Re: Can we afford to offer SPARQL endpoints when we are successful? (Was "linked data hosted somewhere")

Kingsley Idehen Wed, 26 Nov 2008 17:27:30 -0800


Hugh Glaser wrote:

Prompted by the thread on "linked data hosted somewhere" I would like to ask
the above question that has been bothering me for a while.


The only reason anyone can afford to offer a SPARQL endpoint is because it
doesn't get used too much?

No.

For instance DBpedia has offered a SPARQL endpoint in public view fromday one to demonstrate what a public sparql endpoint can deliver.

The SPARQL Engine has to be able to work out the "Cost of a Query" andand have intelligence re. resultset (solution) sizes and final deliverof the resultsets. In short, it has to construct a query fulfillmentmatrix that is server side configurable and enforceable.

In the SQL realm of ODBC/JDBC/etc. we had to do the same thing with ourDrivers knowing the high probability of deliberate or inadvertent DOSvia Cartesian products. Naturally, this approach is intrinsic to Virtuoso.

Any public facing query interface needs to have the capabilities above.Even Google uses similar techniques when delivering its documentdatabase realm search engine services.

As abstract components for studying interaction, performance, etc.:
DB=KB, SQL=SPARQL.
In fact, I often consider the components themselves interchangeable; that
is, the first step of the migration to SW technologies for an application is
to take an SQL-based back end and simply replace it with a SPARQL/RDF back
end and then carry on.

However.
No serious DB publisher gives direct SQL access to their DB (I think).

It really depends on the task at hand, and the factors allotted to"change sensitivity". If the "change sensitivity" factor has a highweighting then some form of cursoring against the main server offers aviable solution, but most don't go there because only a handful of DBMSDrivers actually support all the cursor models (Keyset, Dynamic, Mixed,and Static). Even worse, most of the Drivers (bar ours) aren't equippedwith the fulfillment matrix capabilities I described above. Ifscrollable cursors aren't workable, you also have highly granulartransactional replication as a "change sensitivity" issue handler re.indirect access, but these aren't common across all DBMS engines.

There are often commercial reasons, of course.
But even when there are not (the Open in LOD), there are only search options
and possibly download facilities.
Even government organisations that have a remit to publish their data don't
offer SQL access.

From my vantage point exposing SQL wouldn't have really solved theissue at hand (putting the DOS issues aside) anyhow. The data sourcename granularity offered in the RDBMS realm simply isn't there. This isfundamentally why HTTP based Data Source Naming (using URIs) and HTTPbased Data Access by Reference (Linked Data) is ultimately so powerful.It addresses what open SQL RDBMS access would never have been able todeliver re. open data access and connectivity.

Will we not have to do the same?
Or perhaps there is a subset of SPARQL that I could offer that will allow me
to offer a "safer" service that conforms to other's safer service (so it is
well-understood?
Is this defined, or is anyone working on it?

I really think this is going to come down the the RDF DBMS Engine (asper my initial comments).

And I am not referring to any particular software - it seems to me that this
is something that LODers need to worry about.

LODers are not necessarily DBMS people, I think it's important to note :-)

It's one thing to know how to query a DBMS and a totally differentkettle of fish re. building one.What LOD needs to do is take engagement of the broader DBMS communityvery seriously.

We aim to take over the world; and if SPARQL endpoints are part of that
(maybe they aren't - just resolvable URIs?), then we should make damn sure
that we think they can be delivered.

I would say we aim to open up data access to world via the World WideWeb :-)


Kingsley

My answer to my subject question?
No, not as it stands. And we need to have a story to replace it.

Best
Hugh

=======================
Sorry if this is a second copy, but the first, sent as a new post, seemed to
only elicit a message from <[EMAIL PROTECTED]> and I can't work out or
find out whether it means the message was rejected or something else, such
as awaiting moderation.
So I've done this as a reply.
=======================
And now a response to the message from Aldo, done here to reduce traffic:

Very generous of you to write in this way.
And yes, humour is good.
And sorry to all for the traffic.

On 27/11/2008 00:02, "Aldo Bucchi" <[EMAIL PROTECTED]> wrote:

OK Hugh,

I see what you mean and I understand you being upset. Just re-read the
conversation word by word because I felt something was not right.
I did say "wacky"... is that it?

In that case, and if this caused the confusion, I am really sorry.

I was not talking about your software, this was just a joke. Talking in
general.
You replied to my joke with an absurd reply.

My point was simply that, if you want to push things over the edge,
why not get your own box. We all take care of our infrastructure and
know its limitations.

So, I formally apologize.
I am by no means endorsing one piece of software over another ( save
for mine, but it does't exist yet ;).
My preferences for virtuoso come from experiential bias.

I hope this clears things up.
I apologize for the traffic.

However, I do make a formal request for some sense of humor.
This list tends to get into this kind of discussions, and we will
start getting more and more visits from outsiders who are not used to
this sort of "sharpness".

Best,
A



--


Regards,

Kingsley Idehen       Weblog: http://www.openlinksw.com/blog/~kidehen

President & CEOOpenLink Software Web: http://www.openlinksw.com

Re: Can we afford to offer SPARQL endpoints when we are successful? (Was "linked data hosted somewhere")

Reply via email to