Re: [Social] OT: list subscribers

Dan Brickley Wed, 21 May 2008 16:02:22 -0700

Peter Saint-Andre wrote:

On 05/12/2008 12:53 PM, Dan Brickley wrote:

I've been hacking around with
the use of XMPP as a data bus for RDF querying using SPARQL (as Peter
well knows, being my XMPP helpline). Some notes on that at [1].

[1] http://danbri.org/words/2008/02/11/278


See also http://crschmidt.net/semweb/sparqlxmpp/

Yup, Chris wrote this up based on IRC chats after the original designdiscussions I had with you. He beat me to running code :) The mapping ofSPARQL result set format to XMPP IQ markup mutated a bit over time, sothere isn't clean interop currently between his python and my jqbus javastuff. But it's all wrong anyway since large resultsets are too big forone IQ; need to move to some batching or attachments-based model.

I still don't quite understand SPARQL, but I'm sure that's because I'm
missing the appropriate synapses. :)


OK basic idea of SPARQL:

1. understand the RDF 'nodes and arcs' data model. Quick messy pottedversion here, sorry if this is too hasty...

An RDF graph encodes a collection of simple statements or claims, whichcan be visualised as an edge-labelled graph, where the inter-node edgesin this graph correspond to properties/relationships/attributes. Eachedge links something either to another thing, or to a literal value. Soeach node is either a literal (which may either be tagged with alanguage code, or with a datatype URI), or a non-literal. Non literalsmay be labelled with a URI, or may be 'blank' (although actualrepresentations of this graph structure often have aprivate-to-the-graph identifier; however this is invisible to RDF). Thelabel on each edge is itself a URI, for usual namespacing reasons.

2. think of RDF graphs as descriptions of the world; sets of claimswhich may or may not be accurate. The graph can bewritten/published/excanged in any of the various concrete RDF syntaxes.RDF/XML is a common if ugly one. Also we have RDFa, Turtle/N3, and GRDDLwhich uses XSLT to turn colloqiual XML into RDF graphs.

3. think of an RDF dataset as a collection of one or more of thesegraphs; a system dealing with multiple such graphs identifies each witha URI.

4. RDF querying in SPARQL is all about asking questions of one or moresuch graphs. As such, an RDF query is conceptually a bit like an RDFdocument, except bits can be marked as missing and labelled withvariable names.


So an RDF/XML document might encode a graph that says something equiv to:

'there exists a Movie, its :homepage property a has value which is theURI <http://ironmanmovie.marvel.com/>; it's :title property has theliteral value 'Iron Man' and its :starring property has the literalvalue "Robert Downey Jr.".'


(I'll spare you the XML version here)

By contrast SPARQL uses a non-XML notation to express questions. Youmight write sparql which says,

'OK give me values ?x and ?y where ?x is the URI of the movie's:homepage, and ?y is the title, wherever some thing that is Movie has a:starring property with value "Robert Downey Jr.".


(that's in pseudo-sparql english for now)

And depending on the dataset you ran the query against, you might getone row back, with ?x=http://ironmanmovie.marvel.com/ ?y="Iron Man".

Or you might get a load more rows describing other movies.

5. For now, just focus on this part of SPARQL. The bit that looks mostlike SQL. We have a query which is asking for variable-to-valuebindings, tabular ... against some target data. It returns a set of'hits' just like SQL over JDBC/ODBC/DBI etc. There are detaildifferences, but conceptually it is similar.

SPARQL defines XML and JSON bindings for these result sets. TheXMPP/SPARQL binding work is an attempt to flow these through XMPPinstead of the more traditional HTTP-based bindings.


http://www.w3.org/TR/rdf-sparql-XMLres/

http://www.w3.org/TR/rdf-sparql-json-res/ ...are the formats, there'salso a protocol spec, http://www.w3.org/TR/rdf-sparql-protocol/ whichalso covers the HTTP binding. By far the most work is in the querylanguage spec though, http://www.w3.org/TR/rdf-sparql-query/

6. er that's it. I guess I should write the above stuff in RDF/XML andSPARQL here.


target data:

_:r1 rdf:type eg:Movie .
_:r1 eg:homepage <http://ironmanmovie.marvel.com/> .
_:r1 eg:title "Iron Man" .
_:r1 eg:starring "Robert Downey Jr." .

this could also be written

[ a eg:Movie;
  eg:homepage <http://ironmanmovie.marvel.com/> ;
  eg:title "Iron Man";
  eg:starring "Robert Downey Jr.";
]

...in Turtle/SPARQL notation. The RDF/XML snippet for this might be

<eg:Movie xmlns:eg="http://eg.example.com/abcd#";>
 <eg:homepage rdf:resource="http://ironmannovie.marvel.com/"/>
 <eg:title>Iron Man</eg:title>
 <eg:starring>Robert Downey Jr.</eg:starring>
</eg:Movie>

A key point is that SPARQL doesn't care which notation the source datawas originally in. Everything gets normalised into the triple/graphmodel, and queries are written in terms of it.


So our sample query in SPARQL might be:

PREFIX eg: <http://eg.example.com/abcd#>
SELECT ?x ?y
WHERE {
[ a eg:Movie;
  eg:homepage ?x ;
  eg:title ?y ;
  eg:starring "Robert Downey Jr." ;
] .
}

# here 'a' is short for rdf:type, and the [ chunk bracket ] notation isa way of writing a blank node


and the results are like this:

?x=http://ironmannovie.marvel.com/ ?y="Iron Man"
?x=http://someothermovie.com/ ?y="Some Other Movie"

Humm ok that was a bit rambling, ... does it help at all? Main point Iwas aiming at here is that understanding basic sparql is a small stepfrom being comfortable with the RDF graph model. Since queries arereally just RDF graphs with bits labelled missing ("?x" etc), and queryresults are the values from the graph that fit the pattern specified.

There is a lot of extra detail, eg. around GRAPH clause which lets youmatch specific graphs in the target dataset, or for optionals, filters,datatyping etc. But the core is pretty simple.


Hope that helps :)

cheers,

Dan

--
http://danbri.org/

Re: [Social] OT: list subscribers

Reply via email to