Hi folks
(sorry for not chiming in on LOD list before btw; I tried to join
sometime back but it got wedged, but I think that was the old MIT-hosted
list anyways)
Richard Cyganiak wrote:
I don't think this is a problem. For provenance purposes, whatever works
for RDF/XML documents will also work for HTML+RDFa documents. Just think
of RDFa as a very verbose RDF syntax that contains a lot of “comments”
(the non-RDF, pure-HTML parts of the document). In the end, an RDF agent
just sees triples, no matter if they are parsed out of an HTML+RDFa
document or an RDF/XML document.
It's a very reasonable question.
For my homepage I some small pieces of experimental RDFa in there, in
addition to my FOAF files. If you ever see a foaf:name for me that is
"Dan A. Brickley" you've probably found something whose provenance
traces to that homepage markup.
<div typeof="foaf:Person" property="foaf:name" content="Dan A.
Brickley"> ...etc
We had an interesting chat in the #laconica IRC channel last week (I'll
wikifi the notes soon I hope; not publically logged yet) with Brad
Fitzpatrick, about how Laconica and the Google Social Graph API might
relate to each other. The key idea here is discovering which profile etc
pages have rel=me (or equiv FOAF) "claims" of which other pages. When
Google SGAPI
Brad Fitzpatrick: "just because a person tells friendfeed that they're
Bob on Twitter and Bob2 on Jaiku, that doesn't make it true."
But if the Twitter account 'bob' has rel=me pointing to the Jaiku
account, and jaiku's bob2 page has the same pointing back, that suggests
they're at least telling a mutually consistent story.
How does all this relate to the original question? :)
Humm, well the way I see it, we'll want SPARQL / named graph conventions
for taking all the chunks of RDF data that are reasonably ascribed to
me, and making them queryable as a unit. I can see two styles here: put
them all in a single named graph (maybe with a tag: or uuid: URI to
avoid confusion), or else put claims in another graph (maybe default
graph, or a 'table of contents' graph).
Sure there are contexts in which it is important to know that a certain
triple came from the RDFa in my homepage, versus in my RSS feed, versus
in my foaf.rdf or one extracted from Flickr or loaded off my Friendfeed,
MyBlogLog, or Pownce or Ecademy accounts.
But it would be even more powerful if we could also allow them to be
grouped together, so people could ask the SPARQL store, "what values are
there for foaf:name of the person whose :openid is <http://danbri.org/> ?".
Now SGAPI has some things that help here. And a similar API could be
exposed by SINDICE or Garlik/qdos too. You can say, "What are the IDs of
the 'other mes' on the Web?".
Here's mine:
http://socialgraph.apis.google.com/otherme?pretty=1&q=danbri.org
(yep there are probably bugs). This is sugar around the existing SGAPI
calls, and is based on the notion of reciprocated claims between profile
pages.
(btw, some of these URIs are indirected identifiers (ie. of documents);
but let's not go off on one of those people-versus-doc URI threads right
now? :)
So.... conclusions?
Knowing where/when to split data into multiple chunks versus serve from
a single (possibly aggregated) source is a timeless problem. Different
people and parties will split their data up in multiple ways; that is
inevitable. We have in SPARQL and the RDF environment a glimmer of hope:
data split up into multiple files can still be tracked down as having
the same real-world source or publisher. And we have a couple of ways we
can represent this: put them in one named graph in a sparql store, or
put in multiple named graphs, but use the URIs of those graphs to group
them.
Plausible?
cheers,
Dan
--
http://danbri.org/