Hi Martynas,

On Mon, 2011-05-02 at 10:51 +0200, Martynas Jusevicius wrote: 
> Hey list,
> 
> I want to improve provenance of RDF data in my app, and I'm mostly
> looking at named graphs since reification seems not be used that much.
> 
> One point of view is logical divisions:
> - read-only core ontologies
> - user ontologies
> - user instance data
> I could make a named graph for each of them.
> 
> The other is that I'd like to have metadata about every added/updated
> triple so the app could say "User X updated resource Y with value of Z
> on date W". In this case basically every triple should have its own
> unique URI - i.e. be a named graph with a single statement?
> 
> It seems that I could implement either the first case or the second
> with named graphs, but not both, which I would prefer.
> How would you go about it - has anyone worked on use cases like this?
> Should I still consider reification - and maybe use it together with
> named graphs?

I guess it depends on how you want to manage the data,  whether you need
to limit queries to particular sub-categories of data and just how much
data you are talking about.

In principle you could have a separate named graph both for each
ontology and for each atomic addition of user triples plus a separate
metadata graph. If atomic additions are made one triple at a time that
would be a lot of named graphs but it is possible.

If your updates include retractions than that gets messier in that you
have to remove the old graph as well as add to the new one, still
possible I guess.

FWIW the last time I did serious work with triple level provenance
(which was before named graphs were so much in vogue) I worked it with
just two graphs - one for the asserted data and one for the metadata.
The metadata graph could have used the reification vocabulary but I
found it easier to generate a hash to identify the triple in the data
graph and then use the hash (as a UUID URI) as the subject of provenance
triples in the metadata graph. That's isomorphic to using reification
but is more compact and easier to query.

Dave


Reply via email to