Re: rdf questions

pieter-gmail Wed, 23 Dec 2015 12:52:17 -0800

Thanks to @Mike and @Joshua for the information.

Interesting stuff. Admittedly I need to do some homework on RDF to
understand things better.


Having a keen interest in UML I noticed this OMG spec,
http://www.omg.org/spec/ODM/1.1/PDF/ including RDF meta model specification.
A brief scan of the spec shows that the MOF also had some trouble with
the meta modeling of RDF semantics.

Maybe some day PG too will get some attention from the OMG for what its
worth.

Thanks
Pieter

On 23/12/2015 20:43, Mike Personick wrote:
> Here is Blazegraph's TP3 RDF*/PG mapping just for reference.  Different
> from the original TP2 mapping, which did not use RDF*.
>
> // vertex (id="a", label="person")
> pg:a rdfs:label "A" .
>
> // vertex property (single or set)
> pg:a pg:key1 "val" .
>
> // vertex property (list)
> pg:a pg:key2 _:b1 .
> _:b1 rdf:value "val" .
> _:b1 rdf:li 0 .
>
> // vertex property property
> <<pg:a pg:key1 "val">> pg:acl "public" .
> <<pg:a pg:key2 _:b1>> pg:acl "private" .
>
> // edge (id="x", from="a", to="b", label="knows")
> pg:a pg:x pg:b .
> <<pg:a pg:x pg:b>> rdfs:label "knows" .
>
> // edge property
> <<pg:a pg:x pg:b>> pg:key "val" .
>
> Here is a link to Olaf and Bryan's original work on RDF*:
>
> http://arxiv.org/abs/1406.3399
>
>
> On Tue, Dec 22, 2015 at 4:33 PM, Joshua Shinavier <[email protected]> wrote:
>
>> On Tue, Dec 22, 2015 at 8:39 AM, Mike Personick <[email protected]> wrote:
>>
>>> Neither generic RDF -> PG nor PG -> generic RDF can be lossless.
>>>
>>
>> Both can be lossless: you can translate any RDF graph or dataset into a PG
>> graph, and any PG graph into an RDF graph such that you can recover the
>> original graph exactly, having lost no information.  What you can't have is
>> a one-to-one mapping between RDF graphs and PG graphs.
>>
>>
>>
>>
>>> Even with reification you can't solve the problem that PG allows multiple
>>> edge instances with the same (s, p, o).  Same from, to, and edge label.
>>> Olaf and I went back and forth on this point quite a bit and we agreed
>> that
>>> this made the two models irreconcilable without using some specific RDF
>>> schema to keep track of edge ids.
>>
>>
>> You said it: use edge ids.  See the first example from [3].  A dataset
>> alternative I mentioned is to create one named graph per statement, but
>> that would be pretty unusual.
>>
>>
>>
>>
>>>   PG -> RDF cannot be lossless without a
>>> custom RDF schema for edge identifiers.
>>
>>
>> How are edge identifiers different than URIs or blank node IDs?  Their
>> syntax is opaque to either data model, but you do need a property to
>> connect the edge resource with the id resource.  Other vocabulary elements
>> are also needed, as you can't get away from mapping into a schema, in
>> either direction.
>>
>>
>>
>>
>>>   There are other things about PG
>>> that force a conversion to RDF to require a RDF/PG schema, such as
>>> Cardinality.list.  RDF lends itself well to Cardinality.single and
>>> Cardinality.set, list not so much.
>>>
>>> The reverse is true is well, RDF -> PG is not lossless either, since
>> there
>>> are many things you can do in RDF that you cannot do with PG.  One
>> example
>>> is edges connecting edges.  Another example is unlimited depth of
>> property
>>> properties with RDF* or old-school reification.
>>>
>>
>> Yes, and that's not even getting into named graphs, which are important for
>> SPARQL and most real applications.
>>
>>
>>
>> Long and short of it - you can have a feature limited PG implementation
>>> that works with some kinds of generic RDF, or you can have a full
>> featured
>>> PG implementation that only works on RDF graphs conforming to some
>> specific
>>> schema to deal with the impedance mismatches between RDF and PG.
>>
>>
>> You can have PG views of any RDF data, or RDF views of any PG data, but you
>> can't have it both ways at once because the data models aren't equivalent.
>>
>>
>>
>>
>>>   What
>>> might be nice in the future is decide on a standardized RDF/PG schema so
>>> that each vendor doesn't do it differently.
>>>
>>
>> PropertyGraphSail was probably the first PG-->RDF mapper [1].  It suggests
>> a vocabulary of five terms. SailGraph likewise has a handful of terms (some
>> of which, like "ng" and "kind", could use some tweaking) which could serve
>> as a starting point.
>>
>> Best,
>>
>> Josh
>>
>>
>> [1] https://groups.google.com/forum/#!topic/gremlin-users/Ov91RPkajBI
>>
>>
>>
>>
>>
>>>
>>>
>>> On Mon, Dec 21, 2015 at 10:56 PM, pieter-gmail <[email protected]>
>>> wrote:
>>>
>>>> Thanks for the explanation.
>>>> Cheers
>>>> Pieter
>>>>
>>>> On 21/12/2015 23:59, Joshua Shinavier wrote:
>>>>> Hi Pieter,
>>>>>
>>>>> Yes, it is possible to map RDF graphs, and also RDF datasets
>>> (collections
>>>>> of graphs with names), to a property graph data model without loss.
>>>>> GraphSail [1] had to do this in order to use Blueprints-based DBs as
>>>> triple
>>>>> stores, querying over the RDF data and retrieving it.  GraphSail
>> uses a
>>>>> mapping almost identical to that of SailGraph [2] (see a schematic on
>>>> that
>>>>> page), which maps RDF to property graphs.  For the "opposite" of
>>>> GraphSail
>>>>> and SailGraph (i.e. arbitrary property graphs to RDF), see
>>>>> PropertyGraphSail [3].
>>>>>
>>>>> Olaf Hartig discusses some incompatibilities between PG and RDF in
>> his
>>>>> paper.  Some essential things to keep in mind:
>>>>> *) In mapping between PG and RDF, you are forced to treat edges
>> either
>>> as
>>>>> resources or as statements.  If edges are statements, then any edge
>>>>> properties are lost in the PG-->RDF mapping (unless you were to do
>>>>> something a little weird with named graphs: one graph per statement).
>>> If
>>>>> edges are vertices, the RDF format is quite verbose and is not
>>>> symmetrical
>>>>> with a useful RDF-->PG mapping.  PropertyGraphSail supports two
>> styles
>>> of
>>>>> mapping: one "verbose" (edge-reified) and the other compact (edges as
>>>>> statements).
>>>>> *) A straightforward RDF(datasets)-->PG mapping treats resources as
>>>>> vertices and statements as edges or as properties depending on the
>>>> object,
>>>>> but this is more complicated if you want to preserve named graph
>>>> metadata,
>>>>> as you can't attach metadata to PG properties.  You already have a
>> bit
>>>> of a
>>>>> problem if you want to do anything graph-like with named graph
>>> metadata,
>>>> as
>>>>> PG is not a hypergraph data model (no edges from edges).
>>>>>
>>>>> Best,
>>>>>
>>>>> Josh
>>>>>
>>>>>
>>>>> [1] https://github.com/tinkerpop/blueprints/wiki/Sail-Ouplementation
>>>>> [2] https://github.com/tinkerpop/blueprints/wiki/Sail-Implementation
>>>>> [3]
>>>>>
>> https://github.com/tinkerpop/blueprints/wiki/PropertyGraphSail-Ouplementation
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Dec 21, 2015 at 11:28 AM, pieter-gmail <
>>> [email protected]>
>>>>> wrote:
>>>>>
>>>>>> Thanks, I have just started on the rdf path.
>>>>>>
>>>>>> When you say the RDF data model and PG data model are not 100%
>> aligned
>>>>>> does that mean that for some RDF models to PG model there will be
>>>>>> information loss or just a increase in complexity and efficiency?
>>>>>>
>>>>>> Does the same hold for the other way around PG model to RDF model?
>>>>>>
>>>>>> I'll have a look at your implementation to understand things better.
>>>>>>
>>>>>> Cheers
>>>>>> Pieter
>>>>>>
>>>>>> On 21/12/2015 18:46, Mike Personick wrote:
>>>>>>> The RDF data model and the PG data model are not 100% aligned.  I
>>> know
>>>>>>> there have been a few academic papers on the subject.  For
>>> Blazegraph I
>>>>>> am
>>>>>>> using a PG schema built on top of raw RDF.  But a raw RDF graph
>> would
>>>> not
>>>>>>> work with the Blazegraph TP3 interface if it doesn't follow the PG
>>>>>> schema.
>>>>>>> On Mon, Dec 21, 2015 at 3:22 AM, pieter-gmail <
>>> [email protected]
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Found this recently, fyi
>>>>>>>>
>>>>>>>> http://arxiv.org/abs/1409.3288
>>>>>>>>
>>>>>>>> Cheers
>>>>>>>> Pieter
>>>>>>>>
>>>>>>>> On 12/12/2015 16:01, pieter wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I know many rdf vendors are TinkerPop providers.
>>>>>>>>>
>>>>>>>>> Can it work in the other direction, i.e. can a rdf dataset be
>>> loaded
>>>>>>>>> into a TinkerPop database?
>>>>>>>>> Is it possible to load any rdf dataset into TinkerPop without
>> loss?
>>>>>>>>> Is this something TinkerPop is interested in?
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Pieter
>>>>>>>>>
>>>>>>>>>
>>>>

Re: rdf questions

Reply via email to