Re: INSERT resource CBDs into separate graphs
On 12/05/16 23:50, Martynas Jusevičius wrote: Thanks Andy. So my options are limited, I had anticipated that :) Being able to do it in one SPARQL operation would be a huge advantage when it comes to configuration. It's suppose to be generic so it cannot really depend on data, except for some internal convention, so the OPTIONAL pattern is not great. To get a DESCRIBE of each resource I would need to iterate through them which would be expensive if there are many of them? The function is an interesting idea, I might try it out. Could this be solved more naturally if SPARQL supported DESCRIBE subqueries? Martynas, Where are you thinking DESCRIBE subqueries would be allowed? Much of pattern evaluation is based on "tables" not graph* -> graph (SPARQL is an access language - you get data out of graphs) Graph-points in the evaluation are fixed e.g. FROM but there are the the graph ops in SPARQL Update, ADD, MOVE, COPY, that have equivalences in INSERT-DELETE-WHERE, but you don't get arbitrary subqueries like SELECT and EXISTS (an ASK subquery, with substitution) Andy Martynas On Thu, May 12, 2016 at 1:48 PM, Andy Seabornewrote: On 10/05/16 23:27, Martynas Jusevičius wrote: Hey, I know this is a SPARQL question not specific to Jena, but I couldn't get an answer on public-sparql-dev, so I'll try my luck here: https://lists.w3.org/Archives/Public/public-sparql-dev/2016AprJun/0010.html This is a real problem we have, triples end up in wrong graphs when we split one large graph into multiple per-resource named graphs. Martynas It is always risky to say "can't". You can't in any practical sense and in the general case within a single standard SPARQL update request. If you know something about the data, some things become possible. The only way to match arbitrary length paths is via property paths. If you know the possible properties to traverse. (:p1|:p2|...)+ may help. If you know the stopping properties (non-traverals): ^(:not_p1|:not_p2|...)+ and ultimately: (^:DoesNotExist)+ which follows anything, expensively, grabbing a lot you don't want making it impractical. Other ways I can think of looking at: * Make the OPTIONAL more complicated. * Find the data, use DESCRIBE to get a model then use GSP to POST it. * Use a property function to calculate variables to use in the INSERT INSERT { ?s ?p ?o } WHERE { ... pattern ... (?x ?y) my:calcTriples ( ?s ?p ?o ) } Andy 2016AprJun/0010 ==> Hey all, when mapping tabular data to RDF, we end up with multiple resource descriptions in one graph. The descriptions are rooted in document resources and can have optional nested bnode paths attached: <#resource1> a foaf:Document ; foaf:name "Resource 1" ; foaf:maker [ a foaf:Person ; foaf:familyName "FamilyName 1" ] . <#resource2> a foaf:Document ; foaf:name "Resource 2" . # no bnodes attached to this one <#resource3> a foaf:Document ; foaf:name "Resource 3" ; foaf:maker [ a foaf:Person ; foaf:familyName "FamilyName 3" ] . As the next thing, we want to store each description in a separate named graph. Currently we do something like: DELETE { ?doc ?docP ?docO . ?docO ?thingP ?thingO . } INSERT { GRAPH ?graph { ?doc ?docP ?docO . ?docO ?thingP ?thingO . } } WHERE { { SELECT ?doc ?graph { ?doc a foaf:Document BIND (UUID() AS ?graph) } } ?doc ?docP ?docO . OPTIONAL { ?docO ?thingP ?thingO . } } This works in simple cases, but breaks down as soon as the bounded description is "deeper" than the pattern in OPTIONAL. I was wondering if this can be solved in a general way, for description with arbitrarily nested blank nodes? And maybe even possible to get rid of the pattern checking resource type? Something like INSERT { GRAPH ?graph { DESCRIBE ?doc } } :) Martynas graphityhq.com
Re: INSERT resource CBDs into separate graphs
Thanks Andy. So my options are limited, I had anticipated that :) Being able to do it in one SPARQL operation would be a huge advantage when it comes to configuration. It's suppose to be generic so it cannot really depend on data, except for some internal convention, so the OPTIONAL pattern is not great. To get a DESCRIBE of each resource I would need to iterate through them which would be expensive if there are many of them? The function is an interesting idea, I might try it out. Could this be solved more naturally if SPARQL supported DESCRIBE subqueries? Martynas On Thu, May 12, 2016 at 1:48 PM, Andy Seabornewrote: > On 10/05/16 23:27, Martynas Jusevičius wrote: >> >> Hey, >> >> I know this is a SPARQL question not specific to Jena, but I couldn't >> get an answer on public-sparql-dev, so I'll try my luck here: >> >> https://lists.w3.org/Archives/Public/public-sparql-dev/2016AprJun/0010.html >> >> This is a real problem we have, triples end up in wrong graphs when we >> split one large graph into multiple per-resource named graphs. >> >> >> Martynas >> > > It is always risky to say "can't". > > > You can't in any practical sense and in the general case within a single > standard SPARQL update request. > > If you know something about the data, some things become possible. > > The only way to match arbitrary length paths is via property paths. If you > know the possible properties to traverse. > > (:p1|:p2|...)+ > > may help. If you know the stopping properties (non-traverals): > > ^(:not_p1|:not_p2|...)+ > > and ultimately: > > (^:DoesNotExist)+ > > which follows anything, expensively, grabbing a lot you don't want making it > impractical. > > > Other ways I can think of looking at: > > * Make the OPTIONAL more complicated. > * Find the data, use DESCRIBE to get a model then use GSP to POST it. > * Use a property function to calculate variables to use in the INSERT > > INSERT { ?s ?p ?o } > WHERE { > ... pattern ... > (?x ?y) my:calcTriples ( ?s ?p ?o ) > } > > Andy > > 2016AprJun/0010 ==> >> >> Hey all, >> >> when mapping tabular data to RDF, we end up with multiple resource >> descriptions in one graph. The descriptions are rooted in document >> resources and can have optional nested bnode paths attached: >> >> <#resource1> a foaf:Document ; >> foaf:name "Resource 1" ; >> foaf:maker [ a foaf:Person ; foaf:familyName "FamilyName 1" ] . >> >> <#resource2> a foaf:Document ; >> foaf:name "Resource 2" . # no bnodes attached to this one >> >> <#resource3> a foaf:Document ; >> foaf:name "Resource 3" ; >> foaf:maker [ a foaf:Person ; foaf:familyName "FamilyName 3" ] . >> >> >> As the next thing, we want to store each description in a separate >> named graph. Currently we do something like: >> >> DELETE >> { >> ?doc ?docP ?docO . >> ?docO ?thingP ?thingO . >> } >> INSERT >> { >> GRAPH ?graph { >> ?doc ?docP ?docO . >> ?docO ?thingP ?thingO . >> } >> } >> WHERE >> { >> { >> SELECT ?doc ?graph >> { >> ?doc a foaf:Document >> BIND (UUID() AS ?graph) >> } >> } >> ?doc ?docP ?docO . >> OPTIONAL { >> ?docO ?thingP ?thingO . >> } >> } >> >> This works in simple cases, but breaks down as soon as the bounded >> description is "deeper" than the pattern in OPTIONAL. >> >> I was wondering if this can be solved in a general way, for >> description with arbitrarily nested blank nodes? And maybe even >> possible to get rid of the pattern checking resource type? >> >> Something like INSERT { GRAPH ?graph { DESCRIBE ?doc } } :) >> >> >> Martynas >> graphityhq.com > > >
Re: INSERT resource CBDs into separate graphs
On 10/05/16 23:27, Martynas Jusevičius wrote: Hey, I know this is a SPARQL question not specific to Jena, but I couldn't get an answer on public-sparql-dev, so I'll try my luck here: https://lists.w3.org/Archives/Public/public-sparql-dev/2016AprJun/0010.html This is a real problem we have, triples end up in wrong graphs when we split one large graph into multiple per-resource named graphs. Martynas It is always risky to say "can't". You can't in any practical sense and in the general case within a single standard SPARQL update request. If you know something about the data, some things become possible. The only way to match arbitrary length paths is via property paths. If you know the possible properties to traverse. (:p1|:p2|...)+ may help. If you know the stopping properties (non-traverals): ^(:not_p1|:not_p2|...)+ and ultimately: (^:DoesNotExist)+ which follows anything, expensively, grabbing a lot you don't want making it impractical. Other ways I can think of looking at: * Make the OPTIONAL more complicated. * Find the data, use DESCRIBE to get a model then use GSP to POST it. * Use a property function to calculate variables to use in the INSERT INSERT { ?s ?p ?o } WHERE { ... pattern ... (?x ?y) my:calcTriples ( ?s ?p ?o ) } Andy 2016AprJun/0010 ==> Hey all, when mapping tabular data to RDF, we end up with multiple resource descriptions in one graph. The descriptions are rooted in document resources and can have optional nested bnode paths attached: <#resource1> a foaf:Document ; foaf:name "Resource 1" ; foaf:maker [ a foaf:Person ; foaf:familyName "FamilyName 1" ] . <#resource2> a foaf:Document ; foaf:name "Resource 2" . # no bnodes attached to this one <#resource3> a foaf:Document ; foaf:name "Resource 3" ; foaf:maker [ a foaf:Person ; foaf:familyName "FamilyName 3" ] . As the next thing, we want to store each description in a separate named graph. Currently we do something like: DELETE { ?doc ?docP ?docO . ?docO ?thingP ?thingO . } INSERT { GRAPH ?graph { ?doc ?docP ?docO . ?docO ?thingP ?thingO . } } WHERE { { SELECT ?doc ?graph { ?doc a foaf:Document BIND (UUID() AS ?graph) } } ?doc ?docP ?docO . OPTIONAL { ?docO ?thingP ?thingO . } } This works in simple cases, but breaks down as soon as the bounded description is "deeper" than the pattern in OPTIONAL. I was wondering if this can be solved in a general way, for description with arbitrarily nested blank nodes? And maybe even possible to get rid of the pattern checking resource type? Something like INSERT { GRAPH ?graph { DESCRIBE ?doc } } :) Martynas graphityhq.com
INSERT resource CBDs into separate graphs
Hey, I know this is a SPARQL question not specific to Jena, but I couldn't get an answer on public-sparql-dev, so I'll try my luck here: https://lists.w3.org/Archives/Public/public-sparql-dev/2016AprJun/0010.html This is a real problem we have, triples end up in wrong graphs when we split one large graph into multiple per-resource named graphs. Martynas