Re: SHACL-based data extraction from a knowledge graph

Andy Seaborne Mon, 28 Mar 2022 12:26:00 -0700

Some inspiration from ShEx may help. The "validation" process is definedby assigning triples to non-overlapping partitions defined byconstraints. There can be more then one way to partition the triples ina disjunction or conjunction when there are multiple occurrences oftriples matching multiple constraints in the conjunction. OR and AND inShExC.

The process can involve backtracking to search through alternatives(it's like string regex except "bag regex" is assigning triples to bagsas the regex passes over). It's also more "closed" by default in style.

SHACL does not have this "use once". The sub-shapes of a shape are moreindependent. But it's only the compositional operations that matter =not the basic triple constraints. And it matters less if what is beingextracted is a graph because it's a set.

Some restrictions are necessary - SHACL-SPARQL does say why a constraintmatched and can need the rest of the graph.


    Andy

On 10/03/2022 19:19, Florian Kleedorfer wrote:

Not sure how that could work. You could keep a set of tiples per focusnode validation, add all triples that pass the constraint tests (giventhat you somehow are able to reconstruct the triple(s) from the data theShacl logic is working on (which is not triples but, in many instances,sets of nodes - e.g. the result of G.allSP()), and emit the set onceyou've established that the complete shape is valid for the focus node.I would be very sceptical of adding such a special-interest aspect tocode like SHACL that must be relied on and fast as can be.
Having said that, I've wanted to modify the way Jena evaluates SHACLrecently - maybe a way to extend it would be useful (allowinginheritance or having some kind of callback or somesuch). However, Ifound that for my use case, the trick with the graph wrapper thatobserves which triples are pulled by SHACL works just fine and is verysimple to implement (the shacl validation algorithm, if you want tomodify it, is not that simple and easy to mess up).
Am 2022-03-09 14:26, schrieb Thomas Francart:
What is VLib.validateShape actually returns the focusNode + Path +
valueNodes that conform to each shape ? or emit them through a listener ?
(
https://github.com/apache/jena/blob/5ce8c141d425655bcaa9d7567117659e502a7ff1/jena-shacl/src/main/java/org/apache/jena/shacl/validation/VLib.java#L89
)
The idea would be to use the Validator as a "filter" that emits thetriples
valid according to shapes, so that they can be aggregated in an output
graph.

Re: SHACL-based data extraction from a knowledge graph

Reply via email to