Hi Andy, Not trying to be pedantic below but I’m trying to understand how to think in shacl and establish some expectations of the validation process.
> On May 31, 2020, at 9:40 AM, Andy Seaborne <[email protected]> wrote: > > Do we agree that this is a test case? > (one file, data and shapes combined) > Only command line tools needed. I agree that the combined data and shapes file exhibits differences in report results, when interchanging bds:PersonShape and bds:PersonLocalShape. > ------------------------ > @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . > @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . > @prefix sh: <http://www.w3.org/ns/shacl#> . > @prefix bdo: <http://purl.bdrc.io/ontology/core/> . > @prefix bdr: <http://purl.bdrc.io/resource/> . > @prefix bds: <http://purl.bdrc.io/ontology/shapes/core/> . > > ## Data: > > bdr:NM0895CB6787E8AC6E > a bdo:PersonName ; > . > > bdr:P707 a bdo:Person ; > bdo:personName bdr:NM0895CB6787E8AC6E ; > . > > ## Shapes: > > #bds:PersonShape # 2 > bds:PersonLocalShape # 1 > sh:property bds:PersonShape-personName ; > sh:targetClass bdo:Person ; > . > > bds:PersonShape-personName > sh:message "PersonName is not well-formed, wrong Class or missing > rdfs:label"@en ; > sh:node bds:PersonNameShape ; > sh:path bdo:personName ; > . > > bds:PersonNameShape a sh:NodeShape ; > sh:property bds:PersonNameShape-personNameLabel ; > sh:targetClass bdo:PersonName ; > . > > bds:PersonNameShape-personNameLabel > sh:message ":PersonName must have exactly one rdfs:label"@en ; > sh:minCount 1 ; > sh:path rdfs:label ; > . > ------------------------ > > The differences seems to be that the hash order is different and it affects > finding targets, combined with the fact that targets are nested: I see JENA-1907 <https://issues.apache.org/jira/browse/JENA-1907> raises the issue; I understand: > If A is processed first as a target then the parser shapes now includes B so > processing B is skipped. > Note - the effect is only in the number of times constriants are executed , > once or twice, not whether they are omitted. to say that, in the current test case w/ the hash order issue, when nesting occurs owing to sh:node, then when a violation is found by (A) bds:PersonShape-personName, then the validation does not "go deeper" to consider (B) bds:PersonNameShape, by itself. W/o sh:node, in bds:PersonShape-personName, then both bds:PersonShape-personName and bds:PersonNameShape are parsed as independent targets and executed independently. > bds:PersonLocalShape (target) > -> bds:PersonLocalShape > -> bds:PersonNameShape (target) > -> bds:PersonNameShape-personNameLabel I think the second line above is supposed to be -> bds:PersonShape-personName > Both targets match bdr:P707, one by class, one by property. I understand the NodeShape, bds:PersonLocalShape, matching bdr:P707, meaning, to me, that the constraints expressed in that shape need to be evaluated w/ P707 being the subject (== focus node). I take this to be “by class”. I do not understand how NodeShape, bds:PersonNameShape, matches bdr:P707. I think bds:PersonNameShape matches bdr:NM0895CB6787E8AC6E because of sh:targetClass bdo:PersonName. > It should execute twice - I’m not following the referent “it” (but see below, I think I may). My understanding of (target) bds:PersonLocalShape is that for resources of targetClass, bdo:Person, check that the constraints expressed in bds:PersonShape-personName conform for all objects of bdo:personName where the subject of that property path is bdr:P707 (in this case); and (target) bds:PersonNameShape says that for resources of targetClass, bdo:PersonName, check that the constraints in PersonShape-personNameLabel conform where the resource is a bdo:PersonName, in this case bdr:NM0895CB6787E8AC6E. I don’t see what’s supposed to execute twice. > but did you mean to do this in the first place? Note while it is a minCount > failure, because of going through the sh;node, the message is the "wrong > Class" one because executing via bds:PersonShape-personName makes that the > message. I meant to express that for a bdo:Person there must be at least 1 bdo:personName - via bds:PersonShape-personName (the test case omits sh:minCount 1 in bds:PersonShape-personName); and that a conforming bdoPersonName must have exactly 1 rdfs:label (the test case omits sh:maxCount 1 in bds:PersonShape-personNameLabel). I used "sh:node bds:PersonNameShape" in the declaration for bds:PersonShape-personName to identify the particular NodeShape that is intended to validate objects of the "sh:path bdo:personName” in this situation. Perhaps I see what is "supposed to execute twice”. With the "sh:node bds:PersonNameShape” in bds:PersonShape-personName, then bds:PersonNameShape validation must be executed (if it hasn’t already been executed); and since bdr:NM0895CB6787E8AC6E will match bds:PersonNameShape separately by considering “sh:targetClass bdo:PersonName” then unless there is some check in the validator to see if a (node, shape) pair has already been executed, then there will be 2 executions instead of just 1. > You can see the differences with "shacl print”. I do see differences w/ “shacl parse” w/ and w/o "sh:node bds:PersonNameShape”. I’ll learn to use the tool. My take away is that I shouldn’t be using sh:node as I have or perhaps I could remove the sh:targetClass from bds:PersonNameShape and use sh:node to steer the validation. But I guess the latter would lead to the generic "PersonName is not well-formed …” message instead of the more specific "PersonName must have exactly one rdfs:label”. There seem to be many nuances to shacl. Anyway thanks very much for the valuable information regarding using shacl, Chris > > Andy > > > On 29/05/2020 20:39, Chris Tomlinson wrote: >> Hi Andy, >> Thank you for the reply. Focussing on just the first question. I have >> prepared small self-contained tests of jena-shacl from 3.14.0 (JS) and >> TopQuadrant Shacl 1.3.2 (TQ). >> The apps differ only according to differences imposed by the JS and TQ APIs: >> ShaclName_validateGraphJS.java <https://pastebin.com/5382xZeL> >> ShaclName_validateGraphTQ.java <https://pastebin.com/3BxmyhqA> >> The DATA_P707.ttl <https://pastebin.com/ugCZfABj> contains the three needed >> triples from the ontology and the bare minimum from the example P707 with >> two different errors in two of the PersonName instances. >> The ShapeName_01.ttl <https://pastebin.com/jDqzvPTe> contains the shape >> definitions and all tests are performed only by changing the name on line 9. >> The ShaclName_validateGraphJS-results-PersonShape.txt >> <https://pastebin.com/seEfWKNa> shows the results when the JS app is run >> with the name bds:PersonShape and gives the expected results. >> The ShaclName_validateGraphJS-results-PersonLocalShape… >> <https://pastebin.com/q1SWMC4H> shows the results when the JS app is run >> with the name bds:PersonLocalShape and gives unexpected results. Namely, the >> expected violation regarding the PersonName which uses skos:prefLabel >> instead of rdfs:label is erroneously reported as conforming. >> The ShaclName_validateGraphJS-results-varying.txt >> <https://pastebin.com/CNwnE5kg> shows results for names ranging from “P”, >> “Pe”, “Per” thru “PersonLocal”, “PersonShape” upto “PersonLocalShape”, >> “PersonLocalShaper”, and finally “PersonLocalShapers” for the JS app. In the >> table a “0” means the unexpected result and a “1” means the expected result >> - 7 names produce unexpected results and 20 names produce expected results. >> The ShaclName_validateGraphTQ-results.txt <https://pastebin.com/BQnStjVq> >> shows the results when the TQ app is run for any spelling of the name on >> line 9 of ShapeName_01.ttl <https://pastebin.com/jDqzvPTe>. The results are >> the expected results as with some spellings of the name in the JS case. TQ >> shows no variation owing to the name on line 9 as is expected. >> (Note: The TQ engine needed to be re-initialized for each use otherwise it >> accumulated results. This is why there is an init of the >> ShaclSimpleValidator at each use in the JS app even though it is not needed. >> I just wanted to produce as much as possible an apples-to-apples comparison >> of JS and TQ.) >> (Note: The TQ report does not include sh:conforms true ; in the results, >> just: [ a sh:ValidationReport ] . I don’t know if this conforms to the >> SHACL spec but that’s another matter.) >> The results from the command line tests show the same as the above. >> Running with line 9 of ShapeName_01.ttl <https://pastebin.com/jDqzvPTe> >> set to bds:PersonLocalShape: >> shacl v -s ShapeName_01.ttl -d DATA_P707.ttl > >> PersonLocalShape_JS_Results.ttl <https://pastebin.com/M9s859Kc> >> produces the unexpected results, namely there is no detail regarding the >> missing rdfs:label on bdr:NM0895CB6787E8AC6E. >> However, running with line 9 of ShapeName_01.ttl >> <https://pastebin.com/jDqzvPTe> set to bds:PersonShape: >> shacl v -s ShapeName_01.ttl -d DATA_P707.ttl > >> PersonShape_JS_Results.ttl <https://pastebin.com/DhBNucpX> >> produces the expected results, in that the detail regarding the missing >> rdfs:label on bdr:NM0895CB6787E8AC6E is present among the results. >> I did not set up the TQ command line but I think the above TQ results make >> this testing unnecessary. >> I think these tests show that there is an unexpected dependence on a shape >> name in the JS library and not in the TQ library. I think this is an error >> and I can open a JIRA issue if appropriate. >> A consideration I have is that we want to be able to use the fuseki shacl >> endpoint for some processing and hence need to understand the expected >> behavior of the JS library which is integrated. >> Thank you again for your help >> Chris >>> On May 29, 2020, at 6:26 AM, Andy Seaborne <[email protected]> wrote: >>> >>>> Question 1: regarding the name bds:PersonShape at line 9 of >>>> ShapeName_01.ttl <https://pastebin.com/spJJAsJ3>. With that name the >>>> results of running ShaclName_validateGraph.java >>>> <https://pastebin.com/qvUy2XeB> are as expected, see >>>> ShapeName-results-PersonShape.txt <https://pastebin.com/Hbk4dj04>. >>>> There are two errors in P707_nameErrs02.ttl >>>> <https://pastebin.com/8wZeMiEU> regarding bdr:NMC2A097019ABA499F and >>>> bdr:NM0895CB6787E8AC6E which are reported in the >>>> ShapeName-results-PersonShape.txt <https://pastebin.com/Hbk4dj04> file. >>>> However, if the name at line 9 of ShapeName_01.ttl >>>> <https://pastebin.com/spJJAsJ3> is changed to: bds:PersonLocalShape or >>>> bds:Frogs; then detail for bdr:NM0895CB6787E8AC6E reports, (see >>>> ShapeName-results-PersonLocalShape.txt <https://pastebin.com/f4F9h1E2>): >>>> [ a sh:ValidationReport ; >>>> sh:conforms true ] . >>>> instead of: >>>> [ a sh:ValidationReport ; >>>> sh:conforms false ; >>>> sh:result [ a sh:ValidationResult ; >>>> sh:focusNode bdr:NM0895CB6787E8AC6E ; >>>> sh:resultMessage ":PersonName must have >>>> exactly one rdfs:label"@en ; >>>> sh:resultPath rdfs:label ; >>>> sh:resultSeverity sh:Violation ; >>>> sh:sourceConstraintComponent >>>> sh:MinCountConstraintComponent ; >>>> sh:sourceShape >>>> bds:PersonNameShape-personNameLabel >>>> ] >>>> ] . >>>> which is the result with bds:PersonShape at line 9 of ShapeName_01.ttl >>>> <https://pastebin.com/spJJAsJ3>. In fact changing the name to >>>> bds:FrogTarts also produces the expected results. >>>> Summary: If the shape name at line 9 of ShapeName_01.ttl >>>> <https://pastebin.com/spJJAsJ3> is either bds:PersonShape or bds:FrogTarts >>>> then the results are as expected; while if the shape name is either >>>> bds:PersonLocalShape or bds:Frogs then one of the detail results >>>> disappears and is replaced by sh:conforms true. >>>> Why this dependence on the shape name? The shape name isn’t referred to >>>> elsewhere in ShapeName_01.ttl <https://pastebin.com/spJJAsJ3>. >>> >>> >>> A way to check is run both Jena Shacl and TQ Shacl and see if they get the >>> same violations >>> >>> I ran the shapes and data in both and get 32 violations (with no ontology >>> added) >>> >>> and then running with the datafile as P707+ontology. Now 5 results each. >>> >>> shacl v -s ShapeName_01.ttl -d P707_nameErrs02.ttl > V1.ttl >>> >>> tb-shacl -shapesfile ShapeName_01.ttl -datafile P707_nameErrs02.ttl >>> >>> The name of the shape does not seem to make a difference when run like this. >>> >>> Have you tries with targetNode to select the node to validate? With a >>> subset of thee shapes? That would make discussing it much easier as would a >>> self-contained data (the ontology isn't particularly small). >>> >>> Do you have an example which has one target shape and shows differences? >>> >>> >>> This: >>> >>> bds:PersonShape-personName >>> a sh:PropertyShape ; >>> sh:class bdo:PersonName ; >>> sh:message "PersonName is not well-formed, wrong Class or missing >>> rdfs:label"@en ; >>> sh:minCount 1 ; >>> sh:node bds:PersonNameShape ; >>> sh:nodeKind sh:IRI ; >>> sh:path bdo:personName ; >>> . >>> >>> (and others) could be split up into separate shapes, one per constraint >>> (this has node kind, node shape, and minCount) which might make the report >>> clearer >>> >>> bds:PersonNameShape also has a target - it can get called via two >>> different routes. >>> >>> It's quite complicated to track what's going on.
