On 01/06/2020 21:08, Chris Tomlinson wrote:
Hi Andy,

Not trying to be pedantic below but I’m trying to understand how to think in 
shacl and establish some expectations of the validation process.

If it help, the general pattern is

Target ->
  (Node shape -> property shape->)*
  Constraint*

On May 31, 2020, at 9:40 AM, Andy Seaborne <[email protected]> wrote:

Do we agree that this is a test case?
(one file, data and shapes combined)
Only command line tools needed.

I agree that the combined data and shapes file exhibits differences in report 
results, when interchanging bds:PersonShape and bds:PersonLocalShape.


------------------------
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sh:    <http://www.w3.org/ns/shacl#> .
@prefix bdo:   <http://purl.bdrc.io/ontology/core/> .
@prefix bdr:   <http://purl.bdrc.io/resource/> .
@prefix bds:   <http://purl.bdrc.io/ontology/shapes/core/> .

## Data:

bdr:NM0895CB6787E8AC6E
         a           bdo:PersonName ;
.

bdr:P707  a                  bdo:Person ;
        bdo:personName       bdr:NM0895CB6787E8AC6E ;
.

## Shapes:

#bds:PersonShape           # 2
bds:PersonLocalShape      # 1
    sh:property     bds:PersonShape-personName ;
    sh:targetClass  bdo:Person ;
.

bds:PersonShape-personName
    sh:message      "PersonName is not well-formed, wrong Class or missing 
rdfs:label"@en ;
    sh:node         bds:PersonNameShape ;
    sh:path         bdo:personName ;
.

bds:PersonNameShape  a  sh:NodeShape ;
    sh:property     bds:PersonNameShape-personNameLabel ;
    sh:targetClass  bdo:PersonName ;
.

bds:PersonNameShape-personNameLabel
    sh:message      ":PersonName must have exactly one rdfs:label"@en ;
    sh:minCount     1 ;
    sh:path         rdfs:label ;
.
------------------------

The differences seems to be that the hash order is different and it affects 
finding targets, combined with the fact that targets are nested:

I see JENA-1907 <https://issues.apache.org/jira/browse/JENA-1907> raises the 
issue; I understand:

If A is processed first as a target then the parser shapes now includes B so 
processing B is skipped.
Note - the effect is only in the number of times constriants are executed , 
once or twice, not whether they are omitted.


to say that, in the current test case w/ the hash order issue, when nesting occurs owing 
to sh:node, then when a violation is found by (A) bds:PersonShape-personName, then the 
validation does not "go deeper" to consider (B) bds:PersonNameShape, by itself. 
W/o sh:node, in bds:PersonShape-personName, then both  bds:PersonShape-personName and 
bds:PersonNameShape are parsed as independent targets and  executed independently.


bds:PersonLocalShape (target)
-> bds:PersonLocalShape
   -> bds:PersonNameShape (target)
     -> bds:PersonNameShape-personNameLabel

I think the second line above is supposed to be

     -> bds:PersonShape-personName


Both targets match bdr:P707, one by class, one by property.

I understand the NodeShape, bds:PersonLocalShape, matching bdr:P707, meaning, 
to me, that the constraints expressed in that shape need to be evaluated w/ 
P707 being the subject (== focus node). I take this to be “by class”.

I do not understand how NodeShape, bds:PersonNameShape, matches bdr:P707. I 
think bds:PersonNameShape matches bdr:NM0895CB6787E8AC6E because of 
sh:targetClass bdo:PersonName.

1/
bds:PersonShape
  sh:targetClass  bdo:Person
  -> bdr:P707

and is has
  sh:property     bds:PersonShape-personName ;
->
 sh:node         bds:PersonNameShape ;
->
 sh:property     bds:PersonNameShape-personNameLabel ;

2/
bds:PersonNameShape  a  sh:NodeShape ;
    sh:property     bds:PersonNameShape-personNameLabel ;
    sh:targetClass  bdo:PersonName ; <-- which is part of bdr:P707
->  bdr:NM0895CB6787E8AC6E ;

so two ways to get to bds:PersonNameShape-personNameLabel from target declarations.

(try "shacl validate -v")

In case1: you can see the paths:
2 targets.
each with one focus node
leading to the same property shape /PersonNameShape-personNameLabel
which has a constraint.

(I checked the spec and it is onlt says to execute once if the same focus node comes up multiple times for the same target shape but here there are two different target shapes. TQ shacl agrees.)

F: Focus node
S: Node Shape
P: Property Shape.
C: Constraint

NodeShape[http://purl.bdrc.io/ontology/shapes/core/PersonLocalShape]
N: FocusNodes(1): [http://purl.bdrc.io/resource/P707]
  F: http://purl.bdrc.io/resource/P707
  S: NodeShape[http://purl.bdrc.io/ontology/shapes/core/PersonLocalShape]
P: PropertyShape[http://purl.bdrc.io/ontology/shapes/core/PersonShape-personName -> <http://purl.bdrc.io/ontology/core/personName>]
  C: http://purl.bdrc.io/resource/P707 :: Node
  S: NodeShape[http://purl.bdrc.io/ontology/shapes/core/PersonNameShape]
P: PropertyShape[http://purl.bdrc.io/ontology/shapes/core/PersonNameShape-personNameLabel -> <http://www.w3.org/2000/01/rdf-schema#label>]
  C: http://purl.bdrc.io/resource/NM0895CB6787E8AC6E :: minCount[1]


NodeShape[http://purl.bdrc.io/ontology/shapes/core/PersonNameShape]
N: FocusNodes(1): [http://purl.bdrc.io/resource/NM0895CB6787E8AC6E]
  F: http://purl.bdrc.io/resource/NM0895CB6787E8AC6E
  S: NodeShape[http://purl.bdrc.io/ontology/shapes/core/PersonNameShape]
P: PropertyShape[http://purl.bdrc.io/ontology/shapes/core/PersonNameShape-personNameLabel -> <http://www.w3.org/2000/01/rdf-schema#label>]
  C: http://purl.bdrc.io/resource/NM0895CB6787E8AC6E :: minCount[1]




It should execute twice -

I’m not following the referent “it” (but see below, I think I may).

The constraint(s) of bds:PersonShape-personName


My understanding of (target) bds:PersonLocalShape is that for resources of 
targetClass, bdo:Person, check that the constraints expressed in 
bds:PersonShape-personName conform for all objects of bdo:personName where the 
subject of that property path is bdr:P707 (in this case); and

(target) bds:PersonNameShape says that for resources of targetClass, 
bdo:PersonName, check that the constraints in PersonShape-personNameLabel 
conform where the resource is a bdo:PersonName, in this case 
bdr:NM0895CB6787E8AC6E.

I don’t see what’s supposed to execute twice.

/PersonNameShape-personNameLabel

and constraint minCount[1] on NM0895CB6787E8AC6E



but did you mean to do this in the first place? Note while it is a minCount failure, 
because of going through the sh;node, the message is the "wrong Class" one 
because executing via bds:PersonShape-personName makes that the message.

I meant to express that for a bdo:Person there must be at least 1 
bdo:personName - via bds:PersonShape-personName (the test case omits 
sh:minCount 1 in bds:PersonShape-personName);

Yes - because that minCount was not a factor.

I worked though the data removing each element that did not affect the outcome, 3 vs 2, then remove the SPARQL constaint which is not relevant (it contributed one violation in both cases) leaving 2 vs 1.
and that is due to the /PersonNameShape-personNameLabel minCount


and that a conforming bdoPersonName must have exactly 1 rdfs:label (the test 
case omits sh:maxCount 1 in bds:PersonShape-personNameLabel).

I used "sh:node bds:PersonNameShape" in the declaration for 
bds:PersonShape-personName to identify the particular NodeShape that is intended to validate 
objects of the "sh:path bdo:personName” in this situation.

Perhaps I see what is "supposed to execute twice”.

With the "sh:node bds:PersonNameShape” in bds:PersonShape-personName, then 
bds:PersonNameShape validation must be executed (if it hasn’t already been 
executed); and

since bdr:NM0895CB6787E8AC6E will match bds:PersonNameShape separately by 
considering “sh:targetClass bdo:PersonName” then unless there is some check in 
the validator to see if a (node, shape) pair has already been executed, then 
there will be 2 executions instead of just 1.


You can see the differences with "shacl print”.

I do see differences w/ “shacl parse” w/ and w/o "sh:node bds:PersonNameShape”. 
I’ll learn to use the tool.

My take away is that I shouldn’t be using sh:node as I have or perhaps I could remove the 
sh:targetClass from bds:PersonNameShape and use sh:node to steer the validation. But I 
guess the latter would lead to the generic "PersonName is not well-formed …” message 
instead of the more specific "PersonName must have exactly one rdfs:label”.

Dulication arises when theer is a target that is also referred to by another target by some connections though the shaps graph - sh:node is one way of doing.

There are other ways to link in a constraint twice like graph linking:

## Data:

:foo a :C ;
   :prop 1 , 2 .


## Shapes:

:A
  sh:targetClass :C ;
  sh:property :P .

:B
  sh:targetClass :C ;
  sh:property :P .

:P
    sh:path :prop ;
    sh:message "Hello world" ;
    sh:maxCount 1 .

2 violations, both with "Hello World", for the same reason


There seem to be many nuances to shacl.

Anyway thanks very much for the valuable information regarding using shacl,
Chris





    Andy


On 29/05/2020 20:39, Chris Tomlinson wrote:
Hi Andy,
Thank you for the reply. Focussing on just the first question. I have prepared 
small self-contained tests of jena-shacl from 3.14.0 (JS) and TopQuadrant Shacl 
1.3.2 (TQ).
The apps differ only according to differences imposed by the JS and TQ APIs:
     ShaclName_validateGraphJS.java <https://pastebin.com/5382xZeL>
     ShaclName_validateGraphTQ.java <https://pastebin.com/3BxmyhqA>
The DATA_P707.ttl <https://pastebin.com/ugCZfABj> contains the three needed 
triples from the ontology and the bare minimum from the example P707 with two 
different errors in two of the PersonName instances.
The ShapeName_01.ttl <https://pastebin.com/jDqzvPTe> contains the shape 
definitions and all tests are performed only by changing the name on line 9.
The ShaclName_validateGraphJS-results-PersonShape.txt 
<https://pastebin.com/seEfWKNa> shows the results when the JS app is run with 
the name bds:PersonShape and gives the expected results.
The ShaclName_validateGraphJS-results-PersonLocalShape… 
<https://pastebin.com/q1SWMC4H> shows the results when the JS app is run with 
the name bds:PersonLocalShape and gives unexpected results. Namely, the expected 
violation regarding the PersonName which uses skos:prefLabel instead of rdfs:label is 
erroneously reported as conforming.
The ShaclName_validateGraphJS-results-varying.txt 
<https://pastebin.com/CNwnE5kg> shows results for names ranging from “P”, “Pe”, 
“Per” thru “PersonLocal”, “PersonShape” upto “PersonLocalShape”, “PersonLocalShaper”, 
and finally “PersonLocalShapers” for the JS app. In the table a “0” means the 
unexpected result and a “1” means the expected result - 7 names produce unexpected 
results and 20 names produce expected results.
The ShaclName_validateGraphTQ-results.txt <https://pastebin.com/BQnStjVq> shows the 
results when the TQ app is run for any spelling of the name on line 9 of ShapeName_01.ttl 
<https://pastebin.com/jDqzvPTe>. The results are the expected results as with some 
spellings of the name in the JS case. TQ shows no variation owing to the name on line 9 as 
is expected.
(Note: The TQ engine needed to be re-initialized for each use otherwise it 
accumulated results. This is why there is an init of the ShaclSimpleValidator 
at each use in the JS app even though it is not needed. I just wanted to 
produce as much as possible an apples-to-apples comparison of JS and TQ.)
(Note: The TQ report does not include sh:conforms true ; in the results, just: 
[ a       sh:ValidationReport ] . I don’t know if this conforms to the SHACL 
spec but that’s another matter.)
The results from the command line tests show the same as the above.
Running  with line 9 of  ShapeName_01.ttl <https://pastebin.com/jDqzvPTe> set 
to bds:PersonLocalShape:
     shacl v -s ShapeName_01.ttl -d DATA_P707.ttl > PersonLocalShape_JS_Results.ttl 
<https://pastebin.com/M9s859Kc>
produces the unexpected results, namely there is no detail regarding the 
missing rdfs:label on bdr:NM0895CB6787E8AC6E.
However, running with line 9 of  ShapeName_01.ttl 
<https://pastebin.com/jDqzvPTe> set to bds:PersonShape:
     shacl v -s ShapeName_01.ttl -d DATA_P707.ttl > PersonShape_JS_Results.ttl 
<https://pastebin.com/DhBNucpX>
produces the expected results, in that the detail regarding the missing 
rdfs:label on bdr:NM0895CB6787E8AC6E is present among the results.
I did not set up the TQ command line but I think the above TQ results make this 
testing unnecessary.
I think these tests show that there is an unexpected dependence on a shape name 
in the JS library and not in the TQ library. I think this is an error and I can 
open a JIRA issue if appropriate.
A consideration I have is that we want to be able to use the fuseki shacl 
endpoint for some processing and hence need to understand the expected behavior 
of the JS library which is integrated.
Thank you again for your help
Chris
On May 29, 2020, at 6:26 AM, Andy Seaborne <[email protected]> wrote:

Question 1: regarding the name  bds:PersonShape at line 9 of ShapeName_01.ttl 
<https://pastebin.com/spJJAsJ3>. With that name the results of running 
ShaclName_validateGraph.java <https://pastebin.com/qvUy2XeB> are as expected, see 
ShapeName-results-PersonShape.txt <https://pastebin.com/Hbk4dj04>.
There are two errors in P707_nameErrs02.ttl <https://pastebin.com/8wZeMiEU> regarding 
bdr:NMC2A097019ABA499F and bdr:NM0895CB6787E8AC6E which are reported in the 
ShapeName-results-PersonShape.txt <https://pastebin.com/Hbk4dj04> file.
However, if the name at line 9 of ShapeName_01.ttl <https://pastebin.com/spJJAsJ3> is 
changed to: bds:PersonLocalShape or bds:Frogs; then detail for bdr:NM0895CB6787E8AC6E 
reports, (see ShapeName-results-PersonLocalShape.txt <https://pastebin.com/f4F9h1E2>):
     [ a sh:ValidationReport ;
       sh:conforms true ] .
instead of:
[ a            sh:ValidationReport ;
   sh:conforms  false ;
   sh:result    [ a                             sh:ValidationResult ;
                  sh:focusNode                  bdr:NM0895CB6787E8AC6E ;
                  sh:resultMessage              ":PersonName must have exactly one 
rdfs:label"@en ;
                  sh:resultPath                 rdfs:label ;
                  sh:resultSeverity             sh:Violation ;
                  sh:sourceConstraintComponent  sh:MinCountConstraintComponent ;
                  sh:sourceShape                
bds:PersonNameShape-personNameLabel
                ]
] .
which is the result with bds:PersonShape at line 9 of ShapeName_01.ttl 
<https://pastebin.com/spJJAsJ3>. In fact changing the name to bds:FrogTarts 
also produces the expected results.
Summary: If the shape name at line 9 of ShapeName_01.ttl 
<https://pastebin.com/spJJAsJ3> is either bds:PersonShape or bds:FrogTarts then 
the results are as expected; while if the shape name is either bds:PersonLocalShape 
or bds:Frogs then one of the detail results disappears and is replaced by  
sh:conforms true.
Why this dependence on the shape name? The shape name isn’t referred to elsewhere in 
ShapeName_01.ttl <https://pastebin.com/spJJAsJ3>.


A way to check is run both Jena Shacl and TQ Shacl and see if they get the same 
violations

I ran the shapes and data in both and get 32 violations (with no ontology added)

and then running with the datafile as P707+ontology.  Now 5 results each.

shacl v -s ShapeName_01.ttl -d P707_nameErrs02.ttl > V1.ttl

tb-shacl -shapesfile ShapeName_01.ttl -datafile P707_nameErrs02.ttl

The name of the shape does not seem to make a difference when run like this.

Have you tries with targetNode to select the node to validate? With a subset of 
thee shapes? That would make discussing it much easier as would a 
self-contained data (the ontology isn't particularly small).

Do you have an example which has one target shape and shows differences?


This:

bds:PersonShape-personName
    a               sh:PropertyShape ;
    sh:class        bdo:PersonName ;
    sh:message      "PersonName is not well-formed, wrong Class or missing 
rdfs:label"@en ;
    sh:minCount     1 ;
    sh:node         bds:PersonNameShape ;
    sh:nodeKind     sh:IRI ;
    sh:path         bdo:personName ;
.

(and others) could be split up into separate shapes, one per constraint (this 
has node kind, node shape, and minCount) which might make the report clearer

bds:PersonNameShape  also has a target - it can get called via two different 
routes.

It's quite complicated to track what's going on.


Reply via email to