Re: SHACL Endpoint questions

Chris Tomlinson Thu, 14 May 2020 11:06:28 -0700

Hi Andy,

I want to validate a named graph in the context of the union graph. I don’t 
want to validate the union graph. The union graph has information in it such as 
the ontology which defines subClass and subProperty relations needed to 
successfully validate a target graph such as http://purl.bdrc.io/graph/P707 
<http://purl.bdrc.io/graph/P707>.


Also P707 refers to a parent and teacher P705 which needs to be verified that 
it meets minimum criteria for a Person.

I thought that validate(shapes, graph, node) should accomplish this if graph = 
the dataset graph which contains all these additional bits of information.

That’s why the endpoint is interesting since it provides in principle access to 
using shacl inside of Fuseki, where the entire dataset is available, without 
having to write an independent bit of code that we add to our fuseki 
deployments.

I hope this clarifies what I’m wanting to accomplish. I probably don’t 
understand what validate(shapes, graph, node) is supposed to do.

Thanks for your patience,
Chris


> On May 14, 2020, at 12:34 PM, Andy Seaborne <[email protected]> wrote:
> 
> ?graph names the graph to be validated.
> 
> ?graph can be a URI of a named graph in the dataset
> 
> or ?graph=default for the default graph (note: this is the storage default 
> graph, not the union default graph)
> 
> or ?graph=union for the union of all named graphs which is what I think 
> you're asking for.
> 
> (This is the org.apache.jena.fuseki.servlets.SHACL_Validation servlet.)
> 
> 
> On 14/05/2020 15:40, Chris Tomlinson wrote:
>> Hi Andy,
>> Thanks very much for the shacl guidance. The use of sh:targetSubjectsOf is 
>> quite helpful. I replaced the bdo:personName w/ bdo:isRoot which must be 
>> present on any Entity resource so that if a Work or Place or other entity is 
>> checked it will fail if it isn’t a bdo:Person.
>> This still fails in the event that there is no bdo:isRoot so in some way 
>> that negative needs also to be caught to weed out really malformed graphs.
>> I still have a question about the shacl endpoint:
>>     Is the ?graph parameter validated in the context of the entire dataset 
>> specified in the endpoint URL or just the named graph itself?
>> It appears to be just the named graph itself so is the same as running the 
>> shacl command outside of Fuseki.
> 
> Yes - as above, it can be the union.
> 
>> We are wanting a validation of the named graph against the entire (union) 
>> dataset graph
> 
> Not sure what "against" means here. There is a shapes graph in the validate 
> request and data graph, which can be the union graph of the dataset.
> 
> To direct the validation to a certain node, use sh:targetNode.
> 
>> which will have sufficient information about subClassOf* and external 
>> resources like P705 without entailing a validation of all nodes reachable 
>> from triples in the ?graph named graph. This might be similar to:
>>     validator.validate(shapes, dsg, node)
>> where node would be the root resource URI like, 
>> <http://purl.bdrc.io/resource/P707 <http://purl.bdrc.io/resource/P707>>.
>> Is this something that needs an issue raised and a bit of extension of the 
>> endpoint or is there another way to get this kind of behavior through the 
>> endpoint?
>> Thank you very much for your help,
>> Chris
>>> On May 13, 2020, at 12:16 PM, Andy Seaborne <[email protected]> wrote:
>>> 
>>> 
>>> 
>>> On 13/05/2020 16:03, Chris Tomlinson wrote:
>>>> Hi Andy,
>>>> Thank you for the reply. I can get your example to work as you indicate, 
>>>> but have some questions:
>>>> 1) I went through the latest SHACL draft 
>>>> <https://w3c.github.io/data-shapes/shacl/> and I cannot find how to know 
>>>> that sh:targetNode always executes. It’s also not clear to me what it 
>>>> means to execute. I thought that sh:targetNode X was a way to restrict a 
>>>> shape to X in the data graph, whatever X might be.
>>> 
>>> It sets the target node to X and that becomes $this.
>>> 
>>> It does not say the target has to be in the graph.
>>> 
>>> The tests use this idiom quite a lot.
>>> 
>>> This matters because in some places the spec is not complete and without 
>>> some light reverse engineering from the tests, I'd not have been able to 
>>> implement some of the SPARQL functionality (particularly SPARQL components, 
>>> not the SPARQl constraints we're talking about here).
>>> 
>>> Also, RDF graphs do not have a formally defined set of nodes - they are a 
>>> set of edges and any nodes you want can be used in triples.
>>> 
>>>> 2) What I’m trying to do is validate that a resource like 
>>>> http://purl.bdrc.io/resource/P707 <http://purl.bdrc.io/resource/P707> is a 
>>>> Person, which at a minimum means that:
>>>>     <http://purl.bdrc.io/resource/P707 
>>>> <http://purl.bdrc.io/resource/P707>>  a  
>>>> <http://purl.bdrc.io/ontology/core/Person 
>>>> <http://purl.bdrc.io/ontology/core/Person>> .
>>>> is present in the http://purl.bdrc.io/graph/P707 
>>>> <http://purl.bdrc.io/graph/P707>. The PersonShape 
>>>> <https://github.com/buda-base/editor-templates/blob/master/templates/core/person.shapes.ttl>
>>>>  has:
>>>>     sh:targetClass bdo:Person
>>>> but that only serves to say that PersonShape only applies to resources of 
>>>> class bdo:Person and if there are none, then there are no violations which 
>>>> means I can try to validate a bibliographic element such as 
>>>> http://purl.bdrc.io/resource/W1FPL1 <http://purl.bdrc.io/resource/W1FPL1> 
>>>> which is of class bdo: ImageInstance but of course that still sh:conforms 
>>>> true since bds:PersonShape doesn’t apply and hence there aren’t any 
>>>> violations. (to see the resources, use 
>>>> http://ldspdi-dev.bdrc.io/resource/W1FPL1.ttl 
>>>> <http://ldspdi-dev.bdrc.io/resource/W1FPL1.ttl>, for example).
>>>> The use case is: a client submits a graph of a resource and claims it to 
>>>> be a bdo:Person or a subClassOf* it; and we want to validate the graph as 
>>>> a bdo:Person and so want to get the result “false" for bdr:W1FPL1 instead 
>>>> of “true".
>>>> It’s our intent to use a tool like shacl for this top-level task as well 
>>>> as validating the details liuke having at least one name, a gender, and so 
>>>> on.
>>>> I tried using something like your example:
>>>> bds:CheckPersonClassShape  a      sh:NodeShape ;
>>>>     rdfs:label      "Check Person Class Shape"@en ;
>>>>     sh:targetNode "Check Class" ;
>>>>     sh:sparql [
>>>>       a sh:SPARQLConstraint ;
>>>>       sh:prefixes [
>>>>         sh:declare [
>>>>           sh:prefix "rdf" ;
>>>>           sh:namespace "http://www.w3.org/1999/02/22-rdf-syntax-ns#"; ;
>>>>         ] , [
>>>>           sh:prefix "bdo" ;
>>>>           sh:namespace "http://purl.bdrc.io/ontology/core/"; ;
>>>>         ]
>>>>       ] ;
>>>>       sh:select """
>>>>         select $this (rdf:type as ?path) (bdo:Person as ?value)
>>>>        where {
>>>>            filter not exists { $this ?path ?value }
>>>>        }
>>>>      """ ;
>>> 
>>> That query does not look right.
>>> 
>>> 1/ $this is the targetNode
>>> 
>>> $this is "Check Class" - the shape needs to find the thing that is the 
>>> person amongst the several subjects in the data. That can be in the SPARQL 
>>> or as a target of some kind.
>>> 
>>> Either set the target to be bdr:P707
>>> or find a signature such has a bdo:personName triple. "sh:targetSubjectsOf 
>>> bdo:personName"
>>> or write some pattern in the SPARQL query.
>>> 
>>> You may want some "whole graph" validation such as not completely empty or 
>>> has at least some relevant vocabulary to ensure that the data is not so off 
>>> that nothing will trigger.  That's where the sh:targetNode "foobar" trick 
>>> comes in.
>>> 
>>> 2/ It's looking for any triple with $this as subject, not "a bdo:Person"
>>> 
>>> The SELECT-AS happens after the WHERE.
>>> FILTER NOT EXISTS does not set ?path ?value so if they are unset there are 
>>> free variables.
>>> 
>>>   filter not exists { $this ?P ?O }
>>> 
>>> would be just the same and matches any triple with $this as subject.
>>> 
>>> 
>>> 
>>> You want to set ?value and ?path before the FILTER:
>>> 
>>>  BIND (bdo:Person as ?value)
>>>  BIND (rdf:type as ?path)
>>> 
>>> 
>>> 
>>> or write directly and not worry about ?path and ?value.
>>> 
>>>  filter not exists { $this rdf:type bdo:Person }
>>> 
>>> (
>>> The message processing from SPARQL constraints and components doesn't do 
>>> templating.
>>> )
>>> 
>>>>     ] ;
>>>> .
>>>> But this just always reports a violation that the literal, “Check Class”, 
>>>> doesn’t conform, which is true since it isn’t in the data graph.
>>> 
>>> 
>>> bds:CheckPersonClassShape  a      sh:NodeShape ;
>>>    rdfs:label      "Check Person Class Shape"@en ;
>>>    ## sh:targetNode bdr:P707 ;
>>>    sh:targetSubjectsOf bdo:personName ;
>>>    sh:sparql [
>>>      a sh:SPARQLConstraint ;
>>>      sh:prefixes [
>>>        sh:declare [
>>>          sh:prefix "rdf" ;
>>>          sh:namespace "http://www.w3.org/1999/02/22-rdf-syntax-ns#"; ;
>>>        ] , [
>>>          sh:prefix "bdo" ;
>>>          sh:namespace "http://purl.bdrc.io/ontology/core/"; ;
>>>        ]
>>>      ] ;
>>>      sh:select """
>>>        select $this
>>>         where {
>>>           filter not exists { $this rdf:type bdo:Person }
>>>         }
>>>       """ ;
>>>    ] ;
>>> .
>>> 
>>> 
>>> 
>>> shacl validate -v -s shapes.ttl -d P707.ttl
>>> 
>>> shows the validation when "a  bdo:Person ;" commented out of the data:
>>> 
>>> NodeShape[http://example/CheckPersonClassShape]
>>> N: FocusNodes(1): [http://purl.bdrc.io/resource/P707]
>>>  F: http://purl.bdrc.io/resource/P707
>>>  S: NodeShape[http://example/CheckPersonClassShape]
>>>  C: SPARQL[PREFIX  bdo:  <http://purl.bdrc.io/ontology/core/> PREFIX rdf:  
>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#>  SELECT  ?this WHERE {   
>>> FILTER NOT EXISTS { ?this  rdf:type  bdo:Person } }]
>>> 
>>> ... prefixes ...
>>> 
>>> [ a            sh:ValidationReport ;
>>>  sh:conforms  false ;
>>>  sh:result    [ a                             sh:ValidationResult ;
>>>                 sh:focusNode                  bdr:P707 ;
>>>                 sh:resultMessage              "SPARQL SELECT constraint for 
>>> <http://purl.bdrc.io/resource/P707> returns 
>>> <http://purl.bdrc.io/resource/P707>" ;
>>>                 sh:resultSeverity             sh:Violation ;
>>>                 sh:sourceConstraintComponent sh:SPARQLConstraintComponent ;
>>>                 sh:sourceShape                bds:CheckPersonClassShape ;
>>>                 sh:value                      bdr:P707
>>>               ]
>>> ] .
>>> 
>>> 
>>>> 3) The original reason for wanting to use the shacl endpoint was so that 
>>>> we could PUT the submitted graph in the Fuseki dataset and then use the 
>>>> endpoint to validate the resource bdr:P707 (or bdr:W1FPL1) as a Person (or 
>>>> not) with the rest of the dataset graph available to handle things like 
>>>> subClassOf*  and subPropertyOf* for various items as well as validating 
>>>> the minimum of resources referenced by P707 such as that P705 is a male 
>>>> person and hence can be a father of P707.
>>> 
>>> That sounds like
>>> 
>>>   sh:targetNode bdr:P707
>>> 
>>> and also some shapes to check "is there anything relevant at all".
>>> 
>>>    Andy
>>> 
>>>> The graph for P707 that is submitted would only have references to P705, 
>>>> with no properties on P705, since that resource is in its own graph.
>>>> I thought this is pretty much how validate(Shapes Graph, Node) would work, 
>>>> where Graph would be the union dataset graph.
>>>> I’m evidently missing some understanding.
>>>> I appreciate your patience,
>>>> Chris
>>>>> On May 12, 2020, at 3:52 AM, Andy Seaborne <[email protected]> wrote:
>>>>> 
>>>>> Chris,
>>>>> 
>>>>> Here's a shape that always executes and tests for an empty data graph.
>>>>> 
>>>>> # No violation
>>>>> shacl validate -v -shapes ex-shapes.ttl -data not-empty.ttl
>>>>> 
>>>>> # Violation
>>>>> shacl validate -v -shapes ex-shapes.ttl -data empty.nt
>>>>> 
>>>>> "sh:targetNode" always executes.
>>>>> 
>>>>> With this pattern, the SPARQL query can do arbitrary checks.
>>>>> 
>>>>>    Andy
>>>>> 
>>>>> ## ex-shapes.ttl
>>>>> PREFIX rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>> PREFIX rdfs:    <http://www.w3.org/2000/01/rdf-schema#>
>>>>> 
>>>>> PREFIX sh:      <http://www.w3.org/ns/shacl#>
>>>>> PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#>
>>>>> 
>>>>> PREFIX ex:        <http://example/>
>>>>> 
>>>>> ex:NotEmptyGraphShape
>>>>>  rdf:type sh:NodeShape ;
>>>>>  sh:targetNode "Empty Graph" ;
>>>>>  sh:sparql [
>>>>>    a sh:SPARQLConstraint ;
>>>>>    sh:select """
>>>>>   SELECT $this ?value
>>>>>   WHERE {
>>>>>            FILTER NOT EXISTS { ?s ?p ?o }
>>>>>   }
>>>>>   """ ;
>>>>>   ] .
>>>>> 
>>>>> On 11/05/2020 17:14, Chris Tomlinson wrote:
>>>>> 
>>>>>> I appreciate that it works that way but until and unless I can 
>>>>>> understand your point about
>>>>>>  [] sh:targetNode ex:myNode
>>>>>> then I don’t know how to distinguish: 1) no violations because a Person 
>>>>>> graph conforms to the PersonShapes - like there’s no Work indicated as a 
>>>>>> parent of the person or a rdfs:label is used where a skos:prefLabel is 
>>>>>> expected; versus 2) no violations because the question is vacuous like 
>>>>>> asking if a Work looks like a person or an empty non-existent graph 
>>>>>> looks like a person.

Re: SHACL Endpoint questions

Reply via email to