RDFS inference produces invalid data

Jindřich Mynarz Tue, 22 Oct 2019 11:07:11 -0700

Hi,

it is reasonably common in RDF vocabularies to see that a property is
defined as an rdfs:subPropertyOf a blank node, such as in this example:


### vocabulary.ttl

PREFIX :     <http://example.com/>
PREFIX owl:  <http://www.w3.org/2002/07/owl#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

:p1 rdfs:subPropertyOf _:b1 .
_:b1 owl:inverseOf :p2 .

###

Example data using this vocabulary:

### data.ttl

PREFIX : <http://example.com/>

:e1 :p1 :e2 .

###

When RDFS inference is applied to this data, using the vocabulary with
Jena's command-line tool infer (the current 3.13.1 version, source code:
https://github.com/apache/jena/blob/master/jena-cmds/src/main/java/riotcmd/infer.java),
i.e. infer --rdfs vocabulary.ttl data.ttl, we get the following result
(here prettified):

###

PREFIX : <http://example.com/>

:e1 :p1 :e2 ;
  _:b2 :e2 .

###

As you can see, we infer that :e1 _:b2 :e2, which is invalid, because blank
nodes (i.e. _:b2) are not permitted as predicates in RDF (
https://www.w3.org/TR/rdf11-concepts/#h3_section-triples).

Now, it is clear how that follows from the rdfs:subPropertyOf inference
rules (set aside that what we might actually want an RDFS/OWL reasoner that
would give us :e2 :p2 :e1), but should such inference be made if it
violates the RDF data model?

I wonder if checking the produced inferences for validity is expensive, or
if Jena's infer assumes a superset of RDF. Removing such inferences in
post-processing is a bit tricky because RDF parsers recognize this as an
error and fail.

- Jindrich

-- 
Jindrich Mynarz
https://mynarz.net/#jindrich

RDFS inference produces invalid data

Reply via email to