[
https://issues.apache.org/jira/browse/JENA-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17424345#comment-17424345
]
Andy Seaborne commented on JENA-2176:
-------------------------------------
Hi [~justin2004],
Yes, that can happen. Intenerally Jena works internal with any {{Node}} type in
any position. That includes literals-as-subjects but also variables, which are
an extension of RDF terms, markers with special meanings and now RDF-star
quoted triples, which are new kind of {{Node}}. There is provision of {{Graph}}
references as {{Node}} (see N3) in the basic classes.
When looked up in the data, there is simply no match. It is simpler to delegate
this to the lookup than test the subject every time. In the case of TDB2 (and
TDB1) execution of a basic graph pattern is not by RDF -term but by internal
id, and the internal ids do not indicate whether it is a URI, blank node or
literal.
There are some optimizations possible by knowing a variable can only be a URI
but they are not general and so can not be used everywhere.
In SPARQL, the [Triple
Patterns|https://www.w3.org/TR/sparql11-query/#sparqlTriplePatterns] definition
includes literals in the subject position. It is easier to define it this way
because variables may be bound to literals dynamically.
RDF 1.1 notes this: [7. Generalized RDF Triples, Graphs, and
Datasets|https://www.w3.org/TR/rdf11-concepts/#section-generalized-rdf]. Liek
SPARQL, in rules-based inference out-of-place terms occur naturally.
The parsers do not accept non-conformant RDF data.
> TDB2 queries can execute quadpatterns with a literal in the subject position
> ----------------------------------------------------------------------------
>
> Key: JENA-2176
> URL: https://issues.apache.org/jira/browse/JENA-2176
> Project: Apache Jena
> Issue Type: Bug
> Components: TDB2
> Affects Versions: Jena 4.2.0
> Reporter: Justin
> Priority: Major
> Attachments: z.rq, z.ttl
>
>
> Hello,
> If you try to put a triple into a TDB2 with a literal in the subject position
> you get the following:
> ```
> ERROR riot :: [line: 6, col: 18] Subject is not a URI or blank node
> ```
> So far so good.
> But since literals can not be in the subject position of a triple a query
> against a TDB2 should never attempt to find a literal in the subject position
> of a triple, right? It would be a waste of time.
> But if I am reading the logs correctly that is what appears to happen:
> ```
> root@ec6206bb523f:/mnt/tdb_42# cat /mnt/z.ttl
> @prefix ex: <[http://example.com/]> .
> ex:apple ex:hasPart ex:skin .
> ex:skin ex:hasName "Skin" .
> ex:file ex:hasPart "lala" .
> root@ec6206bb523f:/mnt/tdb_42#
> root@ec6206bb523f:/mnt/tdb_42# cat /mnt/z.rq
> prefix ex: <[http://example.com/]>
> select * where
> { ?s ex:hasPart ?o . optional \\{ ?o ?p ?o1 . }
> }
>
> root@ec6206bb523f:/mnt/tdb_42# /mnt/apache-jena-4.2.0/bin/tdb2.tdbloader
> --loc=`pwd` /mnt/z.ttl
> 00:31:49 INFO loader :: Loader = LoaderPhased
> 00:31:49 INFO loader :: Start: /mnt/z.ttl
> 00:31:49 INFO loader :: Finished: /mnt/z.ttl: 3 tuples in 0.07s (Avg: 40)
> 00:31:49 INFO loader :: Finish - index SPO
> 00:31:49 INFO loader :: Start replay index SPO
> 00:31:49 INFO loader :: Index set: SPO => SPO->POS, SPO->OSP
> 00:31:49 INFO loader :: Index set: SPO => SPO->POS, SPO->OSP [3 items, 0.0
> seconds]
> 00:31:49 INFO loader :: Finish - index OSP
> 00:31:49 INFO loader :: Finish - index POS
> root@ec6206bb523f:/mnt/tdb_42# /mnt/apache-jena-4.2.0/bin/tdb2.tdbquery -v
> --loc=`pwd` --query=/mnt/z.rq
> 1 PREFIX ex: <[http://example.com/]>
> 2
> 3 SELECT *
> 4 WHERE
> 5
> { ?s ex:hasPart ?o
> 6 OPTIONAL
> 7 \{ ?o ?p ?o1 }
>
> 8 }
>
> 00:31:59 INFO exec :: QUERY
> PREFIX ex: <[http://example.com/]>
>
> SELECT *
> WHERE
>
> { ?s ex:hasPart ?o OPTIONAL
> { ?o ?p ?o1 }
> }
> 00:31:59 INFO exec :: ALGEBRA
> (conditional
> (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?s
> <[http://example.com/hasPart]> ?o))
> (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?o ?p ?o1)))
> 00:32:00 INFO exec :: TDB
> (conditional
> (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?s
> <[http://example.com/hasPart]> ?o))
> (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?o ?p ?o1)))
> 00:32:00 INFO exec :: Execute :: ?s <[http://example.com/hasPart]> ?o
> 00:32:00 INFO exec :: TDB
> (quadpattern (quad <urn:x-arq:DefaultGraphNode> <[http://example.com/skin]>
> ?p ?o1))
> 00:32:00 INFO exec :: Execute :: <[http://example.com/skin]> ?p ?o1
> 00:32:00 INFO exec :: TDB
> (quadpattern (quad <urn:x-arq:DefaultGraphNode> "lala" ?p ?o1))
> 00:32:00 INFO exec :: Execute :: "lala" ?p ?o1
> --------------------------------------------
> |s|o|p|o1|
> ============================================
> |ex:apple|ex:skin|ex:hasName|"Skin"|
> |ex:file|"lala"| | |
> --------------------------------------------
>
> ```
> Doesn't this:
> ```
> 00:32:00 INFO exec :: TDB
> (quadpattern (quad <urn:x-arq:DefaultGraphNode> "lala" ?p ?o1))
> 00:32:00 INFO exec :: Execute :: "lala" ?p ?o1
> ```
> mean a lookup was done in the TDB2 for a triple with the literal "lala" in
> the subject position? If so, shouldn't lookups like that be ignored as they
> will never find matching triples in the TDB2?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)