[ 
https://issues.apache.org/jira/browse/JENA-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17424345#comment-17424345
 ] 

Andy Seaborne commented on JENA-2176:
-------------------------------------

Hi [~justin2004],

Yes, that can happen. Intenerally Jena works internal with any {{Node}} type in 
any position. That includes literals-as-subjects but also variables, which are 
an extension of RDF terms, markers with special meanings and now RDF-star 
quoted triples, which are new kind of {{Node}}. There is provision of {{Graph}} 
references as {{Node}} (see N3) in the basic classes.

When looked up in the data, there is simply no match. It is simpler to delegate 
this to the lookup than test the subject every time. In the case of TDB2 (and 
TDB1) execution of a basic graph pattern is not by RDF -term but by internal 
id, and the internal ids do not indicate whether it is a URI, blank node or 
literal.

There are some optimizations possible by knowing a variable can only be a URI 
but they are not general and so can not be used everywhere.

In SPARQL, the [Triple 
Patterns|https://www.w3.org/TR/sparql11-query/#sparqlTriplePatterns] definition 
includes literals in the subject position. It is easier to define it this way 
because variables may be bound to literals dynamically.

RDF 1.1 notes this: [7. Generalized RDF Triples, Graphs, and 
Datasets|https://www.w3.org/TR/rdf11-concepts/#section-generalized-rdf]. Liek 
SPARQL, in rules-based inference out-of-place terms occur naturally.

The parsers do not accept non-conformant RDF data.


> TDB2 queries can execute quadpatterns with a literal in the subject position
> ----------------------------------------------------------------------------
>
>                 Key: JENA-2176
>                 URL: https://issues.apache.org/jira/browse/JENA-2176
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB2
>    Affects Versions: Jena 4.2.0
>            Reporter: Justin
>            Priority: Major
>         Attachments: z.rq, z.ttl
>
>
> Hello,
> If you try to put a triple into a TDB2 with a literal in the subject position 
> you get the following:
>  ```
>  ERROR riot :: [line: 6, col: 18] Subject is not a URI or blank node
>  ```
> So far so good.
> But since literals can not be in the subject position of a triple a query 
> against a TDB2 should never attempt to find a literal in the subject position 
> of a triple, right? It would be a waste of time.
> But if I am reading the logs correctly that is what appears to happen:
>  ```
> root@ec6206bb523f:/mnt/tdb_42# cat /mnt/z.ttl 
>  @prefix ex: <[http://example.com/]> .
> ex:apple ex:hasPart ex:skin .
>  ex:skin ex:hasName "Skin" .
>  ex:file ex:hasPart "lala" .
> root@ec6206bb523f:/mnt/tdb_42# 
>  root@ec6206bb523f:/mnt/tdb_42# cat /mnt/z.rq 
>  prefix ex: <[http://example.com/]>
> select * where
> { ?s ex:hasPart ?o . optional \\{ ?o ?p ?o1 . }
> }
>  
> root@ec6206bb523f:/mnt/tdb_42# /mnt/apache-jena-4.2.0/bin/tdb2.tdbloader 
> --loc=`pwd` /mnt/z.ttl
>  00:31:49 INFO loader :: Loader = LoaderPhased
>  00:31:49 INFO loader :: Start: /mnt/z.ttl
>  00:31:49 INFO loader :: Finished: /mnt/z.ttl: 3 tuples in 0.07s (Avg: 40)
>  00:31:49 INFO loader :: Finish - index SPO
>  00:31:49 INFO loader :: Start replay index SPO
>  00:31:49 INFO loader :: Index set: SPO => SPO->POS, SPO->OSP
>  00:31:49 INFO loader :: Index set: SPO => SPO->POS, SPO->OSP [3 items, 0.0 
> seconds]
>  00:31:49 INFO loader :: Finish - index OSP
>  00:31:49 INFO loader :: Finish - index POS
>  root@ec6206bb523f:/mnt/tdb_42# /mnt/apache-jena-4.2.0/bin/tdb2.tdbquery -v 
> --loc=`pwd` --query=/mnt/z.rq
>  1 PREFIX ex: <[http://example.com/]>
>  2
>  3 SELECT *
>  4 WHERE
>  5
> { ?s ex:hasPart ?o 
> 6 OPTIONAL
>  7 \{ ?o ?p ?o1 }
>  
>  8 }
>  
>  00:31:59 INFO exec :: QUERY
>  PREFIX ex: <[http://example.com/]>
>  
>  SELECT *
>  WHERE
>  
>  { ?s ex:hasPart ?o OPTIONAL 
> { ?o ?p ?o1 }
> }
>  00:31:59 INFO exec :: ALGEBRA
>  (conditional
>  (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?s 
> <[http://example.com/hasPart]> ?o))
>  (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?o ?p ?o1)))
>  00:32:00 INFO exec :: TDB
>  (conditional
>  (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?s 
> <[http://example.com/hasPart]> ?o))
>  (quadpattern (quad <urn:x-arq:DefaultGraphNode> ?o ?p ?o1)))
>  00:32:00 INFO exec :: Execute :: ?s <[http://example.com/hasPart]> ?o
>  00:32:00 INFO exec :: TDB
>  (quadpattern (quad <urn:x-arq:DefaultGraphNode> <[http://example.com/skin]> 
> ?p ?o1))
>  00:32:00 INFO exec :: Execute :: <[http://example.com/skin]> ?p ?o1
>  00:32:00 INFO exec :: TDB
>  (quadpattern (quad <urn:x-arq:DefaultGraphNode> "lala" ?p ?o1))
>  00:32:00 INFO exec :: Execute :: "lala" ?p ?o1
>  --------------------------------------------
> |s|o|p|o1|
> ============================================
> |ex:apple|ex:skin|ex:hasName|"Skin"|
> |ex:file|"lala"| | |
> --------------------------------------------
>  
>  ```
> Doesn't this:
>  ```
>  00:32:00 INFO exec :: TDB
>  (quadpattern (quad <urn:x-arq:DefaultGraphNode> "lala" ?p ?o1))
>  00:32:00 INFO exec :: Execute :: "lala" ?p ?o1
>  ```
>  mean a lookup was done in the TDB2 for a triple with the literal "lala" in 
> the subject position? If so, shouldn't lookups like that be ignored as they 
> will never find matching triples in the TDB2?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to