[ 
https://issues.apache.org/jira/browse/JENA-781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14126902#comment-14126902
 ] 

Andy Seaborne edited comment on JENA-781 at 9/9/14 2:11 PM:
------------------------------------------------------------

It's not specifically about XMLLIterals.  It's the fact it's an illegal lexical 
form for the datatype:

{noformat}
PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#>

SELECT ?x
WHERE
{
  VALUES ?x { <http://uri> "a" 2 true }
  FILTER(?x NOT IN ("two"^^xsd:integer))
}
{noformat}
Translating the {{NOT IN}} to the equivalent form of {{!=}}
{noformat}
PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#>

SELECT ?x
WHERE
{
  VALUES ?x { <http://uri> "a" 2 true }
  FILTER(?x != "two"^^xsd:integer)
}
{noformat}
and so we're into the fact that {{!=}} can not be certain two things are not 
equal without knowing value spaces are disjoint. Open world assumption on 
datatypes.

A URI can not be a literal so {{ <http://uri> != "lexical"^^fn:unknownDatatype 
}} is known to be true.

But {{2 != "two"^^fn:unknownDatatype}} is not known the query engine.  Maybe 
{{:unknownDatatype}} has a value space of numbers and a lexical space of 
spellings; or it may not be a number at all.

If a value tests is "unknown" (ARQ does not know it definitely true nor 
defintiely false) then it's an evaluation error which rejects the row as not 
being positively known to pass the filter.

 As far as ARQ is concerned {{"two"^^xsd:integer}} is not an integer value but 
it has no idea what it is. It treats it as an unknown datatype but it could be 
treated with some knowledge and this could might be tighened up a little for 
the known XSD datatypes.  It's a non-trivial change that may have knock-on 
effects.



was (Author: andy.seaborne):
It's not specifically about XMLLIterals.  It's the fact it's an illegal lexical 
form for the datatype:

{noformat}
PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#>

SELECT ?x
WHERE
{
  VALUES ?x { <http://uri> "a" 2 true }
  FILTER(?x NOT IN ("two"^^xsd:integer))
}
{noformat}
Translating the {{NOT IN}} to the equivalent form of {{!=}}
{noformat}
PREFIX xsd:     <http://www.w3.org/2001/XMLSchema#>

SELECT ?x
WHERE
{
  VALUES ?x { <http://uri> "a" 2 true }
  FILTER(?x != "two"^^xsd:integer)
}
{noformat}
and so we're into the fact that {{!=}} can not be certain two things are not 
equal without knowing value spaces are disjoint. Open world assumption on 
datatypes.

A URI can not be a literal so {{ <http://uri> != "lexical"^^fn:unknownDatatype 
}} is known to be true.

But {{2 != "two"^^fn:unknownDatatype}} is not known the query engine.  Maybe 
{{:unknownDatatype}} has a value space of numbers and a lexical space of 
spellings; or it may not be a number at all.

 As far as ARQ is concerned {{"two"^^xsd:integer}} is not an integer value but 
it has no idea what it is. It treats it as an unknown datatype but it could be 
treated with some knowledge and this could might be tighened up a little for 
the known XSD datatypes.  It's a non-trivial change that may have knock-on 
effects.


> XML Literal used in FILTER NOT IN leads to incorrect results
> ------------------------------------------------------------
>
>                 Key: JENA-781
>                 URL: https://issues.apache.org/jira/browse/JENA-781
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: ARQ, Optimizer
>    Affects Versions: Jena 2.12.0
>            Reporter: Rob Vesse
>
> Originally spotted this while answering a question on Answers - 
> http://answers.semanticweb.com/questions/30243/sparql-in-problem
> The actual question had a bug in the users data but even with that fixed 
> incorrect query results were still received.  I have reduced this to a 
> following query that demonstrates that the issue is with the presence of a 
> XML Literal in the {{NOT IN}} list
> Working query:
> {noformat}
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT ?x
> WHERE
> {
>   VALUES ?x { <http://uri> "a" 2 true }
> }
> {noformat}
> Broken query:
> {noformat}
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> SELECT ?x
> WHERE
> {
>   VALUES ?x { <http://uri> "a" 2 true }
>   FILTER(?x NOT IN ("<foo />"^^rdf:XMLLiteral))
> }
> {noformat}
> The expected results are the same for both queries, all four values given for 
> {{?x}} should be returned.  However the second query with the {{FILTER}} only 
> returns {{http://uri}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to