On 06/04/13 20:59, Bob DuCharme wrote:
With the following data,
@prefix d: <http://learningsparql.com/ns/data#> .
@prefix dm: <http://learningsparql.com/ns/demo#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix mt: <http://learningsparql.com/ns/mytypesystem#> .
d:item2a dm:prop "two" .
d:item2b dm:prop "two"^^xsd:string .
d:item2c dm:prop "two"^^mt:potrzebies .
d:item2d dm:prop "two"@en .
I run this query,Possibly sh
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX d: <http://learningsparql.com/ns/data#>
SELECT ?s
WHERE { ?s ?p "two"^^xsd:string . }
and Jena 2.7.4 ARQ gives me this result,
------------
| s |
============
| d:item2b |
| d:item2a |
------------
but running the same query against the same data with Fuseki 0.2.6 gives
me this:
------------
| s |
============
| d:item2b |
------------
Why is this? According to current W3C Recommendations (as opposed to
future plans for RDF 1.1), Fuseki is more correct, right?
Hi Bob,
Not quite - it depends on which recommendations you read :-)
specifically how much of RDF Model Theory (RDF-MT).
How are you running Fuseki?
Jena memory models provide the inference that simple literals and
xsd:strings are the same value. This is RDF-MT rules xsd 1a and xsd 1b
== xsd 1a
uuu aaa "sss". -> uuu aaa "sss"^^xsd:string .
== xsd 1b
uuu aaa "sss"^^xsd:string . -> uuu aaa "sss".
so if you include that spec, you get two rows because "two" matches
"two" and "two"^^xsd:string.
But the memory models also keep those terms apart so that
:x :p "foo" .
:x :p "foo"^^xsd:string .
is stored as two triples.
Storage models just keep those two triples apart. It would be costly to
also index on value at scale. TDB, whether in-memory or on-disk, treats
the "" and ""^^xsd:string forms separately. So Fuseki/TDB isn't
including RDF MT.
TDB could have done the translation of ^^xsd:string to simple literals
(c.f. treatment of integers in TDB) on loading but that does risk being
inconvenient for ontology storage where the use of xsd:string is common
and they don't treat simple literals as xsd:string.
Future:
In RDF 1.1, this changes. All literals have datatypes. For @lang ones
it's rdf:langString (SPARQL already includes this).
A simple literal (no language tag, no datatype) becomes surface syntax
for xsd:string. So at parse time,
"foo" ==> "foo"^^xsd:string
and
:x :p "foo" .
:x :p "foo"^^xsd:string .
is the same triple - and hence one triple in the graph.
You will then get
------------
| s |
============
| d:item2b |
| d:item2a |
------------
on all storage types (and printing xsd:strings will be the un-^^ form).
While a small change in the grand scheme of things, it's going to be an
interesting one.
Andy
Thanks,
Bob