yes, Jena preserves the case of language tags (except in in-memory plan graphs).

There is small problem that RDF suggests normalised as lowercase but the formal canonical forms (RFC 5646) aren't like that - the canonical form is "en-US".

So which is the "right" form when retrieved (rhetorical question!)?

And users like the data that goes inn to be the same as comes out!

The parsers have an option to canonicalizes data on input.

    Andy

On 15/03/2021 13:16, Paul Appleby wrote:
Hi

I just want to check the behaviour of language tags on SPARQL queries in
regard to case sensitivity. In Fuseki 3.16.0 if I run this query:

INSERT DATA {
   <http://example.com> <http://test.com> "TEST" .
   <http://example.com> <http://test.com> "TEST"@en-US .
   <http://example.com> <http://test.com> "TEST"@en-us .
}

This inserts three triples. My interpretation is that this should only
insert two triples (with language tags normalised to lower case).

After running the above insert the following query returns three results,
not two:

SELECT * {
   ?s ?p ?o
}

Similarly, updates trying to delete the triples seem to be case
sensitive on the language tags. So if I then run the following query only
two triples are deleted:

DELETE DATA {
   <http://example.com> <http://test.com> "TEST" .
   <http://example.com> <http://test.com> "TEST"@en-us .
}

When we try the insert query above in both MarkLogic and GraphDB we get
just two triples.

Is my interpretation of language tag handling incorrect?

Regards

Reply via email to