[jira] [Comment Edited] (JENA-1377) Model.isIsomorphicWith() returns false when language tags case do not match

Andy Seaborne (JIRA) Mon, 24 Jul 2017 13:54:22 -0700

    [ 
https://issues.apache.org/jira/browse/JENA-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099107#comment-16099107
 ]


Andy Seaborne edited comment on JENA-1377 at 7/24/17 8:53 PM:
--------------------------------------------------------------

All parsing goes via an {{RDFParser}} -- see {{RDFDataMgr}} -- so, yes, it's 
possible.

{{StreamRDFLib.graph}} will create a destination for a graph (parsing acts on 
the storage units - Graph and 
DatasetGraph).

It is not  a "major flaw"; it sounds like your application has a requirement to 
work with RFC5646/canonicalised language tags.

The RDF design is lower level and a compromise on different viewpoints 
(requirements) and on RDF's history. Note especially the counting issue if 
forcing the case. Jena models do comparison without regard to case for 
"listStatements" etc.  Isomorphism is clearly documented to work on RDF terms.  
I do not know what the meaning of isomorphism is if values were used.  Graphs 
of different sizes could compare isomorphic which is somewhat strange.

RFC 5646 says "are to be treated as case insensitive" not that it is forced to 
one case to another. Canonical form is one way to work - it is not the only 
one.  "treat" is qualified as "conveys the same meaning", not "is written the 
same".

SPARQL has {{langMatches}}

The value 1 can be written "001"\^\^xsd:integer, "+1"\^\^xsd:integer and 
"1"\^\^xsd:integer. They are "treated" as the same value (meaning) in, say a 
SPARQL {{FILTER(?x < 23)}}; they are written differently.  Context matters - 
what about "1"\^\^xsd:decimal or "1"\^\^xsd:double?






was (Author: andy.seaborne):
All parsing goes via an {{RDFParser}} -- see {{RDFDataMgr}} -- so, yes, it's 
possible.

{{StreamRDFLib.graph}} will create a destination for a graph (parsing acts on 
the storage units - Graph and 
DatasetGraph).

It is not  a "major flaw"; it sounds like your application has a requirement to 
work with RFC5646/canonicalised language tags.

The RDF design is lower level and a compromise on different viewpoints 
(requirements) and on RDF's history. Note especially the counting issue if 
forcing the case. Jena models do comparison without regard to case for 
"listStatements" etc.  Isomorphism is clearly documented to work on RDF terms.  
I do not know what the meaning of isomorphism is if values were used.  Graphs 
of different sizes could compare isomorphic which is somewhat strange.

RFC 5646 says "are to be treated as case insensitive" not that it is forced to 
one case to another. Canonical form is one way to work - it is not the only 
one.  "treat" is qualified as "conveys the same meaning", not "is written the 
same".

SPARQL has {{langMatches}}

The value 1 can be written "001"^^xsd:integer, "+1"^^xsd:integer and 
"1"^^xsd:integer. They are "treated" as the same value (meaning) in, say a 
SPARQL {{FILTER(?x < 23)}}; they are written differently.  Context matters - 
what about "1"^^xsd:decimal or "1"^^xsd:double?





> Model.isIsomorphicWith() returns false when language tags case do not match
> ---------------------------------------------------------------------------
>
>                 Key: JENA-1377
>                 URL: https://issues.apache.org/jira/browse/JENA-1377
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: Jena 3.3.0
>         Environment: Linux
>            Reporter: Elie Roux
>
> Model.isIsomorphicWith() treats language tags in a case-sensitive way, which 
> is against BCP47 spec. It is easily shown with an example:
> {noformat}
>            Model m1 = ModelFactory.createDefaultModel();
>            Resource r = m1.getResource("http://example.com/resource";);
>            Property p = m1.getProperty("http://example.com/property";);
>            m1.add(r, p, m1.createLiteral("example", "zh-Latn-pinyin")); // 
> canonical
>            Model m2 = ModelFactory.createDefaultModel();
>            r = m2.getResource("http://example.com/resource";);
>            p = m2.getProperty("http://example.com/property";);
>            m2.add(r, p, m1.createLiteral("example", "zh-latn-pinyin")); // 
> lower case
>            System.out.println(m1.isIsomorphicWith(m2));
> {noformat}
> prints false, while it clearly should print true. Related bug (which is not 
> really a bug per se, just a trigger for this one: 
> https://github.com/jsonld-java/jsonld-java/issues/199
> See also https://issues.apache.org/jira/browse/COMMONSRDF-51 for some 
> consideration of the language tag case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (JENA-1377) Model.isIsomorphicWith() returns false when language tags case do not match

Reply via email to