[ 
https://issues.apache.org/jira/browse/COMMONSRDF-51?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15824637#comment-15824637
 ] 

ASF GitHub Bot commented on COMMONSRDF-51:
------------------------------------------

Github user ansell commented on a diff in the pull request:

    https://github.com/apache/commons-rdf/pull/30#discussion_r96309778
  
    --- Diff: api/src/test/java/org/apache/commons/rdf/api/AbstractRDFTest.java 
---
    @@ -194,6 +194,114 @@ public void testCreateLiteralLangISO693_3() throws 
Exception {
             assertEquals("\"Herbert Van de Sompel\"@vls", 
vls.ntriplesString());
         }
     
    +    public void testCreateLiteralLangCaseInsensitive() throws Exception {
    +        // COMMONSRDF-51: Literal langtag may not be in lowercase, but
    +        // must be COMPARED (aka .equals and .hashCode()) in lowercase
    +        // as the language space is lower case.       
    +        final Literal lower = factory.createLiteral("Hello", "en-gb"); 
    +        final Literal upper = factory.createLiteral("Hello", "EN-GB"); 
    +        final Literal mixed = factory.createLiteral("Hello", "en-GB");
    +
    +        
    +        assertEquals("en-gb", lower.getLanguageTag().get());
    --- End diff --
    
    RDF4J may not follow this in some cases. It may use the BCP47 normalisation 
conventions to obtain en-GB instead.


> RDF-1.1 specifies that language tags need to be compared using lower-case
> -------------------------------------------------------------------------
>
>                 Key: COMMONSRDF-51
>                 URL: https://issues.apache.org/jira/browse/COMMONSRDF-51
>             Project: Apache Commons RDF
>          Issue Type: Bug
>          Components: api
>    Affects Versions: 0.3.0
>            Reporter: Peter Ansell
>            Assignee: Stian Soiland-Reyes
>
> The [RDF-1.1 specification states that the [value space of Literal language 
> tags is 
> lowercase|https://www.w3.org/TR/rdf11-concepts/#section-Graph-Literal], which 
> does not conflict with the case-insensitive specification in BCP47. The 
> Literal.equals and Literal.hashCode API contracts should specify that 
> language tags must be compared using lowercase, even if they are otherwise 
> stored and returned as upper-case by getLanguageTag. The API currently has 
> incorrect language by saying "character-by-character" for language tag 
> comparisons, as that implies case-sensitive comparisons are used.
> The lowercasing must also be done using a locale that is consistent (known 
> example where lowercase and uppercase do not roundtrip as expected for 
> US-ASCII characters is Turkish [1]), so I would recommend actually stating 
> that .toLowerCase(Locale.ENGLISH) is used.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to