Github user kinow commented on the issue:

    https://github.com/apache/jena/pull/237
  
    >I agree with the other commenters, the general order should be (lang, lex) 
to avoid potentially inconsistent ordering.
    
    Ack, that makes sense +1
    
    >Also the language tag may not match any Locale. We also need to have unit 
tests that verify that the code works in corner cases like this.
    
    Sure, tests and more defensive programming will come later. Right now 
looking more for comments on how to sort, where to sort, etc.
    
    Besides typos/mispellings, there are also valid tags such as i-klingon (I 
believe this is mentioned in some specification linked in the SPARQL spec 
page). For cases like this I think we would simply try to match against the 
JVM's available locales, and if not existing, then just use normal string 
comparison.
    
    >But what about subtags like en-US and en-GB? If the language tag is the 
primary sort key, then all en-GB values would sort before "a"@en-US, which I 
think would be confusing for most users.
    The sort order and collation locale could be based on just the main tag (en 
in this case) ignoring the subtags, but I'm quite sure there is some language 
subtag out there in the world that requires a different collation order from 
that of the main language...
    
    The sort order of accented letters is different for en-CA and en-FR.
    
    en-FR:
    
    * cote
    * coté
    * côte
    * côté
    
    en-CA:
    
    * cote
    * côte
    * coté
    * côté



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to