Hello, I've been trying to get this working for some time now. The problem is with polish characters in orientdb (i.e. 'ą','ę','ć','ź','ż',etc.). When using the order by clause the results aren't ordered properly. Test case: 1. Create new database (test) 2. Connect to test 3. Create class Book 4. Create 'name' property (string) in Book 5. Add 6 records to Book - name = "ącki" - name = "abrakadabra" - name = "baran" - name = "bączek" - name = "ćwierkacz" - name = "czarny" 6. select * from Book order by name asc.
*Expected result (sort order by name):* abrakadabra ącki baran bączek czarny ćwierkacz *Received result:* abrakadabra baran bączek czarny ącki ćwierkacz I already tried using lucene index with analyzer: org.apache.lucene.analysis.pl.PolishAnalyzer, but it doesn't seem to work (the index distinguishes "a" and "ą" and I don't see a way to set the field as ICUCollationField which works for Solr as expected), and also tried to use the normalize function (i.e. Select * from Book order by name.normalize()) which works almost as expected except the "ą"gets in front of the "a" and it also seems that the normalize() purpose was different then using it in order by. To summarize, I am expecting to get the order by working for any language using diacritics. So it should work for german ü, polish ą, czech č, etc. For polish, the letter "ą" is after "a" but before "b". What would be the proper way to get this working? I've noticed that the only engine that gets this done properly is arangoDB. Neo4j has the same problem and now when trying orientDB I cannot get this to work. Would someone be so kind to point me in the proper direction on how to approach this issue? Best Regards Rafal. -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
