[ https://issues.apache.org/jira/browse/LUCENE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051824#comment-13051824 ]
Dawid Weiss commented on LUCENE-3206: ------------------------------------- This is fine -- "Sorting of UTF-8 strings as arrays of unsigned bytes will produce the same results as sorting them based on Unicode code points.", http://en.wikipedia.org/wiki/UTF-8. Indeed, when you look at how UTF8 encodes multibyte codepoints the codepoint order will be preserved. > FST package API refactoring > --------------------------- > > Key: LUCENE-3206 > URL: https://issues.apache.org/jira/browse/LUCENE-3206 > Project: Lucene - Java > Issue Type: Improvement > Components: core/FSTs > Affects Versions: 3.2 > Reporter: Dawid Weiss > Assignee: Dawid Weiss > Priority: Minor > Fix For: 3.3, 4.0 > > Attachments: LUCENE-3206.patch > > > The current API is still marked @experimental, so I think there's still time > to fiddle with it. I've been using the current API for some time and I do > have some ideas for improvement. This is a placeholder for these -- I'll post > a patch once I have a working proof of concept. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org