[
https://issues.apache.org/jira/browse/LUCENE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051824#comment-13051824
]
Dawid Weiss commented on LUCENE-3206:
-------------------------------------
This is fine -- "Sorting of UTF-8 strings as arrays of unsigned bytes will
produce the same results as sorting them based on Unicode code points.",
http://en.wikipedia.org/wiki/UTF-8.
Indeed, when you look at how UTF8 encodes multibyte codepoints the codepoint
order will be preserved.
> FST package API refactoring
> ---------------------------
>
> Key: LUCENE-3206
> URL: https://issues.apache.org/jira/browse/LUCENE-3206
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/FSTs
> Affects Versions: 3.2
> Reporter: Dawid Weiss
> Assignee: Dawid Weiss
> Priority: Minor
> Fix For: 3.3, 4.0
>
> Attachments: LUCENE-3206.patch
>
>
> The current API is still marked @experimental, so I think there's still time
> to fiddle with it. I've been using the current API for some time and I do
> have some ideas for improvement. This is a placeholder for these -- I'll post
> a patch once I have a working proof of concept.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]