[
https://issues.apache.org/jira/browse/LUCENE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13051725#comment-13051725
]
Dawid Weiss commented on LUCENE-3206:
-------------------------------------
UTF32 is basically codepoint representation, so there are no surrogates (as in
UTF16) and there is no special encoding of higher codepoints (as in UTF8). I
don't know what sort order is used inside Lucene (is it UTF8 byte-to-byte
values or decoded codepoints?). If it is codepoint order then no problem --
this should be preserved.
I'll stick to BYTE1/BYTE4 inputs then for now and I'll try to push this patch
forward in my spare time.
> FST package API refactoring
> ---------------------------
>
> Key: LUCENE-3206
> URL: https://issues.apache.org/jira/browse/LUCENE-3206
> Project: Lucene - Java
> Issue Type: Improvement
> Components: core/FSTs
> Affects Versions: 3.2
> Reporter: Dawid Weiss
> Assignee: Dawid Weiss
> Priority: Minor
> Fix For: 3.3, 4.0
>
> Attachments: LUCENE-3206.patch
>
>
> The current API is still marked @experimental, so I think there's still time
> to fiddle with it. I've been using the current API for some time and I do
> have some ideas for improvement. This is a placeholder for these -- I'll post
> a patch once I have a working proof of concept.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]