[
https://issues.apache.org/jira/browse/LUCENE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael McCandless reassigned LUCENE-1434:
------------------------------------------
Assignee: Michael McCandless
> IndexableBinaryStringTools: convert arbitrary byte sequences into Strings
> that can be used as index terms, and vice versa
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: LUCENE-1434
> URL: https://issues.apache.org/jira/browse/LUCENE-1434
> Project: Lucene - Java
> Issue Type: New Feature
> Components: Other
> Affects Versions: 2.4
> Reporter: Steven Rowe
> Assignee: Michael McCandless
> Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1434.patch
>
>
> Provides support for converting byte sequences to Strings that can be used as
> index terms, and back again. The resulting Strings preserve the original byte
> sequences' sort order (assuming the bytes are interpreted as unsigned).
> The Strings are constructed using a Base 8000h encoding of the original
> binary data - each char of an encoded String represents a 15-bit chunk from
> the byte sequence. Base 8000h was chosen because it allows for all lower 15
> bits of char to be used without restriction; the surrogate range
> [U+D800-U+DFFF] does not represent valid chars, and would require complicated
> handling to avoid them and allow use of char's high bit.
> This class is intended to serve as a mechanism to allow CollationKeys to
> serve as index terms.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]