[
https://issues.apache.org/jira/browse/LUCENE-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159528#comment-14159528
]
Shai Erera commented on LUCENE-5989:
------------------------------------
bq. Is that simply a typo
Yes, fixed :).
The term 'keyword' is of course overloaded here. When I propose KeywordField, I
am following the existing Keyword* classes that we have: KeywordTokenizer,
KeywordAnalyzer, KeywordAttribute. And from what I remember, when users ask how
to parse 'keywords' they indexed as StringFields, we often tell them to use
PerFieldAnalyzerWrapper with a KeywordAnalyzer for that field. That's why I
feel that KeywordField fits better with the overall Keyword* tokenstream API.
> Add BinaryField, to index a single binary token
> -----------------------------------------------
>
> Key: LUCENE-5989
> URL: https://issues.apache.org/jira/browse/LUCENE-5989
> Project: Lucene - Core
> Issue Type: Improvement
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 5.0, Trunk
>
> Attachments: LUCENE-5989.patch
>
>
> 5 years ago (LUCENE-1458) we "enabled" fully binary terms in the
> lowest levels of Lucene (the codec APIs) yet today, actually adding an
> arbitrary byte[] binary term during indexing is far from simple: you
> must make a custom Field with a custom TokenStream and a custom
> TermToBytesRefAttribute, as far as I know.
> This is supremely expert, I wonder if anyone out there has succeeded
> in doing so?
> I think we should make indexing a single byte[] as simple as indexing
> a single String.
> This is a pre-cursor for issues like LUCENE-5596 (encoding IPv6
> address as byte[16]) and LUCENE-5879 (encoding native numeric values
> in their simple binary form).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]