[
https://issues.apache.org/jira/browse/LUCENE-580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12465797
]
Nadav Har'El commented on LUCENE-580:
-------------------------------------
This patch will be useful for users LUCENE-755, the payloads patch. That patch
adds "payloads" to tokens, but using it to add a few tokens with payloads in
some field can be ugly because you need to split the code into two places: at
one place you add the field, only text, and at another place you need to write
a special analyzer which will work only on that field, recognize the specific
tokens and add the payloads to them. This patch makes this easier, because when
you add a field, you can add it pre-analyzed, already as a list of tokens, and
these tokens will already have their special payloads in them.
I have just a few comments on this patch:
1. The description above suggests that it might not work if the same field name
is used for two Field's, one stored and the other preanalyzed. I think it is
important that this combination (as well as all other combinations) are
supported. I actually use all these combinations in my code, and I don't see
why it should cause problems.
2. The patch has some strange changes in the comments, changing the word
"Index" to "NotificationService". I bet this wasn't intentional :-)
3. The new Field constructor still has a "Index" paramter, taking TOKENIZED,
UN_TOKENIZED or NO_NORMS (only NO is forbidden). I wonder, what's the
difference between TOKENIZED and UN_TOKENIZED in this case? The NO_NORMS is a
very useful case, because it allows you to do something not previously possible
in Lucene (a tokenized field, but without norms). Perhaps this parameter should
be better documented in the javadoc comment.
4. In the new Field constructor's comment, the phrase "if name or reader"
should be "if name or tokenStream".
Thanks!
> Pre-analyzed fields
> -------------------
>
> Key: LUCENE-580
> URL: https://issues.apache.org/jira/browse/LUCENE-580
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Analysis
> Affects Versions: 1.9
> Reporter: Karl Wettin
> Priority: Minor
> Attachments: preanalyze.tar
>
>
> Adds the possibility to set a TokenStream at Field constrution time,
> available as tokenStreamValue in addition to stringValue, readerValue and
> binaryValue.
> There might be some problems with mixing stored fields with the same name as
> a field with tokenStreamValue.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]