[
https://issues.apache.org/jira/browse/LUCENE-7866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16045457#comment-16045457
]
Uwe Schindler edited comment on LUCENE-7866 at 6/10/17 8:45 AM:
----------------------------------------------------------------
OK, I figured out:
The DelimitedPayloadTokenFilter does never fail in TestFactories or
TestRandomChains, because it is impossible to "randomly" get a PayloadEncoder
for ints (as you need to pass a special value to the factory). By default it
just encodes the text behind the payload as a string payload. Because of that
it never fails.
But In our case the constructor just has an char (the delimiter) and the part
behind the delimiter MUST be an integer. And that breaks with all types of
random data.
was (Author: thetaphi):
OK, I figured out:
The DelimitedPayloadTokenFilter does never fail in TestFactories or
TestRandomChains, because it is impossible to "randomly" get a PayloadEncoder
for ints (as you need to pass a special value to the factory). By default it
just encodes the text behind the payload as a string payload. Because of that
it never fails.
But In our case the constructor just has an int (the delimiter) and the part
behind the delimiter MUST be an integer. And that breaks with all types of
random data.
> Add TokenFilter to add custom term frequency (like
> DelimitedPayloadTokenFilter)
> -------------------------------------------------------------------------------
>
> Key: LUCENE-7866
> URL: https://issues.apache.org/jira/browse/LUCENE-7866
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/analysis
> Reporter: Uwe Schindler
> Assignee: Uwe Schindler
> Fix For: master (7.0)
>
> Attachments: LUCENE-7866.patch, LUCENE-7866.patch, LUCENE-7866.patch,
> LUCENE-7866.patch
>
>
> This is a followup of LUCENE-7854. This will add a simple {{TokenFilter}}
> like {{DelimitedPayloadTokenFilter}} that can be used to index a custom term
> frequency: {{"token|5"}} will be index token "token" with a term freq of 5.
> The effect is the same as adding the token 5 times by a "repeat token filter".
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]