[
https://issues.apache.org/jira/browse/LUCENE-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483625#comment-16483625
]
ASF subversion and git services commented on LUCENE-8273:
---------------------------------------------------------
Commit 24c186eff9a9b2b2c0a86fc0a828bd81ba0993e8 in lucene-solr's branch
refs/heads/master from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=24c186e ]
LUCENE-8273: Don't wrap MinHashFilter in a condition
MinHashFilter needs to consume the entire tokenstream, so wrapping it in a
randomized condition makes no sense, and breaks offsets.
> Add a ConditionalTokenFilter
> ----------------------------
>
> Key: LUCENE-8273
> URL: https://issues.apache.org/jira/browse/LUCENE-8273
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Alan Woodward
> Assignee: Alan Woodward
> Priority: Major
> Fix For: 7.4
>
> Attachments: LUCENE-8273-2.patch, LUCENE-8273-2.patch,
> LUCENE-8273-part2-rebased.patch, LUCENE-8273-part2-rebased.patch,
> LUCENE-8273-part2.patch, LUCENE-8273-part2.patch, LUCENE-8273.patch,
> LUCENE-8273.patch, LUCENE-8273.patch, LUCENE-8273.patch, LUCENE-8273.patch,
> LUCENE-8273.patch, LUCENE-8273.patch, LUCENE-8273.patch
>
>
> Spinoff of LUCENE-8265. It would be useful to be able to wrap a TokenFilter
> in such a way that it could optionally be bypassed based on the current state
> of the TokenStream. This could be used to, for example, only apply
> WordDelimiterFilter to terms that contain hyphens.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]