[
https://issues.apache.org/jira/browse/LUCENE-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483624#comment-16483624
]
ASF subversion and git services commented on LUCENE-8273:
---------------------------------------------------------
Commit 0934e2a998ac43e46594e049daab751d8cae2476 in lucene-solr's branch
refs/heads/branch_7x from [~romseygeek]
[ https://git-wip-us.apache.org/repos/asf?p=lucene-solr.git;h=0934e2a ]
LUCENE-8273: Don't wrap MinHashFilter in a condition
MinHashFilter needs to consume the entire tokenstream, so wrapping it in a
randomized condition makes no sense, and breaks offsets.
> Add a ConditionalTokenFilter
> ----------------------------
>
> Key: LUCENE-8273
> URL: https://issues.apache.org/jira/browse/LUCENE-8273
> Project: Lucene - Core
> Issue Type: New Feature
> Reporter: Alan Woodward
> Assignee: Alan Woodward
> Priority: Major
> Fix For: 7.4
>
> Attachments: LUCENE-8273-2.patch, LUCENE-8273-2.patch,
> LUCENE-8273-part2-rebased.patch, LUCENE-8273-part2-rebased.patch,
> LUCENE-8273-part2.patch, LUCENE-8273-part2.patch, LUCENE-8273.patch,
> LUCENE-8273.patch, LUCENE-8273.patch, LUCENE-8273.patch, LUCENE-8273.patch,
> LUCENE-8273.patch, LUCENE-8273.patch, LUCENE-8273.patch
>
>
> Spinoff of LUCENE-8265. It would be useful to be able to wrap a TokenFilter
> in such a way that it could optionally be bypassed based on the current state
> of the TokenStream. This could be used to, for example, only apply
> WordDelimiterFilter to terms that contain hyphens.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]