[ 
https://issues.apache.org/jira/browse/LUCENE-8323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16489305#comment-16489305
 ] 

Jim Ferenczi commented on LUCENE-8323:
--------------------------------------

{quote}
What I'd really like to see is this thing enhanced to use 
GraphTokenStreamFiniteStrings. That did not exist at the time it was developed. 
It would add proper support for different position increments & lengths so that 
the analysis chain can usefully add synonyms, etc.
{quote}

You can also check the CompletionTokenStream in the suggest package. It does 
exactly what you want and it's already a TokenStream so maybe it can be renamed 
and moved to the analysis module ?

> New ConcatenateFilter, a TokenFilter to concat/join tokens
> ----------------------------------------------------------
>
>                 Key: LUCENE-8323
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8323
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/analysis
>            Reporter: David Smiley
>            Assignee: David Smiley
>            Priority: Major
>         Attachments: LUCENE-8323.patch
>
>
> Here I introduce the ConcatenateFilter (with Factory) to concatenate/join 
> tokens with a provided separator to produce one final token.  It's similar to 
> FingerprintFilter but doesn't deduplicate or sort.  It's useful for doing 
> exact-ish search on short text (think names or titles) with simple analysis.  
> At this task, its faster than a PhraseQuery equivalent, and solves the issue 
> of matching completely and not a portion of the tokens.  It's also useful for 
> using Lucene to hold a dictionary of short names/phrases for 
> entity-extraction (aka text tagging).  The OpenSextant SolrTextTagger uses it 
> for this purpose, which is where I'm taking it from.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to