GitHub user cbeer opened a pull request:

    https://github.com/apache/lucene-solr/pull/308

    Add a suggester that operates on tokenized values from a field

    The `TokenizingSuggester` is suspiciously similar to the 
`AnalyzingInfixSuggester` (and presumably it could be merged into or extend the 
`AnalyzingInfixSuggester`), but with an additional feature (the 
`tokenizingAnalyzer`) that allows us to pre-tokenizing suggestions into a 
manageable size (perhaps single words, shingles of multiple words, or perhaps 
even NLP-extracted noun phrases) .
    
    Our use case is providing autocomplete suggestions for searching within OCR 
text of a document (searching within is powered by highlighting), and we're 
dealing with some page-level OCR that can easily exceed the 32k size limit for 
the `AnalyzingInfixSuggester`'s exacttext string field. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cbeer/lucene-solr 
tokenizing-suggester-upstreamable

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/lucene-solr/pull/308.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #308
    
----
commit c516bcaabbe6214ba4938859d6775ae7992fed0a
Author: Chris Beer <cabeer@...>
Date:   2018-01-16T21:29:51Z

    Add a suggester that operates on tokenized values from a field

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to