GitHub user cbeer opened a pull request: https://github.com/apache/lucene-solr/pull/308
Add a suggester that operates on tokenized values from a field The `TokenizingSuggester` is suspiciously similar to the `AnalyzingInfixSuggester` (and presumably it could be merged into or extend the `AnalyzingInfixSuggester`), but with an additional feature (the `tokenizingAnalyzer`) that allows us to pre-tokenizing suggestions into a manageable size (perhaps single words, shingles of multiple words, or perhaps even NLP-extracted noun phrases) . Our use case is providing autocomplete suggestions for searching within OCR text of a document (searching within is powered by highlighting), and we're dealing with some page-level OCR that can easily exceed the 32k size limit for the `AnalyzingInfixSuggester`'s exacttext string field. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cbeer/lucene-solr tokenizing-suggester-upstreamable Alternatively you can review and apply these changes as the patch at: https://github.com/apache/lucene-solr/pull/308.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #308 ---- commit c516bcaabbe6214ba4938859d6775ae7992fed0a Author: Chris Beer <cabeer@...> Date: 2018-01-16T21:29:51Z Add a suggester that operates on tokenized values from a field ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org