[
https://issues.apache.org/jira/browse/LUCENE-4845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13605868#comment-13605868
]
Robert Muir commented on LUCENE-4845:
-------------------------------------
{quote}
I think so ... but then I worry about the FST blowing up. I guess if we limit
how "deep" the infixing can work that would limit the FST size ... but I'd
rather not have that limit.
{quote}
But how is this any different than edge-ngrams up to a limit?
With words of <= 4 chars, this suggester avoids the typical bad complexity you
would get from an inverted index because the docids are pre-sorted in
weight-order, so it can early terminate.
But as soon as you type that 5th character: it can blow up. I'm not saying its
likely, but can happen due to particulars of the content, for example if you
had place names and you typed Shangh... and this prefix matches millions and
millions of terms.
> Add AnalyzingInfixSuggester
> ---------------------------
>
> Key: LUCENE-4845
> URL: https://issues.apache.org/jira/browse/LUCENE-4845
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/spellchecker
> Reporter: Michael McCandless
> Assignee: Michael McCandless
> Fix For: 5.0, 4.3
>
> Attachments: infixSuggest.png, LUCENE-4845.patch, LUCENE-4845.patch
>
>
> Our current suggester impls do prefix matching of the incoming text
> against all compiled suggestions, but in some cases it's useful to
> allow infix matching. E.g, Netflix does infix suggestions in their
> search box.
> I did a straightforward impl, just using a normal Lucene index, and
> using PostingsHighlighter to highlight matching tokens in the
> suggestions.
> I think this likely only works well when your suggestions have a
> strong prior ranking (weight input to build), eg Netflix knows
> the popularity of movies.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]