[ https://issues.apache.org/jira/browse/SOLR-2378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13015631#comment-13015631 ]
Michael McCandless commented on SOLR-2378: ------------------------------------------ Wow those improvements are awesome -- FST 26.7X smaller RAM footprint, 18X faster lookups, but build time is 3.6X slower. This is built on a composite reader, right? Does the build time include the time to enum the terms from MultiTermsEnum? > FST-based Lookup (suggestions) for prefix matches. > -------------------------------------------------- > > Key: SOLR-2378 > URL: https://issues.apache.org/jira/browse/SOLR-2378 > Project: Solr > Issue Type: New Feature > Components: spellchecker > Reporter: Dawid Weiss > Assignee: Dawid Weiss > Labels: lookup, prefix > Fix For: 4.0 > > > Implement a subclass of Lookup based on finite state automata/ transducers > (Lucene FST package). This issue is for implementing a relatively basic > prefix matcher, we will handle infixes and other types of input matches > gradually. Impl. phases: > - write a DFA based suggester effectively identical to ternary tree based > solution right now, > - baseline benchmark against tern. tree (memory consumption, rebuilding > speed, indexing speed; reuse Andrzej's benchmark code) > - modify DFA to encode term weights directly in the automaton (optimize for > onlyMostPopular case) > - benchmark again > - add infix suggestion support with prefix matches boosted higher (?) > - benchmark again > - modify the tutorial on the wiki [http://wiki.apache.org/solr/Suggester] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org