[ https://issues.apache.org/jira/browse/LUCENE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850545#action_12850545 ]
Michael McCandless commented on LUCENE-2351: -------------------------------------------- The attached patch improves sneaky wildcard query "un*t" (on a 5M doc wikipedia index, matching 1058 terms --> 124623 docs) from 39.69 QPS -> 44.85 QPS (best of 5) on flex. But trunk is at 63.19 QPS so we still have more to do... > optimize automatonquery > ----------------------- > > Key: LUCENE-2351 > URL: https://issues.apache.org/jira/browse/LUCENE-2351 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Affects Versions: Flex Branch > Reporter: Robert Muir > Priority: Minor > Fix For: Flex Branch > > Attachments: LUCENE-2351.patch > > > Mike found a few cases in flex where we have some bad behavior with > automatonquery. > The problem is similar to a database query planner, where sometimes simply > doing a full table scan is faster than using an index. > We can optimize automatonquery a little bit, and get better performance for > fuzzy,wildcard,regex queries. > Here is a list of ideas: > * create commonSuffixRef for infinite automata, not just really-bad linear > scan cases > * do a null check rather than populating an empty commonSuffixRef > * localize the 'linear' case to not seek, but instead scan, when ping-ponging > against loops in the state machine > * add a mechanism to enable/disable the terms dict cache, e.g. we can disable > it for infinite cases, and maybe fuzzy N>1 also. > * change the use of BitSet to OpenBitSet or long[] gen for path-tracking > * optimize the backtracking code where it says /* String is good to go as-is > */, this need not be a full run(), I think... -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org