[ 
https://issues.apache.org/jira/browse/LUCENE-2351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12850844#action_12850844
 ] 

Michael McCandless commented on LUCENE-2351:
--------------------------------------------

OOOH I like this approach!!  It makes the linear decision "local", and bounds 
(by linearUpperBound) the region so that we don't have to revisit the decision 
on every term.  And it enables efficiently using the suffix :)

And.... it's FAST!  With this fix, the hard query (un*t) on flex is 105 QPS 
(best of 5, on 5 M doc wikipedia index), vs 62 QPS on trunk.  Yay :)

> optimize automatonquery
> -----------------------
>
>                 Key: LUCENE-2351
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2351
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: Flex Branch
>            Reporter: Robert Muir
>            Priority: Minor
>             Fix For: Flex Branch
>
>         Attachments: LUCENE-2351.patch, LUCENE-2351.patch, 
> LUCENE-2351_infinite.patch, LUCENE-2351_infinite.patch
>
>
> Mike found a few cases in flex where we have some bad behavior with 
> automatonquery.
> The problem is similar to a database query planner, where sometimes simply 
> doing a full table scan is faster than using an index.
> We can optimize automatonquery a little bit, and get better performance for 
> fuzzy,wildcard,regex queries.
> Here is a list of ideas:
> * create commonSuffixRef for infinite automata, not just really-bad linear 
> scan cases
> * do a null check rather than populating an empty commonSuffixRef
> * localize the 'linear' case to not seek, but instead scan, when ping-ponging 
> against loops in the state machine
> * add a mechanism to enable/disable the terms dict cache, e.g. we can disable 
> it for infinite cases, and maybe fuzzy N>1 also.
> * change the use of BitSet to OpenBitSet or long[] gen for path-tracking
> * optimize the backtracking code where it says /* String is good to go as-is 
> */, this need not be a full run(), I think...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to