[ https://issues.apache.org/jira/browse/LUCENE-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12877145#action_12877145 ]
Edward Drapkin commented on LUCENE-2494: ---------------------------------------- That's MUCH better than what I had, kudos! > Modify ParallelMultiSearcher to use a CompletionService instead of slowly > polling for results > --------------------------------------------------------------------------------------------- > > Key: LUCENE-2494 > URL: https://issues.apache.org/jira/browse/LUCENE-2494 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Environment: Irrelevant > Reporter: Edward Drapkin > Assignee: Simon Willnauer > Fix For: 3.1 > > Attachments: LUCENE-2494.patch, LUCENE-2494.patch > > Original Estimate: 0h > Remaining Estimate: 0h > > Right now, the parallel multi searcher creates an array/list of Future<V> > representing each of the searchables that's being concurrently searched (and > its corresponding search task). > As it stands, once the tasks are all submitted to the executor, the array is > iterated over, FIFO, and Future.get() is called iteratively. This obviously > works, but isn't ideal. It's entirely possible (a situation I've run into) > where one of the first searchables represents a large index that takes a long > time to search, so the results of the other searchables can't be processed > until the large index is done searching. In my case, we have two indexes > with several million records that get searched in front of some other > indexes, the smallest of which has only a few ten thousand entries and I > didn't think it was ideal for the results of the other indexes to wait. > I've modified ParallelMultiSearcher to use CompletionServices instead, so > that results are processed in the order they are completed, rather than the > order that they are submitted. All the tests still pass, and to the best of > my knowledge this won't break anything. This have several advantages: > 1) Speed - the thread owning the executor doesn't have to wait for the first > submitted task to finish in order to process the results of the other tasks, > which may have finished first > 2) Removed several warnings (even if they are annotated away) due to the > ugliness of typecasting generic arrays. > 3) Decreased the complexity of the code in some cases, usually by removing > the necessity of allocating and filling arrays. > With a primed "cache" of searchables, I was getting 700-1200 ms per search, > and using the same phrases, with this patch, I am now getting 400-500ms per > search :) > Patch is attached. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org