Michael McCandless created LUCENE-7792:
------------------------------------------

             Summary: Add optional concurrency to OfflineSorter
                 Key: LUCENE-7792
                 URL: https://issues.apache.org/jira/browse/LUCENE-7792
             Project: Lucene - Core
          Issue Type: Improvement
            Reporter: Michael McCandless
            Assignee: Michael McCandless
             Fix For: master (7.0), 6.6


OfflineSorter is a heavy operation and is really an embarrassingly concurrent 
problem at heart, and if you have enough hardware concurrency (e.g. fast SSDs, 
multiple CPU cores) it can be a big speedup.

E.g., after reading a partition from the input, one thread can sort and write 
it, while another thread reads the next partition, etc.  Merging partitions can 
also be done in the background.  Some things still cannot be concurrent, e.g. 
the initial read from the input must be a single thread, as well as the final 
merge and writing to the final output.

I think I found a fairly non-invasive way to add optional concurrency to this 
class, by adding an optional ExecutorService to OfflineSorter's ctor (similar 
to IndexSearcher) and using futures to represent each partition as we sort, and 
creating Callable classes for sorting and merging partitions.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to