Michael McCandless created LUCENE-7792:
------------------------------------------
Summary: Add optional concurrency to OfflineSorter
Key: LUCENE-7792
URL: https://issues.apache.org/jira/browse/LUCENE-7792
Project: Lucene - Core
Issue Type: Improvement
Reporter: Michael McCandless
Assignee: Michael McCandless
Fix For: master (7.0), 6.6
OfflineSorter is a heavy operation and is really an embarrassingly concurrent
problem at heart, and if you have enough hardware concurrency (e.g. fast SSDs,
multiple CPU cores) it can be a big speedup.
E.g., after reading a partition from the input, one thread can sort and write
it, while another thread reads the next partition, etc. Merging partitions can
also be done in the background. Some things still cannot be concurrent, e.g.
the initial read from the input must be a single thread, as well as the final
merge and writing to the final output.
I think I found a fairly non-invasive way to add optional concurrency to this
class, by adding an optional ExecutorService to OfflineSorter's ctor (similar
to IndexSearcher) and using futures to represent each partition as we sort, and
creating Callable classes for sorting and merging partitions.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]