[ 
https://issues.apache.org/jira/browse/LUCENE-4661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546927#comment-13546927
 ] 

Michael McCandless commented on LUCENE-4661:
--------------------------------------------

Marcus, I would stick with 3/1 ... but best would be to run experiments and see 
:)

Shawn, CMS will accept up to maxMergeCount merges, but then if another merge 
wants to kick off, CMS will pause the thread that "caused" this merge to be 
kicked off (ie, pause the producers of segments).  So if maxMergeCount=4, then 
4 merges will be queued up (with one of them actually running, if 
maxThreadCount=1), but if your indexing thread(s) produce so many segments that 
a 5th merge now wants to run, they will then be paused at that point, until 1 
merge finishes and we are back to 4 queued merges.
                
> Reduce default maxMerge/ThreadCount for ConcurrentMergeScheduler
> ----------------------------------------------------------------
>
>                 Key: LUCENE-4661
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4661
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: 4.1, 5.0
>
>
> I think our current defaults (maxThreadCount=#cores/2,
> maxMergeCount=maxThreadCount+2) are too high ... I've frequently found
> merges falling behind and then slowing each other down when I index on
> a spinning-magnets drive.
> As a test, I indexed all of English Wikipedia with term-vectors (=
> heavy on merging), using 6 threads ... at the defaults
> (maxThreadCount=3, maxMergeCount=5, for my machine) it took 5288 sec
> to index & wait for merges & commit.  When I changed to
> maxThreadCount=1, maxMergeCount=2, indexing time sped up to 2902
> seconds (45% faster).  This is on a spinning-magnets disk... basically
> spinning-magnets disk don't handle the concurrent IO well.
> Then I tested an OCZ Vertex 3 SSD: at the current defaults it took
> 1494 seconds and at maxThreadCount=1, maxMergeCount=2 it took 1795 sec
> (20% slower).  Net/net the SSD can handle merge concurrency just fine.
> I think we should change the defaults: spinning magnet drives are hurt
> by the current defaults more than SSDs are helped ... apps that know
> their IO system is fast can always increase the merge concurrency.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to