[ 
https://issues.apache.org/jira/browse/LUCENE-5705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14008387#comment-14008387
 ] 

Shai Erera commented on LUCENE-5705:
------------------------------------

Orthogonal to this issue, but it sounds like you're doing a large initial 
import from JDBC? Maybe you only do a single import even? In that case, maybe 
it's better if you disable merges at all during import, then turn on merges, 
call maybeMerge (I'm not sure if Solr has a command to do that though) with a 
MergeScheduler that runs more than the default concurrent merges? Just an 
idea...

> ConcurrentMergeScheduler/maxMergeCount default is too low
> ---------------------------------------------------------
>
>                 Key: LUCENE-5705
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5705
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/other
>    Affects Versions: 4.8
>            Reporter: Shawn Heisey
>            Assignee: Shawn Heisey
>            Priority: Minor
>             Fix For: 4.9
>
>         Attachments: LUCENE-5705.patch, LUCENE-5705.patch, dih-example.patch
>
>
> The default value for maxMergeCount in ConcurrentMergeScheduler is 2.  This 
> causes problems for Solr's dataimport handler when very large imports are 
> done from a JDBC source.
> What happens is that when three merge tiers are scheduled at the same time, 
> the add/update thread will stop for several minutes while the largest merge 
> finishes.  In the meantime, the dataimporter JDBC connection to the database 
> will time out, and when the add/update thread resumes, the import will fail 
> because the ResultSet throws an exception.  Setting maxMergeCount to 6 
> eliminates this issue for virtually any size import -- although it is 
> theoretically possible to have that many simultaneous merge tiers, I've never 
> seen it.
> As long as maxThreads is properly set (the default value of 1 is appropriate 
> for most installations), I cannot think of a really good reason that the 
> default for maxMergeCount should be so low.  If someone does need to strictly 
> control the number of threads that get created, they can reduce the number.  
> Perhaps someone with more experience knows of a really good reason to make 
> this default low?
> I'm not sure what the new default number should be, but I'd like to avoid 
> bikeshedding.  I don't think it should be Integer.MAX_VALUE.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to