Re: What does "too many merges...stalling" in indexwriter log mean?

2013-07-12 Thread Shawn Heisey
On 7/12/2013 9:23 AM, Tom Burton-West wrote:
> Do you have any feeling for what gets traded off if we increase the
> maxMergeCount?
> 
> This is completely new for us because we are experimenting with indexing
> pages instead of whole documents.  Since our average document is about 370
> pages, this means that we have increased the number of documents we are
> asking Solr to index by a couple of orders of magnitude. (on the other hand
> the size of the document decreases by a couple of orders of magnitude).
> I'm not sure why increasing the number of documents (and reducing their
> size) is causing more merges.  I'll have to investigate.

I'm not sure that you lose anything, really.  If everything is
proceeding normally before the "stalling" message is logged, I would not
expect it to cause ANY problems.

The reason that I increased this value was because when I did a
full-import of millions of documents from mysql, I would reach the point
where there were three different levels of merges going on at once.
Because the default thread count is one, only the largest merge was
actually occurring, the others were queued and waiting.

With three merges stacked up at once, I had passed the maxMergeCount
threshold, so *indexing* stopped.  It can take several minutes for a
very large merge to finish, so indexing stopped long enough that the
MySQL server would drop the connection established by the JDBC driver.
Once the merge finished and DIH tried to resume indexing, the connection
was gone and it would fail the entire import.

I have never seen more than three merge levels happening at once, so a
value of 6 is probably overkill, but shouldn't be a problem.  The true
goal is to make sure that indexing never stops, not to push the system
limits.  The maxThreadCount parameter should prevent I/O from becoming a
problem.

Thanks,
Shawn



Re: What does "too many merges...stalling" in indexwriter log mean?

2013-07-12 Thread Tom Burton-West
Thanks Shawn,

Do you have any feeling for what gets traded off if we increase the
maxMergeCount?

This is completely new for us because we are experimenting with indexing
pages instead of whole documents.  Since our average document is about 370
pages, this means that we have increased the number of documents we are
asking Solr to index by a couple of orders of magnitude. (on the other hand
the size of the document decreases by a couple of orders of magnitude).
I'm not sure why increasing the number of documents (and reducing their
size) is causing more merges.  I'll have to investigate.

Tom


On Thu, Jul 11, 2013 at 5:29 PM, Shawn Heisey  wrote:

> On 7/11/2013 1:47 PM, Tom Burton-West wrote:
>
>> We are seeing the message "too many merges...stalling"  in our indexwriter
>> log.   Is this something to be concerned about?  Does it mean we need to
>> tune something in our indexing configuration?
>>
>
> It sounds like you've run into the maximum number of simultaneous merges,
> which I believe defaults to two, or maybe three.  The following config
> section in  will likely take care of the issue. This assumes
> 3.6 or later, I believe that on older versions, this goes in
> .
>
>   
> 1
> 6
>   
>
> Looking through the source code to confirm, this definitely seems like the
> case.  Increasing maxMergeCount is likely going to speed up your indexing,
> at least by a little bit.  A value of 6 is probably high enough for mere
> mortals, buy you guys don't do anything small, so I won't begin to
> speculate what you'll need.
>
> If you are using spinning disks, you'll want maxThreadCount at 1.  If
> you're using SSD, then you can likely increase that value.
>
> Thanks,
> Shawn
>
>


Re: What does "too many merges...stalling" in indexwriter log mean?

2013-07-11 Thread Shawn Heisey

On 7/11/2013 1:47 PM, Tom Burton-West wrote:

We are seeing the message "too many merges...stalling"  in our indexwriter
log.   Is this something to be concerned about?  Does it mean we need to
tune something in our indexing configuration?


It sounds like you've run into the maximum number of simultaneous 
merges, which I believe defaults to two, or maybe three.  The following 
config section in  will likely take care of the issue. 
This assumes 3.6 or later, I believe that on older versions, this goes 
in .


  
1
6
  

Looking through the source code to confirm, this definitely seems like 
the case.  Increasing maxMergeCount is likely going to speed up your 
indexing, at least by a little bit.  A value of 6 is probably high 
enough for mere mortals, buy you guys don't do anything small, so I 
won't begin to speculate what you'll need.


If you are using spinning disks, you'll want maxThreadCount at 1.  If 
you're using SSD, then you can likely increase that value.


Thanks,
Shawn



What does "too many merges...stalling" in indexwriter log mean?

2013-07-11 Thread Tom Burton-West
Hello,

We are seeing the message "too many merges...stalling"  in our indexwriter
log.   Is this something to be concerned about?  Does it mean we need to
tune something in our indexing configuration?

Tom