Re: excessive merging/small segment sizes

Kireet Reddy Mon, 07 Jul 2014 12:48:01 -0700

All that seems correct (except I think this is for node 6, not node 5). We 
don't delete documents, but we do some updates. The vast majority of 
documents get indexed into the large shards, but the smaller ones take some 
writes as well.


We aren't using virtualized hardware and elasticsearch is the only thing 
running on the machines, no scheduled jobs, etc. I don't think something is 
interfering, actually overall disk i/o rate and operations on the machine 
go down quite a bit during the problematic period, which is consistent with 
your observations about things taking longer.

I went back and checked all our collected metrics again. I noticed that 
even though the heap usage and gc count seems smooth during the period in 
question, gc time spent goes way up. Also active indexing threads goes up, 
but since our ingest rate didn't go up I assumed this was a side effect. 
During a previous occurrence a few days ago on node5, I stopped all 
indexing activity for 15 minutes. Active merges and indexing requests went 
to zero as expected. Then I re-enabled indexing and immediately the 
increased cpu/gc/active merges went back up to the problematic rates.

Overall this is pretty confusing to me as to what is a symptom vs a root 
cause here. A summary of what I think I know:

   1. Every few days, cpu usage on a node goes way above the other nodes 
   and doesn't recover. We've let the node run in the elevated cpu state for a 
   day with no improvement.
   2. It doesn't seem likely that it's data related. We use replicas=1 and 
   no other nodes have issues.
   3. It doesn't seem hardware related. We run on a dedicated h/w with 
   elasticsearch being the only thing running. Also the problem appears on 
   various nodes and machine load seems tied directly to the elasticsearch 
   process.
   4. During the problematic period: cpu usage, active merge threads, 
   active bulk (indexing) threads, and gc time are elevated.
   5. During the problematic period: i/o ops and i/o throughput decrease.
   6. overall heap usage size seems to smoothly increase, the extra gc time 
   seems to be spent on the new gen. Interestingly, the gc count didn't seem 
   to increase.
   7. In the hours beforehand, gc behavior of the problematic node was 
   similar to the other nodes.
   8. If I pause indexing, machine load quickly returns to normal, merges 
   and indexing requests complete.  if I then restart indexing the problem 
   reoccurs immediately.
   9. If I disable automatic refreshes, the problem disappears within an 
   hour or so.
   10. hot threads show merging activity as the hot threads.

The first few points make me think the increased active merges is perhaps a 
side effect, but then the last 3 make me think merging is the root cause. 
The only additional things I can think of that may be relevant are:

   1. Our documents can vary greatly in size, they average a couple KB but 
   can rarely be several MB. 
   2. we do use language analysis plugins, perhaps one of these is acting 
   up? 
   3. We eagerly load one field into the field data cache. But the cache 
   size is ok and the overall heap behavior is ok so I don't think this is the 
   problem.

That's a lot of information, but I am not sure where to go next from here...

On Monday, July 7, 2014 8:23:20 AM UTC-7, Michael McCandless wrote:
>
> Indeed there are no big merges during that time ...
>
> I can see on node5, ~14:45 suddenly merges are taking a long time, refresh 
> is taking much longer (4-5 seconds instead of < .4 sec), commit time goes 
> up from < 0.5 sec to ~1-2 sec, etc., but other metrics are fine e.g. total 
> merging GB, number of commits/refreshes is very low during this time.
>
> Each node has 2 biggish (~17 GB) shards and then ~50 tiny shards.  The 
> biggish shards are indexing at a very slow rate and only have ~1% 
> deletions.  Are you explicitly deleting docs?
>
> I suspect something is suddenly cutting into the IO perf of this box, and 
> because merging/refreshing is so IO intensive, it causes these operations 
> to run slower / backlog.
>
> Are there any scheduled jobs, e.g. backups/snapshots, that start up?  Are 
> you running on virtualized hardware?
>
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>  
>
> On Sun, Jul 6, 2014 at 8:23 PM, Kireet Reddy <[email protected] 
> <javascript:>> wrote:
>
>> Just to reiterate, the problematic period is from 07/05 14:45 to 07/06 
>> 02:10. I included a couple hours before and after in the logs.
>>
>>
>> On Sunday, July 6, 2014 5:17:06 PM UTC-7, Kireet Reddy wrote:
>>>
>>> They are linked below (node5 is the log of the normal node, node6 is the 
>>> log of the problematic node). 
>>>
>>> I don't think it was doing big merges, otherwise during the high load 
>>> period, the merges graph line would have had a "floor" > 0, similar to the 
>>> time period after I disabled refresh. We don't do routing and use mostly 
>>> default settings. I think the only settings we changed are:
>>>
>>> indices.memory.index_buffer_size: 50%
>>> index.translog.flush_threshold_ops: 50000
>>>
>>> We are running on a 6 cpu/12 cores machine with a 32GB heap and 96GB of 
>>> memory with 4 spinning disks. 
>>>
>>> node 5 log (normal) 
>>> <https://www.dropbox.com/s/uf76m58nf87mdmw/node5.zip>
>>> node 6 log (high load) 
>>> <https://www.dropbox.com/s/w7qm2v9qpdttd69/node6.zip>
>>>
>>> On Sunday, July 6, 2014 4:23:19 PM UTC-7, Michael McCandless wrote:
>>>>
>>>> Can you post the IndexWriter infoStream output?  I can see if anything 
>>>> stands out.
>>>>
>>>> Maybe it was just that this node was doing big merges?  I.e., if you 
>>>> waited long enough, the other shards would eventually do their big merges 
>>>> too?
>>>>
>>>> Have you changed any default settings, do custom routing, etc.?  Is 
>>>> there any reason to think that the docs that land on this node are 
>>>> "different" in any way?
>>>>
>>>> Mike McCandless
>>>>
>>>> http://blog.mikemccandless.com
>>>>
>>>>
>>>> On Sun, Jul 6, 2014 at 6:48 PM, Kireet Reddy <[email protected]> wrote:
>>>>
>>>>>  From all the information I’ve collected, it seems to be the merging 
>>>>> activity:
>>>>>
>>>>>
>>>>>    1. We capture the cluster stats into graphite and the current 
>>>>>    merges stat seems to be about 10x higher on this node. 
>>>>>    2. The actual node that the problem occurs on has happened on 
>>>>>    different physical machines so a h/w issue seems unlikely. Once the 
>>>>> problem 
>>>>>    starts it doesn't seem to stop. We have blown away the indices in the 
>>>>> past 
>>>>>    and started indexing again after enabling more logging/stats. 
>>>>>    3. I've stopped executing queries so the only thing that's 
>>>>>    happening on the cluster is indexing.
>>>>>    4. Last night when the problem was ongoing, I disabled refresh 
>>>>>    (index.refresh_interval = -1) around 2:10am. Within 1 hour, the load 
>>>>>    returned to normal. The merge activity seemed to reduce, it seems like 
>>>>> 2 
>>>>>    very long running merges are executing but not much else. 
>>>>>    5. I grepped an hour of logs of the 2 machiese for "add merge=", 
>>>>>    it was 540 on the high load node and 420 on a normal node. I pulled 
>>>>> out the 
>>>>>    size value from the log message and the merges seemed to be much 
>>>>> smaller on 
>>>>>    the high load node. 
>>>>>
>>>>> I just created the indices a few days ago, so the shards of each index 
>>>>> are balanced across the nodes. We have external metrics around document 
>>>>> ingest rate and there was no spike during this time period. 
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Kireet
>>>>>
>>>>>
>>>>> On Sunday, July 6, 2014 1:32:00 PM UTC-7, Michael McCandless wrote:
>>>>>
>>>>>> It's perfectly normal/healthy for many small merges below the floor 
>>>>>> size to happen.
>>>>>>
>>>>>> I think you should first figure out why this node is different from 
>>>>>> the others?  Are you sure it's merging CPU cost that's different?
>>>>>>
>>>>>> Mike McCandless
>>>>>>
>>>>>> http://blog.mikemccandless.com
>>>>>>
>>>>>>
>>>>>> On Sat, Jul 5, 2014 at 9:51 PM, Kireet Reddy <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>>>  We have a situation where one of the four nodes in our cluster 
>>>>>>> seems to get caught up endlessly merging.  However it seems to be 
>>>>>>> high CPU activity and not I/O constrainted. I have enabled the 
>>>>>>> IndexWriter 
>>>>>>> info stream logs, and often times it seems to do merges of quite small 
>>>>>>> segments (100KB) that are much below the floor size (2MB). I suspect 
>>>>>>> this 
>>>>>>> is due to frequent refreshes and/or using lots of threads concurrently 
>>>>>>> to 
>>>>>>> do indexing. Is this true?
>>>>>>>
>>>>>>> My supposition is that this is leading to the merge policy doing 
>>>>>>> lots of merges of very small segments into another small segment which 
>>>>>>> will 
>>>>>>> again require a merge to even reach the floor size. My index has 64 
>>>>>>> segments and 25 are below the floor size. I am wondering if there 
>>>>>>> should be 
>>>>>>> an exception for the maxMergesAtOnce parameter for the first level so 
>>>>>>> that 
>>>>>>> many small segments could be merged at once in this case.
>>>>>>>
>>>>>>> I am considering changing the other parameters (wider tiers, lower 
>>>>>>> floor size, more concurrent merges allowed) but these all seem to have 
>>>>>>> side 
>>>>>>> effects I may not necessarily want. Is there a good solution here?
>>>>>>>  
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "elasticsearch" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>>
>>>>>>> To view this discussion on the web visit 
>>>>>>> https://groups.google.com/d/msgid/elasticsearch/0a8db0dc-ae0
>>>>>>> b-49cb-b29d-e396510bf755%40googlegroups.com 
>>>>>>> <https://groups.google.com/d/msgid/elasticsearch/0a8db0dc-ae0b-49cb-b29d-e396510bf755%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>  -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "elasticsearch" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>> msgid/elasticsearch/edc22069-8674-41db-ab06-226b05d293aa%
>>>>> 40googlegroups.com 
>>>>> <https://groups.google.com/d/msgid/elasticsearch/edc22069-8674-41db-ab06-226b05d293aa%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/a94c2dee-d6dd-4de2-aa59-003f57d2b446%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/a94c2dee-d6dd-4de2-aa59-003f57d2b446%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a855fbe1-13f3-4c53-8607-a26b8063de1f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: excessive merging/small segment sizes

Reply via email to