Re: excessive merging/small segment sizes

Michael McCandless Thu, 10 Jul 2014 02:50:32 -0700

Indeed the hot threads on node5 didn't reveal anything unexpected: they are
busy merging and refreshing.  One thread was in SmartChineseAnalyzer...


Very strange that young gen GC is suddenly so costly.  This is much more
time spent in GC than you ever see on healthy nodes?

If you restart the JVM on that node, does the problem immediately start
happening again?  (Hmm: does restarting the JVM cause the others to start
building new replicas... which would mess up this experiment).  Separately,
if you reboot the node and bring it back up, does the problem start
happening again?

Does vmstat/iostat reveal anything?  E.g. is the OS doing a lot of swapping
or something?  Do you mlockall your JVM heap (bootstrap.mlockall)?  What is
swappiness set to?

Which Java / OS versions are you using?

Mike McCandless

http://blog.mikemccandless.com


On Wed, Jul 9, 2014 at 5:13 PM, Kireet Reddy <[email protected]> wrote:

> Sorry, here it is:
>
> https://www.dropbox.com/sh/3s6m0bhz4eshi6m/AAABnRXFCLrCne-GLG1zvQP3a
>
> Also a couple of graphs of the memory usage.
>
>
> On Wednesday, July 9, 2014 2:10:49 PM UTC-7, Michael McCandless wrote:
>
>> Hmm link doesn't seem to work?
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>>
>> On Wed, Jul 9, 2014 at 4:06 PM, Kireet Reddy <[email protected]> wrote:
>>
>>>  Sorry, forgot the link
>>>
>>> https://www.dropbox.com/sh/3s6m0bhz4eshi6m/AAD8g3Ukq1UW0IbPV-a-CrBGa/
>>> 1229.txt
>>>
>>> On Wednesday, July 9, 2014 1:05:56 PM UTC-7, Kireet Reddy wrote:
>>>>
>>>> The problem is happening again, this time on node 5. I have captured a
>>>> few hot thread requests here. I also included one from node 6 (which is now
>>>> fine).There are merge related stacks, but it seems like everything is
>>>> taking a lot more cpu than usual. I did a few type=wait and type=block
>>>> dumps and the result was always 0% usage there. Also young gen gc activity
>>>> has again gone crazy (old gen/heap size seems fine). Would hot thread
>>>> measurements be distorted if gc activity is very high?
>>>>
>>>> It seems strange to me that this would only happen on one node while we
>>>> have replica set to at least 1 for all our indices. It seems like the
>>>> problems should happen on a couple nodes simultaneously.
>>>>
>>>> --Kireet
>>>>
>>>>
>>>> On Monday, July 7, 2014 3:51:35 PM UTC-7, Michael McCandless wrote:
>>>>
>>>>> Could you pull all hot threads next time the problem happens?
>>>>>
>>>>> Mike McCandless
>>>>>
>>>>> http://blog.mikemccandless.com
>>>>>
>>>>>
>>>>> On Mon, Jul 7, 2014 at 3:47 PM, Kireet Reddy <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>  All that seems correct (except I think this is for node 6, not node
>>>>>> 5). We don't delete documents, but we do some updates. The vast majority 
>>>>>> of
>>>>>> documents get indexed into the large shards, but the smaller ones take 
>>>>>> some
>>>>>> writes as well.
>>>>>>
>>>>>> We aren't using virtualized hardware and elasticsearch is the only
>>>>>> thing running on the machines, no scheduled jobs, etc. I don't think
>>>>>> something is interfering, actually overall disk i/o rate and operations 
>>>>>> on
>>>>>> the machine go down quite a bit during the problematic period, which is
>>>>>> consistent with your observations about things taking longer.
>>>>>>
>>>>>> I went back and checked all our collected metrics again. I noticed
>>>>>> that even though the heap usage and gc count seems smooth during the 
>>>>>> period
>>>>>> in question, gc time spent goes way up. Also active indexing threads goes
>>>>>> up, but since our ingest rate didn't go up I assumed this was a side
>>>>>> effect. During a previous occurrence a few days ago on node5, I stopped 
>>>>>> all
>>>>>> indexing activity for 15 minutes. Active merges and indexing requests 
>>>>>> went
>>>>>> to zero as expected. Then I re-enabled indexing and immediately the
>>>>>> increased cpu/gc/active merges went back up to the problematic rates.
>>>>>>
>>>>>> Overall this is pretty confusing to me as to what is a symptom vs a
>>>>>> root cause here. A summary of what I think I know:
>>>>>>
>>>>>>    1. Every few days, cpu usage on a node goes way above the other
>>>>>>    nodes and doesn't recover. We've let the node run in the elevated cpu 
>>>>>> state
>>>>>>    for a day with no improvement.
>>>>>>    2. It doesn't seem likely that it's data related. We use
>>>>>>    replicas=1 and no other nodes have issues.
>>>>>>    3. It doesn't seem hardware related. We run on a dedicated h/w
>>>>>>    with elasticsearch being the only thing running. Also the problem 
>>>>>> appears
>>>>>>    on various nodes and machine load seems tied directly to the 
>>>>>> elasticsearch
>>>>>>    process.
>>>>>>    4. During the problematic period: cpu usage, active merge
>>>>>>    threads, active bulk (indexing) threads, and gc time are elevated.
>>>>>>    5. During the problematic period: i/o ops and i/o throughput
>>>>>>    decrease.
>>>>>>    6. overall heap usage size seems to smoothly increase, the extra
>>>>>>    gc time seems to be spent on the new gen. Interestingly, the gc count
>>>>>>    didn't seem to increase.
>>>>>>    7. In the hours beforehand, gc behavior of the problematic node
>>>>>>    was similar to the other nodes.
>>>>>>    8. If I pause indexing, machine load quickly returns to normal,
>>>>>>    merges and indexing requests complete.  if I then restart indexing the
>>>>>>    problem reoccurs immediately.
>>>>>>    9. If I disable automatic refreshes, the problem disappears
>>>>>>    within an hour or so.
>>>>>>    10. hot threads show merging activity as the hot threads.
>>>>>>
>>>>>> The first few points make me think the increased active merges is
>>>>>> perhaps a side effect, but then the last 3 make me think merging is the
>>>>>> root cause. The only additional things I can think of that may be 
>>>>>> relevant
>>>>>> are:
>>>>>>
>>>>>>    1. Our documents can vary greatly in size, they average a couple
>>>>>>    KB but can rarely be several MB.
>>>>>>    2. we do use language analysis plugins, perhaps one of these is
>>>>>>    acting up?
>>>>>>    3. We eagerly load one field into the field data cache. But the
>>>>>>    cache size is ok and the overall heap behavior is ok so I don't think 
>>>>>> this
>>>>>>    is the problem.
>>>>>>
>>>>>> That's a lot of information, but I am not sure where to go next from
>>>>>> here...
>>>>>>
>>>>>> On Monday, July 7, 2014 8:23:20 AM UTC-7, Michael McCandless wrote:
>>>>>>
>>>>>>> Indeed there are no big merges during that time ...
>>>>>>>
>>>>>>> I can see on node5, ~14:45 suddenly merges are taking a long time,
>>>>>>> refresh is taking much longer (4-5 seconds instead of < .4 sec), commit
>>>>>>> time goes up from < 0.5 sec to ~1-2 sec, etc., but other metrics are 
>>>>>>> fine
>>>>>>> e.g. total merging GB, number of commits/refreshes is very low during 
>>>>>>> this
>>>>>>> time.
>>>>>>>
>>>>>>> Each node has 2 biggish (~17 GB) shards and then ~50 tiny shards.
>>>>>>>  The biggish shards are indexing at a very slow rate and only have ~1%
>>>>>>> deletions.  Are you explicitly deleting docs?
>>>>>>>
>>>>>>> I suspect something is suddenly cutting into the IO perf of this
>>>>>>> box, and because merging/refreshing is so IO intensive, it causes these
>>>>>>> operations to run slower / backlog.
>>>>>>>
>>>>>>> Are there any scheduled jobs, e.g. backups/snapshots, that start up?
>>>>>>>  Are you running on virtualized hardware?
>>>>>>>
>>>>>>>
>>>>>>> Mike McCandless
>>>>>>>
>>>>>>> http://blog.mikemccandless.com
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Jul 6, 2014 at 8:23 PM, Kireet Reddy <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>  Just to reiterate, the problematic period is from 07/05 14:45 to
>>>>>>>> 07/06 02:10. I included a couple hours before and after in the logs.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sunday, July 6, 2014 5:17:06 PM UTC-7, Kireet Reddy wrote:
>>>>>>>>>
>>>>>>>>> They are linked below (node5 is the log of the normal node, node6
>>>>>>>>> is the log of the problematic node).
>>>>>>>>>
>>>>>>>>> I don't think it was doing big merges, otherwise during the high
>>>>>>>>> load period, the merges graph line would have had a "floor" > 0, 
>>>>>>>>> similar to
>>>>>>>>> the time period after I disabled refresh. We don't do routing and use
>>>>>>>>> mostly default settings. I think the only settings we changed are:
>>>>>>>>>
>>>>>>>>> indices.memory.index_buffer_size: 50%
>>>>>>>>> index.translog.flush_threshold_ops: 50000
>>>>>>>>>
>>>>>>>>> We are running on a 6 cpu/12 cores machine with a 32GB heap and
>>>>>>>>> 96GB of memory with 4 spinning disks.
>>>>>>>>>
>>>>>>>>> node 5 log (normal)
>>>>>>>>> <https://www.dropbox.com/s/uf76m58nf87mdmw/node5.zip>
>>>>>>>>> node 6 log (high load)
>>>>>>>>> <https://www.dropbox.com/s/w7qm2v9qpdttd69/node6.zip>
>>>>>>>>>
>>>>>>>>> On Sunday, July 6, 2014 4:23:19 PM UTC-7, Michael McCandless wrote:
>>>>>>>>>>
>>>>>>>>>> Can you post the IndexWriter infoStream output?  I can see if
>>>>>>>>>> anything stands out.
>>>>>>>>>>
>>>>>>>>>> Maybe it was just that this node was doing big merges?  I.e., if
>>>>>>>>>> you waited long enough, the other shards would eventually do their 
>>>>>>>>>> big
>>>>>>>>>> merges too?
>>>>>>>>>>
>>>>>>>>>> Have you changed any default settings, do custom routing, etc.?
>>>>>>>>>>  Is there any reason to think that the docs that land on this node 
>>>>>>>>>> are
>>>>>>>>>> "different" in any way?
>>>>>>>>>>
>>>>>>>>>> Mike McCandless
>>>>>>>>>>
>>>>>>>>>> http://blog.mikemccandless.com
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Jul 6, 2014 at 6:48 PM, Kireet Reddy <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>  From all the information I’ve collected, it seems to be the
>>>>>>>>>>> merging activity:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    1. We capture the cluster stats into graphite and the
>>>>>>>>>>>    current merges stat seems to be about 10x higher on this node.
>>>>>>>>>>>    2. The actual node that the problem occurs on has happened
>>>>>>>>>>>    on different physical machines so a h/w issue seems unlikely. 
>>>>>>>>>>> Once the
>>>>>>>>>>>    problem starts it doesn't seem to stop. We have blown away the 
>>>>>>>>>>> indices in
>>>>>>>>>>>    the past and started indexing again after enabling more 
>>>>>>>>>>> logging/stats.
>>>>>>>>>>>    3. I've stopped executing queries so the only thing that's
>>>>>>>>>>>    happening on the cluster is indexing.
>>>>>>>>>>>    4. Last night when the problem was ongoing, I disabled
>>>>>>>>>>>    refresh (index.refresh_interval = -1) around 2:10am. Within 1 
>>>>>>>>>>> hour, the
>>>>>>>>>>>    load returned to normal. The merge activity seemed to reduce, it 
>>>>>>>>>>> seems like
>>>>>>>>>>>    2 very long running merges are executing but not much else.
>>>>>>>>>>>    5. I grepped an hour of logs of the 2 machiese for "add
>>>>>>>>>>>    merge=", it was 540 on the high load node and 420 on a normal 
>>>>>>>>>>> node. I
>>>>>>>>>>>    pulled out the size value from the log message and the merges 
>>>>>>>>>>> seemed to be
>>>>>>>>>>>    much smaller on the high load node.
>>>>>>>>>>>
>>>>>>>>>>> I just created the indices a few days ago, so the shards of each
>>>>>>>>>>> index are balanced across the nodes. We have external metrics around
>>>>>>>>>>> document ingest rate and there was no spike during this time period.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Kireet
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sunday, July 6, 2014 1:32:00 PM UTC-7, Michael McCandless
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> It's perfectly normal/healthy for many small merges below the
>>>>>>>>>>>> floor size to happen.
>>>>>>>>>>>>
>>>>>>>>>>>> I think you should first figure out why this node is different
>>>>>>>>>>>> from the others?  Are you sure it's merging CPU cost that's 
>>>>>>>>>>>> different?
>>>>>>>>>>>>
>>>>>>>>>>>> Mike McCandless
>>>>>>>>>>>>
>>>>>>>>>>>> http://blog.mikemccandless.com
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jul 5, 2014 at 9:51 PM, Kireet Reddy <[email protected]
>>>>>>>>>>>> > wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>  We have a situation where one of the four nodes in our
>>>>>>>>>>>>> cluster seems to get caught up endlessly merging.  However it
>>>>>>>>>>>>> seems to be high CPU activity and not I/O constrainted. I have 
>>>>>>>>>>>>> enabled the
>>>>>>>>>>>>> IndexWriter info stream logs, and often times it seems to do 
>>>>>>>>>>>>> merges of
>>>>>>>>>>>>> quite small segments (100KB) that are much below the floor size 
>>>>>>>>>>>>> (2MB). I
>>>>>>>>>>>>> suspect this is due to frequent refreshes and/or using lots of 
>>>>>>>>>>>>> threads
>>>>>>>>>>>>> concurrently to do indexing. Is this true?
>>>>>>>>>>>>>
>>>>>>>>>>>>> My supposition is that this is leading to the merge policy
>>>>>>>>>>>>> doing lots of merges of very small segments into another small 
>>>>>>>>>>>>> segment
>>>>>>>>>>>>> which will again require a merge to even reach the floor size. My 
>>>>>>>>>>>>> index has
>>>>>>>>>>>>> 64 segments and 25 are below the floor size. I am wondering if 
>>>>>>>>>>>>> there should
>>>>>>>>>>>>> be an exception for the maxMergesAtOnce parameter for the first 
>>>>>>>>>>>>> level so
>>>>>>>>>>>>> that many small segments could be merged at once in this case.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am considering changing the other parameters (wider tiers,
>>>>>>>>>>>>> lower floor size, more concurrent merges allowed) but these all 
>>>>>>>>>>>>> seem to
>>>>>>>>>>>>> have side effects I may not necessarily want. Is there a good 
>>>>>>>>>>>>> solution here?
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>>> Google Groups "elasticsearch" group.
>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>>>
>>>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/0a8db0dc-ae0
>>>>>>>>>>>>> b-49cb-b29d-e396510bf755%40googlegroups.com
>>>>>>>>>>>>> <https://groups.google.com/d/msgid/elasticsearch/0a8db0dc-ae0b-49cb-b29d-e396510bf755%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>> .
>>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>  --
>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>> Google Groups "elasticsearch" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/edc22069-867
>>>>>>>>>>> 4-41db-ab06-226b05d293aa%40googlegroups.com
>>>>>>>>>>> <https://groups.google.com/d/msgid/elasticsearch/edc22069-8674-41db-ab06-226b05d293aa%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "elasticsearch" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/elasticsearch/a94c2dee-d6d
>>>>>>>> d-4de2-aa59-003f57d2b446%40googlegroups.com
>>>>>>>> <https://groups.google.com/d/msgid/elasticsearch/a94c2dee-d6dd-4de2-aa59-003f57d2b446%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "elasticsearch" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> To view this discussion on the web visit https://groups.google.com/d/
>>>>>> msgid/elasticsearch/a855fbe1-13f3-4c53-8607-a26b8063de1f%40goo
>>>>>> glegroups.com
>>>>>> <https://groups.google.com/d/msgid/elasticsearch/a855fbe1-13f3-4c53-8607-a26b8063de1f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "elasticsearch" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit https://groups.google.com/d/
>>> msgid/elasticsearch/0f46aa67-0730-4442-9dd6-72a1f01cb8f4%
>>> 40googlegroups.com
>>> <https://groups.google.com/d/msgid/elasticsearch/0f46aa67-0730-4442-9dd6-72a1f01cb8f4%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/5ec4e59a-ab42-4d3c-b950-8b6246dac1dc%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/5ec4e59a-ab42-4d3c-b950-8b6246dac1dc%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAD7smReFXpw6xtHAu%3DBYv%2B0LhnrGqdupEA7RsF1h12wjNC0iYQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: excessive merging/small segment sizes

Reply via email to