Re: ES v1.1 continuous young gc pauses old gc, stops the world when old gc happens and splits cluster

Michael Hart Tue, 24 Jun 2014 05:53:38 -0700

Removing the "-XX+UseCMSInitiatingOccupancyOnly" flag extended the time it 
took before the JVM started full GC's from about 2 hours to 7 hours in my 
cluster, but now it's back to constant full GC's. I'm out of ideas. 
Suggestions?


mike


On Monday, June 23, 2014 10:25:20 AM UTC-4, Michael Hart wrote:
>
> My nodes are in Rackspace, so they are VM's, but they are configured 
> without swap.
>
> I'm not entirely sure what the searches are up to, I'm going to 
> investigate that further.
>
> I did correlate a rapid increase in Heap used, number of segments (up from 
> the norm of ~400 to 15,000) and consequently Old GC counts when the cluster 
> attempts to merge a 5GB segment. It seems that in spite of my really fast 
> disk the merge of a 5GB segment takes up to 15 minutes. I've made two 
> changes this morning, namely set these:
>
> index.merge.scheduler.max_thread_count: 3
> index.merge.policy.max_merged_segment: 1gb
>
> The first is in the hope that while a large segment merge is underway, the 
> two other threads can still keep the small segment merges going. The second 
> is to keep the larger segment merges under control. I was ending up with 
> two 5GB segments, and a long tail of smaller ones. A quick model shows that 
> by dropping this to 1GB I'll have 12 x 1GB segments and a similar long tail 
> of smaller segments (about 50?).
>
> I've also enabled GC logging on one node, I'll leave it running for the 
> day and tomorrow remove the "-XX+UseCMSInitiatingOccupancyOnly" flag (used 
> by default for elasticsearch) and see if there's any difference. I'll 
> report back here incase this is of any use for anyone.
>
> thanks
> mike
>
> On Friday, June 20, 2014 6:31:54 PM UTC-4, Clinton Gormley wrote:
>
> * Do you have swap disabled?  (any swap plays havoc with GC)
> * You said you're doing scan/scroll - how many documents are you 
> retrieving at a time? Consider reducing the number
> * Are you running on a VM - that can cause you to swap even though your VM 
> guest thinks that swap is disabled, or steal CPU (slowing down GC)
>
> Essentially, if all of the above are false, you shouldn't be getting slow 
> GCs unless you're under very heavy memory pressure (and I see that your old 
> gen is not too bad, so that doesn't look likely).
>
>
> On 20 June 2014 16:03, Michael Hart <[email protected]> wrote:
>
> Thanks I do see the GC warnings in the logs, such as
>
> [2014-06-19 20:17:06,603][WARN ][monitor.jvm              ] [redacted] [gc
> ][old][179386][22718] duration [11.4s], collections [1]/[12.2s], total [
> 11.4s]/[25.2m], memory [7.1gb]->[6.9gb]/[7.2gb], all_pools {[young] [
> 158.7mb]->[7.4mb]/[266.2mb]}{[survivor] [32.4mb]->[0b]/[33.2mb]}{[old] [
> 6.9gb]->[6.9gb]/[6.9gb]}
>
> CPU Idle is around 50% when the merge starts, and drops to zero by the 
> time that first GC old warning is logged. During recovery my SSD's sustain 
> 2400 IOPS and during yesterday's outage I only see about 800 IOPS before ES 
> died. While I can throw more hardware at it, I'd prefer to do some tuning 
> first if possible. 
>
> The reason I was thinking of adding more shards is that largest segment is 
> 4.9GB (just under the default maximum set 
> by index.merge.policy.max_merged_segment). I suppose the other option is to 
> reduce the index.merge.policy.max_merged_segment setting to something 
> smaller, but I have no idea what the implications are.
>
> thoughts?
> mike
>
> On Friday, June 20, 2014 9:47:22 AM UTC-4, Ankush Jhalani wrote:
>
> Mike - The above sounds like happened due to machines sending too many 
> indexing requests and merging unable to keep up pace. Usual suspects would 
> be not enough cpu/disk speed bandwidth. 
> This doesn't sound related to memory constraints posted in the original 
> issue of this thread. Do you see memory GC traces in logs? 
>
> On Friday, June 20, 2014 9:40:48 AM UTC-4, Michael Hart wrote:
>
> We're seeing the same thing. ES 1.1.0, JDK 7u55 on Ubuntu 12.04, 5 data 
> nodes, 3 separate masters, all are 15GB hosts with 7.5GB Heaps, storage is 
> SSD. Data set is ~1.6TB according to Marvel.
>
> Our daily indices are roughly 33GB in size, with 5 shards and 2 replicas. 
> I'm still investigating what happened yesterday, but I do see in Marvel a 
> large spike in the "Indices Current Merges" graph just before the node 
> dies, and a corresponding increase in JVM Heap. When Heap hits 99% 
> everything grinds to a halt. Restarting the node "fixes" the issue, but 
> this is third or fourth time it's happened.
>
> I'm still researching how to deal with this, but a couple of things I am 
> looking at are:
>
>    - increase the number of shards so that the segment merges stay 
>    smaller (is that even a legitimate sentence?) I'm still reading through 
>    this page the Index Module Merge page 
>    
> <http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-merge.html>
>  for 
>    more details. 
>    - look at store level throttling 
>    
> <http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules-store.html#store-throttling>
>    . 
>
> I would love to get some feedback on my ramblings. If I find anything more 
> I'll update this thread.
>
> cheers
> mike
>
>
>
>
> On Thursday, June 19, 2014 4:05:54 PM UTC-4, Bruce Ritchie wrote:
>
> Java 8 with G1GC perhaps? It'll have more overhead but perhaps it'll be 
> more consistent wrt pauses.
>
>
>
> On Wednesday, June 18, 2014 2:02:24 PM UTC-4, Eric Brandes wrote:
>
> I'd just like to chime in with a "me too".  Is the answer just more 
> nodes?  In my case this is happening every week or so.
>
> On Monday, April 21, 2014 9:04:33 PM UTC-5, Brian Flad wrote:
>
> My dataset currently is 100GB across a few "daily" indices (~5-6GB and 15 
> shards each). Data nodes are 12 CPU, 12GB RAM (6GB heap).
>
>
> On Mon, Apr 21, 2014 at 6:33 PM, Mark Walkom <[email protected]> 
> wrote:
>
> How big are your data sets? How big are your nodes?
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: [email protected]
> web: www.campaignmonitor.com
>
>
> On 22 April 2014 00:32, Brian Flad <[email protected]> wrote:
>
> We're seeing the same behavior with 1.1.1, JDK 7u55, 3 master nodes (2 min 
> master), and 5 data nodes. Interestingly, we see the repeated young GCs 
> only on a node or two at a time. Cluster operations (such as recovering 
> unassigned shards) grinds to a halt. After restarting a GCing node, 
> everything returns to normal operation in the cluster.
>
> Brian F
>
>
> On Wed, Apr 16, 2014 at 8:00 PM, Mark Walkom <[email protected]> 
> wrote:
>
> In both your instances, if you can, have 3 master eligible nodes as it 
> will reduce the likelihood of a split cluster as you will always have a 
> majority quorum. Also look at discovery.zen.minimum_master_nodes to go 
> with that.
> However you may just be reaching the limit of your nodes, which means the 
> best option is to add another node (which also neatly solves your split 
> brain!).
>
> Ankush it would help if you can update java, most people recommend u25 but 
> we run u51 with no problems.
>
>
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: [email protected]
> web: www.campaignmonitor.com
>
>
> On 17 April 2014 07:31, Dominiek ter Heide <[email protected]> wrote:
>
> We are seeing the same issue here. 
>
> Our environment:
>
> - 2 nodes
> - 30GB Heap allocated to ES
> - ~140GB of data
> - 639 indices, 10 shards per index
> - ~48M documents
>
> After starting ES everything is good, but after a couple of hours we see 
> the Heap build up towards 96% on one node and 80% on the other. We then see 
> the GC take very long on the 96% node:
>
>
>
>
>
>
>
>
>
> TOuKgmlzaVaFVA][elasticsearch1.trend1.bottlenose.com][inet[/192.99.45.125:
> 9300]]])
>
> [2014-04-16 12:04:27,845][INFO ][discovery                ] 
> [elasticsearch2.trend1] trend1/I3EHG_XjSayz2OsHyZpeZA
>
> [2014-04-16 12:04:27,850][INFO ][http                     ] [
> elasticsearch2.trend1] bound_address {inet[/0.0.0.0:9200]}, 
> publish_address {inet[/192.99.45.126:9200]}
>
> [2014-04-16 12:04:27,851][INFO ][node                     ] 
> [elasticsearch2.trend1] started
>
> [2014-04-16 12:04:32,669][INFO ][indices.store            ] 
> [elasticsearch2.trend1] updating indices.store.throttle.max_bytes_per_sec 
> from [20mb] to [1gb], note, type is [MERGE]
>
> [2014-04-16 12:04:32,669][INFO ][cluster.routing.allocation.decider] 
> [elasticsearch2.trend1] updating [cluster.routing.allocation.
> node_initial_primaries_recoveries] from [4] to [50]
>
> [2014-04-16 12:04:32,670][INFO ][indices.recovery         ] 
> [elasticsearch2.trend1] updating [indices.recovery.max_bytes_per_sec] from 
> [200mb] to [2gb]
>
> [2014-04-16 12:04:32,670][INFO ][cluster.routing.allocation.decider] 
> [elasticsearch2.trend1] updating [cluster.routing.allocation.
> node_initial_primaries_recoveries] from [4] to [50]
>
> [2014-04-16 12:04:32,670][INFO ][cluster.routing.allocation.decider] 
> [elasticsearch2.trend1] updating [cluster.routing.allocation.
> node_initial_primaries_recoveries] from [4] to [50]
>
> [2014-04-16 15:25:21,409][WARN ][monitor.jvm              ] 
> [elasticsearch2.trend1] [gc][old][11876][106] duration [1.1m], 
> collections [1]/[1.1m], total [1.1m]/[1.4m], memory [28.7gb]->[22gb]/[
> 29.9gb], all_pools {[young] [67.9mb]->[268.9mb]/[665.6mb]}{[survivor] [
> 60.5mb]->[0b]/[83.1mb]}{[old] [28.6gb]->[21.8gb]/[29.1gb]}
>
> [2014-04-16 16:02:32,523][WARN ][monitor.jvm              ] [
> elasticsearch2.trend1] [gc][old][13996][144] duration [1.4m], collections 
> [1]/[1.4m], total [1.4m]/[3m], memory [28.8gb]->[23.5gb]/[29.9gb], 
> all_pools {[young] [21.8mb]->[238.2mb]/[665.6mb]}{[survivor] [82.4mb]->[0b
> ]/[83.1mb]}{[old] [28.7gb]->[23.3gb]/[29.1gb]}
>
> [2014-04-16 16:14:12,386][WARN ][monitor.jvm              ] [
> elasticsearch2.trend1] [gc][old][14603][155] duration [1.3m], collections 
> [2]/[1.3m], total [1.3m]/[4.4m], memory [29.2gb]->[23.9gb]/[29.9gb], 
> all_pools {[young] [289mb]->[161.3mb]/[665.6mb]}{[survivor] [58.3mb]->[0b
> ]/[83.1mb]}{[old] [28.8gb]->[23.8gb]/[29.1gb]}
>
> [2014-04-16 16:17:55,480][WARN ][monitor.jvm              ] [
> elasticsearch2.trend1] [gc][old][14745][158] duration [1.3m], collections 
> [1]/[1.3m], total [1.3m]/[5.7m], memory [29.7gb]->[24.1gb]/[29.9gb], 
> all_pools {[young] [633.8mb]->[149.7mb]/[665.6mb]}{[survivor] [68.6mb]->[
> 0b]/[83.1mb]}{[old] [29gb]->[24gb]/[29.1gb]}
>
> [2014-04-16 16:21:17,950][WARN ][monitor.jvm              ] [
> elasticsearch2.trend1] [gc][old][14857][161] duration [1.4m], collections 
> [1]/[1.4m], total [1.4m]/[7.2m], memory [28.6gb]->[24.5gb]/[29.9gb], 
> all_pools {[young] <
>
> ...

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e34fe1bb-67c5-4559-b142-0b238f0ec65a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: ES v1.1 continuous young gc pauses old gc, stops the world when old gc happens and splits cluster

Reply via email to