Hi group,

Restarting a ES cluster triggers recovery which is long-lasting and load 
expensive. I am searching for a way to reduce the runtime and load of a 
restart. I read someone executes daily rolling restarts of his large ES 
cluster to ensure the primary and replica shards are 100% indentical, 
meaning they will be fast recoverable. But that sounds like a hack and not 
something you should happy with as SRE. And its impact on ES performance 
may be acceptable on a large cluster, but not on our 3 node cluster.

How I believe shard recovery works: if ES spots differences between a 
primary and its replica shard(s). It will rebuild the replica shard(s) as 
an exact copy of the primary shard. Rebuiding results in lots of network 
traffic and disk I/O.

We have a 3-node ES 1.0.1 cluster with 3k primary shards and 3k replica 
shards. During a recent restart (to reduce heapsize to 31G to get 
CompressedOops back) the recovery of the 1st node took the longest time (~6 
hours). Recovery and the 2nd less (~2 hours) and the 3rd is quick (<1 
hour). I believe recovery becomes faster after each node, because each 
recovery ends with more replica shards as exact copies of their primary.

I tried force-merging with an expensive max_num_segments=1, but the metrics 
segments.count + segments.memory of same shards still differ between pri + 
rep. No luck. For the curious few I have included the before + after 
results below.

Any ideas?

Regards,
Renzo


BEFORE:
idx                            shard prirep docs  store segments.count 
segments.memory 
logstash-pro-oracle-2014.04.24 0     p      1072 485592              8     
      14615 
logstash-pro-oracle-2014.04.24 0     r      1072 449022              1     
      11958 
logstash-pro-oracle-2014.04.24 1     p      1095 493774              7     
      14336 
logstash-pro-oracle-2014.04.24 1     r      1095 459966              1     
      11988 
logstash-pro-oracle-2014.04.24 2     p      1039 452078              5     
      13158 
logstash-pro-oracle-2014.04.24 2     r      1039 458513              6     
      13480 
logstash-pro-oracle-2014.04.24 3     p      1094 492753              8     
      14574 
logstash-pro-oracle-2014.04.24 3     r      1094 483347              6     
      13850 
logstash-pro-oracle-2014.04.24 4     p      1099 494740              8     
      14645 
logstash-pro-oracle-2014.04.24 4     r      1099 488953              7     
      14251 

AFTER:
idx                            shard prirep docs  store segments.count 
segments.memory 
logstash-pro-oracle-2014.04.24 0     p      1072 449358              1     
      11958 
logstash-pro-oracle-2014.04.24 0     r      1072 448884              1     
      11958 
logstash-pro-oracle-2014.04.24 1     p      1095 460391              1     
      11980 
logstash-pro-oracle-2014.04.24 1     r      1095 459918              1     
      11988 <-- rep is 8 bigger than its pri
logstash-pro-oracle-2014.04.24 2     p      1039 431341              1     
      11580 
logstash-pro-oracle-2014.04.24 2     r      1039 431695              1     
      11572 <-- rep is 8 smaller than its pri
logstash-pro-oracle-2014.04.24 3     p      1094 457135              1     
      11907 
logstash-pro-oracle-2014.04.24 3     r      1094 457970              1     
      11907 
logstash-pro-oracle-2014.04.24 4     p      1099 457640              1     
      11957 
logstash-pro-oracle-2014.04.24 4     r      1099 457165              1     
      11957 

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/20ca0eb4-1f62-4465-a289-2ecd740c9c2e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to