Re: Trying to optimize configuration for better cluster restart/recovery

Tony Su Fri, 07 Feb 2014 16:30:34 -0800

Hi Ivan,
Thx.
 
Yes, I have been doing a flush before every cluster shutdown now.
Running ES 1.0 RC1
 
I have been doing rolling restarts because I have been unable to start all 
nodes nearly at once and get all nodes to join even after extending the 
timeout as I described. But, as I'm theorizing I'm speculating that doing a 
rolling restart is contributing to the shards being re-allocated because 
nodes that contain shards for the index may not appear soon enough.
 
Maybe the entry I made in elasticsearch.yml exactly as I described isn't 
correct? I derived it from an ES source that described sending the command 
using curl but I thought better to enter directly in elasticsearch.yml
 
I'll take a look at your link, thx.
 
Tony


On Friday, February 7, 2014 3:23:24 PM UTC-8, Ivan Brusic wrote:

> Shard allocation should never happen if disable_allocation is enabled. 
> Which version are you using? Are you doing a rolling restart or a full 
> cluster restart?
>
> Two things that might help. First is to execute a flush before restarting. 
> I believe mismatched transaction states will label a shard as incorrect 
> during a restart. Also play around with the recovery settings [1]. Try 
> setting gateway.recover_after_nodes (disabled by default).
>
> [1] 
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-gateway.html#recover-after
>
> Cheers,
>
> Ivan
>
>
> On Fri, Feb 7, 2014 at 3:11 PM, Tony Su <[email protected] 
> <javascript:>>wrote:
>
>> At first, I noticed what some have called "shard thrashing," ie during 
>> startup shards are re-allocated as nodes come online.
>>
>> Have implemented the following by either creating a new setting or 
>> modifying existing settings in elasticsearch.yml
>>
>> 1. Disable allocation altogether
>>
>> cluster.routing.allocation.disable_allocation: true
>>
>> 2. Avoid split-brain in the current 5 node cluster
>>
>> discovery.zen.minimum_master_nodes: 3
>>
>> 3 Increased Discovery timeout
>>
>> discovery.zen.ping.timeout: 100s
>>
>>
>> Specific Objective:
>> When a cluster restarts, try to force re-use of how the shards were 
>> allocated before shutdown.
>>
>> Attempt:
>> - Tried to increase the discovery.zen.minimum_master_nodes to 5 in a 5 
>> node cluster with the idea that if a node could refuse to become 
>> operational until all 5 nodes in the cluster were recognized. 
>>
>> Result:
>> Unfortunately, despite making this setting equal to the total number of 
>> nodes in the cluster, I observed shard re-allocation at 4 of the 5 nodes 
>> without waiting for the fifth node to come online. And, this is with 
>> allocation disabled.
>>
>> Would like an opinion whether what I'm trying to accomplish is even 
>> possible to
>> - As much as possible to force a restarted cluster to use existing shards 
>> as already allocated
>> - Start all at once rather than rolling node starts which contributes to 
>> shard re-allocation.
>>
>> TIA,
>> Tony
>>
>>  
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/aadbb803-f78e-4ddf-a718-69d4a2792f12%40googlegroups.com
>> .
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a6f3385b-98d2-4d60-9c11-ccbc34cfa706%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Re: Trying to optimize configuration for better cluster restart/recovery

Reply via email to