Hi all,

We have a 5 node elasticsearch setup as follows:

   - 3 nodes running in data mode
   - 2 nodes running in master only mode
   
Each of the nodes have the following configuration options set in 
elasticsearch.yml file

> index.number_of_shards: 6
> index.number_of_replicas: 0
> index.mapper.dynamic: true
> action.auto_create_index: true
> action.disable_delete_all_indices: true


What I understand from the above config is that each index will have 6 
shards and there will not be any replica of each of the shards. Also, we 
have the following discovery options set

> discovery.zen.minimum_master_nodes: 1
> discovery.zen.ping.multicast.enabled: false
> discovery.zen.ping.unicast.hosts: ["10.101.55.93[9300-9400]", 
> "10.242.22.126[9300-9400]", "10.144.77.94[9300-9400]", 
> "10.116.173.148[9300-9400]", "10.224.42.205[9300-9400]"]


We are also running logstash which is indexing documents in this 
elasticsearch setup. Yesterday, I started noticing the following errors in 
the logstash indexer log file

> {:timestamp=>"2014-03-10T11:05:44.453000+0000", :message=>"Failed to index 
> an event, will retry", :exception=>
> *org.elasticsearch.action.UnavailableShardsException*: 
> [logstash-2014.03.10][1] [1] shardIt, [0] active : Timeout waiting for 
> [1m], request: index 
> {[logstash-2014.03.10][application][uMrD19E4QuKepoBq17ZKQA], 
> source[{"@source":"file://ip-10-118-115-235//mnt/deploy/apache-tomcat/logs/application.log","@tags":["shipped"],"@fields":{"environment":["production"],"service":["application"],"machine":["proxy1"],"timestamp":"03/10/14
>  
> 06:03:07.292","thread":"http-8080-126","severity":"DEBUG","message":"ources.AccountServicesResource
>  
> idx=0"},"@timestamp":"2014-03-10T06:03:07.292Z","@source_host":"ip-10-118-115-235","@source_path":"//mnt/deploy/apache-tomcat/logs/application.log","@message":"03/10/14
>  
> 06:03:07.292 [http-8080-126] DEBUG ources.AccountServicesResource 
> idx=0","@type":"application"}]}, 
> :event=>{"@source"=>"file://ip-10-118-115-235//mnt/deploy/apache-tomcat/logs/application.log",
>  
> "@tags"=>["shipped"], "@fields"=>{"environment"=>["production"], 
> "service"=>["application"], "machine"=>["proxy1"], "timestamp"=>"03/10/14 
> 06:03:07.292", "thread"=>"http-8080-126", "severity"=>"DEBUG", 
> "message"=>"ources.AccountServicesResource idx=0"}, 
> "@timestamp"=>"2014-03-10T06:03:07.292Z", 
> "@source_host"=>"ip-10-118-115-235", 
> "@source_path"=>"//mnt/deploy/apache-tomcat/logs/application.log", 
> "@message"=>"03/10/14 06:03:07.292 [http-8080-126] DEBUG 
> ources.AccountServicesResource idx=0", "@type"=>"application"}, 
> :level=>:warn} 


While I was debugging this issue, I noticed that the first IP in the 
*discovery.zen.ping.unicast.hosts 
*was wrong. That IP did not point to any of the 5 nodes in the cluster. I 
realized that the first IP should be of one of the nodes configured as 
master and I changed that IP to the correct one on all of the nodes and 
restarted ES. After that change, I no longer see the above error.

I have a question - considering the first IP was wrong, the cluster would 
have elected the only other node configured as master as the *master*. This 
means that there was atleast one master in the cluster. So, for this 
exception to happen, could the other master's metadata about shards be 
wrong?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d5eedac2-d152-4ab5-98fd-4cc22566aced%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to