So far the only log message we've seen is: 
 
zen-disco-node_failed([CDPX-PRD-ELS4][lkquUBfHT1aXAO3-_tCNCg][cdpx-prd-els4][inet[
10.9.64.142/10.9.64.142:9300]]{master=false}<http://10.9.64.142/10.9.64.142:9300%5D%5D%7Bmaster=false%7D>),
 
reason failed to ping, tried [5] times, each with maximum [1m] timeout

We have other data traversing the network that would be very sensitive to 
any latency or outages, in addition to alerts that would fire off if we had 
a network outage, so I am confident we don't have any network issues when 
this occurs.  Furthermore, we are only seeing data nodes drop, the masters 
never drop.

Is there a recommended heap size for nodes that are masters only?  In 
addition, any recommendations on heap size for data nodes?  I assume this 
could be a timeout in general during GC processes as our data nodes have 
larger heaps?

On Friday, April 25, 2014 5:49:44 PM UTC-6, Alexander Reelsen wrote:
>
> Hey,
>
> is there any reason in the logfile of the master node, why it was 
> deelected? (network outage as well)? Did you give your master nodes also a 
> huge heap which could cause long outages during GC?
>
>
> --Alex
>
>
> On Mon, Apr 21, 2014 at 5:51 PM, <[email protected] <javascript:>> wrote:
>
>> We currently are running dedicated master nodes but I believe they are 
>> also servicing queries.  I can change it such that queries only hit the 
>> data nodes and see if that eliminates the issue...
>>
>> On Monday, April 21, 2014 3:40:59 PM UTC-6, Binh Ly wrote:
>>>
>>> Other than network, is it possible that your nodes could sometimes be 
>>> overloaded such that they cannot respond immediately? If that's the case, 
>>> then you can probably get 3 nodes (servers), make them master-only nodes 
>>> (node.master: true, node.data: false). Set 
>>> discovery.zen.minimum_master_nodes: 
>>> 2 for those 3 nodes. And then for the rest of your other data nodes, make 
>>> them non-master eligible (node.master: false, node.data: true). This way 
>>> you have 3 nodes dedicated only to do cluster state/master tasks unimpeded 
>>> by load or anything else other than your network. Just don't run anything 
>>> else on them or send queries/indexing jobs to these 3 nodes. :)
>>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/4858a2da-5ceb-48f1-8cfe-fe460ab2dcce%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/4858a2da-5ceb-48f1-8cfe-fe460ab2dcce%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e391a498-139a-4a8d-9c0c-7eb8402cfa89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to