So far the only log message we've seen is:
zen-disco-node_failed([CDPX-PRD-ELS4][lkquUBfHT1aXAO3-_tCNCg][cdpx-prd-els4][inet[
10.9.64.142/10.9.64.142:9300]]{master=false}<http://10.9.64.142/10.9.64.142:9300%5D%5D%7Bmaster=false%7D>),
reason failed to ping, tried [5] times, each with maximum [1m] timeout
We have other data traversing the network that would be very sensitive to
any latency or outages, in addition to alerts that would fire off if we had
a network outage, so I am confident we don't have any network issues when
this occurs. Furthermore, we are only seeing data nodes drop, the masters
never drop.
Is there a recommended heap size for nodes that are masters only? In
addition, any recommendations on heap size for data nodes? I assume this
could be a timeout in general during GC processes as our data nodes have
larger heaps?
On Friday, April 25, 2014 5:49:44 PM UTC-6, Alexander Reelsen wrote:
>
> Hey,
>
> is there any reason in the logfile of the master node, why it was
> deelected? (network outage as well)? Did you give your master nodes also a
> huge heap which could cause long outages during GC?
>
>
> --Alex
>
>
> On Mon, Apr 21, 2014 at 5:51 PM, <[email protected] <javascript:>> wrote:
>
>> We currently are running dedicated master nodes but I believe they are
>> also servicing queries. I can change it such that queries only hit the
>> data nodes and see if that eliminates the issue...
>>
>> On Monday, April 21, 2014 3:40:59 PM UTC-6, Binh Ly wrote:
>>>
>>> Other than network, is it possible that your nodes could sometimes be
>>> overloaded such that they cannot respond immediately? If that's the case,
>>> then you can probably get 3 nodes (servers), make them master-only nodes
>>> (node.master: true, node.data: false). Set
>>> discovery.zen.minimum_master_nodes:
>>> 2 for those 3 nodes. And then for the rest of your other data nodes, make
>>> them non-master eligible (node.master: false, node.data: true). This way
>>> you have 3 nodes dedicated only to do cluster state/master tasks unimpeded
>>> by load or anything else other than your network. Just don't run anything
>>> else on them or send queries/indexing jobs to these 3 nodes. :)
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/4858a2da-5ceb-48f1-8cfe-fe460ab2dcce%40googlegroups.com<https://groups.google.com/d/msgid/elasticsearch/4858a2da-5ceb-48f1-8cfe-fe460ab2dcce%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/e391a498-139a-4a8d-9c0c-7eb8402cfa89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.