Quick update:
Apparently the election notifications disappeared somewhere between the
datacenters (firewall) when the sockets were not used for some time. We
fixed this with zookeeper.tcpKeepAlive=true.
Regards,
Chris
On Wed, Aug 8, 2018 at 5:05 PM Andor Molnar
wrote:
> Some kind of a network
What action should i perform for getting the most usable logs in this case ?
Log level to debug and kill -3 when its failing ?
On 11 September 2018 9:17:45 pm Andor Molnár wrote:
Erm.
Thanks for carrying out these tests Chris.
Have you by any chance - as Camille suggested - collected
Erm.
Thanks for carrying out these tests Chris.
Have you by any chance - as Camille suggested - collected debug logs
from these tests?
Andor
On 09/11/2018 11:08 AM, Cee Tee wrote:
> Concluded a test with a 3.4.13 cluster, it shows the same behaviour.
>
> On Mon, Sep 3, 2018 at 4:56 PM Andor
Concluded a test with a 3.4.13 cluster, it shows the same behaviour.
On Mon, Sep 3, 2018 at 4:56 PM Andor Molnar
wrote:
> Thanks for testing Chris.
>
> So, if I understand you correctly, you're running the latest version from
> branch-3.5. Could we say that this is a 3.5-only problem?
> Have
I havent noticed it in 3.4 back when we used it , but i can do a test to
confirm it. I will let you know in appx one week.
Regards
Chris
On 3 September 2018 4:56:00 pm Andor Molnar wrote:
Thanks for testing Chris.
So, if I understand you correctly, you're running the latest version from
Thanks for testing Chris.
So, if I understand you correctly, you're running the latest version from
branch-3.5. Could we say that this is a 3.5-only problem?
Have you ever tested the same cluster with 3.4?
Regards,
Andor
On Tue, Aug 21, 2018 at 11:29 AM, Cee Tee wrote:
> I've tested the
I've tested the patch and let it run 6 days. It did not help, result is
still the same. (remaining ZKs form islands based on datacenter they are
in).
I have mitigated it by doing a daily rolling restart.
Regards,
Chris
On Mon, Aug 13, 2018 at 2:06 PM Andor Molnar
wrote:
> Hi Chris,
>
> Would
Interesting, i will have a look at it.
Thanks
Chris
On 13 August 2018 2:06:55 pm Andor Molnar wrote:
Hi Chris,
Would you mind testing the following patch on your test clusters?
I'm not entirely sure, but the issue might be related.
https://issues.apache.org/jira/browse/ZOOKEEPER-2930
Hi Chris,
Would you mind testing the following patch on your test clusters?
I'm not entirely sure, but the issue might be related.
https://issues.apache.org/jira/browse/ZOOKEEPER-2930
Regards,
Andor
On Wed, Aug 8, 2018 at 6:51 PM, Camille Fournier wrote:
> If you have the time and
Running 3.5.5
I managed to recreate it on acc and test cluster today, failing on shutdown
of leader. Both had been running for over a week. After restarting all
zookeepers it runs fine no matter how many leader shutdowns i throw at it.
On 8 August 2018 5:05:34 pm Andor Molnar wrote:
Some
Some kind of a network split?
It looks like 1-2 and 3-4 were able to communicate each other, but
connection timed out between these 2 splits. When 5 came back online it
started with supporters of (1,2) and later 3 and 4 also joined.
There was no such issue the day after.
Which version of
Actually i have similar issues on my test and acceptance clusters where
leader election fails if the cluster has been running for a couple of days.
If you stop/start the Zookeepers once they will work fine on further
disruptions that day. Not sure yet what the treshold is.
On 8 August 2018
12 matches
Mail list logo