Re: upgrade from 3.4.5 to 3.5.6

2020-03-24 Thread Szalay-Bekő Máté
Hi Kuldeep,

I just want to provide you some background info about our documentation.
The reason to upgrade to 3.4.6 first is to avoid the following error:

> 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784
:QuorumCnxManager@349] - Invalid server id: -65536

This error comes because of the protocol changes between ZooKeeper server
nodes during connection initiation for leader election. In ZooKeeper 3.5 a
protocol version was introduced (see ZOOKEEPER-107) and since that time the
fist long value sent in the initial message is not the server ID but the
protocol version (-65536). In ZooKeeper 3.4.6 we made the old 3.4
ZooKeepers backward compatible, so they are able to parse both the old and
the new protocol format (see ZOOKEEPER-1633). This issue happens only when
you need to use old (3.4.0 - 3.4.5) and new (3.5.0+) ZooKeeper servers
together in the same cluster. During a rolling upgrade, this is usually the
case to have old and new ZooKeepers present together.

The fact that you haven't seen any issues might be caused by the order of
the servers. In ZooKeeper the connection initiation between the servers
during the leader election follows a specific rule. As far as I remember
always the server with the larger ID 'wins the challenge', so it is
possible, that the old server didn't need to parse any initial message (if
it had the largest ID) and this is why you haven't seen the issue. Also
having 2 nodes up from the 3 nodes cluster still makes the cluster work (so
you should also check if all the servers are part of the quorum).

I agree with Enrico and Norbert, the safest and most stable way is upgrade
first to 3.4.latest, then go to 3.5.latest. Still, if you don't see that
you would hit this specific issue (e.g. no "Invalid server id" in the log
files), and all the three servers can handle traffic, then maybe you don't
need to upgrade first to 3.4.latest, it is your decision. Definitely you
should test it first, as suggested by the others.

Kind regards,
Mate

On Tue, Mar 24, 2020 at 12:29 PM Norbert Kalmar
 wrote:

> Hi,
>
> That guide is to upgrade to 3.5.0, which was an alpha version. A lot has
> changed for the first stable release of 3.5.5 and then a few more, even
> rolling upgrade issues have been fixed for 3.5.6.
> This is a more up-to-date guide:
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/Upgrade+FAQ
>
> If you have done your testing (with prod snapshot!), then you can skip 3.4
> latest upgrade, but keep in mind we do our recommendations for a reason.
> There were issues reported and/or found during testing. Some are fixed with
> 3.5.6, some only happens if certain conditions stand (IOException: No
> snapshot found - mentioned in the guide, fixed in 3.5.6).
>
> So it is up to you, I would still recommend to do an 3.4 upgrade first, if
> it's feasible.
>
> Regards,
> Norbert
>
> On Tue, Mar 24, 2020 at 11:45 AM kuldeep singh 
> wrote:
>
> > Hi,
> >
> > Current Zookeeper version :- 3.4.5
> > Upgraded version:- 3.5.6
> >
> > We are not going with 3.5.7. Our final decision is zookeeper version is
> > 3.5.6
> > as per your reply first we need to move latest version of 3.4.x, like
> below
> >
> > 3.4.5 -> 3.4.14 -> 3.5.6 (Correct me if I am wrong here)
> >
> > But if We are not facing any problem that i have shared you that we have
> > set up of 3 node cluster where 2 node are on 3.5.6 version and 1 node on
> > 3.4.5, Everything is running fine and didn't get any issue, So what other
> > problem we can face if we directly move to 3.5.6
> >
> > Thanks,
> > -
> > Kuldeep Singh Budania
> > Software Architect
> >
> >
> > On Tue, Mar 24, 2020 at 3:58 PM Enrico Olivelli 
> > wrote:
> >
> > > Hi
> > > You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to
> > > 3.5.7.
> > > All should run well without issues
> > >
> > >
> > > Enrico
> > >
> > > Il Mar 24 Mar 2020, 10:18 kuldeep singh  ha
> > > scritto:
> > >
> > > > Hi Team,
> > > >
> > > > We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node
> > > > cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5.
> > > >
> > > > Everything is running fine and didn't get any issue on my system.
> > > >
> > > > but I found something on apache site  that first we need to upgrade
> on
> > > > 3.4.6 than we can upgrade to 3.5.6. So is it mandatory  to go on
> 3.4.6
> > > > first.
> > > >
> > > > *Upgrading to 3.5.0*
> > > >
> > > > Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only
> > after
> > > > upgrading your ensemble to the 3.4.6 release. Note that this is only
> > > > necessary for rolling upgrades (if you're fine with shutting down the
> > > > system completely, you don't have to go through 3.4.6). If you
> attempt
> > a
> > > > rolling upgrade without going through 3.4.6 (for example from 3.4.5),
> > you
> > > > may get the following error:
> > > >
> > > > 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784
> > > > 

Re: Zookeeper not listening on 2888 and appears nodes are not connecting to each other.

2020-03-24 Thread rld244
Thanks Mate.

I did get 3.5.7 to work with my existing Ansible role, I just had whitelist
some 4lw's to get ZK status to show up in my Solr Admin. But I did read
somewhere that the 4lw's might be going away. I wonder how this will affect
Solr Cloud.

I'll tweak the role to include your suggestions for 3.6.1 and let you know
how it goes.

Thanks again!



--
Sent from: http://zookeeper-user.578899.n2.nabble.com/


Re: upgrade from 3.4.5 to 3.5.6

2020-03-24 Thread Norbert Kalmar
Hi,

That guide is to upgrade to 3.5.0, which was an alpha version. A lot has
changed for the first stable release of 3.5.5 and then a few more, even
rolling upgrade issues have been fixed for 3.5.6.
This is a more up-to-date guide:
https://cwiki.apache.org/confluence/display/ZOOKEEPER/Upgrade+FAQ

If you have done your testing (with prod snapshot!), then you can skip 3.4
latest upgrade, but keep in mind we do our recommendations for a reason.
There were issues reported and/or found during testing. Some are fixed with
3.5.6, some only happens if certain conditions stand (IOException: No
snapshot found - mentioned in the guide, fixed in 3.5.6).

So it is up to you, I would still recommend to do an 3.4 upgrade first, if
it's feasible.

Regards,
Norbert

On Tue, Mar 24, 2020 at 11:45 AM kuldeep singh 
wrote:

> Hi,
>
> Current Zookeeper version :- 3.4.5
> Upgraded version:- 3.5.6
>
> We are not going with 3.5.7. Our final decision is zookeeper version is
> 3.5.6
> as per your reply first we need to move latest version of 3.4.x, like below
>
> 3.4.5 -> 3.4.14 -> 3.5.6 (Correct me if I am wrong here)
>
> But if We are not facing any problem that i have shared you that we have
> set up of 3 node cluster where 2 node are on 3.5.6 version and 1 node on
> 3.4.5, Everything is running fine and didn't get any issue, So what other
> problem we can face if we directly move to 3.5.6
>
> Thanks,
> -
> Kuldeep Singh Budania
> Software Architect
>
>
> On Tue, Mar 24, 2020 at 3:58 PM Enrico Olivelli 
> wrote:
>
> > Hi
> > You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to
> > 3.5.7.
> > All should run well without issues
> >
> >
> > Enrico
> >
> > Il Mar 24 Mar 2020, 10:18 kuldeep singh  ha
> > scritto:
> >
> > > Hi Team,
> > >
> > > We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node
> > > cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5.
> > >
> > > Everything is running fine and didn't get any issue on my system.
> > >
> > > but I found something on apache site  that first we need to upgrade on
> > > 3.4.6 than we can upgrade to 3.5.6. So is it mandatory  to go on 3.4.6
> > > first.
> > >
> > > *Upgrading to 3.5.0*
> > >
> > > Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only
> after
> > > upgrading your ensemble to the 3.4.6 release. Note that this is only
> > > necessary for rolling upgrades (if you're fine with shutting down the
> > > system completely, you don't have to go through 3.4.6). If you attempt
> a
> > > rolling upgrade without going through 3.4.6 (for example from 3.4.5),
> you
> > > may get the following error:
> > >
> > > 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784
> > > :QuorumCnxManager$Listener@498] - Received connection request /
> > > 127.0.0.1:60876
> > >
> > > 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784
> > > :QuorumCnxManager@349] - Invalid server id: -65536
> > >
> > > During a rolling upgrade, each server is taken down in turn and
> rebooted
> > > with the new 3.5.0 binaries. Before starting the server with 3.5.0
> > > binaries, we highly recommend updating the configuration file so that
> all
> > > server statements "server.x=..." contain client ports (see the section
> > > Specifying
> > > the client port). As explained earlier you may leave the configuration
> > in a
> > > single file, as well as leave the clientPort/clientPortAddress
> statements
> > > (although if you specify client ports in the new format, these
> statements
> > > are now redundant).
> > >
> > > Could you please let me know about this case. Appreciate if respond
> soon.
> > >
> > > Thanks,
> > > -
> > > Kuldeep Singh Budania
> > >
> >
>


Re: upgrade from 3.4.5 to 3.5.6

2020-03-24 Thread Enrico Olivelli
Il giorno mar 24 mar 2020 alle ore 11:45 kuldeep singh
 ha scritto:
>
> Hi,
>
> Current Zookeeper version :- 3.4.5
> Upgraded version:- 3.5.6
>
> We are not going with 3.5.7. Our final decision is zookeeper version is
> 3.5.6

I suggest you to move to 3.5.7, 3.5.6 is an older version, see this
problem just as an example:
https://issues.apache.org/jira/browse/ZOOKEEPER-3644

> as per your reply first we need to move latest version of 3.4.x, like below
>
> 3.4.5 -> 3.4.14 -> 3.5.6 (Correct me if I am wrong here)

yes this is ok

>
> But if We are not facing any problem that i have shared you that we have
> set up of 3 node cluster where 2 node are on 3.5.6 version and 1 node on
> 3.4.5, Everything is running fine and didn't get any issue, So what other
> problem we can face if we directly move to 3.5.6
>

Maybe the answer is yes if you are find with having a window of
downtime for ZK server
if peers are not able to talk to each other the system cannot make progress
but when all of the nodes are up and upgraded to 3.5.7 you will be okay.

please test is in some staging environment, non directly in production :-)

Enrico

> Thanks,
> -
> Kuldeep Singh Budania
> Software Architect
>
>
> On Tue, Mar 24, 2020 at 3:58 PM Enrico Olivelli  wrote:
>
> > Hi
> > You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to
> > 3.5.7.
> > All should run well without issues
> >
> >
> > Enrico
> >
> > Il Mar 24 Mar 2020, 10:18 kuldeep singh  ha
> > scritto:
> >
> > > Hi Team,
> > >
> > > We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node
> > > cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5.
> > >
> > > Everything is running fine and didn't get any issue on my system.
> > >
> > > but I found something on apache site  that first we need to upgrade on
> > > 3.4.6 than we can upgrade to 3.5.6. So is it mandatory  to go on 3.4.6
> > > first.
> > >
> > > *Upgrading to 3.5.0*
> > >
> > > Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only after
> > > upgrading your ensemble to the 3.4.6 release. Note that this is only
> > > necessary for rolling upgrades (if you're fine with shutting down the
> > > system completely, you don't have to go through 3.4.6). If you attempt a
> > > rolling upgrade without going through 3.4.6 (for example from 3.4.5), you
> > > may get the following error:
> > >
> > > 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784
> > > :QuorumCnxManager$Listener@498] - Received connection request /
> > > 127.0.0.1:60876
> > >
> > > 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784
> > > :QuorumCnxManager@349] - Invalid server id: -65536
> > >
> > > During a rolling upgrade, each server is taken down in turn and rebooted
> > > with the new 3.5.0 binaries. Before starting the server with 3.5.0
> > > binaries, we highly recommend updating the configuration file so that all
> > > server statements "server.x=..." contain client ports (see the section
> > > Specifying
> > > the client port). As explained earlier you may leave the configuration
> > in a
> > > single file, as well as leave the clientPort/clientPortAddress statements
> > > (although if you specify client ports in the new format, these statements
> > > are now redundant).
> > >
> > > Could you please let me know about this case. Appreciate if respond soon.
> > >
> > > Thanks,
> > > -
> > > Kuldeep Singh Budania
> > >
> >


Re: upgrade from 3.4.5 to 3.5.6

2020-03-24 Thread kuldeep singh
Hi,

Current Zookeeper version :- 3.4.5
Upgraded version:- 3.5.6

We are not going with 3.5.7. Our final decision is zookeeper version is
3.5.6
as per your reply first we need to move latest version of 3.4.x, like below

3.4.5 -> 3.4.14 -> 3.5.6 (Correct me if I am wrong here)

But if We are not facing any problem that i have shared you that we have
set up of 3 node cluster where 2 node are on 3.5.6 version and 1 node on
3.4.5, Everything is running fine and didn't get any issue, So what other
problem we can face if we directly move to 3.5.6

Thanks,
-
Kuldeep Singh Budania
Software Architect


On Tue, Mar 24, 2020 at 3:58 PM Enrico Olivelli  wrote:

> Hi
> You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to
> 3.5.7.
> All should run well without issues
>
>
> Enrico
>
> Il Mar 24 Mar 2020, 10:18 kuldeep singh  ha
> scritto:
>
> > Hi Team,
> >
> > We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node
> > cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5.
> >
> > Everything is running fine and didn't get any issue on my system.
> >
> > but I found something on apache site  that first we need to upgrade on
> > 3.4.6 than we can upgrade to 3.5.6. So is it mandatory  to go on 3.4.6
> > first.
> >
> > *Upgrading to 3.5.0*
> >
> > Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only after
> > upgrading your ensemble to the 3.4.6 release. Note that this is only
> > necessary for rolling upgrades (if you're fine with shutting down the
> > system completely, you don't have to go through 3.4.6). If you attempt a
> > rolling upgrade without going through 3.4.6 (for example from 3.4.5), you
> > may get the following error:
> >
> > 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784
> > :QuorumCnxManager$Listener@498] - Received connection request /
> > 127.0.0.1:60876
> >
> > 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784
> > :QuorumCnxManager@349] - Invalid server id: -65536
> >
> > During a rolling upgrade, each server is taken down in turn and rebooted
> > with the new 3.5.0 binaries. Before starting the server with 3.5.0
> > binaries, we highly recommend updating the configuration file so that all
> > server statements "server.x=..." contain client ports (see the section
> > Specifying
> > the client port). As explained earlier you may leave the configuration
> in a
> > single file, as well as leave the clientPort/clientPortAddress statements
> > (although if you specify client ports in the new format, these statements
> > are now redundant).
> >
> > Could you please let me know about this case. Appreciate if respond soon.
> >
> > Thanks,
> > -
> > Kuldeep Singh Budania
> >
>


Re: upgrade from 3.4.5 to 3.5.6

2020-03-24 Thread Enrico Olivelli
Hi
You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to
3.5.7.
All should run well without issues


Enrico

Il Mar 24 Mar 2020, 10:18 kuldeep singh  ha
scritto:

> Hi Team,
>
> We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node
> cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5.
>
> Everything is running fine and didn't get any issue on my system.
>
> but I found something on apache site  that first we need to upgrade on
> 3.4.6 than we can upgrade to 3.5.6. So is it mandatory  to go on 3.4.6
> first.
>
> *Upgrading to 3.5.0*
>
> Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only after
> upgrading your ensemble to the 3.4.6 release. Note that this is only
> necessary for rolling upgrades (if you're fine with shutting down the
> system completely, you don't have to go through 3.4.6). If you attempt a
> rolling upgrade without going through 3.4.6 (for example from 3.4.5), you
> may get the following error:
>
> 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784
> :QuorumCnxManager$Listener@498] - Received connection request /
> 127.0.0.1:60876
>
> 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784
> :QuorumCnxManager@349] - Invalid server id: -65536
>
> During a rolling upgrade, each server is taken down in turn and rebooted
> with the new 3.5.0 binaries. Before starting the server with 3.5.0
> binaries, we highly recommend updating the configuration file so that all
> server statements "server.x=..." contain client ports (see the section
> Specifying
> the client port). As explained earlier you may leave the configuration in a
> single file, as well as leave the clientPort/clientPortAddress statements
> (although if you specify client ports in the new format, these statements
> are now redundant).
>
> Could you please let me know about this case. Appreciate if respond soon.
>
> Thanks,
> -
> Kuldeep Singh Budania
>


upgrade from 3.4.5 to 3.5.6

2020-03-24 Thread kuldeep singh
Hi Team,

We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node
cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5.

Everything is running fine and didn't get any issue on my system.

but I found something on apache site  that first we need to upgrade on
3.4.6 than we can upgrade to 3.5.6. So is it mandatory  to go on 3.4.6
first.

*Upgrading to 3.5.0*

Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only after
upgrading your ensemble to the 3.4.6 release. Note that this is only
necessary for rolling upgrades (if you're fine with shutting down the
system completely, you don't have to go through 3.4.6). If you attempt a
rolling upgrade without going through 3.4.6 (for example from 3.4.5), you
may get the following error:

2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784
:QuorumCnxManager$Listener@498] - Received connection request /
127.0.0.1:60876

2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784
:QuorumCnxManager@349] - Invalid server id: -65536

During a rolling upgrade, each server is taken down in turn and rebooted
with the new 3.5.0 binaries. Before starting the server with 3.5.0
binaries, we highly recommend updating the configuration file so that all
server statements "server.x=..." contain client ports (see the section
Specifying
the client port). As explained earlier you may leave the configuration in a
single file, as well as leave the clientPort/clientPortAddress statements
(although if you specify client ports in the new format, these statements
are now redundant).

Could you please let me know about this case. Appreciate if respond soon.

Thanks,
-
Kuldeep Singh Budania


Re: Zookeeper not listening on 2888 and appears nodes are not connecting to each other.

2020-03-24 Thread Szalay-Bekő Máté
There are multiple issues around your setup which got fixed recently.

The IllegalArgumentException at java.util.concurrent.ThreadPoolExecutor
suggest you were hitting this issue:
https://issues.apache.org/jira/browse/ZOOKEEPER-3758
This affects ZooKeeper 3.6.0 and we already fixed it, as Enrico mentioned.
The 3.6.1 will solve this particular issue. Or you can also set the
following config as a workaround:
multiAddress.reachabilityCheckEnabled=false. (setting this won't be needed
in 3.6.1, but in your case it will most probably help in 3.6.0) This is a
3.6 specific issue and 3.6 specific configuration parameter, you won't see
this problem in 3.5.

However... you also using 0.0.0.0 in you server config, which is actually
not recommended since 3.5. This leads to other error when peers wish to
rejoin to the quorum (see
https://issues.apache.org/jira/browse/ZOOKEEPER-2164). This was also fixed
and the fix will be released in 3.5.8 and 3.6.1. As a workaround (and this
is actually not only a workaround but the more consistent and dynamic
re-config compatible way) you can use the following config in all the there
servers:

quorumListenOnAllIPs=true
server.1=< fqdn of server 1 >:2888:3888
server.2=< fqdn of server 2 >:2888:3888
server.3=< fqdn of server 3 >:2888:3888


The 'quorumListenOnAllIPs' config above will have the same effect: all the
servers will listen on 0.0.0.0 locally. But it has a benefit that all the
members still have the same view of the cluster. And the re-join problem
should not happen here.

 I hope these changes will help.

Kind regards,
Mate

On Tue, Mar 24, 2020 at 2:06 AM rld244  wrote:

> Thanks for getting back to me Enrico.
>
> I'm working within an AWS VPC with all nodes on a private subnet and I
> think
> Ping isn't enabled, but I can use nc and specific ports to verify that the
> servers can talk to each other.
>
> Interestingly I just noticed that on the node with myid 3 zookeeper is
> listening on 2888 and can be reached by the other nodes on 2181, 2888 and
> 3888.
>
> I don't know if that helps.
>
> Maybe I'll try deploying 3.5.x instead.
>
>
>
> --
> Sent from: http://zookeeper-user.578899.n2.nabble.com/
>