Re: upgrade from 3.4.5 to 3.5.6
Hi Kuldeep, I just want to provide you some background info about our documentation. The reason to upgrade to 3.4.6 first is to avoid the following error: > 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784 :QuorumCnxManager@349] - Invalid server id: -65536 This error comes because of the protocol changes between ZooKeeper server nodes during connection initiation for leader election. In ZooKeeper 3.5 a protocol version was introduced (see ZOOKEEPER-107) and since that time the fist long value sent in the initial message is not the server ID but the protocol version (-65536). In ZooKeeper 3.4.6 we made the old 3.4 ZooKeepers backward compatible, so they are able to parse both the old and the new protocol format (see ZOOKEEPER-1633). This issue happens only when you need to use old (3.4.0 - 3.4.5) and new (3.5.0+) ZooKeeper servers together in the same cluster. During a rolling upgrade, this is usually the case to have old and new ZooKeepers present together. The fact that you haven't seen any issues might be caused by the order of the servers. In ZooKeeper the connection initiation between the servers during the leader election follows a specific rule. As far as I remember always the server with the larger ID 'wins the challenge', so it is possible, that the old server didn't need to parse any initial message (if it had the largest ID) and this is why you haven't seen the issue. Also having 2 nodes up from the 3 nodes cluster still makes the cluster work (so you should also check if all the servers are part of the quorum). I agree with Enrico and Norbert, the safest and most stable way is upgrade first to 3.4.latest, then go to 3.5.latest. Still, if you don't see that you would hit this specific issue (e.g. no "Invalid server id" in the log files), and all the three servers can handle traffic, then maybe you don't need to upgrade first to 3.4.latest, it is your decision. Definitely you should test it first, as suggested by the others. Kind regards, Mate On Tue, Mar 24, 2020 at 12:29 PM Norbert Kalmar wrote: > Hi, > > That guide is to upgrade to 3.5.0, which was an alpha version. A lot has > changed for the first stable release of 3.5.5 and then a few more, even > rolling upgrade issues have been fixed for 3.5.6. > This is a more up-to-date guide: > https://cwiki.apache.org/confluence/display/ZOOKEEPER/Upgrade+FAQ > > If you have done your testing (with prod snapshot!), then you can skip 3.4 > latest upgrade, but keep in mind we do our recommendations for a reason. > There were issues reported and/or found during testing. Some are fixed with > 3.5.6, some only happens if certain conditions stand (IOException: No > snapshot found - mentioned in the guide, fixed in 3.5.6). > > So it is up to you, I would still recommend to do an 3.4 upgrade first, if > it's feasible. > > Regards, > Norbert > > On Tue, Mar 24, 2020 at 11:45 AM kuldeep singh > wrote: > > > Hi, > > > > Current Zookeeper version :- 3.4.5 > > Upgraded version:- 3.5.6 > > > > We are not going with 3.5.7. Our final decision is zookeeper version is > > 3.5.6 > > as per your reply first we need to move latest version of 3.4.x, like > below > > > > 3.4.5 -> 3.4.14 -> 3.5.6 (Correct me if I am wrong here) > > > > But if We are not facing any problem that i have shared you that we have > > set up of 3 node cluster where 2 node are on 3.5.6 version and 1 node on > > 3.4.5, Everything is running fine and didn't get any issue, So what other > > problem we can face if we directly move to 3.5.6 > > > > Thanks, > > - > > Kuldeep Singh Budania > > Software Architect > > > > > > On Tue, Mar 24, 2020 at 3:58 PM Enrico Olivelli > > wrote: > > > > > Hi > > > You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to > > > 3.5.7. > > > All should run well without issues > > > > > > > > > Enrico > > > > > > Il Mar 24 Mar 2020, 10:18 kuldeep singh ha > > > scritto: > > > > > > > Hi Team, > > > > > > > > We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node > > > > cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5. > > > > > > > > Everything is running fine and didn't get any issue on my system. > > > > > > > > but I found something on apache site that first we need to upgrade > on > > > > 3.4.6 than we can upgrade to 3.5.6. So is it mandatory to go on > 3.4.6 > > > > first. > > > > > > > > *Upgrading to 3.5.0* > > > > > > > > Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only > > after > > > > upgrading your ensemble to the 3.4.6 release. Note that this is only > > > > necessary for rolling upgrades (if you're fine with shutting down the > > > > system completely, you don't have to go through 3.4.6). If you > attempt > > a > > > > rolling upgrade without going through 3.4.6 (for example from 3.4.5), > > you > > > > may get the following error: > > > > > > > > 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784 > > > >
Re: Zookeeper not listening on 2888 and appears nodes are not connecting to each other.
Thanks Mate. I did get 3.5.7 to work with my existing Ansible role, I just had whitelist some 4lw's to get ZK status to show up in my Solr Admin. But I did read somewhere that the 4lw's might be going away. I wonder how this will affect Solr Cloud. I'll tweak the role to include your suggestions for 3.6.1 and let you know how it goes. Thanks again! -- Sent from: http://zookeeper-user.578899.n2.nabble.com/
Re: upgrade from 3.4.5 to 3.5.6
Hi, That guide is to upgrade to 3.5.0, which was an alpha version. A lot has changed for the first stable release of 3.5.5 and then a few more, even rolling upgrade issues have been fixed for 3.5.6. This is a more up-to-date guide: https://cwiki.apache.org/confluence/display/ZOOKEEPER/Upgrade+FAQ If you have done your testing (with prod snapshot!), then you can skip 3.4 latest upgrade, but keep in mind we do our recommendations for a reason. There were issues reported and/or found during testing. Some are fixed with 3.5.6, some only happens if certain conditions stand (IOException: No snapshot found - mentioned in the guide, fixed in 3.5.6). So it is up to you, I would still recommend to do an 3.4 upgrade first, if it's feasible. Regards, Norbert On Tue, Mar 24, 2020 at 11:45 AM kuldeep singh wrote: > Hi, > > Current Zookeeper version :- 3.4.5 > Upgraded version:- 3.5.6 > > We are not going with 3.5.7. Our final decision is zookeeper version is > 3.5.6 > as per your reply first we need to move latest version of 3.4.x, like below > > 3.4.5 -> 3.4.14 -> 3.5.6 (Correct me if I am wrong here) > > But if We are not facing any problem that i have shared you that we have > set up of 3 node cluster where 2 node are on 3.5.6 version and 1 node on > 3.4.5, Everything is running fine and didn't get any issue, So what other > problem we can face if we directly move to 3.5.6 > > Thanks, > - > Kuldeep Singh Budania > Software Architect > > > On Tue, Mar 24, 2020 at 3:58 PM Enrico Olivelli > wrote: > > > Hi > > You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to > > 3.5.7. > > All should run well without issues > > > > > > Enrico > > > > Il Mar 24 Mar 2020, 10:18 kuldeep singh ha > > scritto: > > > > > Hi Team, > > > > > > We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node > > > cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5. > > > > > > Everything is running fine and didn't get any issue on my system. > > > > > > but I found something on apache site that first we need to upgrade on > > > 3.4.6 than we can upgrade to 3.5.6. So is it mandatory to go on 3.4.6 > > > first. > > > > > > *Upgrading to 3.5.0* > > > > > > Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only > after > > > upgrading your ensemble to the 3.4.6 release. Note that this is only > > > necessary for rolling upgrades (if you're fine with shutting down the > > > system completely, you don't have to go through 3.4.6). If you attempt > a > > > rolling upgrade without going through 3.4.6 (for example from 3.4.5), > you > > > may get the following error: > > > > > > 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784 > > > :QuorumCnxManager$Listener@498] - Received connection request / > > > 127.0.0.1:60876 > > > > > > 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784 > > > :QuorumCnxManager@349] - Invalid server id: -65536 > > > > > > During a rolling upgrade, each server is taken down in turn and > rebooted > > > with the new 3.5.0 binaries. Before starting the server with 3.5.0 > > > binaries, we highly recommend updating the configuration file so that > all > > > server statements "server.x=..." contain client ports (see the section > > > Specifying > > > the client port). As explained earlier you may leave the configuration > > in a > > > single file, as well as leave the clientPort/clientPortAddress > statements > > > (although if you specify client ports in the new format, these > statements > > > are now redundant). > > > > > > Could you please let me know about this case. Appreciate if respond > soon. > > > > > > Thanks, > > > - > > > Kuldeep Singh Budania > > > > > >
Re: upgrade from 3.4.5 to 3.5.6
Il giorno mar 24 mar 2020 alle ore 11:45 kuldeep singh ha scritto: > > Hi, > > Current Zookeeper version :- 3.4.5 > Upgraded version:- 3.5.6 > > We are not going with 3.5.7. Our final decision is zookeeper version is > 3.5.6 I suggest you to move to 3.5.7, 3.5.6 is an older version, see this problem just as an example: https://issues.apache.org/jira/browse/ZOOKEEPER-3644 > as per your reply first we need to move latest version of 3.4.x, like below > > 3.4.5 -> 3.4.14 -> 3.5.6 (Correct me if I am wrong here) yes this is ok > > But if We are not facing any problem that i have shared you that we have > set up of 3 node cluster where 2 node are on 3.5.6 version and 1 node on > 3.4.5, Everything is running fine and didn't get any issue, So what other > problem we can face if we directly move to 3.5.6 > Maybe the answer is yes if you are find with having a window of downtime for ZK server if peers are not able to talk to each other the system cannot make progress but when all of the nodes are up and upgraded to 3.5.7 you will be okay. please test is in some staging environment, non directly in production :-) Enrico > Thanks, > - > Kuldeep Singh Budania > Software Architect > > > On Tue, Mar 24, 2020 at 3:58 PM Enrico Olivelli wrote: > > > Hi > > You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to > > 3.5.7. > > All should run well without issues > > > > > > Enrico > > > > Il Mar 24 Mar 2020, 10:18 kuldeep singh ha > > scritto: > > > > > Hi Team, > > > > > > We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node > > > cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5. > > > > > > Everything is running fine and didn't get any issue on my system. > > > > > > but I found something on apache site that first we need to upgrade on > > > 3.4.6 than we can upgrade to 3.5.6. So is it mandatory to go on 3.4.6 > > > first. > > > > > > *Upgrading to 3.5.0* > > > > > > Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only after > > > upgrading your ensemble to the 3.4.6 release. Note that this is only > > > necessary for rolling upgrades (if you're fine with shutting down the > > > system completely, you don't have to go through 3.4.6). If you attempt a > > > rolling upgrade without going through 3.4.6 (for example from 3.4.5), you > > > may get the following error: > > > > > > 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784 > > > :QuorumCnxManager$Listener@498] - Received connection request / > > > 127.0.0.1:60876 > > > > > > 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784 > > > :QuorumCnxManager@349] - Invalid server id: -65536 > > > > > > During a rolling upgrade, each server is taken down in turn and rebooted > > > with the new 3.5.0 binaries. Before starting the server with 3.5.0 > > > binaries, we highly recommend updating the configuration file so that all > > > server statements "server.x=..." contain client ports (see the section > > > Specifying > > > the client port). As explained earlier you may leave the configuration > > in a > > > single file, as well as leave the clientPort/clientPortAddress statements > > > (although if you specify client ports in the new format, these statements > > > are now redundant). > > > > > > Could you please let me know about this case. Appreciate if respond soon. > > > > > > Thanks, > > > - > > > Kuldeep Singh Budania > > > > >
Re: upgrade from 3.4.5 to 3.5.6
Hi, Current Zookeeper version :- 3.4.5 Upgraded version:- 3.5.6 We are not going with 3.5.7. Our final decision is zookeeper version is 3.5.6 as per your reply first we need to move latest version of 3.4.x, like below 3.4.5 -> 3.4.14 -> 3.5.6 (Correct me if I am wrong here) But if We are not facing any problem that i have shared you that we have set up of 3 node cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5, Everything is running fine and didn't get any issue, So what other problem we can face if we directly move to 3.5.6 Thanks, - Kuldeep Singh Budania Software Architect On Tue, Mar 24, 2020 at 3:58 PM Enrico Olivelli wrote: > Hi > You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to > 3.5.7. > All should run well without issues > > > Enrico > > Il Mar 24 Mar 2020, 10:18 kuldeep singh ha > scritto: > > > Hi Team, > > > > We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node > > cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5. > > > > Everything is running fine and didn't get any issue on my system. > > > > but I found something on apache site that first we need to upgrade on > > 3.4.6 than we can upgrade to 3.5.6. So is it mandatory to go on 3.4.6 > > first. > > > > *Upgrading to 3.5.0* > > > > Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only after > > upgrading your ensemble to the 3.4.6 release. Note that this is only > > necessary for rolling upgrades (if you're fine with shutting down the > > system completely, you don't have to go through 3.4.6). If you attempt a > > rolling upgrade without going through 3.4.6 (for example from 3.4.5), you > > may get the following error: > > > > 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784 > > :QuorumCnxManager$Listener@498] - Received connection request / > > 127.0.0.1:60876 > > > > 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784 > > :QuorumCnxManager@349] - Invalid server id: -65536 > > > > During a rolling upgrade, each server is taken down in turn and rebooted > > with the new 3.5.0 binaries. Before starting the server with 3.5.0 > > binaries, we highly recommend updating the configuration file so that all > > server statements "server.x=..." contain client ports (see the section > > Specifying > > the client port). As explained earlier you may leave the configuration > in a > > single file, as well as leave the clientPort/clientPortAddress statements > > (although if you specify client ports in the new format, these statements > > are now redundant). > > > > Could you please let me know about this case. Appreciate if respond soon. > > > > Thanks, > > - > > Kuldeep Singh Budania > > >
Re: upgrade from 3.4.5 to 3.5.6
Hi You have to upgrade to latest 3.4.x Zookeeper then you will upgrade to 3.5.7. All should run well without issues Enrico Il Mar 24 Mar 2020, 10:18 kuldeep singh ha scritto: > Hi Team, > > We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node > cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5. > > Everything is running fine and didn't get any issue on my system. > > but I found something on apache site that first we need to upgrade on > 3.4.6 than we can upgrade to 3.5.6. So is it mandatory to go on 3.4.6 > first. > > *Upgrading to 3.5.0* > > Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only after > upgrading your ensemble to the 3.4.6 release. Note that this is only > necessary for rolling upgrades (if you're fine with shutting down the > system completely, you don't have to go through 3.4.6). If you attempt a > rolling upgrade without going through 3.4.6 (for example from 3.4.5), you > may get the following error: > > 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784 > :QuorumCnxManager$Listener@498] - Received connection request / > 127.0.0.1:60876 > > 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784 > :QuorumCnxManager@349] - Invalid server id: -65536 > > During a rolling upgrade, each server is taken down in turn and rebooted > with the new 3.5.0 binaries. Before starting the server with 3.5.0 > binaries, we highly recommend updating the configuration file so that all > server statements "server.x=..." contain client ports (see the section > Specifying > the client port). As explained earlier you may leave the configuration in a > single file, as well as leave the clientPort/clientPortAddress statements > (although if you specify client ports in the new format, these statements > are now redundant). > > Could you please let me know about this case. Appreciate if respond soon. > > Thanks, > - > Kuldeep Singh Budania >
upgrade from 3.4.5 to 3.5.6
Hi Team, We are upgrading zookeeper from 3.4.5 to 3.5.6. I have set up 3 node cluster where 2 node are on 3.5.6 version and 1 node on 3.4.5. Everything is running fine and didn't get any issue on my system. but I found something on apache site that first we need to upgrade on 3.4.6 than we can upgrade to 3.5.6. So is it mandatory to go on 3.4.6 first. *Upgrading to 3.5.0* Upgrading a running ZooKeeper ensemble to 3.5.0 should be done only after upgrading your ensemble to the 3.4.6 release. Note that this is only necessary for rolling upgrades (if you're fine with shutting down the system completely, you don't have to go through 3.4.6). If you attempt a rolling upgrade without going through 3.4.6 (for example from 3.4.5), you may get the following error: 2013-01-30 11:32:10,663 [myid:2] - INFO [localhost/127.0.0.1:2784 :QuorumCnxManager$Listener@498] - Received connection request / 127.0.0.1:60876 2013-01-30 11:32:10,663 [myid:2] - WARN [localhost/127.0.0.1:2784 :QuorumCnxManager@349] - Invalid server id: -65536 During a rolling upgrade, each server is taken down in turn and rebooted with the new 3.5.0 binaries. Before starting the server with 3.5.0 binaries, we highly recommend updating the configuration file so that all server statements "server.x=..." contain client ports (see the section Specifying the client port). As explained earlier you may leave the configuration in a single file, as well as leave the clientPort/clientPortAddress statements (although if you specify client ports in the new format, these statements are now redundant). Could you please let me know about this case. Appreciate if respond soon. Thanks, - Kuldeep Singh Budania
Re: Zookeeper not listening on 2888 and appears nodes are not connecting to each other.
There are multiple issues around your setup which got fixed recently. The IllegalArgumentException at java.util.concurrent.ThreadPoolExecutor suggest you were hitting this issue: https://issues.apache.org/jira/browse/ZOOKEEPER-3758 This affects ZooKeeper 3.6.0 and we already fixed it, as Enrico mentioned. The 3.6.1 will solve this particular issue. Or you can also set the following config as a workaround: multiAddress.reachabilityCheckEnabled=false. (setting this won't be needed in 3.6.1, but in your case it will most probably help in 3.6.0) This is a 3.6 specific issue and 3.6 specific configuration parameter, you won't see this problem in 3.5. However... you also using 0.0.0.0 in you server config, which is actually not recommended since 3.5. This leads to other error when peers wish to rejoin to the quorum (see https://issues.apache.org/jira/browse/ZOOKEEPER-2164). This was also fixed and the fix will be released in 3.5.8 and 3.6.1. As a workaround (and this is actually not only a workaround but the more consistent and dynamic re-config compatible way) you can use the following config in all the there servers: quorumListenOnAllIPs=true server.1=< fqdn of server 1 >:2888:3888 server.2=< fqdn of server 2 >:2888:3888 server.3=< fqdn of server 3 >:2888:3888 The 'quorumListenOnAllIPs' config above will have the same effect: all the servers will listen on 0.0.0.0 locally. But it has a benefit that all the members still have the same view of the cluster. And the re-join problem should not happen here. I hope these changes will help. Kind regards, Mate On Tue, Mar 24, 2020 at 2:06 AM rld244 wrote: > Thanks for getting back to me Enrico. > > I'm working within an AWS VPC with all nodes on a private subnet and I > think > Ping isn't enabled, but I can use nc and specific ports to verify that the > servers can talk to each other. > > Interestingly I just noticed that on the node with myid 3 zookeeper is > listening on 2888 and can be reached by the other nodes on 2181, 2888 and > 3888. > > I don't know if that helps. > > Maybe I'll try deploying 3.5.x instead. > > > > -- > Sent from: http://zookeeper-user.578899.n2.nabble.com/ >