I hate to say it, but I think 3.6.0 should release as is. It is impossible to *reliably* retrofit backwards compatibility / interoperability onto a release that was engineered from the beginning without that goal. Learn the lesson, set goals differently in the future.
On Tue, Feb 11, 2020 at 9:41 AM Szalay-Bekő Máté <szalay.beko.m...@gmail.com> wrote: > FYI: I created these scripts for my local tests: > https://github.com/symat/zk-rolling-upgrade-test > > For the long term I would also add some script that actually monitors the > state of the quorum and also runs continuous traffic, not just 1-2 > smoketests after each restart. But I don't know how important this would > be. > > On Tue, Feb 11, 2020 at 5:25 PM Enrico Olivelli <eolive...@gmail.com> > wrote: > > > Il giorno mar 11 feb 2020 alle ore 17:17 Andor Molnar > > <an...@apache.org> ha scritto: > > > > > > The most obvious one which crosses my mind is that I previously worked > > on: > > > > > > 1) run old version cluster, > > > 2) connect to each node and run smoke tests, > > > 3) restart one node with new code, > > > 4) goto 2) until all nodes are upgraded > > > > > > I think this wouldn’t work in a “unit test”, we probably need a > separate > > Jenkins job and a nice python script to do this. > > > > > > Andor > > > > > > > > > > > > > > > > On 2020. Feb 11., at 16:38, Patrick Hunt <ph...@apache.org> wrote: > > > > > > > > Anyone have ideas how we could add testing for upgrade? Obviously > > something > > > > we're missing, esp given it's import. > > > > I will send an email next days with a proposal. > > btw my idea is very like Andor's one > > > > Once we have an automatic environment we can launch from Jenkins > > > > Enrico > > > > > > > > > > > > Patrick > > > > > > > > On Tue, Feb 11, 2020 at 12:40 AM Enrico Olivelli < > eolive...@gmail.com> > > > > wrote: > > > > > > > >> Il giorno mar 11 feb 2020 alle ore 09:12 Szalay-Bekő Máté > > > >> <szalay.beko.m...@gmail.com> ha scritto: > > > >>> > > > >>> Hi All, > > > >>> > > > >>> about the question from Michael: > > > >>>> Regarding the fix, can we just make 3.6.0 aware of the old > protocol > > and > > > >>>> speak old message format when it's talking to old server? > > > >>> > > > >>> In this particular case, it might be enough. The protocol change > > happened > > > >>> now in the 'initial message' sent by the QuorumCnxManager. Maybe it > > is > > > >> not > > > >>> a problem if the new servers can not initiate channels to the old > > > >> servers, > > > >>> maybe it is enough if these channel gets initiated by the old > servers > > > >> only. > > > >>> I will test it quickly. > > > >>> > > > >>> Although I have no idea if any other thing changed in the quorum > > protocol > > > >>> between 3.5 and 3.6. In other cases it might not be enough if the > new > > > >>> servers can understand the old messages, as the old servers can > > break by > > > >>> not understanding the messages from the new servers. Also, in the > > code > > > >>> currently (AFAIK) there is no generic knowledge of protocol > > versions, the > > > >>> servers are not storing that which protocol versions they > can/should > > use > > > >> to > > > >>> communicate to which particular other servers. Maybe we don't even > > need > > > >>> this, but I would feel better if we would have more tests around > > these > > > >>> things. > > > >>> > > > >>> My suggestion for the long term: > > > >>> - let's fix this particular issue now with 3.6.0 quickly (I start > > doing > > > >>> this today) > > > >>> - let's do some automation (backed up with jenkins) that will test > a > > > >> whole > > > >>> combinations of different ZooKeeper upgrade paths by making rolling > > > >>> upgrades during some light traffic. Let's have a bit better > > definition > > > >>> about what we expect (e.g. the quorum is up, but some clients can > get > > > >>> disconnected? What will happen to the ephemeral nodes? Do we want > to > > > >>> gracefully close or transfer the user sessions before stopping the > > old > > > >>> server?) and let's see where this broke. Just by checking the > code, I > > > >> don't > > > >>> think the quorum will always be up (e.g. between older 3.4 versions > > and > > > >>> 3.5). > > > >> > > > >> > > > >> I am happy to work on this topic > > > >> > > > >>> - we need to update the Wiki about the working rolling upgrade > paths > > and > > > >>> maybe about workarounds if needed > > > >>> - we might need to do some fixes (adding backward compatible > versions > > > >>> and/or specific parameters that enforce old protocol temporary > > during the > > > >>> rolling upgrade that can be changed later to the new protocol by > > either > > > >>> dynamic reconfig or by rolling restart) > > > >> > > > >> it would be much better on 3.6 code to have some support for > > > >> compatibility with 3.5 servers > > > >> we can't require old code to be forward compatible but we can make > new > > > >> code be compatible to a certain extend with old code. > > > >> If we can achieve this compatibility goal without a flag is better, > > > >> users won't have to care about this part and they simply "trust" on > us > > > >> > > > >> The rollback story is also important, but maybe we are still not > ready > > > >> for it, in case of local changes to store, > > > >> it is better to have a clear design and plan and work for a new > > release > > > >> (3.7?) > > > >> > > > >> Enrico > > > >> > > > >>> > > > >>> Depending on your comments, I am happy to create a few Jira tickets > > > >> around > > > >>> these topics. > > > >>> > > > >>> Kind regards, > > > >>> Mate > > > >>> > > > >>> ps. Enrico, sorry about your RC... I owe you a beer, let me know if > > you > > > >> are > > > >>> near to Budapest ;) > > > >>> > > > >>> On Tue, Feb 11, 2020 at 8:43 AM Enrico Olivelli < > eolive...@gmail.com > > > > > > >> wrote: > > > >>> > > > >>>> Good. > > > >>>> > > > >>>> I will cancel the vote for 3.6.0rc2. > > > >>>> > > > >>>> I appreciate very much If Mate and his colleagues have time to > work > > on > > > >> a > > > >>>> fix. > > > >>>> Otherwise I will have cycles next week > > > >>>> > > > >>>> I would also like to spend my time in setting up a few minimal > > > >> integration > > > >>>> tests about the upgrade story > > > >>>> > > > >>>> Enrico > > > >>>> > > > >>>> Il Mar 11 Feb 2020, 07:30 Michael Han <h...@apache.org> ha > scritto: > > > >>>> > > > >>>>> Kudos Enrico, very thorough work as the final gate keeper of the > > > >> release! > > > >>>>> > > > >>>>> Now with this, I'd like to *vote a -1* on the 3.6.0 RC2. > > > >>>>> > > > >>>>> I'd recommend we fix this issue for 3.6.0. ZooKeeper is one of > the > > > >> rare > > > >>>>> piece of software that put so much emphasis on compatibilities > thus > > > >> it > > > >>>> just > > > >>>>> works when upgrade / downgrade, which is amazing. One guarantee > we > > > >> always > > > >>>>> had is during rolling upgrade, the quorum will always be > available, > > > >>>> leading > > > >>>>> to no service interruption. It would be sad we lose such > capability > > > >> given > > > >>>>> this is still a tractable problem. > > > >>>>> > > > >>>>> Regarding the fix, can we just make 3.6.0 aware of the old > protocol > > > >> and > > > >>>>> speak old message format when it's talking to old server? > > Basically, > > > >> an > > > >>>>> ugly if else check against the protocol version should work and > > > >> there is > > > >>>> no > > > >>>>> need to have multiple pass on rolling upgrade process. > > > >>>>> > > > >>>>> > > > >>>>> On Mon, Feb 10, 2020 at 10:23 PM Enrico Olivelli < > > > >> eolive...@gmail.com> > > > >>>>> wrote: > > > >>>>> > > > >>>>>> I suggest this plan: > > > >>>>>> - release 3.6.0 now > > > >>>>>> - improve the migration story, the flow outlined by Mate is > > > >>>>>> interesting, but it will take time > > > >>>>>> > > > >>>>>> 3.6.0rc2 got enough binding votes so I am going to finalize the > > > >>>>>> release this evening (within 8-10 hours) if no one comes out in > > the > > > >>>>>> VOTE thread with a -1 > > > >>>>>> > > > >>>>>> Enrico > > > >>>>>> > > > >>>>>> Enrico > > > >>>>>> > > > >>>>>> Il giorno lun 10 feb 2020 alle ore 19:33 Patrick Hunt > > > >>>>>> <ph...@apache.org> ha scritto: > > > >>>>>>> > > > >>>>>>> On Mon, Feb 10, 2020 at 3:38 AM Andor Molnar <an...@apache.org > > > > > >>>> wrote: > > > >>>>>>> > > > >>>>>>>> Hi, > > > >>>>>>>> > > > >>>>>>>> Answers inline. > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>>> In my experience when you are close to a release it is > > > >> better to > > > >>>> to > > > >>>>>>>>> make big changes. (I am among the approvers of that patch, > > > >> so I > > > >>>> am > > > >>>>>>>>> responsible for this change) > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> Although this statement is acceptable for me, I don’t feel > this > > > >>>> patch > > > >>>>>>>> should not have been merged into 3.6.0. Submission has been > > > >>>> preceded > > > >>>>>> by a > > > >>>>>>>> long argument with MAPR folks who originally wanted to be > > > >> merged > > > >>>> into > > > >>>>>> 3.4 > > > >>>>>>>> branch (considering the pace how ZooKeeper community is moving > > > >>>>>> forward) and > > > >>>>>>>> we reached an agreement that release it with 3.6.0. > > > >>>>>>>> > > > >>>>>>>> Make a long story short, this patch has been outstanding for > > > >> ages > > > >>>>>> without > > > >>>>>>>> much attention from the community and contributors made a lot > > > >> of > > > >>>>>> effort to > > > >>>>>>>> get it done before the release. > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>>> I would like to ear from people that have been in the > > > >> community > > > >>>> for > > > >>>>>>>>> long time, then I am ready to complete the release process > > > >> for > > > >>>>>>>>> 3.6.0rc2. > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> Me too. > > > >>>>>>>> > > > >>>>>>>> I tend to accept the way rolling restart works now - as you > > > >>>> described > > > >>>>>>>> Enrico - and given that situation was pretty much the same > > > >> between > > > >>>>> 3.4 > > > >>>>>> and > > > >>>>>>>> 3.5, I don’t feel we have to make additional changes. > > > >>>>>>>> > > > >>>>>>>> On the other hand, the fix that Mate suggested sounds quite > > > >> cool, > > > >>>> I’m > > > >>>>>> also > > > >>>>>>>> happy to work on getting it in. > > > >>>>>>>> > > > >>>>>>>> Fyi, Release Management page says the following: > > > >>>>>>>> > > > >>>>>> > > > >>>> > > > >> > > https://cwiki.apache.org/confluence/display/ZOOKEEPER/ReleaseManagement > > > >>>>>>>> > > > >>>>>>>> "major.minor release of ZooKeeper must be backwards compatible > > > >> with > > > >>>>> the > > > >>>>>>>> previous minor release, major.(minor-1)" > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>> Our users, direct and indirect, value the ability to migrate to > > > >> newer > > > >>>>>>> versions - esp as we drop support for older. Frictions such as > > > >> this > > > >>>> can > > > >>>>>> be > > > >>>>>>> a reason to go elsewhere. I'm "pro" b/w compact - esp given our > > > >>>>> published > > > >>>>>>> guidelines. > > > >>>>>>> > > > >>>>>>> Patrick > > > >>>>>>> > > > >>>>>>> > > > >>>>>>>> Andor > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>>>>> On 2020. Feb 10., at 11:32, Enrico Olivelli < > > > >> eolive...@gmail.com > > > >>>>> > > > >>>>>> wrote: > > > >>>>>>>>> > > > >>>>>>>>> Thank you Mate for checking and explaining this story. > > > >>>>>>>>> > > > >>>>>>>>> I find it very interesting that the cause is ZOOKEEPER-3188 > > > >> as: > > > >>>>>>>>> - it is the last "big patch" committed to 3.6 before > > > >> starting the > > > >>>>>>>>> release process > > > >>>>>>>>> - it is the cause of the failure of the first RC > > > >>>>>>>>> > > > >>>>>>>>> In my experience when you are close to a release it is > > > >> better to > > > >>>> to > > > >>>>>>>>> make big changes. (I am among the approvers of that patch, > > > >> so I > > > >>>> am > > > >>>>>>>>> responsible for this change) > > > >>>>>>>>> > > > >>>>>>>>> This is a pointer to the change to whom who wants to > > > >> understand > > > >>>>>> better > > > >>>>>>>>> the context > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >> > > > https://github.com/apache/zookeeper/pull/1048/files#diff-7a209d890686bcba351d758b64b22a7dR11 > > > >>>>>>>>> > > > >>>>>>>>> IIUC even for the upgrade from 3.4 to 3.5 the story was the > > > >> same > > > >>>>> and > > > >>>>>>>>> if this statement holds then I feel we can continue > > > >>>>>>>>> with this release. > > > >>>>>>>>> > > > >>>>>>>>> - Reverting ZOOKEEPER-3188 is not an option for me, it is too > > > >>>>>> complex. > > > >>>>>>>>> - Making 3.5 and 3.6 "compatible" can be very tricky and we > > > >> do > > > >>>> not > > > >>>>>>>>> have tools to certify this compatibility (at least not in the > > > >>>> short > > > >>>>>>>>> term) > > > >>>>>>>>> > > > >>>>>>>>> I would like to ear from people that have been in the > > > >> community > > > >>>> for > > > >>>>>>>>> long time, then I am ready to complete the release process > > > >> for > > > >>>>>>>>> 3.6.0rc2. > > > >>>>>>>>> > > > >>>>>>>>> I will update the website and the release notes with a > > > >> specific > > > >>>>>>>>> warning about the upgrade, we should also update the Wiki > > > >>>>>>>>> > > > >>>>>>>>> Enrico > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> Il giorno lun 10 feb 2020 alle ore 11:17 Szalay-Bekő Máté > > > >>>>>>>>> <szalay.beko.m...@gmail.com> ha scritto: > > > >>>>>>>>>> > > > >>>>>>>>>> Hi Enrico! > > > >>>>>>>>>> > > > >>>>>>>>>> This is caused by the different PROTOCOL_VERSION in the > > > >>>>>>>> QuorumCnxManager. > > > >>>>>>>>>> The Protocol version was changed last time in > > > >> ZOOKEEPER-2186 > > > >>>>>> released > > > >>>>>>>>>> first in 3.4.7 and 3.5.1 to avoid some crashing / fix some > > > >> bugs. > > > >>>>>> Later I > > > >>>>>>>>>> also changed the protocol version when the format of the > > > >> initial > > > >>>>>> message > > > >>>>>>>>>> changed in ZOOKEEPER-3188. So actually the quorum protocol > > > >> is > > > >>>> not > > > >>>>>>>>>> compatible in this case and is the 'expected' behavior if > > > >> you > > > >>>>>> upgrade > > > >>>>>>>> e.g > > > >>>>>>>>>> from 3.4.6 to 3.4.7, or 3.4.6 to 3.5.5 or e.g from 3.5.6 to > > > >>>> 3.6.0. > > > >>>>>>>>>> > > > >>>>>>>>>> We had some discussion in the PR of ZOOKEEPER-3188 back > > > >> then and > > > >>>>>> got to > > > >>>>>>>> the > > > >>>>>>>>>> conclusion that it is not that bad, as there will be no data > > > >>>> loss > > > >>>>>> as you > > > >>>>>>>>>> wrote. The tricky thing is that during rolling upgrade we > > > >> should > > > >>>>>> ensure > > > >>>>>>>>>> both backward and forward compatibility to make sure that > > > >> the > > > >>>> old > > > >>>>>> and > > > >>>>>>>> the > > > >>>>>>>>>> new part of the quorum can still speak to each other. The > > > >>>> current > > > >>>>>>>> solution > > > >>>>>>>>>> (simply failing if the protocol versions mismatch) is more > > > >>>> simple > > > >>>>>> and > > > >>>>>>>> still > > > >>>>>>>>>> working just fine: as the servers are restarted one-by-one, > > > >> the > > > >>>>>> nodes > > > >>>>>>>> with > > > >>>>>>>>>> the old protocol version and the nodes with the new protocol > > > >>>>> version > > > >>>>>>>> will > > > >>>>>>>>>> form two partitions, but any given time only one partition > > > >> will > > > >>>>>> have the > > > >>>>>>>>>> quorum. > > > >>>>>>>>>> > > > >>>>>>>>>> Still, thinking it trough, as a side effect in these cases > > > >> there > > > >>>>>> will > > > >>>>>>>> be a > > > >>>>>>>>>> short time when none of the partitions will have quorums > > > >> (when > > > >>>> we > > > >>>>>> have N > > > >>>>>>>>>> servers with the old protocol version, N servers with the > > > >> new > > > >>>>>> protocol > > > >>>>>>>>>> version, and there is one server just being restarted). I > > > >> am not > > > >>>>>> sure > > > >>>>>>>> if we > > > >>>>>>>>>> can accept this. > > > >>>>>>>>>> > > > >>>>>>>>>> For ZOOKEEPER-3188 we can add a small patch to make it > > > >> possible > > > >>>> to > > > >>>>>> parse > > > >>>>>>>>>> the initial message of the old protocol version with the new > > > >>>> code. > > > >>>>>> But > > > >>>>>>>> I am > > > >>>>>>>>>> not sure if it would be enough (as the old code will not be > > > >> able > > > >>>>> to > > > >>>>>>>> parse > > > >>>>>>>>>> the new initial message). > > > >>>>>>>>>> > > > >>>>>>>>>> One option can be to make a patch also for 3.5 to have a > > > >> version > > > >>>>>> which > > > >>>>>>>>>> supports both protocol versions. (let's say in 3.5.8) Then > > > >> we > > > >>>> can > > > >>>>>> write > > > >>>>>>>> to > > > >>>>>>>>>> the release note, that if you need rolling upgrade from any > > > >>>>> versions > > > >>>>>>>> since > > > >>>>>>>>>> 3.4.7, then you have to first upgrade from 3.5.8 before > > > >>>> upgrading > > > >>>>> to > > > >>>>>>>> 3.6.0. > > > >>>>>>>>>> We can even make the same thing on the 3.4 branch. > > > >>>>>>>>>> > > > >>>>>>>>>> But I am also new to the community... It would be great to > > > >> hear > > > >>>>> the > > > >>>>>>>> opinion > > > >>>>>>>>>> of more experienced people. > > > >>>>>>>>>> Whatever the decision will be, I am happy to make the > > > >> changes. > > > >>>>>>>>>> > > > >>>>>>>>>> And sorry for breaking the RC (if we decide that this needs > > > >> to > > > >>>> be > > > >>>>>>>>>> changed...). ZOOKEEPER-3188 was a complex patch. > > > >>>>>>>>>> > > > >>>>>>>>>> Kind regards, > > > >>>>>>>>>> Mate > > > >>>>>>>>>> > > > >>>>>>>>>> On Mon, Feb 10, 2020 at 9:47 AM Enrico Olivelli < > > > >>>>>> eolive...@gmail.com> > > > >>>>>>>> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>>> Hi, > > > >>>>>>>>>>> even if we had enough binding +1 on 3.6.0rc2 before > > > >> closing the > > > >>>>>> VOTE > > > >>>>>>>>>>> of 3.6.0 I wanted to finish my tests and I am coming to an > > > >>>>> apparent > > > >>>>>>>>>>> blocker. > > > >>>>>>>>>>> > > > >>>>>>>>>>> I am trying to upgrade a 3.5.6 cluster to 3.6.0, but it > > > >> looks > > > >>>>> like > > > >>>>>>>>>>> peers are not able to talk to each other. > > > >>>>>>>>>>> I have a cluster of 3, server1, server2 and server3. > > > >>>>>>>>>>> When I upgrade server1 to 3.6.0rc2 I see this kind of > > > >> errors on > > > >>>>> 3.5 > > > >>>>>>>> nodes: > > > >>>>>>>>>>> > > > >>>>>>>>>>> 2020-02-10 09:35:07,745 [myid:3] - INFO > > > >>>>>>>>>>> [localhost/127.0.0.1:3334:QuorumCnxManager$Listener@918] - > > > >>>>>> Received > > > >>>>>>>>>>> connection request 127.0.0.1:62591 > > > >>>>>>>>>>> 2020-02-10 09:35:07,746 [myid:3] - ERROR > > > >>>>>>>>>>> [localhost/127.0.0.1:3334:QuorumCnxManager@527] - > > > >>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >> > > > org.apache.zookeeper.server.quorum.QuorumCnxManager$InitialMessage$InitialMessageException: > > > >>>>>>>>>>> Got unrecognized protocol version -65535 > > > >>>>>>>>>>> > > > >>>>>>>>>>> Once I upgrade all of the peers the system is up and > > > >> running, > > > >>>>>> without > > > >>>>>>>>>>> apparently no data loss. > > > >>>>>>>>>>> > > > >>>>>>>>>>> During the upgrade as soon as I upgrade the first node, > > > >> say, > > > >>>>>> server1, > > > >>>>>>>>>>> server1 is not able to accept connections (error "Close of > > > >>>>> session > > > >>>>>> 0x0 > > > >>>>>>>>>>> java.io.IOException: ZooKeeperServer not running") from > > > >>>> clients, > > > >>>>>> this > > > >>>>>>>>>>> is expected, because as far as it cannot talk with the > > > >> other > > > >>>>> peers > > > >>>>>> it > > > >>>>>>>>>>> is practically partitioned away from the cluster. > > > >>>>>>>>>>> > > > >>>>>>>>>>> My questions are: > > > >>>>>>>>>>> 1) is this expected ? I can't remember protocol changes > > > >> from > > > >>>> 3.5 > > > >>>>> to > > > >>>>>>>>>>> 3.6, but actually 3.6 diverged from 3.5 branch so long ago, > > > >>>> and I > > > >>>>>> was > > > >>>>>>>>>>> not in the community as dev so I cannot tell > > > >>>>>>>>>>> 2) is this a viable option for users ? to have some > > > >> temporary > > > >>>>>> glitch > > > >>>>>>>>>>> during the upgrade and hope that the upgrade completes > > > >> without > > > >>>>>>>>>>> troubles ? > > > >>>>>>>>>>> > > > >>>>>>>>>>> In theory as long as two servers are running the same major > > > >>>>> version > > > >>>>>>>>>>> (3.5 or 3.6) we have a quorum and the system is able to > > > >> make > > > >>>>>> progress > > > >>>>>>>>>>> and to server clients. > > > >>>>>>>>>>> I feel that this is quite dangerous, but I don't have > > > >> enough > > > >>>>>> context > > > >>>>>>>>>>> to understand how this problem is possible and when we > > > >> decided > > > >>>> to > > > >>>>>>>>>>> break compatibility. > > > >>>>>>>>>>> > > > >>>>>>>>>>> The other option is that I am wrong in my test and I am > > > >> messing > > > >>>>> up > > > >>>>>> :-) > > > >>>>>>>>>>> > > > >>>>>>>>>>> The other upgrade path I would like to see working like a > > > >> charm > > > >>>>> is > > > >>>>>> the > > > >>>>>>>>>>> upgrade from 3.4 to 3.6, as I see that as soon as we > > > >> release > > > >>>> 3.6 > > > >>>>> we > > > >>>>>>>>>>> should encourage users to move to 3.6 and not to 3.5. > > > >>>>>>>>>>> > > > >>>>>>>>>>> Regards > > > >>>>>>>>>>> Enrico > > > >>>>>>>>>>> > > > >>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>>>> > > > >>>> > > > >> > > > > > >