So, you guys saying we should upgrade to Netty 4 and backport the following patches too:
ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak ZOOKEEPER-3163 Use session map to improve the performance when closing session in Netty ZOOKEEPER-3146 Limit the maximum client connections per IP in NettyServerCnxnFactory ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep the same behavior and make the code easier to maintain I think it’s still deliverable with the stable release. Regarding the timing, let’s say we release 3.5.5 with Netty 3 and do the upgrade in 3.5.6. Is it acceptable to do such a big with a minor version change? Upgrading now isn't ideal either, but maybe somewhat less brutal. Regards, Andor > On 2019. Jan 8., at 2:10, Brian Nixon <brian.nixon...@gmail.com> wrote: > > Also worth consideration for the Netty fixes backport bundle: > > ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak > > Our experience with Netty has been positive so far though I don't think we > have enough info to declare it is stable. It makes a lot of sense to me for > 3.5 to use Netty 4, the biggest question is the timing. > > > On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <an...@apache.org> wrote: > >> I got another one: >> >> ZOOKEEPER-3163 Use session map to improve the performance when closing >> session in Netty >> >> >> So, that doesn't seem too many. We can talk about backporting them, but >> I don't think they're super critical for the first stable release. >> >> Regarding 3.4, I need to validate the embedded scenario, but at least >> the Java 11 build is green. :) >> >> >> Andor >> >> >> >> On 1/4/19 18:08, Enrico Olivelli wrote: >>> Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar >>> <nkal...@cloudera.com.invalid> ha scritto: >>>> +1 for Netty 4 in 3.5 >>>> >>>> Pretty much all the pros and cons has been said before me. >>>> I would only add that this is not a new functionality that we wan't to >>>> backport. It's a criticall(ish?) bugfix, which requires quite a bit of >>>> change unfortunately. >>>> >>>> Regards, >>>> Norbert >>>> >>>> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org> wrote: >>>> >>>>> What are those patches exactly? >>>>> >>>>> Comparing the ported version of 3.5 with master I’ve only found 2 >> patches >>>>> which are missing: >>>>> >>>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in >>>>> NettyServerCnxnFactory >>>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep >>>>> the same behavior and make the code easier to maintain >>>>> >>>>> None of them are critical I would say. >>>>> Is there anything else I’m missing? >>> I have compared the histories and I think Andor you are right. >>> >>> master: >>> https://github.com/apache/zookeeper/commits/master >>> >>> branch-3.5: >>> https://github.com/apache/zookeeper/commits/branch-3.5 >>> >>> Sorry I was thinking to the amount of patches related to TLS stuff , >>> but they are not related to Netty 4. >>> >>> What about branch 3.4 ? >>> We don't have reconfig but the case "Start embedded ZK, stop it and >>> try to restart on the same port" should apply as well. >>> >>> Enrico >>> >>>>> Andor >>>>> >>>>> >>>>> >>>>> >>>>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eolive...@gmail.com> >> wrote: >>>>>> >>>>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar >>>>>> <an...@apache.org <mailto:an...@apache.org>> ha scritto: >>>>>>> Hi team / Enrico, >>>>>>> >>>>>>> I’d like to get feedback from the community on the following patch >>>>> (moving the discussion from GitHub to here): >>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 < >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204> >>>>>>> https://github.com/apache/zookeeper/pull/753 < >>>>> https://github.com/apache/zookeeper/pull/753> >>>>>>> In a nutshell: looks like that Netty 3.10 is broken under Java 11: it >>>>> doesn’t properly close the underlying socket (probably not closing the >>>>> registered NIO selectors) and reconfig tests are unable to re-bind the >>>>> ports. This problem is similar that we already fixed in NIO with the >>>>> following patch: >>>>> >> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2 >>>>> < >>>>> >> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2 >>>>>>> The problem doesn’t show up on trunk which has been recently upgraded >>>>> to Netty 4. >>>>>>> Repro: >>>>>>> - Start embedded ZK, stop it and try to restart on the same port, or >>>>>>> - Start normal ensemble and reconfig to use different (client) port. >>>>> Then reconfig back to the original port which should fail. (that’s the >>>>> scenario which is covered in ReconfigTest) >>>>>>> I created the above patch (#753) to backport Netty 4 upgrade to 3.5 >> and >>>>> it fixes the problem with Java 11 (it doesn’t cause regression in the >>>>> pre-commit build either), but Enrico is having concerns about making >> such >>>>> big change before the release. >>>>>>> I tend to agree, but let’s see what are the options. >>>>>>> >>>>>>> Thoughts: >>>>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is >>>>> critical. >>>>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in Netty 3, >>>>> what can we do? >>>>>>> a) We cannot workaround from ZooKeeper itself and have to >> submit >>>>> a pull request for Netty. I think it’s quite unlikely that they will >> accept >>>>> the change given it’s not a security bug, but even if they did, only >> the >>>>> upgraded version of Netty 3 would work properly with ZooKeeper. Err. >>>>>>> b) We can workaround it from ZooKeeper: that could be option >> #1, >>>>> but I have a strong feeling about it’s not going to be the case. >>>>>>> - Shall we upgrade to Netty 4? - this is option #2 >>>>>>> >>>>>>> Please share your thoughts, maybe you know about an option #3. >>>>>> Thank you Andor >>>>>> >>>>>> I have thought more about this problem, and I have checked that Netty >>>>>> 3 is really dead/unmantained (last release in 2016). >>>>>> If I understand correctly there is no easy workaround (nothing without >>>>>> hacking Netty 3 internals) >>>>>> >>>>>> As soon as we will declare 3.5.5 "stable" the world will hopefully >>>>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support). >>>>>> The network stack is very important so it is better to have Netty 4 as >>>>>> foundation, I am thinking about security issues, we won't make an >>>>>> "hotfix" release with the switch to Netty 4 because there is a bad bug >>>>>> in Netty 3. >>>>>> So better to switch now. >>>>>> >>>>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes >>>>>> around Netty on master branch, we must be sure that what we are >>>>>> delivering in 3.5.5 is stable. >>>>>> >>>>>> We will also have to state clearly in the "release notes" that Netty >>>>>> version is changed, as this may have a non trivial impact to memory >>>>>> usage (i.e. Netty 4 uses more Direct memory by default) >>>>>> >>>>>> So to recap my final opinion: +1 to switch to Netty 4 if we take care >>>>>> of port all of the fixes around Netty 4 from master branch and we >>>>>> state the switch clearly in the release notes >>>>>> >>>>>> Enrico >>>>>> >>>>>>> Regards, >>>>>>> Andor >>>>> >>