Thank you guys for your support. I leave this open for the community to decide, but if none of the binding voters (committers, PMCs) respond in 1 week, I'll take responsibility and commit the patches. I know holidays are still going on, but 3.5 release is important we should make progress.
Thanks, Andor On 1/9/19 21:54, Norbert Kalmar wrote: > I agree with backporting Netty 4 and the related patches to 3.5.5. > +1 (non binding) (and I might have already voted) > > If it took so long to release a stable 3.5, I don't think we should leave a > known bug in the system, which has such an impact (not supporting current > LTS java). > The PRs shouldn't be hard to backport, no new development is required, and > unfortunately maven migration isn't finished yet, so it's not like we need > to postpone the release just because of Netty upgrade. That's my two cents > anyway. > > Regards, > Norbert > > On Wed, Jan 9, 2019 at 9:19 PM Enrico Olivelli <eolive...@gmail.com> wrote: > >> Il giorno mer 9 gen 2019 alle ore 21:16 Andor Molnar >> <an...@apache.org> ha scritto: >>> So, you guys saying we should upgrade to Netty 4 and backport the >> following patches too: >>> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak >>> ZOOKEEPER-3163 Use session map to improve the performance when closing >> session in Netty >>> ZOOKEEPER-3146 Limit the maximum client connections per IP in >> NettyServerCnxnFactory >>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to keep >> the same behavior and make the code easier to maintain >>> I think it’s still deliverable with the stable release. Regarding the >> timing, let’s say we release 3.5.5 with Netty 3 and do the upgrade in >> 3.5.6. Is it acceptable to do such a big with a minor version change? >> >> I think we can do the upgrade in 3.5.5. >> We can't say Java 11 is supported if we don't have tests working. And >> Java 11 is the current LTS release from Oracle. >> >> If we release now 3.5.5 then we need to work for 3.5.6 and then >> release....it is a double work, we don't have so much resources :-( >> >> IMHO it is better to release 3.5.5 and start focusing on 3.6.0 >> >> Enrico >> >>> Upgrading now isn't ideal either, but maybe somewhat less brutal. >>> >>> Regards, >>> Andor >>> >>> >>> >>> >>>> On 2019. Jan 8., at 2:10, Brian Nixon <brian.nixon...@gmail.com> >> wrote: >>>> Also worth consideration for the Netty fixes backport bundle: >>>> >>>> ZOOKEEPER-3131 org.apache.zookeeper.server.WatchManager resource leak >>>> >>>> Our experience with Netty has been positive so far though I don't >> think we >>>> have enough info to declare it is stable. It makes a lot of sense to >> me for >>>> 3.5 to use Netty 4, the biggest question is the timing. >>>> >>>> >>>> On Fri, Jan 4, 2019 at 10:44 AM Andor Molnár <an...@apache.org> wrote: >>>> >>>>> I got another one: >>>>> >>>>> ZOOKEEPER-3163 Use session map to improve the performance when closing >>>>> session in Netty >>>>> >>>>> >>>>> So, that doesn't seem too many. We can talk about backporting them, >> but >>>>> I don't think they're super critical for the first stable release. >>>>> >>>>> Regarding 3.4, I need to validate the embedded scenario, but at least >>>>> the Java 11 build is green. :) >>>>> >>>>> >>>>> Andor >>>>> >>>>> >>>>> >>>>> On 1/4/19 18:08, Enrico Olivelli wrote: >>>>>> Il giorno ven 4 gen 2019 alle ore 17:04 Norbert Kalmar >>>>>> <nkal...@cloudera.com.invalid> ha scritto: >>>>>>> +1 for Netty 4 in 3.5 >>>>>>> >>>>>>> Pretty much all the pros and cons has been said before me. >>>>>>> I would only add that this is not a new functionality that we wan't >> to >>>>>>> backport. It's a criticall(ish?) bugfix, which requires quite a bit >> of >>>>>>> change unfortunately. >>>>>>> >>>>>>> Regards, >>>>>>> Norbert >>>>>>> >>>>>>> On Fri, Jan 4, 2019 at 4:36 PM Andor Molnar <an...@apache.org> >> wrote: >>>>>>>> What are those patches exactly? >>>>>>>> >>>>>>>> Comparing the ported version of 3.5 with master I’ve only found 2 >>>>> patches >>>>>>>> which are missing: >>>>>>>> >>>>>>>> ZOOKEEPER-3146 Limit the maximum client connections per IP in >>>>>>>> NettyServerCnxnFactory >>>>>>>> ZOOKEEPER-3177 Refactor request throttle logic in NIO and Netty to >> keep >>>>>>>> the same behavior and make the code easier to maintain >>>>>>>> >>>>>>>> None of them are critical I would say. >>>>>>>> Is there anything else I’m missing? >>>>>> I have compared the histories and I think Andor you are right. >>>>>> >>>>>> master: >>>>>> https://github.com/apache/zookeeper/commits/master >>>>>> >>>>>> branch-3.5: >>>>>> https://github.com/apache/zookeeper/commits/branch-3.5 >>>>>> >>>>>> Sorry I was thinking to the amount of patches related to TLS stuff , >>>>>> but they are not related to Netty 4. >>>>>> >>>>>> What about branch 3.4 ? >>>>>> We don't have reconfig but the case "Start embedded ZK, stop it and >>>>>> try to restart on the same port" should apply as well. >>>>>> >>>>>> Enrico >>>>>> >>>>>>>> Andor >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> On 2019. Jan 4., at 16:27, Enrico Olivelli <eolive...@gmail.com> >>>>> wrote: >>>>>>>>> Il giorno ven 4 gen 2019 alle ore 14:23 Andor Molnar >>>>>>>>> <an...@apache.org <mailto:an...@apache.org>> ha scritto: >>>>>>>>>> Hi team / Enrico, >>>>>>>>>> >>>>>>>>>> I’d like to get feedback from the community on the following >> patch >>>>>>>> (moving the discussion from GitHub to here): >>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204 < >>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-3204> >>>>>>>>>> https://github.com/apache/zookeeper/pull/753 < >>>>>>>> https://github.com/apache/zookeeper/pull/753> >>>>>>>>>> In a nutshell: looks like that Netty 3.10 is broken under Java >> 11: it >>>>>>>> doesn’t properly close the underlying socket (probably not closing >> the >>>>>>>> registered NIO selectors) and reconfig tests are unable to re-bind >> the >>>>>>>> ports. This problem is similar that we already fixed in NIO with >> the >>>>>>>> following patch: >>>>>>>> >> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2 >>>>>>>> < >>>>>>>> >> https://github.com/apache/zookeeper/commit/c3babb94275ad667dc71c10dcb08a383a3c154c2 >>>>>>>>>> The problem doesn’t show up on trunk which has been recently >> upgraded >>>>>>>> to Netty 4. >>>>>>>>>> Repro: >>>>>>>>>> - Start embedded ZK, stop it and try to restart on the same >> port, or >>>>>>>>>> - Start normal ensemble and reconfig to use different (client) >> port. >>>>>>>> Then reconfig back to the original port which should fail. (that’s >> the >>>>>>>> scenario which is covered in ReconfigTest) >>>>>>>>>> I created the above patch (#753) to backport Netty 4 upgrade to >> 3.5 >>>>> and >>>>>>>> it fixes the problem with Java 11 (it doesn’t cause regression in >> the >>>>>>>> pre-commit build either), but Enrico is having concerns about >> making >>>>> such >>>>>>>> big change before the release. >>>>>>>>>> I tend to agree, but let’s see what are the options. >>>>>>>>>> >>>>>>>>>> Thoughts: >>>>>>>>>> - Do we have to fix this? - Yes. Java 11 is LTS and I the bug is >>>>>>>> critical. >>>>>>>>>> - Can we fix Netty 3? - Maybe. Let’s say we find the bug in >> Netty 3, >>>>>>>> what can we do? >>>>>>>>>> a) We cannot workaround from ZooKeeper itself and have to >>>>> submit >>>>>>>> a pull request for Netty. I think it’s quite unlikely that they >> will >>>>> accept >>>>>>>> the change given it’s not a security bug, but even if they did, >> only >>>>> the >>>>>>>> upgraded version of Netty 3 would work properly with ZooKeeper. >> Err. >>>>>>>>>> b) We can workaround it from ZooKeeper: that could be option >>>>> #1, >>>>>>>> but I have a strong feeling about it’s not going to be the case. >>>>>>>>>> - Shall we upgrade to Netty 4? - this is option #2 >>>>>>>>>> >>>>>>>>>> Please share your thoughts, maybe you know about an option #3. >>>>>>>>> Thank you Andor >>>>>>>>> >>>>>>>>> I have thought more about this problem, and I have checked that >> Netty >>>>>>>>> 3 is really dead/unmantained (last release in 2016). >>>>>>>>> If I understand correctly there is no easy workaround (nothing >> without >>>>>>>>> hacking Netty 3 internals) >>>>>>>>> >>>>>>>>> As soon as we will declare 3.5.5 "stable" the world will hopefully >>>>>>>>> abandon 3.4 and switch to 3.5 + Netty (because of SSL support). >>>>>>>>> The network stack is very important so it is better to have Netty >> 4 as >>>>>>>>> foundation, I am thinking about security issues, we won't make an >>>>>>>>> "hotfix" release with the switch to Netty 4 because there is a >> bad bug >>>>>>>>> in Netty 3. >>>>>>>>> So better to switch now. >>>>>>>>> >>>>>>>>> But Facebooks friends, expecially @ivmaykov did a lot of bugfixes >>>>>>>>> around Netty on master branch, we must be sure that what we are >>>>>>>>> delivering in 3.5.5 is stable. >>>>>>>>> >>>>>>>>> We will also have to state clearly in the "release notes" that >> Netty >>>>>>>>> version is changed, as this may have a non trivial impact to >> memory >>>>>>>>> usage (i.e. Netty 4 uses more Direct memory by default) >>>>>>>>> >>>>>>>>> So to recap my final opinion: +1 to switch to Netty 4 if we take >> care >>>>>>>>> of port all of the fixes around Netty 4 from master branch and we >>>>>>>>> state the switch clearly in the release notes >>>>>>>>> >>>>>>>>> Enrico >>>>>>>>> >>>>>>>>>> Regards, >>>>>>>>>> Andor