Hi, devs, I would like to revive the discussion again.
In our ARM environment, we encounter a compile error when using Flink 1.17. Flink 1.17 depends on flink-shaded 16.1, which uses netty 4.1.82. However, flink-shaded 16.1 fails to compile in the ARM environment. As a result, we are unable to compile Flink 1.17 due to this issue. We have tested compiling flink-shaded using netty 4.1.83 or a later version in ARM env, and it can compile successfully. I believe this is a critical bug that needs to be addressed first. Taking into consideration the previous discussions regarding compatibility and the dependency of external connectors on this version, I propose addressing the bug by only updating flink-shaded's netty to a minor version (e.g., 4.1.83) rather than backporting FLINK-32032. This approach can avoid introducing other changes, such as updating to a higher netty version (4.1.91), potential guava version alterations, Curator version modifications, and so on. WDYT? Best, Yuxin Jing Ge <j...@ververica.com.invalid> 于2023年10月5日周四 14:39写道: > Hi Chesnay, > > Thanks for joining this discussion and sharing your thoughts! > > > > Connectors shouldn't depend on flink-shaded. > > > > Perfect! We are on the same page. If you could read through the discussion, > you would realize that, currently, there are many connectors depend on > flink-shaded. > > > > Connectors are small enough in scope that depending directly on > > guava/jackson/etc. is a fine approach, and they have plenty of other > > dependencies that they need to manage anyway; let's treat these the same > > way. > > > > It is even better, if we could do that. Jira tickest[1] are created. > > > > As for class-loading, there has been a long-standing goal of each > > connector being loaded in their own classloader. That still is the north > > star and the only reasonable way to ensure that multiple connectors can > > be safely used with SQL. > > > > What is the current status? Do we have any Jira ticket for that? > > Best regards, > Jing > > [1] https://issues.apache.org/jira/browse/FLINK-33190 > > On Wed, Oct 4, 2023 at 4:43 PM Chesnay Schepler <ches...@apache.org> > wrote: > > > There is no "monolithic" flink-shaded dependency. > > Connectors shouldn't depend on anything that Flink provides, but be > > self-contained as Martijn pointed out. > > > > > > > Connectors shouldn't depend on flink-shaded. > > > > > > The overhead and/or risks of doing/supporting that right now far > > outweigh the benefits. > > ( Because we either have to encode the full version for all dependencies > > into the package, or accept the risk of minor/patch dependency clashes) > > > > > > Connectors are small enough in scope that depending directly on > > guava/jackson/etc. is a fine approach, and they have plenty of other > > dependencies that they need to manage anyway; let's treat these the same > > way. > > > > > > Naturally this is also an argument against flink-shaded-connectors; on > > top of that we already experience repo creep and managing releases is > > difficult enough as-is. > > > > > > > As for class-loading, there has been a long-standing goal of each > > connector being loaded in their own classloader. That still is the north > > star and the only reasonable way to ensure that multiple connectors can > > be safely used with SQL. > > > > > > On 02/10/2023 18:32, Jing Ge wrote: > > > Hi Sergey, > > > > > > Thanks for sharing your thoughts. It could somehow help but didn't get > to > > > the root of this issue. > > > > > > According to the documentation, Flink shaded is used to provide a > single > > > instance of a shaded dependency across sub-modules in Flink repo. > Shaded > > > namespaces should be used where shaded dependencies are configured. > After > > > connectors have been externalized, it ends up with more repos depending > > on > > > one shaded jar, e.g. guava. This is a "monolithic" dependency setup > that > > > makes it difficult to change the root(flink-shade), because any changes > > of > > > the root have to be propagated to all downstream repos. Even worse is > > that > > > not every downstream repo is known while modifying the root. > > > > > > Since all externalized connectors have their own repos and are not > > > sub-modules of Flink anymore, I would suggest the following upgrade: > > > > > > 1. Connectors should use their own classloader instead of Flink's > > > classloader. This will break the monolithic dependency. Connectors and > > > Flink can use different versions of flink-shaded. > > > 2. [optional] It would be even better that all connector repos depend > on > > > their own individual shaded repo, e.g. flink-connector-shaded. > > flink-shaded > > > should only be used by Flink. > > > > > > WDYT? > > > > > > Best regards, > > > Jing > > > > > > > > > On Thu, Sep 14, 2023 at 11:28 PM Sergey Nuyanzin <snuyan...@gmail.com> > > > wrote: > > > > > >> Yes, that's a reasonable question, thanks for raising it. > > >> > > >> I think this is not only about flink-shaded, rather about dependencies > > in > > >> general > > >> > > >> I guess there is no rule of thumb, or at least I'm not aware of > > >> Here are my thoughts > > >> 1. If bumping dependency doesn't require breaking changes and passes > > >> existing tests then just bump it > > >> 2. In case there are breaking changes we could consider doing this > > within > > >> next major release > > >> for minor release > > >> a. try to answer a question whether it impacts Flink or not > > >> b. in case it impacts Flink and fix itself is relatively small > > then to > > >> avoid breaking change > > >> we could copy classes with solutions to Flink repo like it > > usually > > >> happens with Calcite related fixes. > > >> The problem of this approach is that I guess it will not work > > for > > >> non jvm deps like e.g. RocksDB > > >> c. In case no way to do it without breaking changes for minor > > release > > >> then probably need sort of announcement motivating to move to another > > major > > >> version where the issue is fixed > > >> > > >> looking forward to seeing other opinions about that > > >> > > >> On Wed, Sep 13, 2023 at 9:47 PM Jing Ge <j...@ververica.com.invalid> > > >> wrote: > > >> > > >>> Hi Sergey, > > >>> > > >>> Thanks for doing the analysis and providing the great insight. I did > my > > >> own > > >>> analysis and got the same conclusion. I just wanted to use this > example > > >> to > > >>> kick off a discussion and check if there is a common guideline or > > concept > > >>> in the community to handle such cases, since it seems any bump-up > might > > >>> have a big impact. This time, we are kind of lucky. What if CVE > related > > >>> code has been used in Flink? I do see it is an issue that upgrading a > > lib > > >>> in the flink-shaded repo is not recommended because of the > complexity. > > >>> IMHO, we don't want to put the cart before the horse. > > >>> > > >>> Best regards, > > >>> Jing > > >>> > > >>> On Wed, Sep 13, 2023 at 9:11 PM Sergey Nuyanzin <snuyan...@gmail.com > > > > >>> wrote: > > >>> > > >>>> Thanks for raising this > > >>>> > > >>>> I would suggest trying to double check whether it actually impacts > > >> Flink > > >>> or > > >>>> not. > > >>>> > > >>>> For instance from one side Calcite between 1.22.0..1.31.0 has a > > >> critical > > >>>> CVE-2022-39135 [1] > > >>>> from another side Flink does not use this functionality and is not > > >>> impacted > > >>>> by this. > > >>>> > > >>>> Regarding guava > > >>>> After closer look at Guava's high CVE you've mentioned [2] > > >>>> Based on Github issue describing the problem (exists since 2016) [3] > > >>>> there is a security issue with class > > >>>> com.google.common.io.FileBackedOutputStream > > >>>> > > >>>> While looking at source code for > > >>>> com.google.common.io.FileBackedOutputStream usage I was not able to > > >> find > > >>>> such. > > >>>> Also I was not able to find usage of > > >>>> org.apache.flink.shaded.guava30.com.google.common.io > > >> .Files#createTempDir > > >>>> which was fixed within commit [4] > > >>>> Also I was not able to find other Guava classes which use the > problem > > >>> one. > > >>>> Currently I tend to think that it probably does not impact Flink as > > >> well. > > >>>> Please correct me if I'm wrong > > >>>> > > >>>> [1] https://nvd.nist.gov/vuln/detail/CVE-2022-39135 > > >>>> [2] https://nvd.nist.gov/vuln/detail/CVE-2023-2976 > > >>>> [3] https://github.com/google/guava/issues/2575 > > >>>> [4] > > >>>> > > >>>> > > >> > > > https://github.com/google/guava/commit/b3719763b9ca777b94554d7ea2d2b92e27ac6164 > > >>>> On Wed, Sep 13, 2023 at 6:26 PM Jing Ge <j...@ververica.com.invalid > > > > >>>> wrote: > > >>>> > > >>>>> Hi Sergey, Hi Martijn, > > >>>>> > > >>>>> Thanks for the information and suggestions. It is rational. > > >>>>> > > >>>>> Since on one hand, we should not do the upgrade according to all > your > > >>>>> thoughts, and on the other hand, Guava 30.1.1-jre has known CVEs[1] > > >> and > > >>>> the > > >>>>> next closest version that has no CVEs is 32.0.1-jre. Without the > > >>> upgrade, > > >>>>> users might not be able to use Flink 1.17 in their production and > > >> Flink > > >>>>> 1.18 has not been released yet. Do we have any known concept, rule, > > >> or > > >>>>> process in the community to handle such a common CVE issue? > > >>>>> > > >>>>> Best regards, > > >>>>> Jing > > >>>>> > > >>>>> > > >>>>> > > >>>>> [1] > > >>> https://mvnrepository.com/artifact/com.google.guava/guava/30.1.1-jre > > >>>>> On Wed, Sep 13, 2023 at 4:31 PM Martijn Visser < > > >>> martijnvis...@apache.org > > >>>>> wrote: > > >>>>> > > >>>>>> Hi Jing, > > >>>>>> > > >>>>>> Like Sergey said, making that change would break all connectors > > >> that > > >>>> used > > >>>>>> Flink Shaded in the released versions, given that Guava is part of > > >>> the > > >>>>>> package name. I think that was the case for Kafka, Pulsar, JDBC, > > >>>> PubSub, > > >>>>>> HBase, Cassandra and maybe even more. We shouldn't do this > upgrade. > > >>>>>> > > >>>>>> Best regards, > > >>>>>> > > >>>>>> Martijn > > >>>>>> > > >>>>>> On Wed, Sep 13, 2023 at 2:54 PM Sergey Nuyanzin < > > >> snuyan...@gmail.com > > >>>>>> wrote: > > >>>>>> > > >>>>>>> Hi Jing, > > >>>>>>> > > >>>>>>> If I remember correctly, bumping of guava to another major > > >> version > > >>>>>> usually > > >>>>>>> leads to package rename > > >>>>>>> since the major version number is a part of the package name... > > >>>>>>> Not 100% sure however this could be the reason for some potential > > >>>>>> breaking > > >>>>>>> changes (we also faced that with connectors while the last > > >>>> flink-shaded > > >>>>>>> update) > > >>>>>>> > > >>>>>>> On Wed, Sep 13, 2023 at 2:43 PM Jing Ge > > >> <j...@ververica.com.invalid > > >>>>>>> wrote: > > >>>>>>> > > >>>>>>>> Hi Martijn, > > >>>>>>>> > > >>>>>>>> For example, bump up guava from 30.1.1-jre to 32.1.2-jre. > > >>>>>>>> > > >>>>>>>> Best regards, > > >>>>>>>> Jing > > >>>>>>>> > > >>>>>>>> On Wed, Sep 13, 2023 at 2:37 PM Martijn Visser < > > >>>>>> martijnvis...@apache.org > > >>>>>>>> wrote: > > >>>>>>>> > > >>>>>>>>> Hi Jing, > > >>>>>>>>> > > >>>>>>>>> A PR to update the readme sounds like a good plan. > > >>>>>>>>> > > >>>>>>>>> I think it depends on what are the expected updates for a > > >>>>>> flink-shaded > > >>>>>>>>> 16.2, since that version doesn't exist. > > >>>>>>>>> > > >>>>>>>>> Best regards, > > >>>>>>>>> > > >>>>>>>>> Martijn > > >>>>>>>>> > > >>>>>>>>> On Wed, Sep 13, 2023 at 1:48 PM Jing Ge > > >>>> <j...@ververica.com.invalid > > >>>>>>>>> wrote: > > >>>>>>>>> > > >>>>>>>>>> Hi Martijn, > > >>>>>>>>>> > > >>>>>>>>>> Thanks for your reply with details. Appreciate it. > > >>>>>>>>>> > > >>>>>>>>>> > > >>>>>>>>>>> Flink-Shaded is usually only updated whenever a > > >>>>>>>>>>> new Flink minor version is released and only at the > > >>> beginning > > >>>>> of > > >>>>>>> the > > >>>>>>>>>>> release cycle, so that there's enough time to stabilize > > >>>> Flink. > > >>>>>>>>>> > > >>>>>>>>>> This is the information I am looking for. It will help devs > > >>>>>>> understand > > >>>>>>>>> the > > >>>>>>>>>> compatibility between different versions of flink and > > >>>>> flink-shaded, > > >>>>>>> if > > >>>>>>>> it > > >>>>>>>>>> could be described in the readme. If you don't mind, I can > > >>>>> create a > > >>>>>>> pr > > >>>>>>>>> and > > >>>>>>>>>> update it. > > >>>>>>>>>> > > >>>>>>>>>> Speaking of this rule, I have a follow-up question: do you > > >>> have > > >>>>> any > > >>>>>>>>> concern > > >>>>>>>>>> if flink-shaded 16.2 will be released and upgraded(from > > >> 16.1 > > >>> to > > >>>>>> 16.2) > > >>>>>>>> in > > >>>>>>>>>> Flink 1.17? Is there anything we should pay attention to > > >>> while > > >>>>>>>> releasing > > >>>>>>>>> a > > >>>>>>>>>> new minor flink-shaded version? > > >>>>>>>>>> > > >>>>>>>>>> Best regards, > > >>>>>>>>>> Jing > > >>>>>>>>>> > > >>>>>>>>>> On Wed, Sep 13, 2023 at 9:01 AM Martijn Visser < > > >>>>>>>> martijnvis...@apache.org > > >>>>>>>>>> wrote: > > >>>>>>>>>> > > >>>>>>>>>>> Hi Jing, > > >>>>>>>>>>> > > >>>>>>>>>>> Flink Shaded exists so that Flinks internal usage of > > >>> commonly > > >>>>>> used > > >>>>>>>>>> packages > > >>>>>>>>>>> such as Guava, Jackson inside of Flink don't clash with > > >>>>> different > > >>>>>>>>>> versions > > >>>>>>>>>>> that users might use when creating a Flink application. > > >>> When > > >>>> I > > >>>>>> did > > >>>>>>>> the > > >>>>>>>>>>> upgrade of Flink Shaded, we already ran into a bunch of > > >>>>> problems > > >>>>>>>>> because > > >>>>>>>>>> a > > >>>>>>>>>>> lot of the externalized connectors relied on Flink > > >> Shaded, > > >>>>> which > > >>>>>>> made > > >>>>>>>>> it > > >>>>>>>>>>> problematic to get the connector to work on both Flink > > >> 1.17 > > >>>> and > > >>>>>>> Flink > > >>>>>>>>>> 1.18. > > >>>>>>>>>>> There's been quite a lot of effort put into making sure > > >>> that > > >>>>>>>>> externalized > > >>>>>>>>>>> connectors don't rely on Flink Shaded at all anymore, by > > >>>> either > > >>>>>>> using > > >>>>>>>>>> their > > >>>>>>>>>>> own versions of shaded artifacts (which was the case with > > >>> the > > >>>>>>> Pulsar > > >>>>>>>>>>> connector) or just removing the dependency on Flink > > >> Shaded > > >>>> all > > >>>>>>>>> together, > > >>>>>>>>>> by > > >>>>>>>>>>> using regular Java. > > >>>>>>>>>>> > > >>>>>>>>>>> If you would upgrade flink-shaded from 16.1 to 17.0 in > > >>> Flink > > >>>>>> 1.17, > > >>>>>>>> you > > >>>>>>>>>>> would break all externalized connectors that rely on > > >> Flink > > >>>>>> Shaded's > > >>>>>>>>> Guava > > >>>>>>>>>>> version, plus you potentially would impact the runtime > > >>> given > > >>>>> that > > >>>>>>>>>> there's a > > >>>>>>>>>>> newer Netty version etc. Flink-Shaded is usually only > > >>> updated > > >>>>>>>> whenever > > >>>>>>>>> a > > >>>>>>>>>> new Flink minor version is released and only at the > > >> beginning > > >>>> of > > >>>>>> the > > >>>>>>>>>>> release cycle, so that there's enough time to stabilize > > >>>> Flink. > > >>>>>>>>>> All in all, we shouldn't upgrade flink-shaded in Flink > > >> 1.17. > > >>>>>>>>>>> Best regards, > > >>>>>>>>>>> > > >>>>>>>>>>> Martijn > > >>>>>>>>>>> > > >>>>>>>>>>> On Tue, Sep 12, 2023 at 7:26 PM Jing Ge > > >>>>>> <j...@ververica.com.invalid > > >>>>>>>>>>> wrote: > > >>>>>>>>>>> > > >>>>>>>>>>>> Hi Dev, > > >>>>>>>>>>>> > > >>>>>>>>>>>> Currently Flink 1.17 is using flink-shaded 16.1 and > > >> Flink > > >>>>> 1.18 > > >>>>>> is > > >>>>>>>>> using > > >>>>>>>>>>>> flink-shaded 17.0. Do we need to consider any > > >>> compatibility > > >>>>>> rules > > >>>>>>>>>> between > > >>>>>>>>>>>> them? E.g. is there any concern to upgrade flink-shaded > > >>>> from > > >>>>>> 16.1 > > >>>>>>>> to > > >>>>>>>>>> 17.x > > >>>>>>>>>>>> for Flink 1.17? Or there are some implicit dependency > > >>> rules > > >>>>>>> between > > >>>>>>>>>>>> them. Looking > > >>>>>>>>>>>> forward to hearing from you. > > >>>>>>>>>>>> > > >>>>>>>>>>>> Best regards, > > >>>>>>>>>>>> Jing > > >>>>>>>>>>>> > > >>>>>>> > > >>>>>>> -- > > >>>>>>> Best regards, > > >>>>>>> Sergey > > >>>>>>> > > >>>> > > >>>> -- > > >>>> Best regards, > > >>>> Sergey > > >>>> > > >> > > >> -- > > >> Best regards, > > >> Sergey > > >> > > > > >