Sounds good, I agree with this solution. We can keep them in profile and disable profile once Hadoop resolves CVEs. I feel we should resolve Avro and Jettison CVEs and rollout 5.2.1 unless newer released version of Hadoop already has them resolved. WDYT?
On Thu, Sep 5, 2024 at 9:46 PM Istvan Toth <st...@cloudera.com.invalid> wrote: > Thanks for your response, Viraj. > > Sure, managing transitive dependencies and using a newer Hadoop are not > mutually exclusive. > > For old HBase versions like 2.1 even the newest officially supported Hadoop > is going to be old. > > What do you think about my proposals on how to handle the transitive > dependency management ? > I quite like the idea of putting those into a profile (which is enabled by > default) so that they can be disabled if a newer hadoop is used for > building. > > Istvan > > On Thu, Sep 5, 2024 at 5:37 PM Viraj Jasani <vjas...@apache.org> wrote: > > > I agree with upgrading Hadoop version but it is helpful only if the CVEs > > from indirect dependencies have been resolved by Hadoop. > > The release frequency is too slow for Hadoop (similar to us) so some CVE > > resolution might be WIP and take considerable time to get released. In > such > > cases, maybe we can also try to manage dependency within Phoenix pom. > WDYT? > > > > > > On Wed, Sep 4, 2024 at 1:02 AM Istvan Toth <st...@apache.org> wrote: > > > > > Hi! > > > > > > In light of the recent discussions with the Trino team, we should think > > > about improving our CVE stance. > > > > > > We are doing a good job of updating our direct dependencies, but we > don't > > > have a good solution for our indirect dependencies, which are included > in > > > our official shaded binary artifacts, and show up in CVE scans. > > > > > > These are basically all coming from Hadoop, as HBase is also good at > > > keeping its direct dependencies in order. > > > > > > The main problem is that in order to maximize compatibility, we build > our > > > binaries with the oldest supported Hadoop major.minor version, which is > > > often long EOL. > > > > > > I don't have experience with mixing Hadoop server and client versions, > > but > > > I recall that Hadoop major versions are supposed to be backwards wire > > > compatible. > > > > > > The only binaries that include the Hadoop (and HBase) code are the > > > phoenix-client-embedded and phoenix-client-lite uberjars, which are > > > unfortunately the most widely used artifacts. > > > > > > My first proposal is to use the latest supported Hadoop version instead > > of > > > the oldest one for each HBase profile. This should remove many of the > > > existing CVEs from the binaries. > > > > > > Do you think this would cause backwards compatibility issues ? > > > > > > If not, do you think that we should do this for 5.2.1 ? > > > > > > My second discussion point is dependencyManage-ing transitive > dependency > > > versions for the CVEs that present in whatever Hadoop version we build > > > with. > > > This is a slippery slope, and I personally would like to minimize and > > > compartmentalize it as much as possible. > > > > > > What is your opinion on this ? > > > Is this worth doing ? > > > Should we do it in the main code, or only for the uberjars (which would > > be > > > big a testing problem) > > > Should we put those dependencyManage entries in a profile, so that they > > can > > > be turned off when building with newer Hadoop versions ? > > > > > > looking forward to hearing your thoughts on this > > > > > > Istvan > > > > > > > > -- > *István Tóth* | Sr. Staff Software Engineer > *Email*: st...@cloudera.com > cloudera.com <https://www.cloudera.com> > [image: Cloudera] <https://www.cloudera.com/> > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera > on LinkedIn] <https://www.linkedin.com/company/cloudera> > ------------------------------ > ------------------------------ >