Hi!

In light of the recent discussions with the Trino team, we should think
about improving our CVE stance.

We are doing a good job of updating our direct dependencies, but we don't
have a good solution for our indirect dependencies, which are included in
our official shaded binary artifacts, and show up in CVE scans.

These are basically all coming from Hadoop, as HBase is also good at
keeping its direct dependencies in order.

The main problem is that in order to maximize compatibility, we build our
binaries with the oldest supported Hadoop major.minor version, which is
often long EOL.

I don't have experience with mixing Hadoop server and client versions, but
I recall that Hadoop major versions are supposed to be backwards wire
compatible.

The only binaries that include the Hadoop (and HBase) code are the
phoenix-client-embedded and phoenix-client-lite uberjars, which are
unfortunately the most widely used artifacts.

My first proposal is to use the latest supported Hadoop version instead of
the oldest one for each HBase profile. This should remove many of the
existing CVEs from the binaries.

Do you think this would cause backwards compatibility issues ?

If not, do you think that we should do this for 5.2.1 ?

My second discussion point is dependencyManage-ing transitive dependency
versions for the CVEs that present in whatever Hadoop version we build with.
This is a slippery slope, and I personally would like to minimize and
compartmentalize it as much as possible.

What is your opinion on this ?
Is this worth doing ?
Should we do it in the main code, or only for the uberjars (which would be
big a testing problem)
Should we put those dependencyManage entries in a profile, so that they can
be turned off when building with newer Hadoop versions ?

looking forward to hearing your thoughts on this

Istvan

Reply via email to