Hi! In light of the recent discussions with the Trino team, we should think about improving our CVE stance.
We are doing a good job of updating our direct dependencies, but we don't have a good solution for our indirect dependencies, which are included in our official shaded binary artifacts, and show up in CVE scans. These are basically all coming from Hadoop, as HBase is also good at keeping its direct dependencies in order. The main problem is that in order to maximize compatibility, we build our binaries with the oldest supported Hadoop major.minor version, which is often long EOL. I don't have experience with mixing Hadoop server and client versions, but I recall that Hadoop major versions are supposed to be backwards wire compatible. The only binaries that include the Hadoop (and HBase) code are the phoenix-client-embedded and phoenix-client-lite uberjars, which are unfortunately the most widely used artifacts. My first proposal is to use the latest supported Hadoop version instead of the oldest one for each HBase profile. This should remove many of the existing CVEs from the binaries. Do you think this would cause backwards compatibility issues ? If not, do you think that we should do this for 5.2.1 ? My second discussion point is dependencyManage-ing transitive dependency versions for the CVEs that present in whatever Hadoop version we build with. This is a slippery slope, and I personally would like to minimize and compartmentalize it as much as possible. What is your opinion on this ? Is this worth doing ? Should we do it in the main code, or only for the uberjars (which would be big a testing problem) Should we put those dependencyManage entries in a profile, so that they can be turned off when building with newer Hadoop versions ? looking forward to hearing your thoughts on this Istvan