For reference when I run owasp on trunk:
[INFO] Writing HTML report to: /home/edward/hadoop/hadoop-common-project/hadoop-common/target/dependency-check-report.html [WARNING] One or more dependencies were identified with known vulnerabilities in Apache Hadoop Common: netty-transport-4.1.127.Final.jar (pkg:maven/io.netty/[email protected], cpe:2.3:a:netty:netty:4.1.127:*:*:*:*:*:*:*) : CVE-2025-67735 protobuf-java-2.5.0.jar (pkg:maven/com.google.protobuf/[email protected], cpe:2.3:a:google:protobuf-java:2.5.0:*:*:*:*:*:*:*) : CVE-2024-7254, CVE-2022-3171, CVE-2021-22569 See the dependency-check report for more details. On Mon, Jan 12, 2026 at 11:30 AM Wei-Chiu Chuang <[email protected]> wrote: > Hadoop 3.4 doesn't use protobuf 2.5 anymore. > Our latest docker images are 1.2GB in size. If it leads to a smaller > docker image footprint I'm all for it. Maybe remove some "optional" > packages from the docker image also. > > On Wed, Dec 17, 2025 at 7:33 AM Edward Capriolo <[email protected]> > wrote: > >> Hello friends, >> >> I have packed up hadoop a number of ways over the years. >> >> Lately, since eveyone loves docker, I find my 80gb hard disk constantly >> filled by, bulky or bloated images. >> >> I have to force these bloated images to "hit the gym". >> >> https://hub.docker.com/u/ecapriolo >> I have a spark, Zeppelin, and livy running on alpine and not much more >> than the jre. >> >> I wanted to tackle hadoop core next. >> >> https://issues.apache.org/jira/browse/HADOOP-19756 >> >> Few funny fake blockers. >> 1) musl and thr code in ticket above >> 2) the old 2.5.0 protobuf >> So many oss problems no one even bothers packaging that protoc version >> for 6 years >> 3) the rhel reliance on the nis libraries >> >> Next, I realize rhe hadoop "lean" package cant accommodate every case. >> But the lea is like 500mb docs and 500mb jars :) >> Timeline server and libs form 150mb. Test jars maybe 100 more. The native >> libs outside libhadoop are 180mb. (If you are on alpine they are negligible >> anyway) >> >> See the rm -rfs here. >> >> https://github.com/edwardcapriolo/edgy-ansible/tree/main/imaging/hadoop >> >> Anyway my goal is to have nice lean alpine based packages and more >> advanced helm charts mirroring things I have done in ansible.. 2 nn 3 >> journal nodes setup. 2 rms 3 zk etc. >> >
