I wanted to discuss what I am looking to do with alpine. As you may know I
am a bit long winded but there is a narrative here about what I am trying
to do with hadoop and packaging.

When I got into linux, RedHat at version 9?. split into Fedora and Red Hat
Enterprise Linux. What emerged from this situation: Fedora was a test bed
so Fedora 5 would be the bleeding edge that became RHEL 4. Also around that
time CentOS became 'the' fork many businesses were completely OK running
CentOS. There were a number of people doing extended packaging like DAG
rpms. There were only a few 'stubborn' pieces of software that were tried
hard to  "needing" RHEL. Effectively you could find an RPM from all those
places (RHEL,DAG,CENTOS) and they would 'generally' work together.

SUSE and gentoo where there, Umbuntu came up, but none are my bread and
butter so I can not speak there.

We all know over the past few years "containers" are here. This isn't a
unique opinion of mine, "There is a lot of "misguided" packaging into
containers" like rebuilding a 9 GB image to change a 3 line config,  We
used to achieve "immutable deployments" by installing all the user software
into /opt/hadoop and ONLY altering that by the process 'ansible'.

I was in a regulated environment, and in those environments they are very
serious OSS vulnerability scanning. Not just a scan once a quarter, or once
a year. Not just a scan and "asking nicely" to try to clean it up. Constant
scanning and asking for impact assessment AND timelines for
remediation.  (Note I see the owasp plugin is in hadoop trunk but it itself
is versions behind I will send pr). The thing about these environments.

1) Less is more!
2) it is easier to constantly upgrade than to explain.

Example. There is a CVE on the zookeeper in hadoop. If you read about the
CVE it is only inside the "optional" admin server
CVE-2025-58457: Insufficient Permission Check
This vulnerability allows an *authorized* client with low-level privileges
to execute sensitive snapshot and restore commands on the Admin Server
without the required root (ALL) permissions. The primary risk is the
disclosure of the cluster state via unauthorized snapshots.

It takes more time to constantly read the sometimes cryptic CVE reports,
and EXPLAIN to people that you are not affected then to keep patching!

Remember lesson #1 (less is more). The problem of having 4GB of stuff in
/usr/bin is that SOMETHING is always having a vulnerability and it is
usually something you dont use!
Bigger isnt better with hadoop either, my 80GB ssd has say 25GB free, but
having a 5GB olverlayFS for the hadoop build starts choking down how many
things I can test at once. minimr cluster assembly in the "lean" has a
compounding effect etc.


Onto alpine, sooo. in the enterprise they love the RedHat. It's a vendor.
You can pay them! CYA! However RHEL isn't a container distro. It is a great
distro, but so expansive the minimal install is maybe like 5GB ). I would
am sure that you can pair it down but it is not the bread and butter.

WIth the vendor 'situation' involving Centos. is it rocky is it alma? To me
personally I am "divesting' 6GB "distros" mostly because of reason #1, too
much stuff that isnt useful only to create vulnerabilities.

What I am doing is trying to push down into containers.
https://github.com/edwardcapriolo/edgy-ansible/tree/main/imaging/hadoop/compositions/ha_rm_zk_pki_tls

My goal is to service not the "every" possible hadoop like a cloudera
manager can, only to give these archite type setups. HA YARN is an
architype, ha NN with 3 journal nodes is an archetype.

The downside of alpine is the MUSL it forces you to break lt forces of
assumptions. However it finds other things "our find -l" when the container
starts to list the directory is not portable to alpine. But its also
forcing a revis of some c code which turns out not portable anyway.

So I would ask the group, if I dont fall off a cliff and get the alpine
support to a decent point can it be added as one of the "official"
supported build envs.Like make this decision in like 4 months or so.


Thanks,
Edward

Reply via email to