Re: GNU Coding Standards, automake, and the recent xz-utils backdoor

Eric Gallager Sun, 31 Mar 2024 11:51:44 -0700

On Sun, Mar 31, 2024 at 3:20 AM Jacob Bachmeyer <jcb62...@gmail.com> wrote:
>
> dherr...@tentpost.com wrote:
> > On 2024-03-30 18:25, Bruno Haible wrote:
> >> Eric Gallager wrote:
> >>>
> >>> Hm, so should automake's `distcheck` target be updated to perform
> >>> these checks as well, then?
> >>
> >> The first mentioned check can not be automated. ...
> >>
> >> The second mentioned check could be done by the maintainer, ...
> >
> >
> > I agree that distcheck is good but not a cure all.  Any static system
> > can be attacked when there is motive, and unit tests are easily gamed.
>
> The issue seems to be releases containing binary data for unit tests,
> instead of source or scripts to generate that data.  In this case, that
> binary data was used to smuggle in heavily obfuscated object code.
>
> The best analysis in one place that I have found so far is
> <URL:https://gynvael.coldwind.pl/?lang=en&id=782>.  In brief, grep is
> used to locate the main backdoor files by searching for marker strings.
> After running tests/files/bad-3-corrupt_lzma2.xz through tr(1), it
> becomes a /valid/ xz file that decompresses to a shell script that
> extracts a second shell script from part of the compressed data in
> tests/files/good-large_compressed.lzma and pipes it to a shell.  That
> second script has two major functions:  first, it searches the test
> files for four six-byte markers, and it then extracts and decrypts
> (using a simple RC4-alike implemented in Awk) the binary backdoor also
> found in tests/files/good-large_compressed.lzma.  The six-byte markers
> mark beginning and end of raw LZMA2 streams obfuscated with a simple
> substitution cipher.  Any such streams found would be decompressed and
> read by the shell, but neither of the known crocked releases had any
> files containing those markers.  The binary backdoor is an x86-64 object
> that gets unpacked into liblzma_la-crc64-fast.o, unless m4/gettext.m4
> contains "dnl Convert it to C string syntax." which is a clever flag
> because about no one actually checks that those m4 files in release
> tarballs actually match what the GNU project distributes.


Maybe this is something that the GNU project could start making
stronger recommendations about.

> The object itself is just the backdoor and presumably provides the
> symbol _get_cpuid as its entrypoint, since the unpacker script patches
> the src/liblzma/check/crc{64,32}_fast.c files in a pipeline to add calls to
> that function and drops the compiled objects in .libs/.  Running make
> will then skip building those objects, since they are already
> up-to-date, and the backdoored objects get linked into the final binary.
>
> Commit 6e636819e8f070330d835fce46289a3ff72a7b89
> (<URL:https://git.tukaani.org/?p=xz.git;a=commitdiff;h=6e636819e8f070330d835fce46289a3ff72a7b89>)
> was an update to the backdoor.  The commit message is suspicious,
> claiming the use of "a constant seed" to generate reproducible test
> files, but /not/ declaring how the files were produced, which of course
> prevents reproducibility.
>
> > With a reproducible build system, multiple maintainers can "make dist"
> > and compare the output to cross-check for erroneous / malicious dist
> > environments.  Multiple signatures should be harder to compromise,
> > assuming each is independent and generally trustworthy.
>
> This can only work if a package /has/ multiple active maintainers.

Well, other people besides the maintainers can also run `make dist`
and `make distcheck`. My idea was to get end-users in the habit of
running `make distcheck` themselves before installing stuff. And if
that's too much to ask of end users, I'd also point out that there are
multiple kinds of maintainer: besides the upstream maintainer, there
are also usually separate distro maintainers. Even if there's only 1
upstream maintainer, as was the case here, I still think that it would
be good to get distro maintainers in the habit of including `make
distcheck` as part of their own release process, before they accept
updates from upstream.

>
> You also have a small misunderstanding here:  "make dist" prepares a
> (source) release tarball, not a binary build, so this is a
> closely-related issue but actually distinct from reproducible builds.
> Also easier to solve, since we only have to make the source tarball
> reproducible.
>
> > Maybe GNU should establish a cross-verification signing standard and
> > "dist verification service" that automates this process?  Point it to
> > a repo and tag, request a signed hash of the dist package...  Then
> > downstream projects could check package signatures from both the
> > maintainer and such third-party verifiers to check that nothing was
> > inserted outside of version control.
>
> Essentially, this would be an automated release building service:  upon
> request, make a Git checkout, run autogen.sh or equivalent, make dist,
> and publish or hash the result.  The problem is that an attacker who
> manages to gain commit access to a repository may be able to launch
> attacks on the release building service, since "make dist" can run
> scripts.  The service could probably mount the working filesystem noexec
> since preparing source releases should not require running (non-system)
> binaries and scripts can be run by directly feeding them into their
> interpreters even if the filesystem is mounted noexec, but this still
> leaves all available interpreters and system tools potentially available.
>

Well, it'd at least make things more difficult for the attacker, even
if it wouldn't stop them completely.

>
> -- Jacob

Re: GNU Coding Standards, automake, and the recent xz-utils backdoor

Reply via email to