On Sun, Mar 31, 2024 at 3:20 AM Jacob Bachmeyer <jcb62...@gmail.com> wrote: > > dherr...@tentpost.com wrote: > > On 2024-03-30 18:25, Bruno Haible wrote: > >> Eric Gallager wrote: > >>> > >>> Hm, so should automake's `distcheck` target be updated to perform > >>> these checks as well, then? > >> > >> The first mentioned check can not be automated. ... > >> > >> The second mentioned check could be done by the maintainer, ... > > > > > > I agree that distcheck is good but not a cure all. Any static system > > can be attacked when there is motive, and unit tests are easily gamed. > > The issue seems to be releases containing binary data for unit tests, > instead of source or scripts to generate that data. In this case, that > binary data was used to smuggle in heavily obfuscated object code. > > The best analysis in one place that I have found so far is > <URL:https://gynvael.coldwind.pl/?lang=en&id=782>. In brief, grep is > used to locate the main backdoor files by searching for marker strings. > After running tests/files/bad-3-corrupt_lzma2.xz through tr(1), it > becomes a /valid/ xz file that decompresses to a shell script that > extracts a second shell script from part of the compressed data in > tests/files/good-large_compressed.lzma and pipes it to a shell. That > second script has two major functions: first, it searches the test > files for four six-byte markers, and it then extracts and decrypts > (using a simple RC4-alike implemented in Awk) the binary backdoor also > found in tests/files/good-large_compressed.lzma. The six-byte markers > mark beginning and end of raw LZMA2 streams obfuscated with a simple > substitution cipher. Any such streams found would be decompressed and > read by the shell, but neither of the known crocked releases had any > files containing those markers. The binary backdoor is an x86-64 object > that gets unpacked into liblzma_la-crc64-fast.o, unless m4/gettext.m4 > contains "dnl Convert it to C string syntax." which is a clever flag > because about no one actually checks that those m4 files in release > tarballs actually match what the GNU project distributes.
Maybe this is something that the GNU project could start making stronger recommendations about. > The object itself is just the backdoor and presumably provides the > symbol _get_cpuid as its entrypoint, since the unpacker script patches > the src/liblzma/check/crc{64,32}_fast.c files in a pipeline to add calls to > that function and drops the compiled objects in .libs/. Running make > will then skip building those objects, since they are already > up-to-date, and the backdoored objects get linked into the final binary. > > Commit 6e636819e8f070330d835fce46289a3ff72a7b89 > (<URL:https://git.tukaani.org/?p=xz.git;a=commitdiff;h=6e636819e8f070330d835fce46289a3ff72a7b89>) > was an update to the backdoor. The commit message is suspicious, > claiming the use of "a constant seed" to generate reproducible test > files, but /not/ declaring how the files were produced, which of course > prevents reproducibility. > > > With a reproducible build system, multiple maintainers can "make dist" > > and compare the output to cross-check for erroneous / malicious dist > > environments. Multiple signatures should be harder to compromise, > > assuming each is independent and generally trustworthy. > > This can only work if a package /has/ multiple active maintainers. Well, other people besides the maintainers can also run `make dist` and `make distcheck`. My idea was to get end-users in the habit of running `make distcheck` themselves before installing stuff. And if that's too much to ask of end users, I'd also point out that there are multiple kinds of maintainer: besides the upstream maintainer, there are also usually separate distro maintainers. Even if there's only 1 upstream maintainer, as was the case here, I still think that it would be good to get distro maintainers in the habit of including `make distcheck` as part of their own release process, before they accept updates from upstream. > > You also have a small misunderstanding here: "make dist" prepares a > (source) release tarball, not a binary build, so this is a > closely-related issue but actually distinct from reproducible builds. > Also easier to solve, since we only have to make the source tarball > reproducible. > > > Maybe GNU should establish a cross-verification signing standard and > > "dist verification service" that automates this process? Point it to > > a repo and tag, request a signed hash of the dist package... Then > > downstream projects could check package signatures from both the > > maintainer and such third-party verifiers to check that nothing was > > inserted outside of version control. > > Essentially, this would be an automated release building service: upon > request, make a Git checkout, run autogen.sh or equivalent, make dist, > and publish or hash the result. The problem is that an attacker who > manages to gain commit access to a repository may be able to launch > attacks on the release building service, since "make dist" can run > scripts. The service could probably mount the working filesystem noexec > since preparing source releases should not require running (non-system) > binaries and scripts can be run by directly feeding them into their > interpreters even if the filesystem is mounted noexec, but this still > leaves all available interpreters and system tools potentially available. > Well, it'd at least make things more difficult for the attacker, even if it wouldn't stop them completely. > > -- Jacob