On 31/03/2024 14:53, Christian Schneider wrote:
But my main question is: I fail to see the difference whether I plant my malicious code in configure, configure.ac or *.c: Someone has to review the changes and notice the problem. And we have to trust the RMs. What am I missing?
As I understand it, the attack being discussed involved*code that was never committed to version control*. The bulk of the payload was committed in fake binary test artifacts, which are unlikely to be inspected but harmless by themselves; but the trigger to incorporate it into the binary was added*manually* in between the automated build and producing the signed release archive. So the theory is that if there's no human involved in that process, there is no way for a human to introduce a malicious change at that step. An exploit would need to be introduced somewhere in version controlled, human-readable, code; giving extra chances for it to be detected. On 30/03/2024 18:24, Jakub Zelenka wrote:
Do you think it would be different if the change happened in the distributed source file instead? I mean you could still modify tarball of the distributed file (e.g. hide somewhere in configure.ac or in our case more easily in less visible files like various Makefile.frag and similar). The only thing that you get by using just VCS files is that people could hash the distributed content of the files and compare it with the hash of the VCS files but does anyone do this sort of verification?
We already use a version control system built entirely on comparing hashes of source files. So given a signed tarball that claimed to match the content of a signed tag, any user can trivially check out the tag, expand the tarball, and run "git diff" to detect any anomalies. The question of who would do that in practice is a valid one, and something that I'm sure has been discussed elsewhere regarding reproducible binary builds. On 30/03/2024 15:35, Daniil Gentili wrote:
Btw, I do not believe that "it would require end users to install autotools and bison in order to compile PHP from tarballs" is valid reason to delay the patching of a serious attack vector ASAP.
As is always the case, there is a trade-off between security and convenience - in this case, distributing something that's usable without large amounts of extra tooling (including, for some generated files, a copy of PHP itself), vs distributing something that is 100% reviewable by humans.
Ultimately, 99.999% of users are not going to compile their own copy of PHP from source; they are going to trust some chain of providers to take the source, perform all the necessary build steps, and produce a binary. Removing generated files from the tarballs doesn't eliminate that need for trust, it just shifts more of it to organisations like Debian and RedHat; and maybe that's a valid aim, because those organisations have more resources than us to build appropriate processes.
Making things reproducible aims to attack the same problem from a different angle: rather than placing more trust in one part of the chain, it allows multiple parallel chains, which should all give the same result. If builds from different sources start showing unexplained differences, it can be flagged automatically.
Regards, -- Rowan Tommins [IMSoP]