On 29.3.2024 23:31:26, Daniil Gentili wrote:
In light of the recent supply chain attack in xz/lzma, leading to a
backdoor in openSSH
(https://www.openwall.com/lists/oss-security/2024/03/29/4), I believe
that it would be a good idea to remove the huge attack surface offered
by the pre-generated autoconf build scripts and lexers, offered in the
release tarballs.
In particular, the xz supply chain attack injected the exploit with a
few obfuscated lines, manually added to the end of the pre-generated
configure script, that was only bundled in the tarballs.
Even if the exploits themselves were committed to the repo in the form
of test files, the code that actually injected the exploit in the
library was not committed to the repo, and was only present in the
pre-generated configure script in the tarball: this injection mode
makes sense, as extra files in the tarball not present in the git repo
would raise suspicions, but machine-generated configure scripts
containing hundreds of thousands of lines of code not present in the
upstream VCS are the norm, and are usually not checked before execution.
Specifically in the case of PHP, along from the configure script, the
tarball also bundles generated lexer files which contain actual C
code, which is an additional attack vector, i.e. here's the diff
between the tarball of the 8.3.4 release, and the PHP-8.3.4 tag on the
git repo:
```
~ $ diff -r php-8.3.4 php-src -q
Only in php-src:
.git Files
php-8.3.4/NEWS and php-src/NEWS differ
Files php-8.3.4/Zend/zend.h and php-src/Zend/zend.h
differ Only in php-8.3.4/Zend: zend_ini_parser.c
Only in php-8.3.4/Zend: zend_ini_parser.h
Only in php-8.3.4/Zend:
zend_ini_parser.output Only in
php-8.3.4/Zend: zend_ini_scanner.c
Only in php-8.3.4/Zend: zend_ini_scanner_defs.h
Only in php-8.3.4/Zend:
zend_language_parser.c Only in
php-8.3.4/Zend: zend_language_parser.h
Only in php-8.3.4/Zend: zend_language_parser.output
Only in php-8.3.4/Zend: zend_language_scanner.c
Only in php-8.3.4/Zend:
zend_language_scanner_defs.h Only in php-8.3.4:
configure Files
php-8.3.4/configure.ac and php-src/configure.ac differ
Only in php-8.3.4/ext/json:
json_parser.tab.c Only in
php-8.3.4/ext/json: json_parser.tab.h
Only in php-8.3.4/ext/json: json_scanner.c
Only in php-8.3.4/ext/json:
php_json_scanner_defs.h Only in
php-8.3.4/ext/pdo: pdo_sql_parser.c
Only in php-8.3.4/ext/phar:
phar_path_check.c Only in
php-8.3.4/ext/standard: url_scanner_ex.c
Only in php-8.3.4/ext/standard: var_unserializer.c
Only in php-8.3.4/main: php_config.h.in
Files php-8.3.4/main/php_version.h and php-src/main/php_version.h
differ Only in php-8.3.4/pear:
install-pear-nozlib.phar Only in
php-8.3.4/sapi/phpdbg: phpdbg_lexer.c
Only in php-8.3.4/sapi/phpdbg:
phpdbg_parser.c Only in
php-8.3.4/sapi/phpdbg: phpdbg_parser.h
Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.output
```
To prevent attacks from malevolent/compromised RMs, I propose
completely removing all autogenerated files from the release tarballs,
and ensuring their content exactly matches the content of the
associated git tag (this means also removing the -dev prefix from the
version number in main/php_version.h, Zend/zend.h, configure.ac and
NEWS in the git tag).
Of course this means that users will have to generate the build
scripts when compiling PHP, as when installing PHP from the VCS repo.
I'm sending a copy of this email to secur...@php.net as well.
Hey Daniil,
You can also have a public CI (i.e. a github action) generate the
artifacts, along with hash computation.
It should be a github action which runs on tags. This makes it fully
verifiable; i.e. the code for the generation of action, including the
hash. Anyone who wants can trivially trace this back.
There's nothing in the tarballs which cannot be trivially automated and
made verifiable.
I don't think providing pre-generated files is fundamentally flawed, the
primary lacking thing is verifiability. Which is also what enabled the
xz backdoor.
Bob