In light of the recent supply chain attack in xz/lzma, leading to a backdoor in 
openSSH (https://www.openwall.com/lists/oss-security/2024/03/29/4), I believe 
that it would be a good idea to remove the huge attack surface offered by the 
pre-generated autoconf build scripts and lexers, offered in the release 
tarballs.

In particular, the xz supply chain attack injected the exploit with a few 
obfuscated lines, manually added to the end of the pre-generated configure 
script, that was only bundled in the tarballs.

Even if the exploits themselves were committed to the repo in the form of test 
files, the code that actually injected the exploit in the library was not 
committed to the repo, and was only present in the pre-generated configure 
script in the tarball: this injection mode makes sense, as extra files in the 
tarball not present in the git repo would raise suspicions, but 
machine-generated configure scripts containing hundreds of thousands of lines 
of code not present in the upstream VCS are the norm, and are usually not 
checked before execution.

Specifically in the case of PHP, along from the configure script, the tarball 
also bundles generated lexer files which contain actual C code, which is an 
additional attack vector, i.e. here's the diff between the tarball of the 8.3.4 
release, and the PHP-8.3.4 tag on the git repo:

```
~ $ diff -r php-8.3.4 php-src -q
Only in php-src: .git                                                      
Files php-8.3.4/NEWS and php-src/NEWS differ                               
Files php-8.3.4/Zend/zend.h and php-src/Zend/zend.h differ                 Only 
in php-8.3.4/Zend: zend_ini_parser.c
Only in php-8.3.4/Zend: zend_ini_parser.h
Only in php-8.3.4/Zend: zend_ini_parser.output                             Only 
in php-8.3.4/Zend: zend_ini_scanner.c
Only in php-8.3.4/Zend: zend_ini_scanner_defs.h
Only in php-8.3.4/Zend: zend_language_parser.c                             Only 
in php-8.3.4/Zend: zend_language_parser.h                             Only in 
php-8.3.4/Zend: zend_language_parser.output
Only in php-8.3.4/Zend: zend_language_scanner.c
Only in php-8.3.4/Zend: zend_language_scanner_defs.h                       Only 
in php-8.3.4: configure                                               Files 
php-8.3.4/configure.ac and php-src/configure.ac differ               Only in 
php-8.3.4/ext/json: json_parser.tab.c                              Only in 
php-8.3.4/ext/json: json_parser.tab.h
Only in php-8.3.4/ext/json: json_scanner.c
Only in php-8.3.4/ext/json: php_json_scanner_defs.h                        Only 
in php-8.3.4/ext/pdo: pdo_sql_parser.c
Only in php-8.3.4/ext/phar: phar_path_check.c                              Only 
in php-8.3.4/ext/standard: url_scanner_ex.c
Only in php-8.3.4/ext/standard: var_unserializer.c
Only in php-8.3.4/main: php_config.h.in
Files php-8.3.4/main/php_version.h and php-src/main/php_version.h differ   Only 
in php-8.3.4/pear: install-pear-nozlib.phar                           Only in 
php-8.3.4/sapi/phpdbg: phpdbg_lexer.c                              Only in 
php-8.3.4/sapi/phpdbg: phpdbg_parser.c                             Only in 
php-8.3.4/sapi/phpdbg: phpdbg_parser.h
Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.output
```

To prevent attacks from malevolent/compromised RMs, I propose completely 
removing all autogenerated files from the release tarballs, and ensuring their 
content exactly matches the content of the associated git tag (this means also 
removing the -dev prefix from the version number in main/php_version.h, 
Zend/zend.h, configure.ac and NEWS in the git tag).

Of course this means that users will have to generate the build scripts when 
compiling PHP, as when installing PHP from the VCS repo.

I'm sending a copy of this email to secur...@php.net as well.

Reply via email to