Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-03-30 Thread Saki Takamachi
Another div case:
```

$num = new Number('0.01'); // scale 18
$num2 = new Number('3');

$div = $num->div($num2);
$div->getNumber(); // '0.00' scale 18

$div = $num->div($num2, 20);
$div->getNumber(); // '0.0033' scale 20
```

Regards.

Saki


Re: [PHP-DEV] [RFC] [Discussion] Support object type in BCMath

2024-03-30 Thread Saki Takamachi
Hi Jordan,

> For addition, it absolutely should expand scale like this, unless the 
> constructor also defines a default rounding type that is used in that 
> situation. All numbers, while arbitrary, will be finite, so addition will 
> always be exact and known based on inputs prior to calculation. 
> 
> Treating scale like this isn't more strict, it's confusing. For instance:
> 
> ```
> $numA = new Number('1.23', 2);
> $numB = new Number('1.23456', 5);
> 
> $expandedScale1 = $numA + $numB; // 2.46456
> $expandedScale2 = $numB + $numA; // 2.46456
> 
> $strictScale1 = $numA + $numB; // 2.46 assuming truncation
> $strictScale2 = $numB + $numA; // 2.46456
> ```
> 
> I ran into this same issue with operand ordering when I was writing my 
> operator overload RFC.
> 
> There are ways you could do the overload implementation that would get around 
> this for object + object operations, but it's also mathematically unsound and 
> probably unexpected for anyone who is going to the trouble of using an 
> arbitrary precision library.
> 
> Addition and subtraction should automatically use the largest scale from all 
> operands. Division and multiplication should require a specified scale.
> 
> Because of this, I'm not entirely sure that specifying a scale in the 
> constructor is actually a good thing. It is incredibly easy to create 
> situations, unless the implementation in C is VERY careful, where the operand 
> positions matter beyond the simple calculation. Multiplication is 
> commutative, but division is not. This would almost certainly lead to some 
> very difficult to track down bugs.
> 
> Putting scale in the constructor is similar to some of the examples of 
> "possible misuse cases of operator overloading" that I had to go over when I 
> was making my RFC. We definitely want to avoid that if possible for the first 
> number/math object that has operator overloads.

Your opinion may be reasonable given the original BCMath calculation order. 
That is, do you intend code like this?

Signature:
```
// public function __construct(string|int $number)
// public function getNumber(?int $scale = null): string
```

Add:
```
// public function add(Number|string|int $number): string

$num = new Number('1.23456');
$num2 = new Number('1.23');

$add = $num + $num2;
$add->getNumber(); // '2.46456'
$add->getNumber(1); // ‘2.4'

$add = $num->add($num2);
$add->getNumber(); // '2.46456'
$add->getNumber(1); // '2.4'
```

Div:
```
// public function div(Number|string|int $number, int $scaleExpansionLimit = 
10): string


// case 1
$num = new Number('0.0001');
$num2 = new Number('3');

$div = $num / $num2; // scale expansion limit is always 10
$div->getNumber(); // '0.3'

$div = $num->div($num2, 20);
$div->getNumber(); // '0.333'
$div->getNumber(7); // ‘0.333'


// case 2
$num = new Number('1.11');
$num2 = new Number('3');

$div = $num->div($num2, 3);
$div->getNumber(); // '0.370'
$div->getNumber(7); // ‘0.370'
```

Since the scale can be inferred for everything other than div, a special 
argument is given only for div.

Regards.

Saki

Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Jakub Zelenka
On Sat, Mar 30, 2024 at 5:46 PM Ben Ramsey  wrote:

> On Mar 30, 2024, at 07:03, Jakub Zelenka  wrote:
>
> 
> Hi,
>
> On Sat, Mar 30, 2024 at 7:08 AM Marco Pivetta  wrote:
>
>>
>>
>> On Sat, 30 Mar 2024, 05:19 Ben Ramsey,  wrote:
>>
>>> On Mar 29, 2024, at 20:20, Bob Weinand  wrote:
>>>
>>> 
>>> On 29.3.2024 23:31:26, Daniil Gentili wrote:
>>>
>>> In light of the recent supply chain attack in xz/lzma, leading to a
>>> backdoor in openSSH (
>>> https://www.openwall.com/lists/oss-security/2024/03/29/4), I believe
>>> that it would be a good idea to remove the huge attack surface offered by
>>> the pre-generated autoconf build scripts and lexers, offered in the release
>>> tarballs.
>>>
>>> In particular, the xz supply chain attack injected the exploit with a
>>> few obfuscated lines, manually added to the end of the pre-generated
>>> configure script, that was only bundled in the tarballs.
>>>
>>> Even if the exploits themselves were committed to the repo in the form
>>> of test files, the code that actually injected the exploit in the library
>>> was not committed to the repo, and was only present in the pre-generated
>>> configure script in the tarball: this injection mode makes sense, as extra
>>> files in the tarball not present in the git repo would raise suspicions,
>>> but machine-generated configure scripts containing hundreds of thousands of
>>> lines of code not present in the upstream VCS are the norm, and are usually
>>> not checked before execution.
>>>
>>> Specifically in the case of PHP, along from the configure script, the
>>> tarball also bundles generated lexer files which contain actual C code,
>>> which is an additional attack vector, i.e. here's the diff between the
>>> tarball of the 8.3.4 release, and the PHP-8.3.4 tag on the git repo:
>>>
>>> ```
>>> ~ $ diff -r php-8.3.4 php-src -q
>>> Only in php-src:
>>> .git  Files
>>> php-8.3.4/NEWS and php-src/NEWS differ   Files
>>> php-8.3.4/Zend/zend.h and php-src/Zend/zend.h differ Only
>>> in php-8.3.4/Zend: zend_ini_parser.c
>>> Only in php-8.3.4/Zend: zend_ini_parser.h
>>> Only in php-8.3.4/Zend:
>>> zend_ini_parser.output Only in php-8.3.4/Zend:
>>> zend_ini_scanner.c
>>> Only in php-8.3.4/Zend: zend_ini_scanner_defs.h
>>> Only in php-8.3.4/Zend:
>>> zend_language_parser.c Only in php-8.3.4/Zend:
>>> zend_language_parser.h Only in php-8.3.4/Zend:
>>> zend_language_parser.output
>>> Only in php-8.3.4/Zend: zend_language_scanner.c
>>> Only in php-8.3.4/Zend:
>>> zend_language_scanner_defs.h   Only in php-8.3.4:
>>> configure   Files php-8.3.4/
>>> configure.ac and php-src/configure.ac differ   Only in
>>> php-8.3.4/ext/json: json_parser.tab.c  Only in
>>> php-8.3.4/ext/json: json_parser.tab.h
>>> Only in php-8.3.4/ext/json: json_scanner.c
>>> Only in php-8.3.4/ext/json:
>>> php_json_scanner_defs.hOnly in php-8.3.4/ext/pdo:
>>> pdo_sql_parser.c
>>> Only in php-8.3.4/ext/phar:
>>> phar_path_check.c  Only in
>>> php-8.3.4/ext/standard: url_scanner_ex.c
>>> Only in php-8.3.4/ext/standard: var_unserializer.c
>>> Only in php-8.3.4/main: php_config.h.in
>>> Files php-8.3.4/main/php_version.h and php-src/main/php_version.h
>>> differ   Only in php-8.3.4/pear:
>>> install-pear-nozlib.phar   Only in
>>> php-8.3.4/sapi/phpdbg: phpdbg_lexer.c  Only in
>>> php-8.3.4/sapi/phpdbg: phpdbg_parser.c Only in
>>> php-8.3.4/sapi/phpdbg: phpdbg_parser.h
>>> Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.output
>>> ```
>>>
>>> To prevent attacks from malevolent/compromised RMs, I propose completely
>>> removing all autogenerated files from the release tarballs, and ensuring
>>> their content exactly matches the content of the associated git tag (this
>>> means also removing the -dev prefix from the version number in
>>> main/php_version.h, Zend/zend.h, configure.ac and NEWS in the git tag).
>>>
>>> Of course this means that users will have to generate the build scripts
>>> when compiling PHP, as when installing PHP from the VCS repo.
>>>
>>> I'm sending a copy of this email to secur...@php.net as well.
>>>
>>> Hey Daniil,
>>>
>>> You can also have a public CI (i.e. a github action) generate the
>>> artifacts, along with hash computation.
>>> It should be a github action which runs on tags. This makes it fully
>>> verifiable; i.e. the code for the generation of action, including the hash.
>>> Anyone who wants can trivially trace this back.
>>>
>>> There's nothing in the tarballs which cannot be trivially automated and
>>> made verifiable.
>>>
>>> I don't think providing pre-generated files is fundamentally flawed, the
>>> primary lacking thing is verifiab

Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Jakub Zelenka
Hi,

On Sat, Mar 30, 2024 at 3:33 PM Daniil Gentili 
wrote:

> It is also pretty standard thing to distribute configure files (which is
> the file that probably matters most).
>
> The current standard way of distributing generated configure files in
> tarballs is precisely what allowed the xz supply chain attack to go
> unnoticed for so long.
>
>
Do you think it would be different if the change happened in the
distributed source file instead? I mean you could still modify tarball of
the distributed file (e.g. hide somewhere in configure.ac or in our case
more easily in less visible files like various Makefile.frag and similar).
The only thing that you get by using just VCS files is that people could
hash the distributed content of the files and compare it with the hash of
the VCS files but does anyone do this sort of verification?

If you meant using Git archive, then it's not a good idea because it
doesn't have a long term hash stability:
https://github.com/orgs/community/discussions/46034 .


> I strongly believe all projects using autotools, including PHP, should
> switch away from this "standard" way of doing things.
>
> Also don't forget that we need to also provide Windows builds which are
> binaries so we need some sort of verification of this type in any case.
>
> Of course, build reproducibility is a very good thing, but when a user
> downloads a binary, they're aware that they're getting a compiled blob
> which might contain injected malicious code (especially if there's no build
> reproducibility); when a user downloads a source tarball, there's a false
> sense of security rooted in the mistaken belief that the source code in the
> tarball matches the one distributed in the VCS, but in reality, the tarball
> also contains potentially malicious semi-compiled blobs, not present in the
> VCS.
>
>
> It would require compromising the CI as well as the download serves
> happening at the same time which seems to me like an impossible scenario.
>
> I misunderstood your original message, I thought you meant that there
> would be some new CI system hosted on downloads.php.net dedicated to
> verifying, not the current GHA CI system, which is configured on the public
> VCS.
>
> If GHA is used for verifying builds, that would make more sense, but then
> users would be required to check the status of a github pipeline to
> validate that the tarball was not compromised (or alternatively clone from
> source and re-generate the build scripts manually, or simply trust the
> release manager, which brings us to square 1).
>

As noted above, you would need to do much more involved verification if
only generated source codes were removed. With GHA it would be just failed
build and we could integrate notification (e.g. to Foundation Slack) so
more people could be easily aware if there is such problem.

Regards

Jakub


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Ben Ramsey
On Mar 30, 2024, at 07:03, Jakub Zelenka  wrote:Hi,On Sat, Mar 30, 2024 at 7:08 AM Marco Pivetta  wrote:On Sat, 30 Mar 2024, 05:19 Ben Ramsey,  wrote:On Mar 29, 2024, at 20:20, Bob Weinand  wrote:

  
  
On 29.3.2024 23:31:26, Daniil Gentili
  wrote:


  
  
  In light of
the recent supply chain attack in xz/lzma, leading to a backdoor
in openSSH
(https://www.openwall.com/lists/oss-security/2024/03/29/4), I
believe that it would be a good idea to remove the huge attack
surface offered by the pre-generated autoconf build scripts and
lexers, offered in the release tarballs. 
  
  In
particular, the xz supply chain attack injected the exploit with
a few obfuscated lines, manually added to the end of the
pre-generated configure script, that was only bundled in the
tarballs. 
  
  Even if the
exploits themselves were committed to the repo in the form of
test files, the code that actually injected the exploit in the
library was not committed to the repo, and was only present in
the pre-generated configure script in the tarball: this
injection mode makes sense, as extra files in the tarball not
present in the git repo would raise suspicions, but
machine-generated configure scripts containing hundreds of
thousands of lines of code not present in the upstream VCS are
the norm, and are usually not checked before execution. 
  
  Specifically
in the case of PHP, along from the configure script, the tarball
also bundles generated lexer files which contain actual C code,
which is an additional attack vector, i.e. here's the diff
between the tarball of the 8.3.4 release, and the PHP-8.3.4 tag
on the git repo: 
  
  ```
  
  ~ $ diff -r
php-8.3.4 php-src -q 
  Only in
php-src:
.git  Files
php-8.3.4/NEWS and php-src/NEWS
differ   Files php-8.3.4/Zend/zend.h
and php-src/Zend/zend.h differ Only in
php-8.3.4/Zend: zend_ini_parser.c 
  Only in
php-8.3.4/Zend: zend_ini_parser.h 
  Only in
php-8.3.4/Zend:
zend_ini_parser.output Only in
php-8.3.4/Zend: zend_ini_scanner.c 
  Only in
php-8.3.4/Zend: zend_ini_scanner_defs.h 
  Only in
php-8.3.4/Zend:
zend_language_parser.c Only in
php-8.3.4/Zend:
zend_language_parser.h Only in
php-8.3.4/Zend: zend_language_parser.output 
  Only in
php-8.3.4/Zend: zend_language_scanner.c 
  Only in
php-8.3.4/Zend:
zend_language_scanner_defs.h   Only in
php-8.3.4:
configure   Files
php-8.3.4/configure.ac and php-src/configure.ac
differ   Only in php-8.3.4/ext/json:
json_parser.tab.c  Only in
php-8.3.4/ext/json: json_parser.tab.h 
  Only in
php-8.3.4/ext/json: json_scanner.c 
  Only in
php-8.3.4/ext/json:
php_json_scanner_defs.h    Only in
php-8.3.4/ext/pdo: pdo_sql_parser.c 
  Only in
php-8.3.4/ext/phar:
phar_path_check.c  Only in
php-8.3.4/ext/standard: url_scanner_ex.c 
  Only in
php-8.3.4/ext/standard: var_unserializer.c 
  Only in
php-8.3.4/main: php_config.h.in 
  Files
php-8.3.4/main/php_version.h and php-src/main/php_version.h
differ   Only in php-8.3.4/pear:
install-pear-nozlib.phar   Only in
php-8.3.4/sapi/phpdbg:
phpdbg_lexer.c  Only in
php-8.3.4/sapi/phpdbg:
phpdbg_parser.c Only in
php-8.3.4/sapi/phpdbg: phpdbg_parser.h 
  Only in
php-8.3.4/sapi/phpdbg: phpdbg_parser.output 
  ```
  
  
  To prevent
attacks from malevolent/compromised RMs, I propose completely
removing all autogenerated files from the release tarballs, and
ensuring their content exactly matches the content of the
associated git tag (this means also removing the -dev prefix
from the version number in main/php_version.h, Zend/zend.h,
configure.ac and NEWS in the git tag). 
  
  Of course
this means that users will have to generate the build scripts
when compiling PHP, as when installing PHP from the VCS repo.
  
  
  I'm sending
a copy of this email to secur...@php.net as well.
Hey Daniil,

Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Daniil Gentili


That would break lots of tools as it requires extra dependencies so it 
is not something that would could in stable versions.


Btw, I do not believe that "it would require end users to install 
autotools and bison in order to compile PHP from tarballs" is valid 
reason to delay the patching of a serious attack vector ASAP.



Regards,

Daniil Gentili.


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Daniil Gentili

Hi,
It is also pretty standard thing to distribute configure files (which 
is the file that probably matters most).


The current standard way of distributing generated configure files in 
tarballs is precisely what allowed the xz supply chain attack to go 
unnoticed for so long.


I strongly believe all projects using autotools, including PHP, should 
switch away from this "standard" way of doing things.


Also don't forget that we need to also provide Windows builds which 
are binaries so we need some sort of verification of this type in any 
case.


Of course, build reproducibility is a very good thing, but when a user 
downloads a binary, they're aware that they're getting a compiled blob 
which might contain injected malicious code (especially if there's no 
build reproducibility); when a user downloads a source tarball, there's 
a false sense of security rooted in the mistaken belief that the source 
code in the tarball matches the one distributed in the VCS, but in 
reality, the tarball also contains potentially malicious semi-compiled 
blobs, not present in the VCS.


It would require compromising the CI as well as the download serves 
happening at the same time which seems to me like an impossible scenario.


I misunderstood your original message, I thought you meant that there 
would be some new CI system hosted on downloads.php.net dedicated to 
verifying, not the current GHA CI system, which is configured on the 
public VCS.


If GHA is used for verifying builds, that would make more sense, but 
then users would be required to check the status of a github pipeline to 
validate that the tarball was not compromised (or alternatively clone 
from source and re-generate the build scripts manually, or simply trust 
the release manager, which brings us to square 1).



Regards,

Daniil Gentili.


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Jakub Zelenka
On Sat, Mar 30, 2024 at 12:03 PM Jakub Zelenka  wrote:

> Hi,
>
> On Sat, Mar 30, 2024 at 7:08 AM Marco Pivetta  wrote:
>
>>
>>
>> On Sat, 30 Mar 2024, 05:19 Ben Ramsey,  wrote:
>>
>>> On Mar 29, 2024, at 20:20, Bob Weinand  wrote:
>>>
>>> 
>>> On 29.3.2024 23:31:26, Daniil Gentili wrote:
>>>
>>> In light of the recent supply chain attack in xz/lzma, leading to a
>>> backdoor in openSSH (
>>> https://www.openwall.com/lists/oss-security/2024/03/29/4), I believe
>>> that it would be a good idea to remove the huge attack surface offered by
>>> the pre-generated autoconf build scripts and lexers, offered in the release
>>> tarballs.
>>>
>>> In particular, the xz supply chain attack injected the exploit with a
>>> few obfuscated lines, manually added to the end of the pre-generated
>>> configure script, that was only bundled in the tarballs.
>>>
>>> Even if the exploits themselves were committed to the repo in the form
>>> of test files, the code that actually injected the exploit in the library
>>> was not committed to the repo, and was only present in the pre-generated
>>> configure script in the tarball: this injection mode makes sense, as extra
>>> files in the tarball not present in the git repo would raise suspicions,
>>> but machine-generated configure scripts containing hundreds of thousands of
>>> lines of code not present in the upstream VCS are the norm, and are usually
>>> not checked before execution.
>>>
>>> Specifically in the case of PHP, along from the configure script, the
>>> tarball also bundles generated lexer files which contain actual C code,
>>> which is an additional attack vector, i.e. here's the diff between the
>>> tarball of the 8.3.4 release, and the PHP-8.3.4 tag on the git repo:
>>>
>>> ```
>>> ~ $ diff -r php-8.3.4 php-src -q
>>> Only in php-src:
>>> .git  Files
>>> php-8.3.4/NEWS and php-src/NEWS differ   Files
>>> php-8.3.4/Zend/zend.h and php-src/Zend/zend.h differ Only
>>> in php-8.3.4/Zend: zend_ini_parser.c
>>> Only in php-8.3.4/Zend: zend_ini_parser.h
>>> Only in php-8.3.4/Zend:
>>> zend_ini_parser.output Only in php-8.3.4/Zend:
>>> zend_ini_scanner.c
>>> Only in php-8.3.4/Zend: zend_ini_scanner_defs.h
>>> Only in php-8.3.4/Zend:
>>> zend_language_parser.c Only in php-8.3.4/Zend:
>>> zend_language_parser.h Only in php-8.3.4/Zend:
>>> zend_language_parser.output
>>> Only in php-8.3.4/Zend: zend_language_scanner.c
>>> Only in php-8.3.4/Zend:
>>> zend_language_scanner_defs.h   Only in php-8.3.4:
>>> configure   Files php-8.3.4/
>>> configure.ac and php-src/configure.ac differ   Only in
>>> php-8.3.4/ext/json: json_parser.tab.c  Only in
>>> php-8.3.4/ext/json: json_parser.tab.h
>>> Only in php-8.3.4/ext/json: json_scanner.c
>>> Only in php-8.3.4/ext/json:
>>> php_json_scanner_defs.hOnly in php-8.3.4/ext/pdo:
>>> pdo_sql_parser.c
>>> Only in php-8.3.4/ext/phar:
>>> phar_path_check.c  Only in
>>> php-8.3.4/ext/standard: url_scanner_ex.c
>>> Only in php-8.3.4/ext/standard: var_unserializer.c
>>> Only in php-8.3.4/main: php_config.h.in
>>> Files php-8.3.4/main/php_version.h and php-src/main/php_version.h
>>> differ   Only in php-8.3.4/pear:
>>> install-pear-nozlib.phar   Only in
>>> php-8.3.4/sapi/phpdbg: phpdbg_lexer.c  Only in
>>> php-8.3.4/sapi/phpdbg: phpdbg_parser.c Only in
>>> php-8.3.4/sapi/phpdbg: phpdbg_parser.h
>>> Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.output
>>> ```
>>>
>>> To prevent attacks from malevolent/compromised RMs, I propose completely
>>> removing all autogenerated files from the release tarballs, and ensuring
>>> their content exactly matches the content of the associated git tag (this
>>> means also removing the -dev prefix from the version number in
>>> main/php_version.h, Zend/zend.h, configure.ac and NEWS in the git tag).
>>>
>>> Of course this means that users will have to generate the build scripts
>>> when compiling PHP, as when installing PHP from the VCS repo.
>>>
>>> I'm sending a copy of this email to secur...@php.net as well.
>>>
>>> Hey Daniil,
>>>
>>> You can also have a public CI (i.e. a github action) generate the
>>> artifacts, along with hash computation.
>>> It should be a github action which runs on tags. This makes it fully
>>> verifiable; i.e. the code for the generation of action, including the hash.
>>> Anyone who wants can trivially trace this back.
>>>
>>> There's nothing in the tarballs which cannot be trivially automated and
>>> made verifiable.
>>>
>>> I don't think providing pre-generated files is fundamentally flawed, the
>>> primary lacking thing is verifiability. Which is also what enabled the xz
>>> backdoor

Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Jakub Zelenka
Hi

On Sat, Mar 30, 2024 at 1:39 PM Daniil Gentili 
wrote:

> Hi,
>
>
> >The idea is that we would setup worklfow on CI that would run on tag push
> and it would call (authenticated https request) downloads.php.net server
> that could do the actual build
>
> I strongly believe that source tarballs should contain *only* the source
> code contained in the VCS.
>
That would break lots of tools as it requires extra dependencies so it is
not something that would could in stable versions. It is also pretty
standard thing to distribute configure files (which is the file that
probably matters most). Also don't forget that we need to also provide
Windows builds which are binaries so we need some sort of verification of
this type in any case.

>
> Distributing "half-built" source code (even if it's generated by a CI, and
> especially by a build server on downloads.php.net, which can be
> compromised) defeats the reproducibility and transparency purposes of
> building from source.
>

It would require compromising the CI as well as the download serves
happening at the same time which seems to me like an impossible scenario.

Regards

Jakub


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Tim Düsterhus

Hi

On 3/30/24 14:20, Stanislav Malyshev wrote:

But does the release manager generate the files (and the tarball) in a
reproducible way?


I understand that's what ./scripts/dev/makedist and
./scripts/dev/genfiles do, but I suspect exact bits in resulting
configure and lexers may depend on the exact version of tools & utils
used. For upstream packagers like distros I'd likely recommend using
these tools directly anyway, and not rely on what's in the package.



I've made some improvements to the 'makedist' script last year to 
improve reproducibility [1], but they are not fully reproducible yet.


Notably the timestamps within the .tar archive are not reproducible yet: 
https://github.com/php/php-src/blob/186465b1ddcf203ddffb5d24bae897508c711586/scripts/dev/makedist#L169-L172


They are set to the time the script is run, but should probably be 
derived from the time of the current commit instead. Likewise the gzip 
call does not have the -n flag and thus also embeds a timestamp into the 
.tar.gz archive.


There are probably further bits that are not reproducible yet.

Best regards
Tim Düsterhus

[1] https://github.com/php/php-src/pull/10613 and 
https://github.com/php/php-src/pull/10615


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Daniil Gentili

Hi,


>The idea is that we would setup worklfow on CI that would run on tag 
push and it would call (authenticated https request) downloads.php.net 
 server that could do the actual build


I strongly believe that source tarballs should contain *only* the source 
code contained in the VCS.


Distributing "half-built" source code (even if it's generated by a CI, 
and especially by a build server on downloads.php.net, which can be 
compromised) defeats the reproducibility and transparency purposes of 
building from source.


> For upstream packagers like distros I'd likely recommend using these 
tools directly anyway, and not rely on what's in the package.


Distros like arch linux already re-generate the configure scripts from 
scratch, but I believe that no distinction should be made, *everyone* 
should get a tarball containing *only* the bare source code, without 
leaving to the user the choice to re-generate the build files, or use a 
potentially compromised build script.



Regards,

Daniil Gentili.


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Stanislav Malyshev

Hi!

On 3/30/24 1:27 AM, Sebastian Bergmann wrote:

Am 30.03.2024 um 05:17 schrieb Ben Ramsey:
This is also why our release managers sign the tarballs with their own 
GPG keys, after generating the artifacts. This verifies the release 
manager was the one who generated the files.


But does the release manager generate the files (and the tarball) in a 
reproducible way?


I understand that's what ./scripts/dev/makedist and 
./scripts/dev/genfiles do, but I suspect exact bits in resulting 
configure and lexers may depend on the exact version of tools & utils 
used. For upstream packagers like distros I'd likely recommend using 
these tools directly anyway, and not rely on what's in the package.


--
Stas Malyshev
smalys...@gmail.com


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Jakub Zelenka
Hi,

On Sat, Mar 30, 2024 at 7:08 AM Marco Pivetta  wrote:

>
>
> On Sat, 30 Mar 2024, 05:19 Ben Ramsey,  wrote:
>
>> On Mar 29, 2024, at 20:20, Bob Weinand  wrote:
>>
>> 
>> On 29.3.2024 23:31:26, Daniil Gentili wrote:
>>
>> In light of the recent supply chain attack in xz/lzma, leading to a
>> backdoor in openSSH (
>> https://www.openwall.com/lists/oss-security/2024/03/29/4), I believe
>> that it would be a good idea to remove the huge attack surface offered by
>> the pre-generated autoconf build scripts and lexers, offered in the release
>> tarballs.
>>
>> In particular, the xz supply chain attack injected the exploit with a few
>> obfuscated lines, manually added to the end of the pre-generated configure
>> script, that was only bundled in the tarballs.
>>
>> Even if the exploits themselves were committed to the repo in the form of
>> test files, the code that actually injected the exploit in the library was
>> not committed to the repo, and was only present in the pre-generated
>> configure script in the tarball: this injection mode makes sense, as extra
>> files in the tarball not present in the git repo would raise suspicions,
>> but machine-generated configure scripts containing hundreds of thousands of
>> lines of code not present in the upstream VCS are the norm, and are usually
>> not checked before execution.
>>
>> Specifically in the case of PHP, along from the configure script, the
>> tarball also bundles generated lexer files which contain actual C code,
>> which is an additional attack vector, i.e. here's the diff between the
>> tarball of the 8.3.4 release, and the PHP-8.3.4 tag on the git repo:
>>
>> ```
>> ~ $ diff -r php-8.3.4 php-src -q
>> Only in php-src:
>> .git  Files
>> php-8.3.4/NEWS and php-src/NEWS differ   Files
>> php-8.3.4/Zend/zend.h and php-src/Zend/zend.h differ Only
>> in php-8.3.4/Zend: zend_ini_parser.c
>> Only in php-8.3.4/Zend: zend_ini_parser.h
>> Only in php-8.3.4/Zend:
>> zend_ini_parser.output Only in php-8.3.4/Zend:
>> zend_ini_scanner.c
>> Only in php-8.3.4/Zend: zend_ini_scanner_defs.h
>> Only in php-8.3.4/Zend:
>> zend_language_parser.c Only in php-8.3.4/Zend:
>> zend_language_parser.h Only in php-8.3.4/Zend:
>> zend_language_parser.output
>> Only in php-8.3.4/Zend: zend_language_scanner.c
>> Only in php-8.3.4/Zend:
>> zend_language_scanner_defs.h   Only in php-8.3.4:
>> configure   Files php-8.3.4/
>> configure.ac and php-src/configure.ac differ   Only in
>> php-8.3.4/ext/json: json_parser.tab.c  Only in
>> php-8.3.4/ext/json: json_parser.tab.h
>> Only in php-8.3.4/ext/json: json_scanner.c
>> Only in php-8.3.4/ext/json:
>> php_json_scanner_defs.hOnly in php-8.3.4/ext/pdo:
>> pdo_sql_parser.c
>> Only in php-8.3.4/ext/phar:
>> phar_path_check.c  Only in
>> php-8.3.4/ext/standard: url_scanner_ex.c
>> Only in php-8.3.4/ext/standard: var_unserializer.c
>> Only in php-8.3.4/main: php_config.h.in
>> Files php-8.3.4/main/php_version.h and php-src/main/php_version.h
>> differ   Only in php-8.3.4/pear:
>> install-pear-nozlib.phar   Only in
>> php-8.3.4/sapi/phpdbg: phpdbg_lexer.c  Only in
>> php-8.3.4/sapi/phpdbg: phpdbg_parser.c Only in
>> php-8.3.4/sapi/phpdbg: phpdbg_parser.h
>> Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.output
>> ```
>>
>> To prevent attacks from malevolent/compromised RMs, I propose completely
>> removing all autogenerated files from the release tarballs, and ensuring
>> their content exactly matches the content of the associated git tag (this
>> means also removing the -dev prefix from the version number in
>> main/php_version.h, Zend/zend.h, configure.ac and NEWS in the git tag).
>>
>> Of course this means that users will have to generate the build scripts
>> when compiling PHP, as when installing PHP from the VCS repo.
>>
>> I'm sending a copy of this email to secur...@php.net as well.
>>
>> Hey Daniil,
>>
>> You can also have a public CI (i.e. a github action) generate the
>> artifacts, along with hash computation.
>> It should be a github action which runs on tags. This makes it fully
>> verifiable; i.e. the code for the generation of action, including the hash.
>> Anyone who wants can trivially trace this back.
>>
>> There's nothing in the tarballs which cannot be trivially automated and
>> made verifiable.
>>
>> I don't think providing pre-generated files is fundamentally flawed, the
>> primary lacking thing is verifiability. Which is also what enabled the xz
>> backdoor.
>>
>> Bob
>>
>>
>> This is also why our release managers sign the tarballs with their own
>> GPG keys, after generating the artifacts. This verifies the release

Re: [PHP-DEV] [RFC] Invoke __callStatic when non-static public methods are called statically

2024-03-30 Thread Robert Landers
On Sat, Mar 30, 2024 at 1:06 AM 하늘아부지  wrote:
>
>
>
> 2024년 3월 30일 (토) 오전 2:15, Robert Landers 님이 작성:
>>
>> On Fri, Mar 29, 2024 at 3:41 AM 하늘아부지  wrote:
>> >
>> > Hello.
>> >
>> > I created a wiki for __callStatic related issues.
>> > Please see:
>> > https://wiki.php.net/rfc/complete_callstatc_magic
>> >
>> > I look forward to your interest and advice.
>> >
>> > Best Regards.
>> > Daddyofsky
>>
>> Hey there,
>>
>> Some general feedback:
>>
>> The RFC is a bit hard for me to follow, for example:
>>
>> > However, the IDE cannot find active method, and the Go to Definition 
>> > feature cannot be used.
>>
>> This sounds like a bug or feature request with your IDE, not a problem with 
>> PHP.
>>
>> > This code is very clear, aside from the fact that the method is not static.
>>
>> Clear to whom? As a developer this looks downright confusing. I can't
>> even guess what the actual behavior is. If you call a non-static
>> method statically, its currently an error. Why would this not be an
>> error?
>>
>> > The IDE recognizes active methods well and the Go to definition feature 
>> > also works properly.
>>
>> What is stopping the IDE to taking you to the __callStatic method?
>> That would be the correct behavior IMHO, not the implementation for
>> instance methods.
>>
>> > Even if there are only a few core methods, it cannot be made into a single 
>> > file.
>>
>> There is nothing stopping you from putting multiple classes in a file.
>>
>> > As can be seen in the above examples, the code becomes clearer, and 
>> > navigation through the IDE works much better.
>>
>> I completely disagree. It mixes concerns and makes spaghetti code into
>> incomprehensible code. Also, maybe you should take up IDE navigation
>> with your IDE?
>>
>> > Instead of throwing an error when a non-static public method is called 
>> > statically, the _ _callStatic method should be invoked.
>>
>> I completely agree with this, btw. Your examples could use some work
>> and shows all the reasons why it shouldn't call __callStatic(). A real
>> life example, that has nothing to do with a specific framework:
>>
>> When generating proxies for existing types, you often need to share
>> some state between the proxies. To do that, you put static
>> methods/properties on the proxy class and hope to the PHP Gods that
>> nobody will ever accidentally name something in their concrete class
>> with the name you chose for things. To help with that, you create some
>> kind of insane prefix. If __callStatic() were called ALWAYS in a
>> static context, even if a non-static method exists, then collisions
>> would literally be impossible. But at that point, why can't I just
>> write:
>>
>> class Test {
>>   public static function test(): string { return "hello world"; }
>>   public function test(): int { return random_int(); }
>> }
>>
>> ??? I feel like this is your real RFC and should be allowed if we're
>> allowed to __callStatic() to instance methods. I don't think it makes
>> sense to have one without the other, and then what you want with
>> __callStatic() comes naturally, instead of this weird and confusing
>> RFC.
>>
>>
>> Robert Landers
>> Software Engineer
>> Utrecht NL
>
>
> Hello.
>
> I'm not very familiar with documentation. It would be helpful if you could 
> suggest how to fix the problematic parts.
>
> Using the example of the IDE meant to illustrate that it should be intuitive 
> enough for the IDE to find it directly.
>
> Even currently, `__callStatic` is called in cases of non-static methods that 
> are not public methods. `__callStatic` already acts as a rule-breaker. It 
> seems like there's a desire for the magic method to be named as such without 
> actually causing much magic. The current `__callStatic` is like a magician 
> who can use high-level magic to hit targets behind doors or invisible targets 
> but can't use basic magic to hit a visible door.
>
> While it's good that the PHP kindly notifies errors, I think it would also be 
> beneficial to provide users with options to do other things, specifically 
> when they intentionally use `__callStatic`. That's the core point of this 
> proposal.
>
> If there are better examples or ways to explain, I would appreciate learning 
> about them.
>
> Daddyofsky

My main issue with the RFC comes down to the fact that it suggests
that this is valid:

// https://3v4l.org/0ufkt
class Test {
  public function test() {}
  public static function test() {}
}

Having two functions with the same name, one static and one non-static
is (currently) an error. Allowing callStatic to break this rule
without being able to write the above code, feels inconsistent. While
PHP is full of inconsistencies, adding more of them isn't the right
way. I'd much rather see static and non-static methods being able to
have the same name -- in which case it would make sense for callStatic
to work as described in the RFC.

Robert Landers
Software Engineer
Utrecht NL


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Sebastian Bergmann

Am 30.03.2024 um 05:17 schrieb Ben Ramsey:
This is also why our release managers sign the tarballs with their own GPG 
keys, after generating the artifacts. This verifies the release manager 
was the one who generated the files.


But does the release manager generate the files (and the tarball) in a 
reproducible way?


Re: [PHP-DEV] Consider removing autogenerated files from tarballs

2024-03-30 Thread Marco Pivetta
On Sat, 30 Mar 2024, 05:19 Ben Ramsey,  wrote:

> On Mar 29, 2024, at 20:20, Bob Weinand  wrote:
>
> 
> On 29.3.2024 23:31:26, Daniil Gentili wrote:
>
> In light of the recent supply chain attack in xz/lzma, leading to a
> backdoor in openSSH (
> https://www.openwall.com/lists/oss-security/2024/03/29/4), I believe that
> it would be a good idea to remove the huge attack surface offered by the
> pre-generated autoconf build scripts and lexers, offered in the release
> tarballs.
>
> In particular, the xz supply chain attack injected the exploit with a few
> obfuscated lines, manually added to the end of the pre-generated configure
> script, that was only bundled in the tarballs.
>
> Even if the exploits themselves were committed to the repo in the form of
> test files, the code that actually injected the exploit in the library was
> not committed to the repo, and was only present in the pre-generated
> configure script in the tarball: this injection mode makes sense, as extra
> files in the tarball not present in the git repo would raise suspicions,
> but machine-generated configure scripts containing hundreds of thousands of
> lines of code not present in the upstream VCS are the norm, and are usually
> not checked before execution.
>
> Specifically in the case of PHP, along from the configure script, the
> tarball also bundles generated lexer files which contain actual C code,
> which is an additional attack vector, i.e. here's the diff between the
> tarball of the 8.3.4 release, and the PHP-8.3.4 tag on the git repo:
>
> ```
> ~ $ diff -r php-8.3.4 php-src -q
> Only in php-src: .git
> Files php-8.3.4/NEWS and php-src/NEWS differ
> Files php-8.3.4/Zend/zend.h and php-src/Zend/zend.h differ
> Only in php-8.3.4/Zend: zend_ini_parser.c
> Only in php-8.3.4/Zend: zend_ini_parser.h
> Only in php-8.3.4/Zend: zend_ini_parser.output
> Only in php-8.3.4/Zend: zend_ini_scanner.c
> Only in php-8.3.4/Zend: zend_ini_scanner_defs.h
> Only in php-8.3.4/Zend: zend_language_parser.c
> Only in php-8.3.4/Zend: zend_language_parser.h
> Only in php-8.3.4/Zend: zend_language_parser.output
> Only in php-8.3.4/Zend: zend_language_scanner.c
> Only in php-8.3.4/Zend: zend_language_scanner_defs.h
> Only in php-8.3.4: configure
> Files php-8.3.4/configure.ac and php-src/configure.ac
> differ   Only in php-8.3.4/ext/json:
> json_parser.tab.c  Only in php-8.3.4/ext/json:
> json_parser.tab.h
> Only in php-8.3.4/ext/json: json_scanner.c
> Only in php-8.3.4/ext/json: php_json_scanner_defs.h
> Only in php-8.3.4/ext/pdo: pdo_sql_parser.c
> Only in php-8.3.4/ext/phar: phar_path_check.c
> Only in php-8.3.4/ext/standard: url_scanner_ex.c
> Only in php-8.3.4/ext/standard: var_unserializer.c
> Only in php-8.3.4/main: php_config.h.in
> Files php-8.3.4/main/php_version.h and php-src/main/php_version.h differ
> Only in php-8.3.4/pear: install-pear-nozlib.phar
> Only in php-8.3.4/sapi/phpdbg: phpdbg_lexer.c
> Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.c
> Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.h
> Only in php-8.3.4/sapi/phpdbg: phpdbg_parser.output
> ```
>
> To prevent attacks from malevolent/compromised RMs, I propose completely
> removing all autogenerated files from the release tarballs, and ensuring
> their content exactly matches the content of the associated git tag (this
> means also removing the -dev prefix from the version number in
> main/php_version.h, Zend/zend.h, configure.ac and NEWS in the git tag).
>
> Of course this means that users will have to generate the build scripts
> when compiling PHP, as when installing PHP from the VCS repo.
>
> I'm sending a copy of this email to secur...@php.net as well.
>
> Hey Daniil,
>
> You can also have a public CI (i.e. a github action) generate the
> artifacts, along with hash computation.
> It should be a github action which runs on tags. This makes it fully
> verifiable; i.e. the code for the generation of action, including the hash.
> Anyone who wants can trivially trace this back.
>
> There's nothing in the tarballs which cannot be trivially automated and
> made verifiable.
>
> I don't think providing pre-generated files is fundamentally flawed, the
> primary lacking thing is verifiability. Which is also what enabled the xz
> backdoor.
>
> Bob
>
>
> This is also why our release managers sign the tarballs with their own GPG
> keys, after generating the artifacts. This verifies the release manager was
> the one who generated the files.
>
> Cheers,
> Ben
>

Hey Ben,

I understand that the XZ project had signed releases too: that still means
that downstream consumers would need to trust the release managers anyway,
and reproduce the whole chain themselves.

I suppose that's part of OP's concern.