Re: [PHP-DEV] Reproducible Builds
Am 29.11.2023 um 08:12 schrieb Derick Rethans: Not really, as a hash doesn't directly tell me the date/time, and neither would it help in dev branches / checkouts where the latest changes haven't been comiited yet. I do not see how date/time help with seeing what was compiled. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] What is the prevailing sentiment about extract() and compact() ?
> On 29 Nov 2023, at 09:58, Larry Garfield wrote: > > On Tue, Nov 28, 2023, at 7:49 PM, Juliette Reinders Folmer wrote: >> L.S., >> >> What with all the drives towards cleaner code, how do people feel >> nowadays about `extract()` and `compact()` still being supported ? >> >> Both have alternatives. The alternatives may be a little more cumbersome >> to type, but also make the code more descriptive, lessens the risk of >> variable name collisions (though this can be handled via the $flags in >> extract), prevents surprises when a non-associative key would be >> included in an array and lessens security risks when used on untrusted data > > *snip* > >> I can imagine these could be candidates for deprecation ? Or limited >> deprecation - only when used in the global namespace ? >> >> For now, I'm just wondering how people feel about these functions. >> >> Smile, >> Juliette > > extract() has very limited use in some kinds of template engine, which use > PHP require() as a template mechanism. I don't think compact() has any uses. > > I very recently was just reminded that these even exist, as i had to tell one > of my developers to not use them. I think it was compact() he was trying to > use. I vetoed it. > > I would not mind if they were removed, but I don't know how large the BC > impact would be. They'd probably need a long deprecation period, just to be > safe. > > --Larry Garfield > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > Hi, While I think I understand the goal behind this, I think you're missing some factors here. Regarding use-cases for compact: the most common one I can think of from my work, is for passing multiple local variables as context to a logging function, but I'd be surprised if its not also used to build faux hash structures too. If your goal is to achieve an associative array (i.e a poor mans hash) of known variable names, using compact in php8+ has *less* risk of uncaught/unexpected errors than building it manually. Passing an undefined name (i.e. due a typo, or it just not being defined) produces a warning regardless of whether you build the array manually or pass the name(s) to compact(). Providing an array key name that doesn't match the variable name (e.g. due to a typo, or a variable being renamed) will produce no error when building the array manually, but will produce a warning with compact(). IDEs (e.g. PHPStorm/IDEA+PHP plugin) can already understand that the names passed to compact are a variable name, and make changes when a variable is renamed via the IDE. They simply cannot do the same for plain array keys. Due to how variable scope works, the only way to re-implement compact() with the same key-typo-catching behaviour as a function in userland would be something that requires the user to pass the result of get_defined_vars() to every call. So no, I don't think compact() should be deprecated, what I think *should* happen, is to promote the current warning on undefined variables, to an error, as per https://wiki.php.net/rfc/undefined_variable_error_promotion. Whether this is a foregone conclusion or not, I don't know because that RFC doesn't mention compact() specifically. extract(), as Larry points out has historically been used by 'pure php' style template systems, in a manner that's generally "safe". Personally I'm less inclined to use this behaviour now (i.e. I'd prefer to access named & typed properties from a template than arbitrary local variable names) but I don't think that's enough of a case to remove it, because just like with compact, by nature of how variable scope works, it's very difficult/impossible to re-implement this in userland, in a way that's reusable and doesn't involve using worse constructs (e.g. eval'ing the result of a function) I think there's possibly an argument to be made for improvements, such as changing the default mode of extract to something besides EXTR_OVERWRITE, or to have checks in place preventing the overwrite of superglobals. Cheers Stephen
Re: [PHP-DEV] Reproducible Builds
On 29 November 2023 00:48:28 GMT, Matthew Weier O'Phinney wrote: >On Tue, Nov 28, 2023, 5:28 PM Derick Rethans wrote: > >> On 28 November 2023 17:28:18 GMT, Sebastian Bergmann >> wrote: >> >> >While we could probably replace __DATE__ and __TIME__ with >> SOURCE_DATE_EPOCH [3] [4], I cannot help but wonder whether having the date >> and time when the executable was built in the executable is actually >> useful. How attached are we to having the date and time of the build in the >> output of phpinfo(), "php -i", etc.? >> >> It is really useful for the development versions of PHP. Knowing whether >> your are running a PHP-dev from last week or last month is important. > > >Would Marco's suggestion of using a git hash solve that? You'd then get >both a reproducible build AND know when/what it was generated from. > >> >> >> >> Not really, as a hash doesn't directly tell me the date/time, and neither would it help in dev branches / checkouts where the latest changes haven't been comiited yet. cheers Derick -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Reproducible Builds
Am 28.11.2023 um 19:40 schrieb Ilija Tovilo: At least for core, enabled-by-default extensions, __DATE__ and __TIME__ seem to be the only variables. I can get reproducible builds by setting SOURCE_DATE_EPOCH. Confirmed: I can get reproducible builds, too, by using CLANG and setting SOURCE_DATE_EPOCH. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Reproducible Builds
Am 29.11.2023 um 07:23 schrieb Sebastian Bergmann: SOURCE_DATE_EPOCH=$(git log -1 --pretty=%cI) should do the trick. What I meant to write was SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct), of course. Sorry for the noise. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Reproducible Builds
Am 29.11.2023 um 01:54 schrieb Marco Pivetta: Also, refs have a timestamp :-) SOURCE_DATE_EPOCH=$(git log -1 --pretty=%cI) should do the trick. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] What is the prevailing sentiment about extract() and compact() ?
On Tue, Nov 28, 2023, at 7:49 PM, Juliette Reinders Folmer wrote: > L.S., > > What with all the drives towards cleaner code, how do people feel > nowadays about `extract()` and `compact()` still being supported ? > > Both have alternatives. The alternatives may be a little more cumbersome > to type, but also make the code more descriptive, lessens the risk of > variable name collisions (though this can be handled via the $flags in > extract), prevents surprises when a non-associative key would be > included in an array and lessens security risks when used on untrusted data *snip* > I can imagine these could be candidates for deprecation ? Or limited > deprecation - only when used in the global namespace ? > > For now, I'm just wondering how people feel about these functions. > > Smile, > Juliette extract() has very limited use in some kinds of template engine, which use PHP require() as a template mechanism. I don't think compact() has any uses. I very recently was just reminded that these even exist, as i had to tell one of my developers to not use them. I think it was compact() he was trying to use. I vetoed it. I would not mind if they were removed, but I don't know how large the BC impact would be. They'd probably need a long deprecation period, just to be safe. --Larry Garfield -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
[PHP-DEV] What is the prevailing sentiment about extract() and compact() ?
L.S., What with all the drives towards cleaner code, how do people feel nowadays about `extract()` and `compact()` still being supported ? Both have alternatives. The alternatives may be a little more cumbersome to type, but also make the code more descriptive, lessens the risk of variable name collisions (though this can be handled via the $flags in extract), prevents surprises when a non-associative key would be included in an array and lessens security risks when used on untrusted data function foo() { $array = [ 'color' => 'blue', 'size' => 'medium', ]; // Using extract. extract($array); var_dump($color); // Not using extract. var_dump($array['color']); $color = $array['color']; var_dump($color); } function bar( $color, $size ) { // Using compact. $array = compact('color', 'size'); var_dump($array); // Not using compact. $array = [ 'color' => $color, 'size' => $size, ]; var_dump($array); $array = []; foreach (['color', 'size'] as $name) { if (isset($$name)) { $array[$name] = $$name; } } var_dump($array); } https://3v4l.org/JeHnY I can imagine these could be candidates for deprecation ? Or limited deprecation - only when used in the global namespace ? For now, I'm just wondering how people feel about these functions. Smile, Juliette
Re: [PHP-DEV] Reproducible Builds
On Wed, 29 Nov 2023 at 01:48, Matthew Weier O'Phinney < mweierophin...@gmail.com> wrote: > On Tue, Nov 28, 2023, 5:28 PM Derick Rethans wrote: > > > On 28 November 2023 17:28:18 GMT, Sebastian Bergmann > > wrote: > > > > >While we could probably replace __DATE__ and __TIME__ with > > SOURCE_DATE_EPOCH [3] [4], I cannot help but wonder whether having the > date > > and time when the executable was built in the executable is actually > > useful. How attached are we to having the date and time of the build in > the > > output of phpinfo(), "php -i", etc.? > > > > It is really useful for the development versions of PHP. Knowing whether > > your are running a PHP-dev from last week or last month is important. > > > Would Marco's suggestion of using a git hash solve that? You'd then get > both a reproducible build AND know when/what it was generated from. > Also, refs have a timestamp :-) Marco Pivetta https://mastodon.social/@ocramius https://ocramius.github.io/
Re: [PHP-DEV] Reproducible Builds
On Tue, Nov 28, 2023, 5:28 PM Derick Rethans wrote: > On 28 November 2023 17:28:18 GMT, Sebastian Bergmann > wrote: > > >While we could probably replace __DATE__ and __TIME__ with > SOURCE_DATE_EPOCH [3] [4], I cannot help but wonder whether having the date > and time when the executable was built in the executable is actually > useful. How attached are we to having the date and time of the build in the > output of phpinfo(), "php -i", etc.? > > It is really useful for the development versions of PHP. Knowing whether > your are running a PHP-dev from last week or last month is important. Would Marco's suggestion of using a git hash solve that? You'd then get both a reproducible build AND know when/what it was generated from. > > > >
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
> Use zend.script_encoding=sjis and zend_bultibyte=true > > ❯ ~/php82/bin/php -d zend.script_encoding=sjis -d zend.multibyte=true > deprecate_zend_scriptencoding.php > array(7) { > ["biao_hex"]=> > string(6) "e8a1a8" > ["zend.multibyte"]=> > string(1) "1" > ["zend.script_encoding"]=> > string(4) "sjis" > ["zend.detect_unicode"]=> > string(1) "1" > ["mbstring.internal_encoding"]=> > string(0) "" > ["mbstring.func_overload"]=> > bool(false) > ["PHP_VERSION"]=> > string(5) "8.2.8" > } > Strictly, include internal_encoding. ❯ ~/php82/bin/php -d zend.script_encoding=sjis -d internal_encoding=sjis -d zend.multibyte=true deprecate_zend_scriptencoding.php array(7) { ["biao_hex"]=> string(4) "955c" ["zend.multibyte"]=> string(1) "1" ["zend.script_encoding"]=> string(4) "sjis" ["zend.detect_unicode"]=> string(1) "1" ["mbstring.internal_encoding"]=> string(0) "" ["mbstring.func_overload"]=> bool(false) ["PHP_VERSION"]=> string(5) "8.2.8" } -- --- Yuya Hamada (tekimen) - https://tekitoh-memdhoi.info - https://github.com/youkidearitai - -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
2023年11月29日(水) 9:04 Hans Henrik Bergan : > > Do you have access to a project actually using Shift_JIS? Interesting! > I thought they were practically unicorns / non-existent running PHP4, > > Can you run > ``` > var_dump(array( > "biao_hex" => bin2hex("表"), > "zend.multibyte" => ini_get("zend.multibyte"), > "zend.script_encoding" => ini_get("zend.script_encoding"), > "zend.detect_unicode" => ini_get("zend.detect_unicode"), > "mbstring.internal_encoding" => ini_get("mbstring.internal_encoding"), > "mbstring.func_overload" => ini_get("mbstring.func_overload"), > "PHP_VERSION" => PHP_VERSION, > )); > ``` Hi, Hans I'm trying to above code. Nothing config: ❯ ~/php82/bin/php deprecate_zend_scriptencoding.php PHP Parse error: syntax error, unexpected identifier "zend", expecting ")" in /Users/youkidearitai/deprecate_zend_scriptencoding.php on line 5 Parse error: syntax error, unexpected identifier "zend", expecting ")" in /Users/youkidearitai/deprecate_zend_scriptencoding.php on line 5 Use zend.script_encoding=sjis and zend_bultibyte=true ❯ ~/php82/bin/php -d zend.script_encoding=sjis -d zend.multibyte=true deprecate_zend_scriptencoding.php array(7) { ["biao_hex"]=> string(6) "e8a1a8" ["zend.multibyte"]=> string(1) "1" ["zend.script_encoding"]=> string(4) "sjis" ["zend.detect_unicode"]=> string(1) "1" ["mbstring.internal_encoding"]=> string(0) "" ["mbstring.func_overload"]=> bool(false) ["PHP_VERSION"]=> string(5) "8.2.8" } Therefore, zend.script_encoding and zend.multibyte is very important. Regards Yuya -- --- Yuya Hamada (tekimen) - https://tekitoh-memdhoi.info - https://github.com/youkidearitai - -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
actually scratch that, run ``` var_dump(array( "biao_hex" => bin2hex("表"), "zend.multibyte" => ini_get("zend.multibyte"), "zend.script_encoding" => ini_get("zend.script_encoding"), "zend.detect_unicode" => ini_get("zend.detect_unicode"), "mbstring.internal_encoding" => ini_get("mbstring.internal_encoding"), "mbstring.func_overload" => ini_get("mbstring.func_overload"), "PHP_VERSION" => PHP_VERSION, "raw_script_bytes" => bin2hex(file_get_contents(__FILE__)), )); ``` what do you get? On Wed, 29 Nov 2023 at 01:04, Hans Henrik Bergan wrote: > > Do you have access to a project actually using Shift_JIS? Interesting! > I thought they were practically unicorns / non-existent running PHP4, > > Can you run > ``` > var_dump(array( > "biao_hex" => bin2hex("表"), > "zend.multibyte" => ini_get("zend.multibyte"), > "zend.script_encoding" => ini_get("zend.script_encoding"), > "zend.detect_unicode" => ini_get("zend.detect_unicode"), > "mbstring.internal_encoding" => ini_get("mbstring.internal_encoding"), > "mbstring.func_overload" => ini_get("mbstring.func_overload"), > "PHP_VERSION" => PHP_VERSION, > )); > ``` > there? What do you get? > > On Wed, 29 Nov 2023 at 00:47, youkidearitai wrote: > > > > 2023年11月29日(水) 8:07 Hans Henrik Bergan : > > > > > > @youkidearitai right now the code specifically deals with > > > - UTF8: removing UTF8 BOM and removing `declare(encoding='UTF-8'); > > > - UTF16LE/UTF16BE/UTF32LE/UTF32BE: converting to UTF8 removing the BOM > > > and removing declare(encoding='...') > > > - ISO-8859-1: converting to UTF-8 and removing > > > declare(encoding='ISO-8859-1'), i couldn't really find information on > > > a ISO-8859-1 BOM, so to the best of my knowledge it does not exist > > > > > > it does not deal with any other encodings as of writing, but more can > > > be added if needed. > > > > > > > Hi, Hans > > > > I see. I understand the argument. > > At least, Japanese character encoding seems not using declare(encoding=...). > > > > Probably, we use zend_encoding implicitly. > > If delete zend_encoding, In SJIS (Shift_JIS) probably will occur 5c problem. > > > > For example is below: > > > > $val = "表"; // 表 is 0x955c, script see 0x5c22, therefore, Throw on Parse > > Error > > > > Please see about 5c problem > > https://blog.kano.ac/archive/posts/1654_5c-problem/ > > > > I would like to maintain backwards compatibility. zend_encoding seems > > can't delete. > > > > Regards > > Yuya > > > > -- > > --- > > Yuya Hamada (tekimen) > > - https://tekitoh-memdhoi.info > > - https://github.com/youkidearitai > > - > > > > -- > > PHP Internals - PHP Runtime Development Mailing List > > To unsubscribe, visit: https://www.php.net/unsub.php > > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
Do you have access to a project actually using Shift_JIS? Interesting! I thought they were practically unicorns / non-existent running PHP4, Can you run ``` var_dump(array( "biao_hex" => bin2hex("表"), "zend.multibyte" => ini_get("zend.multibyte"), "zend.script_encoding" => ini_get("zend.script_encoding"), "zend.detect_unicode" => ini_get("zend.detect_unicode"), "mbstring.internal_encoding" => ini_get("mbstring.internal_encoding"), "mbstring.func_overload" => ini_get("mbstring.func_overload"), "PHP_VERSION" => PHP_VERSION, )); ``` there? What do you get? On Wed, 29 Nov 2023 at 00:47, youkidearitai wrote: > > 2023年11月29日(水) 8:07 Hans Henrik Bergan : > > > > @youkidearitai right now the code specifically deals with > > - UTF8: removing UTF8 BOM and removing `declare(encoding='UTF-8'); > > - UTF16LE/UTF16BE/UTF32LE/UTF32BE: converting to UTF8 removing the BOM > > and removing declare(encoding='...') > > - ISO-8859-1: converting to UTF-8 and removing > > declare(encoding='ISO-8859-1'), i couldn't really find information on > > a ISO-8859-1 BOM, so to the best of my knowledge it does not exist > > > > it does not deal with any other encodings as of writing, but more can > > be added if needed. > > > > Hi, Hans > > I see. I understand the argument. > At least, Japanese character encoding seems not using declare(encoding=...). > > Probably, we use zend_encoding implicitly. > If delete zend_encoding, In SJIS (Shift_JIS) probably will occur 5c problem. > > For example is below: > > $val = "表"; // 表 is 0x955c, script see 0x5c22, therefore, Throw on Parse Error > > Please see about 5c problem > https://blog.kano.ac/archive/posts/1654_5c-problem/ > > I would like to maintain backwards compatibility. zend_encoding seems > can't delete. > > Regards > Yuya > > -- > --- > Yuya Hamada (tekimen) > - https://tekitoh-memdhoi.info > - https://github.com/youkidearitai > - > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
2023年11月29日(水) 8:07 Hans Henrik Bergan : > > @youkidearitai right now the code specifically deals with > - UTF8: removing UTF8 BOM and removing `declare(encoding='UTF-8'); > - UTF16LE/UTF16BE/UTF32LE/UTF32BE: converting to UTF8 removing the BOM > and removing declare(encoding='...') > - ISO-8859-1: converting to UTF-8 and removing > declare(encoding='ISO-8859-1'), i couldn't really find information on > a ISO-8859-1 BOM, so to the best of my knowledge it does not exist > > it does not deal with any other encodings as of writing, but more can > be added if needed. > Hi, Hans I see. I understand the argument. At least, Japanese character encoding seems not using declare(encoding=...). Probably, we use zend_encoding implicitly. If delete zend_encoding, In SJIS (Shift_JIS) probably will occur 5c problem. For example is below: $val = "表"; // 表 is 0x955c, script see 0x5c22, therefore, Throw on Parse Error Please see about 5c problem https://blog.kano.ac/archive/posts/1654_5c-problem/ I would like to maintain backwards compatibility. zend_encoding seems can't delete. Regards Yuya -- --- Yuya Hamada (tekimen) - https://tekitoh-memdhoi.info - https://github.com/youkidearitai - -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Reproducible Builds
On 28 November 2023 17:28:18 GMT, Sebastian Bergmann wrote: >While we could probably replace __DATE__ and __TIME__ with SOURCE_DATE_EPOCH >[3] [4], I cannot help but wonder whether having the date and time when the >executable was built in the executable is actually useful. How attached are we >to having the date and time of the build in the output of phpinfo(), "php -i", >etc.? It is really useful for the development versions of PHP. Knowing whether your are running a PHP-dev from last week or last month is important. For released versions, not so much. cheers Derick -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
@youkidearitai right now the code specifically deals with - UTF8: removing UTF8 BOM and removing `declare(encoding='UTF-8'); - UTF16LE/UTF16BE/UTF32LE/UTF32BE: converting to UTF8 removing the BOM and removing declare(encoding='...') - ISO-8859-1: converting to UTF-8 and removing declare(encoding='ISO-8859-1'), i couldn't really find information on a ISO-8859-1 BOM, so to the best of my knowledge it does not exist it does not deal with any other encodings as of writing, but more can be added if needed. On Tue, 28 Nov 2023 at 23:58, youkidearitai wrote: > > 2023年11月29日(水) 7:41 Hans Henrik Bergan : > > > > btw if we come to some consensus to my php2utf8.php script is actually > > worthwhile to expand on, i can volunteer to add more encodings (SJIS, > > BIG5, anything supported by mbstring), > > but it wouldn't surprise me if a better approach exist and the script > > should be rewritten entirely~ > > > > >add that what's special about UTF-8 isn't that it's "fixed-endian". > > > > should've added this to the last post, but the "zend.detect_unicode" > > ini-option is specifically to scan for BOMs, and BOMs are > > significantly less useful in fixed-endian encodings (like UTF8) than > > bi-endian encodings (like UTF16/UTF32) ^^ > > > > On Tue, 28 Nov 2023 at 21:47, Hans Henrik Bergan > > wrote: > > > > > > > What is the migration path for legacy code that use those directives? > > > > > > The migration path is to convert the legacy-encoding PHP files to UTF-8. > > > Luckily this can be largely automated, here is my attempt: > > > https://github.com/divinity76/php2utf8/blob/main/src/php2utf8.php > > > but that code definitely needs some proof-reading and additions - idk > > > if the approach used is even a good approach, it was just the first i > > > could think of, feel free to write one from scratch > > > > > > > > > >Can you share a little more details about how this works? > > > > > > I hope someone else can do that, but it allows PHP to parse and > > > execute scripts not written in UTF-8 and scripts utilizing > > > BOM/byte-order-masks. > > > > > > >add that what's special about UTF-8 isn't that it's "fixed-endian". > > > > > > one of multiple good things about UTF-8 is that it's fixed-endian, and > > > UTF8 don't need a BOM to specify endianess (unlike UTF16 and UTF32 > > > which are bi-endian, and a BOM helps identify endianess used~) > > > > > > >If the solution is as easy as just converting the encoding of the > > > source file, then why did we even need to have this setting at all? > > > Why did PHP parser support encodings that demanded the introduction of > > > > > > I've read your question but don't have an answer to it, hopefully > > > someone else knows. > > > > > > > > > On Tue, 28 Nov 2023 at 21:09, Claude Pache wrote: > > > > > > > > > > > > > > > > > Le 28 nov. 2023 à 20:56, Kamil Tekiela a écrit > > > > > : > > > > > > > > > >> Convert your PHP source files to UTF-8. > > > > > > > > > > If the solution is as easy as just converting the encoding of the > > > > > source file, then why did we even need to have this setting at all? > > > > > Why did PHP parser support encodings that demanded the introduction of > > > > > this declare? > > > > > > > > It is not necessary as simple: because your code base may contain > > > > literal strings, and changing the encoding of the source file will > > > > effectively change the contents of the strings. > > > > > > > > —Claude > > > > > > > > -- > > PHP Internals - PHP Runtime Development Mailing List > > To unsubscribe, visit: https://www.php.net/unsub.php > > > > Hi, Hans > > Is this convert PHP code from any encoding to UTF-8? > If correct, PHP code is coded various character encoding, > It is very difficult. > This is because it is not necessarily implemented in UTF-8. > > In the world, we have many character encoding. > PHP code will be difficult to unify. > > Regards > Yuya > > -- > --- > Yuya Hamada (tekimen) > - https://tekitoh-memdhoi.info > - https://github.com/youkidearitai > - > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
2023年11月29日(水) 7:41 Hans Henrik Bergan : > > btw if we come to some consensus to my php2utf8.php script is actually > worthwhile to expand on, i can volunteer to add more encodings (SJIS, > BIG5, anything supported by mbstring), > but it wouldn't surprise me if a better approach exist and the script > should be rewritten entirely~ > > >add that what's special about UTF-8 isn't that it's "fixed-endian". > > should've added this to the last post, but the "zend.detect_unicode" > ini-option is specifically to scan for BOMs, and BOMs are > significantly less useful in fixed-endian encodings (like UTF8) than > bi-endian encodings (like UTF16/UTF32) ^^ > > On Tue, 28 Nov 2023 at 21:47, Hans Henrik Bergan wrote: > > > > > What is the migration path for legacy code that use those directives? > > > > The migration path is to convert the legacy-encoding PHP files to UTF-8. > > Luckily this can be largely automated, here is my attempt: > > https://github.com/divinity76/php2utf8/blob/main/src/php2utf8.php > > but that code definitely needs some proof-reading and additions - idk > > if the approach used is even a good approach, it was just the first i > > could think of, feel free to write one from scratch > > > > > > >Can you share a little more details about how this works? > > > > I hope someone else can do that, but it allows PHP to parse and > > execute scripts not written in UTF-8 and scripts utilizing > > BOM/byte-order-masks. > > > > >add that what's special about UTF-8 isn't that it's "fixed-endian". > > > > one of multiple good things about UTF-8 is that it's fixed-endian, and > > UTF8 don't need a BOM to specify endianess (unlike UTF16 and UTF32 > > which are bi-endian, and a BOM helps identify endianess used~) > > > > >If the solution is as easy as just converting the encoding of the > > source file, then why did we even need to have this setting at all? > > Why did PHP parser support encodings that demanded the introduction of > > > > I've read your question but don't have an answer to it, hopefully > > someone else knows. > > > > > > On Tue, 28 Nov 2023 at 21:09, Claude Pache wrote: > > > > > > > > > > > > > Le 28 nov. 2023 à 20:56, Kamil Tekiela a écrit : > > > > > > > >> Convert your PHP source files to UTF-8. > > > > > > > > If the solution is as easy as just converting the encoding of the > > > > source file, then why did we even need to have this setting at all? > > > > Why did PHP parser support encodings that demanded the introduction of > > > > this declare? > > > > > > It is not necessary as simple: because your code base may contain literal > > > strings, and changing the encoding of the source file will effectively > > > change the contents of the strings. > > > > > > —Claude > > > > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > Hi, Hans Is this convert PHP code from any encoding to UTF-8? If correct, PHP code is coded various character encoding, It is very difficult. This is because it is not necessarily implemented in UTF-8. In the world, we have many character encoding. PHP code will be difficult to unify. Regards Yuya -- --- Yuya Hamada (tekimen) - https://tekitoh-memdhoi.info - https://github.com/youkidearitai - -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
On Tue, Nov 28, 2023 at 12:48 PM Hans Henrik Bergan wrote: > >If the solution is as easy as just converting the encoding of the > source file, then why did we even need to have this setting at all? > Why did PHP parser support encodings that demanded the introduction of > > I've read your question but don't have an answer to it, hopefully > someone else knows. These settings predate the ubiquity of UTF-8, which did not begin to see widespread adoption until the mid-to-late 2000s, and did not reach ubiquity until the mid-2010s: https://en.wikipedia.org/wiki/Popularity_of_text_encodings mbstring.script_encoding was introduced with this commit and released in PHP 4.3 (renamed to zend.script_encoding in PHP 5.4): https://github.com/php/php-src/commit/f30b722f14521fbad2fabe5fdcaa2b60fe97eebb zend.detect_unicode introduced in this commit, released with PHP 5.1: https://github.com/php/php-src/commit/a8c6b992b8894763c59276c1142971aa9a314500 zend.multibyte introduced with this commit, released with PHP 5.4: https://github.com/php/php-src/commit/ab93d8c621645e05d6a6a431d52ac64eda956673 declare(encoding) appears to predate all of the PHP 4.0 tagged releases, including the pre-release ones. - Mark Trapp -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
btw if we come to some consensus to my php2utf8.php script is actually worthwhile to expand on, i can volunteer to add more encodings (SJIS, BIG5, anything supported by mbstring), but it wouldn't surprise me if a better approach exist and the script should be rewritten entirely~ >add that what's special about UTF-8 isn't that it's "fixed-endian". should've added this to the last post, but the "zend.detect_unicode" ini-option is specifically to scan for BOMs, and BOMs are significantly less useful in fixed-endian encodings (like UTF8) than bi-endian encodings (like UTF16/UTF32) ^^ On Tue, 28 Nov 2023 at 21:47, Hans Henrik Bergan wrote: > > > What is the migration path for legacy code that use those directives? > > The migration path is to convert the legacy-encoding PHP files to UTF-8. > Luckily this can be largely automated, here is my attempt: > https://github.com/divinity76/php2utf8/blob/main/src/php2utf8.php > but that code definitely needs some proof-reading and additions - idk > if the approach used is even a good approach, it was just the first i > could think of, feel free to write one from scratch > > > >Can you share a little more details about how this works? > > I hope someone else can do that, but it allows PHP to parse and > execute scripts not written in UTF-8 and scripts utilizing > BOM/byte-order-masks. > > >add that what's special about UTF-8 isn't that it's "fixed-endian". > > one of multiple good things about UTF-8 is that it's fixed-endian, and > UTF8 don't need a BOM to specify endianess (unlike UTF16 and UTF32 > which are bi-endian, and a BOM helps identify endianess used~) > > >If the solution is as easy as just converting the encoding of the > source file, then why did we even need to have this setting at all? > Why did PHP parser support encodings that demanded the introduction of > > I've read your question but don't have an answer to it, hopefully > someone else knows. > > > On Tue, 28 Nov 2023 at 21:09, Claude Pache wrote: > > > > > > > > > Le 28 nov. 2023 à 20:56, Kamil Tekiela a écrit : > > > > > >> Convert your PHP source files to UTF-8. > > > > > > If the solution is as easy as just converting the encoding of the > > > source file, then why did we even need to have this setting at all? > > > Why did PHP parser support encodings that demanded the introduction of > > > this declare? > > > > It is not necessary as simple: because your code base may contain literal > > strings, and changing the encoding of the source file will effectively > > change the contents of the strings. > > > > —Claude > > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
> What is the migration path for legacy code that use those directives? The migration path is to convert the legacy-encoding PHP files to UTF-8. Luckily this can be largely automated, here is my attempt: https://github.com/divinity76/php2utf8/blob/main/src/php2utf8.php but that code definitely needs some proof-reading and additions - idk if the approach used is even a good approach, it was just the first i could think of, feel free to write one from scratch >Can you share a little more details about how this works? I hope someone else can do that, but it allows PHP to parse and execute scripts not written in UTF-8 and scripts utilizing BOM/byte-order-masks. >add that what's special about UTF-8 isn't that it's "fixed-endian". one of multiple good things about UTF-8 is that it's fixed-endian, and UTF8 don't need a BOM to specify endianess (unlike UTF16 and UTF32 which are bi-endian, and a BOM helps identify endianess used~) >If the solution is as easy as just converting the encoding of the source file, then why did we even need to have this setting at all? Why did PHP parser support encodings that demanded the introduction of I've read your question but don't have an answer to it, hopefully someone else knows. On Tue, 28 Nov 2023 at 21:09, Claude Pache wrote: > > > > > Le 28 nov. 2023 à 20:56, Kamil Tekiela a écrit : > > > >> Convert your PHP source files to UTF-8. > > > > If the solution is as easy as just converting the encoding of the > > source file, then why did we even need to have this setting at all? > > Why did PHP parser support encodings that demanded the introduction of > > this declare? > > It is not necessary as simple: because your code base may contain literal > strings, and changing the encoding of the source file will effectively change > the contents of the strings. > > —Claude > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
> Le 28 nov. 2023 à 20:56, Kamil Tekiela a écrit : > >> Convert your PHP source files to UTF-8. > > If the solution is as easy as just converting the encoding of the > source file, then why did we even need to have this setting at all? > Why did PHP parser support encodings that demanded the introduction of > this declare? It is not necessary as simple: because your code base may contain literal strings, and changing the encoding of the source file will effectively change the contents of the strings. —Claude -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
> Convert your PHP source files to UTF-8. If the solution is as easy as just converting the encoding of the source file, then why did we even need to have this setting at all? Why did PHP parser support encodings that demanded the introduction of this declare? -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
On Nov 28, 2023, at 11:12, Claude Pache wrote: > Le 28 nov. 2023 à 19:57, Hans Henrik Bergan a écrit : >> With the dominance of UTF-8 (a fixed-endian encoding), surely no new >> code should utilize any of declare(encoding='...') / zend.multibyte / >> zend.script_encoding / zend.detect_unicode. >> I propose we deprecate all 4. > > What is the migration path for legacy code that use those directives? Convert your PHP source files to UTF-8. These directives are only required for code written in legacy multibyte encodings like Shift-JIS, Big5, or EUC-CN. (These encodings are primarily used for Chinese and Japanese text.) These directives are not required for scripts which *process* text in these encodings. They're only required if the source code itself is in a legacy multibyte encoding, as those encodings can contain octets in the basic ASCII range (0x20 - 0x7f) within multibyte sequences. For example, the character "ボ" (U+30DC KATAKANA LETTER BO) is encoded in Shift-JIS as 83 7B, whose second octet would ordinarily represent the ASCII character "{". If this character appeared in a variable name, for instance, PHP would need to recognize that the "7B" does not represent open brace. >> With the dominance of UTF-8 (a fixed-endian encoding) I'll add that what's special about UTF-8 isn't that it's "fixed-endian". It's that UTF-8 only uses octets above 0x7F for characters outside the ASCII range, so the parser doesn't have to be specifically aware of UTF-8 encoding when processing text. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
Hi Hans, Can you share a little more details about how this works? This is a pretty niche functionality, so most people probably don't know what it is, how it works, or why it should no longer be used. Also, as Claude mentioned, what is the preferred alternative? Regards, Kamil -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
> Le 28 nov. 2023 à 19:57, Hans Henrik Bergan a écrit : > > With the dominance of UTF-8 (a fixed-endian encoding), surely no new > code should utilize any of declare(encoding='...') / zend.multibyte / > zend.script_encoding / zend.detect_unicode. > I propose we deprecate all 4. Hi, What is the migration path for legacy code that use those directives? —Claude -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
[PHP-DEV] Deprecate declare(encoding='...') + zend.multibyte + zend.script_encoding + zend.detect_unicode ?
With the dominance of UTF-8 (a fixed-endian encoding), surely no new code should utilize any of declare(encoding='...') / zend.multibyte / zend.script_encoding / zend.detect_unicode. I propose we deprecate all 4. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Reproducible Builds
On Tue, 28 Nov 2023 at 19:40, Ilija Tovilo wrote: > That said, I wouldn't object to removing the date either. > Wishful thinking, but perhaps a GIT ref of some sort would be a good replacement too, if the working copy is clean. I wouldn't put too much weight on it, but that would certainly help people while jumping across branches, when trying out new RFCs, and it should be stable. Marco Pivetta https://mastodon.social/@ocramius https://ocramius.github.io/
Re: [PHP-DEV] Reproducible Builds
Hi Sebastian On Tue, Nov 28, 2023 at 6:28 PM Sebastian Bergmann wrote: > > I recently watched a video [1] that once again brought the topic of > reproducible builds [2] to my attention. > ... > I have not yet checked whether usage of the __DATE__ and __TIME__ macros > is the only thing that makes the compilation of PHP irreproducible, but no > longer using them would be a good start on the path towards reproducible > builds. At least for core, enabled-by-default extensions, __DATE__ and __TIME__ seem to be the only variables. I can get reproducible builds by setting SOURCE_DATE_EPOCH. > While we could probably replace __DATE__ and __TIME__ with > SOURCE_DATE_EPOCH [3] [4], ... Both GCC and Clang support SOURCE_DATE_EPOCH out of the box, setting __DATE__ and __TIME__ accordingly. MSVC (shockingly) does not. However, reproducible builds likely don't matter as much for Windows since we provide the binaries for it. That said, I wouldn't object to removing the date either. Ilija -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
[PHP-DEV] Reproducible Builds
I recently watched a video [1] that once again brought the topic of reproducible builds [2] to my attention. I believe that reproducible builds are becoming more and more important and that the build of the PHP interpreter/runtime should become reproducible. Right now, compiling the same version of PHP's C sources in the same environment (using the same compiler, against the same dependencies, etc.) produces a different binary every time. "Different" meaning that the built artifacts, the "php" executable for the CLI SAPI, for example, are not bit-by-bit identical. One obvious reason why this is the case is the fact that we use __DATE__ and __TIME__ in a couple of places. These preprocessor macros are expanded by the C compiler at compile-time to the current date and time. They are used in sapi/cli/php_cli.c, for instance, so that the output of "php -i" contains the date and time when the executable was compiled. I have not yet checked whether usage of the __DATE__ and __TIME__ macros is the only thing that makes the compilation of PHP irreproducible, but no longer using them would be a good start on the path towards reproducible builds. While we could probably replace __DATE__ and __TIME__ with SOURCE_DATE_EPOCH [3] [4], I cannot help but wonder whether having the date and time when the executable was built in the executable is actually useful. How attached are we to having the date and time of the build in the output of phpinfo(), "php -i", etc.? AFAIK, the topic of reproducible builds was brought up in 2017 for the first, and before this email only, time [5]. There was a PR [6] that was merged into PHP 7.1 which introduced the use of SOURCE_DATE_EPOCH to define PHP_BUILD_DATE in configure.ac. Today, when I grep for SOURCE_DATE_EPOCH on the master branch, I do not find any usage of SOURCE_DATE_EPOCH anymore. Or PHP_BUILD_DATE, for that matter. -- [1] https://media.ccc.de/v/camp2023-57236-reproducible_builds_the_first_ten_years [2] https://reproducible-builds.org/ [3] https://reproducible-builds.org/specs/source-date-epoch/ [4] https://reproducible-builds.org/docs/source-date-epoch/ [5] https://externals.io/message/101327#101327 [6] https://github.com/php/php-src/pull/2965 -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Callable arguments cannot have default value
On 28/11/2023 09:54, Claude Pache wrote: The big problem with the `callable` type, is that it can be check only at runtime. For instance: ```php function foo(callable $x) { } foo('strlen'); // ok foo('i_dont_exist'); // throws a TypeError ``` To expand on this example, and address the original question more explicitly, consider if we allowed this: function foo(callable $x = 'maybe_exists') { } To decide whether that's a valid definition, the compiler needs to know whether 'maybe_exists' can be resolved to the name of a global function; but it might be defined in a different file, which hasn't been included yet (or, more generally, which isn't being compiled right now). To allow the default, the engine would need to defer the validity check until the function is actually executed. This is how "new in initializers" works [https://wiki.php.net/rfc/new_in_initializers] and we can actually use that feature to implement a default for callable parameters: ```php class WrappedCallable { // Note: can't declare callable as the property type, but can as an explicit constructor parameter private $callable; public function __construct(callable $callable) { $this->callable = $callable; } public function __invoke(...$args) { return ($this->callable)(...$args); } } function test(callable $f = new WrappedCallable('strlen')) { echo $f('hello'); } test(); ``` Using this wrapper, we can pass in any value which is itself valid in an initializer, including callables specified as 'funcname' or ['class', 'staticmethod']. The trick is that we're not actually evaluating that value as a callable until we invoke test(), at which point the constructor of WrappedCallable performs the assertion that it's actually callable. So this compiles: function test(callable $f = new WrappedCallable('i_dont_exist')) { echo $f('hello'); } But will then error at run-time, *unless* a global function called i_dont_exist has been defined before that call. It seems like it would be feasible for the engine to do something similar natively, creating an equivalent of WrappedCallable('i_dont_exist') using the first-class callable syntax: function test(callable $f = i_dont_exist(...)) { echo $f('hello'); } Regards, -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] Callable arguments cannot have default value
> Le 28 nov. 2023 à 00:59, Sergii Shymko a écrit : > > Hi, > > Wanted to bring up an inconsistent behavior of callable arguments compared to > arguments of other types. > Callable argument cannot have a default value (tested string or array types - > both are not permitted). > The same exact value works perfectly fine when passed dynamically, it just > cannot be specified as a default. > The workaround is to remove the type annotation which is obviously > undesirable. > > Here’s an example: > declare(strict_types=1); > function test(callable $idGenerator = 'session_create_id') { >$id = $idGenerator(); >// ... > } > > The function/method declaration above produces the following error on all PHP > versions: > Fatal error: Cannot use string as default value for parameter $idGenerator of > type callable in /tmp/preview on line 4 > > Note that the exact same string argument can be passed without any issue: > function test(callable $idGenerator) {…} > test('session_create_id’); > > Is there a specific architectural limitation causing this that's > hard/impossible to overcome? > > I’m aware that class properties cannot be annotated with callable - another > unfortunate limitation. > Callable is not a real type like other primitive types which causes all these > inconsistencies, correct? > Callable properties (separate topic) may be a challenge, but can at least > argument defaults be supported? > > Regards, > Sergii Shymko Hi Sergii, The big problem with the `callable` type, is that it can be check only at runtime. For instance: ```php function foo(callable $x) { } foo('strlen'); // ok foo('i_dont_exist'); // throws a TypeError ``` Another complication, is that a value of the form `[ $class, $protected_or_private_method ]` may or may not be callable depending on whether the method is visible from the current scope. In other words, contrarily to all other types, `callable` depends both on runtime state and on context. Therefore, an argument of type `callable` cannot have a default value, because it is not known in advance whether the default value will be valid when used. For the case of class properties, see https://wiki.php.net/rfc/typed_properties_v2#supported_types —Claude
Re: [PHP-DEV] Callable arguments cannot have default value
On Tue, Nov 28, 2023 at 12:59 AM Sergii Shymko wrote: > > Hi, > > Wanted to bring up an inconsistent behavior of callable arguments compared to > arguments of other types. > Callable argument cannot have a default value (tested string or array types - > both are not permitted). > The same exact value works perfectly fine when passed dynamically, it just > cannot be specified as a default. > The workaround is to remove the type annotation which is obviously > undesirable. > > Here’s an example: > declare(strict_types=1); > function test(callable $idGenerator = 'session_create_id') { > $id = $idGenerator(); > // ... > } > > The function/method declaration above produces the following error on all PHP > versions: > Fatal error: Cannot use string as default value for parameter $idGenerator of > type callable in /tmp/preview on line 4 > > Note that the exact same string argument can be passed without any issue: > function test(callable $idGenerator) {…} > test('session_create_id’); > > Is there a specific architectural limitation causing this that's > hard/impossible to overcome? > > I’m aware that class properties cannot be annotated with callable - another > unfortunate limitation. > Callable is not a real type like other primitive types which causes all these > inconsistencies, correct? > Callable properties (separate topic) may be a challenge, but can at least > argument defaults be supported? > > Regards, > Sergii Shymko I stopped using "callable" a long time ago. These days I use \Closure and it works in all the same places (including properties). If you want to accept a callable string, you need to change the type to \Closure|string and verify it with `is_callable()`. > Is there a specific architectural limitation causing this that's > hard/impossible to overcome? IIRC, default arguments must be compile-time constant, and this isn't, apparently: session_create_id(...) You can also do something like this: function hello() { echo "hi\n"; } class wrapper { public function __construct(public \Closure|string $closure) { is_callable($closure) ?: throw new InvalidArgumentException('closure must be callable'); } public function __invoke() { return ($this->closure)(); } } function test(wrapper|Closure $closure = new wrapper('hello')) { ($closure)(); } test(); -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php