Re: [PHP-DEV] [RFC] User Defined Operator Overloads (v0.6)
On Tue, Dec 21, 2021, 6:37 AM Andreas Hennings wrote: > > In a class Matrix, you might want to implement three variations of the > * operator: > - Matrix * Matrix = Matrix. > - Matrix * float = Matrix. > - Matrix * Vector = Vector. > Same for other classes and operators: > - Money / float = Money > - Money / Money = float > - Distance * Distance = Area > - Distance * float = Distance > these are bad examples and nightmare to maintain. I think even more with lovely typed languages. Matrix*float are better implemented as method here. >
Re: [PHP-DEV] Re: PHP-FPM process management woes
As Jakub mentioned, exactly one MINIT/MSHUTDOWN per process will only fix one of your problems, but I do like the step in that direction. I express my sympathy and condolences: I've had my own share of woes with the fact that env var and INI changes are not complete during MINIT due to fcgi protocol/php-fpm. It's very annoying for extensions! I always install all hooks that might be needed, and have runtime checks if they are actually enabled. Not ideal, and sadly one MINIT/MSHUTDOWN will not fix it. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] User Defined Operator Overloads (v0.6)
On Mon, Dec 20, 2021 at 4:43 PM Andreas Hennings wrote: > > The exact position of where that trade-off is 'worth it' is going to > > be different for different people. But one of the areas where PHP is > > 'losing ground' to Python is how Python is better at processing data > > with maths, and part of that is how even trivial things, such as > > complex numbers, are quite difficult to implement and/or use in > > userland PHP. > > Could be interesting to look for examples in Python. > I was not lucky so far, but there must be something.. > ... > Btw it would be really interesting to find such a list of > recommendations for Python. > The reference you added, > https://isocpp.org/wiki/faq/operator-overloading#op-ov-rules, is for > C++, which is less comparable to PHP than Python is. > During my research phase of this RFC I was able to review many different takes on this from the Python space. Here is one example of a community discussion about it: https://stackoverflow.com/questions/1552260/rules-of-thumb-for-when-to-use-operator-overloading-in-python One of the interesting things here is that *most* of the warnings for Python users discussed at the link are actually *designed* to not be issues within this RFC. That's on purpose of course, I tried to think about how some of the design issues in Python could be improved. In Python, the +, +=, and ++ operators are implemented independently. In this RFC, you may only overload the + operator, and then the VM handles the appropriate surrounding logic for the other operators. For instance, with ++$obj or $obj++, you want to return either a reference or a copy, depending on if it's a pre- or post-increment. In this RFC, the handling of when the ZVAL is returned is handled by the VM automatically, and a subordinate call to the opcode for + is made when appropriate. The reassignment += works similarly, with the ZVAL's being assigned automatically and a subordinate call. This vastly reduces the surface for inconsistency. Another warning that is discussed is around overloading the == operator. A big reason for this is that the Python overloads do NOT require the == overload to return a particular type. Because of this, overloading the == operator can result in situations in Python where it is difficult to compare objects for equality. However, in this RFC the == operator can only be overloaded to return a boolean, so the semantic meaning of the operator remains the same. Though you could of course do something terrible and mutate the object during an equality comparison, you must return a boolean value, ensuring that the operator cannot be co-opted for other purposes easily. Additionally, the != and == cannot be independently implemented in this RFC, but can in Python. In Python the inequality operators can be implemented independently: >, >=, <=, <. They *also* are not required to return a boolean value. In this RFC, independent overloads for the different comparisons are not provided. Instead, you must implement the <=> operator and return an int. Further, the int value you return is normalized to -1, 0, 1 within the engine. This ensures that someone could not repurpose the > operator to pull something out of a queue, for instance. (They could still repurpose >> to do so, but since the shift left and shift right operators are not used in the context of boolean algebra often in PHP, that's far less dangerous.) A future scope that I plan on working on is actually having an Ordering enum that must be returned by the <=> overload instead, that even more explicitly defines what sorts of states can be returned from this overload. A lot of years of language design experience have been invested into operator overloading across various languages. I wanted to at least try to take advantage of all this experience when writing this RFC. It's why I say that PHP will end up with the most restrictive operator overloads of any language I'm aware of. There will still be pain points (returning union types is not an easy thing to eliminate without full compile time type resolution), but as far as buggy or problematic code, there's a lot about this RFC that works to prevent it. A determined programmer can still create problems, but I find this (personally) an uncompelling argument against the feature. There are *many* features in PHP that a determined programmer can create problems with. The __get, __set, __call, and __callStatic magic methods can actually allow you to overload the *assignment* operator for certain contexts. The __toString magic method can already be used to mutate the object through a simple concatenation. The ArrayAccess interface forces you to deal with the union of *all* types (mixed), even when that doesn't make sense. And these are just the PHP features that in some way already interact with operators and objects in special circumstances. Jordan
Re: [PHP-DEV] [RFC] User Defined Operator Overloads (v0.6)
On Tue, 21 Dec 2021 at 00:03, Dan Ackroyd wrote: > > On Fri, 17 Dec 2021 at 18:36, Stanislav Malyshev wrote: > > > > When reading > > this code: $foo * $bar - how do I know which of the ways you took and > > where should I look for the code that is responsible for it? When I see > > $foo->times($bar) it's clear who's in charge and where I find the code. > > Terse code is nice but not at the expense of making it write-only. > > Well, there's only two places to look with operator overloads, but > yes you're right, using operator overloads for single operation is not > a good example of how they make code easier to read. The more > complicated example from the introduction to the RFC > https://wiki.php.net/rfc/user_defined_operator_overloads#introduction > shows how they make complex maths easier to read. I think the example in the RFC is interesting, but not ideal to advertise the RFC. The example is with native scalar types and build-in operator implementations. (I don't know how GMP works internally, but for an average user of PHP it does not make sense to call this "overloaded") In fact, if we add overloaded operators as in the RFC, the example becomes less easy to read, because now we can no longer be sure by just looking at the snippet: - Are those variables scalar values, or objects? - Are the operators using the built-in implementation or some custom overloaded implementation? (depends on the operand types) - Are the return values or intermediate values scalars or objects? We need really good variable names, and/or other contextual information, to answer those questions. This said, I am sure we _can_ find good examples. In this thread, people already mentioned Matrix/Vector, Money/Currency and Time/Duration. Others would be various numbers with physical measuring units. > > The exact position of where that trade-off is 'worth it' is going to > be different for different people. But one of the areas where PHP is > 'losing ground' to Python is how Python is better at processing data > with maths, and part of that is how even trivial things, such as > complex numbers, are quite difficult to implement and/or use in > userland PHP. Could be interesting to look for examples in Python. I was not lucky so far, but there must be something.. > > Stanislav Malyshev wrote: > > And again, what's the intuitive > > difference between operators +=+@-+ and ++--=!* ? > > That's not part of the RFC. > > There's enough trade-offs to discuss already; people don't need to > imagine more that aren't part of what is being proposed. > > > I have encountered > > toolkits where the authors think it's cute to define "+" to mean > > something that has nothing to do with mathematical addition > > Rather than leaving everyone to make the same mistakes again, this RFC > might be improved by having a list of stuff that it really shouldn't > be used for. At least then anyone who violates those guidelines does > so at their own risk. Having guidelines would also help junior devs > point out to more senior devs that "you're trying to be clever and the > whole team is going to regret this". > > I started a 'Guidelines for operator overloads' here > (https://github.com/Danack/GuidelinesForOperatorOverloads/blob/main/guidelines.md) > - if anyone has horrorible examples they'd like to add, PR's are > welcome. I think it is a good start. I would avoid appealing to "common sense" or "logical sense" though, this can mean different things to different people, and is also somewhat tautological, like "do good things, avoid bad things". More meaningful terms can be "familiar", "expectations", "predictable", "non-ambiguous". (I see this language is coming from the C++ document, but either way I don't like it) Possible alternative language snippets: For designing operators: - Replicate familiar notations from the subject domain, e.g. maths, physics, commerce. (this has some overlap with the first point) - Return the type and value that people expect based on their expectations and mental models. - Use identifiers (class names, method names, variable names) from the same subject domain language that inspires the operators. - Avoid ambiguity: If different people will have different expectations for return type and value, introduce well-named methods instead of overloaded operators. - Completeness trade-off: Understand the full range of operators, and type combinations for the same operator, that is common in the subject domain. Then decide which of those should be supported with operator overloads, and which should be supported with methods instead. - Take inspiration from code examples outside of PHP. For using operators: Use descriptive variable names, method names, other identifiers, and other hints (@var comments etc), so that the type and role of each variable and value can be easily understood. E.g. "$duration = $tStart - $tEnd;". "If you provide constructive operators, they should not change their operands." I think we should
[PHP-DEV] Re: PHP-FPM process management woes
Hi, On Mon, Dec 20, 2021 at 6:41 PM Derick Rethans wrote: > Literature since at least 2006 (Sara's Extending and Embedding PHP > book), as well as numerous presentations that I have seen and given, and > including the online PHP Internals Book > ( > https://www.phpinternalsbook.com/php7/extensions_design/php_lifecycle.html), > > always expected that the MINIT (and MSHUTDOWN) functions are called once > per worker process. However, PHP-FPM does not do that. It only calls > extension's MINITs once in the main control process, but it *does* call > an MSHUTDOWN for each worker process. > In the past, this has already caused an issue where I couldn't create a > monitoring thread in MINIT, and then wait for it to end in MSHUTDOWN, as > MSHUTDOWN would wait on the same thread in each worker process, although > only one was created in MINIT. > > The way how PHP-FPM handles this breaks the generally assumed "one > MINIT, one MSHUTDOWN call" per process approach. > Yeah this is a bit unfortunate asymetry. It has got some slight advantages like in case of preloading that is done just once instead of on each child init. However there are most likely more disadvantages like the ones above and also the fact that MINIT is often run under the root so it's not ideal from the security point of view - especially when running 3rd party extensions. > In particular, in the case of Xdebug bug #2051 it creates a problem with > the following set-up: > > 1. php.ini has a `xdebug.mode=off` > 2. the pool configuration has a `php_admin_value[xdebug.mode] = debug` >directive > > So this is actually mainly about the fact that fpm_php_apply_defines runs after the MINIT. You might be able to tweak this but not sure if it completely fixes the issue as there's another to way overwrite INI that happens during the request. It is using FCGI env PHP_ADMIN_VALUE (see fastcgi_ini_parser usage) so you might need to handle this case in your code anyway. Basically you can't rely on the fact that INI stays the same so you will probably have to add some logic to your RINIT to handle this. > > My suggestion for a fix would be to emulate what Apache always did: > > One MINIT/MSHUTDOWN in the main control process (I think it needed that > to be able to implement php_admin_value and php_value), and then > additionally also in each worker process. > As I said above, it won't probably fully fix your problem but if you still want to try to tackle it and move the MINIT, the way that I would do it is to try to separate the whole sapi init logic and call it from the child init as the first thing. If you want to experiment with using php_admin_value before the module minit, then it might be worth to try put fpm_php_init_child (or just the defines) to sapi startup callback before calling php_module_startup so the main INIs are loads but it might be a bit tricky to get worker pool config - it might need some extra globals maybe for it. But not sure if that's gonna work (e.g. if the main INI files are loaded at the sapi startup stage) and what will be impact on extensions. I'd probably have to check it more and try it. Think it's also something that could happen in master branch only as it's changing some fundamental bits in FPM... Regards Jakub
Re: [PHP-DEV] [RFC] User Defined Operator Overloads (v0.6)
On Fri, 17 Dec 2021 at 00:25, Larry Garfield wrote: > > On Thu, Dec 16, 2021, at 1:24 PM, Andreas Hennings wrote: > > > I see the distinction in overloading based on the object type on the > > left, vs overloading based on parameter types. > > > > For a method call $a->f($b), the implementation of ->f() is chosen > > based on the type of $a, but not $b. > > For an operator call "$a + $b", with the system proposed here, again, > > the implementation of "+" will be chosen based on the type of $a, but > > not $b. > > For native operator calls, the implementation is chosen based on the > > types of $a and $b, but in general they are cast to the same type > > before applying the operator. > > For global function calls f($a, $b), the implementation is always the same. > > > > In a language with parameter-based overloading, the implementation can > > be chosen based on the types of $a and $b. > > > > This brings me back to the "symmetry" concern. > > In a call "$a->f($b)", it is very clear that the implementation is owned by > > $a. > > However, in an operator expression "$a + $b", it looks as if both > > sides are on equal footing, whereas in reality $a "owns" the > > implementation. > > > > Add to this that due to the weak typing and implicit casting, > > developers could be completely misled by looking at an operator > > invocation, if a value (in our case just the left side) has an > > unexpected type in some edge cases. > > Especially if it is not clear whether the value is a scalar or an object. > > With a named method call, at least it is constrained to classes that > > implement a method with that name. > > The RFC covers all of this, and the way it works around it. Absent method > overloading (which I don't expect any time soon, especially given how > vehemently Nikita is against it), it's likely the best we could do. > > > > In a class Matrix, operator(Matrix $other): Matrix {} can be declared > > to always return Matrix, and operator(float $factor): float {} can be > > declared to always return float. > > However, with a generic operator(mixed $other): Matrix|float {}, we > > cannot natively declare when the return value will be Matrix or float. > > (a tool like psalm could still do it) > > I... have no idea what you're talking about here. The RFC as currently > written is not a "generic operator". It's > > operator *(Matrix $other, bool $left): Matrix > > The implementer can type both $other and the return however they want. That > could be Matrix in both cases, or it could be Matrix|float, or whatever. > That's... the same as every other return type we have now. Basically the same as others have been saying in more recent comments. In a class Matrix, you might want to implement three variations of the * operator: - Matrix * Matrix = Matrix. - Matrix * float = Matrix. - Matrix * Vector = Vector. Same for other classes and operators: - Money / float = Money - Money / Money = float - Distance * Distance = Area - Distance * float = Distance Without parameter-based overloading, this needs union return types, IF we want to support all variations with operators: - Matrix * (Matrix|float|Vector) = Matrix|Vector. - Money / (Money|float) = float|Money - Distance * (Distance|float) = Area|Distance Which gives you a return type with some ambiguity. With methods, you could have different method names with dedicated return types. The naming can be awkward, so I am giving different possibilities here. - Matrix->mulFloat(float) = Matrix->scale(float) = Matrix - Matrix->mul(Matrix) = Matrix::product(Matrix, Matrix) = Matrix - Matrix->mulVector(Vector) = Vector To me, the best seems a method name that somehow predicts the return type. Possible solutions for the developer who is writing a Matrix class and who wants to use overloaded operators: - Accept the ambiguity of the return type, and use tools like psalm to be more precise. - Only use the * operator for one or 2 of the 3 variations (those that return Matrix), and introduce a regular function for the third: - Matrix * Matrix|float = Matrix - Matrix->mulVector(Vector) = Vector This "concern" is not a complete blocker for the proposal. For math-related use cases like the above, the natural expectation to use operators can be so strong that we can live with some return type ambiguity. -- Andreas > > --Larry Garfield > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] [RFC] User Defined Operator Overloads (v0.6)
On Fri, 17 Dec 2021 at 18:36, Stanislav Malyshev wrote: > > When reading > this code: $foo * $bar - how do I know which of the ways you took and > where should I look for the code that is responsible for it? When I see > $foo->times($bar) it's clear who's in charge and where I find the code. > Terse code is nice but not at the expense of making it write-only. Well, there's only two places to look with operator overloads, but yes you're right, using operator overloads for single operation is not a good example of how they make code easier to read. The more complicated example from the introduction to the RFC https://wiki.php.net/rfc/user_defined_operator_overloads#introduction shows how they make complex maths easier to read. The exact position of where that trade-off is 'worth it' is going to be different for different people. But one of the areas where PHP is 'losing ground' to Python is how Python is better at processing data with maths, and part of that is how even trivial things, such as complex numbers, are quite difficult to implement and/or use in userland PHP. Stanislav Malyshev wrote: > And again, what's the intuitive > difference between operators +=+@-+ and ++--=!* ? That's not part of the RFC. There's enough trade-offs to discuss already; people don't need to imagine more that aren't part of what is being proposed. > I have encountered > toolkits where the authors think it's cute to define "+" to mean > something that has nothing to do with mathematical addition Rather than leaving everyone to make the same mistakes again, this RFC might be improved by having a list of stuff that it really shouldn't be used for. At least then anyone who violates those guidelines does so at their own risk. Having guidelines would also help junior devs point out to more senior devs that "you're trying to be clever and the whole team is going to regret this". I started a 'Guidelines for operator overloads' here (https://github.com/Danack/GuidelinesForOperatorOverloads/blob/main/guidelines.md) - if anyone has horrorible examples they'd like to add, PR's are welcome. cheers Dan Ack -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Surveying interest regarding CMake
On 20.12.2021 at 23:01, Horváth V. wrote: > On 2021. 12. 20. 17:19, Pierre Joye wrote: > >> We may switch to vcpkg distributions, [...], or the current autoconf >> php js port works too. > > Could you elaborate on what you mean by these? > > The reason why I prefer Conan here is because they provide pre-built > binaries for common setups, so that makes builds go faster for the > majority of the cases. I'm not against having a vcpkg manifest either > way, since that's just a simple JSON file at the root of the project, > but that will result in duplication and one more thing to maintain. I assume that Pierre was referring to the Windows dependency libraries. Anyhow, I suggest to defer the Windows support; it appears to be more important to have a CMake based build system working for other platforms first, and then we can still figure out how to make that useable on Windows, too. In other words, eating the elephpant one bite at a time. :) Christoph -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Surveying interest regarding CMake
On 2021. 12. 20. 17:19, Pierre Joye wrote: We may switch to vcpkg distributions, [...], or the current autoconf php js port works too. Could you elaborate on what you mean by these? The reason why I prefer Conan here is because they provide pre-built binaries for common setups, so that makes builds go faster for the majority of the cases. I'm not against having a vcpkg manifest either way, since that's just a simple JSON file at the root of the project, but that will result in duplication and one more thing to maintain. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
[PHP-DEV] PHP-FPM process management woes
Hi! In the last few days I have been investigating Xdebug bug #2051: "Segfault on fatal error when setting xdebug.mode via php-fpm pool config" (https://bugs.xdebug.org/view.php?id=2051). I have now tracked this down to an unexpectedness in how PHP-FPM handles extension/module initialisation. Literature since at least 2006 (Sara's Extending and Embedding PHP book), as well as numerous presentations that I have seen and given, and including the online PHP Internals Book (https://www.phpinternalsbook.com/php7/extensions_design/php_lifecycle.html), always expected that the MINIT (and MSHUTDOWN) functions are called once per worker process. However, PHP-FPM does not do that. It only calls extension's MINITs once in the main control process, but it *does* call an MSHUTDOWN for each worker process. In the past, this has already caused an issue where I couldn't create a monitoring thread in MINIT, and then wait for it to end in MSHUTDOWN, as MSHUTDOWN would wait on the same thread in each worker process, although only one was created in MINIT. The way how PHP-FPM handles this breaks the generally assumed "one MINIT, one MSHUTDOWN call" per process approach. In particular, in the case of Xdebug bug #2051 it creates a problem with the following set-up: 1. php.ini has a `xdebug.mode=off` 2. the pool configuration has a `php_admin_value[xdebug.mode] = debug` directive When PHP-FPM starts up, it calls Xdebug's MINIT, which checks the value of the `xdebug.mode` INI setting. If it set to `off` it does **nothing**, including setting up handles such as the zend_error_cb handlers in `xdebug_base_minit`:: PHP_MINIT_FUNCTION(xdebug) { … if (XDEBUG_MODE_IS_OFF()) { return SUCCESS; } … xdebug_base_minit(INIT_FUNC_ARGS_PASSTHRU); } MINIT sets the `xdebug_old_error_cb` and `xdebug_new_error_cb` handlers to something else than `NULL`:: void xdebug_base_minit(INIT_FUNC_ARGS) { /* Record Zend and Xdebug error callbacks, the actual setting is done in * base on RINIT */ xdebug_old_error_cb = zend_error_cb; xdebug_new_error_cb = xdebug_error_cb; In each RINIT, which is called once per request, it also checks the INI setting, and if set, uses some of the handlers that were set-up in MINIT. Please note that PHP-FPM has changed the value of `xdebug.mode` in this worker process to `debug` (ie, not `off`), (simplified):: PHP_RINIT_FUNCTION(xdebug) { … if (XDEBUG_MODE_IS_OFF()) { return SUCCESS; } … xdebug_base_rinit(); } void xdebug_base_rinit() { xdebug_base_use_xdebug_error_cb(); } void xdebug_base_use_xdebug_error_cb(void) { zend_error_cb = xdebug_new_error_cb; } When now an error occurs, PHP calls `zend_error_cb`, but this is now unset (`NULL`) as when MINIT was called, the `xdebug.mode` setting was still set to `off`, but during RINIT it was changed by PHP-FPM to `debug`. Xdebug does not expect this setting to change, especially because it is marked as `PHP_INI_SYSTEM`. In my opinion this is a bug (or two) in PHP-FPM, as: - it does not follow the expected one MINIT/MSHUTDOWN cycle, that for example Apache (1 and 2) use, and which is documented in books and online material (and my memory) - it disrespects `PHP_INI_SYSTEM` My suggestion for a fix would be to emulate what Apache always did: One MINIT/MSHUTDOWN in the main control process (I think it needed that to be able to implement php_admin_value and php_value), and then additionally also in each worker process. I did have a brief look at implementing this, but haven't managed to get it to work yet — mainly because I am unfamiliar with the PHP-FPM code at the moment. cheers, Derick -- PHP 7.4 Release Manager Host of PHP Internals News: https://phpinternals.news Like Xdebug? Consider supporting me: https://xdebug.org/support https://derickrethans.nl | https://xdebug.org | https://dram.io twitter: @derickr and @xdebug -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] Surveying interest regarding CMake
On Fri, Dec 17, 2021, 10:23 PM Kalle Sommer Nielsen wrote: > Hi > > Den fre. 17. dec. 2021 kl. 01.09 skrev Horváth V. < > friendlyan...@hotmail.com>: > > Yes, gradually phasing the current build system out is the most > > pragmatic choice, although it will incur some extra maintenance cost for > > the time it's still in use, but it's better to do something sooner than > > later. This will also allow php-src to drop the Windows SDK altogether > > that was recently moved to the GH org, because Microsoft stopped > > maintenance. > > I feel it is important to clarify something here. The Windows SDK is > still supported by Microsoft. The Binary SDK for PHP by Microsoft > which is the SDK that manages dependencies and provides cross MSVC > environment build scripts is no longer supported. > > You can still build PHP on Windows without the Binary SDK but you will > have to manage the build depedencies in all their flavors manually > without this Binary SDK for PHP. > We may switch to vcpkg distributions, which supports cmake out of the box as well, or the current autoconf php js port works too. best, >