Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/18 15:55, Max Kellermann wrote: > Here's my RFC: https://wiki.php.net/rfc/include_cleanup Two weeks will be up the day after tomorrow, and there hasn't been any further discussion for more than a week, so I figure there is no further demand for discussing my RFC. I have replied to all emails here trying to explain my point, and today, I have updated the RFC to include the criticism expressed here. I have also prepared the vote doodles (four yes/no votes). Please tell me if you believe something is missing. If nobody objects, I'll announce the start of voting on February 1st. Three new relevant PRs I posted recently: - https://github.com/php/php-src/pull/10410 is a minimal PR for some include cleanup which also attempts to keep compatibility with "bad" extensions, e.g. by including errno.h from php.h - https://github.com/php/php-src/pull/10404 adds a few third-party extensions to the CI to detect accidental API breakages - https://github.com/php/php-src/pull/10472 removes existing #include comments which my work imitates, after three people expressed their unhappiness with the idea of comments on #include directives Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/20 14:49, Tim Düsterhus wrote: > A reasonable first step might be just *adding* all the missing '#include's > to '*.c' with one PR per ext/* directory Sounds like a good plan, but after the discussion about maintaining compatibility with bad out-of-tree code, I believe this should actually be the *last* step. Some in-tree extensions are buggy and need fixups (e.g. errno.h), but if I don't fix them now, I can more easily verify if my php_compat.h changes are good enough to avoid breakages in those in-tree extensions. Once the actual cleanup is merged, and we are confident enough that php_compat.h works, we may clean up the in-tree extensions, and then define a macro, say, PHP_NO_COMPAT, so all the compatibility tweaks for bad extensions can be avoided for certain already-fixed extensions. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/20 16:00, "Christoph M. Becker" wrote: > I'm afraid the worst that can actually happen is that this breaks a lot > of PHP extensions, SAPIs, etc. True, I meant out-of-tree code, too. I will add some extra care to ensure compatibility even with "bad" extensions, by re-adding removed #include lines to php[_compat].h, as suggested by Jakub. I agree that any breakages of out-of-tree code could (or should) be regarded a regression that needs to be fixed, no matter how bad that code is. I'm ready to take responsibility for eventual breakages and do my best to submit fixes quickly. > We do not know how many there are, and we cannot even check all PECL > extensions for practical reasons. It would be helpful to add a few selected extensions to the CI, so unintended API regressions are detected as early as possible. > So we need to be careful here. Understood, I will! Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 20.01.2023 at 13:20, Max Kellermann wrote: > I think you overestimate the gravity of the changes this RFC will do. > This isn't a code rewrite, this isn't fragile. If it builds, it's > okay. > > So the worst that can happen is a build breakage with some exotic > configuration (like the DTrace breakage my work caused), but these are > trivial to fix. If it happens, it demonstrates that PHP's CI is > incomplete and should be improved. I'm afraid the worst that can actually happen is that this breaks a lot of PHP extensions, SAPIs, etc. We do not know how many there are, and we cannot even check all PECL extensions for practical reasons. Unfortunately, many of the PECL extensions are barely maintained, and it is already not rare that some extensions are not compatible with a new PHP minor or major version during the RC phase; and some are not even compatible months after the new PHP version's GA. That hinders the general adaption of new PHP versions already (I've read that all to often: "please provide compatibility with PHP x.y, because otherwise we cannot upgrade"). So we need to be careful here. -- Christoph M. Becker -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
Hi On 1/20/23 11:52, Jakub Zelenka wrote: As it was said, this is problematic for bug fixes when merging up and it's extra hurdle for maintainers - read this will slow down the development. Can I do anything to help with those merges? Reduce the diff to absolute minimum to prevent potential conflict. I think it would be more acceptable if there was a plan that will get us there in multiple releases rather than do one big bang change. Then devs more likely remember what actually changed between releases and will be more careful when merging up. A reasonable first step might be just *adding* all the missing '#include's to '*.c' with one PR per ext/* directory - not yet touching Zend/ and not adding a reasoning comment, because existing code doesn't have the reasoning comment either. Touching only *.c files in ext/ should not have any effect on anything else and thus adding those '#include's should be safe with regard to possible breakage. It might also be argued that such a missing include is an actual (hidden) bug. These PRs thus should be easy enough to review with a quick glance. Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
> > >> Well the issue with your approach is that 104 PR's is too much to review > and reviewing that in one go is not good either. So finding some better > separation and a plan forward could again help you to get this accepted. > > I meant 104 PR's are just too many small chunks to review. That said if you split it to multiple years, it might work too. In any case there should be some plan that doesn't involve merging of all changes in one go which is my main issue with this. The benefit is just too small to be able to contain it in a single minor version bump. I personally don't like the PHP-9.0 branch idea but if it's your preferred solution, you might propose that and see what others think. It's probably still better than getting that all in now IMO. Regards Jakub
Re: [PHP-DEV] RFC: rules for #include directives
On Fri, Jan 20, 2023 at 12:58 PM Max Kellermann wrote: > On 2023/01/20 13:41, Jakub Zelenka wrote: > > I think you should clarify in the RFC how this is going to be split and > > what's the time frame for getting that all in. > > The RFC is not the actual cleanup, it's for reaching an agreement on > whether it is a desirable goal to have minimal includes instead of the > chaotic mess we currently have. This was negated by Dmitry Stogov and > Derick Rethans (but by nobody else so far; the discussion here was > mostly about other aspects, but not about this basic questions). > > After this disagreement among PHP maintainers, Christoph M. Becker > suggested doing an RFC: > > https://github.com/php/php-src/pull/10220#issuecomment-1383789100 > > The splitting and time frame are out of the RFC's scope. I'll do > whatever reviewers ask me to do, that's why I splitted my first PRs > (after being asked to do so). > > I'm aware of all of this. What I'm trying to do is to help you to find a middle ground so the changes can be eventually accepted in some way. I think the main problem with this is the maintenance and stability impact and if you don't find answers for that, it will most likely not get accepted judging the mood of core devs about this. > > Your commits are separated by files and are really too small. I didn't > mean > > that. What I was talking about it is to do that single change and apply > all > > dependent changes on it so it still compiles / works and can be proposed > as > > a separate chunk of work. It means if you remove "zend_gc.h" from > "zend.h", > > then your PR would also contain changes that add "zend_gc.h" to > "zend_gc.c" > > and "zend_objects_API.h". So to get "zend.h" to this state in you PR, it > > would require 9 PR's if I count correctly. There would be some other > > changes after that obviously but it should be nowhere near to 104... > > I did it the other way round, an approach which makes much more sense > IMHO: > > (For example) I created a stgit patch which cleans up zend.h to only > include headers that are needed by zend.h itself. > > That broke the build of various other libraries, for example > "zend_gc.c". For me, this was clearly a sign that "zend_gc.c" is > buggy; adding the missing "#include" here had actually nothing to do > with removing one from "zend.h". So I popped the "zend.h" patch from > the stack, created a new patch to fix up "zend_gc.c". > > Back to the "zend.h" patch, build again, and create more patches to > fix libraries that got broken. > > That way, each patch fixes the bugs in one library, a bug that was > invisible before because of the header bloat, but gets later revealed > by header fixups. > > Each per-library patch is simple to review: it adds #includes to the > ".c" file which were missing there; the diff shows which are added, > and a reviewer can verify that those headers are indeed needed there. > The same patch may remove unneeded #includes from the ".h" file, which > the reviewer can also easily verify. > > Each per-library patch is buildable. You can go back and forth, "git > checkout" each of my commits and you have a working PHP source tree, > with commits that are self-contained, are reviewable and make sense. > > Your approach is harder to review, because there is nothing a reviewer > can do to verify the correctness, other than building PHP. They > cannot see whether the list of added #includes is really complete, > because they see only those that got added, but cannot see those that > are still missing. > > Well the issue with your approach is that 104 PR's is too much to review and reviewing that in one go is not good either. So finding some better separation and a plan forward could again help you to get this accepted. Regards Jakub
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/20 13:41, Jakub Zelenka wrote: > I think you should clarify in the RFC how this is going to be split and > what's the time frame for getting that all in. The RFC is not the actual cleanup, it's for reaching an agreement on whether it is a desirable goal to have minimal includes instead of the chaotic mess we currently have. This was negated by Dmitry Stogov and Derick Rethans (but by nobody else so far; the discussion here was mostly about other aspects, but not about this basic questions). After this disagreement among PHP maintainers, Christoph M. Becker suggested doing an RFC: https://github.com/php/php-src/pull/10220#issuecomment-1383789100 The splitting and time frame are out of the RFC's scope. I'll do whatever reviewers ask me to do, that's why I splitted my first PRs (after being asked to do so). > Your commits are separated by files and are really too small. I didn't mean > that. What I was talking about it is to do that single change and apply all > dependent changes on it so it still compiles / works and can be proposed as > a separate chunk of work. It means if you remove "zend_gc.h" from "zend.h", > then your PR would also contain changes that add "zend_gc.h" to "zend_gc.c" > and "zend_objects_API.h". So to get "zend.h" to this state in you PR, it > would require 9 PR's if I count correctly. There would be some other > changes after that obviously but it should be nowhere near to 104... I did it the other way round, an approach which makes much more sense IMHO: (For example) I created a stgit patch which cleans up zend.h to only include headers that are needed by zend.h itself. That broke the build of various other libraries, for example "zend_gc.c". For me, this was clearly a sign that "zend_gc.c" is buggy; adding the missing "#include" here had actually nothing to do with removing one from "zend.h". So I popped the "zend.h" patch from the stack, created a new patch to fix up "zend_gc.c". Back to the "zend.h" patch, build again, and create more patches to fix libraries that got broken. That way, each patch fixes the bugs in one library, a bug that was invisible before because of the header bloat, but gets later revealed by header fixups. Each per-library patch is simple to review: it adds #includes to the ".c" file which were missing there; the diff shows which are added, and a reviewer can verify that those headers are indeed needed there. The same patch may remove unneeded #includes from the ".h" file, which the reviewer can also easily verify. Each per-library patch is buildable. You can go back and forth, "git checkout" each of my commits and you have a working PHP source tree, with commits that are self-contained, are reviewable and make sense. Your approach is harder to review, because there is nothing a reviewer can do to verify the correctness, other than building PHP. They cannot see whether the list of added #includes is really complete, because they see only those that got added, but cannot see those that are still missing. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On Fri, Jan 20, 2023 at 12:21 PM Max Kellermann wrote: > On 2023/01/20 12:50, Jakub Zelenka wrote: > > That's exactly it's too big PR that is changing too much which might > result > > exactly to the mentioned concerns about breaking some untested builds and > > it's very hard to track for dev what actually changed. > > This huge PR isn't meant to be merged in one go - as I said, the > previous PRs were smaller, and I'll submit smaller PRs once we agree > what should be in there. > > I think you should clarify in the RFC how this is going to be split and what's the time frame for getting that all in. > > > Then wait a month and propose removal "zend_gc.h" from > > "zend.h". Then wait another month, propose other include and so > > on. It will probably take years to get all changes in but in this > > way it's much safer, more in my opinion and. > > That won't take years, it will take decades. See, my draft PR has 104 > commits currently; if you split each so only one #include gets > touched, and wait a month, that's easily be 50 years - longer than the > PHP project has existed so far. > Your commits are separated by files and are really too small. I didn't mean that. What I was talking about it is to do that single change and apply all dependent changes on it so it still compiles / works and can be proposed as a separate chunk of work. It means if you remove "zend_gc.h" from "zend.h", then your PR would also contain changes that add "zend_gc.h" to "zend_gc.c" and "zend_objects_API.h". So to get "zend.h" to this state in you PR, it would require 9 PR's if I count correctly. There would be some other changes after that obviously but it should be nowhere near to 104... Jakub
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/20 12:50, Jakub Zelenka wrote: > That's exactly it's too big PR that is changing too much which might result > exactly to the mentioned concerns about breaking some untested builds and > it's very hard to track for dev what actually changed. This huge PR isn't meant to be merged in one go - as I said, the previous PRs were smaller, and I'll submit smaller PRs once we agree what should be in there. Submitting partial PRs and reordering commits is trivial for me, thanks to my stgit superpowers! > Yeah exactly all of that. I looked to the PR and it's combining lots of > different changes. It's mainly because all changes in "zend.h" and > "zend_API.h" are done in one go. I think it would be just better to propose > change of a single header replaced in them instead. For example just remove > "zend_alloc.h" from "zend.h" and all needed changes for that. (Remember that Derick Rethans specifically complained here that my work consists of too many small commits? Giggle.) > Then wait a month and propose removal "zend_gc.h" from > "zend.h". Then wait another month, propose other include and so > on. It will probably take years to get all changes in but in this > way it's much safer, more in my opinion and. That won't take years, it will take decades. See, my draft PR has 104 commits currently; if you split each so only one #include gets touched, and wait a month, that's easily be 50 years - longer than the PHP project has existed so far. Meanwhile, all the others will add new "unclean" code faster than I am allowed to clean up. It will produce the same amount of merge problems, only slower. I think you overestimate the gravity of the changes this RFC will do. This isn't a code rewrite, this isn't fragile. If it builds, it's okay. So the worst that can happen is a build breakage with some exotic configuration (like the DTrace breakage my work caused), but these are trivial to fix. If it happens, it demonstrates that PHP's CI is incomplete and should be improved. There is no worst-case scenario that would suggest artificially delaying this work by decades. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On Fri, Jan 20, 2023 at 11:15 AM Max Kellermann wrote: > On 2023/01/20 11:52, Jakub Zelenka wrote: > > Reduce the diff to absolute minimum to prevent potential conflict. I > think > > it would be more acceptable if there was a plan that will get us there in > > multiple releases rather than do one big bang change. > > My draft PR currently contains 104 very small commits, one for each > library that I fixed. > > That's exactly it's too big PR that is changing too much which might result exactly to the mentioned concerns about breaking some untested builds and it's very hard to track for dev what actually changed. > > > That said I think it would be maybe more sensible what I mentioned > > above and that's to split the work to multiple steps over the years. > > I don't know what you mean by that. Cleaning up a code base is > incremental work; I have 104 of those splitted steps in my branch, and > many more steps may follow. This will take years, or rather: will > never be finished, because code can always be improved some more. > > Do you mean I should work more slowly? I should submit my patches > more slowly? In smaller PRs? Wait more between PRs? > > Yeah exactly all of that. I looked to the PR and it's combining lots of different changes. It's mainly because all changes in "zend.h" and "zend_API.h" are done in one go. I think it would be just better to propose change of a single header replaced in them instead. For example just remove "zend_alloc.h" from "zend.h" and all needed changes for that. Then wait a month and propose removal "zend_gc.h" from "zend.h". Then wait another month, propose other include and so on. It will probably take years to get all changes in but in this way it's much safer, more in my opinion and. I would also suggest to limit any re-ordering of headers as well as the mentioned comments so diff is minimal. By the way all of this is just my opinion but think that less invasive changes might more likely to get this accepted. Regards Jakub
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/20 11:52, Jakub Zelenka wrote: > Reduce the diff to absolute minimum to prevent potential conflict. I think > it would be more acceptable if there was a plan that will get us there in > multiple releases rather than do one big bang change. My draft PR currently contains 104 very small commits, one for each library that I fixed. If you like some of them, it's easy for me to submit a new PR with that selection, and then you can merge them. I'll keep the rest rebased, for some day you decide to merge the rest. That's up to you. I don't know what kind of "plan" I need to have - I clean up what needs to be cleaned up, but first, we need to decide what "there" means in "get us there". > Well we don't even know when 9.0 will be tagged so this could span to years > which might be waste of your resources just for the sake of headers. A lot of work is already done, and I keep this branch rebased on master, which is just as much work as merging, no real difference. No time saved. On the other hand, having an official branch to become 9.0 or whatever version with my patches already in, others can chime in and help with the code cleanup. > That said I think it would be maybe more sensible what I mentioned > above and that's to split the work to multiple steps over the years. I don't know what you mean by that. Cleaning up a code base is incremental work; I have 104 of those splitted steps in my branch, and many more steps may follow. This will take years, or rather: will never be finished, because code can always be improved some more. Do you mean I should work more slowly? I should submit my patches more slowly? In smaller PRs? Wait more between PRs? Before I submitted my work, I was specifically asked to submit smaller PRs with fewer commits, which I did, and the first 4 small PRs were merged, with more waiting in my queue (until my effort was stopped by Dmitry's veto). > I think it should be at least included with "php.h" to reduce the impact on > SAPI's and extensions. OK, noted, I'll amend my PR branch to move omitted #includes to php.h/php_compat.h to retain source compatibility. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/20 11:51, Michael Voříšek - ČVUT FJFI wrote: > I would highly appreciate if this could be done automatically, ie. > generate "the correct" includes and assert them by > https://github.com/php/php-src/blob/php-8.2.1/.github/actions/verify-generated-files/action.yml > CI. If you can generate the comments too, they can be asserted by CI > reliably then as well. Agree, it would be a good idea to integrate iwyu into the CI. (Of course, only if we agree that cleaning up those includes is a desired goal for the PHP project.) Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On Fri, Jan 20, 2023 at 10:17 AM Max Kellermann wrote: > On 2023/01/20 10:53, Jakub Zelenka wrote: > > > As it was said, this is problematic for bug fixes when merging up > > and it's extra hurdle for maintainers - read this will slow down the > > development. > > Can I do anything to help with those merges? > > Reduce the diff to absolute minimum to prevent potential conflict. I think it would be more acceptable if there was a plan that will get us there in multiple releases rather than do one big bang change. Then devs more likely remember what actually changed between releases and will be more careful when merging up. > > Also this is not something I would like to see in a minor version bump > even > > though we can break ABI in that version but in general we try to > introduce > > as little changes as possible for extensions. If anything it should at > > least wait for the next major version bump when extension devs expect > more > > work to be done. > > What about: let's branch PHP 8.3 out, and declare master will become > 9.0. (Or create a new branch for 9.0 and let 8.3 stay in master; > branches names doesn't matter, only the idea matters.) > > I volunteer to do regular (daily?) merges from the 8.3 branch to the > 9.0 branch, and fix all the problems that may occur. > > Well we don't even know when 9.0 will be tagged so this could span to years which might be waste of your resources just for the sake of headers. Might be just easier to propose it before the actual major verison. That said I think it would be maybe more sensible what I mentioned above and that's to split the work to multiple steps over the years. > > I think comments are more useful in the actual header to explain > > what it is for which can be done in a few lines. It can be more > > detailed and doesn't need to be repeated... > > I agree with that part. More documentation for headers is useful. > > As I said in another reply, I feel those comments are only really > useful for those large and obscure catch-all headers such as > "zend_types.h". Check my "zend_result.h" commit - it makes lots of my > comments disappear, because suddenly the #include directive becomes > self-explaining. > I think if people are not sure what it does, they should just open the header and get explanation there so it should not be useful anywhere. > > > - some of the removed headers from libc should be kept as there's > > just a small benefit in removing them (e.g. errno.h) > > There is a benefit in removing them; improved compile times and > reduced namespace pollution (which is a theoretical benefit which is > probably not valued by all). > > But the benefit of keeping it is only source compatibility with buggy > extensions. > > And source compatibility with buggy extensions can be had by moving > those #include directives to "php_compat.h", with a way for "good" > extensions to opt out of the bloat. > > If you believe source compatibility even with "bad" extensions is > utterly important, I'll consider that in my submissions, and will > consider all build breakages as regressions that need to be fixed in > PHP (e.g. php_compat.h). I don't have a real opinion on that, I'll > accept whatever you decide and will help implement it. > > I think it should be at least included with "php.h" to reduce the impact on SAPI's and extensions. Regards Jakub
Re: [PHP-DEV] RFC: rules for #include directives
comments like #include "zend_execute.h" // for zend_vm_calc_used_stack() Maintaining "the correct" includes manually is hard. Maintaining the comments manually is even harder. In https://github.com/php/php-src/blob/a3a3f326bd32005dd68936edf0e01f98a2fbe163/CODING_STANDARDS.md?plain=1#L269 you proposed a tool to determine the needed includes. I would highly appreciate if this could be done automatically, ie. generate "the correct" includes and assert them by https://github.com/php/php-src/blob/php-8.2.1/.github/actions/verify-generated-files/action.yml CI. If you can generate the comments too, they can be asserted by CI reliably then as well. With kind regards / Mit freundlichen Grüßen / S přátelským pozdravem, Michael Voříšek On 20 Jan 2023 11:17, Max Kellermann wrote: On 2023/01/20 10:53, Jakub Zelenka wrote: I'm afraid it's too late for PHP. It's never too late. Don't give up PHP :-) PHP is an old code base. I want to help modernize it. As it was said, this is problematic for bug fixes when merging up and it's extra hurdle for maintainers - read this will slow down the development. Can I do anything to help with those merges? Also this is not something I would like to see in a minor version bump even though we can break ABI in that version but in general we try to introduce as little changes as possible for extensions. If anything it should at least wait for the next major version bump when extension devs expect more work to be done. What about: let's branch PHP 8.3 out, and declare master will become 9.0. (Or create a new branch for 9.0 and let 8.3 stay in master; branches names doesn't matter, only the idea matters.) I volunteer to do regular (daily?) merges from the 8.3 branch to the 9.0 branch, and fix all the problems that may occur. I think comments are more useful in the actual header to explain what it is for which can be done in a few lines. It can be more detailed and doesn't need to be repeated... I agree with that part. More documentation for headers is useful. As I said in another reply, I feel those comments are only really useful for those large and obscure catch-all headers such as "zend_types.h". Check my "zend_result.h" commit - it makes lots of my comments disappear, because suddenly the #include directive becomes self-explaining. - some of the removed headers from libc should be kept as there's just a small benefit in removing them (e.g. errno.h) There is a benefit in removing them; improved compile times and reduced namespace pollution (which is a theoretical benefit which is probably not valued by all). But the benefit of keeping it is only source compatibility with buggy extensions. And source compatibility with buggy extensions can be had by moving those #include directives to "php_compat.h", with a way for "good" extensions to opt out of the bloat. If you believe source compatibility even with "bad" extensions is utterly important, I'll consider that in my submissions, and will consider all build breakages as regressions that need to be fixed in PHP (e.g. php_compat.h). I don't have a real opinion on that, I'll accept whatever you decide and will help implement it. Max
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/20 10:53, Jakub Zelenka wrote: > I'm afraid it's too late for PHP. It's never too late. Don't give up PHP :-) PHP is an old code base. I want to help modernize it. > As it was said, this is problematic for bug fixes when merging up > and it's extra hurdle for maintainers - read this will slow down the > development. Can I do anything to help with those merges? > Also this is not something I would like to see in a minor version bump even > though we can break ABI in that version but in general we try to introduce > as little changes as possible for extensions. If anything it should at > least wait for the next major version bump when extension devs expect more > work to be done. What about: let's branch PHP 8.3 out, and declare master will become 9.0. (Or create a new branch for 9.0 and let 8.3 stay in master; branches names doesn't matter, only the idea matters.) I volunteer to do regular (daily?) merges from the 8.3 branch to the 9.0 branch, and fix all the problems that may occur. > I think comments are more useful in the actual header to explain > what it is for which can be done in a few lines. It can be more > detailed and doesn't need to be repeated... I agree with that part. More documentation for headers is useful. As I said in another reply, I feel those comments are only really useful for those large and obscure catch-all headers such as "zend_types.h". Check my "zend_result.h" commit - it makes lots of my comments disappear, because suddenly the #include directive becomes self-explaining. > - some of the removed headers from libc should be kept as there's > just a small benefit in removing them (e.g. errno.h) There is a benefit in removing them; improved compile times and reduced namespace pollution (which is a theoretical benefit which is probably not valued by all). But the benefit of keeping it is only source compatibility with buggy extensions. And source compatibility with buggy extensions can be had by moving those #include directives to "php_compat.h", with a way for "good" extensions to opt out of the bloat. If you believe source compatibility even with "bad" extensions is utterly important, I'll consider that in my submissions, and will consider all build breakages as regressions that need to be fixed in PHP (e.g. php_compat.h). I don't have a real opinion on that, I'll accept whatever you decide and will help implement it. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On Wed, Jan 18, 2023 at 2:55 PM Max Kellermann wrote: > > Here's my RFC: https://wiki.php.net/rfc/include_cleanup > > I think that some logic behind this RFC makes sense but I would see it more like something for a new project. I'm afraid it's too late for PHP. At least doing that in one big step seems very painful and disturbing for not that much benefit. As it was said, this is problematic for bug fixes when merging up and it's extra hurdle for maintainers - read this will slow down the development. Also this is not something I would like to see in a minor version bump even though we can break ABI in that version but in general we try to introduce as little changes as possible for extensions. If anything it should at least wait for the next major version bump when extension devs expect more work to be done. Except that, there are few things that I don't like: - comments next to inclusion don't have any value especially in the way how it's done - it just puts few things that are in the header and sometimes that matches the header name (e.g. smart str...). I think comments are more useful in the actual header to explain what it is for which can be done in a few lines. It can be more detailed and doesn't need to be repeated... - some of the removed headers from libc should be kept as there's just a small benefit in removing them (e.g. errno.h) That said I think it makes sense but I think the disadvantages outweigh the advantages... Regads Jakub
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/16 17:43, Max Kellermann wrote: > I did, but in this early stage, there is no measurable speedup yet, > because there are still too many "fat" headers left that need to be > included everywhere, which in turn include too much. The many small > improvements I made so far drown in that noise. Meanwhile, my draft branch builds 2% faster (perf-stat/cycles: 328,246,125,055 -> 320,802,717,013; the measurement is stable enough to always see that 2% improvement). This is still in an early stage (most Zend headers have been cleaned up, but little else, no main, no extensions). There's much more to gain. Improved build times are the least important aspect of this code cleanup, but this effect does exist. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/18 18:51, Kamil Tekiela wrote: > - I am against forward struct declarations. I think they decrease code > readability and should be avoided. btw. if this opinion is shared by the majority of voters, I'll send a PR to remove all existing forward declarations from PHP. There are many, for example zend_object_handlers in Zend/zend_types.h added by https://github.com/php/php-src/commit/c80e82230b5 Though one very interesting commit to introduce a forward declaration is https://github.com/php/php-src/commit/f4cfaf36e23ca (forward-declared zend_ast) > - I am against putting comments on #includes. Same here - there are currently 47 #includes with comments (none of which were written by me, some dating back 24 years ago). I'll take care of removing them all, if it turns out that the majority of voters doesn't like them. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/18 18:51, Kamil Tekiela wrote: > > #include "zend_portability.h" // for BEGIN_EXTERN_C > What if in future the need for BEGIN_EXTERN_C disappears? Who is going to > remember to update the comment? (I just addressed that concern in reply to Rowan's email.) > As you said yourself, this refactoring has no practical effect on > compilation times. No, I did not say that. I said: "in this early stage, there is no measurable speedup yet" "Once I have disentangled and splitted those fat headers [...], the speedup will be measurable and perceptible." > Another advantage listed is "more correct code". Perhaps it is, but does it > mean that the current situation is likely to cause more frequent PHP bugs? > Would this refactoring help in finding PHP bugs more quicker? This isn't the kind of "correct" that (directly) prevents runtime bugs; it's the kind of "correct" that leads to having more readable code and thus making maintenance easier. If maintenance is easier, developers will likely produce fewer runtime bugs, but that's only an indirect effect. Maintenance becomes easier because it should be easier to find out which header to include to make a certain feature available. Right now, the headers are a big mess - when do I have to include zend.h, when do I have to include php.h and what is php_compat.h, and I included zend_types.h but why are certain types not available? Is there documentation about this? I found none... And PHP extensions I saw look like all people just try'n'error and copy'n'paste to "develop" code. > I mean things like a bug fix using a symbol that hasn't been used in > the file but is included through one of the more generic > headers. Once such a commit is merged into master, it may turn out > that the symbol lacks a new include. This adds unnecessary work for > bug fixers who will now have to verify this and find the appropriate > include during merges. The same can be said about all code changes - all code changes make merging more difficult; sometimes with merge conflicts, sometimes with header changes, sometimes with API changes, and most dangerously: semantic changes. Anyway, you care about the people who merge a bugfix to another branch. Yes, that may sometimes lead to a build problem. Only push a merge commit after you have verified that it builds. Each breakage is extra work. But in return, after the transition to "cleaner code", maintenance and development gets easier. There, you get the time back you invested earlier. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/18 18:06, Rowan Tommins wrote: > Then I guess you've never heard people quoting Robert "Uncle Bob" Martin, who > is well known for asserting that code should be self-documenting, and > therefore not need comments, saying things like: > > > The proper use of comments is to compensate for our failure to express > > ourself in code. Note that I used the word failure. I meant it. Comments > > are always failures. > > Many arguments could be had over whether that's going too far, but "too many > comments" can definitely be a bad thing. There is a tiny piece of truth in that, but it doesn't apply to #include directives. Yes, you should always try to write code in a way that's simple enough for others to understand. But #include directives cannot be written in such a way. Sometimes, you know why an #include is there, but sometimes, you cannot, and only a comment can explain it. > To take the example Dmitry gave: > > > #include "zend_portability.h" // for BEGIN_EXTERN_C > > What should I understand from this comment? That the reason for including this header was because we need BEGIN_EXTERN_C in this file, and BEGIN_EXTERN_C is provided by that header. Without the comment, this would be hard to find out. > - If I want some other symbol defined in zend_portability.h, do I need to > adjust the includes? No, all the symbols are already imported. I could adjust > the comment, but nothing will break if I don't. I don't think that matters much. If you use another symbol from zend_portability.h, you could update the comment, or not. Sure, if you do not, the comment would be incomplete. But that incomplete comment is still better than no comment. > - If I remove the last use of BEGIN_EXTERN_C in this file, can I remove the > include? No, because other symbols from it might be used, even if they > weren't when the comment was written. Removing the last use of BEGIN_EXTERN_C would give you a hint that maybe zend_portability.h does not need to be included anymore. If you care, you can decide to remove the #include and see if it still builds. If it doesn't there's apparently something else still needed from zend_portability.h, and the error message tells you what is needed. If you still care, now's your chance to copy'n'paste that identifier from the compiler's error message to the comment, replacing the old comment. Or just leave the old now-outdated comment. And let somebody else do it, eventually or never. But still, an outdated comment doesn't hurt anybody. It helps keeping the code clean (by being able to easily see #include candidates which can be removed), but if the comment turns out to be outdated, update it or just leave it. I mean, for the last 20+ years, nobody has cared for outdated and unnecessary #includes. Outdated and unnecessary #includes to have negative effects. And now some people pretend to care about outdated and unnecessary comments, which have no negative effects. > - Perhaps the name of the header doesn't give a clear enough idea of what > symbols it might contain, and what types of file should include it. If so, > concentrate on better naming, or better documentation of existing conventions. Yes yes yes! Concentrate on better naming - that's exactly what I'm about to do! And that will lead to FEWER #include comments. For example, many headers include zend_types.h but they only need zend_result. My PR contains a commit which moves this typedef to zend_result.h. Now I can change lots of "#include zend_types.h" to "#inclued zend_result.h", and can then omit the comment, because it's obvious why this header gets included. (Dmitry has complained already about this very commit.) > - Perhaps it is a justification added when the include was first added. If > so, put it in the commit message and PR summary. "Put it in the commit message" can be said about all code comments and all code documentation. I think that's wrong, because: - commit message shows only the initial reason why the #include was added, but unlike code comments, commit messages cannot ever be updated - looking up commit messages is more cumbersome than reading a code comment A commit message should describe why something was done, and how something was done; it is an important part of a software. But it should not replace documentation that helps understand the software. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
Hi On 1/18/23 21:37, Flávio Heleno wrote: This may be a silly question, but in that case, wouldn't #ifdef guards keep the compiler from including/parsing a.h twice? It's complicated. Modern compilers may include an optimization for include guards (https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html), but those rules may only be applied fairly narrowly. I just recompiled gammasection.c in ext/random [1] which is a fairly simple file containing just a handful of files and traced the compiler with 'strace' for the 'openat' syscalls: In total 2488 openat syscalls (many of them resulting in ENOENT, because of multiple include paths) were performed by the compiler. Multiple of my system headers were opened several times, because the include guard optimization could not be applied to them. I'm also seeing duplicated PHP headers: 20 "/usr/lib/gcc/x86_64-linux-gnu/9/include/stddef.h", 13 "/usr/include/x86_64-linux-gnu/bits/mathcalls-narrow.h", 11 "/usr/include/x86_64-linux-gnu/bits/wordsize.h", 8 "/usr/include/x86_64-linux-gnu/bits/mathcalls.h", 6 "/usr/include/x86_64-linux-gnu/bits/libc-header-start.h", 5 "/usr/lib/gcc/x86_64-linux-gnu/9/include/limits.h", 4 "/usr/include/x86_64-linux-gnu/bits/mathcalls-helper-functions.h", 3 "/usr/include/assert.h", 3 "./php-src/Zend/zend_hash.h", 3 "./php-src/Zend/zend.h", 2 "/usr/lib/gcc/x86_64-linux-gnu/9/include/stdarg.h", 2 "/usr/lib/gcc/x86_64-linux-gnu/9/include/mm_malloc.h", 2 "/usr/lib/gcc/x86_64-linux-gnu/9/include/emmintrin.h", 2 "/usr/include/x86_64-linux-gnu/bits/long-double.h", 2 "./php-src/Zend/zend_string.h", 2 "./php-src/Zend/zend_stream.h", 2 "./php-src/Zend/zend_stack.h", [1] https://github.com/php/php-src/blob/master/ext/random/gammasection.c When a.h is *not* required by any of b.h, c.h nor foo.c, I agree that it should *not* be included at all, but when any of them, Ideally the scope of each individual header would also be narrowed to reduce the number of required dependencies. Staying at the gammasection.c example from above: Ideally it should not be necessary to include the 20 kB zend_string.h twice, because the gammasection.c does not use strings at all. Likewise the 50 kB zend_hash.h is included three times and not used it all. With the amount of C files this adds up. Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On Wed, Jan 18, 2023 at 4:17 PM Tim Düsterhus wrote: > Hi > > On 1/18/23 18:51, Kamil Tekiela wrote: > > As you said yourself, this refactoring has no practical effect on > > It has no practical effect *yet*. The headers need to be untangled first > before actual optimization can happen. > > Or maybe have all > > ZEND headers included with a single header? > > That would go against the goal of reducing compile times. Much of the > time spent when compiling C is parsing megabytes of source code, because > deeply nested and/or unnecessary includes will bloat the amount of code > the toolchain will need to go through. > > Let me give a simple example: > > We have the following headers: > > a.h: 1 MB. > b.h: Includes a.h and is itself 100 kB in size. > c.h: Includes a.h and is itself 100 kB in size. > > foo.c: Includes b.h and c.h, but c.h is not actually used. Is itself 300 > kB in size. > > Now when compiling foo.c, a total of 2.5 MB of source code need to be > parsed, because a.h is included twice. If the unused include for c.h is > removed, then it will only be 1.4 MB in total. This multiplies quickly > for more complex include chains. I'd like to point to this commit once > more: > > > https://github.com/haproxy/haproxy/commit/340ef2502eae2a37781e460d3590982c0e437fbd > > Removing two headers in a single file reduced the total compile by > roughly 10%! > > Here's two more similar commits: > > > https://github.com/haproxy/haproxy/commit/e5983ffb3adbf71a8f286094b1c1afce6061d1f3 > > https://github.com/haproxy/haproxy/commit/1db546eecd3982ffc1ea92c2f542a3b01ce43137 > > Best regards > Tim Düsterhus > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > > Tim, This may be a silly question, but in that case, wouldn't #ifdef guards keep the compiler from including/parsing a.h twice? When a.h is *not* required by any of b.h, c.h nor foo.c, I agree that it should *not* be included at all, but when any of them, or even if all of them requires it, the guards should be enough to avoid redundant work afaik. Or am I missing anything here?
Re: [PHP-DEV] RFC: rules for #include directives
Hi On 1/18/23 18:51, Kamil Tekiela wrote: As you said yourself, this refactoring has no practical effect on It has no practical effect *yet*. The headers need to be untangled first before actual optimization can happen. Or maybe have all ZEND headers included with a single header? That would go against the goal of reducing compile times. Much of the time spent when compiling C is parsing megabytes of source code, because deeply nested and/or unnecessary includes will bloat the amount of code the toolchain will need to go through. Let me give a simple example: We have the following headers: a.h: 1 MB. b.h: Includes a.h and is itself 100 kB in size. c.h: Includes a.h and is itself 100 kB in size. foo.c: Includes b.h and c.h, but c.h is not actually used. Is itself 300 kB in size. Now when compiling foo.c, a total of 2.5 MB of source code need to be parsed, because a.h is included twice. If the unused include for c.h is removed, then it will only be 1.4 MB in total. This multiplies quickly for more complex include chains. I'd like to point to this commit once more: https://github.com/haproxy/haproxy/commit/340ef2502eae2a37781e460d3590982c0e437fbd Removing two headers in a single file reduced the total compile by roughly 10%! Here's two more similar commits: https://github.com/haproxy/haproxy/commit/e5983ffb3adbf71a8f286094b1c1afce6061d1f3 https://github.com/haproxy/haproxy/commit/1db546eecd3982ffc1ea92c2f542a3b01ce43137 Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
I am in both camps. I think some slight cleanup might be in order, but not how you are proposing it. First of all: - I am against forward struct declarations. I think they decrease code readability and should be avoided. - I am against putting comments on #includes. Comments are noise in code and often go out of date. Especially in a situation like this where the comment is physically far away from the code that introduced it. To take the example: > #include "zend_portability.h" // for BEGIN_EXTERN_C What if in future the need for BEGIN_EXTERN_C disappears? Who is going to remember to update the comment? As you said yourself, this refactoring has no practical effect on compilation times. But your RFC states this as one of the advantages. This is misleading. Another advantage listed is "more correct code". Perhaps it is, but does it mean that the current situation is likely to cause more frequent PHP bugs? Would this refactoring help in finding PHP bugs more quicker? Others have raised a valid point that PHP bugs are fixed in earlier versions and then merged up into master. What you are proposing is going to lead to issues with merging. And I don't mean merge conflicts, because includes rarely change in bug fixes. I mean things like a bug fix using a symbol that hasn't been used in the file but is included through one of the more generic headers. Once such a commit is merged into master, it may turn out that the symbol lacks a new include. This adds unnecessary work for bug fixers who will now have to verify this and find the appropriate include during merges. So maybe reduce scope of this refactoring? Do it only where it's strictly necessary. Don't use forward declarations or comments. Or maybe have all ZEND headers included with a single header? Refactoring needs to make developers' life easier and not just be done for the sake of following best-practice.
Re: [PHP-DEV] RFC: rules for #include directives
Regards, On 18 January 2023 16:07:52 GMT, Max Kellermann wrote: >This argument puzzles me most. I've never heard anybody criticize >some piece of code for having too MANY code comments, too MUCH >explanation. Then I guess you've never heard people quoting Robert "Uncle Bob" Martin, who is well known for asserting that code should be self-documenting, and therefore not need comments, saying things like: > The proper use of comments is to compensate for our failure to express > ourself in code. Note that I used the word failure. I meant it. Comments are > always failures. Many arguments could be had over whether that's going too far, but "too many comments" can definitely be a bad thing. To take the example Dmitry gave: > #include "zend_portability.h" // for BEGIN_EXTERN_C What should I understand from this comment? - If I want some other symbol defined in zend_portability.h, do I need to adjust the includes? No, all the symbols are already imported. I could adjust the comment, but nothing will break if I don't. - If I remove the last use of BEGIN_EXTERN_C in this file, can I remove the include? No, because other symbols from it might be used, even if they weren't when the comment was written. So why does this comment feel necessary? - Perhaps the name of the header doesn't give a clear enough idea of what symbols it might contain, and what types of file should include it. If so, concentrate on better naming, or better documentation of existing conventions. - Perhaps it is a justification added when the include was first added. If so, put it in the commit message and PR summary. I'm not the right person to have opinions on the rest of this discussion, but I can certainly understand the argument against this particular aspect. -- Rowan Tommins [IMSoP]
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/18 16:41, Derick Rethans wrote: > Including a "random" zend header also sounds like a great benefit. I > shouldn't need to care as an extension author which header enables which > internal function(s) for usage. So, in your opinion, choosing an arbitrary of the 95 Zend/zend*.h headers should make *all* functions available? I don't agree with that. Having a documented header such as "zend_all.h" (or just the already-existing "php.h") which really makes everything available is certainly a good idea. But making all 95 headers behave that way doesn't sound like a good design. > This is a very naive view of the world. You make this personal. Let's not. > When looking at the commit history, I saw more than a dozen commits > doing a small change. That is clutter. https://github.com/php/php-src/blob/master/CONTRIBUTING.md "Do not commit multiple files and dump all messages in one commit. If you modified several unrelated files, commit each group separately and provide a nice commit message for each one." Each of my commits is self-contained and cleans up one internal library. Exactly the way the PHP commit rules ask me to do. I even took extra care so every commit is buildable, even though I had to reorder them carefully to make that possible. You may or may not value my effort, but it is in line with PHP's rules. > Adding forwards declarations for zend_* structs, is clutter. That is a reasonable opinion, but not mine. > Adding comments that go out of date as to why a header is included is > also clutter. This argument puzzles me most. I've never heard anybody criticize some piece of code for having too MANY code comments, too MUCH explanation. Every piece of documentation/explanation eventually goes out of date, but that's not a good reason to never write any. > > Dmitry Stogov said it's "just a useless permutation", but I can't > > understand this argument either. > > It creates code-churn, which makes merging up fixes from older branches > harder. I've been rebasing my include cleanup patches for several weeks and have yet to see the first conflict. That's because bug fixes merged from the stable branches rarely ever affect #include directives. And even if there is some day a merge conflict, fixing it is trivial. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On Mon, 16 Jan 2023, Max Kellermann wrote: > On 2023/01/16 13:38, Kamil Tekiela wrote: > > > What's the impact on other maintainers, especially extension > > maintainers? > > Other maintainers may need to determine which includes they really > need, instead of relying on everything always already there by > including one random zend_* header. Including a "random" zend header also sounds like a great benefit. I shouldn't need to care as an extension author which header enables which internal function(s) for usage. > Extension maintainers may see build failures because, for example, > they forgot to include errno.h but used "errno". This previously > worked because all PHP headers included errno.h even though there was > no reason to. These build failures are bugs in those extensions, and > revealing them is a good thing, even though it seems tedious at first. This is a very naive view of the world. You don't get to decide what is "good for other people". > > Do you see any downsides to your new approach? > … > Derick Rethans wrote my idea "adds nothing but clutter", but I guess > he should explain what bothers him; I don't comprehend this, because > from my perspective, I intend to remove clutter. When looking at the commit history, I saw more than a dozen commits doing a small change. That is clutter. Adding forwards declarations for zend_* structs, is clutter. Adding comments that go out of date as to why a header is included is also clutter. > Dmitry Stogov said it's "just a useless permutation", but I can't > understand this argument either. It creates code-churn, which makes merging up fixes from older branches harder. cheers, Derick -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
Hi On 1/18/23 15:55, Max Kellermann wrote: Here's my RFC: https://wiki.php.net/rfc/include_cleanup I hope that's correct this way - then we're now at step 5 "Listen to the feedback, and try to answer/resolve all questions" - so please give feedback :-) Don't forget to add it to the appropriate section in the RFC overview: https://wiki.php.net/rfc Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/18 13:11, "Christoph M. Becker" wrote: > > I registered the account "maxk" on the Wiki - please grant me RFC > > karma! > > Done. Best of luck! Thanks. Here's my RFC: https://wiki.php.net/rfc/include_cleanup I hope that's correct this way - then we're now at step 5 "Listen to the feedback, and try to answer/resolve all questions" - so please give feedback :-) -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 18.01.2023 at 12:58, Max Kellermann wrote: > On 2023/01/18 12:51, "G. P. B." wrote: >> So I think creating an official RFC is the only way. > > Okay then, I'm now at step 2 of https://wiki.php.net/rfc/howto > > I registered the account "maxk" on the Wiki - please grant me RFC > karma! Done. Best of luck! -- Christoph M. Becker -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/18 12:51, "G. P. B." wrote: > So I think creating an official RFC is the only way. Okay then, I'm now at step 2 of https://wiki.php.net/rfc/howto I registered the account "maxk" on the Wiki - please grant me RFC karma! Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On Wed, 18 Jan 2023 at 09:01, Max Kellermann wrote: > On 2023/01/16 13:48, "G. P. B." wrote: > > Moreover, having those sorts of changes be RFCs seems counterproductive > as > > the only people who care about this are actual core and extensions > > developers and this opens the gate for petty RFCs to resolve coding style > > disagreements. > > How shall we proceed from here? Shall I create an official RFC or > not? > > George said no, which I understand; but I don't know what else to do > to produce a decision. > > I asked Dmitry to post his GitHub arguments in this thread, so you see > both sides of the story, and you can discuss his arguments. (I > already replied to him on GitHub, see > https://github.com/php/php-src/pull/10345) > I still don't think the RFC process is a good vehicle for those sorts of decisions, but it's the only process we have and there is some clear disagreement that needs to get resolved here. So I think creating an official RFC is the only way. Best regards, George P. Banyard
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/16 13:48, "G. P. B." wrote: > Moreover, having those sorts of changes be RFCs seems counterproductive as > the only people who care about this are actual core and extensions > developers and this opens the gate for petty RFCs to resolve coding style > disagreements. How shall we proceed from here? Shall I create an official RFC or not? George said no, which I understand; but I don't know what else to do to produce a decision. I asked Dmitry to post his GitHub arguments in this thread, so you see both sides of the story, and you can discuss his arguments. (I already replied to him on GitHub, see https://github.com/php/php-src/pull/10345) Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
Here are my comments about the last PR https://github.com/php/php-src/pull/10345 Max asked to repost them in this thread: In my opinion this should not be accepted. Despite the intention is good, this huge patch doesn't improve anything but adds new questions. - It introduces new include files (e.g. zend_char.h, what was wrong with zend_portability.h?) - It adds a lot of forward declarations (e.g. typedef struct _ ) - It adds almost useless comments that are not goigg to be kept as to date (e.g. #include "zend_portability.h" // for BEGIN_EXTERN_C, zend_portability.h defines all base macros for Zend engine) - zend.c adds a number of commented includes (is this some kind of TODO?) - > is included in many other *.h files instead of single zend_portability.h or zend_types.h. I see all these changes as a source permutation that don't add significant value. On the other hand: - this may introduce PHP build failures on some untested configurations (like it was with DTRACE) - we have to support several PHP branches at once and merging fixes between very different branches is going to introduce problems - this changes may break compilation of third-party software (e.g. xdebug). Not a huge problem, but it's not great to do this in a minor version update. On Mon, Jan 16, 2023 at 8:22 PM Levi Morrison via internals < internals@lists.php.net> wrote: > As commented yesterday on one of the PRs, I strongly agree with the > cleanups. Our code has implicit dependencies and if you reorder the > includes in the same file, you can break builds. That shouldn't ever > happen. > > I also agree that dtrace breaking is a failure of CI and testing, not > necessarily the patch (I haven't reviewed that PR carefully, maybe > there was some carelessness but if it's not tested, it's hard to blame > anyone). > > Lastly, I do not care for the comments which explain why a header is > included. My experience is that these comments bitrot quite quickly. > > Overall, I hope we can resume this effort. > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > >
Re: [PHP-DEV] RFC: rules for #include directives
As commented yesterday on one of the PRs, I strongly agree with the cleanups. Our code has implicit dependencies and if you reorder the includes in the same file, you can break builds. That shouldn't ever happen. I also agree that dtrace breaking is a failure of CI and testing, not necessarily the patch (I haven't reviewed that PR carefully, maybe there was some carelessness but if it's not tested, it's hard to blame anyone). Lastly, I do not care for the comments which explain why a header is included. My experience is that these comments bitrot quite quickly. Overall, I hope we can resume this effort. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/16 17:44, Tim Düsterhus wrote: > Sorry for the duplicate mail, but it just came to my mind checking the CI > build logs from before and after the revert: I wish that were true numbers, but they're not - that's just (extreme) noise. Later CI builds are back to shorter build times. I guess CI builds in the cloud aren't very useful for measuring build times. My measurements with "perf stat" on a (bare-metal) build machine that was idle did not produce a meaningful difference. The noise was smaller, but still too much to see a real difference. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
Hi On 1/16/23 17:36, Tim Düsterhus wrote: From my experience contributing to another C project (HAProxy), cleaning up the the includes can have a measurable impact on build times. See this commit for example: Sorry for the duplicate mail, but it just came to my mind checking the CI build logs from before and after the revert: - Before (x64 Release ZTS) https://github.com/php/php-src/actions/runs/3924250312/jobs/6708385711 After: https://github.com/php/php-src/actions/runs/3929927482/jobs/6719316931 9:26 minutes -> 9:52 minutes Before (x64 Debug NTS): https://github.com/php/php-src/actions/runs/3924250312/jobs/6708385677 After: https://github.com/php/php-src/actions/runs/3929927482/jobs/6719316800 4:14 minutes -> 5:11 minutes - So with the changes that already happened, Max managed to shave off 30 seconds (5%) for the x64 Release ZTS build and 1 minute (19%) for the x64 Debug NTS build. This difference would likely become even larger, as to my understanding the cleanup wasn't complete yet. Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/16 17:25, Kamil Tekiela wrote: > Have you done any benchmarking in terms of build time? Is there any > tangible difference or is it only theoretical? I did, but in this early stage, there is no measurable speedup yet, because there are still too many "fat" headers left that need to be included everywhere, which in turn include too much. The many small improvements I made so far drown in that noise. Once I have disentangled and splitted those fat headers (e.g. move zend_result to a separate header to avoid including zend_types.h everywhere), the speedup will be measurable and perceptible. It's too early to prove it with numbers. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
Hi On 1/16/23 17:25, Kamil Tekiela wrote: Have you done any benchmarking in terms of build time? Is there any tangible difference or is it only theoretical? From my experience contributing to another C project (HAProxy), cleaning up the the includes can have a measurable impact on build times. See this commit for example: https://github.com/haproxy/haproxy/commit/340ef2502eae2a37781e460d3590982c0e437fbd They go even as far as regularly reordering object files by build time to optimally keep multi-core machines busy to shave off the last few milliseconds: https://github.com/haproxy/haproxy/commit/d2ff5dc3ebba1163749e2d874cce5892570f540a Even for incremental build having a hypothetical 100ms decrease in build time can make the difference between "this is fast enough to wait" and "this is so slow that my mind wanders around while waiting to build". And in CI which doesn't do incremental builds we won't needlessly burn CPU time that is better spent elsewhere. Thus I'm in favor of this cleanup. Best regards Tim Düsterhus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
Have you done any benchmarking in terms of build time? Is there any tangible difference or is it only theoretical?
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/16 17:14, Chase Peeler wrote: > will that have any impact on the final product such as reduced binary sizes > or better performance? No, this kind of code cleanup must not affect the resulting binary at all, therefore no change to runtime behavior/performance. Though the build will be faster, because the C compiler has fewer includes (= less source code) to process in each compilation unit. -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On Mon, Jan 16, 2023 at 7:54 AM Max Kellermann wrote: > On 2023/01/16 13:38, Kamil Tekiela wrote: > > Did you create an RFC already? Or is this RFC-like discussion? > > No. I followed https://wiki.php.net/rfc/howto and this is step 1. > Creating the actual RFC is step 3. > > (I'm new to PHP and this process - so far, I've only contributed PHP > code through GitHub, and I was surprised by today's sudden resistance > against my code cleanup by two PHP developers, after the others > welcomed by changes.) > > > In either case, if you are asking community to come to a decision, we > need > > to see some background. > > > Why do you want to change this? What's the benefit? > > For cleaner code, more readable code (less obscurity in huge amounts > of undocumented #includes), more correct code, less fragile code, > reduced compile times. > > > What's the impact on other maintainers, especially extension > > maintainers? > > Other maintainers may need to determine which includes they really > need, instead of relying on everything always already there by > including one random zend_* header. > > Extension maintainers may see build failures because, for example, > they forgot to include errno.h but used "errno". This previously > worked because all PHP headers included errno.h even though there was > no reason to. These build failures are bugs in those extensions, and > revealing them is a good thing, even though it seems tedious at first. > > > Do you see any downsides to your new approach? > > Like any code cleanup, this can result in regressions for parts of PHP > that are not covered by a test and not built with the CI, like the > DTrace regression yesterday. > > And code cleanups can easily reveal existing bugs, which is a good > thing, but those bugs can be hidden at first - e.g. failure to > explicitly include "php_config.h" will (silently) disable features and > break things. That may cause some temporary pain, but in the end will > result in better code, though some will believe that it isn't a worthy > goal or not worth the temporary pain. > > Derick Rethans wrote my idea "adds nothing but clutter", but I guess > he should explain what bothers him; I don't comprehend this, because > from my perspective, I intend to remove clutter. > > Dmitry Stogov said it's "just a useless permutation", but I can't > understand this argument either. > > Max > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > > For those of us that don't know the finer details of the build process, will that have any impact on the final product such as reduced binary sizes or better performance? I'm not saying that code cleanup isn't a good enough reason to do this on its own, just curious if there are other advantages beyond that. -- Chase Peeler chasepee...@gmail.com
Re: [PHP-DEV] RFC: rules for #include directives
On 2023/01/16 13:38, Kamil Tekiela wrote: > Did you create an RFC already? Or is this RFC-like discussion? No. I followed https://wiki.php.net/rfc/howto and this is step 1. Creating the actual RFC is step 3. (I'm new to PHP and this process - so far, I've only contributed PHP code through GitHub, and I was surprised by today's sudden resistance against my code cleanup by two PHP developers, after the others welcomed by changes.) > In either case, if you are asking community to come to a decision, we need > to see some background. > Why do you want to change this? What's the benefit? For cleaner code, more readable code (less obscurity in huge amounts of undocumented #includes), more correct code, less fragile code, reduced compile times. > What's the impact on other maintainers, especially extension > maintainers? Other maintainers may need to determine which includes they really need, instead of relying on everything always already there by including one random zend_* header. Extension maintainers may see build failures because, for example, they forgot to include errno.h but used "errno". This previously worked because all PHP headers included errno.h even though there was no reason to. These build failures are bugs in those extensions, and revealing them is a good thing, even though it seems tedious at first. > Do you see any downsides to your new approach? Like any code cleanup, this can result in regressions for parts of PHP that are not covered by a test and not built with the CI, like the DTrace regression yesterday. And code cleanups can easily reveal existing bugs, which is a good thing, but those bugs can be hidden at first - e.g. failure to explicitly include "php_config.h" will (silently) disable features and break things. That may cause some temporary pain, but in the end will result in better code, though some will believe that it isn't a worthy goal or not worth the temporary pain. Derick Rethans wrote my idea "adds nothing but clutter", but I guess he should explain what bothers him; I don't comprehend this, because from my perspective, I intend to remove clutter. Dmitry Stogov said it's "just a useless permutation", but I can't understand this argument either. Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php
Re: [PHP-DEV] RFC: rules for #include directives
On Mon, 16 Jan 2023 at 12:03, Max Kellermann wrote: > Hi, > > in the past weeks, I submitted four PRs for cleaning up the #includes > in the PHP code base: > > https://github.com/php/php-src/pull/10216 > https://github.com/php/php-src/pull/10220 > https://github.com/php/php-src/pull/10279 > https://github.com/php/php-src/pull/10300 > > I saw that the existing #include directives were inconsistent, > incomplete and bloated; things just worked by chance, not by design, > because there were a few headers which just included everything. I > wanted to help clean up this mess that had accumulated over two > decades. > > All PRs were welcomed by different reviewers and were merged; there > was just one minor criticism by Dmitry Stogov who thought the code > comments explaining many non-obvious #includes should be removed: > > https://github.com/php/php-src/pull/10216#issuecomment-1375140255 > > > I don't think we should include the comments like // for > > BEGIN_EXTERN_C (and similar). The are good for review only. I'm > > indifferent to these changes and don't object. > > Yesterday, when a DTrace-specific regression was reported > (https://github.com/php/php-src/pull/10220#issuecomment-1383035139), > after which Dmitry asked to revert the whole PR: > > https://github.com/php/php-src/pull/10220#issuecomment-1383658247 > > Instead, I submitted a trivial fix for the regression > (https://github.com/php/php-src/pull/10334), which was rejected by > Dmitry > (https://github.com/php/php-src/pull/10220#issuecomment-1383706602) > but confirmed by the original reporter > (https://github.com/php/php-src/pull/10220#issuecomment-1383802334). > > In spite of that, Dmitry demanded to revert all of my #include > cleanups > (https://github.com/php/php-src/pull/10220#issuecomment-1383739816): > > > I'm asking to revert all these include cleanup commits! This is > > just a useless permutation. e.g. 68ada76 adds typedef struct _* that > > we didn't need before. How is this clearly? > > ... which so far only Derick Rethans agreed to: > > https://github.com/php/php-src/pull/10220#issuecomment-1383784480 > > > FWIW, I agree with Dmitry here, and these should all be reverted. It > > adds nothing but clutter. > > Christoph M. Becker performed the revert and suggested doing an RFC > (https://github.com/php/php-src/pull/10220#issuecomment-1383789100) > and vote on it. > > So this is a first draft of my proposal which I'd like to discuss with > you: > > https://github.com/php/php-src/pull/10338 > > Max > As the person who has signed off on the PRs and merged them, I am in favour of these changes and think the revert is a total over reaction. Before merging the first set, it was asked ahead of time if those changes were OK, and multiple people were in favour of these changes. The fact the DTrace build was broken inadvertently and only found out today is a failure of our CI Build matrix and myself for not knowing this option nor compiling with it. Not of the changes. Moreover, having those sorts of changes be RFCs seems counterproductive as the only people who care about this are actual core and extensions developers and this opens the gate for petty RFCs to resolve coding style disagreements. Finally, some of the wording used in those PR replies are far from civil and can push away new contributors to the language. Best regards, George P. Banyard
Re: [PHP-DEV] RFC: rules for #include directives
Hi Max, Thanks for describing the situation. We do not often vote on implementation changes that don't affect PHP language itself. At first, I even considered these changes as "I don't care". However, they are turned up into huge patches that make troubles for PHP maintenance. As we didn't come to an agreement, the best decision should be done by the PHP community. In the current state of the implementation I'm personally against this, but I might change my opinion if this is targeted to PHP-9 and the implementation is done in a some better way. Thanks. Dmitry. On Mon, Jan 16, 2023 at 3:03 PM Max Kellermann wrote: > Hi, > > in the past weeks, I submitted four PRs for cleaning up the #includes > in the PHP code base: > > https://github.com/php/php-src/pull/10216 > https://github.com/php/php-src/pull/10220 > https://github.com/php/php-src/pull/10279 > https://github.com/php/php-src/pull/10300 > > I saw that the existing #include directives were inconsistent, > incomplete and bloated; things just worked by chance, not by design, > because there were a few headers which just included everything. I > wanted to help clean up this mess that had accumulated over two > decades. > > All PRs were welcomed by different reviewers and were merged; there > was just one minor criticism by Dmitry Stogov who thought the code > comments explaining many non-obvious #includes should be removed: > > https://github.com/php/php-src/pull/10216#issuecomment-1375140255 > > > I don't think we should include the comments like // for > > BEGIN_EXTERN_C (and similar). The are good for review only. I'm > > indifferent to these changes and don't object. > > Yesterday, when a DTrace-specific regression was reported > (https://github.com/php/php-src/pull/10220#issuecomment-1383035139), > after which Dmitry asked to revert the whole PR: > > https://github.com/php/php-src/pull/10220#issuecomment-1383658247 > > Instead, I submitted a trivial fix for the regression > (https://github.com/php/php-src/pull/10334), which was rejected by > Dmitry > (https://github.com/php/php-src/pull/10220#issuecomment-1383706602) > but confirmed by the original reporter > (https://github.com/php/php-src/pull/10220#issuecomment-1383802334). > > In spite of that, Dmitry demanded to revert all of my #include > cleanups > (https://github.com/php/php-src/pull/10220#issuecomment-1383739816): > > > I'm asking to revert all these include cleanup commits! This is > > just a useless permutation. e.g. 68ada76 adds typedef struct _* that > > we didn't need before. How is this clearly? > > ... which so far only Derick Rethans agreed to: > > https://github.com/php/php-src/pull/10220#issuecomment-1383784480 > > > FWIW, I agree with Dmitry here, and these should all be reverted. It > > adds nothing but clutter. > > Christoph M. Becker performed the revert and suggested doing an RFC > (https://github.com/php/php-src/pull/10220#issuecomment-1383789100) > and vote on it. > > So this is a first draft of my proposal which I'd like to discuss with > you: > > https://github.com/php/php-src/pull/10338 > > Max > > -- > PHP Internals - PHP Runtime Development Mailing List > To unsubscribe, visit: https://www.php.net/unsub.php > >
Re: [PHP-DEV] RFC: rules for #include directives
Hi, Did you create an RFC already? Or is this RFC-like discussion? In either case, if you are asking community to come to a decision, we need to see some background. Why do you want to change this? What's the benefit? What's the impact on other maintainers, especially extension maintainers? Do you see any downsides to your new approach? Regards, Kamil
[PHP-DEV] RFC: rules for #include directives
Hi, in the past weeks, I submitted four PRs for cleaning up the #includes in the PHP code base: https://github.com/php/php-src/pull/10216 https://github.com/php/php-src/pull/10220 https://github.com/php/php-src/pull/10279 https://github.com/php/php-src/pull/10300 I saw that the existing #include directives were inconsistent, incomplete and bloated; things just worked by chance, not by design, because there were a few headers which just included everything. I wanted to help clean up this mess that had accumulated over two decades. All PRs were welcomed by different reviewers and were merged; there was just one minor criticism by Dmitry Stogov who thought the code comments explaining many non-obvious #includes should be removed: https://github.com/php/php-src/pull/10216#issuecomment-1375140255 > I don't think we should include the comments like // for > BEGIN_EXTERN_C (and similar). The are good for review only. I'm > indifferent to these changes and don't object. Yesterday, when a DTrace-specific regression was reported (https://github.com/php/php-src/pull/10220#issuecomment-1383035139), after which Dmitry asked to revert the whole PR: https://github.com/php/php-src/pull/10220#issuecomment-1383658247 Instead, I submitted a trivial fix for the regression (https://github.com/php/php-src/pull/10334), which was rejected by Dmitry (https://github.com/php/php-src/pull/10220#issuecomment-1383706602) but confirmed by the original reporter (https://github.com/php/php-src/pull/10220#issuecomment-1383802334). In spite of that, Dmitry demanded to revert all of my #include cleanups (https://github.com/php/php-src/pull/10220#issuecomment-1383739816): > I'm asking to revert all these include cleanup commits! This is > just a useless permutation. e.g. 68ada76 adds typedef struct _* that > we didn't need before. How is this clearly? ... which so far only Derick Rethans agreed to: https://github.com/php/php-src/pull/10220#issuecomment-1383784480 > FWIW, I agree with Dmitry here, and these should all be reverted. It > adds nothing but clutter. Christoph M. Becker performed the revert and suggested doing an RFC (https://github.com/php/php-src/pull/10220#issuecomment-1383789100) and vote on it. So this is a first draft of my proposal which I'd like to discuss with you: https://github.com/php/php-src/pull/10338 Max -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: https://www.php.net/unsub.php