On Tue, Jun 29, 2021 at 3:00 PM Rainer Jung <rainer.j...@kippdata.de> wrote: > > Am 29.06.2021 um 14:31 schrieb Stefan Eissing: > > Can comment really on the diff, but totally agree on the goal to minimize > > the unresponsive time and make graceful less disruptive. > > > > So +1 for that. > > +1 on the intention as well.
Checked in trunk (r1892587 + r1892595). > > Not sure, whether that means people would need more headroom in the > scoreboard (which would probably warrant a sentence in CHANGES or docs > about that) or whether it just means the duration during which that > headroom is used changes (which I wouldn't care about). The restart delay between stop and start is now minimal (no reload in between), but the headroom needed does not change AIUI. We still have the situation where connections (worker threads) are active for both the new and old generations of children processes, and its duration depends mainly on the actual lifetime of the connections. So the current tunings still hold I think. What changes now is that for both graceful and ungraceful restarts the main process fully consumes one CPU (to reload) while children are actively running (the old generation keeps accepting/processing connections during reload), whereas before the children were tearing down thus easing the CPUs (but filling the sockets backlogs, potentially until exhaustion..). So there might be a greater load spike (overall) than before on reload. A note on the headroom while at it: mpm_event is possibly less consumer of children (hence scoreboard slots) on restart, because when a child is dying it stops (and thus doesn't account for) the worker threads above the remaining number of connections, which will accurately create children of the new generation to scale. mpm_worker never stops threads (this improvement never made it there AFAICT), thus by accounting for inactive threads as active it will finally create more children of the new generation as connections arrive (eventually reaching the limits earlier, or blocking/waiting for worker threads in the new generation of children overflowed by incoming connections which the main process thinks are evenly distributed across all the children, including old generation's). I don't know how hard/worthy it is to align mpm_worker with mpm_event on this, just a note.. Cheers; Yann.