On Fri, Dec 3, 2021 at 6:41 PM Eric Covener <cove...@gmail.com> wrote:
>
> On Fri, Dec 3, 2021 at 11:23 AM Ruediger Pluem <rpl...@apache.org> wrote:
> >
> > On 12/3/21 2:25 PM, yla...@apache.org wrote:
> > > Author: ylavic
> > > Date: Fri Dec  3 13:25:51 2021
> > > New Revision: 1895553
> > >
> > > URL: http://svn.apache.org/viewvc?rev=1895553&view=rev
> > > Log:
> > > mpm_event: Follow up to r1894285: new MaxSpareThreads heuristics.
> > >
> > > When at MaxSpareThreads, instead of deferring the stop if we are close to
> > > active/server limit let's wait for the pending exits to complete.
> > >
> > > This way we always and accurately account for slow-to-exit processes to
> > > avoid filling up the scoreboard, whether at the limits or not.
> >
> > Just as a comment in case users will report it: This can slow down process 
> > reduction even if far away from the limits:
> >
> > Previously each call to perform_idle_server_maintenance killed off one 
> > process if there was one to kill from the spare threads
> > point of view. Now it could take more calls as the process killed by the 
> > previous call to perform_idle_server_maintenance
> > might not have died when we return to perform_idle_server_maintenance and 
> > thus preventing to kill another one. Hence we won't have
> > multiple processes dying in parallel when we want to reduce processes due 
> > to too much spare threads.
> > This can cause situations that if we kill a slow dying process first we 
> > will have completely idle processes floating around for
> > quite some time.

Since the connections are more or less evenly distributed across all
the processes and each process handles all types of connections
(w.r.t. lifetime/timeout hence slowness to exit), there is probably
not a single slow-to-exit process potentially but rather all processes
or none.
So if a process is slow to exit I don't think that we gain anything by
killing more ones quickly (unless we have room for that, see below),
we will still have completely idle processes (the ones dying) for the
same time though they won't be able to take any potential increasing
load happening soon (while waiting for the dying processes before
killing more allows that).

Though killing them one at a time is a bit too drastic possibly, what
would be a reasonable maximum number of dying processes?

>
> Could we base on max_daemons_limit instead? In the current impl we
> might still have ample slack space in the SB.

IIUC the issue (by design) with max_daemons_limit is that it accounts
for large holes on graceful restart (when the old gen stops) and those
won't fill up until MaxSpareThreads (or MaxRequestsPerChild) kicks in
still, so after a graceful it may not be the appropriate metric for
"how much room do we have?" at MaxSpareThreads time.

How about (modulo brain fart):
    const int N = 1; /* or 2, 3, 4.. */
    int avail_daemons = server_limit - retained->total_daemons;
    int have_room_for_N_restarts = (avail_daemons / N >= active_daemons_limit);
    int inactive_daemons = retained->total_daemons - retained->active_daemons;
    int do_kill = (have_room_for_N_restarts || inactive_daemons == 0);
    if (do_kill) {
        ap_mpm_podx_signal(retained->buckets[child_bucket].pod,
                           AP_MPM_PODX_GRACEFUL);
    }
    else {
        /* Wait for inactive_daemons to settle down */
    }
?


Regards;
Yann.

Reply via email to