Re: [Bug 53555] Scoreboard full error with event/ssl

Stefan Fritsch Mon, 05 Sep 2016 15:03:06 -0700

On Tuesday, 14 June 2016 17:31:50 CEST Eric Covener wrote:
> On Wed, Apr 13, 2016 at 6:27 PM, Stefan Fritsch <[email protected]> wrote:
> > Maybe it would be better to remove the logic to re-use scoreboard
> > slots of processes which have already terminated some threads.
> > Instead, one could use the scoreboard area abore (MaxRequestWorkers /
> > ThreadsPerChild) more aggressively, and maybe even allocate some
> > additional slack space at startup.
> 
> Been helping a user affected by this, they got here after tuning
> around some OOM issues
> and are stuck with long ProxyTimeout values (a good recipe for this problem)
> 
> Without understanding this very well, using the slack space (exclusively)
> does seem like a good idea to me.
> 
> >There could be these config knobs:
> > - the max number of fully active processes (ServerLimit)
> 
> We could also also leave this as MaxRequestWorkers /ThreadsPerChild.
> The slack would be defined by the difference from ServerLimit.  This
> is also handy because it means ServerLimit really is the max
> active+exiting.


I have implemented this now. The patch is attached to the PR:
https://bz.apache.org/bugzilla/attachment.cgi?id=34201&action=diff

In addition, no more processes are told to gracefully stop due to 
MaxSpareThreads, if there are more running processes than MaxRequestWorkers /
ThreadsPerChild. This implies that some are already finishing gracefully but 
have not finished yet.

This condition can be refined, but the other suggestions below have not been 
implemented yet.

I don't think I will have more time to work on this in the next 3 weeks at 
least. But if anyone could take a peek at the patch and comment and/or test 
it, that would be nice.

Cheers,
Stefan

> > - the max number of gracefully finishing processes ("OldServerLimit"?)
> 
> Maybe better to count threads here after your other changes?
> 
> > - what to do if OldServerLimit is reached during a graceful restart:
> >   * reduce the max number of fully active processes accordingly
> >   * kill off old processes
> 
> Maybe OldServerLimit could be the # of exiting processes (or threads)
> that are not counted against the limits, rather than something that is
> binary (either breached or not)
> 
> GracefulShutdownTimeout could not be directly applied, but maybe we
> could have something similar applied only to get us back below
> OldServerLimit.
> 
> > - what to do if OldServerLimit during normal operation:
> >   * stop doing idle-cleanup of fully active processes [X]
> >   * kill off old processes
> > 
> > Hmm. Now that I think of it, I think we should really do [X] if there
> > are too many old processes. The only question is how to define too
> > many.
> 
> I think OldServerLimit as more of a soft limit for countermeasures
> like this helps here too. The slack space itself is then the hard
> limit.

Re: [Bug 53555] Scoreboard full error with event/ssl

Reply via email to