On Tuesday, 14 June 2016 17:31:50 CEST Eric Covener wrote: > On Wed, Apr 13, 2016 at 6:27 PM, Stefan Fritsch <[email protected]> wrote: > > Maybe it would be better to remove the logic to re-use scoreboard > > slots of processes which have already terminated some threads. > > Instead, one could use the scoreboard area abore (MaxRequestWorkers / > > ThreadsPerChild) more aggressively, and maybe even allocate some > > additional slack space at startup. > > Been helping a user affected by this, they got here after tuning > around some OOM issues > and are stuck with long ProxyTimeout values (a good recipe for this problem) > > Without understanding this very well, using the slack space (exclusively) > does seem like a good idea to me. > > >There could be these config knobs: > > - the max number of fully active processes (ServerLimit) > > We could also also leave this as MaxRequestWorkers /ThreadsPerChild. > The slack would be defined by the difference from ServerLimit. This > is also handy because it means ServerLimit really is the max > active+exiting.
I have implemented this now. The patch is attached to the PR: https://bz.apache.org/bugzilla/attachment.cgi?id=34201&action=diff In addition, no more processes are told to gracefully stop due to MaxSpareThreads, if there are more running processes than MaxRequestWorkers / ThreadsPerChild. This implies that some are already finishing gracefully but have not finished yet. This condition can be refined, but the other suggestions below have not been implemented yet. I don't think I will have more time to work on this in the next 3 weeks at least. But if anyone could take a peek at the patch and comment and/or test it, that would be nice. Cheers, Stefan > > - the max number of gracefully finishing processes ("OldServerLimit"?) > > Maybe better to count threads here after your other changes? > > > - what to do if OldServerLimit is reached during a graceful restart: > > * reduce the max number of fully active processes accordingly > > * kill off old processes > > Maybe OldServerLimit could be the # of exiting processes (or threads) > that are not counted against the limits, rather than something that is > binary (either breached or not) > > GracefulShutdownTimeout could not be directly applied, but maybe we > could have something similar applied only to get us back below > OldServerLimit. > > > - what to do if OldServerLimit during normal operation: > > * stop doing idle-cleanup of fully active processes [X] > > * kill off old processes > > > > Hmm. Now that I think of it, I think we should really do [X] if there > > are too many old processes. The only question is how to define too > > many. > > I think OldServerLimit as more of a soft limit for countermeasures > like this helps here too. The slack space itself is then the hard > limit.
