https://bz.apache.org/bugzilla/show_bug.cgi?id=58243
Bug ID: 58243
Summary: rotatelogs goes infinite at startup
Product: Apache httpd-2
Version: 2.4.12
Hardware: PC
Status: NEW
Severity: normal
Priority: P2
Component: support
Assignee: [email protected]
Reporter: [email protected]
Created attachment 33000
--> https://bz.apache.org/bugzilla/attachment.cgi?id=33000&action=edit
rotatelogs cpu spike process tress
originally reported at ApacheLounge:
http://www.apachelounge.com/viewtopic.php?t=6707
when using the -p parameter, e.g.:
ErrorLog "|bin/rotatelogs.exe -p maint/MaintainLogs.bat -l
logs/error.%Y%m%d.log 86400"
CustomLog "|bin/rotatelogs.exe -p maint/MaintainLogs.bat -l
logs/access.%Y%m%d.log 86400" access
one instance of rotatelogs.exe will spike the CPU, caught in an infinite loop.
after a bunch of whittling down, I've identified the cause of my issue resides
in rotatelogs.c::post_rotate, here:
/* Collect any zombies from a previous run, but don't wait. */
while (apr_proc_wait_all_procs(&proc, NULL, NULL, APR_NOWAIT, pool) ==
APR_CHILD_DONE)
/* noop */;
looking at the implementation of said function, we see:
if (waithow != APR_WAIT) {
if (nChilds && nChilds == nActive) {
/* All child processes are running */
rv = APR_CHILD_NOTDONE;
proc->pid = -1;
}
else {
/* proc->pid contains the pid of the
* exited processes
*/
rv = APR_CHILD_DONE;
}
}
if (nActive == 0) {
rv = APR_CHILD_DONE;
proc->pid = -1;
}
return rv;
I would expect nActive == 0 on an initial startup, so I am confused why we're
checking == APR_CHILD_DONE instead of != APR_CHILD_DONE, i.e. loop again if
children exist.
with the original code, I got something like this:
httpd
|- rotatelogs (error) *cpu spike*
|- rotatelogs (access) no spike
|- httpd
|- rotatelogs (error) no spike
switching the code to != check yields:
httpd
|- rotatelogs (error) no spike
|- rotatelogs (access) no spike
|- httpd
|- rotatelogs (error) *cpu spike*
|- rotatelogs (access) no spike
so there is still something amiss. I did trap out to verify that the return
value when using != was APR_CHILD_NOTDONE and not some OS error.
I am not setup to compile apr runtime so I cannot further trap out values for
nChilds and nActive, but the only way we should be able to get
APR_CHILD_NOTDONE would be if nChilds > 0.
I've attached a screenshot of the process tree and it is perhaps worth noting
that there exists a conhost child process.
we don't appear to do anything about the list of processes so I'm not entirely
sure what the goal of this call to apr_proc_wait_all_procs is aiming to do,
i.e. why it's in a while{} vs. an if{} or some such thing.
--
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]