On 23/10/2021 02.49, miim wrote:
I have a relatively simple module which is nonetheless causing Apache to
intermittently segfault.
I've added debugging trace messages to be sent to the error log, but the lack of anything
in the log at the time of the segfault leads me to think that the error log is not
flushed when a message is sent. For example, a segfault occurs at 00:18:04, last
previous request was at 00:15:36, so clearly the new request caused the segfault. But
not even the "Here I am at the handler entry point" (see below) gets into the
logfile before the server log reports a segfault taking down Apache.
/* Retrieve the per-server configuration */
mod_bc_config *bc_scfg = ap_get_module_config(r->server->module_config,
&bridcheck_module);
if (bc_scfg->bc_logdebug & 0x0020000000000)
ap_log_rerror(APLOG_MARK, APLOG_NOTICE, 0, r,
"mod_bridcheck: Enter bridcheck_handler");
I could turn on core dumping but (a) I am no expert at decoding core dumps and
(b) I don't want to dump this problem on somebody else.
So ... is there a way to force Apache to flush the error log before proceeding?
Hello,
I think it is not a problem of log flushing. It is just that when a
segfault occurs the death is sudden because the process is killed by the
OS and has few chances to handle the error itself.
I am very confident, almost 100% sure, that if you don't see the message
in the log then the execution has simply not reached it, the segfault
happened before.
In my opinion it is easier to learn some four or five gdb commands than
to do whatsoever when the segfault occurs. There's only one way of
preventing the death of the process and that it to place a handler on
the SIGSEGV signal in your module (see "man signal" or "man sigaction").
But there's not much you can do in the signal handler. As said, it is
much much easier to activate coredumps and learn some commands.
Here's how I do it typically:
In Debian/Ubuntu distributions, they put a file named envvars in
/etc/apache2. If you have such a distribution edit it as I show below.
If not, then make sure you get the same effects with other means.
I put the following two lines:
ulimit -c unlimited
echo 1 > /proc/sys/kernel/core_uses_pid
The first line is an internal shell command saying that there should be
no size limit on the core file. If you don't have /etc/apache2/envvars
then this command should be executed in the shell from which you launch
apache, such that the apache process inherits this configuration.
The second command instructs the kernel to add the process id to the
name of the core file. Thus, if you have two apache children that dump
cores at the same time, you'll get two different core files instead of
single file in which the kernel writes both cores, and makes it thus
unusable. If you don't have /etc/apache2/envvars then you can execute
this command in any shell, just that you need root privileges in order
to write to /proc/sys/kernel/core_uses_pid.
Let us assume you have now the core file and its name is core.12345,
where 12345 is the process id of the apache child process that died.
Then I start gdb and I execute the following gdb commands at the gdb prompt:
file /usr/sbin/apache2
core-file core.12345
thread apply all bt
The first command loads the apache executable.
The second command loads the core file.
The thirst command displays the call stacks of all threads of the
process (bt = backtrace).
You can switch between threads with the command
thread N
where N is the numerical id of the thread you want to switch to.
Once you're in a thread, you can move up and down the call stack with
the commands "up" and "down". If you compiled your module with debug
symbols then you can inspect variables with the "print" command, e.g.
"print bc_scfg". If, for example, the segfault occurred somewhere in a
libc function, such as malloc, free, strcpy, etc, you may move up the
call chain to the caller of the libc function, to inspect its arguments.
Besides the necessary "-g" compiler switch for adding debugging symbols,
I typically add the "-fno-inline -O0" switches. This prevents any code
optimisation. When I execute step-by-step in a debugger (a live program,
obviously, not a core-file) the instruction are really executed in the
order written in the program and not rearranged for speed.
You may also debug a live program. "Normal" programs, when debugging,
are typically launched directly in the debugger. This is not really
advisable in apache, because it forks. What I do is to let apache start
normally ("apache2ctl start" or "systemctl start apache2") and then
attach the debugger to a live apache child process. I launch gdb, then I
execute the following commands at the gdb prompt:
attach N (where N is the process id of the apache child)
break my_handler (set a breakpoint at one of my functions)
cont (let the process continue its execution until it reaches the
breakpoint and I get the command prompt back)
When the breakpoint is reached I can inspect variables ("print
variable") and execute step by step ("step" and "next").
HTH,
Sorin