On 08/10/2017 10:22 PM, Florian Weimer wrote:
> * Andrew Haley:
> 
>> On 09/08/17 14:05, Andrew Roberts wrote:
>>> 2) It would be nice to see some sort of out of memory error, rather than 
>>> just an ICE.
>>
>> There's nothing we can do: the kernel killed us.  We can't emit any
>> message before we die.  (killed) tells you that we were killed, but
>> we don't know who done it.
> 
> The driver already prints a message.
> 
> The siginfo_t information should indicate that the signal originated
> from the kernel.  

OOC, where?  While a parent process can use "waitid" to get
a siginfo_t with information about the child exit, that siginfo_t
is not the same siginfo_t a signal handler would get as
argument if you could catch/intercept SIGKILL, which you can't
on Linux.  I.e., checking for e.g., si_code == SI_KERNEL in
the siginfo filled in by waitid won't work, because that
siginfo_t has si_code values for SIGCHLD [CLD_EXITED/CLD_KILLED/etc.],
not for the signal that actually killed the process.

Doesn't seem to give you any more useful information beyond the
what you can already get using waitpid (which is what libiberty's
pex code in question uses) and WIFSIGNALED/WTERMSIG.

> It seems that for SIGKILL, there are currently three
> causes in the kernel: the OOM killer, some apparently unreachable code
> in ptrace, and something cgroups-related.  The latter would likely
> take down the driver process, too, so a kernel-originated SIGKILL
> strongly points to the OOM killer.
> 
> But the kernel could definitely do better and set a flag for SIGKILL.

Meanwhile, maybe just having the driver check for SIGKILL and
enumerate likely causes would be better than the status quo.

Pedro Alves

Reply via email to