The Linux overcommit approach is evil anyway, because it's not deterministic
enough in its consequences.  What _should_ happen is that the fork(), 
brk()/sbrk(),
private mmap() of /dev/zero, etc fails.  Every effort should be made to avoid 
failing
COWs and the like that constitute resource allocations for which the syscall 
has already returned
successfully.

Of course, once in a while, no matter how careful the OS is, there's an 
uncorrectable error on a
dirty user page, I/O error on a fetch from backing store, an exec*() that fails 
after address space
teardown has begun (despite having already done some of the checks on the 
executable), or
whatever, and the OS has no choice that results in consistent behavior
other than to kill the process.  But there's really no other excuse for an OS 
initiated SIGKILL
other than such a situation, IMO, and every remotely reasonable attempt should 
have been made
to avoid rather than invite such a situation, even at some cost in performance. 
 And in the event
that such a thing is unavoidable, the logging functionality of the OS should 
already have the
resources needed to maximally ensure its ability to record (or at least display 
on the console) the
timestamp, PID, execname, and reason.

That's not to say that sloppy apps might not still fail to check error returns 
from syscalls or
library functions that make syscalls, but that's not the OS's fault.
 
 
This message posted from opensolaris.org
_______________________________________________
opensolaris-discuss mailing list
[email protected]

Reply via email to