Dear Iván José Pulido Sánchez,
I used oom before and had a little experience on this( kernel
2.6.18, centos 5 kernels). I found when one set vm.overcommit_ratio
too much high, say, 100 or 90, oom is going to kill sshd and many
system services. So I keep setting vm.overcommit_ratio = 80 at most.
For me, I feel oom should always need careful setting if one want
it work properly..
I found a possible answer for you "-17" question. In
http://lwn.net/Articles/317814/, It says:
"The more memory the process uses, the higher the score. The
longer a process is alive in the system, the smaller the score"
Since siesta must be the longest process in the system, it will
get "-17" for oom_adj.
Possible solutions are mentioned in http://lwn.net/Articles/317814/ too:)
Best Wishes,
Kuilin
On 5/13/12, Nick Papior Andersen <[email protected]> wrote:
> I have no real experience with oom-killer. However, it is not set
> intrinsically by siesta. I believe it must be from openmpi or the compiler
> in itself. I have just checked my values of oom_adj, and they are all 0
> (also debian 6). So maybe you have set a system setting.
>
> Please check if it is siesta which is set to -17 or not, just to be sure of
> its origin.
>
> Nevertheless, to circumvent you can when you start siesta catch the pid and
> set the oom_adj value manually.
> Do:
> cat "0" > /proc/<pid>/oom_adj
>
> Then you should be just fine.
>
> Kind regards Nick
>
>
> 2012/5/10 Iván Pulido Sanchez <[email protected]>
>
>> Hello,
>>
>> I've been having problem with Siesta in Linux (Debian) when it needs a
>> lot
>> of RAM, specially in the case when it take all the available memory of
>> the system (ram + swap). The problem is that when it does that the linux
>> oom-killer gets invoked and then it start killing processes. The
>> following
>> are the relevant lines in the kernel log (kern.log) explaining this:
>>
>> May 8 21:00:16 nodo4 kernel: [8217579.288562] siesta invoked oom-killer:
>> gfp_mask=0x200da, order=0, oom_adj=-17
>> ...
>> May 8 21:00:16 nodo4 kernel: [8217579.306967] Out of memory: kill
>> process
>> 1975 (dbus-daemon) score 646 or a child
>> ...
>> May 8 21:00:16 nodo4 kernel: [8217579.856575] rsyslogd invoked
>> oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
>> ...
>> May 8 21:00:16 nodo4 kernel: [8217579.875127] Out of memory: kill
>> process
>> 1599 (rpc.statd) score 399 or a child
>> May 8 21:00:16 nodo4 kernel: [8217579.875934] Killed process 1599
>> (rpc.statd)
>> May 9 11:48:08 nodo4 kernel: imklog 4.6.2, log source = /proc/kmsg
>> started.
>>
>> And thats how it ends (as you can see in the jump in the date).
>>
>> There is something that worries me and that's in the first line of the
>> previous ones. It basically says that siesta is using a oom_adj value
>> of -17 meaning that it can't be killed by the oom-killer, I don't know
>> why is Siesta running with this value for oom_adj.
>>
>> Processes like ssh or rpc.statd (NFS) shouldn't get killed before
>> Siesta, this is why the node "dies" when this happens.
>>
>> Here is my siesta version and/or configuration:
>>
>> Siesta Version: siesta-3.1
>> Architecture : x86_64-debian6
>> Compiler flags: mpif90 -g -O2
>> PARALLEL version
>>
>>
>> Any idea, help or suggestions are very much appreciated.
>>
>> Thanks
>>
>> --
>> Iván José Pulido Sánchez
>> Estudiante de Física
>> Universidad Nacional de Colombia
>>
>>
>