Thanks a lot for your suggestions, gonna put them in practice,
I'll let you know if something else comes up.

On Sun, May 13, 2012 at 10:56 PM, kuilin lu <[email protected]> wrote:

> Dear  Iván José Pulido Sánchez,
>
>     I used oom before and had a little experience on this( kernel
> 2.6.18, centos 5 kernels). I found when one set vm.overcommit_ratio
> too much high, say, 100 or 90, oom is going to kill sshd and many
> system services. So I keep setting vm.overcommit_ratio = 80 at most.
>     For me, I feel oom should always need careful setting if one want
> it work properly..
>
>     I found a possible answer for you "-17" question. In
> http://lwn.net/Articles/317814/, It says:
>     "The more memory the process uses, the higher the score. The
> longer a process is alive in the system, the smaller the score"
>     Since siesta must be the longest process in the system, it will
> get "-17" for oom_adj.
>
>     Possible solutions are mentioned in http://lwn.net/Articles/317814/too:)
>
> Best Wishes,
> Kuilin
>
>
> On 5/13/12, Nick Papior Andersen <[email protected]> wrote:
> > I have no real experience with oom-killer. However, it is not set
> > intrinsically by siesta. I believe it must be from openmpi or the
> compiler
> > in itself. I have just checked my values of oom_adj, and they are all 0
> > (also debian 6). So maybe you have set a system setting.
> >
> > Please check if it is siesta which is set to -17 or not, just to be sure
> of
> > its origin.
> >
> > Nevertheless, to circumvent you can when you start siesta catch the pid
> and
> > set the oom_adj value manually.
> > Do:
> > cat "0" > /proc/<pid>/oom_adj
> >
> > Then you should be just fine.
> >
> > Kind regards Nick
> >
> >
> > 2012/5/10 Iván Pulido Sanchez <[email protected]>
> >
> >> Hello,
> >>
> >> I've been having problem with Siesta in Linux (Debian) when it needs a
> >> lot
> >> of RAM, specially in the case when it take all the available memory of
> >>  the system (ram + swap). The problem is that when it does that the
> linux
> >> oom-killer gets invoked and then it start killing processes. The
> >> following
> >> are the relevant lines in the kernel log (kern.log) explaining this:
> >>
> >> May  8 21:00:16 nodo4 kernel: [8217579.288562] siesta invoked
> oom-killer:
> >> gfp_mask=0x200da, order=0, oom_adj=-17
> >> ...
> >> May  8 21:00:16 nodo4 kernel: [8217579.306967] Out of memory: kill
> >> process
> >> 1975 (dbus-daemon) score 646 or a child
> >> ...
> >> May  8 21:00:16 nodo4 kernel: [8217579.856575] rsyslogd invoked
> >> oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
> >> ...
> >> May  8 21:00:16 nodo4 kernel: [8217579.875127] Out of memory: kill
> >> process
> >> 1599 (rpc.statd) score 399 or a child
> >> May  8 21:00:16 nodo4 kernel: [8217579.875934] Killed process 1599
> >> (rpc.statd)
> >> May  9 11:48:08 nodo4 kernel: imklog 4.6.2, log source = /proc/kmsg
> >> started.
> >>
> >> And thats how it ends (as you can see in the jump in the date).
> >>
> >> There is something that worries me and that's in the first line of the
> >> previous ones. It basically says that siesta is using a oom_adj value
> >> of -17 meaning that it can't be killed by the oom-killer, I don't know
> >> why is Siesta running with this value for oom_adj.
> >>
> >> Processes like ssh or rpc.statd (NFS) shouldn't get killed before
> >> Siesta, this is why the node "dies" when this happens.
> >>
> >> Here is my siesta version and/or configuration:
> >>
> >> Siesta Version:                                        siesta-3.1
> >> Architecture  : x86_64-debian6
> >> Compiler flags: mpif90 -g -O2
> >> PARALLEL version
> >>
> >>
> >> Any idea, help or suggestions are very much appreciated.
> >>
> >> Thanks
> >>
> >> --
> >> Iván José Pulido Sánchez
> >> Estudiante de Física
> >> Universidad Nacional de Colombia
> >>
> >>
> >
>



-- 
Iván José Pulido Sánchez
Estudiante de Física
Universidad Nacional de Colombia

Responder a