Hi,

since oom_adj is deprecated since 2.6.36 kernel ( see
http://code.google.com/p/chromium/issues/detail?id=65009 among others )
and semantic has changed, ie.:

    /proc/pid/oom_adj       range: -17..15
    /proc/pid/oom_score_adj range: -1000..1000

This means writing -17 on /proc/pid/oom_score_adj does not offer oom 
protection
at all. It should be -1000.

I guess it's a matter of documentation and adjustments:
- SLURMD_OOM_ADJ and SLURMSTEPD_OOM_ADJ envars
- adjust set_oom_adj() to do "The Right Thing(Tm)" depending
   on oom_score_adj presence or not.
- doc/html/faq.shtml adjustments ...
- src/plugins/task/cgroup/task_cgroup_memory.c adjustments:
   it use -17 but that may end up in oom_score_adj, hence not
   oom protecting ...

And, yes, "I hate when they do that(Tm)" :)

FYI, I checked slurm 2.3.4 source.
Maybe, it is already handled in recent slurm versions ?


A+

-- 

-----------------------------------------------------------
      Michel Bourget - SGI - Linux Software Engineering
     "Past BIOS POST, everything else is extra" (travis)
-----------------------------------------------------------

Reply via email to