EP-2DAD0AFA905A4ACB804C4F82A001242F
Hi Andrew,

Presently in oom_kill.c we calculate badness score of the victim task as per 
the present RSS counter value of the task.
RSS counter value for any task is usually '[Private (Dirty/Clean)] + [Shared 
(Dirty/Clean)]' of the task.
We have encountered a situation where values for Private fields are less but 
value for Shared fields are more and hence make total RSS counter value large. 
Later on oom situation killing task with highest RSS value but as Private field 
values are not large hence memory gain after killing this process is not as per 
the expectation.

For e.g. take below use-case scenario, in which 3 process are running in 
system. 
All these process done mmap for file exist in present directory and then 
copying data from this file to local allocated pointers in while(1) loop with 
some sleep. Out of 3 process, 2 process has mmaped file with MAP_SHARED setting 
and one has mapped file with MAP_PRIVATE setting.
I have all 3 processes in background and checks RSS/PSS value from user space 
utility (utility over cat /proc/pid/smaps)
Before OOM, below is the consumed memory status for these 3 process (all 
processes run with oom_score_adj = 0)
====================================================
Comm : 1prg,  Pid : 213 (values in kB)
                      Rss     Shared      Private          Pss
  Process :  375764    194596    181168     278460
====================================================
Comm : 3prg,  Pid : 217 (values in kB)
                      Rss    Shared       Private         Pss
  Process :  305760          32     305728    305738
====================================================
Comm : 2prg,  Pid : 218 (values in kB)
                      Rss      Shared       Private         Pss
  Process :  389980     194596     195384    292676
====================================================

Thus as per present code design, first it would select process [2prg : 218] as 
bulkiest process as its RSS value is highest to kill. But if we kill this 
process then only ~195MB would be free as compare to expected ~389MB.
Thus identifying the task based on RSS value is not accurate design and killing 
that identified process didn’t release expected memory back to system.

We need to calculate victim task based on PSS instead of RSS as PSS value 
calculates as
PSS value = [Private (Dirty/Clean)] + [Shared (Dirty/Clean) / no. of shared 
task]
For above use-case scenario also, it can be checked that process [3prg : 217] 
is having largest PSS value and by killing this process we can gain maximum 
memory (~305MB) as compare to killing process identified based on RSS value.

--
Regards,
Yogesh Gaur.N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±‘êçzX§¶›¡Ü¨}©ž²Æ 
zÚ&j:+v‰¨¾«‘êçzZ+€Ê+zf£¢·hšˆ§~†­†Ûiÿûàz¹®w¥¢¸?™¨è­Ú&¢)ߢf”ù^jÇ«y§m…á@A«a¶Úÿ
0¶ìh®å’i

Reply via email to