Hi Chandra,

The following program will be killed in our system by the memory controller
(both the E17 based one and a prior one):

dd bs=4096 count=250000 < /dev/zero > /bigfile

The class within which this command executes is setup as follows:

res=mem,guarantee=-2,limit=125000,total_guarantee=100,max_limit=100

The default class has the following:

res=mem,guarantee=-2,limit=-2,total_guarantee=322735,max_limit=322735

Is it really the case that dd consumes that much memory and therefore must
be killed?  I think this is unlikely, as observing its VIRT and RSS from the
top output shows that its size does not grow beyond a few megabytes.

Or, is the memory controller keeping track of pages that logically no longer
belong to the class.  Looking at output of dmesg, I see the following
bef_shnk_cls and aft_shnk_cls debug messages:

Set mem shares to -2 125000 -1 -1
Check<bef_shnk_cls> /rcfs/taskclass/v_hog1: total=112500
Check<bef_shnk_cls>(zone=0): act 0, inae 0 lact 0 lina 0
Check<bef_shnk_cls>(zone=1): act 1216, inae 102192 lact 1216 lina 102192
Check<bef_shnk_cls>(zone=2): act 661, inae 8181 lact 661 lina 8181
Check<aft_shnk_cls> /rcfs/taskclass/v_hog1: total=32529
Check<aft_shnk_cls>(zone=0): act 0, inae 0 lact 0 lina 0
Check<aft_shnk_cls>(zone=1): act 1216, inae 22192 lact 1216 lina 22192
Check<aft_shnk_cls>(zone=2): act 661, inae 8181 lact 661 lina 8181
Check<bef_shnk_cls> /rcfs/taskclass/v_hog1: total=113746
Check<bef_shnk_cls>(zone=0): act 0, inae 0 lact 0 lina 0
Check<bef_shnk_cls>(zone=1): act 1227, inae 22257 lact 1227 lina 22257
Check<bef_shnk_cls>(zone=2): act 661, inae 89322 lact 661 lina 89322
Check<aft_shnk_cls> /rcfs/taskclass/v_hog1: total=112576
Check<aft_shnk_cls>(zone=0): act 0, inae 0 lact 0 lina 0
Check<aft_shnk_cls>(zone=1): act 1227, inae 22129 lact 1227 lina 22129
Check<aft_shnk_cls>(zone=2): act 663, inae 88260 lact 663 lina 88260
Check<bef_shnk_cls> /rcfs/taskclass/v_hog1: total=112577
Check<bef_shnk_cls>(zone=0): act 0, inae 0 lact 0 lina 0
Check<bef_shnk_cls>(zone=1): act 1227, inae 22129 lact 1227 lina 22129
Check<bef_shnk_cls>(zone=2): act 663, inae 88261 lact 663 lina 88261
VM: killing process dd
Check<aft_shnk_cls> /rcfs/taskclass/v_hog1: total=119531
Check<aft_shnk_cls>(zone=0): act 0, inae 0 lact 0 lina 0
Check<aft_shnk_cls>(zone=1): act 1229, inae 44417 lact 1229 lina 44417
Check<aft_shnk_cls>(zone=2): act 666, inae 72915 lact 666 lina 72915
Check<bef_shnk_cls> /rcfs/taskclass/v_hog1: total=119531
Check<bef_shnk_cls>(zone=0): act 0, inae 0 lact 0 lina 0
Check<bef_shnk_cls>(zone=1): act 1229, inae 44417 lact 1229 lina 44417
Check<bef_shnk_cls>(zone=2): act 666, inae 72915 lact 666 lina 72915
Check<aft_shnk_cls> /rcfs/taskclass/v_hog1: total=69733
Check<aft_shnk_cls>(zone=0): act 0, inae 0 lact 0 lina 0
Check<aft_shnk_cls>(zone=1): act 1294, inae 22288 lact 1294 lina 22288
Check<aft_shnk_cls>(zone=2): act 608, inae 45223 lact 608 lina 45223

Not sure how to interpret this exactly, but it seems me that the bulk of the
pages are in the inae/lina lists.  Maybe the mem controller should be more
aggressive in cleaning out these lists before killing a process like dd.

Marc







-------------------------------------------------------
This SF.Net email is sponsored by: New Crystal Reports XI.
Version 11 adds new functionality designed to reduce time involved in
creating, integrating, and deploying reporting solutions. Free runtime info,
new features, or free trial, at: http://www.businessobjects.com/devxi/728
_______________________________________________
ckrm-tech mailing list
https://lists.sourceforge.net/lists/listinfo/ckrm-tech

Reply via email to