Re: [patch 0/7] improve memcg oom killer robustness v2

azurIt Wed, 04 Sep 2013 00:54:24 -0700

>On Mon, Sep 02, 2013 at 12:38:02PM +0200, azurIt wrote:
>> >>Hi azur,
>> >>
>> >>here is the x86-only rollup of the series for 3.2.
>> >>
>> >>Thanks!
>> >>Johannes
>> >>---
>> >
>> >
>> >Johannes,
>> >
>> >unfortunately, one problem arises: I have (again) cgroup which cannot be 
>> >deleted :( it's a user who had very high memory usage and was reaching his 
>> >limit very often. Do you need any info which i can gather now?
>
>Did the OOM killer go off in this group?
>




# cat /cgroups/cannot_rm_01/memory.oom_control 
oom_kill_disable 0
under_oom 1
#




>Was there a warning in the syslog ("Fixing unhandled memcg OOM
>context")?



Really don't know cos i don't know the exact day when it happens. I just find 
that out on 30.8. but it could happen anytime before. Uptime on that server is 
27 days so maybe i can grep all syslog logs i have if it helps. I just need to 
find out the original name of that  cgroup cos i renamed it to 'cannot_rm_01' 
so my software will ignore it.



>If it happens again, could you check if there are tasks left in the
>cgroup?  And provide /proc/<pid>/stack of the hung task trying to
>delete the cgroup?



# cat /cgroups/cannot_rm_01/tasks
#



>> Now i can definitely confirm that problem is NOT fixed :( it happened again 
>> but i don't have any data because i already disabled all debug output.
>
>Which debug output?



Debug output from my own scripts which are suppose to handle this situation and 
kill frozen processes. I already reactivated it, it is grabbing content of 
'stacks' from all processes before killing them.



>Do you still have access to the syslog?



>From that day (30.8.)? Yes.


>It's possible that, as your system does not deadlock on the OOMing
>cgroup anymore, you hit a separate bug...
>
>Thanks!
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 0/7] improve memcg oom killer robustness v2

Reply via email to