Quoting Danijel Vargek, Continum ([email protected]):
> Hi all,
> 
> we are running a LXC-Host with several testing containers (14 at the moment).
> The host itself is on Ubuntu 14.04, with 3.13.0-32 Kernel. The containers
> are running Debian Wheezy.
> 
> From time to time the host machine completely crashes, probably due to 
> containers
> eating up too much ram. We already limited every container via cgroup (cpu 
> and ram),
> but still receive this behaviour. 
> 
> Our suspection is, that java on some of the containers isn't correctly 
> limited, which
> leads to crashing the host machine. 
> 
> Does anybody got similar expirience, or is there something missing when 
> limiting containers
> via cgroup? 
> 
> This is the syslog entry for the last crash (host machine + one of the 
> containers):
> 
> #### HOST ####
> Aug 26 13:33:10 node04 kernel: [87282.555841] Modules linked 
> in:<4>[87282.555841] Call Trace:
> Aug 26 13:33:10 node04 kernel: [87282.555841]  [<ffffffff811458c4>] 
> perf_event_overflow+0x14/0x20
> Aug 26 13:33:10 node04 kernel: [87282.555841]  [<ffffffff8136e9ed>] ? 
> __write_lock_failed+0xd/0x20
> Aug 26 13:33:10 node04 kernel: [87282.555841] ---[ end trace 71798cbdeee56afd 
> ]---
> Aug 26 13:33:10 node04 kernel: [87304.156008] RAX: ffff881018a2b2e8 RBX: 
> ffff880815771e28 RCX: 0000000000000006
> Aug 26 13:33:10 node04 kernel: [87304.156008] Stack:
> Aug 26 13:33:10 node04 kernel: [87304.156008]  [<ffffffff81152384>] 
> pagefault_out_of_memory+0x14/0x80
> Aug 26 13:33:10 node04 kernel: [87304.156008]  [<ffffffff81727fda>] 
> do_page_fault+0x1a/0x70
> Aug 26 13:33:18 node04 kernel: [87312.204006] FS:  00007f58c7c59700(0000) 
> GS:ffff88103f940000(0000) knlGS:0000000000000000
> Aug 26 13:33:18 node04 kernel: [87312.204006] Stack:
> Aug 26 13:33:18 node04 kernel: [87312.204006]  ffff880ab78e3c70 
> ffffffff8160c3a8 ffff88103f914f00 ffff88103f914f00
> Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff810dba85>] 
> smp_call_function_single+0xe5/0x190
> Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff810dbeb6>] 
> smp_call_function_many+0x286/0x2d0
> Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff811814d5>] 
> change_protection+0x65/0xb0
> Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff8172423c>] 
> retint_signal+0x48/0x8c
> ##############
> 
> #### Container ####
> Aug 26 13:32:45 ff01 kernel: [87279.009427] CPU: 0 PID: 773 Comm: java 
> Tainted: GF          O 3.13.0-32-generic #57-Ubuntu
> Aug 26 13:32:45 ff01 kernel: [87279.009442] Call Trace:
> Aug 26 13:32:45 ff01 kernel: [87279.009470]  [<ffffffff811b388c>] 
> mem_cgroup_oom_synchronize+0x4fc/0x540

This (mem_cgroup_oom_synchronize) suggests to me that in fact the container
is correctly limited.  java exceeds what the container is allowed to
use, and so is killed.

> Aug 26 13:32:45 ff01 kernel: [87279.009502]  [<ffffffff81724448>] 
> page_fault+0x28/0x30
> Aug 26 13:32:45 ff01 kernel: [87279.009650] [23109]     0 23109    32444      
> 422      29        0             0 console-kit-dae
> Aug 26 13:32:45 ff01 kernel: [87279.009779] [  714]  1000   714     4999      
> 182      12        0             0 wrapper-linux-x
> ###################
> 
> Please tell me if you need additional information.
> 
> Regards,
> Danijel Vargek
> 
> -- 
> Danijel Vargek
> Systemadministrator Unix
> 
> Continum AG
> Bismarckallee 7b-d
> D-79098 Freiburg i. Br.
> Tel.: +49 761 217111-77
> Fax.: +49 761 217111-99
> http://www.continum.net
> 
> Sitz der Gesellschaft: Freiburg im Breisgau
> Registergericht: Amtsgericht Freiburg, HRB 6866
> Vorstand: Volker T. Mueller
> Vorsitzender d. Aufsichtsrats: Bernd Straub
> _______________________________________________
> lxc-users mailing list
> [email protected]
> http://lists.linuxcontainers.org/listinfo/lxc-users
_______________________________________________
lxc-users mailing list
[email protected]
http://lists.linuxcontainers.org/listinfo/lxc-users

Reply via email to