Hi all, we are running a LXC-Host with several testing containers (14 at the moment). The host itself is on Ubuntu 14.04, with 3.13.0-32 Kernel. The containers are running Debian Wheezy.
From time to time the host machine completely crashes, probably due to containers eating up too much ram. We already limited every container via cgroup (cpu and ram), but still receive this behaviour. Our suspection is, that java on some of the containers isn't correctly limited, which leads to crashing the host machine. Does anybody got similar expirience, or is there something missing when limiting containers via cgroup? This is the syslog entry for the last crash (host machine + one of the containers): #### HOST #### Aug 26 13:33:10 node04 kernel: [87282.555841] Modules linked in:<4>[87282.555841] Call Trace: Aug 26 13:33:10 node04 kernel: [87282.555841] [<ffffffff811458c4>] perf_event_overflow+0x14/0x20 Aug 26 13:33:10 node04 kernel: [87282.555841] [<ffffffff8136e9ed>] ? __write_lock_failed+0xd/0x20 Aug 26 13:33:10 node04 kernel: [87282.555841] ---[ end trace 71798cbdeee56afd ]--- Aug 26 13:33:10 node04 kernel: [87304.156008] RAX: ffff881018a2b2e8 RBX: ffff880815771e28 RCX: 0000000000000006 Aug 26 13:33:10 node04 kernel: [87304.156008] Stack: Aug 26 13:33:10 node04 kernel: [87304.156008] [<ffffffff81152384>] pagefault_out_of_memory+0x14/0x80 Aug 26 13:33:10 node04 kernel: [87304.156008] [<ffffffff81727fda>] do_page_fault+0x1a/0x70 Aug 26 13:33:18 node04 kernel: [87312.204006] FS: 00007f58c7c59700(0000) GS:ffff88103f940000(0000) knlGS:0000000000000000 Aug 26 13:33:18 node04 kernel: [87312.204006] Stack: Aug 26 13:33:18 node04 kernel: [87312.204006] ffff880ab78e3c70 ffffffff8160c3a8 ffff88103f914f00 ffff88103f914f00 Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff810dba85>] smp_call_function_single+0xe5/0x190 Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff810dbeb6>] smp_call_function_many+0x286/0x2d0 Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff811814d5>] change_protection+0x65/0xb0 Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff8172423c>] retint_signal+0x48/0x8c ############## #### Container #### Aug 26 13:32:45 ff01 kernel: [87279.009427] CPU: 0 PID: 773 Comm: java Tainted: GF O 3.13.0-32-generic #57-Ubuntu Aug 26 13:32:45 ff01 kernel: [87279.009442] Call Trace: Aug 26 13:32:45 ff01 kernel: [87279.009470] [<ffffffff811b388c>] mem_cgroup_oom_synchronize+0x4fc/0x540 Aug 26 13:32:45 ff01 kernel: [87279.009502] [<ffffffff81724448>] page_fault+0x28/0x30 Aug 26 13:32:45 ff01 kernel: [87279.009650] [23109] 0 23109 32444 422 29 0 0 console-kit-dae Aug 26 13:32:45 ff01 kernel: [87279.009779] [ 714] 1000 714 4999 182 12 0 0 wrapper-linux-x ################### Please tell me if you need additional information. Regards, Danijel Vargek -- Danijel Vargek Systemadministrator Unix Continum AG Bismarckallee 7b-d D-79098 Freiburg i. Br. Tel.: +49 761 217111-77 Fax.: +49 761 217111-99 http://www.continum.net Sitz der Gesellschaft: Freiburg im Breisgau Registergericht: Amtsgericht Freiburg, HRB 6866 Vorstand: Volker T. Mueller Vorsitzender d. Aufsichtsrats: Bernd Straub _______________________________________________ lxc-users mailing list [email protected] http://lists.linuxcontainers.org/listinfo/lxc-users
