Quoting Danijel Vargek, Continum ([email protected]): > Hi all, > > we are running a LXC-Host with several testing containers (14 at the moment). > The host itself is on Ubuntu 14.04, with 3.13.0-32 Kernel. The containers > are running Debian Wheezy. > > From time to time the host machine completely crashes, probably due to > containers > eating up too much ram. We already limited every container via cgroup (cpu > and ram), > but still receive this behaviour. > > Our suspection is, that java on some of the containers isn't correctly > limited, which > leads to crashing the host machine. > > Does anybody got similar expirience, or is there something missing when > limiting containers > via cgroup? > > This is the syslog entry for the last crash (host machine + one of the > containers): > > #### HOST #### > Aug 26 13:33:10 node04 kernel: [87282.555841] Modules linked > in:<4>[87282.555841] Call Trace: > Aug 26 13:33:10 node04 kernel: [87282.555841] [<ffffffff811458c4>] > perf_event_overflow+0x14/0x20 > Aug 26 13:33:10 node04 kernel: [87282.555841] [<ffffffff8136e9ed>] ? > __write_lock_failed+0xd/0x20 > Aug 26 13:33:10 node04 kernel: [87282.555841] ---[ end trace 71798cbdeee56afd > ]--- > Aug 26 13:33:10 node04 kernel: [87304.156008] RAX: ffff881018a2b2e8 RBX: > ffff880815771e28 RCX: 0000000000000006 > Aug 26 13:33:10 node04 kernel: [87304.156008] Stack: > Aug 26 13:33:10 node04 kernel: [87304.156008] [<ffffffff81152384>] > pagefault_out_of_memory+0x14/0x80 > Aug 26 13:33:10 node04 kernel: [87304.156008] [<ffffffff81727fda>] > do_page_fault+0x1a/0x70 > Aug 26 13:33:18 node04 kernel: [87312.204006] FS: 00007f58c7c59700(0000) > GS:ffff88103f940000(0000) knlGS:0000000000000000 > Aug 26 13:33:18 node04 kernel: [87312.204006] Stack: > Aug 26 13:33:18 node04 kernel: [87312.204006] ffff880ab78e3c70 > ffffffff8160c3a8 ffff88103f914f00 ffff88103f914f00 > Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff810dba85>] > smp_call_function_single+0xe5/0x190 > Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff810dbeb6>] > smp_call_function_many+0x286/0x2d0 > Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff811814d5>] > change_protection+0x65/0xb0 > Aug 26 13:33:18 node04 kernel: [87312.204006] [<ffffffff8172423c>] > retint_signal+0x48/0x8c > ############## > > #### Container #### > Aug 26 13:32:45 ff01 kernel: [87279.009427] CPU: 0 PID: 773 Comm: java > Tainted: GF O 3.13.0-32-generic #57-Ubuntu > Aug 26 13:32:45 ff01 kernel: [87279.009442] Call Trace: > Aug 26 13:32:45 ff01 kernel: [87279.009470] [<ffffffff811b388c>] > mem_cgroup_oom_synchronize+0x4fc/0x540
This (mem_cgroup_oom_synchronize) suggests to me that in fact the container is correctly limited. java exceeds what the container is allowed to use, and so is killed. > Aug 26 13:32:45 ff01 kernel: [87279.009502] [<ffffffff81724448>] > page_fault+0x28/0x30 > Aug 26 13:32:45 ff01 kernel: [87279.009650] [23109] 0 23109 32444 > 422 29 0 0 console-kit-dae > Aug 26 13:32:45 ff01 kernel: [87279.009779] [ 714] 1000 714 4999 > 182 12 0 0 wrapper-linux-x > ################### > > Please tell me if you need additional information. > > Regards, > Danijel Vargek > > -- > Danijel Vargek > Systemadministrator Unix > > Continum AG > Bismarckallee 7b-d > D-79098 Freiburg i. Br. > Tel.: +49 761 217111-77 > Fax.: +49 761 217111-99 > http://www.continum.net > > Sitz der Gesellschaft: Freiburg im Breisgau > Registergericht: Amtsgericht Freiburg, HRB 6866 > Vorstand: Volker T. Mueller > Vorsitzender d. Aufsichtsrats: Bernd Straub > _______________________________________________ > lxc-users mailing list > [email protected] > http://lists.linuxcontainers.org/listinfo/lxc-users _______________________________________________ lxc-users mailing list [email protected] http://lists.linuxcontainers.org/listinfo/lxc-users
