Hi all,

we are running a LXC-Host with several testing containers (14 at the moment).
The host itself is on Ubuntu 14.04, with 3.13.0-32 Kernel. The containers
are running Debian Wheezy.

From time to time the host machine completely crashes, probably due to 
containers
eating up too much ram. We already limited every container via cgroup (cpu and 
ram),
but still receive this behaviour. 

Our suspection is, that java on some of the containers isn't correctly limited, 
which
leads to crashing the host machine. 

Does anybody got similar expirience, or is there something missing when 
limiting containers
via cgroup? 

This is the syslog entry for the last crash (host machine + one of the 
containers):

#### HOST ####
Aug 26 13:33:10 node04 kernel: [87282.555841] Modules linked 
in:<4>[87282.555841] Call Trace:
Aug 26 13:33:10 node04 kernel: [87282.555841]  [<ffffffff811458c4>] 
perf_event_overflow+0x14/0x20
Aug 26 13:33:10 node04 kernel: [87282.555841]  [<ffffffff8136e9ed>] ? 
__write_lock_failed+0xd/0x20
Aug 26 13:33:10 node04 kernel: [87282.555841] ---[ end trace 71798cbdeee56afd 
]---
Aug 26 13:33:10 node04 kernel: [87304.156008] RAX: ffff881018a2b2e8 RBX: 
ffff880815771e28 RCX: 0000000000000006
Aug 26 13:33:10 node04 kernel: [87304.156008] Stack:
Aug 26 13:33:10 node04 kernel: [87304.156008]  [<ffffffff81152384>] 
pagefault_out_of_memory+0x14/0x80
Aug 26 13:33:10 node04 kernel: [87304.156008]  [<ffffffff81727fda>] 
do_page_fault+0x1a/0x70
Aug 26 13:33:18 node04 kernel: [87312.204006] FS:  00007f58c7c59700(0000) 
GS:ffff88103f940000(0000) knlGS:0000000000000000
Aug 26 13:33:18 node04 kernel: [87312.204006] Stack:
Aug 26 13:33:18 node04 kernel: [87312.204006]  ffff880ab78e3c70 
ffffffff8160c3a8 ffff88103f914f00 ffff88103f914f00
Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff810dba85>] 
smp_call_function_single+0xe5/0x190
Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff810dbeb6>] 
smp_call_function_many+0x286/0x2d0
Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff811814d5>] 
change_protection+0x65/0xb0
Aug 26 13:33:18 node04 kernel: [87312.204006]  [<ffffffff8172423c>] 
retint_signal+0x48/0x8c
##############

#### Container ####
Aug 26 13:32:45 ff01 kernel: [87279.009427] CPU: 0 PID: 773 Comm: java Tainted: 
GF          O 3.13.0-32-generic #57-Ubuntu
Aug 26 13:32:45 ff01 kernel: [87279.009442] Call Trace:
Aug 26 13:32:45 ff01 kernel: [87279.009470]  [<ffffffff811b388c>] 
mem_cgroup_oom_synchronize+0x4fc/0x540
Aug 26 13:32:45 ff01 kernel: [87279.009502]  [<ffffffff81724448>] 
page_fault+0x28/0x30
Aug 26 13:32:45 ff01 kernel: [87279.009650] [23109]     0 23109    32444      
422      29        0             0 console-kit-dae
Aug 26 13:32:45 ff01 kernel: [87279.009779] [  714]  1000   714     4999      
182      12        0             0 wrapper-linux-x
###################

Please tell me if you need additional information.

Regards,
Danijel Vargek

-- 
Danijel Vargek
Systemadministrator Unix

Continum AG
Bismarckallee 7b-d
D-79098 Freiburg i. Br.
Tel.: +49 761 217111-77
Fax.: +49 761 217111-99
http://www.continum.net

Sitz der Gesellschaft: Freiburg im Breisgau
Registergericht: Amtsgericht Freiburg, HRB 6866
Vorstand: Volker T. Mueller
Vorsitzender d. Aufsichtsrats: Bernd Straub
_______________________________________________
lxc-users mailing list
[email protected]
http://lists.linuxcontainers.org/listinfo/lxc-users

Reply via email to