So after cat'ing around in /proc, I can see that the JVM is blocking on disk I/O apparently:
root@i-bde90adc:~# cat /proc/24689/status > Name: java > State: D (disk sleep) > Tgid: 24689 > Pid: 24689 > PPid: 24629 > TracerPid: 0 > Uid: 997 997 997 997 > Gid: 997 997 997 997 > FDSize: 512 > Groups: > VmPeak: 2198700 kB > VmSize: 2196016 kB > VmLck: 0 kB > VmPin: 0 kB > VmHWM: 305276 kB > VmRSS: 296828 kB > VmData: 2040128 kB > VmStk: 156 kB > VmExe: 4 kB > VmLib: 32692 kB > VmPTE: 1292 kB > VmSwap: 0 kB > Threads: 104 > SigQ: 0/180427 > SigPnd: 0000000000000000 > ShdPnd: 0000000000000000 > SigBlk: 0000000000000000 > SigIgn: 0000000000000000 > SigCgt: 2000000181005ccf > CapInh: 0000000000000000 > CapPrm: 0000000000000000 > CapEff: 0000000000000000 > CapBnd: ffffffffffffffff > Cpus_allowed: ffff > Cpus_allowed_list: 0-15 > Mems_allowed: 00000000,00000001 > Mems_allowed_list: 0 > voluntary_ctxt_switches: 25 > nonvoluntary_ctxt_switches: 0 Mesos looks okay: root@i-bde90adc:~# cat /proc/3684/status > Name: lt-mesos-slave > State: S (sleeping) > Tgid: 3684 > Pid: 3684 > PPid: 1131 > TracerPid: 0 > Uid: 0 0 0 0 > Gid: 0 0 0 0 > FDSize: 256 > Groups: > VmPeak: 1531868 kB > VmSize: 1531864 kB > VmLck: 0 kB > VmPin: 0 kB > VmHWM: 15272 kB > VmRSS: 15272 kB > VmData: 1402012 kB > VmStk: 136 kB > VmExe: 152 kB > VmLib: 31824 kB > VmPTE: 416 kB > VmSwap: 0 kB > Threads: 20 > SigQ: 1/180427 > SigPnd: 0000000000000000 > ShdPnd: 0000000000000000 > SigBlk: 0000000000000000 > SigIgn: 0000000000000000 > SigCgt: 0000000180000000 > CapInh: 0000000000000000 > CapPrm: ffffffffffffffff > CapEff: ffffffffffffffff > CapBnd: ffffffffffffffff > Cpus_allowed: ffff > Cpus_allowed_list: 0-15 > Mems_allowed: 00000000,00000001 > Mems_allowed_list: 0 > voluntary_ctxt_switches: 69 > nonvoluntary_ctxt_switches: 0 Is it possible that changing the state of the freezer from thawed -> frozen causes all disk I/O to stop? I'm not sure where to look to mitigate this. On Fri, May 10, 2013 at 11:58 AM, Brenden Matthews < [email protected]> wrote: > The kernel gives me a bunch of "soft lockup: CPU stuck!" and 'task java > blocked for more than 120 seconds". I think these are fairly generic > errors and can be caused by any sort of race condition on the system. > Strace also isn't helpful, as it doesn't tell me which resource the > processes are stuck on. > > I'll try and check /proc/pid/status next time it happens. Other tools > like lsof are unable to run. > > > On Fri, May 10, 2013 at 11:49 AM, Vinod Kone <[email protected]> wrote: > >> ---------- Forwarded message ---------- >> From: Vinod Kone <[email protected]> >> Date: Fri, May 10, 2013 at 11:25 AM >> Subject: Re: OOM killer stuck when freezing cgroup >> To: "[email protected]" <[email protected]> >> >> >> Anything in the kernel logs during this time? >> >> >> @vinodkone >> >> >> On Fri, May 10, 2013 at 11:13 AM, Benjamin Mahler < >> [email protected] >> > wrote: >> >> > Sorry about that miscommunication, I meant to say "from the kernel OOM >> > killer", and you're right that the OOM shouldn't be the source of the >> > problem here (just trying to make a mental note that our accounting is >> > wrong), OOMs are indeed expected as part of normal operation on the >> slave. >> > >> > It's tough to say what the issue is as I don't have any experience >> running >> > on newer kernels like 3.2, I would recommend augmenting the logging in >> the >> > cgroups::Freezer to try to nail down at which point the slave is >> blocking. >> > In particular, in Freezer::freeze() and Freezer::watchFrozen(). >> > >> > Also, do you have /proc/<slave_pid>/status output for the slave when >> it's >> > frozen? >> > >> > It will be really useful to know if that is consistently the last log >> line >> > you see before things lock up as well, as that helps narrow this down. >> > >> > >> > On Fri, May 10, 2013 at 10:44 AM, Brenden Matthews < >> > [email protected]> wrote: >> > >> > > This isn't a kernel OOM. I started the debugger and let it go >> overnight >> > > with no luck. It just doesn't want to budge. I'm aware of the memory >> > > limit issue, but I don't think that's the real problem here. >> Regardless >> > of >> > > whether we run out of memory, the whole system shouldn't freeze when >> > trying >> > > to kill the process. I'm not sure if this is a kernel problem or a >> Mesos >> > > problem, but I suspect that the issue isn't with the kernel. Walking >> > > through the Mesos code, I can't see anything that is obviously wrong >> > > either. I'm using my own branch from here: >> > > https://github.com/airbnb/mesos/tree/testing. >> > > >> > > My next step will be to run a bunch of slaves in GDB and then launch >> some >> > > tasks. Hopefully I can stop them and get a backtrace. I thought it >> > might >> > > be worth asking about it here to see if anyone else has dealt with >> this >> > and >> > > knows what's going on. >> > > >> > > These are EC2 instances running the latest Ubuntu Precise AMIs. >> Here's >> > the >> > > `uname -a' from one: >> > > >> > > Linux i-b5dbc1d5 3.2.0-41-virtual #66-Ubuntu SMP Thu Apr 25 03:47:17 >> UTC >> > > 2013 x86_64 x86_64 x86_64 GNU/Linux >> > > >> > > >> > > >> > > >> > > On Fri, May 10, 2013 at 10:21 AM, Benjamin Mahler < >> > > [email protected] >> > > > wrote: >> > > >> > > > Hey Brenden, what kernel version are you running? What version of >> the >> > > code >> > > > are you running? Are you seeing this on different machines? Can you >> > > confirm >> > > > you're unable to attach to the slave with gdb (that would be the >> > easiest >> > > > way to figure out what's blocking in the slave)? Is that always the >> > last >> > > > line in the log before the processes get stuck? >> > > > >> > > > As an aside, it looks like we aren't properly setting the JVM heap >> size >> > > for >> > > > TaskTrackers in the MesosScheduler for Hadoop. We try to estimate >> the >> > JVM >> > > > memory overhead, but it was perhaps not enough to avoid kernel OOMs. >> > > > >> > > > >> > > > On Fri, May 10, 2013 at 9:57 AM, Brenden Matthews < >> > > > [email protected]> wrote: >> > > > >> > > > > Hey folks, >> > > > > >> > > > > I'm bumping in to a problem frequently. When an executor exceeds >> its >> > > > > memory limit and the executor is deemed in need of execution, the >> > > entire >> > > > > system becomes stuck (though I'm unsure as to whether it's stuck >> > before >> > > > or >> > > > > after attempting to freeze). Here's what the mesos slave log >> looks >> > > like: >> > > > > >> > > > > I0510 04:22:37.935608 13622 cgroups_isolator.cpp:1023] OOM >> notifier >> > is >> > > > > > triggered for executor executor_Task_Tracker_446 of framework >> > > > > > 201305100055-1471680778-5050-18199-0000 with uuid >> > > > > > 14a21d5a-a670-40fd-91b1-54c42b9a30e0 >> > > > > > I0510 04:22:37.935925 13622 cgroups_isolator.cpp:1068] OOM >> detected >> > > for >> > > > > > executor executor_Task_Tracker_446 of framework >> > > > > > 201305100055-1471680778-5050-18199-0000 with uuid >> > > > > > 14a21d5a-a670-40fd-91b1-54c42b9a30e0 >> > > > > > I0510 04:22:37.936982 13622 cgroups_isolator.cpp:1109] Memory >> limit >> > > > > > exceeded: Requested: 2570MB Used: 2570MB >> > > > > > MEMORY STATISTICS: >> > > > > > cache 1682083840 >> > > > > > rss 1012756480 >> > > > > > mapped_file 1085440 >> > > > > > pgpgin 22863635 >> > > > > > pgpgout 22205715 >> > > > > > swap 0 >> > > > > > pgfault 11881912 >> > > > > > pgmajfault 304 >> > > > > > inactive_anon 0 >> > > > > > active_anon 1012756480 >> > > > > > inactive_file 1065136128 >> > > > > > active_file 616947712 >> > > > > > unevictable 0 >> > > > > > hierarchical_memory_limit 2694840320 >> > > > > > hierarchical_memsw_limit 9223372036854775807 >> > > > > > total_cache 1682083840 >> > > > > > total_rss 1012756480 >> > > > > > total_mapped_file 1085440 >> > > > > > total_pgpgin 22863635 >> > > > > > total_pgpgout 22205715 >> > > > > > total_swap 0 >> > > > > > total_pgfault 11881912 >> > > > > > total_pgmajfault 304 >> > > > > > total_inactive_anon 0 >> > > > > > total_active_anon 1012756480 >> > > > > > total_inactive_file 1065136128 >> > > > > > total_active_file 616947712 >> > > > > > total_unevictable 0 >> > > > > > I0510 04:22:37.965666 13622 cgroups_isolator.cpp:620] Killing >> > > executor >> > > > > > executor_Task_Tracker_446 of framework >> > > > > > 201305100055-1471680778-5050-18199-0000 >> > > > > > I0510 04:22:37.967418 13621 cgroups.cpp:1175] Trying to freeze >> > cgroup >> > > > > > >> > > > > >> > > > >> > > >> > >> /cgroup/mesos/framework_201305100055-1471680778-5050-18199-0000_executor_executor_Task_Tracker_446_tag_14a21d5a-a670-40fd-91b1-54c42b9a30e0 >> > > > > >> > > > > >> > > > > After the last line in the log, both the executing task and the >> mesos >> > > > > process are stuck indefinitely. Usually the way to resolve this >> is >> > > > > `killall -9 java lt-mesos-slave' or a reboot. >> > > > > >> > > > > If I cat /cgroup/mesos/framework.../freezer.state, it shows that >> it's >> > > > > freezing. Since the entire system is stuck, I cannot attach a >> > debugger >> > > > to >> > > > > either the JVM or mesos. >> > > > > >> > > > > Any ideas? This problem is very frustrating and I'm stuck at this >> > > point. >> > > > > >> > > > >> > > >> > >> > >
