While running some fairly memory-hungry jobs I hit a state where wchan
in top was showing "fltamap" and the machine locked up (couldn't enter
DDB).
Which must be this:
/* didn't work? must be out of RAM. sleep. */
if (UVM_ET_ISNEEDSCOPY(ufi->entry)) {
uvmfault_unlockmaps(ufi, TRUE);
uvm_wait("fltamapcopy");
continue;
}
I was monitoring top to see if I was getting close to the available
memory and it reported plenty free.
Is there a way I can identify when I'm getting close to this state
so I can kill a job rather than crash the machine? (I have a few that
I need to get run as quickly as possible ..)
Memory: Real: 2920M/5017M act/tot Free: 11G Cache: 165M Swap: 0K/16G
PID USERNAME PRI NICE SIZE RES STATE WAIT TIME CPU COMMAND
15768 sthen -18 0 3000M 2192M sleep fltamap 55:29 88.92% perl
25439 sthen -18 0 377M 392M sleep fltamap 3:08 87.94% perl
8523 sthen -18 0 306M 319M sleep fltamap 1:04 83.35% perl
15304 _tomcat 54 0 1156M 2208K idle thrslee 10:13 19.78% java
26520 sthen -18 0 804K 2224K sleep fltamap 0:03 14.89% top
27905 _postgre -18 0 147M 5520K sleep fltamap 3:52 12.30% postgres
...