While running some fairly memory-hungry jobs I hit a state where wchan
in top was showing "fltamap" and the machine locked up (couldn't enter
DDB).

Which must be this:

                /* didn't work?  must be out of RAM.  sleep. */
                if (UVM_ET_ISNEEDSCOPY(ufi->entry)) {
                        uvmfault_unlockmaps(ufi, TRUE);
                        uvm_wait("fltamapcopy");
                        continue;
                }

I was monitoring top to see if I was getting close to the available
memory and it reported plenty free.

Is there a way I can identify when I'm getting close to this state
so I can kill a job rather than crash the machine? (I have a few that
I need to get run as quickly as possible ..)


Memory: Real: 2920M/5017M act/tot Free: 11G Cache: 165M Swap: 0K/16G

  PID USERNAME PRI NICE  SIZE   RES STATE     WAIT      TIME    CPU COMMAND
15768 sthen    -18    0 3000M 2192M sleep     fltamap  55:29 88.92% perl
25439 sthen    -18    0  377M  392M sleep     fltamap   3:08 87.94% perl
 8523 sthen    -18    0  306M  319M sleep     fltamap   1:04 83.35% perl
15304 _tomcat   54    0 1156M 2208K idle      thrslee  10:13 19.78% java
26520 sthen    -18    0  804K 2224K sleep     fltamap   0:03 14.89% top
27905 _postgre -18    0  147M 5520K sleep     fltamap   3:52 12.30% postgres
...

Reply via email to