> On 27/04/2016, at 12:48 PM, Jerry Jelinek <[email protected]> wrote:
> 
> Can you provide more information about what did not work when you set the 
> zone's memory cap?

This afternoon I had an enormously overloaded but completely 'stock' Debian 8 
lx zone, running on VMWare Fusion, live-locked completely. It would ping but 
was otherwise totally unresponsive. I've had the problem with two physical 
machines today, too: They were both 8GB machines with the zones capped to 4GB; 
with 4 and 8 physical cores; and Images were 20160330T234717Z and 
20160414T011743Z. I was running a make -j8 of mesos.

I had iostat and vmstat running in the global zone. iostat shows...

   tty       lofi1        ramdisk1        sd0           sd1            cpu
 tin tout kps tps serv  kps tps serv  kps tps serv  kps tps serv   us sy dt id
   0  168   4   1    1   64  16    0  8155 520    2    0   0    0    2 21  0 76
   0  328   0   0    0    0   0    0  6101 485    2    0   0    0    3 24  0 73
   0  167   0   0    0    0   0    0  2473 309    3    0   0    0    3 19  0 78
   0  171   0   0    0    0   0    0  7583 401    2    0   0    0    2 21  0 77
   0  169   4   1    2   64  16    0  6717 523    2    0   0    0    3 23  0 75
   0  170   0   0    0    0   0    0  6933 497    3    0   0    0    4 33  0 63
   0  170   0   0    0    0   0    0  5944 467    3    0   0    0    5 39  0 56

Expected behaviour for something that's thrashing - particularly note the low 
cpu availability to userland. From vmstat:

 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr lf rm s0 s1   in   sy   cs us sy id
 1 0 158 2131188 16344 17 652 788 1992 2192 0 416011 0 0 505 0 32474 660 75736 
2 19 78
 0 0 158 2129876 16324 32 434 624 1996 2131 0 422930 0 0 432 0 28819 663 65128 
2 20 78
 0 0 158 2128736 16336 25 623 1002 2213 2336 0 419447 0 0 445 0 30441 674 75889 
4 29 67
 0 0 158 2127176 16320 32 655 905 2254 2579 0 424660 0 0 490 0 30580 666 74210 
4 34 62
 1 0 158 2126084 16328 13 517 876 2064 2180 0 422137 0 0 747 0 37349 660 87374 
3 23 75
 1 0 159 2124800 16344 20 607 829 2186 2338 0 414812 0 0 376 0 33842 672 78496 
3 20 77
 0 0 159 2123412 16332 34 738 1155 2821 2925 0 420288 0 0 555 0 41055 659 95112 
5 22 73
 0 0 159 2121728 16244 13 454 673 1597 1714 0 423595 0 0 405 0 25314 666 63283 
3 36 61
 5 0 159 2120740 16348 66 2521 2331 1100 1268 0 346933 0 7 735 0 14111 707 
66809 12 45 43
 5 0 159 2141400 25396 49 3875 4155 1876 1960 0 217768 0 0 1023 0 20443 699 
105580 17 49 33
 0 0 159 2133576 16308 12 788 1296 2765 2977 0 440330 0 0 633 0 43350 659 
107305 4 24 72
 0 0 159 2132144 16328 15 635 970 2148 2599 0 373470 0 0 498 0 33025 665 81605 
4 31 66

Just 16MB free, presumably being what is causing the thrashing in the first 
place. 

While screeching to a halt is perhaps the expected behaviour, the unfortunate 
part is that it takes down the global zone as well - as in, it becomes 
unresponsive thus rendering the machine lost.

In the name of science I ran it on Joyent's public infrastructure. It was quite 
stoic and made it to the end of the build, but running a make -j 32 gives you 
lots of

g++: internal compiler error: Bus error (program cc1plus)
Please submit a full bug report,
with preprocessed source if appropriate.
See <file:///usr/share/doc/gcc-4.9/README.Bugs> for instructions.
Makefile:6342: recipe for target 'slave/libmesos_no_3rdparty_la-http.lo' failed
make[2]: *** [slave/libmesos_no_3rdparty_la-http.lo] Error 1

and no further prompt. I did get "the system is going down for poweroff" 
notification, however. Presumably the Joyent zone is running on something 
physically much larger so the 'boom' was comparatively less.

I have an image of the crash in the public infrastructure but can't for the 
life of me find the button to make it public :(

Is there anything stupid I've done? I've repeated this on four separate 
platforms now so it's not some stupid hardware error. Do I just have excessive 
expectations?

Thanks,
Dave






-------------------------------------------
smartos-discuss
Archives: https://www.listbox.com/member/archive/184463/=now
RSS Feed: https://www.listbox.com/member/archive/rss/184463/25769125-55cfbc00
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=25769125&id_secret=25769125-7688e9fb
Powered by Listbox: http://www.listbox.com

Reply via email to