Re: [Users] Host IO delay high - Beancounters failure

Vasily Averin Wed, 09 Dec 2015 23:18:13 -0800

Herzlich willkommen, herr Spanka.

In such cases it's important to understand what exactly
blocks another processes outside affected container.
It does not look like your issue was related to memory,
privvmpage messages and failcounters should not cause described problem,


Probably your container had some other activity too,
and ate all DiskIO or whole network bandwidth.
Also container could consume all CPU resource.
It isn't limited by beancounters.
Do you have any such statistic on your node, can you check it?

On native Virtuozzo we have traffic shaping, it allows to limit outgoing 
container's traffic.
Also I hope you can limit DiskIO for affected container, ploop allows to do it.
Containers CPU can be limited too -- please check it

Unfortunately we do not know what is 2.6.32-166 Proxmox kernel, probably it is 
based on our old kernel.
In general I would like to advise you last version of our openVZ kernel 
2.6.32-042stab113.10
https://openvz.org/Download/kernel/rhel6-testing/042stab113.10

Thank you,
        Vasily Averin

PS. If you'll observe the problem next time -- please try to get list of 
blocked processes by using Magic Sysrq key (alt+sysrq+W)
you can press it on local console (if you have direct access to affected server)
or via "echo w > /proc/sysrq-trigger"  if you have working shell.

On 09.12.2015 17:31, Henry Spanka wrote:
> Hey OpenVZ users,
> 
> I’m currently encountering a weird issue but don’t know how to fix it.
> Some containers on our node are freaking out sometimes. That’s not the issue. 
> It’s customer related.
> However they take the node almost down. The host node has an IO delay of 
> 30-50% at that time and needs about 5 minutes to
> Become stable again. SSH login is (almost) impossible. Shell is not running 
> properly.
> 
> Dmesg reports the following:
> __ratelimit: 1892 callbacks suppressed
> Fatal resource shortage: privvmpages, UB 371.
> Fatal resource shortage: privvmpages, UB 371.
> Fatal resource shortage: privvmpages, UB 371.
> Fatal resource shortage: privvmpages, UB 371.
> 
> After taking a look at the bean counters of that container I got the 
> following:
> ----------------------------------------------------------------
> CT 371       | HELD Bar% Lim%| MAXH Bar% Lim%| BAR | LIM | FAIL
> -------------+---------------+---------------+-----+-----+------
>      kmemsize|45.8M   -    - | 223M   -    - |   - |   - |    -
>   lockedpages|   -    -    - |  32K   -    - |   4G|   4G|    -
>   privvmpages|3.25G  27%  27%|  12G 100% 100%|  12G|  12G|  303K
>      shmpages| 114M   -    - | 147M   -    - |   - |   - |    -
>       numproc| 127    -    - | 318    -    - |   - |   - |    -
>     physpages|1.08G   -   26%|   4G   -  100%|   - |   4G|    -
>   vmguarpages|   -    -    - |   -    -    - |   8G|   - |    -
> oomguarpages|1008M  24%   - |2.45G  61%   - |   4G|   - |    -
>    numtcpsock|  31    -    - | 173    -    - |   - |   - |    -
>      numflock|  21    -    - |  48    -    - |   - |   - |    -
>        numpty|   -    -    - |   1    -    - |   - |   - |    -
>    numsiginfo|   -    -    - | 102    -    - |   - |   - |    -
>     tcpsndbuf|1.22M   -    - |16.6M   -    - |   - |   - |    -
>     tcprcvbuf| 496K   -    - | 2.7M   -    - |   - |   - |    -
> othersockbuf|81.3K   -    - |5.72M   -    - |   - |   - |    -
>   dgramrcvbuf|   -    -    - | 117K   -    - |   - |   - |    -
> numothersock|  66    -    - | 284    -    - |   - |   - |    -
>    dcachesize|24.8M   -    - | 178M   -    - |   - |   - |    -
>       numfile|2.69K   -    - |3.13K   -    - |   - |   - |    -
>     numiptent|  62    -    - |  62    -    - |   - |   - |    -
>     swappages| 367M   -    9%| 986M   -   24%|   - |   4G|    -
> 
> A failure count of 300.000 on privvmpages is not normal. However I’m using 
> vSWAP and the RAM is limited to 12G.
> Node has 30GB of 64GB free, so that’s not the issue.
> 
> Anyone has a clue, why the host is almost going down? A single container 
> shouldn’t affect the hosts performance.
> 
> Currently running pve-kernel-2.6.32-43-pve: 
> 2.6.32-166(pve-kernel-2.6.32-43-pve: 2.6.32-166) with vzctl 4.9-4.
> 
> Containers are using the ploop layout and are laying on a LVM(root lvm 
> partition).
> 
> Thank you for your time.
> -----------------------------------------------------------------------------------------
> 
> If you have any further questions, please let us know.
> 
> Mit freundlichen Grüßen / With best regards
> Henry Spanka | myVirtualserver Development Team
_______________________________________________
Users mailing list
[email protected]
https://lists.openvz.org/mailman/listinfo/users

Re: [Users] Host IO delay high - Beancounters failure

Reply via email to