Re: [Qemu-devel] Overcommiting cpu results in all vms offline

Stefan Priebe - Profihost AG Mon, 17 Sep 2018 00:02:11 -0700

Hi,

Am 17.09.2018 um 08:38 schrieb Jack Wang:
> Stefan Priebe - Profihost AG <s.pri...@profihost.ag> 于2018年9月16日周日 下午3:31写道：
>>
>> Hello,
>>
>> while overcommiting cpu I had several situations where all vms gone offline 
>> while two vms saturated all cores.
>>
>> I believed all vms would stay online but would just not be able to use all 
>> their cores?
>>
>> My original idea was to automate live migration on high host load to move 
>> vms to another node but that makes only sense if all vms stay online.
>>
>> Is this expected? Anything special needed to archive this?
>>
>> Greets,
>> Stefan
>>
> Hi, Stefan,
> 
> Do you have any logs when all VMs go offline?
> Maybe OOMkiller play a role there?


After reviewing i think this is memory related but OOM did not play a role.
All kvm processes where spinning trying to get > 100% CPU and i was not
able to even login to ssh. After 5-10 minutes i was able to login.

There were about 150GB free mem.

Relevant settings (no local storage involved):
        vm.dirty_background_ratio:
            3
        vm.dirty_ratio:
            10
        vm.min_free_kbytes:
            10567004

# cat /sys/kernel/mm/transparent_hugepage/defrag
always defer [defer+madvise] madvise never

# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never

After that i had the following traces on the host node:
https://pastebin.com/raw/0VhyQmAv

Thanks!

Greets,
Stefan

Re: [Qemu-devel] Overcommiting cpu results in all vms offline

Reply via email to