subject:"\[Yahoo\-eng\-team\] \[Bug 1950186\] Re\: Nova doesn't account for hugepages when scheduling VMs"

[Yahoo-eng-team] [Bug 1950186] Re: Nova doesn't account for hugepages when scheduling VMs

2022-05-24 Thread sean mooney

This is not a bug it is user error.

when using hugepages if you want to have non hugepage guests on the same host 
then you must use
hw:mem_page_size=small or hw:mem_page_size=4k for all non hugepages guests

we do not support memory oversubscriton when using hw:mem_page_size and
this also makes the guest have 1 implicit numa node.

we intentually do not support mixing numa and non numa guest on the same
host which is what happens if you do not use hw:mem_page_size=small

when hw:mem_page_size is not set we do not do page size/numa node aware
schduling.

the reason that you are having the current issue  is because you are
mixing numa and non numa instance on the same host which has never been
supported in nova.

we may eventually support this in the distantant future but we have no
plans to support this in zed and no one has proposed a way to support it
upstream yet.

it is a very non trivial feature and would require us to effectively make all 
instance numa instances.
we cannot support mixing floating instance an numa affined instances on the 
same host today due to how we do numa affinity
and how that interacts with the kernel OOM reaper.
basically the OOM reaper operates per numa node not globally so if the kernel 
need memory on numa node 0 even if there is free memory on numa node 0 if it 
cant free the memory on numa node 0 it will kill process to free it.

that will often result in numa affined non hugepage guest being killed if a 
floating guest is spawned and it triggers an OOM event.
that is not something we can allow to happen as its a multi tenant issue so we 
cannot support mixing numa and non numa instance in the same host.


the workaround to use hugepage and non hugepage guests on the same host is 
there for  to make all the guest have numa affinity by using hw:mem_page_size.

this is a well know limitation and not a bug so I'm closing this as wont
fix

** Changed in: nova
   Status: Confirmed => Won't Fix

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1950186

Title:
  Nova doesn't account for hugepages when scheduling VMs

Status in OpenStack Compute (nova):
  Won't Fix

Bug description:
  Description
  ===

  When hugepages are enabled on the host it's possible to schedule VMs
  using more RAM than available.

  On the node with memory usage presented below it was possible to
  schedule 6 instances using a total of 140G of memory and a non-
  hugepages-enabled flavor. The same machine has 188G of memory in
  total, of which 64G were reserved for hugepages. Additional ~4G were
  used for housekeeping, OpenStack control plane, etc. This resulted in
  overcommitment of roughly 20G.

  After running memory intensive operations on the VMs, some of them got
  OOM killed.

  $ cat /proc/meminfo  | egrep "^(Mem|Huge)" # on the compute node
  MemTotal:   197784792 kB
  MemFree:115005288 kB
  MemAvailable:   116745612 kB
  HugePages_Total:  64
  HugePages_Free:   64
  HugePages_Rsvd:0
  HugePages_Surp:0
  Hugepagesize:1048576 kB
  Hugetlb:67108864 kB

  $ os hypervisor show copmute1 -c memory_mb -c memory_mb_used -c free_ram_mb
  +++
  | Field  | Value  |
  +++
  | free_ram_mb| 29309  |
  | memory_mb  | 193149 |
  | memory_mb_used | 163840 |
  +++

  $ os host show compute1
  +--+--+-+---+-+
  | Host | Project  | CPU | Memory MB | Disk GB |
  +--+--+-+---+-+
  | compute1 | (total)  |   0 |193149 | 893 |
  | compute1 | (used_now)   |  72 |163840 | 460 |
  | compute1 | (used_max)   |  72 |147456 | 460 |
  | compute1 | some_project_id_was_here |   2 |  4096 |  40 |
  | compute1 | another_anonymized_id_here   |  70 |143360 | 420 |
  +--+--+-+---+-+

  $ os resource provider inventory list uuid_of_compute1_node
  
++--+--+--+--+---++
  | resource_class | allocation_ratio | min_unit | max_unit | reserved | 
step_size |  total |
  
++--+--+--+--+---++
  | MEMORY_MB  |  1.0 |1 |   193149 |16384 |
 1 | 193149 |
  | DISK_GB|  1.0 |1 |  893 |0 |
 1 |893 |
  | PCPU   |  1.0 |1 |   72 |0 |
 1 | 72 |
  
++--+--+--+--+---++

  Steps to reproduce
  ==

  1.

[Yahoo-eng-team] [Bug 1950186] Re: Nova doesn't account for hugepages when scheduling VMs

2022-05-18 Thread Seyeong Kim

** Package changed: nova (Ubuntu) => nova

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1950186

Title:
  Nova doesn't account for hugepages when scheduling VMs

Status in OpenStack Compute (nova):
  Confirmed

Bug description:
  Description
  ===

  When hugepages are enabled on the host it's possible to schedule VMs
  using more RAM than available.

  On the node with memory usage presented below it was possible to
  schedule 6 instances using a total of 140G of memory and a non-
  hugepages-enabled flavor. The same machine has 188G of memory in
  total, of which 64G were reserved for hugepages. Additional ~4G were
  used for housekeeping, OpenStack control plane, etc. This resulted in
  overcommitment of roughly 20G.

  After running memory intensive operations on the VMs, some of them got
  OOM killed.

  $ cat /proc/meminfo  | egrep "^(Mem|Huge)" # on the compute node
  MemTotal:   197784792 kB
  MemFree:115005288 kB
  MemAvailable:   116745612 kB
  HugePages_Total:  64
  HugePages_Free:   64
  HugePages_Rsvd:0
  HugePages_Surp:0
  Hugepagesize:1048576 kB
  Hugetlb:67108864 kB

  $ os hypervisor show copmute1 -c memory_mb -c memory_mb_used -c free_ram_mb
  +++
  | Field  | Value  |
  +++
  | free_ram_mb| 29309  |
  | memory_mb  | 193149 |
  | memory_mb_used | 163840 |
  +++

  $ os host show compute1
  +--+--+-+---+-+
  | Host | Project  | CPU | Memory MB | Disk GB |
  +--+--+-+---+-+
  | compute1 | (total)  |   0 |193149 | 893 |
  | compute1 | (used_now)   |  72 |163840 | 460 |
  | compute1 | (used_max)   |  72 |147456 | 460 |
  | compute1 | some_project_id_was_here |   2 |  4096 |  40 |
  | compute1 | another_anonymized_id_here   |  70 |143360 | 420 |
  +--+--+-+---+-+

  $ os resource provider inventory list uuid_of_compute1_node
  
++--+--+--+--+---++
  | resource_class | allocation_ratio | min_unit | max_unit | reserved | 
step_size |  total |
  
++--+--+--+--+---++
  | MEMORY_MB  |  1.0 |1 |   193149 |16384 |
 1 | 193149 |
  | DISK_GB|  1.0 |1 |  893 |0 |
 1 |893 |
  | PCPU   |  1.0 |1 |   72 |0 |
 1 | 72 |
  
++--+--+--+--+---++

  Steps to reproduce
  ==

  1. Reserve a large part of memory for hugepages on the hypervisor.
  2. Create VMs using a flavor that uses a lot of memory that isn't backed by 
hugepages.
  3. Start memory intensive operations on the VMs, e.g.:
  stress-ng --vm-bytes $(awk '/MemAvailable/{printf "%d", $2 * 0.98;}' < 
/proc/meminfo)k --vm-keep -m 1

  Expected result
  ===

  Nova should not allow overcommitment and should be able to
  differentiate between hugepages and "normal" memory.

  Actual result
  =
  Overcommitment resulting in OOM kills.

  Environment
  ===
  nova-api-metadata 2:21.2.1-0ubuntu1~cloud0
  nova-common 2:21.2.1-0ubuntu1~cloud0
  nova-compute 2:21.2.1-0ubuntu1~cloud0
  nova-compute-kvm 2:21.2.1-0ubuntu1~cloud0
  nova-compute-libvirt 2:21.2.1-0ubuntu1~cloud0
  python3-nova 2:21.2.1-0ubuntu1~cloud0
  python3-novaclient 2:17.0.0-0ubuntu1~cloud0

  OS: Ubuntu 18.04.5 LTS
  Hypervisor: libvirt + KVM

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1950186/+subscriptions


-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

[Yahoo-eng-team] [Bug 1950186] Re: Nova doesn't account for hugepages when scheduling VMs

2022-02-18 Thread Giuseppe Petralia

This can be reproduced on Focal/ussuri:

 Computes:
$ os resource provider list
+--+-++--+--+
| uuid | name   
 | generation | root_provider_uuid   | 
parent_provider_uuid |
+--+-++--+--+
| ca3fa736-7e60-4365-9cc8-7afc78b53005 | 
juju-98fb61-zaza-d6f2c7825043-9.project.serverstack |  5 | 
ca3fa736-7e60-4365-9cc8-7afc78b53005 | None |
| 0605bd29-71d5-40ed-ab8f-eceeaaac59b5 | 
juju-98fb61-zaza-d6f2c7825043-8.project.serverstack |  4 | 
0605bd29-71d5-40ed-ab8f-eceeaaac59b5 | None |
+--+-++--+--+


 Mem Allocation ratio is 1:
$ openstack resource provider inventory list 
ca3fa736-7e60-4365-9cc8-7afc78b53005
++--+--+--+--+---+---+---+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | 
step_size | total |  used |
++--+--+--+--+---+---+---+
| VCPU   | 16.0 |1 |8 |0 | 
1 | 8 | 2 |
| MEMORY_MB  |  1.0 |1 |16008 | 2048 | 
1 | 16008 | 13960 |
| DISK_GB|  1.0 |1 |   77 |0 | 
1 |77 |20 |
++--+--+--+--+---+---+---+


$ openstack resource provider inventory list 
0605bd29-71d5-40ed-ab8f-eceeaaac59b5
++--+--+--+--+---+---+--+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | 
step_size | total | used |
++--+--+--+--+---+---+--+
| VCPU   | 16.0 |1 |8 |0 | 
1 | 8 |0 |
| MEMORY_MB  |  1.0 |1 |16008 | 2048 | 
1 | 16008 |0 |
| DISK_GB|  1.0 |1 |   77 |0 | 
1 |77 |0 |
++--+--+--+--+---+---+--+

 Hugepages: 1000 * 2M
root@juju-98fb61-zaza-d6f2c7825043-9:~# cat /proc/meminfo | grep -i huge
AnonHugePages:622592 kB
ShmemHugePages:0 kB
FileHugePages: 0 kB
HugePages_Total:1000
HugePages_Free: 1000
HugePages_Rsvd:0
HugePages_Surp:0
Hugepagesize:   2048 kB
Hugetlb: 2048000 kB


root@juju-98fb61-zaza-d6f2c7825043-9:~# free -mh
  totalusedfree  shared  buff/cache   available
Mem:   15Gi   3.5Gi11Gi   1.0Mi   713Mi11Gi
Swap:0B  0B  0B


 Host reserved memory is 2G:
$ juju config nova-compute reserved-host-memory
2048


 Available Mem for general use (not hugepage)
I expect to have available memory for VMs = 16008 (total) - 2048 (reserved) - 
2048 (hugepages) = 11912


 Flavor with mem 13960 (> of expected total available 11912)
$ os flavor show 14g-mem
++--+
| Field  | Value|
++--+
| OS-FLV-DISABLED:disabled   | False|
| OS-FLV-EXT-DATA:ephemeral  | 0|
| access_project_ids | None |
| description| None |
| disk   | 20   |
| id | 377de58b-7aa2-499d-9940-abf98aaa5a8a |
| name   | 14g-mem  |
| os-flavor-access:is_public | True |
| properties |  |
| ram| 13960|
| rxtx_factor| 1.0  |
| swap   |  |
| vcpus  | 2|
++--+


## VM with flavor 14g-mem is scheduled correctly (Expected No Valid host)
$ os server list -c ID -c Name -c Status -c "Flavor"

[Yahoo-eng-team] [Bug 1950186] Re: Nova doesn't account for hugepages when scheduling VMs

[Yahoo-eng-team] [Bug 1950186] Re: Nova doesn't account for hugepages when scheduling VMs

[Yahoo-eng-team] [Bug 1950186] Re: Nova doesn't account for hugepages when scheduling VMs

3 matches

Site Navigation

Mail list logo

Footer information