I'll try to reproduce and collect logs for a bug report. Thanks for the info.
PCM On Thu, Jun 9, 2016 at 9:43 AM Matt Riedemann <mrie...@linux.vnet.ibm.com> wrote: > On 6/9/2016 6:15 AM, Paul Michali wrote: > > > > > > On Wed, Jun 8, 2016 at 11:21 PM Chris Friesen > > <chris.frie...@windriver.com <mailto:chris.frie...@windriver.com>> > wrote: > > > > On 06/03/2016 12:03 PM, Paul Michali wrote: > > > Thanks for the link Tim! > > > > > > Right now, I have two things I'm unsure about... > > > > > > One is that I had 1945 huge pages left (of size 2048k) and tried > > to create a VM > > > with a small flavor (2GB), which should need 1024 pages, but Nova > > indicated that > > > it wasn't able to find a host (and QEMU reported an allocation > issue). > > > > > > The other is that VMs are not being evenly distributed on my two > > NUMA nodes, and > > > instead, are getting created all on one NUMA node. Not sure if > > that is expected > > > (and setting mem_page_size to 2048 is the proper way). > > > > > > Just in case you haven't figured out the problem... > > > > Have you checked the per-host-numa-node 2MB huge page availability > > on your host? > > If it's uneven then that might explain what you're seeing. > > > > > > These are the observations/questions I have: > > > > 1) On the host, I was seeing 32768 huge pages, of 2MB size. When I > > created VMs (Cirros) using small flavor, each VM was getting created on > > NUMA nodeid 0. When it hit half of the available pages, I could no > > longer create any VMs (QEMU saying no space). I'd like to understand why > > the assignment was always going two nodeid 0, and to confirm that the > > huge pages are divided among the number of NUMA nodes available. > > > > 2) I changed mem_page_size from 1024 to 2048 in the flavor, and then > > when VMs were created, they were being evenly assigned to the two NUMA > > nodes. Each using 1024 huge pages. At this point I could create more > > than half, but when there were 1945 pages left, it failed to create a > > VM. Did it fail because the mem_page_size was 2048 and the available > > pages were 1945, even though we were only requesting 1024 pages? > > > > 3) Related to #2, is there a relationship between mem_page_size, the > > allocation of VMs to NUMA nodes, and the flavor size? IOW, if I use the > > medium flavor (4GB), will I need a larger mem_page_size? (I'll play with > > this variation, as soon as I can). Gets back to understanding how the > > scheduling determines how to assign the VMs. > > > > 4) When the VM create failed due to QEMU failing allocation, the VM went > > to error state. I deleted the VM, but the neutron port was still there, > > and there were no log messages indicating that a request was made to > > delete the port. Is this expected (that the user would have to manually > > clean up the port)? > > When you hit this case, can you check if instance.host is set in the > database before deleting the instance? I'm guessing what's happening is > the instance didn't get assigned a host since it eventually ended up > with NoValidHost, so when you go to delete it doesn't have a compute to > send it to for delete, so it deletes from the compute API, and we don't > have the host binding details to delete the port. > > Although, when the spawn failed in the compute to begin with we should > have deallocated any networking that was created before kicking back to > the scheduler - unless we don't go back to the scheduler if the instance > is set to ERROR state. > > A bug report with stacktrace of the failure scenario when the instance > goes to error state bug n-cpu logs would probably help. > > > > > 5) A coworker had hit the problem mentioned in #1, with exhaustion at > > the halfway point. If she delete's a VM, and then changes the flavor to > > change the mem_page_size to 2048, should Nova start assigning all new > > VMs to the other NUMA node, until the pool of huge pages is down to > > where the huge pages are for NUMA node 0, or will it alternate between > > the available NUMA nodes (and run out when node 0's pool is exhausted)? > > > > Thanks in advance! > > > > PCM > > > > > > > > > > Chris > > > > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: > > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > < > http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe> > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > > > > > __________________________________________________________________________ > > OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: > openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > > > > -- > > Thanks, > > Matt Riedemann > > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev >
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev