Re: [openstack-dev] [nova] NUMA-aware live migration: easy but incomplete vs complete but hard

Jay Pipes Thu, 21 Jun 2018 06:38:21 -0700

On 06/18/2018 10:16 AM, Artom Lifshitz wrote:

Hey all,


For Rocky I'm trying to get live migration to work properly for
instances that have a NUMA topology [1].

A question that came up on one of patches [2] is how to handle
resources claims on the destination, or indeed whether to handle that
at all.

The previous attempt's approach [3] (call it A) was to use the
resource tracker. This is race-free and the "correct" way to do it,
but the code is pretty opaque and not easily reviewable, as evidenced
by [3] sitting in review purgatory for literally years.

A simpler approach (call it B) is to ignore resource claims entirely
for now and wait for NUMA in placement to land in order to handle it
that way. This is obviously race-prone and not the "correct" way of
doing it, but the code would be relatively easy to review.

For the longest time, live migration did not keep track of resources
(until it started updating placement allocations). The message to
operators was essentially "we're giving you this massive hammer, don't
break your fingers." Continuing to ignore resource claims for now is
just maintaining the status quo. In addition, there is value in
improving NUMA live migration *now*, even if the improvement is
incomplete because it's missing resource claims. "Best is the enemy of
good" and all that. Finally, making use of the resource tracker is
just work that we know will get thrown out once we start using
placement for NUMA resources.

For all those reasons, I would favor approach B, but I wanted to ask
the community for their thoughts.

Side question... does either approach touch PCI device management duringlive migration?

I ask because the only workloads I've ever seen that pin guest vCPUthreads to specific host processors -- or make use of huge pagesconsumed from a specific host NUMA node -- have also made use of SR-IOVand/or PCI passthrough. [1]

If workloads that use PCI passthrough or SR-IOV VFs cannot be livemigrated (due to existing complications in the lower-level virt layers)I don't see much of a point spending lots of developer resources tryingto "fix" this situation when in the real world, only a mythical workloadthat uses CPU pinning or huge pages but *doesn't* use PCI passthrough orSR-IOV VFs would be helped by it.


Best,
-jay

[1 I know I'm only one person, but every workload I've seen thatrequires pinned CPUs and/or huge pages is a VNF that has beenessentially an ASIC that a telco OEM/vendor has converted into softwareand requires the same guarantees that the ASIC and custom hardware gavethe original hardware-based workload. These VNFs, every single one ofthem, used either PCI passthrough or SR-IOV VFs to handlelatency-sensitive network I/O.


__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: [email protected]?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [nova] NUMA-aware live migration: easy but incomplete vs complete but hard

Reply via email to