On 09/27/2018 11:15 AM, Eric Fried wrote:
On 09/27/2018 07:37 AM, Matt Riedemann wrote:
On 9/27/2018 5:23 AM, Sylvain Bauza wrote:


On Thu, Sep 27, 2018 at 2:46 AM Matt Riedemann <mriede...@gmail.com
<mailto:mriede...@gmail.com>> wrote:

     On 9/26/2018 5:30 PM, Sylvain Bauza wrote:
      > So, during this day, we also discussed about NUMA affinity and we
     said
      > that we could possibly use nested resource providers for NUMA
     cells in
      > Stein, but given we don't have yet a specific Placement API
     query, NUMA
      > affinity should still be using the NUMATopologyFilter.
      > That said, when looking about how to use this filter for vGPUs,
     it looks
      > to me that I'd need to provide a new version for the NUMACell
     object and
      > modify the virt.hardware module. Are we also accepting this
     (given it's
      > a temporary question), or should we need to wait for the
     Placement API
      > support ?
      >
      > Folks, what are you thoughts ?

     I'm pretty sure we've said several times already that modeling
NUMA in
     Placement is not something for which we're holding up the extraction.


It's not an extraction question. Just about knowing whether the Nova
folks would accept us to modify some o.vo object and module just for a
temporary time until Placement API has some new query parameter.
Whether Placement is extracted or not isn't really the problem, it's
more about the time it will take for this query parameter ("numbered
request groups to be in the same subtree") to be implemented in the
Placement API.
The real problem we have with vGPUs is that if we don't have NUMA
affinity, the performance would be around 10% less for vGPUs (if the
pGPU isn't on the same NUMA cell than the pCPU). Not sure large
operators would accept that :(

-Sylvain

I don't know how close we are to having whatever we need for modeling
NUMA in the placement API, but I'll go out on a limb and assume we're
not close.

True story. We've been talking about ways to do this since (at least)
the Queens PTG, but haven't even landed on a decent design, let alone
talked about getting it specced, prioritized, and implemented. Since
full NRP support was going to be a prerequisite in any case, and our
Stein plate is full, Train is the earliest we could reasonably expect to
get the placement support going, let alone the nova side. So yeah...

Given that, if we have to do something within nova for NUMA
affinity for vGPUs for the NUMATopologyFilter, then I'd be OK with that
since it's short term like you said (although our "short term"
workarounds tend to last for many releases). Anyone that cares about
NUMA today already has to enable the scheduler filter anyway.


+1 to this ^

Or, I don't know, maybe don't do anything and deal with the (maybe) 10% performance impact from the cross-NUMA main memory <-> CPU hit for post-processing of already parallel-processed GPU data.

In other words, like I've mentioned in numerous specs and in person, I really don't think this is a major problem and is mostly something we're making a big deal about for no real reason.

-jay

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to