On Wed, Jul 11, 2018 at 06:03:08PM +0200, Pavel Hrdina wrote:
> To make it clear I'll summarize all the possible combinations and how it
> should work so we are on the same page.

originally: before commit [1]
now: after commit [1] (current master)
expect: what this patch series should fix


======= Non-NUMA guests =======


* Only one hugepage specified without any nodeset

    <memoryBacking>
      <hugepages>
        <page size='1048576' unit='KiB'/>
      </hugepages>
    </memoryBacking>

    This is correct, was always working and we should not change it.

    originally: works
    now: works
    expected: works


* Only one hugapage specified with nodeset

    <memoryBacking>
      <hugepages>
        <page size='1048576' unit='KiB' nodeset='0'/>
      </hugepages>
    </memoryBacking>

    This is questionable since there is no guest NUMA node specified,
    but on the other hand we can consider non-NUMA guest to have exactly
    one NUMA node.

    This was working but has been broken by commit [1]  which tried to
    fix a case where you are trying to specify non-existing NUMA node.

    Because of that assumption we can consider this as valid XML even
    though there is no NUMA node specified [2].  There are two possible
    solutions:

        - we can leave the XML intact

        - we can silently remove the 'nodeset' attribute to 'fix' the
          XML definition

    originally: works
    now: fails
    expect: works


    <memoryBacking>
      <hugepages>
        <page size='1048576' unit='KiB' nodeset='1'/>
      </hugepages>
    </memoryBacking>

    If the nodeset is != 0 it should newer work becuase there is no
    guest NUMA topology and even if we take into account the assumption
    that there is always one NUMA node it is still invalid.

    originally: works
    now: fails
    expect: fails


* One hugepage with specific nodeset and second default hugepage

    <memoryBacking>
      <hugepages>
        <page size='1048576' unit='KiB' nodeset='0'/>
        <page size='2048' unit='KiB'/>
      </hugepages>
    </memoryBacking>

    This was working but was 'fixed' by commit [1] because the code
    checks only the first hugepage and uses only the first hugepage.

    It should never worked because it doesn't make any sense, there
    is no guest NUMA node configured and even if we take into account
    the assumption that non-NUMA guest has one NUMA node there is need
    for the default page size.

    originally: works
    now: fails
    expect: fails


    There is yet another issue with the current state in libvirt, if
    you swap the order of pages:

    <memoryBacking>
      <hugepages>
        <page size='2048' unit='KiB'/>
        <page size='1048576' unit='KiB' nodeset='0'/>
      </hugepages>
    </memoryBacking>

    it will work even with current libvirt with the fix [1].  The reason
    is that code in qemuBuildMemPathStr() function takes into account
    only the first page size so it depends on the order of elements
    which is wrong.

    We should not allow any of these two configurations.  Setting
    nodeset to != 0 will should not make any difference.

    originally: works
    now: works
    expect: fails


====== NUMA guest ======


* Only one hugepage specified without any nodeset

    <memoryBacking>
      <hugepages>
        <page size='2048' unit='KiB'/>
      </hugepages>
    </memoryBacking>
    ...
    <cpu mode='host-passthrough' check='none'>
      <topology sockets='1' cores='4' threads='2'/>
      <numa>
        <cell id='0' cpus='0-3' memory='4194304' unit='KiB'/>
        <cell id='1' cpus='4-7' memory='4194304' unit='KiB'/>
      </numa>
    </cpu>

    originally: works
    now: works
    expect: works


* Only one hugapage specified with nodeset

    <memoryBacking>
      <hugepages>
        <page size='2048' unit='KiB' nodeset='0'/>
      </hugepages>
    </memoryBacking>
    ...
    <cpu mode='host-passthrough' check='none'>
      <topology sockets='1' cores='4' threads='2'/>
      <numa>
        <cell id='0' cpus='0-3' memory='4194304' unit='KiB'/>
        <cell id='1' cpus='4-7' memory='4194304' unit='KiB'/>
      </numa>
    </cpu>

    All possible combinations for the nodeset attribute are allowed
    if it always corresponds to existing guest NUMA node:

        <page size='2048' unit='KiB' nodeset='1'/>

        or

        <page size='2048' unit='KiB' nodeset='0,1'/>

    originally: works
    now: works
    expect: works


    <memoryBacking>
      <hugepages>
        <page size='2048' unit='KiB' nodeset='2'/>
      </hugepages>
    </memoryBacking>
    ...
    <cpu mode='host-passthrough' check='none'>
      <topology sockets='1' cores='4' threads='2'/>
      <numa>
        <cell id='0' cpus='0-3' memory='4194304' unit='KiB'/>
        <cell id='1' cpus='4-7' memory='4194304' unit='KiB'/>
      </numa>
    </cpu>

    There is invalid guest NUMA node specified for the hugepage.

    originally: fails
    now: fails
    expect: fails

* One hugepage with specific nodeset and second default hugepage

    <memoryBacking>
      <hugepages>
        <page size='1048576' unit='KiB' nodeset='0'/>
        <page size='2048' unit='KiB'/>
      </hugepages>
    </memoryBacking>
    ...
    <cpu mode='host-passthrough' check='none'>
      <topology sockets='1' cores='4' threads='2'/>
      <numa>
        <cell id='0' cpus='0-3' memory='4194304' unit='KiB'/>
        <cell id='1' cpus='4-7' memory='4194304' unit='KiB'/>
      </numa>
    </cpu>

    There are two guest NUMA nodes where we have default hugepage size
    configured and for NUMA node '0' we have a different size.

    originally: works
    now: works
    expect: works


    <memoryBacking>
      <hugepages>
        <page size='1048576' unit='KiB' nodeset='0,1'/>
        <page size='2048' unit='KiB'/>
      </hugepages>
    </memoryBacking>
    ...
    <cpu mode='host-passthrough' check='none'>
      <topology sockets='1' cores='4' threads='2'/>
      <numa>
        <cell id='0' cpus='0-3' memory='4194304' unit='KiB'/>
        <cell id='1' cpus='4-7' memory='4194304' unit='KiB'/>
      </numa>
    </cpu>

    originally: works
    now: works
    expect: fails ???

    In this situation all the guest NUMA nodes are covered by the
    hugepage size with specified nodeset attribute.  The default one
    is not used at all so is pointless here.
    
    The difference between this and non-NUMA guest is that if we change
    the order it will still work as expected, it doesn't depend on the
    order of elements.  However, we might consider is as invalid
    configuration because there is no need to have the default page size
    configured at all.


* Multiple combination of default and specific hugepage sizes

    There are some restriction if we use multiple page sizes:
        
        - There cannot be two different default hugepage sizes

        - Two different page elements cannot have the same guest NUMA
          node specified in nodeset attribute

        - hugepages are not allowed if memory backing allocation is set
          to 'ondemand' (not documented)

        - hugepages are not allowed if memory backing source is set to
          'anonymous' (not documented)


I hope that there is no other configuration that we need to care about.

Pavel

[1] <fa6bdf6afa878b8d7c5ed71664ee72be8967cdc5>
[2] <https://bugzilla.redhat.com/show_bug.cgi?id=1591235>

Attachment: signature.asc
Description: PGP signature

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Reply via email to