Re: [libvirt] [PATCH 2/4] conf: Move hugepage XML validation check out of qemu_command

2018-07-13 Thread Michal Privoznik
On 07/13/2018 02:02 PM, Pavel Hrdina wrote:
> On Wed, Jul 11, 2018 at 06:03:08PM +0200, Pavel Hrdina wrote:
>> To make it clear I'll summarize all the possible combinations and how it
>> should work so we are on the same page.
> 
> originally: before commit [1]
> now: after commit [1] (current master)
> expect: what this patch series should fix
> 
> 
> === Non-NUMA guests ===
> 
> 
> * Only one hugepage specified without any nodeset
> 
> 
>   
> 
>   
> 
> 
> This is correct, was always working and we should not change it.
> 
> originally: works
> now: works
> expected: works
> 
> 
> * Only one hugapage specified with nodeset
> 
> 
>   
> 
>   
> 
> 
> This is questionable since there is no guest NUMA node specified,
> but on the other hand we can consider non-NUMA guest to have exactly
> one NUMA node.
> 
> This was working but has been broken by commit [1]  which tried to
> fix a case where you are trying to specify non-existing NUMA node.
> 
> Because of that assumption we can consider this as valid XML even
> though there is no NUMA node specified [2].  There are two possible
> solutions:
> 
> - we can leave the XML intact
> 
> - we can silently remove the 'nodeset' attribute to 'fix' the
>   XML definition
> 
> originally: works
> now: fails
> expect: works
> 
> 
> 
>   
> 
>   
> 
> 
> If the nodeset is != 0 it should newer work becuase there is no
> guest NUMA topology and even if we take into account the assumption
> that there is always one NUMA node it is still invalid.
> 
> originally: works
> now: fails
> expect: fails
> 
> 
> * One hugepage with specific nodeset and second default hugepage
> 
> 
>   
> 
> 
>   
> 
> 
> This was working but was 'fixed' by commit [1] because the code
> checks only the first hugepage and uses only the first hugepage.
> 
> It should never worked because it doesn't make any sense, there
> is no guest NUMA node configured and even if we take into account
> the assumption that non-NUMA guest has one NUMA node there is need
> for the default page size.
> 
> originally: works
> now: fails
> expect: fails
> 
> 
> There is yet another issue with the current state in libvirt, if
> you swap the order of pages:
> 
> 
>   
> 
> 
>   
> 
> 
> it will work even with current libvirt with the fix [1].  The reason
> is that code in qemuBuildMemPathStr() function takes into account
> only the first page size so it depends on the order of elements
> which is wrong.
> 
> We should not allow any of these two configurations.  Setting
> nodeset to != 0 will should not make any difference.
> 
> originally: works
> now: works
> expect: fails
> 
> 
> == NUMA guest ==
> 
> 
> * Only one hugepage specified without any nodeset
> 
> 
>   
> 
>   
> 
> ...
> 
>   
>   
> 
> 
>   
> 
> 
> originally: works
> now: works
> expect: works
> 
> 
> * Only one hugapage specified with nodeset
> 
> 
>   
> 
>   
> 
> ...
> 
>   
>   
> 
> 
>   
> 
> 
> All possible combinations for the nodeset attribute are allowed
> if it always corresponds to existing guest NUMA node:
> 
> 
> 
> or
> 
> 
> 
> originally: works
> now: works
> expect: works
> 
> 
> 
>   
> 
>   
> 
> ...
> 
>   
>   
> 
> 
>   
> 
> 
> There is invalid guest NUMA node specified for the hugepage.
> 
> originally: fails
> now: fails
> expect: fails
> 
> * One hugepage with specific nodeset and second default hugepage
> 
> 
>   
> 
> 
>   
> 
> ...
> 
>   
>   
> 
> 
>   
> 
> 
> There are two guest NUMA nodes where we have default hugepage size
> configured and for NUMA node '0' we have a different size.
> 
> originally: works
> now: works
> expect: works
> 
> 
> 
>   
> 
> 
>   
> 
> ...
> 
>   
>   
> 
> 
>   
> 
> 
> originally: works
> now: works
> expect: fails ???
> 
> In this situation all the guest NUMA nodes are covered by the
> hugepage size with specified nodeset attribute.  The default one
> is not used at all so is pointless here.
> 
> The difference between this and non-NUMA guest is that if we change
> the order it will still work as expected, it doesn't depend on the
> order of elements.  However, we might consider is as invalid
> configuration because there is no need to have the default page size
> configured at all.
> 

Re: [libvirt] [PATCH 2/4] conf: Move hugepage XML validation check out of qemu_command

2018-07-13 Thread Pavel Hrdina
On Wed, Jul 11, 2018 at 06:03:08PM +0200, Pavel Hrdina wrote:
> To make it clear I'll summarize all the possible combinations and how it
> should work so we are on the same page.

originally: before commit [1]
now: after commit [1] (current master)
expect: what this patch series should fix


=== Non-NUMA guests ===


* Only one hugepage specified without any nodeset


  

  


This is correct, was always working and we should not change it.

originally: works
now: works
expected: works


* Only one hugapage specified with nodeset


  

  


This is questionable since there is no guest NUMA node specified,
but on the other hand we can consider non-NUMA guest to have exactly
one NUMA node.

This was working but has been broken by commit [1]  which tried to
fix a case where you are trying to specify non-existing NUMA node.

Because of that assumption we can consider this as valid XML even
though there is no NUMA node specified [2].  There are two possible
solutions:

- we can leave the XML intact

- we can silently remove the 'nodeset' attribute to 'fix' the
  XML definition

originally: works
now: fails
expect: works



  

  


If the nodeset is != 0 it should newer work becuase there is no
guest NUMA topology and even if we take into account the assumption
that there is always one NUMA node it is still invalid.

originally: works
now: fails
expect: fails


* One hugepage with specific nodeset and second default hugepage


  


  


This was working but was 'fixed' by commit [1] because the code
checks only the first hugepage and uses only the first hugepage.

It should never worked because it doesn't make any sense, there
is no guest NUMA node configured and even if we take into account
the assumption that non-NUMA guest has one NUMA node there is need
for the default page size.

originally: works
now: fails
expect: fails


There is yet another issue with the current state in libvirt, if
you swap the order of pages:


  


  


it will work even with current libvirt with the fix [1].  The reason
is that code in qemuBuildMemPathStr() function takes into account
only the first page size so it depends on the order of elements
which is wrong.

We should not allow any of these two configurations.  Setting
nodeset to != 0 will should not make any difference.

originally: works
now: works
expect: fails


== NUMA guest ==


* Only one hugepage specified without any nodeset


  

  

...

  
  


  


originally: works
now: works
expect: works


* Only one hugapage specified with nodeset


  

  

...

  
  


  


All possible combinations for the nodeset attribute are allowed
if it always corresponds to existing guest NUMA node:



or



originally: works
now: works
expect: works



  

  

...

  
  


  


There is invalid guest NUMA node specified for the hugepage.

originally: fails
now: fails
expect: fails

* One hugepage with specific nodeset and second default hugepage


  


  

...

  
  


  


There are two guest NUMA nodes where we have default hugepage size
configured and for NUMA node '0' we have a different size.

originally: works
now: works
expect: works



  


  

...

  
  


  


originally: works
now: works
expect: fails ???

In this situation all the guest NUMA nodes are covered by the
hugepage size with specified nodeset attribute.  The default one
is not used at all so is pointless here.

The difference between this and non-NUMA guest is that if we change
the order it will still work as expected, it doesn't depend on the
order of elements.  However, we might consider is as invalid
configuration because there is no need to have the default page size
configured at all.


* Multiple combination of default and specific hugepage sizes

There are some restriction if we use multiple page sizes:

- There cannot be two different default hugepage sizes

- Two different page elements cannot have the same guest NUMA
  node specified in nodeset attribute

- hugepages are not allowed if memory backing allocation is set
  to 'ondemand' (not documented)

- hugepages are not allowed if memory backing source is set to
  'anonymous' (not do

Re: [libvirt] [PATCH 2/4] conf: Move hugepage XML validation check out of qemu_command

2018-07-11 Thread Pavel Hrdina
On Wed, Jul 11, 2018 at 05:47:58PM +0200, Michal Privoznik wrote:
> On 07/11/2018 05:25 PM, Pavel Hrdina wrote:
> > On Wed, Jul 11, 2018 at 05:05:07PM +0200, Michal Privoznik wrote:
> >> On 07/11/2018 10:22 AM, Pavel Hrdina wrote:
> >>> We can safely validate the hugepage nodeset attribute at a define time.
> >>> This validation is not done for already existing domains when the daemon
> >>> is restarted.
> >>>
> >>> All the changes to the tests are necessary because we move the error
> >>> from domain start into XML parse.
> >>>
> >>> Signed-off-by: Pavel Hrdina 
> >>> ---
> >>>  src/conf/domain_conf.c| 32 +
> >>>  src/qemu/qemu_command.c   | 34 ---
> >>>  .../seclabel-dynamic-none-relabel.xml |  2 +-
> >>>  tests/qemuxml2argvtest.c  | 16 +
> >>>  .../qemuxml2xmloutdata/hugepages-pages10.xml  | 30 
> >>>  tests/qemuxml2xmloutdata/hugepages-pages4.xml |  1 -
> >>>  tests/qemuxml2xmloutdata/hugepages-pages9.xml | 31 -
> >>>  .../seclabel-dynamic-none-relabel.xml |  2 +-
> >>>  tests/qemuxml2xmltest.c   |  3 --
> >>>  9 files changed, 43 insertions(+), 108 deletions(-)
> >>>  delete mode 100644 tests/qemuxml2xmloutdata/hugepages-pages10.xml
> >>>  delete mode 12 tests/qemuxml2xmloutdata/hugepages-pages4.xml
> >>>  delete mode 100644 tests/qemuxml2xmloutdata/hugepages-pages9.xml
> >>>
> >>> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> >>> index 7396616eda..20d67e7854 100644
> >>> --- a/src/conf/domain_conf.c
> >>> +++ b/src/conf/domain_conf.c
> >>> @@ -6104,6 +6104,35 @@ virDomainDefLifecycleActionValidate(const 
> >>> virDomainDef *def)
> >>>  }
> >>>  
> >>>  
> >>> +static int
> >>> +virDomainDefMemtuneValidate(const virDomainDef *def)
> >>> +{
> >>> +const virDomainMemtune *mem = &(def->mem);
> >>> +size_t i;
> >>> +ssize_t pos = virDomainNumaGetNodeCount(def->numa) - 1;
> >>> +
> >>> +for (i = 0; i < mem->nhugepages; i++) {
> >>> +ssize_t nextBit;
> >>> +
> >>> +if (!mem->hugepages[i].nodemask) {
> >>> +/* This is the master hugepage to use. Skip it as it has no
> >>> + * nodemask anyway. */
> >>> +continue;
> >>> +}
> >>> +
> >>> +nextBit = virBitmapNextSetBit(mem->hugepages[i].nodemask, pos);
> >>> +if (nextBit >= 0) {
> >>
> >> I think its fair to enable hugepages for node #0 which is always there
> >> (even if not configured in domain XML). Just try to run 'numactl -H'
> >> from a domain that has no  in its XML.
> > 
> > Well yes, linux always assumes that there is at least one NUMA node
> > but other systems might not consider it the same.
> 
> I don't think the assumption is limited to Linux only. Even Windows
> behave the same. For instance the following example shows that on
> non-NUMA machine there is NUMA node #0.
> 
> https://docs.microsoft.com/en-us/windows/desktop/Memory/allocating-memory-from-a-numa-node

Well if we can change the assumption into a fact I'm definitely for
that change to consider all guest to have at least one NUMA node.
I was trying to lookup some documentation/specification but failed
to do so.

> 
> > 
> >>
> >>> +virReportError(VIR_ERR_XML_DETAIL,
> >>> +   _("hugepages: node %zd not found"),
> >>> +   nextBit);
> >>> +return -1;
> >>> +}
> >>> +}
> >>
> >> Also, I see that you're removing hugepages-pages9 test from xml2xml
> >> test. But that is needed only because you disallowed nodeset='0' for
> >> nonnuma domain. The real problem there is that the default page size has
> > 
> > That is already disallowed but only once you try to start such domain,
> > I'm just moving this check from start time to parse time.
> 
> Yes because we have a bug in the code. So when you introduced the test
> it was doomed to fail.

This test case should fail every time because it is invalid
configuration.  You have non-NUMA guest with two different pages
and also specific node configured for one page.

> > 
> > If you look into qemuxml2argvtest.c you will see that hugepages-pages9
> > is expected to fail.
> > 
> >> no numa node to apply to, not nodeset='0'. I guess we need to check for
> >> that too (or do we want to?)
> > 
> > That is yet different issue that can be addressed but it should not
> > block this patch.
> 
> Well, maybe. I'm not saying your patches are wrong. Apart from allowing
> nodeset='0' (which I think we should do, but I don't have that much of a
> strong opinion there).

To make it clear I'll summarize all the possible combinations and how it
should work so we are on the same page.

Pavel


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 2/4] conf: Move hugepage XML validation check out of qemu_command

2018-07-11 Thread Michal Privoznik
On 07/11/2018 05:25 PM, Pavel Hrdina wrote:
> On Wed, Jul 11, 2018 at 05:05:07PM +0200, Michal Privoznik wrote:
>> On 07/11/2018 10:22 AM, Pavel Hrdina wrote:
>>> We can safely validate the hugepage nodeset attribute at a define time.
>>> This validation is not done for already existing domains when the daemon
>>> is restarted.
>>>
>>> All the changes to the tests are necessary because we move the error
>>> from domain start into XML parse.
>>>
>>> Signed-off-by: Pavel Hrdina 
>>> ---
>>>  src/conf/domain_conf.c| 32 +
>>>  src/qemu/qemu_command.c   | 34 ---
>>>  .../seclabel-dynamic-none-relabel.xml |  2 +-
>>>  tests/qemuxml2argvtest.c  | 16 +
>>>  .../qemuxml2xmloutdata/hugepages-pages10.xml  | 30 
>>>  tests/qemuxml2xmloutdata/hugepages-pages4.xml |  1 -
>>>  tests/qemuxml2xmloutdata/hugepages-pages9.xml | 31 -
>>>  .../seclabel-dynamic-none-relabel.xml |  2 +-
>>>  tests/qemuxml2xmltest.c   |  3 --
>>>  9 files changed, 43 insertions(+), 108 deletions(-)
>>>  delete mode 100644 tests/qemuxml2xmloutdata/hugepages-pages10.xml
>>>  delete mode 12 tests/qemuxml2xmloutdata/hugepages-pages4.xml
>>>  delete mode 100644 tests/qemuxml2xmloutdata/hugepages-pages9.xml
>>>
>>> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
>>> index 7396616eda..20d67e7854 100644
>>> --- a/src/conf/domain_conf.c
>>> +++ b/src/conf/domain_conf.c
>>> @@ -6104,6 +6104,35 @@ virDomainDefLifecycleActionValidate(const 
>>> virDomainDef *def)
>>>  }
>>>  
>>>  
>>> +static int
>>> +virDomainDefMemtuneValidate(const virDomainDef *def)
>>> +{
>>> +const virDomainMemtune *mem = &(def->mem);
>>> +size_t i;
>>> +ssize_t pos = virDomainNumaGetNodeCount(def->numa) - 1;
>>> +
>>> +for (i = 0; i < mem->nhugepages; i++) {
>>> +ssize_t nextBit;
>>> +
>>> +if (!mem->hugepages[i].nodemask) {
>>> +/* This is the master hugepage to use. Skip it as it has no
>>> + * nodemask anyway. */
>>> +continue;
>>> +}
>>> +
>>> +nextBit = virBitmapNextSetBit(mem->hugepages[i].nodemask, pos);
>>> +if (nextBit >= 0) {
>>
>> I think its fair to enable hugepages for node #0 which is always there
>> (even if not configured in domain XML). Just try to run 'numactl -H'
>> from a domain that has no  in its XML.
> 
> Well yes, linux always assumes that there is at least one NUMA node
> but other systems might not consider it the same.

I don't think the assumption is limited to Linux only. Even Windows
behave the same. For instance the following example shows that on
non-NUMA machine there is NUMA node #0.

https://docs.microsoft.com/en-us/windows/desktop/Memory/allocating-memory-from-a-numa-node

> 
>>
>>> +virReportError(VIR_ERR_XML_DETAIL,
>>> +   _("hugepages: node %zd not found"),
>>> +   nextBit);
>>> +return -1;
>>> +}
>>> +}
>>
>> Also, I see that you're removing hugepages-pages9 test from xml2xml
>> test. But that is needed only because you disallowed nodeset='0' for
>> nonnuma domain. The real problem there is that the default page size has
> 
> That is already disallowed but only once you try to start such domain,
> I'm just moving this check from start time to parse time.

Yes because we have a bug in the code. So when you introduced the test
it was doomed to fail.

> 
> If you look into qemuxml2argvtest.c you will see that hugepages-pages9
> is expected to fail.
> 
>> no numa node to apply to, not nodeset='0'. I guess we need to check for
>> that too (or do we want to?)
> 
> That is yet different issue that can be addressed but it should not
> block this patch.

Well, maybe. I'm not saying your patches are wrong. Apart from allowing
nodeset='0' (which I think we should do, but I don't have that much of a
strong opinion there).

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 2/4] conf: Move hugepage XML validation check out of qemu_command

2018-07-11 Thread Pavel Hrdina
On Wed, Jul 11, 2018 at 05:05:07PM +0200, Michal Privoznik wrote:
> On 07/11/2018 10:22 AM, Pavel Hrdina wrote:
> > We can safely validate the hugepage nodeset attribute at a define time.
> > This validation is not done for already existing domains when the daemon
> > is restarted.
> > 
> > All the changes to the tests are necessary because we move the error
> > from domain start into XML parse.
> > 
> > Signed-off-by: Pavel Hrdina 
> > ---
> >  src/conf/domain_conf.c| 32 +
> >  src/qemu/qemu_command.c   | 34 ---
> >  .../seclabel-dynamic-none-relabel.xml |  2 +-
> >  tests/qemuxml2argvtest.c  | 16 +
> >  .../qemuxml2xmloutdata/hugepages-pages10.xml  | 30 
> >  tests/qemuxml2xmloutdata/hugepages-pages4.xml |  1 -
> >  tests/qemuxml2xmloutdata/hugepages-pages9.xml | 31 -
> >  .../seclabel-dynamic-none-relabel.xml |  2 +-
> >  tests/qemuxml2xmltest.c   |  3 --
> >  9 files changed, 43 insertions(+), 108 deletions(-)
> >  delete mode 100644 tests/qemuxml2xmloutdata/hugepages-pages10.xml
> >  delete mode 12 tests/qemuxml2xmloutdata/hugepages-pages4.xml
> >  delete mode 100644 tests/qemuxml2xmloutdata/hugepages-pages9.xml
> > 
> > diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> > index 7396616eda..20d67e7854 100644
> > --- a/src/conf/domain_conf.c
> > +++ b/src/conf/domain_conf.c
> > @@ -6104,6 +6104,35 @@ virDomainDefLifecycleActionValidate(const 
> > virDomainDef *def)
> >  }
> >  
> >  
> > +static int
> > +virDomainDefMemtuneValidate(const virDomainDef *def)
> > +{
> > +const virDomainMemtune *mem = &(def->mem);
> > +size_t i;
> > +ssize_t pos = virDomainNumaGetNodeCount(def->numa) - 1;
> > +
> > +for (i = 0; i < mem->nhugepages; i++) {
> > +ssize_t nextBit;
> > +
> > +if (!mem->hugepages[i].nodemask) {
> > +/* This is the master hugepage to use. Skip it as it has no
> > + * nodemask anyway. */
> > +continue;
> > +}
> > +
> > +nextBit = virBitmapNextSetBit(mem->hugepages[i].nodemask, pos);
> > +if (nextBit >= 0) {
> 
> I think its fair to enable hugepages for node #0 which is always there
> (even if not configured in domain XML). Just try to run 'numactl -H'
> from a domain that has no  in its XML.

Well yes, linux always assumes that there is at least one NUMA node
but other systems might not consider it the same.

> 
> > +virReportError(VIR_ERR_XML_DETAIL,
> > +   _("hugepages: node %zd not found"),
> > +   nextBit);
> > +return -1;
> > +}
> > +}
> 
> Also, I see that you're removing hugepages-pages9 test from xml2xml
> test. But that is needed only because you disallowed nodeset='0' for
> nonnuma domain. The real problem there is that the default page size has

That is already disallowed but only once you try to start such domain,
I'm just moving this check from start time to parse time.

If you look into qemuxml2argvtest.c you will see that hugepages-pages9
is expected to fail.

> no numa node to apply to, not nodeset='0'. I guess we need to check for
> that too (or do we want to?)

That is yet different issue that can be addressed but it should not
block this patch.

Pavel


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 2/4] conf: Move hugepage XML validation check out of qemu_command

2018-07-11 Thread Michal Privoznik
On 07/11/2018 10:22 AM, Pavel Hrdina wrote:
> We can safely validate the hugepage nodeset attribute at a define time.
> This validation is not done for already existing domains when the daemon
> is restarted.
> 
> All the changes to the tests are necessary because we move the error
> from domain start into XML parse.
> 
> Signed-off-by: Pavel Hrdina 
> ---
>  src/conf/domain_conf.c| 32 +
>  src/qemu/qemu_command.c   | 34 ---
>  .../seclabel-dynamic-none-relabel.xml |  2 +-
>  tests/qemuxml2argvtest.c  | 16 +
>  .../qemuxml2xmloutdata/hugepages-pages10.xml  | 30 
>  tests/qemuxml2xmloutdata/hugepages-pages4.xml |  1 -
>  tests/qemuxml2xmloutdata/hugepages-pages9.xml | 31 -
>  .../seclabel-dynamic-none-relabel.xml |  2 +-
>  tests/qemuxml2xmltest.c   |  3 --
>  9 files changed, 43 insertions(+), 108 deletions(-)
>  delete mode 100644 tests/qemuxml2xmloutdata/hugepages-pages10.xml
>  delete mode 12 tests/qemuxml2xmloutdata/hugepages-pages4.xml
>  delete mode 100644 tests/qemuxml2xmloutdata/hugepages-pages9.xml
> 
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 7396616eda..20d67e7854 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -6104,6 +6104,35 @@ virDomainDefLifecycleActionValidate(const virDomainDef 
> *def)
>  }
>  
>  
> +static int
> +virDomainDefMemtuneValidate(const virDomainDef *def)
> +{
> +const virDomainMemtune *mem = &(def->mem);
> +size_t i;
> +ssize_t pos = virDomainNumaGetNodeCount(def->numa) - 1;
> +
> +for (i = 0; i < mem->nhugepages; i++) {
> +ssize_t nextBit;
> +
> +if (!mem->hugepages[i].nodemask) {
> +/* This is the master hugepage to use. Skip it as it has no
> + * nodemask anyway. */
> +continue;
> +}
> +
> +nextBit = virBitmapNextSetBit(mem->hugepages[i].nodemask, pos);
> +if (nextBit >= 0) {

I think its fair to enable hugepages for node #0 which is always there
(even if not configured in domain XML). Just try to run 'numactl -H'
from a domain that has no  in its XML.

> +virReportError(VIR_ERR_XML_DETAIL,
> +   _("hugepages: node %zd not found"),
> +   nextBit);
> +return -1;
> +}
> +}

Also, I see that you're removing hugepages-pages9 test from xml2xml
test. But that is needed only because you disallowed nodeset='0' for
nonnuma domain. The real problem there is that the default page size has
no numa node to apply to, not nodeset='0'. I guess we need to check for
that too (or do we want to?)

> +
> +return 0;
> +}
> +

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list