On Wed, 14 Aug 2019 19:31:27 -0700 Dan Williams <dan.j.willi...@intel.com> wrote:
> On Wed, Aug 14, 2019 at 6:57 PM Tao Xu <tao3...@intel.com> wrote: > > > > On 8/15/2019 5:29 AM, Dan Williams wrote: > > > On Tue, Aug 13, 2019 at 10:14 PM Tao Xu <tao3...@intel.com> wrote: > > >> > > >> On 8/14/2019 10:39 AM, Dan Williams wrote: > > >>> On Tue, Aug 13, 2019 at 8:00 AM Igor Mammedov <imamm...@redhat.com> > > >>> wrote: > > >>>> > > >>>> On Fri, 9 Aug 2019 14:57:25 +0800 > > >>>> Tao <tao3...@intel.com> wrote: > > >>>> > > >>>>> From: Tao Xu <tao3...@intel.com> > > >>>>> > > [...] > > >>>>> + for (i = 0; i < machine->numa_state->num_nodes; i++) { > > >>>>> + if (numa_info[i].initiator_valid && > > >>>>> + !numa_info[numa_info[i].initiator].has_cpu) { > > >>>> ^^^^^^^^^^^^^^^^^^^^^^ possible out of > > >>>> bounds read, see bellow > > >>>> > > >>>>> + error_report("The initiator-id %"PRIu16 " of NUMA node > > >>>>> %d" > > >>>>> + " does not exist.", numa_info[i].initiator, > > >>>>> i); > > >>>>> + error_printf("\n"); > > >>>>> + > > >>>>> + exit(1); > > >>>>> + } > > >>>> it takes care only about nodes that have cpus or memory-only ones that > > >>>> have > > >>>> initiator explicitly provided on CLI. And leaves possibility to have > > >>>> memory-only nodes without initiator mixed with nodes that have > > >>>> initiator. > > >>>> Is it valid to have mixed configuration? > > >>>> Should we forbid it? > > >>> > > >>> The spec talks about the "Proximity Domain for the Attached Initiator" > > >>> field only being valid if the memory controller for the memory can be > > >>> identified by an initiator id in the SRAT. So I expect the only way to > > >>> define a memory proximity domain without this local initiator is to > > >>> allow specifying a node-id that does not have an entry in the SRAT. > > >>> > > >> Hi Dan, > > >> > > >> So there may be a situation for the Attached Initiator field is not > > >> valid? If true, I would allow user to input Initiator invalid. > > > > > > Yes it's something the OS needs to consider because the platform may > > > not be able to meet the constraint that a single initiator is > > > associated with the memory controller for a given memory target. In > > > retrospect it would have been nice if the spec reserved 0xffffffff for > > > this purpose, but it seems "not in SRAT" is the only way to identify > > > memory that is not attached to any single initiator. > > > > > But As far as I konw, QEMU can't emulate a NUMA node "not in SRAT". I am > > wondering if it is effective only set Initiator invalid? > > You don't need to emulate a NUMA node not in SRAT. Just put a number > in this HMAT entry larger than the largest proximity domain number > found in the SRAT. > > > So behavior is really not defined in the spec (well I wasn't able to convince myself that above behavior is in the spec). In this case I'd go with a strict check for now not allowing invalid initiator (we can easily relax check and allow it point to nonsense later but no other way around)