Re: [PATCH v4 2/4] hw/riscv: spike: Allow creating multiple NUMA sockets

2020-05-29 Thread Anup Patel
On Thu, May 28, 2020 at 8:35 PM Igor Mammedov  wrote:
>
> On Thu, 28 May 2020 18:59:57 +0530
> Anup Patel  wrote:
>
> > We extend RISC-V spike machine to allow creating a multi-socket machine.
> > Each RISC-V spike machine socket is a NUMA node having a set of HARTs,
> > a memory instance, and a CLINT instance. Other devices are shared
> > between all RISC-V spike machine sockets. We also update the generated
> > device tree accordingly.
> >
> > By default, NUMA multi-socket support is disabled for RISC-V spike
> > machine. To enable it, users can use "-numa" command-line options
> > of QEMU.
> >
> > Example1: For two NUMA nodes with 4 CPUs each, append following
> > to command-line options: "-smp 8 -numa node -numa node"
> >
> > Example2: For two NUMA nodes with 3 and 5 CPUs, append following
> > to command-line options: "-smp 8 -numa node,cpus=0-2 -numa node,3-7"
> it uses legacy 'node,cpus=' that we I'm working on depricating and removing.
> Pls use newer '-node cpu=' instead, that might require enabling

Thanks for pointing out.
I will update this series so that "-node cpu" option works fine.

>  MachineClass::has_hotpluggable_cpus
> so user could get a list of possible CPUs (even if they are all coldplugged)
> and might also lead to making -device yuor_cpu functional.

We are taking our first steps for NUMA in the RISC-V world so we will
do  theruntime CPU hotplug as a separate series. For this series, we will
stick with coldplugged NUMA cpus.

>
> PS:
> patch is too big, I suggest to split it. At least changest that introduce
> generic infrastructure
>  > +mc->possible_cpu_arch_ids = spike_possible_cpu_arch_ids;
>  > +mc->cpu_index_to_instance_props = spike_cpu_index_to_props;
>  > +mc->get_default_cpu_node_id = spike_get_default_cpu_node_id;
>  > +mc->numa_mem_supported = true;

Good point, there is certainly quite a bit of code between virt and
spike machine (related to NUMA/mult-socket) which can be shared.
I will factor-out common code as helper functions.

The major part of the patch is about rewriting FDT generation because
it is not suitable for multiple sockets (or multiple NUMA nodes).

Regards,
Anup

>
> > Signed-off-by: Anup Patel 
> > ---
> >  hw/riscv/spike.c | 381 +--
> >  include/hw/riscv/spike.h |  15 +-
> >  2 files changed, 296 insertions(+), 100 deletions(-)
> >
> > diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
> > index d5e0103d89..b1923442b6 100644
> > --- a/hw/riscv/spike.c
> > +++ b/hw/riscv/spike.c
> > @@ -64,9 +64,14 @@ static void create_fdt(SpikeState *s, const struct 
> > MemmapEntry *memmap,
> >  uint64_t mem_size, const char *cmdline)
> >  {
> >  void *fdt;
> > -int cpu;
> > -uint32_t *cells;
> > -char *nodename;
> > +unsigned long clint_addr;
> > +int i, j, idx, cpu, socket;
> > +MachineState *mc = MACHINE(s);
> > +uint32_t dist_matrix_size;
> > +uint32_t *clint_cells, *dist_matrix;
> > +uint32_t cpu_phandle, intc_phandle, phandle = 1;
> > +char *name, *mem_name, *clint_name, *clust_name;
> > +char *core_name, *cpu_name, *intc_name;
> >
> >  fdt = s->fdt = create_device_tree(>fdt_size);
> >  if (!fdt) {
> > @@ -88,68 +93,118 @@ static void create_fdt(SpikeState *s, const struct 
> > MemmapEntry *memmap,
> >  qemu_fdt_setprop_cell(fdt, "/soc", "#size-cells", 0x2);
> >  qemu_fdt_setprop_cell(fdt, "/soc", "#address-cells", 0x2);
> >
> > -nodename = g_strdup_printf("/memory@%lx",
> > -(long)memmap[SPIKE_DRAM].base);
> > -qemu_fdt_add_subnode(fdt, nodename);
> > -qemu_fdt_setprop_cells(fdt, nodename, "reg",
> > -memmap[SPIKE_DRAM].base >> 32, memmap[SPIKE_DRAM].base,
> > -mem_size >> 32, mem_size);
> > -qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
> > -g_free(nodename);
> > -
> >  qemu_fdt_add_subnode(fdt, "/cpus");
> >  qemu_fdt_setprop_cell(fdt, "/cpus", "timebase-frequency",
> >  SIFIVE_CLINT_TIMEBASE_FREQ);
> >  qemu_fdt_setprop_cell(fdt, "/cpus", "#size-cells", 0x0);
> >  qemu_fdt_setprop_cell(fdt, "/cpus", "#address-cells", 0x1);
> > +qemu_fdt_add_subnode(fdt, "/cpus/cpu-map");
> > +
> > +for (socket = (s->num_socs - 1); socket >= 0; socket--) {
> > +clust_name = g_strdup_printf("/cpus/cpu-map/cluster%d", socket);
> > +qemu_fdt_add_subnode(fdt, clust_name);
> > +
> > +clint_cells =  g_new0(uint32_t, s->soc[socket].num_harts * 4);
> >
> > -for (cpu = s->soc.num_harts - 1; cpu >= 0; cpu--) {
> > -nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
> > -char *intc = g_strdup_printf("/cpus/cpu@%d/interrupt-controller", 
> > cpu);
> > -char *isa = riscv_isa_string(>soc.harts[cpu]);
> > -qemu_fdt_add_subnode(fdt, nodename);
> > +for (cpu = s->soc[socket].num_harts - 1; cpu >= 0; cpu--) {
> > +cpu_phandle = phandle++;
> > +
> > +cpu_name = 

Re: [PATCH v4 2/4] hw/riscv: spike: Allow creating multiple NUMA sockets

2020-05-28 Thread Igor Mammedov
On Thu, 28 May 2020 18:59:57 +0530
Anup Patel  wrote:

> We extend RISC-V spike machine to allow creating a multi-socket machine.
> Each RISC-V spike machine socket is a NUMA node having a set of HARTs,
> a memory instance, and a CLINT instance. Other devices are shared
> between all RISC-V spike machine sockets. We also update the generated
> device tree accordingly.
> 
> By default, NUMA multi-socket support is disabled for RISC-V spike
> machine. To enable it, users can use "-numa" command-line options
> of QEMU.
> 
> Example1: For two NUMA nodes with 4 CPUs each, append following
> to command-line options: "-smp 8 -numa node -numa node"
> 
> Example2: For two NUMA nodes with 3 and 5 CPUs, append following
> to command-line options: "-smp 8 -numa node,cpus=0-2 -numa node,3-7"
it uses legacy 'node,cpus=' that we I'm working on depricating and removing.
Pls use newer '-node cpu=' instead, that might require enabling
 MachineClass::has_hotpluggable_cpus
so user could get a list of possible CPUs (even if they are all coldplugged)
and might also lead to making -device yuor_cpu functional.

PS:
patch is too big, I suggest to split it. At least changest that introduce
generic infrastructure
 > +mc->possible_cpu_arch_ids = spike_possible_cpu_arch_ids;
 > +mc->cpu_index_to_instance_props = spike_cpu_index_to_props;
 > +mc->get_default_cpu_node_id = spike_get_default_cpu_node_id;
 > +mc->numa_mem_supported = true;

> Signed-off-by: Anup Patel 
> ---
>  hw/riscv/spike.c | 381 +--
>  include/hw/riscv/spike.h |  15 +-
>  2 files changed, 296 insertions(+), 100 deletions(-)
> 
> diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
> index d5e0103d89..b1923442b6 100644
> --- a/hw/riscv/spike.c
> +++ b/hw/riscv/spike.c
> @@ -64,9 +64,14 @@ static void create_fdt(SpikeState *s, const struct 
> MemmapEntry *memmap,
>  uint64_t mem_size, const char *cmdline)
>  {
>  void *fdt;
> -int cpu;
> -uint32_t *cells;
> -char *nodename;
> +unsigned long clint_addr;
> +int i, j, idx, cpu, socket;
> +MachineState *mc = MACHINE(s);
> +uint32_t dist_matrix_size;
> +uint32_t *clint_cells, *dist_matrix;
> +uint32_t cpu_phandle, intc_phandle, phandle = 1;
> +char *name, *mem_name, *clint_name, *clust_name;
> +char *core_name, *cpu_name, *intc_name;
>  
>  fdt = s->fdt = create_device_tree(>fdt_size);
>  if (!fdt) {
> @@ -88,68 +93,118 @@ static void create_fdt(SpikeState *s, const struct 
> MemmapEntry *memmap,
>  qemu_fdt_setprop_cell(fdt, "/soc", "#size-cells", 0x2);
>  qemu_fdt_setprop_cell(fdt, "/soc", "#address-cells", 0x2);
>  
> -nodename = g_strdup_printf("/memory@%lx",
> -(long)memmap[SPIKE_DRAM].base);
> -qemu_fdt_add_subnode(fdt, nodename);
> -qemu_fdt_setprop_cells(fdt, nodename, "reg",
> -memmap[SPIKE_DRAM].base >> 32, memmap[SPIKE_DRAM].base,
> -mem_size >> 32, mem_size);
> -qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
> -g_free(nodename);
> -
>  qemu_fdt_add_subnode(fdt, "/cpus");
>  qemu_fdt_setprop_cell(fdt, "/cpus", "timebase-frequency",
>  SIFIVE_CLINT_TIMEBASE_FREQ);
>  qemu_fdt_setprop_cell(fdt, "/cpus", "#size-cells", 0x0);
>  qemu_fdt_setprop_cell(fdt, "/cpus", "#address-cells", 0x1);
> +qemu_fdt_add_subnode(fdt, "/cpus/cpu-map");
> +
> +for (socket = (s->num_socs - 1); socket >= 0; socket--) {
> +clust_name = g_strdup_printf("/cpus/cpu-map/cluster%d", socket);
> +qemu_fdt_add_subnode(fdt, clust_name);
> +
> +clint_cells =  g_new0(uint32_t, s->soc[socket].num_harts * 4);
>  
> -for (cpu = s->soc.num_harts - 1; cpu >= 0; cpu--) {
> -nodename = g_strdup_printf("/cpus/cpu@%d", cpu);
> -char *intc = g_strdup_printf("/cpus/cpu@%d/interrupt-controller", 
> cpu);
> -char *isa = riscv_isa_string(>soc.harts[cpu]);
> -qemu_fdt_add_subnode(fdt, nodename);
> +for (cpu = s->soc[socket].num_harts - 1; cpu >= 0; cpu--) {
> +cpu_phandle = phandle++;
> +
> +cpu_name = g_strdup_printf("/cpus/cpu@%d",
> +s->soc[socket].hartid_base + cpu);
> +qemu_fdt_add_subnode(fdt, cpu_name);
>  #if defined(TARGET_RISCV32)
> -qemu_fdt_setprop_string(fdt, nodename, "mmu-type", "riscv,sv32");
> +qemu_fdt_setprop_string(fdt, cpu_name, "mmu-type", "riscv,sv32");
>  #else
> -qemu_fdt_setprop_string(fdt, nodename, "mmu-type", "riscv,sv48");
> +qemu_fdt_setprop_string(fdt, cpu_name, "mmu-type", "riscv,sv48");
>  #endif
> -qemu_fdt_setprop_string(fdt, nodename, "riscv,isa", isa);
> -qemu_fdt_setprop_string(fdt, nodename, "compatible", "riscv");
> -qemu_fdt_setprop_string(fdt, nodename, "status", "okay");
> -qemu_fdt_setprop_cell(fdt, nodename, "reg", cpu);
> -qemu_fdt_setprop_string(fdt, nodename, "device_type", "cpu");
> -