Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node
On Tue, 2013-11-05 at 00:11 +1100, Alexey Kardashevskiy wrote: > Question about english - is "the single RMA" equal to "the only RMA"? Yes. Ben.
Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node
On 11/05/2013 12:19 AM, Peter Maydell wrote: > On 4 November 2013 13:11, Alexey Kardashevskiy wrote: >> PAPR says in "Hypervisor Call Functions": >> >> "Logical addresses start at zero. When control is initially passed to the >> OS from the platform, the first region is the >> single RMA. The first region has logical region identifier of zero. This >> first region is specified by the first address - >> length pair of the “reg” property of the /memory node of the OF device tree." >> >> >> Question about english - is "the single RMA" equal to "the only RMA"? > > No. "the single RMA" is weird English and to me implies > that it's a term that's been defined earlier or at least > that there is some surrounding context that would make it > make more sense (eg some contrasting definition of > "single RMA" vs "double RMA"). Oh. Some more context then: "14.1.1 Real Mode Accesses When the OS controlling an LPAR runs with address translation turned off (MSRDR or MSRIR bit(s) =0) (real mode) the LPAR hardware translates the memory addresses to an LPAR unique area known as the Real Mode Area (RMA). When control is initially passed to the OS from the platform, the RMA starts at the LPAR's logical address 0 and is the first logical memory block reported in the LPAR’s device tree. In general, the RMA is a subset of the LPAR's logical address space. Attempting a non relocated access beyond the bounds of the RMA results in an storage interrupt (ISI/DSI depending upon instruction or data reference). The RMA hardware translation scheme is platform dependent. The options are given below." -- Alexey
Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node
On 4 November 2013 13:11, Alexey Kardashevskiy wrote: > PAPR says in "Hypervisor Call Functions": > > "Logical addresses start at zero. When control is initially passed to the > OS from the platform, the first region is the > single RMA. The first region has logical region identifier of zero. This > first region is specified by the first address - > length pair of the “reg” property of the /memory node of the OF device tree." > > > Question about english - is "the single RMA" equal to "the only RMA"? No. "the single RMA" is weird English and to me implies that it's a term that's been defined earlier or at least that there is some surrounding context that would make it make more sense (eg some contrasting definition of "single RMA" vs "double RMA"). -- PMM
Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node
On 11/04/2013 10:50 PM, Thomas Huth wrote: > On Mon, 4 Nov 2013 12:28:12 +0100 > Alexander Graf wrote: > >> >> On 04.11.2013, at 11:55, Benjamin Herrenschmidt >> wrote: >> >>> On Mon, 2013-11-04 at 11:44 +0100, Alexander Graf wrote: On 01.11.2013, at 11:21, Alexey Kardashevskiy wrote: > SLOF gets really confused if RTAS/device-tree and everything else > what SLOF can use is not in the very first block of the very first > memory node. > > This makes sure that the RMA area is where SLOF expects it to be. > > Cc: Benjamin Herrenschmidt > Cc: Nikunj A Dadhania > Signed-off-by: Alexey Kardashevskiy > --- > hw/ppc/spapr.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 09dc635..09a5d94 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs > *args) >int i; >MemoryRegion *sysmem = get_system_memory(); >MemoryRegion *ram = g_new(MemoryRegion, 1); > -hwaddr rma_alloc_size; > +hwaddr rma_alloc_size, node0_size; >uint32_t initrd_base = 0; >long kernel_size = 0, initrd_size = 0; >long load_limit, rtas_limit, fw_size; > @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs > *args) >spapr->rma_size = MIN(spapr->rma_size, 0x1000); >} >} > +/* > + * SLOF gets confused if RMA resides not in the first block > + * of the first memory node so let's fix it. > + */ > +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; > +spapr->rma_size = MIN(spapr->rma_size, node0_size); So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty broken, especially on 970. Why does SLOF have any issues with NUMA memory nodes? It can just ignore them, no? >>> >>> Because the only way SLOF knows about the RMA is by using the first >>> "reg" entry of the first memory node and that's *all* SLOF knows about. >>> >>> If we start putting things like the DT, SLOF itself, etc... outside of >>> that region, it will crash. > > Ok, the question is whether this is a bug in SLOF and should be fixed > there or whether the RMA should really be limited to the RAM of the > first node only. PAPR says in "Hypervisor Call Functions": "Logical addresses start at zero. When control is initially passed to the OS from the platform, the first region is the single RMA. The first region has logical region identifier of zero. This first region is specified by the first address - length pair of the “reg” property of the /memory node of the OF device tree." Question about english - is "the single RMA" equal to "the only RMA"? > Looking at the function spapr_populate_memory(), it seems there is > already similar code there, so I assume the RMA should really be > limited to that size: > > /* memory node(s) */ > node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; > if (spapr->rma_size > node0_size) { > spapr->rma_size = node0_size; > } > > Maybe this piece of code could just be done earlier instead, before > setting up the fdt_addr and rtas_addr variables, instead of adding the > similar piece of code of this patch? > >>> So we "constrain" things to the rma that way. >>> >>> Creating 4M nodes makes no sense anyway >> >> So why don't we just use the "limit VRMA to 256MB" code always and error out >> of node0 is smaller? I don't think SLOF can run with less than 256MB anyway. > > It's 128 MB nowadays ... there is even a define called MIN_RMA_SLOF for > this in the code already. > > Thomas > -- Alexey
Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node
On Mon, 4 Nov 2013 12:28:12 +0100 Alexander Graf wrote: > > On 04.11.2013, at 11:55, Benjamin Herrenschmidt > wrote: > > > On Mon, 2013-11-04 at 11:44 +0100, Alexander Graf wrote: > >> On 01.11.2013, at 11:21, Alexey Kardashevskiy wrote: > >> > >>> SLOF gets really confused if RTAS/device-tree and everything else > >>> what SLOF can use is not in the very first block of the very first > >>> memory node. > >>> > >>> This makes sure that the RMA area is where SLOF expects it to be. > >>> > >>> Cc: Benjamin Herrenschmidt > >>> Cc: Nikunj A Dadhania > >>> Signed-off-by: Alexey Kardashevskiy > >>> --- > >>> hw/ppc/spapr.c | 8 +++- > >>> 1 file changed, 7 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > >>> index 09dc635..09a5d94 100644 > >>> --- a/hw/ppc/spapr.c > >>> +++ b/hw/ppc/spapr.c > >>> @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs > >>> *args) > >>>int i; > >>>MemoryRegion *sysmem = get_system_memory(); > >>>MemoryRegion *ram = g_new(MemoryRegion, 1); > >>> -hwaddr rma_alloc_size; > >>> +hwaddr rma_alloc_size, node0_size; > >>>uint32_t initrd_base = 0; > >>>long kernel_size = 0, initrd_size = 0; > >>>long load_limit, rtas_limit, fw_size; > >>> @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs > >>> *args) > >>>spapr->rma_size = MIN(spapr->rma_size, 0x1000); > >>>} > >>>} > >>> +/* > >>> + * SLOF gets confused if RMA resides not in the first block > >>> + * of the first memory node so let's fix it. > >>> + */ > >>> +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; > >>> +spapr->rma_size = MIN(spapr->rma_size, node0_size); > >> So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty > >> broken, especially on 970. > >> > >> Why does SLOF have any issues with NUMA memory nodes? It can just ignore > >> them, no? > > > > Because the only way SLOF knows about the RMA is by using the first > > "reg" entry of the first memory node and that's *all* SLOF knows about. > > > > If we start putting things like the DT, SLOF itself, etc... outside of > > that region, it will crash. Ok, the question is whether this is a bug in SLOF and should be fixed there or whether the RMA should really be limited to the RAM of the first node only. Looking at the function spapr_populate_memory(), it seems there is already similar code there, so I assume the RMA should really be limited to that size: /* memory node(s) */ node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; if (spapr->rma_size > node0_size) { spapr->rma_size = node0_size; } Maybe this piece of code could just be done earlier instead, before setting up the fdt_addr and rtas_addr variables, instead of adding the similar piece of code of this patch? > > So we "constrain" things to the rma that way. > > > > Creating 4M nodes makes no sense anyway > > So why don't we just use the "limit VRMA to 256MB" code always and error out > of node0 is smaller? I don't think SLOF can run with less than 256MB anyway. It's 128 MB nowadays ... there is even a define called MIN_RMA_SLOF for this in the code already. Thomas
Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node
On 04.11.2013, at 11:55, Benjamin Herrenschmidt wrote: > On Mon, 2013-11-04 at 11:44 +0100, Alexander Graf wrote: >> On 01.11.2013, at 11:21, Alexey Kardashevskiy wrote: >> >>> SLOF gets really confused if RTAS/device-tree and everything else >>> what SLOF can use is not in the very first block of the very first >>> memory node. >>> >>> This makes sure that the RMA area is where SLOF expects it to be. >>> >>> Cc: Benjamin Herrenschmidt >>> Cc: Nikunj A Dadhania >>> Signed-off-by: Alexey Kardashevskiy >>> --- >>> hw/ppc/spapr.c | 8 +++- >>> 1 file changed, 7 insertions(+), 1 deletion(-) >>> >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c >>> index 09dc635..09a5d94 100644 >>> --- a/hw/ppc/spapr.c >>> +++ b/hw/ppc/spapr.c >>> @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) >>>int i; >>>MemoryRegion *sysmem = get_system_memory(); >>>MemoryRegion *ram = g_new(MemoryRegion, 1); >>> -hwaddr rma_alloc_size; >>> +hwaddr rma_alloc_size, node0_size; >>>uint32_t initrd_base = 0; >>>long kernel_size = 0, initrd_size = 0; >>>long load_limit, rtas_limit, fw_size; >>> @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) >>>spapr->rma_size = MIN(spapr->rma_size, 0x1000); >>>} >>>} >>> +/* >>> + * SLOF gets confused if RMA resides not in the first block >>> + * of the first memory node so let's fix it. >>> + */ >>> +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; >>> +spapr->rma_size = MIN(spapr->rma_size, node0_size); >> >> So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty >> broken, especially on 970. >> >> Why does SLOF have any issues with NUMA memory nodes? It can just ignore >> them, no? > > Because the only way SLOF knows about the RMA is by using the first > "reg" entry of the first memory node and that's *all* SLOF knows about. > > If we start putting things like the DT, SLOF itself, etc... outside of > that region, it will crash. > > So we "constrain" things to the rma that way. > > Creating 4M nodes makes no sense anyway So why don't we just use the "limit VRMA to 256MB" code always and error out of node0 is smaller? I don't think SLOF can run with less than 256MB anyway. Alex
Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node
On Mon, 2013-11-04 at 11:44 +0100, Alexander Graf wrote: > On 01.11.2013, at 11:21, Alexey Kardashevskiy wrote: > > > SLOF gets really confused if RTAS/device-tree and everything else > > what SLOF can use is not in the very first block of the very first > > memory node. > > > > This makes sure that the RMA area is where SLOF expects it to be. > > > > Cc: Benjamin Herrenschmidt > > Cc: Nikunj A Dadhania > > Signed-off-by: Alexey Kardashevskiy > > --- > > hw/ppc/spapr.c | 8 +++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > > index 09dc635..09a5d94 100644 > > --- a/hw/ppc/spapr.c > > +++ b/hw/ppc/spapr.c > > @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) > > int i; > > MemoryRegion *sysmem = get_system_memory(); > > MemoryRegion *ram = g_new(MemoryRegion, 1); > > -hwaddr rma_alloc_size; > > +hwaddr rma_alloc_size, node0_size; > > uint32_t initrd_base = 0; > > long kernel_size = 0, initrd_size = 0; > > long load_limit, rtas_limit, fw_size; > > @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) > > spapr->rma_size = MIN(spapr->rma_size, 0x1000); > > } > > } > > +/* > > + * SLOF gets confused if RMA resides not in the first block > > + * of the first memory node so let's fix it. > > + */ > > +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; > > +spapr->rma_size = MIN(spapr->rma_size, node0_size); > > So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty > broken, especially on 970. > > Why does SLOF have any issues with NUMA memory nodes? It can just ignore > them, no? Because the only way SLOF knows about the RMA is by using the first "reg" entry of the first memory node and that's *all* SLOF knows about. If we start putting things like the DT, SLOF itself, etc... outside of that region, it will crash. So we "constrain" things to the rma that way. Creating 4M nodes makes no sense anyway Cheers, Ben.
Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node
On 01.11.2013, at 11:21, Alexey Kardashevskiy wrote: > SLOF gets really confused if RTAS/device-tree and everything else > what SLOF can use is not in the very first block of the very first > memory node. > > This makes sure that the RMA area is where SLOF expects it to be. > > Cc: Benjamin Herrenschmidt > Cc: Nikunj A Dadhania > Signed-off-by: Alexey Kardashevskiy > --- > hw/ppc/spapr.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c > index 09dc635..09a5d94 100644 > --- a/hw/ppc/spapr.c > +++ b/hw/ppc/spapr.c > @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) > int i; > MemoryRegion *sysmem = get_system_memory(); > MemoryRegion *ram = g_new(MemoryRegion, 1); > -hwaddr rma_alloc_size; > +hwaddr rma_alloc_size, node0_size; > uint32_t initrd_base = 0; > long kernel_size = 0, initrd_size = 0; > long load_limit, rtas_limit, fw_size; > @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) > spapr->rma_size = MIN(spapr->rma_size, 0x1000); > } > } > +/* > + * SLOF gets confused if RMA resides not in the first block > + * of the first memory node so let's fix it. > + */ > +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; > +spapr->rma_size = MIN(spapr->rma_size, node0_size); So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty broken, especially on 970. Why does SLOF have any issues with NUMA memory nodes? It can just ignore them, no? Alex
[Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node
SLOF gets really confused if RTAS/device-tree and everything else what SLOF can use is not in the very first block of the very first memory node. This makes sure that the RMA area is where SLOF expects it to be. Cc: Benjamin Herrenschmidt Cc: Nikunj A Dadhania Signed-off-by: Alexey Kardashevskiy --- hw/ppc/spapr.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 09dc635..09a5d94 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) int i; MemoryRegion *sysmem = get_system_memory(); MemoryRegion *ram = g_new(MemoryRegion, 1); -hwaddr rma_alloc_size; +hwaddr rma_alloc_size, node0_size; uint32_t initrd_base = 0; long kernel_size = 0, initrd_size = 0; long load_limit, rtas_limit, fw_size; @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args) spapr->rma_size = MIN(spapr->rma_size, 0x1000); } } +/* + * SLOF gets confused if RMA resides not in the first block + * of the first memory node so let's fix it. + */ +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size; +spapr->rma_size = MIN(spapr->rma_size, node0_size); /* We place the device tree and RTAS just below either the top of the RMA, * or just below 2GB, whichever is lowere, so that it can be -- 1.8.4.rc4