Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node

2013-11-04 Thread Benjamin Herrenschmidt
On Tue, 2013-11-05 at 00:11 +1100, Alexey Kardashevskiy wrote:
> Question about english - is "the single RMA" equal to "the only RMA"?

Yes.

Ben.




Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node

2013-11-04 Thread Alexey Kardashevskiy
On 11/05/2013 12:19 AM, Peter Maydell wrote:
> On 4 November 2013 13:11, Alexey Kardashevskiy  wrote:
>> PAPR says in "Hypervisor Call Functions":
>>
>> "Logical addresses start at zero. When control is initially passed to the
>> OS from the platform, the first region is the
>> single RMA. The first region has logical region identifier of zero. This
>> first region is specified by the first address -
>> length pair of the “reg” property of the /memory node of the OF device tree."
>>
>>
>> Question about english - is "the single RMA" equal to "the only RMA"?
> 
> No. "the single RMA" is weird English and to me implies
> that it's a term that's been defined earlier or at least
> that there is some surrounding context that would make it
> make more sense (eg some contrasting definition of
> "single RMA" vs "double RMA").


Oh. Some more context then:

"14.1.1 Real Mode Accesses
When the OS controlling an LPAR runs with address translation turned off
(MSRDR or MSRIR bit(s) =0) (real mode)
the LPAR hardware translates the memory addresses to an LPAR unique area
known as the Real Mode Area (RMA).
When control is initially passed to the OS from the platform, the RMA
starts at the LPAR's logical address 0 and is the
first logical memory block reported in the LPAR’s device tree. In general,
the RMA is a subset of the LPAR's logical
address space. Attempting a non relocated access beyond the bounds of the
RMA results in an storage interrupt
(ISI/DSI depending upon instruction or data reference). The RMA hardware
translation scheme is platform dependent.
The options are given below."


-- 
Alexey



Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node

2013-11-04 Thread Peter Maydell
On 4 November 2013 13:11, Alexey Kardashevskiy  wrote:
> PAPR says in "Hypervisor Call Functions":
>
> "Logical addresses start at zero. When control is initially passed to the
> OS from the platform, the first region is the
> single RMA. The first region has logical region identifier of zero. This
> first region is specified by the first address -
> length pair of the “reg” property of the /memory node of the OF device tree."
>
>
> Question about english - is "the single RMA" equal to "the only RMA"?

No. "the single RMA" is weird English and to me implies
that it's a term that's been defined earlier or at least
that there is some surrounding context that would make it
make more sense (eg some contrasting definition of
"single RMA" vs "double RMA").

-- PMM



Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node

2013-11-04 Thread Alexey Kardashevskiy
On 11/04/2013 10:50 PM, Thomas Huth wrote:
> On Mon, 4 Nov 2013 12:28:12 +0100
> Alexander Graf  wrote:
> 
>>
>> On 04.11.2013, at 11:55, Benjamin Herrenschmidt  
>> wrote:
>>
>>> On Mon, 2013-11-04 at 11:44 +0100, Alexander Graf wrote:
 On 01.11.2013, at 11:21, Alexey Kardashevskiy  wrote:

> SLOF gets really confused if RTAS/device-tree and everything else
> what SLOF can use is not in the very first block of the very first
> memory node.
>
> This makes sure that the RMA area is where SLOF expects it to be.
>
> Cc: Benjamin Herrenschmidt 
> Cc: Nikunj A Dadhania 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> hw/ppc/spapr.c | 8 +++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 09dc635..09a5d94 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs 
> *args)
>int i;
>MemoryRegion *sysmem = get_system_memory();
>MemoryRegion *ram = g_new(MemoryRegion, 1);
> -hwaddr rma_alloc_size;
> +hwaddr rma_alloc_size, node0_size;
>uint32_t initrd_base = 0;
>long kernel_size = 0, initrd_size = 0;
>long load_limit, rtas_limit, fw_size;
> @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs 
> *args)
>spapr->rma_size = MIN(spapr->rma_size, 0x1000);
>}
>}
> +/*
> + * SLOF gets confused if RMA resides not in the first block
> + * of the first memory node so let's fix it.
> + */
> +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
> +spapr->rma_size = MIN(spapr->rma_size, node0_size);
 So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty 
 broken, especially on 970.

 Why does SLOF have any issues with NUMA memory nodes? It can just ignore 
 them, no?
>>>
>>> Because the only way SLOF knows about the RMA is by using the first
>>> "reg" entry of the first memory node and that's *all* SLOF knows about.
>>>
>>> If we start putting things like the DT, SLOF itself, etc... outside of
>>> that region, it will crash.
> 
> Ok, the question is whether this is a bug in SLOF and should be fixed
> there or whether the RMA should really be limited to the RAM of the
> first node only.


PAPR says in "Hypervisor Call Functions":

"Logical addresses start at zero. When control is initially passed to the
OS from the platform, the first region is the
single RMA. The first region has logical region identifier of zero. This
first region is specified by the first address -
length pair of the “reg” property of the /memory node of the OF device tree."


Question about english - is "the single RMA" equal to "the only RMA"?


> Looking at the function spapr_populate_memory(), it seems there is
> already similar code there, so I assume the RMA should really be
> limited to that size:
> 
> /* memory node(s) */
> node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
> if (spapr->rma_size > node0_size) {
> spapr->rma_size = node0_size;
> }
> 
> Maybe this piece of code could just be done earlier instead, before
> setting up the fdt_addr and rtas_addr variables, instead of adding the
> similar piece of code of this patch?
> 
>>> So we "constrain" things to the rma that way.
>>>
>>> Creating 4M nodes makes no sense anyway
>>
>> So why don't we just use the "limit VRMA to 256MB" code always and error out 
>> of node0 is smaller? I don't think SLOF can run with less than 256MB anyway.
> 
> It's 128 MB nowadays ... there is even a define called MIN_RMA_SLOF for
> this in the code already.
> 
>  Thomas
> 


-- 
Alexey



Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node

2013-11-04 Thread Thomas Huth
On Mon, 4 Nov 2013 12:28:12 +0100
Alexander Graf  wrote:

> 
> On 04.11.2013, at 11:55, Benjamin Herrenschmidt  
> wrote:
> 
> > On Mon, 2013-11-04 at 11:44 +0100, Alexander Graf wrote:
> >> On 01.11.2013, at 11:21, Alexey Kardashevskiy  wrote:
> >> 
> >>> SLOF gets really confused if RTAS/device-tree and everything else
> >>> what SLOF can use is not in the very first block of the very first
> >>> memory node.
> >>> 
> >>> This makes sure that the RMA area is where SLOF expects it to be.
> >>> 
> >>> Cc: Benjamin Herrenschmidt 
> >>> Cc: Nikunj A Dadhania 
> >>> Signed-off-by: Alexey Kardashevskiy 
> >>> ---
> >>> hw/ppc/spapr.c | 8 +++-
> >>> 1 file changed, 7 insertions(+), 1 deletion(-)
> >>> 
> >>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> >>> index 09dc635..09a5d94 100644
> >>> --- a/hw/ppc/spapr.c
> >>> +++ b/hw/ppc/spapr.c
> >>> @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs 
> >>> *args)
> >>>int i;
> >>>MemoryRegion *sysmem = get_system_memory();
> >>>MemoryRegion *ram = g_new(MemoryRegion, 1);
> >>> -hwaddr rma_alloc_size;
> >>> +hwaddr rma_alloc_size, node0_size;
> >>>uint32_t initrd_base = 0;
> >>>long kernel_size = 0, initrd_size = 0;
> >>>long load_limit, rtas_limit, fw_size;
> >>> @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs 
> >>> *args)
> >>>spapr->rma_size = MIN(spapr->rma_size, 0x1000);
> >>>}
> >>>}
> >>> +/*
> >>> + * SLOF gets confused if RMA resides not in the first block
> >>> + * of the first memory node so let's fix it.
> >>> + */
> >>> +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
> >>> +spapr->rma_size = MIN(spapr->rma_size, node0_size);
> >> So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty 
> >> broken, especially on 970.
> >> 
> >> Why does SLOF have any issues with NUMA memory nodes? It can just ignore 
> >> them, no?
> > 
> > Because the only way SLOF knows about the RMA is by using the first
> > "reg" entry of the first memory node and that's *all* SLOF knows about.
> > 
> > If we start putting things like the DT, SLOF itself, etc... outside of
> > that region, it will crash.

Ok, the question is whether this is a bug in SLOF and should be fixed
there or whether the RMA should really be limited to the RAM of the
first node only.

Looking at the function spapr_populate_memory(), it seems there is
already similar code there, so I assume the RMA should really be
limited to that size:

/* memory node(s) */
node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
if (spapr->rma_size > node0_size) {
spapr->rma_size = node0_size;
}

Maybe this piece of code could just be done earlier instead, before
setting up the fdt_addr and rtas_addr variables, instead of adding the
similar piece of code of this patch?

> > So we "constrain" things to the rma that way.
> > 
> > Creating 4M nodes makes no sense anyway
> 
> So why don't we just use the "limit VRMA to 256MB" code always and error out 
> of node0 is smaller? I don't think SLOF can run with less than 256MB anyway.

It's 128 MB nowadays ... there is even a define called MIN_RMA_SLOF for
this in the code already.

 Thomas




Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node

2013-11-04 Thread Alexander Graf

On 04.11.2013, at 11:55, Benjamin Herrenschmidt  
wrote:

> On Mon, 2013-11-04 at 11:44 +0100, Alexander Graf wrote:
>> On 01.11.2013, at 11:21, Alexey Kardashevskiy  wrote:
>> 
>>> SLOF gets really confused if RTAS/device-tree and everything else
>>> what SLOF can use is not in the very first block of the very first
>>> memory node.
>>> 
>>> This makes sure that the RMA area is where SLOF expects it to be.
>>> 
>>> Cc: Benjamin Herrenschmidt 
>>> Cc: Nikunj A Dadhania 
>>> Signed-off-by: Alexey Kardashevskiy 
>>> ---
>>> hw/ppc/spapr.c | 8 +++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
>>> index 09dc635..09a5d94 100644
>>> --- a/hw/ppc/spapr.c
>>> +++ b/hw/ppc/spapr.c
>>> @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
>>>int i;
>>>MemoryRegion *sysmem = get_system_memory();
>>>MemoryRegion *ram = g_new(MemoryRegion, 1);
>>> -hwaddr rma_alloc_size;
>>> +hwaddr rma_alloc_size, node0_size;
>>>uint32_t initrd_base = 0;
>>>long kernel_size = 0, initrd_size = 0;
>>>long load_limit, rtas_limit, fw_size;
>>> @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
>>>spapr->rma_size = MIN(spapr->rma_size, 0x1000);
>>>}
>>>}
>>> +/*
>>> + * SLOF gets confused if RMA resides not in the first block
>>> + * of the first memory node so let's fix it.
>>> + */
>>> +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
>>> +spapr->rma_size = MIN(spapr->rma_size, node0_size);
>> 
>> So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty 
>> broken, especially on 970.
>> 
>> Why does SLOF have any issues with NUMA memory nodes? It can just ignore 
>> them, no?
> 
> Because the only way SLOF knows about the RMA is by using the first
> "reg" entry of the first memory node and that's *all* SLOF knows about.
> 
> If we start putting things like the DT, SLOF itself, etc... outside of
> that region, it will crash.
> 
> So we "constrain" things to the rma that way.
> 
> Creating 4M nodes makes no sense anyway

So why don't we just use the "limit VRMA to 256MB" code always and error out of 
node0 is smaller? I don't think SLOF can run with less than 256MB anyway.


Alex




Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node

2013-11-04 Thread Benjamin Herrenschmidt
On Mon, 2013-11-04 at 11:44 +0100, Alexander Graf wrote:
> On 01.11.2013, at 11:21, Alexey Kardashevskiy  wrote:
> 
> > SLOF gets really confused if RTAS/device-tree and everything else
> > what SLOF can use is not in the very first block of the very first
> > memory node.
> > 
> > This makes sure that the RMA area is where SLOF expects it to be.
> > 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Nikunj A Dadhania 
> > Signed-off-by: Alexey Kardashevskiy 
> > ---
> > hw/ppc/spapr.c | 8 +++-
> > 1 file changed, 7 insertions(+), 1 deletion(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 09dc635..09a5d94 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
> > int i;
> > MemoryRegion *sysmem = get_system_memory();
> > MemoryRegion *ram = g_new(MemoryRegion, 1);
> > -hwaddr rma_alloc_size;
> > +hwaddr rma_alloc_size, node0_size;
> > uint32_t initrd_base = 0;
> > long kernel_size = 0, initrd_size = 0;
> > long load_limit, rtas_limit, fw_size;
> > @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
> > spapr->rma_size = MIN(spapr->rma_size, 0x1000);
> > }
> > }
> > +/*
> > + * SLOF gets confused if RMA resides not in the first block
> > + * of the first memory node so let's fix it.
> > + */
> > +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
> > +spapr->rma_size = MIN(spapr->rma_size, node0_size);
> 
> So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty 
> broken, especially on 970.
> 
> Why does SLOF have any issues with NUMA memory nodes? It can just ignore 
> them, no?

Because the only way SLOF knows about the RMA is by using the first
"reg" entry of the first memory node and that's *all* SLOF knows about.

If we start putting things like the DT, SLOF itself, etc... outside of
that region, it will crash.

So we "constrain" things to the rma that way.

Creating 4M nodes makes no sense anyway

Cheers,
Ben.





Re: [Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node

2013-11-04 Thread Alexander Graf

On 01.11.2013, at 11:21, Alexey Kardashevskiy  wrote:

> SLOF gets really confused if RTAS/device-tree and everything else
> what SLOF can use is not in the very first block of the very first
> memory node.
> 
> This makes sure that the RMA area is where SLOF expects it to be.
> 
> Cc: Benjamin Herrenschmidt 
> Cc: Nikunj A Dadhania 
> Signed-off-by: Alexey Kardashevskiy 
> ---
> hw/ppc/spapr.c | 8 +++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 09dc635..09a5d94 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
> int i;
> MemoryRegion *sysmem = get_system_memory();
> MemoryRegion *ram = g_new(MemoryRegion, 1);
> -hwaddr rma_alloc_size;
> +hwaddr rma_alloc_size, node0_size;
> uint32_t initrd_base = 0;
> long kernel_size = 0, initrd_size = 0;
> long load_limit, rtas_limit, fw_size;
> @@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
> spapr->rma_size = MIN(spapr->rma_size, 0x1000);
> }
> }
> +/*
> + * SLOF gets confused if RMA resides not in the first block
> + * of the first memory node so let's fix it.
> + */
> +node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
> +spapr->rma_size = MIN(spapr->rma_size, node0_size);

So if I create a NUMA node of 4MB that will be my RMA? That sounds pretty 
broken, especially on 970.

Why does SLOF have any issues with NUMA memory nodes? It can just ignore them, 
no?


Alex




[Qemu-devel] [PATCH] spapr: make sure RMA is in first mode of first memory node

2013-11-01 Thread Alexey Kardashevskiy
SLOF gets really confused if RTAS/device-tree and everything else
what SLOF can use is not in the very first block of the very first
memory node.

This makes sure that the RMA area is where SLOF expects it to be.

Cc: Benjamin Herrenschmidt 
Cc: Nikunj A Dadhania 
Signed-off-by: Alexey Kardashevskiy 
---
 hw/ppc/spapr.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 09dc635..09a5d94 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1113,7 +1113,7 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 int i;
 MemoryRegion *sysmem = get_system_memory();
 MemoryRegion *ram = g_new(MemoryRegion, 1);
-hwaddr rma_alloc_size;
+hwaddr rma_alloc_size, node0_size;
 uint32_t initrd_base = 0;
 long kernel_size = 0, initrd_size = 0;
 long load_limit, rtas_limit, fw_size;
@@ -1154,6 +1154,12 @@ static void ppc_spapr_init(QEMUMachineInitArgs *args)
 spapr->rma_size = MIN(spapr->rma_size, 0x1000);
 }
 }
+/*
+ * SLOF gets confused if RMA resides not in the first block
+ * of the first memory node so let's fix it.
+ */
+node0_size = (nb_numa_nodes > 1) ? node_mem[0] : ram_size;
+spapr->rma_size = MIN(spapr->rma_size, node0_size);
 
 /* We place the device tree and RTAS just below either the top of the RMA,
  * or just below 2GB, whichever is lowere, so that it can be
-- 
1.8.4.rc4