Re: [Xen-devel] [RFC PATCH v3 15/24] ARM: NUMA: DT: Add CPU NUMA support

2017-07-25 Thread Vijay Kilari
Hi Julien,

On Mon, Jul 24, 2017 at 4:54 PM, Julien Grall  wrote:
> Hi Vijay,
>
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> For each cpu, update cpu_to_node[] with node id from
>> the numa-node-id DT property. Also, initialize cpu_to_node[]
>> with node 0.
>>
>> Add macros to access cpu_to_node[] information.
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>> v3: - Dropped numa_add_cpu declaration from asm-arm/numa.h
>> - Dropped stale declarations
>> - Call numa_add_cpu for cpu0
>> ---
>>  xen/arch/arm/numa/numa.c   | 21 +
>>  xen/arch/arm/setup.c   |  2 ++
>>  xen/arch/arm/smpboot.c | 25 -
>>  xen/include/asm-arm/numa.h |  7 +++
>>  xen/include/asm-x86/numa.h |  1 -
>>  xen/include/xen/numa.h |  1 +
>>  6 files changed, 55 insertions(+), 2 deletions(-)
>>
>> diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
>> index c00b92c..dc80aa5 100644
>> --- a/xen/arch/arm/numa/numa.c
>> +++ b/xen/arch/arm/numa/numa.c
>> @@ -22,11 +22,31 @@
>>
>>  static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
>>
>> +/*
>> + * Setup early cpu_to_node.
>> + */
>> +void __init init_cpu_to_node(void)
>> +{
>> +int i;
>> +
>> +for ( i = 0; i < NR_CPUS; i++ )
>> +numa_set_node(i, 0);
>> +}
>
>
> From the comment: "Setup early cpu_to_node". However this is not how you are
> using it.

Ok. I will update the comment.

>
> But I am not sure why it is even here...
>
>> +
>>  void numa_failed(void)
>>  {
>>  numa_off = true;
>>  init_dt_numa_distance();
>>  node_distance_fn = NULL;
>> +init_cpu_to_node();
>> +}
>> +
>> +void __init numa_set_cpu_node(int cpu, unsigned int nid)
>> +{
>> +if ( !node_isset(nid, processor_nodes_parsed) || nid >= MAX_NUMNODES
>> )
>> +nid = 0;
>
>
> This looks wrong to me. If the node-id is invalid, why would you blindly set
> to 0?

Generally this check will not pass. I will make this function return
error code in case
of wrong nid.

>
>
>> +
>> +numa_set_node(cpu, nid);
>>  }
>>
>>  uint8_t __node_distance(nodeid_t a, nodeid_t b)
>> @@ -49,6 +69,7 @@ void __init numa_init(void)
>>  int ret = 0;
>>
>>  nodes_clear(processor_nodes_parsed);
>> +init_cpu_to_node();
>>  init_dt_numa_distance();
>>
>>  if ( numa_off )
>> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
>> index a6d1499..b9c8b0d 100644
>> --- a/xen/arch/arm/setup.c
>> +++ b/xen/arch/arm/setup.c
>> @@ -787,6 +787,8 @@ void __init start_xen(unsigned long boot_phys_offset,
>>
>>  processor_id();
>>
>> +numa_add_cpu(0);
>> +
>>  smp_init_cpus();
>>  cpus = smp_get_max_cpus();
>>  printk(XENLOG_INFO "SMP: Allowing %u CPUs\n", cpus);
>> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
>> index 32e8722..fcf9afc 100644
>> --- a/xen/arch/arm/smpboot.c
>> +++ b/xen/arch/arm/smpboot.c
>> @@ -29,6 +29,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>
>
> Please use the alphabetical order.
>
>>  #include 
>>  #include 
>>  #include 
>> @@ -106,6 +107,7 @@ static void __init dt_smp_init_cpus(void)
>>  [0 ... NR_CPUS - 1] = MPIDR_INVALID
>>  };
>>  bool_t bootcpu_valid = 0;
>> +nodeid_t *cpu_to_nodemap;
>>  int rc;
>>
>>  mpidr = boot_cpu_data.mpidr.bits & MPIDR_HWID_MASK;
>> @@ -117,11 +119,18 @@ static void __init dt_smp_init_cpus(void)
>>  return;
>>  }
>>
>> +cpu_to_nodemap = xzalloc_array(nodeid_t, NR_CPUS);
>
>
> Why do you need to allocate cpu_to_nodemap? Would not it be easier to put it
> on the stack as we do for other variable?

This array holds nodemap indexed by cpuid once for all the cpus.
Later while setting the logical cpu id mapping, the node mapping is set
by calling numa_set_cpu_node().

>
>> +if ( !cpu_to_nodemap )
>> +{
>> +printk(XENLOG_WARNING "Failed to allocate memory for
>> cpu_to_nodemap\n");
>> +return;
>> +}
>> +
>>  dt_for_each_child_node( cpus, cpu )
>>  {
>>  const __be32 *prop;
>>  u64 addr;
>> -u32 reg_len;
>> +uint32_t reg_len, nid;
>>  register_t hwid;
>>
>>  if ( !dt_device_type_is_equal(cpu, "cpu") )
>> @@ -146,6 +155,15 @@ static void __init dt_smp_init_cpus(void)
>>  continue;
>>  }
>>
>> +if ( !dt_property_read_u32(cpu, "numa-node-id", ) )
>> +{
>> +printk(XENLOG_WARNING "cpu node `%s`: numa-node-id not
>> found\n",
>> +   dt_node_full_name(cpu));
>
>
> numa-node-id is not mandatory. So you would print a warning on all non-NUMA
> platform. This not what we want.

ok. I will drop this warning.
>
>> +nid = 0;
>> +}
>> +
>> +cpu_to_nodemap[cpuidx] = nid;
>> +
>>  addr = dt_read_number(prop, dt_n_addr_cells(cpu));
>>
>>  hwid = addr;
>> @@ -224,6 +242,7 @@ static void __init 

Re: [Xen-devel] [RFC PATCH v3 13/24] ARM: NUMA: DT: Parse memory NUMA information

2017-07-21 Thread Vijay Kilari
Hi Julien,

On Thu, Jul 20, 2017 at 4:56 PM, Julien Grall  wrote:
>
>
> On 19/07/17 19:39, Julien Grall wrote:
>>>
>>>  cell = (const __be32 *)prop->data;
>>>  banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));
>>>
>>> -for ( i = 0; i < banks && bootinfo.mem.nr_banks < NR_MEM_BANKS;
>>> i++ )
>>> +for ( i = 0; i < banks; i++ )
>>>  {
>>>  device_tree_get_reg(, address_cells, size_cells, ,
>>> );
>>>  if ( !size )
>>>  continue;
>>> -bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
>>> -bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
>>> -bootinfo.mem.nr_banks++;
>>> +if ( !efi_enabled(EFI_BOOT) && bootinfo.mem.nr_banks <
>>> NR_MEM_BANKS )
>>> +{
>>> +bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
>>> +bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
>>> +bootinfo.mem.nr_banks++;
>>> +}
>>
>>
>> This change should be split.
>
>
> I thought a bit more about this code during the week. I think it would be
> nicer to write:
>
> #ifdef CONFIG_NUMA
> dt_numa_process_memory_node(nid, start, size);
> #endif
>
> if ( !efi_enabled(EFI_BOOT) )
>   continue;

Should be if ( efi_enabled(EFI_BOOT) ) ?
>
> if ( bootinfo.mem.nr_banks < NR_MEM_BANKS )

Should be if ( bootinfo.mem.nr_banks >= NR_MEM_BANKS ) ?

>   break;
>
> bootinfo.mem.bank[];
> 
>
> Also, you may want to add a stub for dt_numa_process_memory_node rather than
> #ifdef in the code.
>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 02/24] x86: NUMA: Clean up: Fix coding styles and drop unused code

2017-07-20 Thread Vijay Kilari
On Thu, Jul 20, 2017 at 5:39 PM, Julien Grall <julien.gr...@arm.com> wrote:
>
>
> On 20/07/17 13:05, Vijay Kilari wrote:
>>
>> On Thu, Jul 20, 2017 at 4:30 PM, Julien Grall <julien.gr...@arm.com>
>> wrote:
>>>
>>> Hi Vijay,
>>>
>>>
>>> On 20/07/17 08:00, Vijay Kilari wrote:
>>>>
>>>>
>>>> On Wed, Jul 19, 2017 at 9:53 PM, Julien Grall <julien.gr...@arm.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> Hi Vijay,
>>>>>
>>>>> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: Vijaya Kumar K <vijaya.ku...@cavium.com>
>>>>>>
>>>>>> Fix coding style, trailing spaces, tabs in NUMA code.
>>>>>> Also drop unused macros and functions.
>>>>>> There is no functional change.
>>>>>>
>>>>>> Signed-off-by: Vijaya Kumar K <vijaya.ku...@cavium.com>
>>>>>> Reviewed-by: Wei Liu <wei.l...@citrix.com>
>>>>>> ---
>>>>>> v3: - Change commit message
>>>>>> - Changed VIRTUAL_BUG_ON to ASSERT
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Looking at the commit message you don't mention any renaming...
>>>>>
>>>>>> - Dropped useless inner paranthesis for some macros
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> [...]
>>>>>
>>>>>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>>>>>> index 3cf26c2..c0de57b 100644
>>>>>> --- a/xen/include/asm-x86/numa.h
>>>>>> +++ b/xen/include/asm-x86/numa.h
>>>>>> @@ -1,8 +1,11 @@
>>>>>> -#ifndef _ASM_X8664_NUMA_H
>>>>>> +#ifndef _ASM_X8664_NUMA_H
>>>>>>  #define _ASM_X8664_NUMA_H 1
>>>>>>
>>>>>>  #include 
>>>>>>
>>>>>> +#define MAX_NUMNODESNR_NODES
>>>>>> +#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I don't understand why this suddenly appears in the code when you moved
>>>>> away
>>>>> in patch #1 in xen/numa.h.
>>>>
>>>>
>>>>
>>>> Particularly MAX_NUMNODES required by this header file with this
>>>> patch changes for compilation.
>>>> Though I can include xen/numa.h here but xen/numa.h is including
>>>> asm/numa.h back.
>>>>
>>>> I will add separate patch for this defines movement and drop from
>>>> this patch.
>>>
>>>
>>>
>>> Why adding a separate patch? The code should not have been moved away in
>>> patch #1 as you did.
>>
>>
>> In patch#1 , I have not moved MAX_NUMNODES. It is kept in xen/numa.h file
>> In this patch, when VIRTUAL_BUG_ON is changed to ASSERT, in
>> asm-x86/numa.h,
>> it requires MAX_NUMNODES define. So I have moved it from xen/numa.h to
>> asm-x86/numa.h
>
>
> I am sorry but looked at your patch #1. You moved NR_NODE_MEMBLKS in patch
> #1 from asm-x86/numa.h to xen/numa.h. And then you moved it again here.
>
>>
>> So, I was thinking of  adding small patch to move both MAX_NUMNODES and
>> NR_NODE_MEMBLKS to asm-x86/numa.h
>
>
> Or better, you can do in xen/numa.h:
>
> #define MAX_NUMNODES ...
> #define NR_NODE_...
>
> #include 

But still compilation issue comes from below code.
where only asm/numa.h is included.

--- a/xen/include/asm-x86/irq.h
+++ b/xen/include/asm-x86/irq.h
@@ -4,7 +4,7 @@
 /* (C) 1992, 1993 Linus Torvalds, (C) 1997 Ingo Molnar */

 #include 
-#include 
+#include 
 #include 
 #include 
 #include 

>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 02/24] x86: NUMA: Clean up: Fix coding styles and drop unused code

2017-07-20 Thread Vijay Kilari
On Thu, Jul 20, 2017 at 4:30 PM, Julien Grall <julien.gr...@arm.com> wrote:
> Hi Vijay,
>
>
> On 20/07/17 08:00, Vijay Kilari wrote:
>>
>> On Wed, Jul 19, 2017 at 9:53 PM, Julien Grall <julien.gr...@arm.com>
>> wrote:
>>>
>>> Hi Vijay,
>>>
>>> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>>>
>>>>
>>>> From: Vijaya Kumar K <vijaya.ku...@cavium.com>
>>>>
>>>> Fix coding style, trailing spaces, tabs in NUMA code.
>>>> Also drop unused macros and functions.
>>>> There is no functional change.
>>>>
>>>> Signed-off-by: Vijaya Kumar K <vijaya.ku...@cavium.com>
>>>> Reviewed-by: Wei Liu <wei.l...@citrix.com>
>>>> ---
>>>> v3: - Change commit message
>>>> - Changed VIRTUAL_BUG_ON to ASSERT
>>>
>>>
>>>
>>> Looking at the commit message you don't mention any renaming...
>>>
>>>> - Dropped useless inner paranthesis for some macros
>>>
>>>
>>>
>>> [...]
>>>
>>>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>>>> index 3cf26c2..c0de57b 100644
>>>> --- a/xen/include/asm-x86/numa.h
>>>> +++ b/xen/include/asm-x86/numa.h
>>>> @@ -1,8 +1,11 @@
>>>> -#ifndef _ASM_X8664_NUMA_H
>>>> +#ifndef _ASM_X8664_NUMA_H
>>>>  #define _ASM_X8664_NUMA_H 1
>>>>
>>>>  #include 
>>>>
>>>> +#define MAX_NUMNODESNR_NODES
>>>> +#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
>>>
>>>
>>>
>>> I don't understand why this suddenly appears in the code when you moved
>>> away
>>> in patch #1 in xen/numa.h.
>>
>>
>> Particularly MAX_NUMNODES required by this header file with this
>> patch changes for compilation.
>> Though I can include xen/numa.h here but xen/numa.h is including
>> asm/numa.h back.
>>
>> I will add separate patch for this defines movement and drop from
>> this patch.
>
>
> Why adding a separate patch? The code should not have been moved away in
> patch #1 as you did.

In patch#1 , I have not moved MAX_NUMNODES. It is kept in xen/numa.h file
In this patch, when VIRTUAL_BUG_ON is changed to ASSERT, in asm-x86/numa.h,
it requires MAX_NUMNODES define. So I have moved it from xen/numa.h to
asm-x86/numa.h

So, I was thinking of  adding small patch to move both MAX_NUMNODES and
NR_NODE_MEMBLKS to asm-x86/numa.h

And in code movement patch, I will move to xen/numa.h along with ASSERT code.

>
> But I still don't understand what is the exact error here... If it fails on
> this patch, likely this should have failed after applying patch #1. And
> *all* patch should be able to build without the rest of the series.

Yes, all patches are tested for compilation individually.

>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 09/24] NUMA: x86: Move common code from srat.c

2017-07-20 Thread Vijay Kilari
Hi Julien,

On Thu, Jul 20, 2017 at 4:47 PM, Julien Grall  wrote:
> Hi Vijay,
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Move code from xen/arch/x86/srat.c to xen/common/numa.c
>> so that it can be used by other archs.
>>
>> Apart from moving the code the following changes are done
>>  - Coding style of code moved to numa.c is changed to xen style
>>  - {memory,processor}_nodes_parsed are made global and moved
>>to xen/nodemask.h
>>  - Few generic static functions in x86/srat.c are made
>>non-static
>>  - Functions moved from x85/srat.c to common/numa.c are made
>>non-static
>>  - numa_scan_nodes() is made as static function
>>  - compute_memnode_shift() and setup_node_bootmem() are made
>>static.
>
>
> You modify the coding style at the same time as the same time as moving the
> code. This makes quite difficult to make sure that a mistake didn't slip in
> the new code. Can you please diving this patch in smaller chunk (i.e moving
> code in smaller chunk) to ease the review?

OK. I will do so.

>
> We can think of merging all of them when committing it.
>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 13/24] ARM: NUMA: DT: Parse memory NUMA information

2017-07-20 Thread Vijay Kilari
On Thu, Jul 20, 2017 at 12:09 AM, Julien Grall  wrote:
> Hi Vijay,
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Parse memory node and fetch numa-node-id information.
>> For each memory range, store in node_memblk_range[]
>> along with node id.
>>
>> When booting in UEFI mode, UEFI passes memory information
>> to Dom0 using EFI memory descriptor table and deletes the
>> memory nodes from the host DT. However to fetch the memory
>> numa node id, memory DT node should not be deleted by EFI stub.
>> With this patch, do not delete memory node from FDT.
>>
>> NUMA info of memory is extracted from process_memory_node()
>> instead of parsing the DT again during numa_init().
>
>
> This patch does too much and needs to be split. The splitting would be at
> least:
>
> - EFI mode change
> - Numa change

OK

>
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>> v3: - Set numa_off in numa_failed() and drop dt_numa variable
>> ---
>>  xen/arch/arm/bootfdt.c  | 25 +
>>  xen/arch/arm/efi/efi-boot.h | 25 -
>>  xen/arch/arm/numa/dt_numa.c | 32 
>>  xen/arch/arm/numa/numa.c|  5 +
>>  xen/include/asm-arm/numa.h  |  2 ++
>>  5 files changed, 60 insertions(+), 29 deletions(-)
>>
>> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
>> index 6e8251b..b3a132c 100644
>> --- a/xen/arch/arm/bootfdt.c
>> +++ b/xen/arch/arm/bootfdt.c
>> @@ -13,6 +13,8 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include 
>> +#include 
>
>
> Please add the headers in alphabetical order.
>
>>  #include 
>>  #include 
>>
>> @@ -146,6 +148,9 @@ static void __init process_memory_node(const void
>> *fdt, int node,
>>  const __be32 *cell;
>>  paddr_t start, size;
>>  u32 reg_cells = address_cells + size_cells;
>> +#ifdef CONFIG_NUMA
>> +uint32_t nid;
>> +#endif
>>
>>  if ( address_cells < 1 || size_cells < 1 )
>>  {
>> @@ -154,24 +159,36 @@ static void __init process_memory_node(const void
>> *fdt, int node,
>>  return;
>>  }
>>
>> +#ifdef CONFIG_NUMA
>> +nid = device_tree_get_u32(fdt, node, "numa-node-id",
>> NR_NODE_MEMBLKS);
>
>
> Should not you use MAX_NUM_NODES rather than NR_NODE_MEMBLKS?
>
> Also, where is the sanity check?

OK
>
>> +#endif
>>  prop = fdt_get_property(fdt, node, "reg", NULL);
>>  if ( !prop )
>>  {
>>  printk("fdt: node `%s': missing `reg' property\n", name);
>> +#ifdef CONFIG_NUMA
>> +   numa_failed();
>
>
> This file is using soft-tab not hard one.
>
>> +#endif
>>  return;
>>  }
>>
>>  cell = (const __be32 *)prop->data;
>>  banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));
>>
>> -for ( i = 0; i < banks && bootinfo.mem.nr_banks < NR_MEM_BANKS; i++ )
>> +for ( i = 0; i < banks; i++ )
>>  {
>>  device_tree_get_reg(, address_cells, size_cells, ,
>> );
>>  if ( !size )
>>  continue;
>> -bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
>> -bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
>> -bootinfo.mem.nr_banks++;
>> +if ( !efi_enabled(EFI_BOOT) && bootinfo.mem.nr_banks <
>> NR_MEM_BANKS )
>> +{
>> +bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
>> +bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
>> +bootinfo.mem.nr_banks++;
>> +}
>
>
> This change should be split.
>
>
>> +#ifdef CONFIG_NUMA
>> +dt_numa_process_memory_node(nid, start, size);
>> +#endif
>>  }
>>  }
>>
>> diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
>> index 56de26e..a8bde68 100644
>> --- a/xen/arch/arm/efi/efi-boot.h
>> +++ b/xen/arch/arm/efi/efi-boot.h
>> @@ -194,33 +194,8 @@ EFI_STATUS __init fdt_add_uefi_nodes(EFI_SYSTEM_TABLE
>> *sys_table,
>>  int status;
>>  u32 fdt_val32;
>>  u64 fdt_val64;
>> -int prev;
>>  int num_rsv;
>>
>> -/*
>> - * Delete any memory nodes present.  The EFI memory map is the only
>> - * memory description provided to Xen.
>> - */
>> -prev = 0;
>> -for (;;)
>> -{
>> -const char *type;
>> -int len;
>> -
>> -node = fdt_next_node(fdt, prev, NULL);
>> -if ( node < 0 )
>> -break;
>> -
>> -type = fdt_getprop(fdt, node, "device_type", );
>> -if ( type && strncmp(type, "memory", len) == 0 )
>> -{
>> -fdt_del_node(fdt, node);
>> -continue;
>> -}
>> -
>> -prev = node;
>> -}
>> -
>
>
> That chunk should move to the same patch as the EFI check.
>
ok
>
>> /*
>>  * Delete all memory reserve map entries. When booting via UEFI,
>>  * kernel will use the UEFI memory map to find reserved regions.
>> diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
>> index 963bb40..84030e7 100644

Re: [Xen-devel] [RFC PATCH v3 10/24] NUMA: Allow numa initialization with DT

2017-07-20 Thread Vijay Kilari
On Wed, Jul 19, 2017 at 11:28 PM, Julien Grall  wrote:
> Hi Vijay,
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> The common code allows numa initialization only when
>> ACPI_NUMA config is enabled. Allow initialization when
>> NUMA config is enabled for DT.
>>
>> In this patch, along with acpi_numa, check for acpi_disabled
>> is added.
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>>  xen/common/numa.c | 4 +---
>>  1 file changed, 1 insertion(+), 3 deletions(-)
>>
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> index 74c4697..5e985d2 100644
>> --- a/xen/common/numa.c
>> +++ b/xen/common/numa.c
>> @@ -324,7 +324,7 @@ static int __init numa_scan_nodes(paddr_t start,
>> paddr_t end)
>>  for ( i = 0; i < MAX_NUMNODES; i++ )
>>  cutoff_node(i, start, end);
>>
>> -if ( acpi_numa <= 0 )
>> +if ( !acpi_disabled && acpi_numa <= 0 )
>
>
> I am struggling to understand this change. Likely you want to similar
> variable for DT to say NUMA is available or this has failed.

Yes, without this check for acpi_disabled, when booting with DT, the check
acpi_numa <= 0 is true and does not allow numa initialization.

>
> This also change quite a bit the semantic for x86 because, you will now
> continue if acpi_disabled and acpi_numa = 0. The code seems to allow it, but
> I don't know if we support it.

Yes, but prior to this patch, x86 is assuming that acpi_disabled is
false by checking
only for acpi_numa <=0.

The other solution is create a arch wrapper and call this from here.

Regards
Vijay


>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 07/24] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA

2017-07-20 Thread Vijay Kilari
On Tue, Jul 18, 2017 at 11:36 PM, Julien Grall  wrote:
> Hi Vijay,
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Right now CONFIG_NUMA is not enabled for ARM and
>> existing code in asm-arm/numa.h is for !CONFIG_NUMA.
>> Hence put this code under #ifndef CONFIG_NUMA.
>>
>> This help to make this changes work when CONFIG_NUMA
>> is not enabled. Though CONFIG_NUMA is enabled by default,
>> manually disabling this option is possible and compilation
>> should go through. Hence kept the these changes under
>> !CONFIG_NUMA.
>
>
> This is still no true. It is not possible to disable CONFIG_NUMA from the
> Kconfig unless you hack it (just tried it)...
>
> As I said on v2, if you always enable NUMA why should we add code in Xen
> that get rotten? Either you allow NUMA to be disabled by the user or you
> drop this code.

The reason is: The next patch #8, which does the code movement moves
the generic code to common header file xen/numa.h.
If we don't put these *existing* defines in asm-arm/numa.h under
#ifndef CONFIG_NUMA,
the compilation fails for ARM.

Is it ok to removes these defines under separate patch after enabling
NUMA config
at the end of patch series?

Let me know if you have any better approach.

>
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>> v3: - Dropped NODE_SHIFT define
>> ---
>>  xen/include/asm-arm/numa.h | 2 ++
>>  1 file changed, 2 insertions(+)
>>
>> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
>> index 53f99af..7f00a36 100644
>> --- a/xen/include/asm-arm/numa.h
>> +++ b/xen/include/asm-arm/numa.h
>> @@ -3,6 +3,7 @@
>>
>>  typedef uint8_t nodeid_t;
>>
>> +#ifndef CONFIG_NUMA
>>  /* Fake one node for now. See also node_online_map. */
>>  #define cpu_to_node(cpu) 0
>>  #define node_to_cpumask(node)   (cpu_online_map)
>> @@ -16,6 +17,7 @@ static inline __attribute__((pure)) nodeid_t
>> phys_to_nid(paddr_t addr)
>>  #define node_spanned_pages(nid) (total_pages)
>>  #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
>>  #define __node_distance(a, b) (20)
>> +#endif /* CONFIG_NUMA */
>>
>>  static inline unsigned int arch_get_dma_bitsize(void)
>>  {
>>
>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 12/24] ARM: NUMA: DT: Parse CPU NUMA information

2017-07-20 Thread Vijay Kilari
On Wed, Jul 19, 2017 at 11:56 PM, Julien Grall  wrote:
> Hi,
>
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Parse CPU node and fetch numa-node-id information.
>> For each node-id found, update nodemask_t mask.
>> Refer to Documentation/devicetree/bindings/numa.txt
>> in linux kernel.
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>> v3: - Parse cpu nodes under path /cpus
>> - Move changes to bootfdt.c as separate patch
>> - Set numa_off on dt_numa_init() failure
>> ---
>>  xen/arch/arm/Makefile   |  1 +
>>  xen/arch/arm/numa/Makefile  |  2 ++
>>  xen/arch/arm/numa/dt_numa.c | 77
>> +
>>  xen/arch/arm/numa/numa.c| 48 
>>  xen/arch/arm/setup.c|  4 +++
>>  xen/include/asm-arm/numa.h  | 10 +-
>>  6 files changed, 141 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 49e1fb2..a89be66 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -3,6 +3,7 @@ subdir-$(CONFIG_ARM_64) += arm64
>>  subdir-y += platforms
>>  subdir-$(CONFIG_ARM_64) += efi
>>  subdir-$(CONFIG_ACPI) += acpi
>> +subdir-$(CONFIG_NUMA) += numa
>>
>>  obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>>  obj-y += bootfdt.init.o
>> diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
>> new file mode 100644
>> index 000..3af3aff
>> --- /dev/null
>> +++ b/xen/arch/arm/numa/Makefile
>> @@ -0,0 +1,2 @@
>> +obj-y += dt_numa.o
>> +obj-y += numa.o
>> diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
>> new file mode 100644
>> index 000..963bb40
>> --- /dev/null
>> +++ b/xen/arch/arm/numa/dt_numa.c
>> @@ -0,0 +1,77 @@
>> +/*
>> + * OF NUMA Parsing support.
>> + *
>> + * Copyright (C) 2015 - 2016 Cavium Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see .
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>
>
> Again, this include should not be there as the device tree is not yet
> parsed.

I believe that below code needs this header file.

>
>> +#include 
>> +#include 
>
>
> Again, please order in alphabetically the includes...
>
>
>> +
>> +/*
>> + * Even though we connect cpus to numa domains later in SMP
>> + * init, we need to know the node ids now for all cpus.
>> + */
>> +static int __init dt_numa_process_cpu_node(const void *fdt)
>> +{
>> +int node, offset;
>> +uint32_t nid;
>> +
>> +offset = fdt_path_offset(fdt, "/cpus");
>> +if ( offset < 0 )
>> +return -EINVAL;
>> +
>> +node = fdt_first_subnode(fdt, offset);
>> +if ( node == -FDT_ERR_NOTFOUND )
>> +return -EINVAL;
>> +
>> +do {
>> +if ( device_tree_type_matches(fdt, node, "cpu") )
>> +{
>> +nid = device_tree_get_u32(fdt, node, "numa-node-id",
>> MAX_NUMNODES);
>> +if ( nid >= MAX_NUMNODES )
>> +printk(XENLOG_WARNING
>> +   "NUMA: Node id %u exceeds maximum value\n", nid);
>> +else
>> +node_set(nid, processor_nodes_parsed);
>> +}
>> +
>> +offset = node;
>> +node = fdt_next_subnode(fdt, offset);
>> +} while (node != -FDT_ERR_NOTFOUND);
>> +
>> +return 0;
>> +}
>> +
>> +int __init dt_numa_init(void)
>> +{
>> +int ret;
>> +
>> +ret = dt_numa_process_cpu_node((void *)device_tree_flattened);
>> +
>> +return ret;
>
>
> return dt_numa_process_cpu_node();
>
> But I am still not sure to understand why you can't parse the numa node in
> directly in bootfdt.c as you do for the memory.

IRC, Initially I was facing issue with this approach. I will re-look into it.

>
>
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
>> new file mode 100644
>> index 000..45cc418
>> --- /dev/null
>> +++ b/xen/arch/arm/numa/numa.c
>> @@ -0,0 +1,48 @@
>> +/*
>> + * ARM NUMA Implementation
>> + *
>> + * Copyright (C) 2016 - Cavium Inc.
>> + * Vijaya Kumar K 
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms and conditions of the GNU General Public
>> + * License, 

Re: [Xen-devel] [RFC PATCH v3 08/24] NUMA: x86: Move numa code and make it generic

2017-07-20 Thread Vijay Kilari
On Wed, Jul 19, 2017 at 11:11 PM, Julien Grall  wrote:
> Hi Vijay,
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Move code from xen/arch/x86/numa.c to xen/common/numa.c
>> so that it can be used by other archs.
>>
>> The following changes are done:
>> - Few generic static functions in x86/numa.c is made
>>   non-static common/numa.c
>> - The generic contents of header file asm-x86/numa.h
>>   are moved to xen/numa.h.
>> - The header file includes are reordered and externs are
>>   dropped.
>> - Moved acpi_numa from asm-x86/acpi.h to xen/acpi.h
>> - Coding style of code moved to commom/numa.c is changed
>>   to Xen style.
>> - numa_add_cpu() and numa_set_node() and moved to header
>>   file and added inline function in case of CONFIG_NUMA
>>   is not enabled because these functions are called from
>>   generic code with out any config check.
>>
>> Also the node_online_map is defined in x86/numa.c for x86
>> and arm/smpboot.c for ARM. For x86 it is moved to x86/smpboot.c
>> If moved to common code the compilation fails because
>> common/numa.c is compiled only when NUMA is enabled.
>
>
> I would much prefer if this patch does one thing: Moving code. The rest
> should be split out to help review and allowing us to easily verify you only
> moved code...

Yes, this patch is doing only code movement. Apart from adding inline function
for numa_add_cpu() and numa_set_node().

>
>> +#define NODE_DATA(nid)  (&(node_data[nid]))
>> +
>> +#define node_start_pfn(nid) NODE_DATA(nid)->node_start_pfn
>> +#define node_spanned_pages(nid) NODE_DATA(nid)->node_spanned_pages
>> +#define node_end_pfn(nid)   NODE_DATA(nid)->node_start_pfn + \
>> + NODE_DATA(nid)->node_spanned_pages
>> +
>> +void numa_add_cpu(int cpu);
>> +void numa_set_node(int cpu, nodeid_t node);
>> +#else
>> +static inline void numa_add_cpu(int cpu) { }
>> +static inline void numa_set_node(int cpu, nodeid_t node) { }
>
>
> I am not sure why you need to define stub at least for numa_set_node... I
> can't see use in non-NUMA code. I will comment about the numa_add_cpu later.

x86 is using from setup.c. yes if we assume that numa is always enabled for x86,
I can drop numa_set_node() inline function.

>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 05/24] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs

2017-07-20 Thread Vijay Kilari
On Wed, Jul 19, 2017 at 10:48 PM, Julien Grall <julien.gr...@arm.com> wrote:
>
>
> On 19/07/17 07:40, Vijay Kilari wrote:
>>
>> On Tue, Jul 18, 2017 at 8:59 PM, Wei Liu <wei.l...@citrix.com> wrote:
>>>
>>> On Tue, Jul 18, 2017 at 05:11:27PM +0530, vijay.kil...@gmail.com wrote:
>>>>
>>>> From: Vijaya Kumar K <vijaya.ku...@cavium.com>
>>>>
>>>> Add accessors for nodes[] and other static variables and
>>>> use those accessors. These variables are later accessed
>>>> outside the file when the code made generic in later
>>>> patches. However the coding style is not changed.
>>>>
>>>> Signed-off-by: Vijaya Kumar K <vijaya.ku...@cavium.com>
>>>> ---
>>>> v3: - Changed accessors parameter from int to unsigned int
>>>> - Updated commit message
>>>> - Fixed wrong indentation
>>>> ---
>>>>  xen/arch/x86/srat.c | 106
>>>> +++-
>>>>  1 file changed, 81 insertions(+), 25 deletions(-)
>>>>
>>>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>>>> index 535c9d7..42cca5a 100644
>>>> --- a/xen/arch/x86/srat.c
>>>> +++ b/xen/arch/x86/srat.c
>>>> @@ -41,6 +41,44 @@ static struct node
>>>> node_memblk_range[NR_NODE_MEMBLKS];
>>>>  static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>>>>  static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
>>>>
>>>> +static struct node *get_numa_node(unsigned int id)
>>>> +{
>>>> + return [id];
>>>> +}
>>>> +
>>>> +static nodeid_t get_memblk_nodeid(unsigned int id)
>>>> +{
>>>> + return memblk_nodeid[id];
>>>> +}
>>>> +
>>>> +static nodeid_t *get_memblk_nodeid_map(void)
>>>> +{
>>>> + return _nodeid[0];
>>>> +}
>>>> +
>>>> +static struct node *get_node_memblk_range(unsigned int memblk)
>>>> +{
>>>> + return _memblk_range[memblk];
>>>> +}
>>>> +
>>>> +static int get_num_node_memblks(void)
>>>> +{
>>>> + return num_node_memblks;
>>>> +}
>>>
>>>
>>> They should all be inline functions. And maybe at once lift to a header
>>> and add proper prefix since you mention they are going to be used later.
>>
>>
>> Currently these are static variables in x86/srat.c file.
>> In patch #9 I move them to common/numa.c file and make these functions
>> non-static.
>>
>> If I lift them to header file and make inline, then I have to make these
>> as
>> global variables.
>
>
> As I said on v2, I am not sure to understand the usefulness of those
> accessors over global variables...

These are static variables. To access across other files (arch specific)
these accessors are added. I have to make them global variables to use
outside of this file.

I am happy to make them global and make these accessors static inline
as suggested by Wei.

>
> You don't have any kind of sanity check, so they would do exactly the same
> job. The global variables would avoid so much churn.
>
> More that you tend to sometimes use global and other time static helpers...
>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 01/24] NUMA: Make number of NUMA nodes configurable

2017-07-20 Thread Vijay Kilari
On Wed, Jul 19, 2017 at 9:25 PM, Julien Grall <julien.gr...@arm.com> wrote:
> Hi Vijay,
>
>
> On 19/07/2017 08:00, Vijay Kilari wrote:
>>
>> On Tue, Jul 18, 2017 at 11:25 PM, Julien Grall <julien.gr...@arm.com>
>> wrote:
>>>
>>> Hi,
>>>
>>>
>>> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>>>
>>>>
>>>> From: Vijaya Kumar K <vijaya.ku...@cavium.com>
>>>>
>>>> Introduce NR_NODES config option to specify number
>>>> of NUMA nodes supported. By default value is set at
>>>> 64 for x86 and 8 for arm. Dropped NODES_SHIFT macro.
>>>>
>>>> Also move NR_NODE_MEMBLKS from asm-x86/acpi.h to xen/numa.h
>>>>
>>>> Signed-off-by: Vijaya Kumar K <vijaya.ku...@cavium.com>
>>>> ---
>>>>  xen/arch/Kconfig   | 7 +++
>>>>  xen/include/asm-x86/acpi.h | 1 -
>>>>  xen/include/asm-x86/numa.h | 2 --
>>>>  xen/include/xen/config.h   | 1 +
>>>>  xen/include/xen/numa.h | 7 ++-
>>>>  5 files changed, 10 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
>>>> index cf0acb7..9c2a4e2 100644
>>>> --- a/xen/arch/Kconfig
>>>> +++ b/xen/arch/Kconfig
>>>> @@ -6,3 +6,10 @@ config NR_CPUS
>>>> default "128" if ARM
>>>> ---help---
>>>>   Specifies the maximum number of physical CPUs which Xen will
>>>> support.
>>>> +
>>>> +config NR_NODES
>>>> +   int "Maximum number of NUMA nodes"
>>>> +   default "64" if X86
>>>> +   default "8" if ARM
>>>
>>>
>>>
>>> 3rd time I am asking it... Why the difference between x86 and ARM?
>>
>>
>> AFAIK, there is no arm platform for now with numa more than 8 nodes.
>> Thunderx is only 2 nodes.
>> So kept it low value for ARM to avoid unnecessary memory allocation.
>>
>> Do you want me to keep same as x86?.
>
>
> Well, you say it is for saving memory allocation but you don't give any
> number on how much you can save by reducing the default from 64 to 8...
>
> Looking at it, MAX_NUMNODES is used for some static allocation and also for
> the bitmap nodemask_t.
>
> Because our bitmap is based on unsigned long, you would use the same
> quantity of memory for AArch64, for AArch32 the quantity will be divided by
> two. Still nodemask_t does not seem to be widely used.
>
> In the case of the static allocation, I spot ~40 bytes per NUMA node. So 8
> node will use ~320 bytes and 64 bytes ~2560.
>
> NUMA is likely going to be used in server, don't tell me you are 2k short in
> memory? If it is an issue it is better to think how to limit the number of
> static variable rather than putting a low limit here.
>
> For Embedded use case, they will likely want to put the default to 1 but I
> would not worry about them as they are likely going to tweak the Kconfig.

Ok. I will set to 64. same as x86.

>
>>
>>>
>>> Also, you likely want to set to 1 if NUMA is not enabled.
>>
>>
>> I don't see any dependency of NR_NODES with NUMA config.
>> So it is always set to default value. Isn't?
>
>
> Well, what is the point to allow more than 1 node when NUMA is not
> supported?

In such case, I have to make NR_NODES depends on NUMA config
and define this value to 1 if NUMA config is not defined as below.

diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index b73d459..a5d40f5 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -11,5 +11,6 @@ config NR_NODES
int "Maximum number of NUMA nodes"
+  range 1 254
default "64"
+   depends on NUMA
---help---
  Specifies the maximum number of NUMA nodes which Xen will support.
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index 604fd6d..eede1c4 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -10,6 +10,10 @@ extern int srat_rev;
 extern nodeid_t  cpu_to_node[NR_CPUS];
 extern cpumask_t node_to_cpumask[];

+#ifndef CONFIG_NUMA
+#define NR_NODES 1
+#endif
+
 #define MAX_NUMNODESNR_NODES
 #define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)

>
> Not mentioning that this is quite confusing for a user to allow setting up
> the maximum number of nodes if the archicture is not supporting numa...
>
> For instance, this is the case today on ARM because, without this series, we
> don't support NUMA.
>
>
>>
>>>
>>>
>>>

Re: [Xen-devel] [RFC PATCH v3 02/24] x86: NUMA: Clean up: Fix coding styles and drop unused code

2017-07-20 Thread Vijay Kilari
On Wed, Jul 19, 2017 at 9:53 PM, Julien Grall  wrote:
> Hi Vijay,
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Fix coding style, trailing spaces, tabs in NUMA code.
>> Also drop unused macros and functions.
>> There is no functional change.
>>
>> Signed-off-by: Vijaya Kumar K 
>> Reviewed-by: Wei Liu 
>> ---
>> v3: - Change commit message
>> - Changed VIRTUAL_BUG_ON to ASSERT
>
>
> Looking at the commit message you don't mention any renaming...
>
>> - Dropped useless inner paranthesis for some macros
>
>
> [...]
>
>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>> index 3cf26c2..c0de57b 100644
>> --- a/xen/include/asm-x86/numa.h
>> +++ b/xen/include/asm-x86/numa.h
>> @@ -1,8 +1,11 @@
>> -#ifndef _ASM_X8664_NUMA_H
>> +#ifndef _ASM_X8664_NUMA_H
>>  #define _ASM_X8664_NUMA_H 1
>>
>>  #include 
>>
>> +#define MAX_NUMNODESNR_NODES
>> +#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
>
>
> I don't understand why this suddenly appears in the code when you moved away
> in patch #1 in xen/numa.h.

Particularly MAX_NUMNODES required by this header file with this
patch changes for compilation.
Though I can include xen/numa.h here but xen/numa.h is including
asm/numa.h back.

I will add separate patch for this defines movement and drop from
this patch.

>
> [...]
>
>
>> @@ -57,21 +55,23 @@ struct node_data {
>>
>>  extern struct node_data node_data[];
>>
>> -static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
>> -{
>> -   nodeid_t nid;
>> -   VIRTUAL_BUG_ON((paddr_to_pdx(addr) >> memnode_shift) >=
>> memnodemapsize);
>> -   nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
>> -   VIRTUAL_BUG_ON(nid >= MAX_NUMNODES || !node_data[nid]);
>> -   return nid;
>> -}
>> -
>> -#define NODE_DATA(nid) (&(node_data[nid]))
>> -
>> -#define node_start_pfn(nid)(NODE_DATA(nid)->node_start_pfn)
>> -#define node_spanned_pages(nid)
>> (NODE_DATA(nid)->node_spanned_pages)
>> -#define node_end_pfn(nid)   (NODE_DATA(nid)->node_start_pfn + \
>> -NODE_DATA(nid)->node_spanned_pages)
>> +static inline __attribute_pure__ nodeid_t phys_to_nid(paddr_t addr)
>> +{
>> +   nodeid_t nid;
>> +
>> +   ASSERT((paddr_to_pdx(addr) >> memnode_shift) < memnodemapsize);
>> +   nid = memnodemap[paddr_to_pdx(addr) >> memnode_shift];
>> +   ASSERT(nid <= MAX_NUMNODES || !node_data[nid].node_start_pfn);
>> +
>> +   return nid;
>> +}
>> +
>> +#define NODE_DATA(nid)  (&(node_data[nid]))
>
>
> I understand Jan asked to remove the inner parentheses here. And you didn't
> do it. However ...
>
>> +
>> +#define node_start_pfn(nid) NODE_DATA(nid)->node_start_pfn
>> +#define node_spanned_pages(nid) NODE_DATA(nid)->node_spanned_pages
>> +#define node_end_pfn(nid)   NODE_DATA(nid)->node_start_pfn + \
>> + NODE_DATA(nid)->node_spanned_pages
>
>
> ... here it is totally wrong to remove the parenthesis. Imagine you do:
>
> node_end_pfn(nid) * 2
>
> This will now turned into
>
> NODE_DATA(nid)->node_start_pfn + NODE_DATA(nid)->node_spanned_pages * 2
>
> The parenthesis is not correct anymore and will result to wrong computation.
> You should keep the outer parenthesis *everywhere* for safety and remove
> only the inner one in NODE_DATA.

OK.

>
> This is also more than cosmetics and I think the reviewed-by from Wei should
> have been carried.

OK.

>
>>
>>  extern int valid_numa_range(u64 start, u64 end, nodeid_t node);
>>
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index 6bba29e..3bb4afc 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -6,9 +6,6 @@
>>  #define NUMA_NO_NODE 0xFF
>>  #define NUMA_NO_DISTANCE 0xFF
>>
>> -#define MAX_NUMNODESNR_NODES
>> -#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
>> -
>
>
> See my comment above.
>
>>  #define vcpu_to_node(v) (cpu_to_node((v)->processor))
>>
>>  #define domain_to_node(d) \
>>
>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 04/24] x86: NUMA: Rename and sanitize memnode shift code

2017-07-20 Thread Vijay Kilari
On Wed, Jul 19, 2017 at 10:42 PM, Julien Grall  wrote:
> Hi Vijay,
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> memnode_shift variable is changed from int to unsigned int.
>> With this change, compute_memnode_shift() returns error value
>> instead of returning shift value. The memnode_shift is updated inside
>> compute_memnode_shift().
>>
>> Also, following changes are made
>>   - Rename compute_hash_shift to compute_memnode_shift
>>   - Update int to unsigned int for params in extract_lsb_from_nodes()
>>   - Return values of populate_memnodemap() is changed
>
>
> I am not sure to understand the rationale behind changing the return value
> of populate_memnodemap. Likely this mean a bit more description in the
> commit message.

There is no much rationale behind it. As a part of cleanup, I have made
meaningful return values. Anyway, I will update commit message.

>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 01/24] NUMA: Make number of NUMA nodes configurable

2017-07-19 Thread Vijay Kilari
On Tue, Jul 18, 2017 at 11:25 PM, Julien Grall  wrote:
> Hi,
>
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Introduce NR_NODES config option to specify number
>> of NUMA nodes supported. By default value is set at
>> 64 for x86 and 8 for arm. Dropped NODES_SHIFT macro.
>>
>> Also move NR_NODE_MEMBLKS from asm-x86/acpi.h to xen/numa.h
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>>  xen/arch/Kconfig   | 7 +++
>>  xen/include/asm-x86/acpi.h | 1 -
>>  xen/include/asm-x86/numa.h | 2 --
>>  xen/include/xen/config.h   | 1 +
>>  xen/include/xen/numa.h | 7 ++-
>>  5 files changed, 10 insertions(+), 8 deletions(-)
>>
>> diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
>> index cf0acb7..9c2a4e2 100644
>> --- a/xen/arch/Kconfig
>> +++ b/xen/arch/Kconfig
>> @@ -6,3 +6,10 @@ config NR_CPUS
>> default "128" if ARM
>> ---help---
>>   Specifies the maximum number of physical CPUs which Xen will
>> support.
>> +
>> +config NR_NODES
>> +   int "Maximum number of NUMA nodes"
>> +   default "64" if X86
>> +   default "8" if ARM
>
>
> 3rd time I am asking it... Why the difference between x86 and ARM?

AFAIK, there is no arm platform for now with numa more than 8 nodes.
Thunderx is only 2 nodes.
So kept it low value for ARM to avoid unnecessary memory allocation.

Do you want me to keep same as x86?.

>
> Also, you likely want to set to 1 if NUMA is not enabled.

I don't see any dependency of NR_NODES with NUMA config.
So it is always set to default value. Isn't?

>
>
>> +   ---help---
>> + Specifies the maximum number of NUMA nodes which Xen will
>> support.
>> diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
>> index 27ecc65..15be784 100644
>> --- a/xen/include/asm-x86/acpi.h
>> +++ b/xen/include/asm-x86/acpi.h
>> @@ -105,7 +105,6 @@ extern void acpi_reserve_bootmem(void);
>>
>>  extern s8 acpi_numa;
>>  extern int acpi_scan_nodes(u64 start, u64 end);
>> -#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
>>
>>  #ifdef CONFIG_ACPI_SLEEP
>>
>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>> index bada2c0..3cf26c2 100644
>> --- a/xen/include/asm-x86/numa.h
>> +++ b/xen/include/asm-x86/numa.h
>> @@ -3,8 +3,6 @@
>>
>>  #include 
>>
>> -#define NODES_SHIFT 6
>> -
>>  typedef u8 nodeid_t;
>>
>>  extern int srat_rev;
>> diff --git a/xen/include/xen/config.h b/xen/include/xen/config.h
>> index a1d0f97..0f1a029 100644
>> --- a/xen/include/xen/config.h
>> +++ b/xen/include/xen/config.h
>> @@ -81,6 +81,7 @@
>>
>>  /* allow existing code to work with Kconfig variable */
>>  #define NR_CPUS CONFIG_NR_CPUS
>> +#define NR_NODES CONFIG_NR_NODES
>>
>>  #ifndef CONFIG_DEBUG
>>  #define NDEBUG
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index 7aef1a8..6bba29e 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -3,14 +3,11 @@
>>
>>  #include 
>>
>> -#ifndef NODES_SHIFT
>> -#define NODES_SHIFT 0
>> -#endif
>> -
>>  #define NUMA_NO_NODE 0xFF
>>  #define NUMA_NO_DISTANCE 0xFF
>>
>> -#define MAX_NUMNODES(1 << NODES_SHIFT)
>> +#define MAX_NUMNODESNR_NODES
>> +#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
>>
>>  #define vcpu_to_node(v) (cpu_to_node((v)->processor))
>>
>>
>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 08/24] NUMA: x86: Move numa code and make it generic

2017-07-19 Thread Vijay Kilari
On Tue, Jul 18, 2017 at 11:46 PM, Julien Grall  wrote:
> Hi,
>
>
> On 18/07/17 16:29, Wei Liu wrote:
>>
>> On Tue, Jul 18, 2017 at 05:11:30PM +0530, vijay.kil...@gmail.com wrote:
>> [...]
>>>
>>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>>> new file mode 100644
>>> index 000..0381f1b
>>> --- /dev/null
>>> +++ b/xen/common/numa.c
>>> @@ -0,0 +1,487 @@
>>> +/*
>>> + * Common NUMA handling functions for x86 and arm.
>>> + * Original code extracted from arch/x86/numa.c
>>> + *
>>> + * This program is free software; you can redistribute it and/or
>>> + * modify it under the terms and conditions of the GNU General Public
>>> + * License, version 2, as published by the Free Software Foundation.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>>> + * GNU General Public License for more details.
>>> + *
>>> + * You should have received a copy of the GNU General Public License
>>> + * along with this program; If not, see .
>>> + */
>>> +
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +#include 
>>> +
>>
>>
>> Since you're moving code anyway, please sort the headers alphabetically.
>>
>>> +static int numa_setup(char *s);
>>> +custom_param("numa", numa_setup);
>>> +
>>> +struct node_data node_data[MAX_NUMNODES];
>>> +
>>> +/* Mapping from pdx to node id */
>>
>>
>> Is this comment applicable to ARM? Does arm has PDX?
>
>
> Yes ARM has PDX. For new architecture we expect the code to provide dummy
> helpers if they want to support NUMA.
>
>>
>>> +unsigned int memnode_shift;
>>> +
>>> +/*
>>> + * In case of numa init failure or numa off,
>>> + * memnode_shift is initialized to BITS_PER_LONG - 1. Hence allocate
>>> + * memnodemap[] of BITS_PER_LONG.
>>> + */
>>> +static typeof(*memnodemap) _memnodemap[BITS_PER_LONG];
>>> +unsigned long memnodemapsize;
>>> +uint8_t *memnodemap;
>>> +
>>> +nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
>>> +[0 ... NR_CPUS-1] = NUMA_NO_NODE
>>> +};
>>> +
>>> +cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
>>> +
>>> +bool numa_off;
>>> +s8 acpi_numa = 0;
>>> +
>>> +int srat_disabled(void)
>>
>>
>> bool here.
>>
>> Should probably be done in a previous patch.
>
>
> Actually, the previous version had srat_disabled return bool. I am aware
> that Jan and I requested to keep acpi_numa as int, I didn't find any request
> of keep moving srat_disabled to int. So can you explain why??

My bad. I dropped patch #4 from v2. But this change was part of patch
#4 and missed it out.

>
>>
>>> +
>>> +void __init numa_init_array(void)
>>> +{
>>> +int rr, i;
>>> +
>>> +/* There are unfortunately some poorly designed mainboards around
>>> +   that only connect memory to a single CPU. This breaks the 1:1
>>> cpu->node
>>> +   mapping. To avoid this fill in the mapping for all possible
>>> +   CPUs, as the number of CPUs is not known yet.
>>> +   We round robin the existing nodes. */
>>
>>
>> Please fix the coding style issue here.
>>
>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 05/24] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs

2017-07-19 Thread Vijay Kilari
On Tue, Jul 18, 2017 at 8:59 PM, Wei Liu  wrote:
> On Tue, Jul 18, 2017 at 05:11:27PM +0530, vijay.kil...@gmail.com wrote:
>> From: Vijaya Kumar K 
>>
>> Add accessors for nodes[] and other static variables and
>> use those accessors. These variables are later accessed
>> outside the file when the code made generic in later
>> patches. However the coding style is not changed.
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>> v3: - Changed accessors parameter from int to unsigned int
>> - Updated commit message
>> - Fixed wrong indentation
>> ---
>>  xen/arch/x86/srat.c | 106 
>> +++-
>>  1 file changed, 81 insertions(+), 25 deletions(-)
>>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index 535c9d7..42cca5a 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -41,6 +41,44 @@ static struct node node_memblk_range[NR_NODE_MEMBLKS];
>>  static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>>  static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
>>
>> +static struct node *get_numa_node(unsigned int id)
>> +{
>> + return [id];
>> +}
>> +
>> +static nodeid_t get_memblk_nodeid(unsigned int id)
>> +{
>> + return memblk_nodeid[id];
>> +}
>> +
>> +static nodeid_t *get_memblk_nodeid_map(void)
>> +{
>> + return _nodeid[0];
>> +}
>> +
>> +static struct node *get_node_memblk_range(unsigned int memblk)
>> +{
>> + return _memblk_range[memblk];
>> +}
>> +
>> +static int get_num_node_memblks(void)
>> +{
>> + return num_node_memblks;
>> +}
>
> They should all be inline functions. And maybe at once lift to a header
> and add proper prefix since you mention they are going to be used later.

Currently these are static variables in x86/srat.c file.
In patch #9 I move them to common/numa.c file and make these functions
non-static.

If I lift them to header file and make inline, then I have to make these as
global variables.

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 18/24] ACPI: Refactor acpi SRAT and SLIT table handling code

2017-07-19 Thread Vijay Kilari
On Tue, Jul 18, 2017 at 9:06 PM, Wei Liu  wrote:
> On Tue, Jul 18, 2017 at 05:11:40PM +0530, vijay.kil...@gmail.com wrote:
>> From: Vijaya Kumar K 
>>
>> Move SRAT handling code which is common across
>> architectures is moved to new file xen/drivers/acpi/srat.c
>> from xen/arch/x86/srat.c file. New header file srat.h is
>> introduced.
>>
>> Other major changes are:
>> - Coding style of code moved is changed.
>> - Moved struct pxm2node from srat.c to srat.h
>> - Dropped {memory,processor}_nodes_parsed from x86/srat.c
>> - Dropped static on node_to_pxm() and moved to beginning of the file.
>> - Made some static functions as non-static
>> - acpi_node_distance() is introduced and called from __node_distance()
>> - Replaced distance constants with LOCAL/REMOTE_DISTANCE defines
>
> It would be nice if you could break these into individual patches.

Ok. I will split.

>
> [...]
>> +
>> +/*
>> + * A lot of BIOS fill in 10 (= no distance) everywhere. This messes
>> + * up the NUMA heuristics which wants the local node to have a smaller
>> + * distance than the others.
>> + * Do some quick checks here and only use the SLIT if it passes.
>> + */
>> +static int __init slit_valid(struct acpi_table_slit *slit)
>> +{
>> +int i, j;
>
> unsigned int

ok
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v3 00/24] ARM: Add Xen NUMA support

2017-07-19 Thread Vijay Kilari
On Tue, Jul 18, 2017 at 9:48 PM, Julien Grall  wrote:
> Hi,
>
> On 18/07/17 12:41, vijay.kil...@gmail.com wrote:
>>
>> This patch is tested on Thunderx platform.
>> No changes are made to x86 implementation only code is sanitized
>> and refactored. Hence only compilation tested for x86.
>>
>> This series is posted as RFC for the reason that it is not tested
>> on x86. Request some help from community in testing this series on x86.
>>
>> Code is shared at
>> https://github.com/vijaykilari/xen-numa/commits/rfc_v3
>
>
> Few months ago you sent a patch that was a pre-requisite for booting NUMA
> (see [1]). It has never been upstreamed, so is it still required?

yes, it is required and is merged

https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=c6fdc9696a6a6eac59bf9c81121d1f1cd5b88dcd

>
> Cheers,
>
> [1]
> https://lists.xenproject.org/archives/html/xen-devel/2017-03/msg03823.html
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 23/24] NUMA: Move CONFIG_NUMA to common Kconfig

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

CONFIG_NUMA is defined in xen/drivers/acpi/Kconfig.
Move to common/Kconfig and enabled by default.
Also, NUMA feature uses PDX for physical address to
memory node mapping. Hence make HAS_PDX dependent
for NUMA.

Signed-off-by: Vijaya Kumar K 
---
 xen/common/Kconfig   | 4 
 xen/drivers/acpi/Kconfig | 3 ---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index dc8e876..6e421c7 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -41,6 +41,10 @@ config HAS_GDBSX
 config HAS_IOPORTS
bool
 
+config NUMA
+   def_bool y
+   depends on HAS_PDX
+
 config HAS_BUILD_ID
string
option env="XEN_HAS_BUILD_ID"
diff --git a/xen/drivers/acpi/Kconfig b/xen/drivers/acpi/Kconfig
index b64d373..488372f 100644
--- a/xen/drivers/acpi/Kconfig
+++ b/xen/drivers/acpi/Kconfig
@@ -4,6 +4,3 @@ config ACPI
 
 config ACPI_LEGACY_TABLES_LOOKUP
bool
-
-config NUMA
-   bool
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 14/24] ARM: NUMA: DT: Parse NUMA distance information

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Parse distance-matrix and fetch node distance information.
Store distance information in node_distance[].

Register dt_node_distance() function pointer with
the ARM numa code. This approach can be later used for
ACPI.

Signed-off-by: Vijaya Kumar K 
---
v3: - Moved __node_distance() declaration to common
  header file
- Use device_tree_node_compatible() instead of
  device_tree_node_matches()
- Dropped xen/errno.h inclusion
---
 xen/arch/arm/numa/dt_numa.c | 131 
 xen/arch/arm/numa/numa.c|  22 
 xen/include/asm-arm/numa.h  |   2 +
 xen/include/asm-x86/numa.h  |   1 -
 xen/include/xen/numa.h  |   3 +
 5 files changed, 158 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
index 84030e7..46c0346 100644
--- a/xen/arch/arm/numa/dt_numa.c
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -23,6 +23,48 @@
 #include 
 #include 
 
+static uint8_t node_distance[MAX_NUMNODES][MAX_NUMNODES];
+
+static uint8_t dt_node_distance(nodeid_t nodea, nodeid_t nodeb)
+{
+if ( nodea >= MAX_NUMNODES || nodeb >= MAX_NUMNODES )
+return nodea == nodeb ? LOCAL_DISTANCE : REMOTE_DISTANCE;
+
+return node_distance[nodea][nodeb];
+}
+
+static int dt_numa_set_distance(uint32_t nodea, uint32_t nodeb,
+uint32_t distance)
+{
+   /* node_distance is uint8_t. Ensure distance is less than 255 */
+   if ( nodea >= MAX_NUMNODES || nodeb >= MAX_NUMNODES || distance > 255 )
+   return -EINVAL;
+
+   node_distance[nodea][nodeb] = distance;
+
+   return 0;
+}
+
+void init_dt_numa_distance(void)
+{
+int i, j;
+
+for ( i = 0; i < MAX_NUMNODES; i++ )
+{
+for ( j = 0; j < MAX_NUMNODES; j++ )
+{
+/*
+ * Initialize distance 10 for local distance and
+ * 20 for remote distance.
+ */
+if ( i  == j )
+node_distance[i][j] = LOCAL_DISTANCE;
+else
+node_distance[i][j] = REMOTE_DISTANCE;
+}
+}
+}
+
 /*
  * Even though we connect cpus to numa domains later in SMP
  * init, we need to know the node ids now for all cpus.
@@ -58,6 +100,76 @@ static int __init dt_numa_process_cpu_node(const void *fdt)
 return 0;
 }
 
+static int __init dt_numa_parse_distance_map(const void *fdt, int node,
+ const char *name,
+ uint32_t address_cells,
+ uint32_t size_cells)
+{
+const struct fdt_property *prop;
+const __be32 *matrix;
+int entry_count, len, i;
+
+printk(XENLOG_INFO "NUMA: parsing numa-distance-map\n");
+
+prop = fdt_get_property(fdt, node, "distance-matrix", );
+if ( !prop )
+{
+printk(XENLOG_WARNING
+   "NUMA: No distance-matrix property in distance-map\n");
+
+return -EINVAL;
+}
+
+if ( len % sizeof(uint32_t) != 0 )
+{
+ printk(XENLOG_WARNING
+"distance-matrix in node is not a multiple of u32\n");
+
+return -EINVAL;
+}
+
+entry_count = len / sizeof(uint32_t);
+if ( entry_count <= 0 )
+{
+printk(XENLOG_WARNING "NUMA: Invalid distance-matrix\n");
+
+return -EINVAL;
+}
+
+matrix = (const __be32 *)prop->data;
+for ( i = 0; i + 2 < entry_count; i += 3 )
+{
+uint32_t nodea, nodeb, distance;
+
+nodea = dt_read_number(matrix, 1);
+matrix++;
+nodeb = dt_read_number(matrix, 1);
+matrix++;
+distance = dt_read_number(matrix, 1);
+matrix++;
+
+if ( dt_numa_set_distance(nodea, nodeb, distance) )
+{
+printk(XENLOG_WARNING
+   "NUMA: node-id out of range in distance matrix for [node%d 
-> node%d]\n",
+   nodea, nodeb);
+return -EINVAL;
+
+}
+printk(XENLOG_INFO "NUMA: distance[node%d -> node%d] = %d\n",
+   nodea, nodeb, distance);
+
+/*
+ * Set default distance of node B->A same as A->B.
+ * No need to check for return value of numa_set_distance.
+ */
+if ( nodeb > nodea )
+dt_numa_set_distance(nodeb, nodea, distance);
+}
+
+return 0;
+}
+
 void __init dt_numa_process_memory_node(uint32_t nid, paddr_t start,
paddr_t size)
 {
@@ -90,11 +202,30 @@ void __init dt_numa_process_memory_node(uint32_t nid, 
paddr_t start,
 return;
 }
 
+static int __init dt_numa_scan_distance_node(const void *fdt, int node,
+ const char *name, int depth,
+ uint32_t address_cells,
+ uint32_t size_cells, void *data)
+{
+if ( device_tree_node_compatible(fdt, node, 

[Xen-devel] [RFC PATCH v3 24/24] NUMA: Enable ACPI_NUMA config

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Add CONFIG_ACPI_NUMA to xen/drivers/acpi/Kconfig and
drop CONFIG_ACPI_NUMA set in asm-x86/config.h.

Signed-off-by: Vijaya Kumar K 
---
 xen/drivers/acpi/Kconfig | 4 
 xen/include/asm-x86/config.h | 1 -
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/acpi/Kconfig b/xen/drivers/acpi/Kconfig
index 488372f..8e15428 100644
--- a/xen/drivers/acpi/Kconfig
+++ b/xen/drivers/acpi/Kconfig
@@ -4,3 +4,7 @@ config ACPI
 
 config ACPI_LEGACY_TABLES_LOOKUP
bool
+
+config ACPI_NUMA
+   def_bool y
+   depends on ACPI && NUMA
diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h
index dc424f9..3e3cc36 100644
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -34,7 +34,6 @@
 #define CONFIG_X86_L1_CACHE_SHIFT 7
 
 #define CONFIG_ACPI_SLEEP 1
-#define CONFIG_ACPI_NUMA 1
 #define CONFIG_ACPI_SRAT 1
 #define CONFIG_ACPI_CSTATE 1
 
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 12/24] ARM: NUMA: DT: Parse CPU NUMA information

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Parse CPU node and fetch numa-node-id information.
For each node-id found, update nodemask_t mask.
Refer to Documentation/devicetree/bindings/numa.txt
in linux kernel.

Signed-off-by: Vijaya Kumar K 
---
v3: - Parse cpu nodes under path /cpus
- Move changes to bootfdt.c as separate patch
- Set numa_off on dt_numa_init() failure
---
 xen/arch/arm/Makefile   |  1 +
 xen/arch/arm/numa/Makefile  |  2 ++
 xen/arch/arm/numa/dt_numa.c | 77 +
 xen/arch/arm/numa/numa.c| 48 
 xen/arch/arm/setup.c|  4 +++
 xen/include/asm-arm/numa.h  | 10 +-
 6 files changed, 141 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 49e1fb2..a89be66 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -3,6 +3,7 @@ subdir-$(CONFIG_ARM_64) += arm64
 subdir-y += platforms
 subdir-$(CONFIG_ARM_64) += efi
 subdir-$(CONFIG_ACPI) += acpi
+subdir-$(CONFIG_NUMA) += numa
 
 obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
 obj-y += bootfdt.init.o
diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
new file mode 100644
index 000..3af3aff
--- /dev/null
+++ b/xen/arch/arm/numa/Makefile
@@ -0,0 +1,2 @@
+obj-y += dt_numa.o
+obj-y += numa.o
diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
new file mode 100644
index 000..963bb40
--- /dev/null
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -0,0 +1,77 @@
+/*
+ * OF NUMA Parsing support.
+ *
+ * Copyright (C) 2015 - 2016 Cavium Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+ * Even though we connect cpus to numa domains later in SMP
+ * init, we need to know the node ids now for all cpus.
+ */
+static int __init dt_numa_process_cpu_node(const void *fdt)
+{
+int node, offset;
+uint32_t nid;
+
+offset = fdt_path_offset(fdt, "/cpus");
+if ( offset < 0 )
+return -EINVAL;
+
+node = fdt_first_subnode(fdt, offset);
+if ( node == -FDT_ERR_NOTFOUND )
+return -EINVAL;
+
+do {
+if ( device_tree_type_matches(fdt, node, "cpu") )
+{
+nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
+if ( nid >= MAX_NUMNODES )
+printk(XENLOG_WARNING
+   "NUMA: Node id %u exceeds maximum value\n", nid);
+else
+node_set(nid, processor_nodes_parsed);
+}
+
+offset = node;
+node = fdt_next_subnode(fdt, offset);
+} while (node != -FDT_ERR_NOTFOUND);
+
+return 0;
+}
+
+int __init dt_numa_init(void)
+{
+int ret;
+
+ret = dt_numa_process_cpu_node((void *)device_tree_flattened);
+
+return ret;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
new file mode 100644
index 000..45cc418
--- /dev/null
+++ b/xen/arch/arm/numa/numa.c
@@ -0,0 +1,48 @@
+/*
+ * ARM NUMA Implementation
+ *
+ * Copyright (C) 2016 - Cavium Inc.
+ * Vijaya Kumar K 
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms and conditions of the GNU General Public
+ * License, version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+
+void __init numa_init(void)
+{
+int ret = 0;
+
+nodes_clear(processor_nodes_parsed);
+if ( numa_off )
+goto no_numa;
+
+ret = dt_numa_init();
+if ( ret )
+{
+numa_off = true;
+printk(XENLOG_WARNING "DT NUMA init failed\n");
+}
+
+no_numa:
+return;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 3b34855..a6d1499 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 

[Xen-devel] [RFC PATCH v3 09/24] NUMA: x86: Move common code from srat.c

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Move code from xen/arch/x86/srat.c to xen/common/numa.c
so that it can be used by other archs.

Apart from moving the code the following changes are done
 - Coding style of code moved to numa.c is changed to xen style
 - {memory,processor}_nodes_parsed are made global and moved
   to xen/nodemask.h
 - Few generic static functions in x86/srat.c are made
   non-static
 - Functions moved from x85/srat.c to common/numa.c are made
   non-static
 - numa_scan_nodes() is made as static function
 - compute_memnode_shift() and setup_node_bootmem() are made
   static.

Also {memory,processor}_nodes_parsed are made as global.
These are used across multiple code files. Adding helpers
to access these nodemask_t is complex.

Signed-off-by: Vijaya Kumar K 
---
v3: - Move declaration of {memory,processor}_nodes_parsed to header
  file
- Drop redundant get_memblk() declaration
- numa_scan_nodes(), setup_node_bootmem(), compute_memnode_shift()
  are made as static function
---
 xen/arch/x86/srat.c| 151 +
 xen/common/numa.c  | 165 +++--
 xen/include/asm-x86/acpi.h |   2 -
 xen/include/asm-x86/numa.h |   2 -
 xen/include/xen/nodemask.h |   2 +
 xen/include/xen/numa.h |  13 ++--
 6 files changed, 174 insertions(+), 161 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 03bc37d..be2634a 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -23,10 +23,6 @@
 
 static struct acpi_table_slit *__read_mostly acpi_slit;
 
-static nodemask_t __initdata memory_nodes_parsed;
-static nodemask_t __initdata processor_nodes_parsed;
-static struct node __initdata nodes[MAX_NUMNODES];
-
 struct pxm2node {
unsigned int pxm;
nodeid_t node;
@@ -36,49 +32,8 @@ static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
 
 static unsigned int node_to_pxm(nodeid_t n);
 
-static int num_node_memblks;
-static struct node node_memblk_range[NR_NODE_MEMBLKS];
-static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
 
-static struct node *get_numa_node(unsigned int id)
-{
-   return [id];
-}
-
-static nodeid_t get_memblk_nodeid(unsigned int id)
-{
-   return memblk_nodeid[id];
-}
-
-static nodeid_t *get_memblk_nodeid_map(void)
-{
-   return _nodeid[0];
-}
-
-static struct node *get_node_memblk_range(unsigned int memblk)
-{
-   return _memblk_range[memblk];
-}
-
-static int get_num_node_memblks(void)
-{
-   return num_node_memblks;
-}
-
-static int __init numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t 
size)
-{
-   if (nodeid >= NR_NODE_MEMBLKS)
-   return -EINVAL;
-
-   node_memblk_range[num_node_memblks].start = start;
-   node_memblk_range[num_node_memblks].end = start + size;
-   memblk_nodeid[num_node_memblks] = nodeid;
-   num_node_memblks++;
-
-   return 0;
-}
-
 static inline bool node_found(unsigned int idx, unsigned int pxm)
 {
return ((pxm2node[idx].pxm == pxm) &&
@@ -149,54 +104,7 @@ nodeid_t acpi_setup_node(unsigned int pxm)
return node;
 }
 
-int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
-{
-   int i;
-
-   for (i = 0; i < get_num_node_memblks(); i++) {
-   struct node *nd = get_node_memblk_range(i);
-
-   if (nd->start <= start && nd->end > end &&
-   get_memblk_nodeid(i) == node)
-   return 1;
-   }
-
-   return 0;
-}
-
-static int __init conflicting_memblks(paddr_t start, paddr_t end)
-{
-   int i;
-
-   for (i = 0; i < get_num_node_memblks(); i++) {
-   struct node *nd = get_node_memblk_range(i);
-   if (nd->start == nd->end)
-   continue;
-   if (nd->end > start && nd->start < end)
-   return i;
-   if (nd->end == end && nd->start == start)
-   return i;
-   }
-   return -1;
-}
-
-static void __init cutoff_node(nodeid_t i, paddr_t start, paddr_t end)
-{
-   struct node *nd = get_numa_node(i);
-
-   if (nd->start < start) {
-   nd->start = start;
-   if (nd->end < nd->start)
-   nd->start = nd->end;
-   }
-   if (nd->end > end) {
-   nd->end = end;
-   if (nd->start > nd->end)
-   nd->start = nd->end;
-   }
-}
-
-static void __init numa_failed(void)
+void __init numa_failed(void)
 {
int i;
printk(KERN_ERR "SRAT: SRAT not used.\n");
@@ -412,7 +320,7 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
 
 /* Sanity check to catch more bad SRATs (they are amazingly common).
Make sure the PXMs cover all memory. */
-static bool __init arch_sanitize_nodes_memory(void)
+bool __init 

[Xen-devel] [RFC PATCH v3 16/24] ARM: NUMA: Add memory NUMA support

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Implement arch_sanitize_nodes_memory() which looks at all banks
in bootinfo.mem, update nodes[] with corresponding nodeid.
Call numa_scan_nodes() generic function with ram start and
end address, which takes care of further computing memnodeshift
and populating memnodemap[] using generic implementation.

Signed-off-by: Vijaya Kumar K 
---
v3: - Dropped common code from asm-arm/numa.h
- Re-used numa_initmem_init() from common code.
---
 xen/arch/arm/numa/numa.c | 77 +++-
 xen/common/numa.c| 14 +
 xen/include/xen/numa.h   |  1 +
 3 files changed, 91 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index dc80aa5..85352dc 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
@@ -64,9 +65,66 @@ void register_node_distance(uint8_t (fn)(nodeid_t a, 
nodeid_t b))
 node_distance_fn = fn;
 }
 
+bool __init arch_sanitize_nodes_memory(void)
+{
+nodemask_t mem_nodes_parsed;
+int bank, nodeid;
+struct node *nd;
+paddr_t start, size, end;
+
+nodes_clear(mem_nodes_parsed);
+for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
+{
+start = bootinfo.mem.bank[bank].start;
+size = bootinfo.mem.bank[bank].size;
+end = start + size;
+
+nodeid = get_mem_nodeid(start, end);
+if ( nodeid >= NUMA_NO_NODE )
+{
+printk(XENLOG_WARNING
+   "NUMA: node for mem bank start 0x%lx - 0x%lx not found\n",
+   start, end);
+
+return false;
+}
+
+nd = get_numa_node(nodeid);
+if ( !node_test_and_set(nodeid, mem_nodes_parsed) )
+{
+nd->start = start;
+nd->end = end;
+}
+else
+{
+if ( start < nd->start )
+nd->start = start;
+if ( nd->end < end )
+nd->end = end;
+}
+}
+
+return true;
+}
+
+static void __init numa_reset_numa_nodes(void)
+{
+int i;
+struct node *nd;
+
+for ( i = 0; i < MAX_NUMNODES; i++ )
+{
+nd = get_numa_node(i);
+nd->start = 0;
+nd->end = 0;
+}
+}
+
 void __init numa_init(void)
 {
-int ret = 0;
+int ret = 0, bank;
+paddr_t ram_start = ~0;
+paddr_t ram_end = 0;
 
 nodes_clear(processor_nodes_parsed);
 init_cpu_to_node();
@@ -83,6 +141,23 @@ void __init numa_init(void)
 }
 
 no_numa:
+for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
+{
+paddr_t bank_start = bootinfo.mem.bank[bank].start;
+paddr_t bank_end = bank_start + bootinfo.mem.bank[bank].size;
+
+ram_start = min(ram_start, bank_start);
+ram_end = max(ram_end, bank_end);
+}
+
+/*
+ * In arch_sanitize_nodes_memory() we update nodes[] properly.
+ * Hence we reset the nodes[] before calling numa_scan_nodes().
+ */
+numa_reset_numa_nodes();
+
+numa_initmem_init(PFN_UP(ram_start), PFN_DOWN(ram_end));
+
 return;
 }
 
diff --git a/xen/common/numa.c b/xen/common/numa.c
index 5e985d2..0f79a07 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -76,6 +76,20 @@ nodeid_t get_memblk_nodeid(unsigned int id)
 return memblk_nodeid[id];
 }
 
+int __init get_mem_nodeid(paddr_t start, paddr_t end)
+{
+unsigned int i;
+
+for ( i = 0; i < get_num_node_memblks(); i++ )
+{
+if ( start >= node_memblk_range[i].start &&
+ end <= node_memblk_range[i].end )
+return memblk_nodeid[i];
+}
+
+return -EINVAL;
+}
+
 static nodeid_t *get_memblk_nodeid_map(void)
 {
 return _nodeid[0];
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 8a306e7..a541eb7 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -70,6 +70,7 @@ struct node *get_numa_node(unsigned int id);
 nodeid_t get_memblk_nodeid(unsigned int memblk);
 struct node *get_node_memblk_range(unsigned int memblk);
 int numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t size);
+int get_mem_nodeid(paddr_t start, paddr_t end);
 int get_num_node_memblks(void);
 bool arch_sanitize_nodes_memory(void);
 void numa_failed(void);
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 21/24] ARM: NUMA: ACPI: Extract proximity from SRAT table

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Register SRAT entry handler for type
ACPI_SRAT_TYPE_GICC_AFFINITY to parse SRAT table
and extract proximity for all CPU IDs.

Signed-off-by: Vijaya Kumar 
---
 xen/arch/arm/acpi/boot.c  |   2 +
 xen/arch/arm/numa/acpi_numa.c | 124 +-
 xen/drivers/acpi/numa.c   |  15 +
 xen/include/acpi/actbl1.h |  17 +-
 xen/include/asm-arm/numa.h|   9 +++
 5 files changed, 165 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
index 889208a..4e28b16 100644
--- a/xen/arch/arm/acpi/boot.c
+++ b/xen/arch/arm/acpi/boot.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -117,6 +118,7 @@ acpi_map_gic_cpu_interface(struct 
acpi_madt_generic_interrupt *processor)
 return;
 }
 
+numa_set_cpu_node(enabled_cpus, acpi_get_nodeid(mpidr));
 /* map the logical cpu id to cpu MPIDR */
 cpu_logical_map(enabled_cpus) = mpidr;
 
diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
index 341e20b7..95617f9 100644
--- a/xen/arch/arm/numa/acpi_numa.c
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -34,13 +34,63 @@ struct cpuid_to_hwid {
 uint64_t hwid;
 };
 
+/* Holds NODE to MPIDR mapping. */
+struct node_to_hwid {
+nodeid_t nodeid;
+uint64_t hwid;
+};
+
 #define PHYS_CPUID_INVALID 0xff
 
 /* Holds mapping of CPU id to MPIDR read from MADT */
 static struct cpuid_to_hwid __read_mostly cpuid_to_hwid_map[NR_CPUS] =
 { [0 ... NR_CPUS - 1] = {PHYS_CPUID_INVALID, MPIDR_INVALID} };
+static struct node_to_hwid __read_mostly node_to_hwid_map[NR_CPUS] =
+{ [0 ... NR_CPUS - 1] = {NUMA_NO_NODE, MPIDR_INVALID} };
+static unsigned int cpus_in_srat;
 static unsigned int num_cpuid_to_hwid;
 
+nodeid_t __init acpi_get_nodeid(uint64_t hwid)
+{
+unsigned int i;
+
+for ( i = 0; i < cpus_in_srat; i++ )
+{
+if ( node_to_hwid_map[i].hwid == hwid )
+return node_to_hwid_map[i].nodeid;
+}
+
+return NUMA_NO_NODE;
+}
+
+static uint64_t acpi_get_cpu_hwid(int cid)
+{
+unsigned int i;
+
+for ( i = 0; i < num_cpuid_to_hwid; i++ )
+{
+if ( cpuid_to_hwid_map[i].cpuid == cid )
+return cpuid_to_hwid_map[i].hwid;
+}
+
+return MPIDR_INVALID;
+}
+
+static void __init acpi_map_node_to_hwid(nodeid_t nodeid, uint64_t hwid)
+{
+if ( nodeid >= MAX_NUMNODES )
+{
+printk(XENLOG_WARNING
+   "ACPI: NUMA: nodeid out of range %d with MPIDR 0x%lx\n",
+   nodeid, hwid);
+numa_failed();
+return;
+}
+
+node_to_hwid_map[cpus_in_srat].nodeid = nodeid;
+node_to_hwid_map[cpus_in_srat].hwid = hwid;
+}
+
 static void __init acpi_map_cpu_to_hwid(uint32_t cpuid, uint64_t mpidr)
 {
 if ( mpidr == MPIDR_INVALID )
@@ -76,15 +126,87 @@ static int __init acpi_parse_madt_handler(struct 
acpi_subtable_header *header,
 return 0;
 }
 
+/* Callback for Proximity Domain -> ACPI processor UID mapping */
+static void __init
+acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
+{
+int pxm, node;
+uint64_t mpidr;
+
+if ( srat_disabled() )
+return;
+
+if ( pa->header.length < sizeof(struct acpi_srat_gicc_affinity) )
+{
+printk(XENLOG_WARNING "SRAT: Invalid SRAT header length: %d\n",
+   pa->header.length);
+numa_failed();
+return;
+}
+
+if ( !(pa->flags & ACPI_SRAT_GICC_ENABLED) )
+return;
+
+if ( cpus_in_srat >= NR_CPUS )
+{
+printk(XENLOG_ERR
+   "SRAT: cpu_to_node_map[%d] is too small to fit all cpus\n",
+   NR_CPUS);
+return;
+}
+
+pxm = pa->proximity_domain;
+node = acpi_setup_node(pxm);
+if ( node == NUMA_NO_NODE )
+{
+numa_failed();
+return;
+}
+
+mpidr = acpi_get_cpu_hwid(pa->acpi_processor_uid);
+if ( mpidr == MPIDR_INVALID )
+{
+printk(XENLOG_ERR
+   "SRAT: PXM %d with ACPI ID %d has no valid MPIDR in MADT\n",
+   pxm, pa->acpi_processor_uid);
+numa_failed();
+return;
+}
+
+acpi_map_node_to_hwid(node, mpidr);
+node_set(node, processor_nodes_parsed);
+cpus_in_srat++;
+acpi_numa = 1;
+printk(XENLOG_INFO "SRAT: PXM %d -> MPIDR 0x%lx -> Node %d\n",
+   pxm, mpidr, node);
+}
+
 void __init acpi_map_uid_to_mpidr(void)
 {
 acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
 acpi_parse_madt_handler, NR_CPUS);
 }
 
+static int __init
+acpi_parse_gicc_affinity(struct acpi_subtable_header *header,
+ const unsigned long end)
+{
+   const struct acpi_srat_gicc_affinity *processor_affinity
+= (struct acpi_srat_gicc_affinity *)header;
+
+   if (!processor_affinity)
+   return -EINVAL;
+
+   acpi_table_print_srat_entry(header);
+   

[Xen-devel] [RFC PATCH v3 20/24] ACPI: Move arch specific SRAT parsing

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

SRAT's X2APIC_CPU_AFFINITY and CPU_AFFINITY types are not used
by ARM. Hence move handling of this SRAT types to arch specific
file and handle them under arch_table_parse_srat().

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/numa/acpi_numa.c |  5 +
 xen/arch/x86/srat.c   | 44 +++
 xen/drivers/acpi/numa.c   | 43 ++
 xen/include/xen/acpi.h|  6 ++
 4 files changed, 57 insertions(+), 41 deletions(-)

diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
index d9ad547..341e20b7 100644
--- a/xen/arch/arm/numa/acpi_numa.c
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -82,6 +82,11 @@ void __init acpi_map_uid_to_mpidr(void)
 acpi_parse_madt_handler, NR_CPUS);
 }
 
+void __init arch_table_parse_srat(void)
+{
+return;
+}
+
 void __init acpi_numa_arch_fixup(void) {}
 
 /*
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index d5caccf..a5fdedd 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -205,3 +205,47 @@ uint8_t __node_distance(nodeid_t a, nodeid_t b)
 }
 
 EXPORT_SYMBOL(__node_distance);
+
+static int __init
+acpi_parse_x2apic_affinity(struct acpi_subtable_header *header,
+  const unsigned long end)
+{
+   const struct acpi_srat_x2apic_cpu_affinity *processor_affinity
+   = container_of(header, struct acpi_srat_x2apic_cpu_affinity,
+  header);
+
+   if (!header)
+   return -EINVAL;
+
+   acpi_table_print_srat_entry(header);
+
+   /* let architecture-dependent part to do it */
+   acpi_numa_x2apic_affinity_init(processor_affinity);
+
+   return 0;
+}
+
+static int __init
+acpi_parse_processor_affinity(struct acpi_subtable_header *header,
+ const unsigned long end)
+{
+   const struct acpi_srat_cpu_affinity *processor_affinity
+   = container_of(header, struct acpi_srat_cpu_affinity, header);
+
+   if (!header)
+   return -EINVAL;
+
+   acpi_table_print_srat_entry(header);
+
+   acpi_numa_processor_affinity_init(processor_affinity);
+
+   return 0;
+}
+
+void __init arch_table_parse_srat(void)
+{
+   acpi_table_parse_srat(ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY,
+ acpi_parse_x2apic_affinity, 0);
+   acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
+ acpi_parse_processor_affinity, 0);
+}
diff --git a/xen/drivers/acpi/numa.c b/xen/drivers/acpi/numa.c
index 85f8917..0adc32c 100644
--- a/xen/drivers/acpi/numa.c
+++ b/xen/drivers/acpi/numa.c
@@ -120,43 +120,6 @@ static int __init acpi_parse_slit(struct acpi_table_header 
*table)
 }
 
 static int __init
-acpi_parse_x2apic_affinity(struct acpi_subtable_header *header,
-  const unsigned long end)
-{
-   const struct acpi_srat_x2apic_cpu_affinity *processor_affinity
-   = container_of(header, struct acpi_srat_x2apic_cpu_affinity,
-  header);
-
-   if (!header)
-   return -EINVAL;
-
-   acpi_table_print_srat_entry(header);
-
-   /* let architecture-dependent part to do it */
-   acpi_numa_x2apic_affinity_init(processor_affinity);
-
-   return 0;
-}
-
-static int __init
-acpi_parse_processor_affinity(struct acpi_subtable_header *header,
- const unsigned long end)
-{
-   const struct acpi_srat_cpu_affinity *processor_affinity
-   = container_of(header, struct acpi_srat_cpu_affinity, header);
-
-   if (!header)
-   return -EINVAL;
-
-   acpi_table_print_srat_entry(header);
-
-   /* let architecture-dependent part to do it */
-   acpi_numa_processor_affinity_init(processor_affinity);
-
-   return 0;
-}
-
-static int __init
 acpi_parse_memory_affinity(struct acpi_subtable_header *header,
   const unsigned long end)
 {
@@ -197,13 +160,11 @@ int __init acpi_numa_init(void)
 {
/* SRAT: Static Resource Affinity Table */
if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) {
-   acpi_table_parse_srat(ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY,
- acpi_parse_x2apic_affinity, 0);
-   acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
- acpi_parse_processor_affinity, 0);
acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
  acpi_parse_memory_affinity,
  NR_NODE_MEMBLKS);
+   /* This call handles architecture dependant SRAT */
+   arch_table_parse_srat();
}
 
/* SLIT: System Locality Information Table */
diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
index 9409350..53795ff 100644

[Xen-devel] [RFC PATCH v3 22/24] ARM: NUMA: Initialize ACPI NUMA

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Call ACPI NUMA initialization under CONFIG_ACPI_NUMA.

Signed-off-by: Vijaya Kumar 
---
 xen/arch/arm/numa/acpi_numa.c | 27 ++-
 xen/arch/arm/numa/numa.c  | 15 +--
 xen/common/numa.c | 14 ++
 xen/include/asm-arm/numa.h|  1 +
 xen/include/xen/numa.h|  1 +
 5 files changed, 55 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
index 95617f9..68fff95 100644
--- a/xen/arch/arm/numa/acpi_numa.c
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -181,7 +181,7 @@ acpi_numa_gicc_affinity_init(const struct 
acpi_srat_gicc_affinity *pa)
pxm, mpidr, node);
 }
 
-void __init acpi_map_uid_to_mpidr(void)
+static void __init acpi_map_uid_to_mpidr(void)
 {
 acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
 acpi_parse_madt_handler, NR_CPUS);
@@ -209,6 +209,31 @@ void __init arch_table_parse_srat(void)
   acpi_parse_gicc_affinity, NR_CPUS);
 }
 
+bool_t __init arch_acpi_numa_init(void)
+{
+int ret;
+
+if ( !acpi_disabled )
+{
+/*
+ * If firmware has DT, process_memory_node() call
+ * would have added memory blocks. So reset it before
+ * ACPI numa init.
+ */
+numa_clear_memblks();
+nodes_clear(memory_nodes_parsed);
+acpi_map_uid_to_mpidr();
+ret = acpi_numa_init();
+if ( ret || srat_disabled() )
+return 1;
+
+/* Register acpi node_distance handler */
+register_node_distance(_node_distance);
+}
+
+return 0;
+}
+
 void __init acpi_numa_arch_fixup(void) {}
 
 /*
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index 26aa4c0..68599c4 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -139,11 +139,22 @@ void __init numa_init(void)
 if ( numa_off )
 goto no_numa;
 
-ret = dt_numa_init();
+#ifdef CONFIG_ACPI_NUMA
+ret = arch_acpi_numa_init();
 if ( ret )
 {
 numa_off = true;
-printk(XENLOG_WARNING "DT NUMA init failed\n");
+printk(XENLOG_WARNING "ACPI NUMA init failed\n");
+}
+#endif
+if ( acpi_disabled )
+{
+ret = dt_numa_init();
+if ( ret )
+{
+numa_off = true;
+printk(XENLOG_WARNING "DT NUMA init failed\n");
+}
 }
 
 no_numa:
diff --git a/xen/common/numa.c b/xen/common/numa.c
index 0f79a07..020bc19 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -76,6 +76,20 @@ nodeid_t get_memblk_nodeid(unsigned int id)
 return memblk_nodeid[id];
 }
 
+void __init numa_clear_memblks(void)
+{
+unsigned int i;
+
+for ( i = 0; i < get_num_node_memblks(); i++ )
+{
+node_memblk_range[i].start = 0;
+node_memblk_range[i].end = 0;
+memblk_nodeid[i] = NUMA_NO_NODE;
+}
+
+num_node_memblks = 0;
+}
+
 int __init get_mem_nodeid(paddr_t start, paddr_t end)
 {
 unsigned int i;
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index f0a50bd..ff10b31 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -20,6 +20,7 @@ static inline nodeid_t acpi_get_nodeid(uint64_t hwid)
 void numa_init(void);
 int dt_numa_init(void);
 void numa_set_cpu_node(int cpu, unsigned int nid);
+bool_t arch_acpi_numa_init(void);
 
 #else
 static inline void numa_init(void)
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index a541eb7..14a7a0c 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -75,6 +75,7 @@ int get_num_node_memblks(void);
 bool arch_sanitize_nodes_memory(void);
 void numa_failed(void);
 uint8_t __node_distance(nodeid_t a, nodeid_t b);
+void numa_clear_memblks(void);
 #else
 static inline void numa_add_cpu(int cpu) { }
 static inline void numa_set_node(int cpu, nodeid_t node) { }
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 19/24] ARM: NUMA: Extract MPIDR from MADT table

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Parse MADT table and extract MPIDR for all
CPU IDs in MADT ACPI_MADT_TYPE_GENERIC_INTERRUPT entries
and store in cpuid_to_hwid_map[]

This mapping is used by SRAT table parsing to extract MPIDR
of the CPU ID.

MADT table is also parsed in arm/acpi/boot.c during smp boot.
However cannot wait till smp boot as SRAT table is parsed
much before during numa_init. Hence MADT is parsed twice
during boot. Once in numa_init and another in smp init.

Signed-off-by: Vijaya Kumar 
---
v3: - acpi_numa is set to -1 on numa failure.
---
 xen/arch/arm/numa/Makefile|  1 +
 xen/arch/arm/numa/acpi_numa.c | 94 +++
 xen/arch/arm/numa/numa.c  |  6 +++
 3 files changed, 101 insertions(+)

diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
index 3af3aff..b549459 100644
--- a/xen/arch/arm/numa/Makefile
+++ b/xen/arch/arm/numa/Makefile
@@ -1,2 +1,3 @@
 obj-y += dt_numa.o
 obj-y += numa.o
+obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o
diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
new file mode 100644
index 000..d9ad547
--- /dev/null
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -0,0 +1,94 @@
+/*
+ * ACPI based NUMA setup
+ *
+ * Copyright (C) 2016 - Cavium Inc.
+ * Vijaya Kumar K 
+ *
+ * Reads the ACPI MADT and SRAT table to setup NUMA information.
+ * Contains Excerpts from x86 implementation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Holds CPUID to MPIDR mapping read from MADT table. */
+struct cpuid_to_hwid {
+uint32_t cpuid;
+uint64_t hwid;
+};
+
+#define PHYS_CPUID_INVALID 0xff
+
+/* Holds mapping of CPU id to MPIDR read from MADT */
+static struct cpuid_to_hwid __read_mostly cpuid_to_hwid_map[NR_CPUS] =
+{ [0 ... NR_CPUS - 1] = {PHYS_CPUID_INVALID, MPIDR_INVALID} };
+static unsigned int num_cpuid_to_hwid;
+
+static void __init acpi_map_cpu_to_hwid(uint32_t cpuid, uint64_t mpidr)
+{
+if ( mpidr == MPIDR_INVALID )
+{
+printk("Skip MADT cpu entry with invalid MPIDR\n");
+numa_failed();
+return;
+}
+
+cpuid_to_hwid_map[num_cpuid_to_hwid].hwid = mpidr;
+cpuid_to_hwid_map[num_cpuid_to_hwid].cpuid = cpuid;
+num_cpuid_to_hwid++;
+}
+
+static int __init acpi_parse_madt_handler(struct acpi_subtable_header *header,
+  const unsigned long end)
+{
+uint64_t mpidr;
+struct acpi_madt_generic_interrupt *p =
+   container_of(header, struct acpi_madt_generic_interrupt, 
header);
+
+if ( BAD_MADT_ENTRY(p, end) )
+{
+/* MADT is invalid, we disable NUMA by calling numa_failed() */
+numa_failed();
+return -EINVAL;
+}
+
+acpi_table_print_madt_entry(header);
+mpidr = p->arm_mpidr & MPIDR_HWID_MASK;
+acpi_map_cpu_to_hwid(p->uid, mpidr);
+
+return 0;
+}
+
+void __init acpi_map_uid_to_mpidr(void)
+{
+acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
+acpi_parse_madt_handler, NR_CPUS);
+}
+
+void __init acpi_numa_arch_fixup(void) {}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index 85352dc..26aa4c0 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
@@ -40,6 +41,11 @@ void numa_failed(void)
 init_dt_numa_distance();
 node_distance_fn = NULL;
 init_cpu_to_node();
+
+#ifdef CONFIG_ACPI_NUMA
+acpi_numa = -1;
+reset_pxm2node();
+#endif
 }
 
 void __init numa_set_cpu_node(int cpu, unsigned int nid)
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 15/24] ARM: NUMA: DT: Add CPU NUMA support

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

For each cpu, update cpu_to_node[] with node id from
the numa-node-id DT property. Also, initialize cpu_to_node[]
with node 0.

Add macros to access cpu_to_node[] information.

Signed-off-by: Vijaya Kumar K 
---
v3: - Dropped numa_add_cpu declaration from asm-arm/numa.h
- Dropped stale declarations
- Call numa_add_cpu for cpu0
---
 xen/arch/arm/numa/numa.c   | 21 +
 xen/arch/arm/setup.c   |  2 ++
 xen/arch/arm/smpboot.c | 25 -
 xen/include/asm-arm/numa.h |  7 +++
 xen/include/asm-x86/numa.h |  1 -
 xen/include/xen/numa.h |  1 +
 6 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index c00b92c..dc80aa5 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -22,11 +22,31 @@
 
 static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
 
+/*
+ * Setup early cpu_to_node.
+ */
+void __init init_cpu_to_node(void)
+{
+int i;
+
+for ( i = 0; i < NR_CPUS; i++ )
+numa_set_node(i, 0);
+}
+
 void numa_failed(void)
 {
 numa_off = true;
 init_dt_numa_distance();
 node_distance_fn = NULL;
+init_cpu_to_node();
+}
+
+void __init numa_set_cpu_node(int cpu, unsigned int nid)
+{
+if ( !node_isset(nid, processor_nodes_parsed) || nid >= MAX_NUMNODES )
+nid = 0;
+
+numa_set_node(cpu, nid);
 }
 
 uint8_t __node_distance(nodeid_t a, nodeid_t b)
@@ -49,6 +69,7 @@ void __init numa_init(void)
 int ret = 0;
 
 nodes_clear(processor_nodes_parsed);
+init_cpu_to_node();
 init_dt_numa_distance();
 
 if ( numa_off )
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index a6d1499..b9c8b0d 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -787,6 +787,8 @@ void __init start_xen(unsigned long boot_phys_offset,
 
 processor_id();
 
+numa_add_cpu(0);
+
 smp_init_cpus();
 cpus = smp_get_max_cpus();
 printk(XENLOG_INFO "SMP: Allowing %u CPUs\n", cpus);
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 32e8722..fcf9afc 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -106,6 +107,7 @@ static void __init dt_smp_init_cpus(void)
 [0 ... NR_CPUS - 1] = MPIDR_INVALID
 };
 bool_t bootcpu_valid = 0;
+nodeid_t *cpu_to_nodemap;
 int rc;
 
 mpidr = boot_cpu_data.mpidr.bits & MPIDR_HWID_MASK;
@@ -117,11 +119,18 @@ static void __init dt_smp_init_cpus(void)
 return;
 }
 
+cpu_to_nodemap = xzalloc_array(nodeid_t, NR_CPUS);
+if ( !cpu_to_nodemap )
+{
+printk(XENLOG_WARNING "Failed to allocate memory for 
cpu_to_nodemap\n");
+return;
+}
+
 dt_for_each_child_node( cpus, cpu )
 {
 const __be32 *prop;
 u64 addr;
-u32 reg_len;
+uint32_t reg_len, nid;
 register_t hwid;
 
 if ( !dt_device_type_is_equal(cpu, "cpu") )
@@ -146,6 +155,15 @@ static void __init dt_smp_init_cpus(void)
 continue;
 }
 
+if ( !dt_property_read_u32(cpu, "numa-node-id", ) )
+{
+printk(XENLOG_WARNING "cpu node `%s`: numa-node-id not found\n",
+   dt_node_full_name(cpu));
+nid = 0;
+}
+
+cpu_to_nodemap[cpuidx] = nid;
+
 addr = dt_read_number(prop, dt_n_addr_cells(cpu));
 
 hwid = addr;
@@ -224,6 +242,7 @@ static void __init dt_smp_init_cpus(void)
 {
 printk(XENLOG_WARNING "DT missing boot CPU MPIDR[23:0]\n"
"Using only 1 CPU\n");
+xfree(cpu_to_nodemap);
 return;
 }
 
@@ -233,7 +252,10 @@ static void __init dt_smp_init_cpus(void)
 continue;
 cpumask_set_cpu(i, _possible_map);
 cpu_logical_map(i) = tmp_map[i];
+numa_set_cpu_node(i, cpu_to_nodemap[i]);
 }
+
+xfree(cpu_to_nodemap);
 }
 
 void __init smp_init_cpus(void)
@@ -313,6 +335,7 @@ void start_secondary(unsigned long boot_phys_offset,
  */
 smp_wmb();
 
+numa_add_cpu(cpuid);
 /* Now report this CPU is up */
 cpumask_set_cpu(cpuid, _online_map);
 
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index d1dc83a..0d3146c 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -10,12 +10,19 @@ void init_dt_numa_distance(void);
 #ifdef CONFIG_NUMA
 void numa_init(void);
 int dt_numa_init(void);
+void numa_set_cpu_node(int cpu, unsigned int nid);
+
 #else
 static inline void numa_init(void)
 {
 return;
 }
 
+static inline void numa_set_cpu_node(int cpu, unsigned int nid)
+{
+return;
+}
+
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index ca0a2a6..fc4747f 

[Xen-devel] [RFC PATCH v3 17/24] ARM: NUMA: DT: Do not expose numa info to DOM0

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Delete numa-node-id and distance map from DOM0 DT
so that NUMA information is not exposed to DOM0.
This helps particularly to boot Node 1 devices
as if booting on Node0.

However this approach has limitation where memory allocation
for the devices should be local.

Also, do not expose numa distance node to DOM0.

Signed-off-by: Vijaya Kumar 
---
 xen/arch/arm/domain_build.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 1bec4fa..a7d6d3a 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -425,6 +425,10 @@ static int write_properties(struct domain *d, struct 
kernel_info *kinfo,
 }
 }
 
+/* Don't expose the property numa to the guest */
+if ( dt_property_name_is_equal(prop, "numa-node-id") )
+continue;
+
 /* Don't expose the property "xen,passthrough" to the guest */
 if ( dt_property_name_is_equal(prop, "xen,passthrough") )
 continue;
@@ -1177,6 +1181,11 @@ static int handle_node(struct domain *d, struct 
kernel_info *kinfo,
 DT_MATCH_TYPE("memory"),
 /* The memory mapped timer is not supported by Xen. */
 DT_MATCH_COMPATIBLE("arm,armv7-timer-mem"),
+/*
+ * NUMA info is not exposed to Dom0.
+ * So, skip distance-map infomation
+ */
+DT_MATCH_COMPATIBLE("numa-distance-map-v1"),
 { /* sentinel */ },
 };
 static const struct dt_device_match timer_matches[] __initconst =
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 11/24] ARM: fdt: Export and introduce new fdt functions

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Introduce new api device_tree_type_matches() to check for
device type. Also export device_tree_get_u32() and
device_tree_node_compatible()

These functions are later used for parsing NUMA information.

Signed-off-by: Vijaya Kumar K 
---
v3: Export device_tree_node_compatible() instead of
device_tree_node_matches()
---
 xen/arch/arm/bootfdt.c  | 20 
 xen/include/asm-arm/setup.h |  5 +
 2 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index ea188a0..6e8251b 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -31,8 +31,8 @@ static bool_t __init device_tree_node_matches(const void 
*fdt, int node,
 && (name[match_len] == '@' || name[match_len] == '\0');
 }
 
-static bool_t __init device_tree_node_compatible(const void *fdt, int node,
- const char *match)
+bool_t __init device_tree_node_compatible(const void *fdt, int node,
+  const char *match)
 {
 int len, l;
 int mlen;
@@ -62,8 +62,20 @@ static void __init device_tree_get_reg(const __be32 **cell, 
u32 address_cells,
 *size = dt_next_cell(size_cells, cell);
 }
 
-static u32 __init device_tree_get_u32(const void *fdt, int node,
-  const char *prop_name, u32 dflt)
+bool_t __init device_tree_type_matches(const void *fdt, int node,
+   const char *match)
+{
+const void *prop;
+
+prop = fdt_getprop(fdt, node, "device_type", NULL);
+if ( prop == NULL )
+return 0;
+
+return strcmp(prop, match) == 0 ? 1 : 0;
+}
+
+u32 __init device_tree_get_u32(const void *fdt, int node,
+   const char *prop_name, u32 dflt)
 {
 const struct fdt_property *prop;
 
diff --git a/xen/include/asm-arm/setup.h b/xen/include/asm-arm/setup.h
index 7ff2c34..fb78478 100644
--- a/xen/include/asm-arm/setup.h
+++ b/xen/include/asm-arm/setup.h
@@ -83,6 +83,11 @@ struct bootmodule *add_boot_module(bootmodule_kind kind,
 struct bootmodule *boot_module_find_by_kind(bootmodule_kind kind);
 const char * __init boot_module_kind_as_string(bootmodule_kind kind);
 
+u32 device_tree_get_u32(const void *fdt, int node, const char *prop_name,
+u32 dflt);
+bool_t device_tree_type_matches(const void *fdt, int node, const char *match);
+bool_t device_tree_node_compatible(const void *fdt, int node,
+   const char *match);
 #endif
 /*
  * Local variables:
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 13/24] ARM: NUMA: DT: Parse memory NUMA information

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Parse memory node and fetch numa-node-id information.
For each memory range, store in node_memblk_range[]
along with node id.

When booting in UEFI mode, UEFI passes memory information
to Dom0 using EFI memory descriptor table and deletes the
memory nodes from the host DT. However to fetch the memory
numa node id, memory DT node should not be deleted by EFI stub.
With this patch, do not delete memory node from FDT.

NUMA info of memory is extracted from process_memory_node()
instead of parsing the DT again during numa_init().

Signed-off-by: Vijaya Kumar K 
---
v3: - Set numa_off in numa_failed() and drop dt_numa variable
---
 xen/arch/arm/bootfdt.c  | 25 +
 xen/arch/arm/efi/efi-boot.h | 25 -
 xen/arch/arm/numa/dt_numa.c | 32 
 xen/arch/arm/numa/numa.c|  5 +
 xen/include/asm-arm/numa.h  |  2 ++
 5 files changed, 60 insertions(+), 29 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 6e8251b..b3a132c 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -13,6 +13,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
@@ -146,6 +148,9 @@ static void __init process_memory_node(const void *fdt, int 
node,
 const __be32 *cell;
 paddr_t start, size;
 u32 reg_cells = address_cells + size_cells;
+#ifdef CONFIG_NUMA
+uint32_t nid;
+#endif
 
 if ( address_cells < 1 || size_cells < 1 )
 {
@@ -154,24 +159,36 @@ static void __init process_memory_node(const void *fdt, 
int node,
 return;
 }
 
+#ifdef CONFIG_NUMA
+nid = device_tree_get_u32(fdt, node, "numa-node-id", NR_NODE_MEMBLKS);
+#endif
 prop = fdt_get_property(fdt, node, "reg", NULL);
 if ( !prop )
 {
 printk("fdt: node `%s': missing `reg' property\n", name);
+#ifdef CONFIG_NUMA
+   numa_failed();
+#endif
 return;
 }
 
 cell = (const __be32 *)prop->data;
 banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));
 
-for ( i = 0; i < banks && bootinfo.mem.nr_banks < NR_MEM_BANKS; i++ )
+for ( i = 0; i < banks; i++ )
 {
 device_tree_get_reg(, address_cells, size_cells, , );
 if ( !size )
 continue;
-bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
-bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
-bootinfo.mem.nr_banks++;
+if ( !efi_enabled(EFI_BOOT) && bootinfo.mem.nr_banks < NR_MEM_BANKS )
+{
+bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
+bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
+bootinfo.mem.nr_banks++;
+}
+#ifdef CONFIG_NUMA
+dt_numa_process_memory_node(nid, start, size);
+#endif
 }
 }
 
diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index 56de26e..a8bde68 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -194,33 +194,8 @@ EFI_STATUS __init fdt_add_uefi_nodes(EFI_SYSTEM_TABLE 
*sys_table,
 int status;
 u32 fdt_val32;
 u64 fdt_val64;
-int prev;
 int num_rsv;
 
-/*
- * Delete any memory nodes present.  The EFI memory map is the only
- * memory description provided to Xen.
- */
-prev = 0;
-for (;;)
-{
-const char *type;
-int len;
-
-node = fdt_next_node(fdt, prev, NULL);
-if ( node < 0 )
-break;
-
-type = fdt_getprop(fdt, node, "device_type", );
-if ( type && strncmp(type, "memory", len) == 0 )
-{
-fdt_del_node(fdt, node);
-continue;
-}
-
-prev = node;
-}
-
/*
 * Delete all memory reserve map entries. When booting via UEFI,
 * kernel will use the UEFI memory map to find reserved regions.
diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
index 963bb40..84030e7 100644
--- a/xen/arch/arm/numa/dt_numa.c
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -58,6 +58,38 @@ static int __init dt_numa_process_cpu_node(const void *fdt)
 return 0;
 }
 
+void __init dt_numa_process_memory_node(uint32_t nid, paddr_t start,
+   paddr_t size)
+{
+struct node *nd;
+int i;
+
+i = conflicting_memblks(start, start + size);
+if ( i < 0 )
+{
+ if ( numa_add_memblk(nid, start, size) )
+ {
+ printk(XENLOG_WARNING "DT: NUMA: node-id %u overflow \n", nid);
+ numa_failed();
+ return;
+ }
+}
+else
+{
+ nd = get_node_memblk_range(i);
+ printk(XENLOG_ERR
+"NUMA DT: node %u (%"PRIx64"-%"PRIx64") overlaps with %d 
(%"PRIx64"-%"PRIx64")\n",
+nid, start, start + size, i, nd->start, nd->end);
+
+ numa_failed();
+ return;
+}
+
+node_set(nid, memory_nodes_parsed);
+
+return;
+}
+
 int 

[Xen-devel] [RFC PATCH v3 07/24] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Right now CONFIG_NUMA is not enabled for ARM and
existing code in asm-arm/numa.h is for !CONFIG_NUMA.
Hence put this code under #ifndef CONFIG_NUMA.

This help to make this changes work when CONFIG_NUMA
is not enabled. Though CONFIG_NUMA is enabled by default,
manually disabling this option is possible and compilation
should go through. Hence kept the these changes under
!CONFIG_NUMA.

Signed-off-by: Vijaya Kumar K 
---
v3: - Dropped NODE_SHIFT define
---
 xen/include/asm-arm/numa.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index 53f99af..7f00a36 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -3,6 +3,7 @@
 
 typedef uint8_t nodeid_t;
 
+#ifndef CONFIG_NUMA
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
@@ -16,6 +17,7 @@ static inline __attribute__((pure)) nodeid_t 
phys_to_nid(paddr_t addr)
 #define node_spanned_pages(nid) (total_pages)
 #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
 #define __node_distance(a, b) (20)
+#endif /* CONFIG_NUMA */
 
 static inline unsigned int arch_get_dma_bitsize(void)
 {
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 18/24] ACPI: Refactor acpi SRAT and SLIT table handling code

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Move SRAT handling code which is common across
architectures is moved to new file xen/drivers/acpi/srat.c
from xen/arch/x86/srat.c file. New header file srat.h is
introduced.

Other major changes are:
- Coding style of code moved is changed.
- Moved struct pxm2node from srat.c to srat.h
- Dropped {memory,processor}_nodes_parsed from x86/srat.c
- Dropped static on node_to_pxm() and moved to beginning of the file.
- Made some static functions as non-static
- acpi_node_distance() is introduced and called from __node_distance()
- Replaced distance constants with LOCAL/REMOTE_DISTANCE defines

Signed-off-by: Vijaya Kumar K 
---
v3: - Moved common function declarations from asm-x86/srat.h
---
 xen/arch/x86/dom0_build.c   |   1 +
 xen/arch/x86/mm.c   |   2 -
 xen/arch/x86/physdev.c  |   1 +
 xen/arch/x86/setup.c|   1 +
 xen/arch/x86/smpboot.c  |   1 +
 xen/arch/x86/srat.c | 246 +
 xen/arch/x86/x86_64/mm.c|   1 +
 xen/drivers/acpi/Makefile   |   1 +
 xen/drivers/acpi/srat.c | 298 
 xen/drivers/passthrough/vtd/iommu.c |   1 +
 xen/include/acpi/srat.h |  24 +++
 xen/include/asm-x86/numa.h  |   5 -
 12 files changed, 331 insertions(+), 251 deletions(-)

diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 0c125e6..04127e7 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index 19f672d..5497621 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -135,8 +135,6 @@ l1_pgentry_t __section(".bss.page_aligned") 
__aligned(PAGE_SIZE)
 #define PTE_UPDATE_WITH_CMPXCHG
 #endif
 
-paddr_t __read_mostly mem_hotplug;
-
 /* Private domain structs for DOMID_XEN and DOMID_IO. */
 struct domain *dom_xen, *dom_io, *dom_cow;
 
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 0eb4097..a73a954 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index db5df69..b957b96 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 168c9d4..ff4c7e1 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index be2634a..d5caccf 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -18,92 +18,10 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
-static struct acpi_table_slit *__read_mostly acpi_slit;
-
-struct pxm2node {
-   unsigned int pxm;
-   nodeid_t node;
-};
-static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
-   { [0 ... MAX_NUMNODES - 1] = {.node = NUMA_NO_NODE} };
-
-static unsigned int node_to_pxm(nodeid_t n);
-
-static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
-
-static inline bool node_found(unsigned int idx, unsigned int pxm)
-{
-   return ((pxm2node[idx].pxm == pxm) &&
-   (pxm2node[idx].node != NUMA_NO_NODE));
-}
-
-static void reset_pxm2node(void)
-{
-   unsigned int i;
-
-   for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-   pxm2node[i].node = NUMA_NO_NODE;
-}
-
-nodeid_t pxm_to_node(unsigned int pxm)
-{
-   unsigned int i;
-
-   if ((pxm < ARRAY_SIZE(pxm2node)) && node_found(pxm, pxm))
-   return pxm2node[pxm].node;
-
-   for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-   if (node_found(i, pxm))
-   return pxm2node[i].node;
-
-   return NUMA_NO_NODE;
-}
-
-nodeid_t acpi_setup_node(unsigned int pxm)
-{
-   nodeid_t node;
-   unsigned int idx;
-   static bool warned;
-   static unsigned int nodes_found;
-
-   BUILD_BUG_ON(MAX_NUMNODES >= NUMA_NO_NODE);
-
-   if (pxm < ARRAY_SIZE(pxm2node)) {
-   if (node_found(pxm, pxm))
-   return pxm2node[pxm].node;
-
-   /* Try to maintain indexing of pxm2node by pxm */
-   if (pxm2node[pxm].node == NUMA_NO_NODE) {
-   idx = pxm;
-   goto finish;
-   }
-   }
-
-   for (idx = 0; idx < ARRAY_SIZE(pxm2node); idx++)
-   if (pxm2node[idx].node == NUMA_NO_NODE)
-   goto finish;
-
-   if (!warned) {
-   printk(KERN_WARNING "SRAT: Too many proximity domains (%#x)\n",
-  pxm);
-   warned = 

[Xen-devel] [RFC PATCH v3 01/24] NUMA: Make number of NUMA nodes configurable

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Introduce NR_NODES config option to specify number
of NUMA nodes supported. By default value is set at
64 for x86 and 8 for arm. Dropped NODES_SHIFT macro.

Also move NR_NODE_MEMBLKS from asm-x86/acpi.h to xen/numa.h

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/Kconfig   | 7 +++
 xen/include/asm-x86/acpi.h | 1 -
 xen/include/asm-x86/numa.h | 2 --
 xen/include/xen/config.h   | 1 +
 xen/include/xen/numa.h | 7 ++-
 5 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index cf0acb7..9c2a4e2 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -6,3 +6,10 @@ config NR_CPUS
default "128" if ARM
---help---
  Specifies the maximum number of physical CPUs which Xen will support.
+
+config NR_NODES
+   int "Maximum number of NUMA nodes"
+   default "64" if X86
+   default "8" if ARM
+   ---help---
+ Specifies the maximum number of NUMA nodes which Xen will support.
diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
index 27ecc65..15be784 100644
--- a/xen/include/asm-x86/acpi.h
+++ b/xen/include/asm-x86/acpi.h
@@ -105,7 +105,6 @@ extern void acpi_reserve_bootmem(void);
 
 extern s8 acpi_numa;
 extern int acpi_scan_nodes(u64 start, u64 end);
-#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
 
 #ifdef CONFIG_ACPI_SLEEP
 
diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
index bada2c0..3cf26c2 100644
--- a/xen/include/asm-x86/numa.h
+++ b/xen/include/asm-x86/numa.h
@@ -3,8 +3,6 @@
 
 #include 
 
-#define NODES_SHIFT 6
-
 typedef u8 nodeid_t;
 
 extern int srat_rev;
diff --git a/xen/include/xen/config.h b/xen/include/xen/config.h
index a1d0f97..0f1a029 100644
--- a/xen/include/xen/config.h
+++ b/xen/include/xen/config.h
@@ -81,6 +81,7 @@
 
 /* allow existing code to work with Kconfig variable */
 #define NR_CPUS CONFIG_NR_CPUS
+#define NR_NODES CONFIG_NR_NODES
 
 #ifndef CONFIG_DEBUG
 #define NDEBUG
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 7aef1a8..6bba29e 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -3,14 +3,11 @@
 
 #include 
 
-#ifndef NODES_SHIFT
-#define NODES_SHIFT 0
-#endif
-
 #define NUMA_NO_NODE 0xFF
 #define NUMA_NO_DISTANCE 0xFF
 
-#define MAX_NUMNODES(1 << NODES_SHIFT)
+#define MAX_NUMNODESNR_NODES
+#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
 
 #define vcpu_to_node(v) (cpu_to_node((v)->processor))
 
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 00/24] ARM: Add Xen NUMA support

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

With this RFC patch series, NUMA support is added for ARM platform.
Both DT and ACPI based NUMA support is added.
Only Xen is made aware of NUMA platform. NUMA awareness to DOM0 is not
added.

As part of this series, the code under x86 architecture is
reused by moving into common files.
New files xen/common/numa.c and xen/drivers/acpi/srat.c files are
added.
For ARM specific new folder is added xen/arch/arm/numa and new files
numa.c, dt_numa.c and acpi_numa are introduced under this folder.

DT NUMA: The following major changes are performed
 - Dropped numa-node-id information from Dom0 DT.
   So that Dom0 devices make allocation from node 0 for
   devmalloc requests.
 - Memory DT is not deleted by EFI. It is exposed to Xen
   to extract numa information.
 - On NUMA failure, Fallback to Non-NUMA booting.ACPI_SRAT_TYPE_MEMORY_AFFINITY
   Assuming all the memory and CPU's are under node 0.
 - CONFIG_NUMA is introduced.

ACPI NUMA:
 - MADT is parsed before parsing SRAT table to extract
   CPU_ID to MPIDR mapping info. In Linux, while parsing SRAT
   table, MADT table is opened and extract MPIDR. This
   approach avoids opening ACPI tables recursively.
 - SRAT table is parsed for ACPI_SRAT_TYPE_GICC_AFFINITY to extract
   proximity info and MPIDR from CPU_ID to MPIDR mapping table.
 - Parsing of SRAT table for ACPI_SRAT_TYPE_MEMORY_AFFINITY to extract
   memory proximity is reused from x86 arch.
 - Re-use SLIT parsing of x86 for node distance information.
 - CONFIG_ACPI_NUMA is introduced

This patch is tested on Thunderx platform.
No changes are made to x86 implementation only code is sanitized
and refactored. Hence only compilation tested for x86.

This series is posted as RFC for the reason that it is not tested
on x86. Request some help from community in testing this series on x86.

Code is shared at
https://github.com/vijaykilari/xen-numa/commits/rfc_v3

v3: Major changes
 - Rebased to latest staging branch
 - Dropped patches 4 & 5 of v2.
 - Reused most of the x86 code like numa emulation, acpi
   memory node parsing by moving to common code.
 - Fixed cpu node parsing with DT.
 - Made NR_NODES as configurable
 - Dropped hardcoding of memnodemap[] array size
 - Segregated new dt functions to single patch 11
 - Introduced new patch 10 for NUMA initialization.

v2: Major changes
  - Rebased to lastest staging branch
  - Reworked on x86 NUMA code and cleanup to possible extent.
Patches 1 to 8 are created for this
  - Reworked on DT and ACPI NUMA extracting information
  - Reused DT code for memory node processing to extract NUMA info.
  - Fixed issues with DT processing
  - Added arch specific processing of SRAT
  - Reworked on MADT and SRAT processing
  - Reworked on node distance
  - All ARM changes are moved under folder arch/arm/numa.
  - NUMA ACPI common changes are kept in drivers/acpi/srat.c

Vijaya Kumar K (24):
  NUMA: Make number of NUMA nodes configurable
  x86: NUMA: Clean up: Fix coding styles and drop unused code
  x86: NUMA: Fix datatypes and attributes
  x86: NUMA: Rename and sanitize memnode shift code
  x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs
  x86: NUMA: Rename some generic functions
  ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  NUMA: x86: Move numa code and make it generic
  NUMA: x86: Move common code from srat.c
  NUMA: Allow numa initialization with DT
  ARM: fdt: Export and introduce new fdt functions
  ARM: NUMA: DT: Parse CPU NUMA information
  ARM: NUMA: DT: Parse memory NUMA information
  ARM: NUMA: DT: Parse NUMA distance information
  ARM: NUMA: DT: Add CPU NUMA support
  ARM: NUMA: Add memory NUMA support
  ARM: NUMA: DT: Do not expose numa info to DOM0
  ACPI: Refactor acpi SRAT and SLIT table handling code
  ARM: NUMA: Extract MPIDR from MADT table
  ACPI: Move arch specific SRAT parsing
  ARM: NUMA: ACPI: Extract proximity from SRAT table
  ARM: NUMA: Initialize ACPI NUMA
  NUMA: Move CONFIG_NUMA to common Kconfig
  NUMA: Enable ACPI_NUMA config

 xen/arch/Kconfig|   7 +
 xen/arch/arm/Makefile   |   1 +
 xen/arch/arm/acpi/boot.c|   2 +
 xen/arch/arm/bootfdt.c  |  45 ++-
 xen/arch/arm/domain_build.c |   9 +
 xen/arch/arm/efi/efi-boot.h |  25 --
 xen/arch/arm/numa/Makefile  |   3 +
 xen/arch/arm/numa/acpi_numa.c   | 246 +
 xen/arch/arm/numa/dt_numa.c | 240 +
 xen/arch/arm/numa/numa.c| 188 ++
 xen/arch/arm/setup.c|   6 +
 xen/arch/arm/smpboot.c  |  25 +-
 xen/arch/x86/dom0_build.c   |   1 +
 xen/arch/x86/mm.c   |   2 -
 xen/arch/x86/numa.c | 463 +
 xen/arch/x86/physdev.c  |   1 +
 xen/arch/x86/setup.c|   1 +
 xen/arch/x86/smpboot.c  |   4 +-
 xen/arch/x86/srat.c | 405 --
 

[Xen-devel] [RFC PATCH v3 10/24] NUMA: Allow numa initialization with DT

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

The common code allows numa initialization only when
ACPI_NUMA config is enabled. Allow initialization when
NUMA config is enabled for DT.

In this patch, along with acpi_numa, check for acpi_disabled
is added.

Signed-off-by: Vijaya Kumar K 
---
 xen/common/numa.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/xen/common/numa.c b/xen/common/numa.c
index 74c4697..5e985d2 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -324,7 +324,7 @@ static int __init numa_scan_nodes(paddr_t start, paddr_t 
end)
 for ( i = 0; i < MAX_NUMNODES; i++ )
 cutoff_node(i, start, end);
 
-if ( acpi_numa <= 0 )
+if ( !acpi_disabled && acpi_numa <= 0 )
 return -1;
 
 if ( !arch_sanitize_nodes_memory() )
@@ -430,11 +430,9 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 return;
 #endif
 
-#ifdef CONFIG_ACPI_NUMA
 if ( !numa_off &&
  !numa_scan_nodes(pfn_to_paddr(start_pfn), pfn_to_paddr(end_pfn)) )
 return;
-#endif
 
 printk(KERN_INFO "%s\n",
numa_off ? "NUMA turned off" : "No NUMA configuration found");
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v3 03/24] x86: NUMA: Fix datatypes and attributes

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Change u{8,32,64} to uint{8,32,64}_t, u64 to paddr_t
wherever applicable.
Fix attributes coding styles.
Also changed
  - Some variables from int to unsigned int
  - Used pfn_to_paddr/paddr_to_pfn whereever required.
  - Alloc memnodemap[] of size BITS_PER_LONG.

Signed-off-by: Vijaya Kumar K 
---
v3: - Change unsigned to unsigned int
- Update commit message
- Drop changing memnode_shift as unsigned int
- Used pfn_to_paddr/paddr_to_pfn
- Alloc memnodemap[] of size BITS_PER_LONG
---
 xen/arch/x86/numa.c| 54 +-
 xen/arch/x86/srat.c| 64 +++---
 xen/include/asm-arm/numa.h |  2 +-
 xen/include/asm-x86/acpi.h |  2 +-
 xen/include/asm-x86/numa.h | 16 ++--
 5 files changed, 72 insertions(+), 66 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 444d7ad..aa4a7c1 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -25,11 +25,17 @@ struct node_data node_data[MAX_NUMNODES];
 
 /* Mapping from pdx to node id */
 int memnode_shift;
-static typeof(*memnodemap) _memnodemap[64];
+
+/*
+ * In case of numa init failure or numa off,
+ * memnode_shift is initialized to BITS_PER_LONG - 1. Hence allocate
+ * memnodemap[] of BITS_PER_LONG.
+ */
+static typeof(*memnodemap) _memnodemap[BITS_PER_LONG];
 unsigned long memnodemapsize;
-u8 *memnodemap;
+uint8_t *memnodemap;
 
-nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
+nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
 [0 ... NR_CPUS-1] = NUMA_NO_NODE
 };
 /*
@@ -38,7 +44,7 @@ nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
 nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
 [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
 };
-cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
+cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
@@ -166,12 +172,12 @@ int __init compute_hash_shift(struct node *nodes, int 
numnodes,
 return shift;
 }
 /* initialize NODE_DATA given nodeid and start/end */
-void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
+void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end)
 {
 unsigned long start_pfn, end_pfn;
 
-start_pfn = start >> PAGE_SHIFT;
-end_pfn = end >> PAGE_SHIFT;
+start_pfn = paddr_to_pfn(start);
+end_pfn = paddr_to_pfn(end);
 
 NODE_DATA(nodeid)->node_start_pfn = start_pfn;
 NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn;
@@ -201,19 +207,20 @@ void __init numa_init_array(void)
 }
 
 #ifdef CONFIG_NUMA_EMU
-static int numa_fake __initdata = 0;
+static unsigned int __initdata numa_fake;
 
 /* Numa emulation */
-static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
+static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
 {
-int i;
+unsigned int i;
 struct node nodes[MAX_NUMNODES];
-u64 sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
+uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
 
 /* Kludge needed for the hash function */
 if ( hweight64(sz) > 1 )
 {
-u64 x = 1;
+uint64_t x = 1;
+
 while ( (x << 1) < sz )
 x <<= 1;
 if ( x < sz / 2 )
@@ -225,9 +232,9 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
 memset(,0,sizeof(nodes));
 for ( i = 0; i < numa_fake; i++ )
 {
-nodes[i].start = (start_pfn << PAGE_SHIFT) + i * sz;
+nodes[i].start = pfn_to_paddr(start_pfn) + i * sz;
 if ( i == numa_fake - 1 )
-sz = (end_pfn << PAGE_SHIFT) - nodes[i].start;
+sz = pfn_to_paddr(end_pfn) - nodes[i].start;
 nodes[i].end = nodes[i].start + sz;
 printk(KERN_INFO
"Faking node %d at %"PRIx64"-%"PRIx64" (%"PRIu64"MB)\n",
@@ -260,8 +267,8 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 #endif
 
 #ifdef CONFIG_ACPI_NUMA
-if ( !numa_off && !acpi_scan_nodes((u64)start_pfn << PAGE_SHIFT,
- (u64)end_pfn << PAGE_SHIFT) )
+if ( !numa_off &&
+ !acpi_scan_nodes(pfn_to_paddr(start_pfn), pfn_to_paddr(end_pfn)) )
 return;
 #endif
 
@@ -269,8 +276,7 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
numa_off ? "NUMA turned off" : "No NUMA configuration found");
 
 printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
-   (u64)start_pfn << PAGE_SHIFT,
-   (u64)end_pfn << PAGE_SHIFT);
+   pfn_to_paddr(start_pfn), pfn_to_paddr(end_pfn));
 /* setup dummy node covering all memory */
 memnode_shift = BITS_PER_LONG - 1;
 memnodemap = _memnodemap;
@@ -279,8 +285,7 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 for ( i = 0; i < nr_cpu_ids; i++ )
 numa_set_node(i, 0);
 cpumask_copy(_to_cpumask[0], 

[Xen-devel] [RFC PATCH v3 04/24] x86: NUMA: Rename and sanitize memnode shift code

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

memnode_shift variable is changed from int to unsigned int.
With this change, compute_memnode_shift() returns error value
instead of returning shift value. The memnode_shift is updated inside
compute_memnode_shift().

Also, following changes are made
  - Rename compute_hash_shift to compute_memnode_shift
  - Update int to unsigned int for params in extract_lsb_from_nodes()
  - Return values of populate_memnodemap() is changed

Signed-off-by: Vijaya Kumar K 
---
v3:
  - Update int to unsigned int for params in extract_lsb_from_nodes()
  - Return values of populate_memnodemap() is changed
---
 xen/arch/x86/numa.c| 53 --
 xen/arch/x86/srat.c|  7 +++---
 xen/include/asm-x86/numa.h |  6 +++---
 3 files changed, 34 insertions(+), 32 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index aa4a7c1..2ea2ec0 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -24,7 +24,7 @@ custom_param("numa", numa_setup);
 struct node_data node_data[MAX_NUMNODES];
 
 /* Mapping from pdx to node id */
-int memnode_shift;
+unsigned int memnode_shift;
 
 /*
  * In case of numa init failure or numa off,
@@ -59,15 +59,16 @@ int srat_disabled(void)
 /*
  * Given a shift value, try to populate memnodemap[]
  * Returns :
- * 1 if OK
- * 0 if memnodmap[] too small (of shift too small)
- * -1 if node overlap or lost ram (shift too big)
+ * 0 if OK
+ * -ENOSPC if memnodmap[] too small (or shift too small)
+ * -EINVAL if node overlap or lost ram (shift too big)
  */
 static int __init populate_memnodemap(const struct node *nodes,
-  int numnodes, int shift, nodeid_t 
*nodeids)
+  unsigned int numnodes, unsigned int 
shift,
+  nodeid_t *nodeids)
 {
 unsigned long spdx, epdx;
-int i, res = -1;
+int i, res = -EINVAL;
 
 memset(memnodemap, NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
 for ( i = 0; i < numnodes; i++ )
@@ -77,10 +78,10 @@ static int __init populate_memnodemap(const struct node 
*nodes,
 if ( spdx >= epdx )
 continue;
 if ( (epdx >> shift) >= memnodemapsize )
-return 0;
+return -ENOSPC;
 do {
 if ( memnodemap[spdx >> shift] != NUMA_NO_NODE )
-return -1;
+return -EINVAL;
 
 if ( !nodeids )
 memnodemap[spdx >> shift] = i;
@@ -89,7 +90,7 @@ static int __init populate_memnodemap(const struct node 
*nodes,
 
 spdx += (1UL << shift);
 } while ( spdx < epdx );
-res = 1;
+res = 0;
 }
 
 return res;
@@ -105,7 +106,7 @@ static int __init allocate_cachealigned_memnodemap(void)
 printk(KERN_ERR
"NUMA: Unable to allocate Memory to Node hash map\n");
 memnodemapsize = 0;
-return -1;
+return -ENOMEM;
 }
 
 memnodemap = mfn_to_virt(mfn);
@@ -122,10 +123,10 @@ static int __init allocate_cachealigned_memnodemap(void)
  * The LSB of all start and end addresses in the node map is the value of the
  * maximum possible shift.
  */
-static int __init extract_lsb_from_nodes(const struct node *nodes,
- int numnodes)
+static unsigned int __init extract_lsb_from_nodes(const struct node *nodes,
+  unsigned int numnodes)
 {
-int i, nodes_used = 0;
+unsigned int i, nodes_used = 0;
 unsigned long spdx, epdx;
 unsigned long bitfield = 0, memtop = 0;
 
@@ -149,27 +150,30 @@ static int __init extract_lsb_from_nodes(const struct 
node *nodes,
 return i;
 }
 
-int __init compute_hash_shift(struct node *nodes, int numnodes,
-  nodeid_t *nodeids)
+int __init compute_memnode_shift(struct node *nodes, unsigned int numnodes,
+ nodeid_t *nodeids)
 {
-int shift;
+int ret;
+
+memnode_shift = extract_lsb_from_nodes(nodes, numnodes);
 
-shift = extract_lsb_from_nodes(nodes, numnodes);
 if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
 memnodemap = _memnodemap;
 else if ( allocate_cachealigned_memnodemap() )
-return -1;
-printk(KERN_DEBUG "NUMA: Using %d for the hash shift.\n", shift);
+return -ENOMEM;
 
-if ( populate_memnodemap(nodes, numnodes, shift, nodeids) != 1 )
+printk(KERN_DEBUG "NUMA: Using %u for the hash shift.\n", memnode_shift);
+
+ret = populate_memnodemap(nodes, numnodes, memnode_shift, nodeids);
+if ( ret )
 {
 printk(KERN_INFO "Your memory is not aligned you need to "
"rebuild your hypervisor with a bigger NODEMAPSIZE "
-   "shift=%d\n", shift);
-return -1;
+   "shift=%u\n", memnode_shift);
+return ret;
 }
 
-return shift;
+return 

[Xen-devel] [RFC PATCH v3 05/24] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Add accessors for nodes[] and other static variables and
use those accessors. These variables are later accessed
outside the file when the code made generic in later
patches. However the coding style is not changed.

Signed-off-by: Vijaya Kumar K 
---
v3: - Changed accessors parameter from int to unsigned int
- Updated commit message
- Fixed wrong indentation
---
 xen/arch/x86/srat.c | 106 +++-
 1 file changed, 81 insertions(+), 25 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 535c9d7..42cca5a 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -41,6 +41,44 @@ static struct node node_memblk_range[NR_NODE_MEMBLKS];
 static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
 
+static struct node *get_numa_node(unsigned int id)
+{
+   return [id];
+}
+
+static nodeid_t get_memblk_nodeid(unsigned int id)
+{
+   return memblk_nodeid[id];
+}
+
+static nodeid_t *get_memblk_nodeid_map(void)
+{
+   return _nodeid[0];
+}
+
+static struct node *get_node_memblk_range(unsigned int memblk)
+{
+   return _memblk_range[memblk];
+}
+
+static int get_num_node_memblks(void)
+{
+   return num_node_memblks;
+}
+
+static int __init numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t 
size)
+{
+   if (nodeid >= NR_NODE_MEMBLKS)
+   return -EINVAL;
+
+   node_memblk_range[num_node_memblks].start = start;
+   node_memblk_range[num_node_memblks].end = start + size;
+   memblk_nodeid[num_node_memblks] = nodeid;
+   num_node_memblks++;
+
+   return 0;
+}
+
 static inline bool node_found(unsigned int idx, unsigned int pxm)
 {
return ((pxm2node[idx].pxm == pxm) &&
@@ -107,11 +145,11 @@ int valid_numa_range(paddr_t start, paddr_t end, nodeid_t 
node)
 {
int i;
 
-   for (i = 0; i < num_node_memblks; i++) {
-   struct node *nd = _memblk_range[i];
+   for (i = 0; i < get_num_node_memblks(); i++) {
+   struct node *nd = get_node_memblk_range(i);
 
if (nd->start <= start && nd->end > end &&
-   memblk_nodeid[i] == node )
+   get_memblk_nodeid(i) == node)
return 1;
}
 
@@ -122,8 +160,8 @@ static int __init conflicting_memblks(paddr_t start, 
paddr_t end)
 {
int i;
 
-   for (i = 0; i < num_node_memblks; i++) {
-   struct node *nd = _memblk_range[i];
+   for (i = 0; i < get_num_node_memblks(); i++) {
+   struct node *nd = get_node_memblk_range(i);
if (nd->start == nd->end)
continue;
if (nd->end > start && nd->start < end)
@@ -136,7 +174,8 @@ static int __init conflicting_memblks(paddr_t start, 
paddr_t end)
 
 static void __init cutoff_node(nodeid_t i, paddr_t start, paddr_t end)
 {
-   struct node *nd = [i];
+   struct node *nd = get_numa_node(i);
+
if (nd->start < start) {
nd->start = start;
if (nd->end < nd->start)
@@ -278,6 +317,7 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
unsigned int pxm;
nodeid_t node;
int i;
+   struct node *memblk;
 
if (srat_disabled())
return;
@@ -288,7 +328,7 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
return;
 
-   if (num_node_memblks >= NR_NODE_MEMBLKS)
+   if (get_num_node_memblks() >= NR_NODE_MEMBLKS)
{
dprintk(XENLOG_WARNING,
 "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
@@ -310,27 +350,31 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
i = conflicting_memblks(start, end);
if (i < 0)
/* everything fine */;
-   else if (memblk_nodeid[i] == node) {
+   else if (get_memblk_nodeid(i) == node) {
bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
!test_bit(i, memblk_hotplug);
 
+   memblk = get_node_memblk_range(i);
+
printk("%sSRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with 
itself (%"PRIx64"-%"PRIx64")\n",
   mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
-  node_memblk_range[i].start, node_memblk_range[i].end);
+  memblk->start, memblk->end);
if (mismatch) {
bad_srat();
return;
}
} else {
+   memblk = get_node_memblk_range(i);
+
printk(KERN_ERR
   "SRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with PXM %u 
(%"PRIx64"-%"PRIx64")\n",
-  pxm, start, 

[Xen-devel] [RFC PATCH v3 08/24] NUMA: x86: Move numa code and make it generic

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Move code from xen/arch/x86/numa.c to xen/common/numa.c
so that it can be used by other archs.

The following changes are done:
- Few generic static functions in x86/numa.c is made
  non-static common/numa.c
- The generic contents of header file asm-x86/numa.h
  are moved to xen/numa.h.
- The header file includes are reordered and externs are
  dropped.
- Moved acpi_numa from asm-x86/acpi.h to xen/acpi.h
- Coding style of code moved to commom/numa.c is changed
  to Xen style.
- numa_add_cpu() and numa_set_node() and moved to header
  file and added inline function in case of CONFIG_NUMA
  is not enabled because these functions are called from
  generic code with out any config check.

Also the node_online_map is defined in x86/numa.c for x86
and arm/smpboot.c for ARM. For x86 it is moved to x86/smpboot.c
If moved to common code the compilation fails because
common/numa.c is compiled only when NUMA is enabled.

Signed-off-by: Vijaya Kumar K 
---
v3: - Moved acpi_numa variable
- acpi_setup_node declaration move is reverted.
- Dropped extern in header file
- Added inline declaration for numa_add_cpu() and
  numa_set_node() function based on CONFIG_NUMA
- Moved numa_initmem_init() to common code
- Moved common code from asm-x86/numa.h to xen/numa.h
- Moved node_online_map from numa.c to smpboot.c
---
 xen/arch/x86/numa.c | 459 +
 xen/arch/x86/smpboot.c  |   1 +
 xen/common/Makefile |   1 +
 xen/common/numa.c   | 487 
 xen/include/asm-x86/acpi.h  |   1 -
 xen/include/asm-x86/numa.h  |  56 -
 xen/include/asm-x86/setup.h |   1 -
 xen/include/xen/numa.h  |  64 ++
 8 files changed, 561 insertions(+), 509 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 44c2e08..654530b 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -10,323 +10,17 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
-#include 
-#include 
-
-static int numa_setup(char *s);
-custom_param("numa", numa_setup);
-
-struct node_data node_data[MAX_NUMNODES];
-
-/* Mapping from pdx to node id */
-unsigned int memnode_shift;
 
 /*
- * In case of numa init failure or numa off,
- * memnode_shift is initialized to BITS_PER_LONG - 1. Hence allocate
- * memnodemap[] of BITS_PER_LONG.
- */
-static typeof(*memnodemap) _memnodemap[BITS_PER_LONG];
-unsigned long memnodemapsize;
-uint8_t *memnodemap;
-
-nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
-[0 ... NR_CPUS-1] = NUMA_NO_NODE
-};
-/*
  * Keep BIOS's CPU2node information, should not be used for memory allocaion
  */
 nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
 [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
 };
-cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
-
-nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
-
-bool numa_off;
-s8 acpi_numa = 0;
-
-int srat_disabled(void)
-{
-return numa_off || acpi_numa < 0;
-}
-
-/*
- * Given a shift value, try to populate memnodemap[]
- * Returns :
- * 0 if OK
- * -ENOSPC if memnodmap[] too small (or shift too small)
- * -EINVAL if node overlap or lost ram (shift too big)
- */
-static int __init populate_memnodemap(const struct node *nodes,
-  unsigned int numnodes, unsigned int 
shift,
-  nodeid_t *nodeids)
-{
-unsigned long spdx, epdx;
-int i, res = -EINVAL;
-
-memset(memnodemap, NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
-for ( i = 0; i < numnodes; i++ )
-{
-spdx = paddr_to_pdx(nodes[i].start);
-epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
-if ( spdx >= epdx )
-continue;
-if ( (epdx >> shift) >= memnodemapsize )
-return -ENOSPC;
-do {
-if ( memnodemap[spdx >> shift] != NUMA_NO_NODE )
-return -EINVAL;
-
-if ( !nodeids )
-memnodemap[spdx >> shift] = i;
-else
-memnodemap[spdx >> shift] = nodeids[i];
-
-spdx += (1UL << shift);
-} while ( spdx < epdx );
-res = 0;
-}
-
-return res;
-}
-
-static int __init allocate_cachealigned_memnodemap(void)
-{
-unsigned long size = PFN_UP(memnodemapsize * sizeof(*memnodemap));
-unsigned long mfn = alloc_boot_pages(size, 1);
-
-if ( !mfn )
-{
-printk(KERN_ERR
-   "NUMA: Unable to allocate Memory to Node hash map\n");
-memnodemapsize = 0;
-return -ENOMEM;
-}
-
-memnodemap = mfn_to_virt(mfn);
-mfn <<= PAGE_SHIFT;
-size <<= PAGE_SHIFT;
-printk(KERN_DEBUG "NUMA: Allocated memnodemap from %lx - %lx\n",
-   mfn, mfn + size);
-memnodemapsize = size / sizeof(*memnodemap);
-
-return 0;
-}
-
-/*
- * The LSB of all start and end addresses in the node map is the 

[Xen-devel] [RFC PATCH v3 02/24] x86: NUMA: Clean up: Fix coding styles and drop unused code

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Fix coding style, trailing spaces, tabs in NUMA code.
Also drop unused macros and functions.
There is no functional change.

Signed-off-by: Vijaya Kumar K 
Reviewed-by: Wei Liu 
---
v3: - Change commit message
- Changed VIRTUAL_BUG_ON to ASSERT
- Dropped useless inner paranthesis for some macros
---
 xen/arch/x86/numa.c| 55 +
 xen/arch/x86/srat.c|  2 +-
 xen/include/asm-x86/numa.h | 56 +++---
 xen/include/xen/numa.h |  3 ---
 4 files changed, 54 insertions(+), 62 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index d45196fa..444d7ad 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -1,8 +1,8 @@
-/* 
+/*
  * Generic VM initialization for x86-64 NUMA setups.
  * Copyright 2002,2003 Andi Kleen, SuSE Labs.
  * Adapted for Xen: Ryan Harper 
- */ 
+ */
 
 #include 
 #include 
@@ -21,13 +21,6 @@
 static int numa_setup(char *s);
 custom_param("numa", numa_setup);
 
-#ifndef Dprintk
-#define Dprintk(x...)
-#endif
-
-/* from proto.h */
-#define round_up(x,y) x)+(y))-1) & (~((y)-1)))
-
 struct node_data node_data[MAX_NUMNODES];
 
 /* Mapping from pdx to node id */
@@ -144,8 +137,9 @@ static int __init extract_lsb_from_nodes(const struct node 
*nodes,
 if ( nodes_used <= 1 )
 i = BITS_PER_LONG - 1;
 else
-i = find_first_bit(, sizeof(unsigned long)*8);
+i = find_first_bit(, sizeof(unsigned long) * 8);
 memnodemapsize = (memtop >> i) + 1;
+
 return i;
 }
 
@@ -173,7 +167,7 @@ int __init compute_hash_shift(struct node *nodes, int 
numnodes,
 }
 /* initialize NODE_DATA given nodeid and start/end */
 void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
-{ 
+{
 unsigned long start_pfn, end_pfn;
 
 start_pfn = start >> PAGE_SHIFT;
@@ -183,7 +177,7 @@ void __init setup_node_bootmem(nodeid_t nodeid, u64 start, 
u64 end)
 NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn;
 
 node_set_online(nodeid);
-} 
+}
 
 void __init numa_init_array(void)
 {
@@ -214,7 +208,7 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
 {
 int i;
 struct node nodes[MAX_NUMNODES];
-u64 sz = ((end_pfn - start_pfn)< 1 )
@@ -222,21 +216,22 @@ static int __init numa_emulation(u64 start_pfn, u64 
end_pfn)
 u64 x = 1;
 while ( (x << 1) < sz )
 x <<= 1;
-if ( x < sz/2 )
-printk(KERN_ERR "Numa emulation unbalanced. Complain to 
maintainer\n");
+if ( x < sz / 2 )
+printk(KERN_ERR
+   "Numa emulation unbalanced. Complain to maintainer\n");
 sz = x;
 }
 
 memset(,0,sizeof(nodes));
 for ( i = 0; i < numa_fake; i++ )
 {
-nodes[i].start = (start_pfn<> 20);
 node_set_online(i);
 }
@@ -256,7 +251,7 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
 #endif
 
 void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
-{ 
+{
 int i;
 
 #ifdef CONFIG_NUMA_EMU
@@ -291,7 +286,7 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 void numa_add_cpu(int cpu)
 {
 cpumask_set_cpu(cpu, _to_cpumask[cpu_to_node(cpu)]);
-} 
+}
 
 void numa_set_node(int cpu, nodeid_t node)
 {
@@ -299,23 +294,23 @@ void numa_set_node(int cpu, nodeid_t node)
 }
 
 /* [numa=off] */
-static __init int numa_setup(char *opt) 
-{ 
-if ( !strncmp(opt,"off",3) )
+static __init int numa_setup(char *opt)
+{
+if ( !strncmp(opt, "off", 3) )
 numa_off = true;
-if ( !strncmp(opt,"on",2) )
+if ( !strncmp(opt, "on", 2) )
 numa_off = false;
 #ifdef CONFIG_NUMA_EMU
 if ( !strncmp(opt, "fake=", 5) )
 {
 numa_off = false;
-numa_fake = simple_strtoul(opt+5,NULL,0);
+numa_fake = simple_strtoul(opt + 5, NULL, 0);
 if ( numa_fake >= MAX_NUMNODES )
 numa_fake = MAX_NUMNODES;
 }
 #endif
 #ifdef CONFIG_ACPI_NUMA
-if ( !strncmp(opt,"noacpi",6) )
+if ( !strncmp(opt,"noacpi", 6) )
 {
 

[Xen-devel] [RFC PATCH v3 06/24] x86: NUMA: Rename some generic functions

2017-07-18 Thread vijay . kilari
From: Vijaya Kumar K 

Rename some function in ACPI code as follow
 - Rename setup_node to acpi_setup_node
 - Rename bad_srat to numa_failed
 - Rename nodes_cover_memory to arch_sanitize_nodes_memory
   and changed return type to bool
 - Rename acpi_scan_nodes to numa_scan_nodes

Also introduce reset_pxm2node() to reset pxm2node variable.
This avoids exporting pxm2node.

Signed-off-by: Vijaya Kumar K 
---
 v3: Changed return type of arch_sanitize_nodes_memory
---
 xen/arch/x86/numa.c|  2 +-
 xen/arch/x86/smpboot.c |  2 +-
 xen/arch/x86/srat.c| 55 ++
 xen/arch/x86/x86_64/mm.c   |  2 +-
 xen/include/asm-x86/acpi.h |  2 +-
 xen/include/asm-x86/numa.h |  2 +-
 6 files changed, 36 insertions(+), 29 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 2ea2ec0..44c2e08 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -271,7 +271,7 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 
 #ifdef CONFIG_ACPI_NUMA
 if ( !numa_off &&
- !acpi_scan_nodes(pfn_to_paddr(start_pfn), pfn_to_paddr(end_pfn)) )
+ !numa_scan_nodes(pfn_to_paddr(start_pfn), pfn_to_paddr(end_pfn)) )
 return;
 #endif
 
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 8d91f6c..78af0d2 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -957,7 +957,7 @@ int cpu_add(uint32_t apic_id, uint32_t acpi_id, uint32_t 
pxm)
 
 if ( !srat_disabled() )
 {
-nodeid_t node = setup_node(pxm);
+nodeid_t node = acpi_setup_node(pxm);
 
 if ( node == NUMA_NO_NODE )
 {
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 42cca5a..03bc37d 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -85,6 +85,14 @@ static inline bool node_found(unsigned int idx, unsigned int 
pxm)
(pxm2node[idx].node != NUMA_NO_NODE));
 }
 
+static void reset_pxm2node(void)
+{
+   unsigned int i;
+
+   for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
+   pxm2node[i].node = NUMA_NO_NODE;
+}
+
 nodeid_t pxm_to_node(unsigned int pxm)
 {
unsigned int i;
@@ -99,7 +107,7 @@ nodeid_t pxm_to_node(unsigned int pxm)
return NUMA_NO_NODE;
 }
 
-nodeid_t setup_node(unsigned int pxm)
+nodeid_t acpi_setup_node(unsigned int pxm)
 {
nodeid_t node;
unsigned int idx;
@@ -188,15 +196,14 @@ static void __init cutoff_node(nodeid_t i, paddr_t start, 
paddr_t end)
}
 }
 
-static void __init bad_srat(void)
+static void __init numa_failed(void)
 {
int i;
printk(KERN_ERR "SRAT: SRAT not used.\n");
acpi_numa = -1;
for (i = 0; i < MAX_LOCAL_APIC; i++)
apicid_to_node[i] = NUMA_NO_NODE;
-   for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-   pxm2node[i].node = NUMA_NO_NODE;
+   reset_pxm2node();
mem_hotplug = 0;
 }
 
@@ -252,7 +259,7 @@ acpi_numa_x2apic_affinity_init(const struct 
acpi_srat_x2apic_cpu_affinity *pa)
if (srat_disabled())
return;
if (pa->header.length < sizeof(struct acpi_srat_x2apic_cpu_affinity)) {
-   bad_srat();
+   numa_failed();
return;
}
if (!(pa->flags & ACPI_SRAT_CPU_ENABLED))
@@ -263,9 +270,9 @@ acpi_numa_x2apic_affinity_init(const struct 
acpi_srat_x2apic_cpu_affinity *pa)
}
 
pxm = pa->proximity_domain;
-   node = setup_node(pxm);
+   node = acpi_setup_node(pxm);
if (node == NUMA_NO_NODE) {
-   bad_srat();
+   numa_failed();
return;
}
 
@@ -286,7 +293,7 @@ acpi_numa_processor_affinity_init(const struct 
acpi_srat_cpu_affinity *pa)
if (srat_disabled())
return;
if (pa->header.length != sizeof(struct acpi_srat_cpu_affinity)) {
-   bad_srat();
+   numa_failed();
return;
}
if (!(pa->flags & ACPI_SRAT_CPU_ENABLED))
@@ -297,9 +304,9 @@ acpi_numa_processor_affinity_init(const struct 
acpi_srat_cpu_affinity *pa)
pxm |= pa->proximity_domain_hi[1] << 16;
pxm |= pa->proximity_domain_hi[2] << 24;
}
-   node = setup_node(pxm);
+   node = acpi_setup_node(pxm);
if (node == NUMA_NO_NODE) {
-   bad_srat();
+   numa_failed();
return;
}
apicid_to_node[pa->apic_id] = node;
@@ -322,7 +329,7 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
if (srat_disabled())
return;
if (ma->header.length != sizeof(struct acpi_srat_mem_affinity)) {
-   bad_srat();
+   numa_failed();
return;
}
if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
@@ -332,7 +339,7 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
{
   

Re: [Xen-devel] [RFC PATCH v2 03/25] x86: NUMA: Rename and sanitize some common functions

2017-07-11 Thread Vijay Kilari
Hi Jan,

 Sorry for late reply.

On Fri, Jun 30, 2017 at 7:35 PM, Jan Beulich  wrote:
  03/28/17 5:54 PM >>>
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -53,15 +53,15 @@ int srat_disabled(void)
>>  /*
>>   * Given a shift value, try to populate memnodemap[]
>>   * Returns :
>> - * 1 if OK
>> - * 0 if memnodmap[] too small (of shift too small)
>> - * -1 if node overlap or lost ram (shift too big)
>> + * 0 if OK
>> + * -EINVAL if memnodmap[] too small (of shift too small)
>> + * OR if node overlap or lost ram (shift too big)
>
> It may not matter too much, but you're making things actually worse to
> the caller, as it now can't distinguish the two failure modes anymore.
> Also, if you already touch it, please also correct the apparent typo
> ("of" quite likely meant to be "or"). But what I consider most problematic
> is that you convert ...

OK. I propose to return different error values for two failure modes.
-ENOMEM for "if memnodmap[] too small" and
-EINVAL for if node overlap or lost ram

But In any case it does not matter much and can drop this change.

> ... what is an error case so far to a success one.
>
>> @@ -116,10 +116,10 @@ static int __init 
>> allocate_cachealigned_memnodemap(void)
>>   * The LSB of all start and end addresses in the node map is the value of 
>> the
>>   * maximum possible shift.
>>   */
>> -static int __init extract_lsb_from_nodes(const struct node *nodes,
>> - int numnodes)
>> +static unsigned int __init extract_lsb_from_nodes(const struct node *nodes,
>> +  int numnodes)
>
> Why would you convert the return type to unsigned, but not also that of the
> bogusly signed parameter?

Because memnode_shift type is changed from int to unsigned int.
The return type is changed.

I will change int parameter to unsigned int.
Apart from that I see that variable 'i' in extract_lsb_from_nodes() is int.
This needs to changed to unsigned int.

>
>> @@ -143,27 +143,27 @@ static int __init extract_lsb_from_nodes(const struct 
>> node *nodes,
>>  return i;
>>  }
>>
>> -int __init compute_hash_shift(struct node *nodes, int numnodes,
>> -  nodeid_t *nodeids)
>> +int __init compute_memnode_shift(struct node *nodes, int numnodes,
>> + nodeid_t *nodeids, unsigned int *shift)
>
> I'm not in favor of returning the shift count via pointer when it can easily
> be returned by value.

OK.

>
>>  {
>> -int shift;
>> +*shift = extract_lsb_from_nodes(nodes, numnodes);
>>
>> -shift = extract_lsb_from_nodes(nodes, numnodes);
>>  if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
>>  memnodemap = _memnodemap;
>>  else if ( allocate_cachealigned_memnodemap() )
>> -return -1;
>> -printk(KERN_DEBUG "NUMA: Using %d for the hash shift.\n", shift);
>> +return -ENOMEM;
>> +
>> +printk(KERN_DEBUG "NUMA: Using %u for the hash shift.\n", *shift);
>>
>> -if ( populate_memnodemap(nodes, numnodes, shift, nodeids) != 1 )
>> +if ( populate_memnodemap(nodes, numnodes, *shift, nodeids) )
>>  {
>>  printk(KERN_INFO "Your memory is not aligned you need to "
>> "rebuild your hypervisor with a bigger NODEMAPSIZE "
>> -   "shift=%d\n", shift);
>> -return -1;
>> +   "shift=%u\n", *shift);
>> +return -EINVAL;
>
> So you make populate_memnodemap() return proper error values, but then discard
> it and uniformly use -EINVAL here. If you mean the function to simply return a
> success/failure indicator, make it return bool. Otherwise use the error value
> it return (even if right now it can only ever be -EINVAL).

OK. I will drop this change and keep compute_hash_shift() return -1 or
shift value.

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig

2017-06-15 Thread Vijay Kilari
On Wed, May 31, 2017 at 4:07 PM, Jan Beulich  wrote:
 On 31.05.17 at 12:18,  wrote:
>> On 31/05/17 11:04, Jan Beulich wrote:
>> On 28.03.17 at 17:53,  wrote:
 --- a/xen/common/Kconfig
 +++ b/xen/common/Kconfig
 @@ -41,6 +41,10 @@ config HAS_GDBSX
  config HAS_IOPORTS
 bool

 +config NUMA
 +   def_bool y
 +   depends on HAS_PDX
>>>
>>> What makes necessary this dependency?
>>
>> IIRC, this is because the numa code is using PDX helpers.
>
> Well, these helpers should have 1:1 translation equivalents for
> the non-PDX case; I don't see the need for the dependency.

PDX is necessary. Without that xen fails to compile for ARM.
IMO, there is no equivalent non-PDX support available.

As it is mandatory config, I propose to remove this dependency with
NUMA config. ok?

>
 --- a/xen/drivers/acpi/Kconfig
 +++ b/xen/drivers/acpi/Kconfig
 @@ -4,6 +4,3 @@ config ACPI

  config ACPI_LEGACY_TABLES_LOOKUP
 bool
 -
 -config NUMA
 -   bool
>>>
>>> This makes clear that so far this is an option which architectures
>>> are expected to select. I think we want it to remain that way, but
>>> if we didn't you should remove the existing select(s).
>>>
>>> Also, does it really matter much whether this is under drivers/acpi/
>>> or common/? After all ACPI appears to be a prereq on ARM too.
>>
>> ACPI is not a prereq for NUMA. You can use it with Device Tree too.
>
> Oh, okay. That should be said in the commit message then.
>
> Jan
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 11/25] x86: NUMA: Move common code from srat.c

2017-05-10 Thread Vijay Kilari
 and

On Mon, May 8, 2017 at 10:36 PM, Julien Grall  wrote:
> Hi Vijay,
>
>
> On 28/03/17 16:53, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Move code from xen/arch/x86/srat.c to xen/common/numa.c
>> so that it can be used by other archs.
>> Few generic static functions in x86/srat.c are made
>> non-static common/numa.c
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>>  xen/arch/x86/srat.c| 152
>> ++---
>>  xen/common/numa.c  | 146
>> +++
>>  xen/include/asm-x86/acpi.h |   3 -
>>  xen/include/asm-x86/numa.h |   2 -
>>  xen/include/xen/numa.h |  14 +
>>  5 files changed, 164 insertions(+), 153 deletions(-)
>>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index 2cc87a3..55947bb 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -23,9 +23,8 @@
>>
>>  static struct acpi_table_slit *__read_mostly acpi_slit;
>>
>> -static nodemask_t __initdata memory_nodes_parsed;
>> -static nodemask_t __initdata processor_nodes_parsed;
>> -static struct node __initdata nodes[MAX_NUMNODES];
>> +extern nodemask_t processor_nodes_parsed;
>> +extern nodemask_t memory_nodes_parsed;
>
>
> On v1, Jan clearly NAK to changes like this. Declarations belong in header
> files. It is a different variable compare to v1, but I would have expected
> you to apply what he said everywhere...

Ok I will move these to header files.

One more change that I made is moved from static to global.
because creating accessor functions around these nodesmask_t is tricky because
the macros (defined in nodemask.h) does not take pointer parameters.

I will add comment.

>
> [...]
>
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> index 207ebd8..1789bba 100644
>> --- a/xen/common/numa.c
>> +++ b/xen/common/numa.c
>> @@ -32,6 +32,8 @@
>>  static int numa_setup(char *s);
>>  custom_param("numa", numa_setup);
>>
>> +nodemask_t __initdata memory_nodes_parsed;
>> +nodemask_t __initdata processor_nodes_parsed;
>>  struct node_data node_data[MAX_NUMNODES];
>>
>>  /* Mapping from pdx to node id */
>> @@ -47,6 +49,10 @@ cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
>>
>>  static bool numa_off = 0;
>>  static bool acpi_numa = 1;
>> +static int num_node_memblks;
>> +static struct node node_memblk_range[NR_NODE_MEMBLKS];
>> +static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>> +static struct node __initdata nodes[MAX_NUMNODES];
>
>
> It would make sense to keep those variables together with
> {memory,processor}_nodes_parsed.

ok
>
> [...]
>
>> +int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
>> +{
>> +int i;
>> +
>> +for (i = 0; i < get_num_node_memblks(); i++) {
>
>
> common/numa.c is using Xen coding style whilst arch/x86/srat.c is using
> Linux coding style.
>
> You decided to validly switch to soft tab, making quite difficult to check
> if this patch is only code movement. But you did not go far enough and fix
> the coding style of the code moved.
>
> Please do it properly and not half of it. For simplicity I would be OK that
> it is done in this patch. But this needs to be clearly written in the commit
> message.

I will add in commit message about coding style changes to destination file
compared to source file.

>
>> +struct node *nd = get_node_memblk_range(i);
>> +
>> +if (nd->start <= start && nd->end > end &&
>> +get_memblk_nodeid(i) == node)
>> +return 1;
>> +}
>> +
>> +return 0;
>> +}
>
>
> [...]
>
>
>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>> index 421e8b7..7cff220 100644
>> --- a/xen/include/asm-x86/numa.h
>> +++ b/xen/include/asm-x86/numa.h
>> @@ -47,8 +47,6 @@ static inline __attribute__((pure)) nodeid_t
>> phys_to_nid(paddr_t addr)
>>  #define node_end_pfn(nid)   (NODE_DATA(nid)->node_start_pfn + \
>>   NODE_DATA(nid)->node_spanned_pages)
>>
>> -extern int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node);
>> -
>>  void srat_parse_regions(uint64_t addr);
>>  extern uint8_t __node_distance(nodeid_t a, nodeid_t b);
>>  unsigned int arch_get_dma_bitsize(void);
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index eed40af..ee53526 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -13,6 +13,7 @@
>>  #define NUMA_NO_DISTANCE 0xFF
>>
>>  #define MAX_NUMNODES(1 << NODES_SHIFT)
>> +#define NR_NODE_MEMBLKS (MAX_NUMNODES * 2)
>>
>>  struct node {
>>  paddr_t start;
>> @@ -28,6 +29,19 @@ extern nodeid_t acpi_setup_node(unsigned int pxm);
>>  extern void srat_detect_node(int cpu);
>>  extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t
>> end);
>>  extern void init_cpu_to_node(void);
>> +extern int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node);
>> +extern int conflicting_memblks(paddr_t start, 

Re: [Xen-devel] [RFC PATCH v2 12/25] ARM: NUMA: Parse CPU NUMA information

2017-05-09 Thread Vijay Kilari
On Mon, May 8, 2017 at 11:01 PM, Julien Grall  wrote:
> Hi Vijay,
>
> The title likely needs to have the work device-tree/DT in it.
>
> On 28/03/17 16:53, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Parse CPU node and fetch numa-node-id information.
>> For each node-id found, update nodemask_t mask.
>> Refer to /Documentation/devicetree/bindings/numa.txt.
>
>
> In which repository?
>
>
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>>  xen/arch/arm/Makefile   |  1 +
>>  xen/arch/arm/bootfdt.c  | 16 --
>>  xen/arch/arm/numa/Makefile  |  2 ++
>>  xen/arch/arm/numa/dt_numa.c | 78
>> +
>>  xen/arch/arm/numa/numa.c| 50 +
>>  xen/arch/arm/setup.c|  4 +++
>>  xen/include/asm-arm/numa.h  | 10 +-
>>  xen/include/asm-arm/setup.h |  4 ++-
>>  8 files changed, 161 insertions(+), 4 deletions(-)
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 0ce94a8..d13b79f 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -3,6 +3,7 @@ subdir-$(CONFIG_ARM_64) += arm64
>>  subdir-y += platforms
>>  subdir-$(CONFIG_ARM_64) += efi
>>  subdir-$(CONFIG_ACPI) += acpi
>> +subdir-$(CONFIG_NUMA) += numa
>>
>>  obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
>>  obj-y += bootfdt.init.o
>> diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
>> index ea188a0..1f876f0 100644
>> --- a/xen/arch/arm/bootfdt.c
>> +++ b/xen/arch/arm/bootfdt.c
>> @@ -62,8 +62,20 @@ static void __init device_tree_get_reg(const __be32
>> **cell, u32 address_cells,
>>  *size = dt_next_cell(size_cells, cell);
>>  }
>>
>> -static u32 __init device_tree_get_u32(const void *fdt, int node,
>> -  const char *prop_name, u32 dflt)
>> +bool_t __init device_tree_type_matches(const void *fdt, int node,
>> +   const char *match)
>> +{
>> +const void *prop;
>> +
>> +prop = fdt_getprop(fdt, node, "device_type", NULL);
>> +if ( prop == NULL )
>> +return 0;
>> +
>> +return strcmp(prop, match) == 0 ? 1 : 0;
>> +}
>> +
>
>
> This change is not explained in the patch and does not belong to it anyway.

OK.
>
>> +u32 __init device_tree_get_u32(const void *fdt, int node,
>> +   const char *prop_name, u32 dflt)
>
>
> Ditto. I would recommend to read [1] for tips to break down a patch.
>
>
>>  {
>>  const struct fdt_property *prop;
>>
>> diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
>> new file mode 100644
>> index 000..3af3aff
>> --- /dev/null
>> +++ b/xen/arch/arm/numa/Makefile
>> @@ -0,0 +1,2 @@
>> +obj-y += dt_numa.o
>> +obj-y += numa.o
>> diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
>> new file mode 100644
>> index 000..66c6efb
>> --- /dev/null
>> +++ b/xen/arch/arm/numa/dt_numa.c
>> @@ -0,0 +1,78 @@
>> +/*
>> + * OF NUMA Parsing support.
>> + *
>> + * Copyright (C) 2015 - 2016 Cavium Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program.  If not, see .
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>
>
> This is already included by xen/mm.h
>
>> +#include 
>> +#include 
>> +#include 
>
>
> Please order the include.
>
>> +
>> +extern nodemask_t processor_nodes_parsed;
>
>
> See my comment on patch #11. I may miss of them and hoping you will fix all
> the occurrence in the next version.
>
>
>> +
>> +/*
>> + * Even though we connect cpus to numa domains later in SMP
>> + * init, we need to know the node ids now for all cpus.
>> + */
>> +static int __init dt_numa_process_cpu_node(const void *fdt, int node,
>> +   const char *name,
>> +   uint32_t address_cells,
>> +   uint32_t size_cells)
>> +{
>> +uint32_t nid;
>> +
>> +nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
>> +
>> +if ( nid >= MAX_NUMNODES )
>> +printk(XENLOG_WARNING "NUMA: Node id %u exceeds maximum value\n",
>> nid);
>> +else
>> +node_set(nid, processor_nodes_parsed);
>> +
>> +return 0;
>> +}
>> +
>> +static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
>> +const char *name, int depth,
>> + 

Re: [Xen-devel] [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic

2017-05-09 Thread Vijay Kilari
On Mon, May 8, 2017 at 10:21 PM, Julien Grall  wrote:
> On 28/03/17 16:53, vijay.kil...@gmail.com wrote:
>>
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> new file mode 100644
>> index 000..207ebd8
>> --- /dev/null
>> +++ b/xen/common/numa.c
>> @@ -0,0 +1,488 @@
>> +/*
>> + * Common NUMA handling functions for x86 and arm.
>> + * Original code extracted from arch/x86/numa.c
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms and conditions of the GNU General Public
>> + * License, version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; If not, see .
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +static int numa_setup(char *s);
>> +custom_param("numa", numa_setup);
>> +
>> +struct node_data node_data[MAX_NUMNODES];
>> +
>> +/* Mapping from pdx to node id */
>> +unsigned int memnode_shift;
>> +static typeof(*memnodemap) _memnodemap[64];
>
>
> Also, you move the hardcoded 64 here. But have you checked it is valid for
> ARM?
>
> Regardless that, this sounds like something that should be turned into a
> define and require a comment.

64 is good enough. This _memnodemap is used in case of NUMA failed or off,
in which case memnode_shift is 63 (BITS_PER_LONG -1).

So all the phys_to_nid() conversion will indexed within limits of _memnodemap[]

>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic

2017-05-09 Thread Vijay Kilari
On Mon, May 8, 2017 at 10:11 PM, Julien Grall  wrote:
> Hi Vijay,
>
>
> On 28/03/17 16:53, vijay.kil...@gmail.com wrote:
>>
>> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
>> index 3bdab9a..33c6806 100644
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -10,286 +10,13 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>>  #include 
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>> -#include 
>> -
>> -static int numa_setup(char *s);
>> -custom_param("numa", numa_setup);
>> -
>> -struct node_data node_data[MAX_NUMNODES];
>> -
>> -/* Mapping from pdx to node id */
>> -unsigned int memnode_shift;
>> -static typeof(*memnodemap) _memnodemap[64];
>> -unsigned long memnodemapsize;
>> -uint8_t *memnodemap;
>> -
>> -nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
>> -[0 ... NR_CPUS-1] = NUMA_NO_NODE
>> -};
>> -/*
>> - * Keep BIOS's CPU2node information, should not be used for memory
>> allocaion
>> - */
>> -nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
>> -[0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
>> -};
>
>
> Why this is moved in this patch from here to x86/srat.c?

This is x86 specific. I will make a separate patch for this
move.

>
> [...]
>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index 7cf4771..2cc87a3 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -27,6 +27,13 @@ static nodemask_t __initdata memory_nodes_parsed;
>>  static nodemask_t __initdata processor_nodes_parsed;
>>  static struct node __initdata nodes[MAX_NUMNODES];
>>
>> +/*
>> + * Keep BIOS's CPU2node information, should not be used for memory
>> allocaion
>> + */
>> +nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
>> +[0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
>> +};
>> +
>
>
> This does not belong to this patch...
Ok
>
>>  struct pxm2node {
>> unsigned int pxm;
>> nodeid_t node;
>
>
> [...]
>
>
>> diff --git a/xen/common/numa.c b/xen/common/numa.c
>> new file mode 100644
>> index 000..207ebd8
>> --- /dev/null
>> +++ b/xen/common/numa.c
>> @@ -0,0 +1,488 @@
>> +/*
>> + * Common NUMA handling functions for x86 and arm.
>> + * Original code extracted from arch/x86/numa.c
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms and conditions of the GNU General Public
>> + * License, version 2, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; If not, see .
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>
>
> Whilst you are moving this in a newfile, please order the includes.

I understand that you don't like any code changes in code movement
patch.

>
> [...]
>
>> +static unsigned int __init extract_lsb_from_nodes(const struct node
>> *nodes,
>> +  int numnodes)
>> +{
>> +unsigned int i, nodes_used = 0;
>> +unsigned long spdx, epdx;
>> +unsigned long bitfield = 0, memtop = 0;
>> +
>> +for ( i = 0; i < numnodes; i++ )
>> +{
>> +spdx = paddr_to_pdx(nodes[i].start);
>> +epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
>> +if ( spdx >= epdx )
>> +continue;
>> +bitfield |= spdx;
>> +nodes_used++;
>> +if ( epdx > memtop )
>> +memtop = epdx;
>> +}
>> +if ( nodes_used <= 1 )
>> +i = BITS_PER_LONG - 1;
>> +else
>> +i = find_first_bit(, sizeof(unsigned long) * 8);
>> +
>
>
> It is interesting to see that newline was added in the process of moving the
> code.

OK.
>
>> +memnodemapsize = (memtop >> i) + 1;
>
>
> []
>
>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>> index 922fbd8..eed40af 100644
>> --- a/xen/include/xen/numa.h
>> +++ b/xen/include/xen/numa.h
>> @@ -14,6 +14,21 @@
>>
>>  #define MAX_NUMNODES(1 << NODES_SHIFT)
>>
>> +struct node {
>> +paddr_t start;
>> +paddr_t end;
>> +};
>> +
>> +extern int compute_memnode_shift(struct node *nodes, int numnodes,
>> + nodeid_t *nodeids, unsigned int *shift);
>> +extern void numa_init_array(void);
>> +extern bool_t srat_disabled(void);
>> +extern void numa_set_node(int cpu, nodeid_t node);
>> +extern nodeid_t acpi_setup_node(unsigned int pxm);
>> +extern void srat_detect_node(int cpu);
>> +extern void setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t
>> end);
>> +extern void init_cpu_to_node(void);
>
>
> Can you please be consistent with this file and drop 

Re: [Xen-devel] [RFC PATCH v2 09/25] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA

2017-05-09 Thread Vijay Kilari
On Mon, May 8, 2017 at 9:28 PM, Julien Grall  wrote:
> Hi Vijay,
>
> On 28/03/17 16:53, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Right now CONFIG_NUMA is not enabled for ARM and
>> existing code in asm-arm/numa.h is for !CONFIG_NUMA.
>> Hence put this code under #ifndef CONFIG_NUMA.
>>
>> This help to make this changes work when CONFIG_NUMA
>> is not enabled.
>
>
> But you always turn NUMA on by default (see patch #24) and there is no
> possibility to turn off NUMA.

Yes at the end of the series we enable NUMA by default.
But the the intermittent patches of this patch series fails to compile.

>
>>
>> Also define NODES_SHIFT macro for ARM to value 2.
>> This limits number of NUMA nodes supported to 4.
>> There is not hard restrictions on this value set to 2.
>
>
> Again, why only 2 when x86 is supporting 6?
>
> Furthermore, this is not related to this patch itself and should be part of
> separate patch.
>
> Lastly, why don't you move that to a Kconfig allowing the user to configure
> the number of Nodes?

ok

>
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>>  xen/include/asm-arm/numa.h | 5 +
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
>> index 53f99af..924bfc0 100644
>> --- a/xen/include/asm-arm/numa.h
>> +++ b/xen/include/asm-arm/numa.h
>> @@ -3,6 +3,10 @@
>>
>>  typedef uint8_t nodeid_t;
>>
>> +/* Limit number of NUMA nodes supported to 4 */
>> +#define NODES_SHIFT 2
>
>
> Why this is not covered by CONFIG_NUMA?

The below define is used in generic code irrespective of CONFIG_NUMA

#define MAX_NUMNODES(1 << NODES_SHIFT)

>
>> +
>> +#ifndef CONFIG_NUMA
>>  /* Fake one node for now. See also node_online_map. */
>>  #define cpu_to_node(cpu) 0
>>  #define node_to_cpumask(node)   (cpu_online_map)
>> @@ -16,6 +20,7 @@ static inline __attribute__((pure)) nodeid_t
>> phys_to_nid(paddr_t addr)
>>  #define node_spanned_pages(nid) (total_pages)
>>  #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
>>  #define __node_distance(a, b) (20)
>> +#endif /* CONFIG_NUMA */
>>
>>  static inline unsigned int arch_get_dma_bitsize(void)
>>  {
>>
>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 06/25] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs

2017-05-09 Thread Vijay Kilari
On Mon, May 8, 2017 at 8:09 PM, Julien Grall  wrote:
> Hi Vijay,
>
> On 28/03/17 16:53, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Add accessor for nodes[] and other static variables and
>
>
> s/accessor/accessors/
>
>> used those accessors.
>
>
> Also, I am not sure to understand the usefulness of those accessors over a
> global variable.

These are static variables which needs to accessed from other files and
later moved to generic file.

>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>>  xen/arch/x86/srat.c | 108
>> +++-
>>  1 file changed, 82 insertions(+), 26 deletions(-)
>>
>> diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
>> index ccacbcd..983e1d8 100644
>> --- a/xen/arch/x86/srat.c
>> +++ b/xen/arch/x86/srat.c
>> @@ -41,7 +41,45 @@ static struct node node_memblk_range[NR_NODE_MEMBLKS];
>>  static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
>>  static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
>>
>> -static inline bool node_found(unsigned idx, unsigned pxm)
>> +static struct node *get_numa_node(int id)
>
>
> unsigned int.
OK
>
>> +{
>> +   return [id];
>> +}
>> +
>> +static nodeid_t get_memblk_nodeid(int id)
>
>
> unsigned int.
>
>> +{
>> +   return memblk_nodeid[id];
>> +}
>> +
>> +static nodeid_t *get_memblk_nodeid_map(void)
>> +{
>> +   return _nodeid[0];
>> +}
>> +
>> +static struct node *get_node_memblk_range(int memblk)
>
>
> unsigned int.
>
>> +{
>> +   return _memblk_range[memblk];
>> +}
>> +
>> +static int get_num_node_memblks(void)
>> +{
>> +   return num_node_memblks;
>> +}
>> +
>> +static int __init numa_add_memblk(nodeid_t nodeid, paddr_t start,
>> uint64_t size)
>> +{
>> +   if (nodeid >= NR_NODE_MEMBLKS)
>> +   return -EINVAL;
>> +
>> +   node_memblk_range[num_node_memblks].start = start;
>> +   node_memblk_range[num_node_memblks].end = start + size;
>> +   memblk_nodeid[num_node_memblks] = nodeid;
>> +   num_node_memblks++;
>> +
>> +   return 0;
>> +}
>> +
>> +static inline bool node_found(unsigned int idx, unsigned int pxm)
>
>
> Please don't make unrelated change in the same patch. In this case I don't
> see why you switch from "unsigned" to "unsigned int".
>
>>  {
>> return ((pxm2node[idx].pxm == pxm) &&
>> (pxm2node[idx].node != NUMA_NO_NODE));
>> @@ -107,11 +145,11 @@ int valid_numa_range(paddr_t start, paddr_t end,
>> nodeid_t node)
>>  {
>> int i;
>>
>> -   for (i = 0; i < num_node_memblks; i++) {
>> -   struct node *nd = _memblk_range[i];
>> +   for (i = 0; i < get_num_node_memblks(); i++) {
>> +   struct node *nd = get_node_memblk_range(i);
>>
>> if (nd->start <= start && nd->end > end &&
>> -   memblk_nodeid[i] == node )
>> +   get_memblk_nodeid(i) == node)
>
>
> Why the indentation changed here?

OK. will wrap these changes in other patches.

>
>
>> return 1;
>> }
>>
>> @@ -122,8 +160,8 @@ static int __init conflicting_memblks(paddr_t start,
>> paddr_t end)
>>  {
>> int i;
>>
>> -   for (i = 0; i < num_node_memblks; i++) {
>> -   struct node *nd = _memblk_range[i];
>> +   for (i = 0; i < get_num_node_memblks(); i++) {
>> +   struct node *nd = get_node_memblk_range(i);
>> if (nd->start == nd->end)
>> continue;
>> if (nd->end > start && nd->start < end)
>> @@ -136,7 +174,8 @@ static int __init conflicting_memblks(paddr_t start,
>> paddr_t end)
>>
>>  static void __init cutoff_node(int i, paddr_t start, paddr_t end)
>>  {
>> -   struct node *nd = [i];
>> +   struct node *nd = get_numa_node(i);
>> +
>> if (nd->start < start) {
>> nd->start = start;
>> if (nd->end < nd->start)
>> @@ -278,6 +317,7 @@ acpi_numa_memory_affinity_init(const struct
>> acpi_srat_mem_affinity *ma)
>> unsigned pxm;
>> nodeid_t node;
>> int i;
>> +   struct node *memblk;
>>
>> if (srat_disabled())
>> return;
>> @@ -288,7 +328,7 @@ acpi_numa_memory_affinity_init(const struct
>> acpi_srat_mem_affinity *ma)
>> if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
>> return;
>>
>> -   if (num_node_memblks >= NR_NODE_MEMBLKS)
>> +   if (get_num_node_memblks() >= NR_NODE_MEMBLKS)
>> {
>> dprintk(XENLOG_WARNING,
>>  "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
>> @@ -310,27 +350,31 @@ acpi_numa_memory_affinity_init(const struct
>> acpi_srat_mem_affinity *ma)
>> i = conflicting_memblks(start, end);
>> if (i < 0)
>> /* everything fine */;
>> -   else if (memblk_nodeid[i] == node) {
>> +   else if (get_memblk_nodeid(i) == node) {
>> bool 

Re: [Xen-devel] [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables

2017-05-02 Thread Vijay Kilari
On Tue, Apr 25, 2017 at 9:13 PM, Jan Beulich <jbeul...@suse.com> wrote:
>>>> On 25.04.17 at 17:14, <julien.gr...@arm.com> wrote:
>> On 25/04/17 15:54, Vijay Kilari wrote:
>>> On Tue, Apr 25, 2017 at 5:58 PM, Julien Grall <julien.gr...@arm.com> wrote:
>>>>>>>
>>>>>>> By setting 1, we are enabling acpi_numa by default. If not enabled, the
>>>>>>> below
>>>>>>> call has check srat_disabled() before proceeding fails.
>>>>>>
>>>>>>
>>>>>>
>>>>>> My understanding is on x86 acpi_numa is disabled by default and will be
>>>>>> enabled if they are able to parse the SRAT. So why are you changing the
>>>>>> behavior for x86?
>>>>>
>>>>>
>>>>> acpi_numa = 0 means it is enabled by default on x86.
>>>>
>>>>
>>>> In acpi_scan_nodes:
>>>>
>>>> if (acpi_numa <= 0)
>>>>   return -1;
>>>>
>>>> So it does not seem that 0 means enabled.
>>>
>>> IMO, In x86
>>>  -1 means disabled
>>>   0 enabled but not numa initialized
>>>   1 enabled and numa initialized.
>>>
>>> I clubbed 0 & 1.
>>
>>  From your description 0 and 1 have different meaning, so I don't see
>> how you can merge them that easily without any explanation.
>>
>> Anyway, I will leave x86 maintainers give their opinion here.
>
> I'm pretty certain this needs to remain a tristate.

Ok. I will drop this patch from this series and can be fixed
outside this series.
BTW, any review comments on remaining patches?

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables

2017-04-25 Thread Vijay Kilari
On Tue, Apr 25, 2017 at 5:58 PM, Julien Grall <julien.gr...@arm.com> wrote:
>
>
> On 25/04/17 13:20, Vijay Kilari wrote:
>>
>> On Tue, Apr 25, 2017 at 5:34 PM, Julien Grall <julien.gr...@arm.com>
>> wrote:
>>>
>>> Hello Vijay,
>>>
>>> On 25/04/17 07:54, Vijay Kilari wrote:
>>>>
>>>>
>>>> On Thu, Apr 20, 2017 at 9:29 PM, Julien Grall <julien.gr...@arm.com>
>>>> wrote:
>>>>>
>>>>>
>>>>> Hi Vijay,
>>>>>
>>>>> On 28/03/17 16:53, vijay.kil...@gmail.com wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> From: Vijaya Kumar K <vijaya.ku...@cavium.com>
>>>>>>
>>>>>> Add accessor functions for acpi_numa and numa_off static
>>>>>> variables. Init value of acpi_numa is set 1 instead of 0.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Please explain why you change the acpi_numa value from 0 to 1.
>>>>
>>>>
>>>>
>>>> previously acpi_numa was s8 and are using 0 and -1 values. I have made
>>>> it bool and set
>>>> the initial value to 1.
>>>
>>>
>>>
>>> Are you sure? With a quick grep I spot it sounds like acpi_numa can have
>>> the
>>> value 0, -1, 1.
>>>
>>
>> Hmm.. But I don't see use of having 0, -1 and 1. But I don't see any use
>> of
>> having 3 values to say enable or disable.
>
>
> Then explain why in the commit message and don't let people discover. If you
> have not done it, I would recommend to read:
>
> https://wiki.xenproject.org/wiki/Submitting_Xen_Project_Patches
>
>>
>>>>
>>>> By setting 1, we are enabling acpi_numa by default. If not enabled, the
>>>> below
>>>> call has check srat_disabled() before proceeding fails.
>>>
>>>
>>>
>>> My understanding is on x86 acpi_numa is disabled by default and will be
>>> enabled if they are able to parse the SRAT. So why are you changing the
>>> behavior for x86?
>>
>>
>> acpi_numa = 0 means it is enabled by default on x86.
>
>
> In acpi_scan_nodes:
>
> if (acpi_numa <= 0)
>   return -1;
>
> So it does not seem that 0 means enabled.

IMO, In x86
 -1 means disabled
  0 enabled but not numa initialized
  1 enabled and numa initialized.

I clubbed 0 & 1.

I was referring to below code in x86 where in acpi_numa = 0 means
srat_disabled() returns false. Which means acpi_numa is enabled implicitly.

int srat_disabled(void)
{
  return numa_off || acpi_numa < 0;
}

When I changed acpi_numa to bool, it is initialized to 1 and changed
below code.

bool srat_disabled(void)
{
return numa_off || acpi_numa == 0;
}

Also this srat_disabed() is used in acpi_numa_memory_affinity_init which is
called from acpi_numa_init() before calling acpi_scan_nodes().

>
>>
>>>
>>>>
>>>> acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>>>> {
>>>> 
>>>>
>>>> if ( srat_disabled() )
>>>> return;
>>>>
>>>> }
>>>>
>>>>>
>>>>> Also, I am not sure to understand the benefits of those helpers. Why do
>>>>> you
>>>>> need them? Why not using the global variable directly, this will avoid
>>>>> to
>>>>> call a function just for returning a value...
>>>>
>>>>
>>>>
>>>> These helpers are used by both common code and arch specific code later.
>>>> Hence made these global variables as static and added helpers
>>>
>>>
>>>
>>> The variables were global so they could already be used anywhere.
>>>
>>> Furthermore, those helpers are not even inlined, so everytime you want to
>>> access read acpi_numa you have to do a branch then a memory access.
>>>
>>> So what is the point to do that?
>>
>>
>> I agree with making inline. But I don't think there is any harm in making
>> them
>> static and adding helpers.
>
>
> But why? Why don't you keep the code as it is? You modify code without any
> justification and not for the better.
>
> Cheers,
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables

2017-04-25 Thread Vijay Kilari
On Tue, Apr 25, 2017 at 5:34 PM, Julien Grall <julien.gr...@arm.com> wrote:
> Hello Vijay,
>
> On 25/04/17 07:54, Vijay Kilari wrote:
>>
>> On Thu, Apr 20, 2017 at 9:29 PM, Julien Grall <julien.gr...@arm.com>
>> wrote:
>>>
>>> Hi Vijay,
>>>
>>> On 28/03/17 16:53, vijay.kil...@gmail.com wrote:
>>>>
>>>>
>>>> From: Vijaya Kumar K <vijaya.ku...@cavium.com>
>>>>
>>>> Add accessor functions for acpi_numa and numa_off static
>>>> variables. Init value of acpi_numa is set 1 instead of 0.
>>>
>>>
>>>
>>> Please explain why you change the acpi_numa value from 0 to 1.
>>
>>
>> previously acpi_numa was s8 and are using 0 and -1 values. I have made
>> it bool and set
>> the initial value to 1.
>
>
> Are you sure? With a quick grep I spot it sounds like acpi_numa can have the
> value 0, -1, 1.
>

Hmm.. But I don't see use of having 0, -1 and 1. But I don't see any use of
having 3 values to say enable or disable.

>>
>> By setting 1, we are enabling acpi_numa by default. If not enabled, the
>> below
>> call has check srat_disabled() before proceeding fails.
>
>
> My understanding is on x86 acpi_numa is disabled by default and will be
> enabled if they are able to parse the SRAT. So why are you changing the
> behavior for x86?

acpi_numa = 0 means it is enabled by default on x86.

>
>>
>> acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
>> {
>> 
>>
>> if ( srat_disabled() )
>> return;
>>
>> }
>>
>>>
>>> Also, I am not sure to understand the benefits of those helpers. Why do
>>> you
>>> need them? Why not using the global variable directly, this will avoid to
>>> call a function just for returning a value...
>>
>>
>> These helpers are used by both common code and arch specific code later.
>> Hence made these global variables as static and added helpers
>
>
> The variables were global so they could already be used anywhere.
>
> Furthermore, those helpers are not even inlined, so everytime you want to
> access read acpi_numa you have to do a branch then a memory access.
>
> So what is the point to do that?

I agree with making inline. But I don't think there is any harm in making them
static and adding helpers.

>
>
>>>> diff --git a/xen/include/asm-x86/acpi.h b/xen/include/asm-x86/acpi.h
>>>> index a766688..9298d42 100644
>>>> --- a/xen/include/asm-x86/acpi.h
>>>> +++ b/xen/include/asm-x86/acpi.h
>>>> @@ -103,7 +103,6 @@ extern void acpi_reserve_bootmem(void);
>>>>
>>>>  #define ARCH_HAS_POWER_INIT1
>>>>
>>>> -extern s8 acpi_numa;
>>>>  extern int acpi_scan_nodes(u64 start, u64 end);
>>>>  #define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
>>>>
>>>> diff --git a/xen/include/asm-x86/numa.h b/xen/include/asm-x86/numa.h
>>>> index bb22bff..ae5768b 100644
>>>> --- a/xen/include/asm-x86/numa.h
>>>> +++ b/xen/include/asm-x86/numa.h
>>>> @@ -30,10 +30,7 @@ extern nodeid_t pxm_to_node(unsigned int pxm);
>>>>
>>>>  extern void numa_add_cpu(int cpu);
>>>>  extern void numa_init_array(void);
>>>> -extern bool_t numa_off;
>>>> -
>>>> -
>>>> -extern int srat_disabled(void);
>>>> +extern bool srat_disabled(void);
>>>>  extern void numa_set_node(int cpu, nodeid_t node);
>>>>  extern nodeid_t setup_node(unsigned int pxm);
>>>>  extern void srat_detect_node(int cpu);
>>>> diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
>>>> index 7aef1a8..7f6d090 100644
>>>> --- a/xen/include/xen/numa.h
>>>> +++ b/xen/include/xen/numa.h
>>>> @@ -18,4 +18,7 @@
>>>>(((d)->vcpu != NULL && (d)->vcpu[0] != NULL) \
>>>> ? vcpu_to_node((d)->vcpu[0]) : NUMA_NO_NODE)
>>>>
>>>> +bool is_numa_off(void);
>>>> +bool get_acpi_numa(void);
>>>> +void set_acpi_numa(bool val);
>>>
>>>
>>>
>>> I am not sure to understand why you add those helpers directly here in
>>> xen/numa.h. IHMO, This should belong to the moving code patches.
>>
>>
>> To have code moving patches doing only code movement, I have exported
>> in the patch it is defined.
>
>
> Sorry but what are prototypes if not code?
>
> The point of moving the prototypes in the patch which move the actual
> declarations is helping the reviewers to check if you correctly moved
> everything.

I am ok if it helps in review.

>
>
>>
>>>
>>>
>>>>  #endif /* _XEN_NUMA_H */
>>>>
>>>
>>> --
>>> Julien Grall
>
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function

2017-04-25 Thread Vijay Kilari
On Thu, Apr 20, 2017 at 9:42 PM, Julien Grall  wrote:
> Hi Vijay,
>
>
> On 28/03/17 16:53, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Split numa_initmem_init() so that the numa fallback code is moved
>> as separate function which is generic.
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>>  xen/arch/x86/numa.c | 29 +
>>  1 file changed, 17 insertions(+), 12 deletions(-)
>>
>> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
>> index 6b794a7..0888d53 100644
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -268,21 +268,10 @@ static int __init numa_emulation(uint64_t start_pfn,
>> uint64_t end_pfn)
>>  }
>>  #endif
>>
>> -void __init numa_initmem_init(unsigned long start_pfn, unsigned long
>> end_pfn)
>> +static void __init numa_dummy_init(unsigned long start_pfn, unsigned long
>> end_pfn)
>>  {
>>  int i;
>>
>> -#ifdef CONFIG_NUMA_EMU
>> -if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
>> -return;
>> -#endif
>> -
>> -#ifdef CONFIG_ACPI_NUMA
>> -if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn <<
>> PAGE_SHIFT,
>> - (uint64_t)end_pfn << PAGE_SHIFT) )
>> -return;
>> -#endif
>> -
>>  printk(KERN_INFO "%s\n",
>> is_numa_off() ? "NUMA turned off" : "No NUMA configuration
>> found");
>>
>> @@ -301,6 +290,22 @@ void __init numa_initmem_init(unsigned long
>> start_pfn, unsigned long end_pfn)
>>  (paddr_t)end_pfn << PAGE_SHIFT);
>>  }
>>
>> +void __init numa_initmem_init(unsigned long start_pfn, unsigned long
>> end_pfn)
>> +{
>> +#ifdef CONFIG_NUMA_EMU
>> +if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
>> +return;
>> +#endif
>
>
> I am not sure where to comment about it in this series, so I will say it
> here.
>
> As asked on v1, why don't you consider fake NUMA? This would help to test
> the series on non-NUMA platform.

I have not tested non-NUMA case with this series. Agreed this two
lines should be added
to numa_initmem_init() of arm (xen/arch/arm/numa/numa.c)

>
>> +
>> +#ifdef CONFIG_ACPI_NUMA
>> +if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn <<
>> PAGE_SHIFT,
>> + (uint64_t)end_pfn << PAGE_SHIFT) )
>> +return;
>> +#endif
>> +
>> +numa_dummy_init(start_pfn, end_pfn);
>> +}
>> +
>>  void numa_add_cpu(int cpu)
>>  {
>>  cpumask_set_cpu(cpu, _to_cpumask[cpu_to_node(cpu)]);
>>
>
> Cheers,
>
> --
>  Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables

2017-04-25 Thread Vijay Kilari
On Thu, Apr 20, 2017 at 9:29 PM, Julien Grall  wrote:
> Hi Vijay,
>
> On 28/03/17 16:53, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> Add accessor functions for acpi_numa and numa_off static
>> variables. Init value of acpi_numa is set 1 instead of 0.
>
>
> Please explain why you change the acpi_numa value from 0 to 1.

previously acpi_numa was s8 and are using 0 and -1 values. I have made
it bool and set
the initial value to 1.

By setting 1, we are enabling acpi_numa by default. If not enabled, the below
call has check srat_disabled() before proceeding fails.

acpi_numa_memory_affinity_init(const struct acpi_srat_mem_affinity *ma)
{


if ( srat_disabled() )
return;

}

>
> Also, I am not sure to understand the benefits of those helpers. Why do you
> need them? Why not using the global variable directly, this will avoid to
> call a function just for returning a value...

These helpers are used by both common code and arch specific code later.
Hence made these global variables as static and added helpers

>
>> Also return value of srat_disabled is changed to bool.
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>>  xen/arch/x86/numa.c| 43
>> +++
>>  xen/arch/x86/setup.c   |  2 +-
>>  xen/arch/x86/srat.c| 12 ++--
>>  xen/include/asm-x86/acpi.h |  1 -
>>  xen/include/asm-x86/numa.h |  5 +
>>  xen/include/xen/numa.h |  3 +++
>>  6 files changed, 42 insertions(+), 24 deletions(-)
>>
>> diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
>> index 964fc5a..6b794a7 100644
>> --- a/xen/arch/x86/numa.c
>> +++ b/xen/arch/x86/numa.c
>> @@ -42,12 +42,27 @@ cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
>>
>>  nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
>>
>> -bool numa_off = 0;
>> -s8 acpi_numa = 0;
>> +static bool numa_off = 0;
>> +static bool acpi_numa = 1;
>
>
> Please don't mix 0/1 and bool. Instead use false/true.

OK.
>
>
>>
>> -int srat_disabled(void)
>> +bool is_numa_off(void)
>>  {
>> -return numa_off || acpi_numa < 0;
>> +return numa_off;
>> +}
>> +
>> +bool get_acpi_numa(void)
>> +{
>> +return acpi_numa;
>> +}
>> +
>> +void set_acpi_numa(bool_t val)
>> +{
>> +acpi_numa = val;
>> +}
>> +
>> +bool srat_disabled(void)
>> +{
>> +return numa_off || acpi_numa == 0;
>>  }
>>
>>  /*
>> @@ -202,13 +217,17 @@ void __init numa_init_array(void)
>>
>>  #ifdef CONFIG_NUMA_EMU
>>  static int __initdata numa_fake = 0;
>> +static int get_numa_fake(void)
>> +{
>> +return numa_fake;
>> +}
>>
>>  /* Numa emulation */
>>  static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
>>  {
>>  int i;
>>  struct node nodes[MAX_NUMNODES];
>> -uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
>> +uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) /
>> get_numa_fake();
>>
>>  /* Kludge needed for the hash function */
>>  if ( hweight64(sz) > 1 )
>> @@ -223,10 +242,10 @@ static int __init numa_emulation(uint64_t start_pfn,
>> uint64_t end_pfn)
>>  }
>>
>>  memset(,0,sizeof(nodes));
>> -for ( i = 0; i < numa_fake; i++ )
>> +for ( i = 0; i < get_numa_fake(); i++ )
>>  {
>>  nodes[i].start = (start_pfn << PAGE_SHIFT) + i * sz;
>> -if ( i == numa_fake - 1 )
>> +if ( i == get_numa_fake() - 1 )
>>  sz = (end_pfn << PAGE_SHIFT) - nodes[i].start;
>>  nodes[i].end = nodes[i].start + sz;
>>  printk(KERN_INFO
>> @@ -235,7 +254,7 @@ static int __init numa_emulation(uint64_t start_pfn,
>> uint64_t end_pfn)
>> (nodes[i].end - nodes[i].start) >> 20);
>>  node_set_online(i);
>>  }
>> -if ( compute_memnode_shift(nodes, numa_fake, NULL, _shift) )
>> +if ( compute_memnode_shift(nodes, get_numa_fake(), NULL,
>> _shift) )
>>  {
>>  memnode_shift = 0;
>>  printk(KERN_ERR "No NUMA hash function found. Emulation
>> disabled.\n");
>> @@ -254,18 +273,18 @@ void __init numa_initmem_init(unsigned long
>> start_pfn, unsigned long end_pfn)
>>  int i;
>>
>>  #ifdef CONFIG_NUMA_EMU
>> -if ( numa_fake && !numa_emulation(start_pfn, end_pfn) )
>> +if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
>>  return;
>>  #endif
>>
>>  #ifdef CONFIG_ACPI_NUMA
>> -if ( !numa_off && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
>> +if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn <<
>> PAGE_SHIFT,
>>   (uint64_t)end_pfn << PAGE_SHIFT) )
>>  return;
>>  #endif
>>
>>  printk(KERN_INFO "%s\n",
>> -   numa_off ? "NUMA turned off" : "No NUMA configuration found");
>> +   is_numa_off() ? "NUMA turned off" : "No NUMA configuration
>> found");
>>
>>  printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
>> (uint64_t)start_pfn << PAGE_SHIFT,
>> @@ -312,7 +331,7 @@ 

Re: [Xen-devel] [PATCH v7 00/34] arm64: Dom0 ITS emulation

2017-04-11 Thread Vijay Kilari
Hi Andre,

On Sat, Apr 8, 2017 at 3:37 AM, Andre Przywara  wrote:
> Hi,
>
> an only slightly modified repost of the last version.
> I added the Reviewed-by: and Acked-by: tags from Stefano and Julien
> and rebased on top of the latest staging tree:
> commit 89216c7999eb5b8558bfac7d61ae0d5ab844ce3f
> Author: Dario Faggioli 
> Date:   Fri Apr 7 18:57:14 2017 +0200
>
> xen: credit1: treat pCPUs more evenly during balancing.
>
> Other than that and one typo and comment fix the first 10 patches have
> not been changed.
> I dropped the addition of the GIC_IRQ_GUEST_LPI_PENDING bit in patch 12/36
> and followed Stefano's suggestion, which led to the removal of former
> patches 17/36 and 23/36 and some simplification in later patches.
> I haven't been able to address most review comments from the last part of
> v5 yet, but will definitely still fix them.
> Detailed changelog below.
>
> Cheers,
> Andre
>
> --
> This series adds support for emulation of an ARM GICv3 ITS interrupt
> controller. For hardware which relies on the ITS to provide interrupts for
> its peripherals this code is needed to get a machine booted into Dom0 at
> all. ITS emulation for DomUs is only really useful with PCI passthrough,
> which is not yet available for ARM. It is expected that this feature
> will be co-developed with the ITS DomU code. However this code drop here
> considered DomU emulation already, to keep later architectural changes
> to a minimum.
>
> This is technical preview version to allow early testing of the feature.
> Things not (properly) addressed in this release:
> - The MOVALL command is not emulated. In our case there is really nothing
> to do here. We might need to revisit this in the future for DomU support.
> - The INVALL command might need some rework to be more efficient. Currently
> we iterate over all mapped LPIs, which might take a bit longer.
> - Indirect tables are not supported. This affects both the host and the
> virtual side.
> - The command queue locking is currently suboptimal and should be made more
> fine-grained in the future, if possible.
> - We need to properly investigate the possible interaction when devices get
> removed. This requires to properly clean up and remove any associated
> resources like pending or in-flight LPIs, for instance.
>
>
> Some generic design principles:
>
> * The current GIC code statically allocates structures for each supported
> IRQ (both for the host and the guest), which due to the potentially
> millions of LPI interrupts is not feasible to copy for the ITS.
> So we refrain from introducing the ITS as a first class Xen interrupt
> controller, also we don't hold struct irq_desc's or struct pending_irq's
> for each possible LPI.
> Fortunately LPIs are only interesting to guests, so we get away with
> storing only the virtual IRQ number and the guest VCPU for each allocated
> host LPI, which can be stashed into one uint64_t. This data is stored in
> a two-level table, which is both memory efficient and quick to access.
> We hook into the existing IRQ handling and VGIC code to avoid accessing
> the normal structures, providing alternative methods for getting the
> needed information (priority, is enabled?) for LPIs.
> Whenever a guest maps a device, we allocate the maximum required number
> of struct pending_irq's, so that any triggering LPI can find its data
> structure. Upon the guest actually mapping the LPI, this pointer to the
> corresponding pending_irq gets entered into a radix tree, so that it can
> be quickly looked up.
>
> * On the guest side we (later will) have to deal with malicious guests
> trying to hog Xen with mapping requests for a lot of LPIs, for instance.
> As the ITS actually uses system memory for storing status information,
> we use this memory (which the guest has to provide) to naturally limit
> a guest. Whenever we need information from any of the ITS tables, we
> temporarily map them (which is cheap on arm64) and copy the required data.
> * An obvious approach to handling some guest ITS commands would be to
> propagate them to the host, for instance to map devices and LPIs and
> to enable or disable LPIs.
> However this (later with DomU support) will create an attack vector, as
> a malicious guest could try to fill the host command queue with
> propagated commands.
> So we try to avoid this situation: Dom0 sending a device mapping (MAPD)
> command is the only time we allow queuing commands to the host ITS command
> queue, as this seems to be the only reliable way of getting the
> required information at the moment. However at the same time we map all
> events to LPIs already, also enable them. This avoids sending commands
> later at runtime, as we can deal with mappings and LPI enabling/disabling
> internally.
>
> To accomodate the tech preview nature of this feature at the moment, there
> is a Kconfig option to enable it. Also it is supported on arm64 only, 

[Xen-devel] [PATCH v5] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP_VIRT region

2017-04-06 Thread vijay . kilari
From: Vijaya Kumar K 

On ARM platforms with NUMA, while initializing second memory node,
panic is triggered from init_node_heap() when virt_to_mfn()
is called for DIRECTMAP_VIRT region address because DIRECTMAP_VIRT
region is not mapped to any virtual address.

The check virt_to_mfn() here is used to know whether the max MFN is
part of the direct mapping. The max MFN is found by calling virt_to_mfn
on end address of DIRECTMAP_VIRT region, which is DIRECTMAP_VIRT_END.

On ARM64, all RAM is currently direct mapped in Xen and virt_to_mfn
uses the hardware for address translation. So if the virtual address
is not mapped translation fault is raised.

In this patch, instead of calling virt_to_mfn(), arch helper
arch_mfn_in_directmap() is introduced.

On ARM64 this arch helper will return true, because currently all RAM
is direct mapped in Xen.
On ARM32, Only a limited amount of RAM, called xenheap, is always mapped
and DIRECTMAP_VIRT region is not mapped. Hence return false.
For x86 this helper does virt_to_mfn.

Signed-off-by: Vijaya Kumar K 
Reviewed-by: Jan Beulich 
---
v5: - Rewritten commit message.
- Update comments
- Dropped extra brackets
---
 xen/common/page_alloc.c|  9 ++---
 xen/include/asm-arm/arm32/mm.h | 23 +++
 xen/include/asm-arm/arm64/mm.h | 23 +++
 xen/include/asm-arm/mm.h   |  8 
 xen/include/asm-x86/mm.h   | 11 +++
 5 files changed, 67 insertions(+), 7 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 68dba19..9e41fb4 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -520,9 +520,6 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
 unsigned long needed = (sizeof(**_heap) +
 sizeof(**avail) * NR_ZONES +
 PAGE_SIZE - 1) >> PAGE_SHIFT;
-#ifdef DIRECTMAP_VIRT_END
-unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
-#endif
 int i, j;
 
 if ( !first_node_initialised )
@@ -532,9 +529,8 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
 first_node_initialised = 1;
 needed = 0;
 }
-#ifdef DIRECTMAP_VIRT_END
 else if ( *use_tail && nr >= needed &&
-  (mfn + nr) <= (virt_to_mfn(eva - 1) + 1) &&
+  arch_mfn_in_directmap(mfn + nr) &&
   (!xenheap_bits ||
!((mfn + nr - 1) >> (xenheap_bits - PAGE_SHIFT))) )
 {
@@ -543,7 +539,7 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
   PAGE_SIZE - sizeof(**avail) * NR_ZONES;
 }
 else if ( nr >= needed &&
-  (mfn + needed) <= (virt_to_mfn(eva - 1) + 1) &&
+  arch_mfn_in_directmap(mfn + needed) &&
   (!xenheap_bits ||
!((mfn + needed - 1) >> (xenheap_bits - PAGE_SHIFT))) )
 {
@@ -552,7 +548,6 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
   PAGE_SIZE - sizeof(**avail) * NR_ZONES;
 *use_tail = 0;
 }
-#endif
 else if ( get_order_from_bytes(sizeof(**_heap)) ==
   get_order_from_pages(needed) )
 {
diff --git a/xen/include/asm-arm/arm32/mm.h b/xen/include/asm-arm/arm32/mm.h
new file mode 100644
index 000..6861249
--- /dev/null
+++ b/xen/include/asm-arm/arm32/mm.h
@@ -0,0 +1,23 @@
+#ifndef __ARM_ARM32_MM_H__
+#define __ARM_ARM32_MM_H__
+
+/*
+ * Only a limited amount of RAM, called xenheap, is always mapped on ARM32.
+ * For convenience always return false.
+ */
+static inline bool arch_mfn_in_directmap(unsigned long mfn)
+{
+return false;
+}
+
+#endif /* __ARM_ARM32_MM_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/arm64/mm.h b/xen/include/asm-arm/arm64/mm.h
new file mode 100644
index 000..d0a3be7
--- /dev/null
+++ b/xen/include/asm-arm/arm64/mm.h
@@ -0,0 +1,23 @@
+#ifndef __ARM_ARM64_MM_H__
+#define __ARM_ARM64_MM_H__
+
+/*
+ * On ARM64, all the RAM is currently direct mapped in Xen.
+ * Hence return always true.
+ */
+static inline bool arch_mfn_in_directmap(unsigned long mfn)
+{
+return true;
+}
+
+#endif /* __ARM_ARM64_MM_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 4892155..0fef612 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -6,6 +6,14 @@
 #include 
 #include 
 
+#if defined(CONFIG_ARM_32)
+# include 
+#elif defined(CONFIG_ARM_64)
+# include 
+#else
+# error "unknown ARM variant"
+#endif
+
 /* Align Xen to a 2 MiB boundary. */
 #define XEN_PADDR_ALIGN (1 << 21)
 
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index 

Re: [Xen-devel] [PATCH v4] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP

2017-04-03 Thread Vijay Kilari
Hi Julien,

On Mon, Apr 3, 2017 at 3:31 PM, Julien Grall  wrote:
> Hi Vijay,
>
> On 28/03/17 13:35, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> On ARM64, virt_to_mfn uses the hardware for address
>> translation. So if the virtual address is not mapped translation
>> fault is raised. On ARM64, DIRECTMAP_VIRT region is direct mapped.
>
>
> You are stating obvious things, a DIRECTMAP_VIRT region is as the name said
> direct mapped. What matter is all the RAM is mapped in Xen on ARM64.
>
>>
>> On ARM platforms with NUMA, While initializing second memory node,
>
>
> s/While/while/
>
>> panic is triggered from init_node_heap() when virt_to_mfn()
>> is called for DIRECTMAP_VIRT region address.
>> Here the check is made to ensure that MFN less than max MFN mapped.
>
>
> "The check is here to know whether the MFN is part of the direct mapping".
>
>> The max MFN is found by calling virt_to_mfn of DIRECTMAP_VIRT_END
>> region.
>
>
> DIRECTMAP_VIRT_END is the end of the region not a region.
>
>> Since DIRECMAP_VIRT region is not mapped to any virtual address
>
>
> s/DIRECMAP_VIRT/DIRECTMAP_VIRT/
>
>> on ARM, it fails.
>>
>> In this patch, instead of calling virt_to_mfn(), arch helper
>> arch_mfn_in_directmap() is introduced. On ARM64 this arch helper
>> will return true, whereas on ARM DIRECTMAP_VIRT region is not directly
>> mapped
>> only xenheap region is directly mapped.
>
>
> As said before, there is no DIRECTMAP_VIRT region on ARM. All the RAM is not
> mapped on Xen but the xenheap.
>
>> So on ARM return false always.
>
>
> I am OK if you always return false on ARM. But you need to explain why not
> return is_xen_heap_mfn(...);
>
>> For x86 this helper does virt_to_mfn.
>>
>> Signed-off-by: Vijaya Kumar K 
>> ---
>>  xen/common/page_alloc.c|  7 ++-
>>  xen/include/asm-arm/arm32/mm.h | 20 
>>  xen/include/asm-arm/arm64/mm.h | 20 
>>  xen/include/asm-arm/mm.h   |  8 
>>  xen/include/asm-x86/mm.h   | 11 +++
>>  5 files changed, 61 insertions(+), 5 deletions(-)
>>
>> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
>> index 42c20cb..c4ffb31 100644
>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -520,9 +520,6 @@ static unsigned long init_node_heap(int node, unsigned
>> long mfn,
>>  unsigned long needed = (sizeof(**_heap) +
>>  sizeof(**avail) * NR_ZONES +
>>  PAGE_SIZE - 1) >> PAGE_SHIFT;
>> -#ifdef DIRECTMAP_VIRT_END
>> -unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
>> -#endif
>>  int i, j;
>>
>>  if ( !first_node_initialised )
>> @@ -534,7 +531,7 @@ static unsigned long init_node_heap(int node, unsigned
>> long mfn,
>>  }
>>  #ifdef DIRECTMAP_VIRT_END
>
>
> Sorry I didn't spot that before. Why do we keep the #ifdef here given that
> the check is arch specific now?
>
>>  else if ( *use_tail && nr >= needed &&
>> -  (mfn + nr) <= (virt_to_mfn(eva - 1) + 1) &&
>> +  arch_mfn_in_directmap(mfn + nr) &&
>>(!xenheap_bits ||
>> !((mfn + nr - 1) >> (xenheap_bits - PAGE_SHIFT))) )
>>  {
>> @@ -543,7 +540,7 @@ static unsigned long init_node_heap(int node, unsigned
>> long mfn,
>>PAGE_SIZE - sizeof(**avail) * NR_ZONES;
>>  }
>>  else if ( nr >= needed &&
>> -  (mfn + needed) <= (virt_to_mfn(eva - 1) + 1) &&
>> +  arch_mfn_in_directmap(mfn + needed) &&
>>(!xenheap_bits ||
>> !((mfn + needed - 1) >> (xenheap_bits - PAGE_SHIFT))) )
>>  {
>> diff --git a/xen/include/asm-arm/arm32/mm.h
>> b/xen/include/asm-arm/arm32/mm.h
>> new file mode 100644
>> index 000..e93d9df
>> --- /dev/null
>> +++ b/xen/include/asm-arm/arm32/mm.h
>> @@ -0,0 +1,20 @@
>> +#ifndef __ARM_ARM32_MM_H__
>> +#define __ARM_ARM32_MM_H__
>> +
>> +/* On ARM only xenheap memory is directly mapped. Hence return false. */
>
>
> By reading this comment some people will wonder why you don't check whether
> the mfn is in xenheap then. As mentioned above, I am ok if you always return
> false here. But you need to explain why.

Is this ok?

"On ARM32, all the RAM is not mapped by Xen, instead it is mapped by xenheap.
So DIRECTMAP_VIRT region is not mapped.
Hence we return always false when mfn is checked on DIRECTMAP_VIRT region."

>
>
>> +static inline bool arch_mfn_in_directmap(unsigned long mfn)
>> +{
>> +return false;
>> +}
>> +
>> +#endif /* __ARM_ARM32_MM_H__ */
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * tab-width: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/include/asm-arm/arm64/mm.h
>> b/xen/include/asm-arm/arm64/mm.h
>> new file mode 100644
>> index 000..36ee9c8
>> --- /dev/null
>> +++ 

Re: [Xen-devel] [PATCH v3 19/26] ARM: vITS: handle MAPTI command

2017-04-01 Thread Vijay Kilari
On Fri, Mar 31, 2017 at 11:35 PM, Andre Przywara  wrote:
> The MAPTI commands associates a DeviceID/EventID pair with a LPI/CPU
> pair and actually instantiates LPI interrupts.
> We connect the already allocated host LPI to this virtual LPI, so that
> any triggering IRQ on the host can be quickly forwarded to a guest.
> Beside entering the VCPU and the virtual LPI number in the respective
> host LPI entry, we also initialize and add the already allocated
> struct pending_irq to our radix tree, so that we can now easily find it
> by its virtual LPI number.
> This exports the vgic_init_pending_irq() function for that purpose.
>
> Signed-off-by: Andre Przywara 
> ---
>  xen/arch/arm/gic-v3-its.c| 74 
> 
>  xen/arch/arm/gic-v3-lpi.c| 16 +
>  xen/arch/arm/vgic-v3-its.c   | 36 +--
>  xen/arch/arm/vgic.c  |  2 +-
>  xen/include/asm-arm/gic_v3_its.h |  6 
>  xen/include/asm-arm/vgic.h   |  1 +
>  6 files changed, 132 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 8db2a09..39f16b2 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -747,6 +747,80 @@ restart:
>  spin_unlock(>arch.vgic.its_devices_lock);
>  }
>
> +/* Must be called with the its_device_lock held. */
> +static struct its_devices *get_its_device(struct domain *d, paddr_t doorbell,
> +  uint32_t devid)
> +{
> +struct rb_node *node = d->arch.vgic.its_devices.rb_node;
> +struct its_devices *dev;
> +
> +while (node)
> +{
> +int cmp;
> +
> +dev = rb_entry(node, struct its_devices, rbnode);
> +cmp = compare_its_guest_devices(dev, doorbell, devid);
> +
> +if ( !cmp )
> +return dev;
> +
> +if ( cmp > 0 )
> +node = node->rb_left;
> +else
> +node = node->rb_right;
> +}
> +
> +return NULL;
> +}
> +
> +static uint32_t get_host_lpi(struct its_devices *dev, uint32_t eventid)
> +{
> +uint32_t host_lpi = 0;
> +
> +if ( dev && (eventid < dev->eventids) )
> +{
> +host_lpi = dev->host_lpi_blocks[eventid / LPI_BLOCK] +
> +   (eventid % LPI_BLOCK);
> +if ( !is_lpi(host_lpi) )
> +host_lpi = 0;
> +}
> +
> +return host_lpi;
> +}
> +
> +/*
> + * Connects the event ID for an already assigned device to the given 
> VCPU/vLPI
> + * pair. The corresponding physical LPI is already mapped on the host side
> + * (when assigning the physical device to the guest), so we just connect the
> + * target VCPU/vLPI pair to that interrupt to inject it properly if it fires.
> + */
> +struct pending_irq *gicv3_assign_guest_event(struct domain *d,
> + paddr_t doorbell_address,
> + uint32_t devid, uint32_t 
> eventid,
> + struct vcpu *v, uint32_t 
> virt_lpi)
> +{
> +struct its_devices *dev;
> +struct pending_irq *pirq = NULL;
> +uint32_t host_lpi = 0;
> +
> +spin_lock(>arch.vgic.its_devices_lock);
> +dev = get_its_device(d, doorbell_address, devid);
> +if ( dev )
> +{
> +host_lpi = get_host_lpi(dev, eventid);
> +pirq = >pend_irqs[eventid];
> +}
> +spin_unlock(>arch.vgic.its_devices_lock);
> +
> +if ( !host_lpi || !pirq )
> +return NULL;
> +
> +gicv3_lpi_update_host_entry(host_lpi, d->domain_id,
> +v ? v->vcpu_id : -1, virt_lpi);
> +
> +return pirq;
> +}
> +
>  /* Scan the DT for any ITS nodes and create a list of host ITSes out of it. 
> */
>  void gicv3_its_dt_init(const struct dt_device_node *node)
>  {
> diff --git a/xen/arch/arm/gic-v3-lpi.c b/xen/arch/arm/gic-v3-lpi.c
> index 2301d53..a6b728e 100644
> --- a/xen/arch/arm/gic-v3-lpi.c
> +++ b/xen/arch/arm/gic-v3-lpi.c
> @@ -178,6 +178,22 @@ void do_LPI(unsigned int lpi)
>  rcu_unlock_domain(d);
>  }
>
> +void gicv3_lpi_update_host_entry(uint32_t host_lpi, int domain_id,
> + unsigned int vcpu_id, uint32_t virt_lpi)
> +{
> +union host_lpi *hlpip, hlpi;
> +
> +host_lpi -= LPI_OFFSET;
> +
> +hlpip = _data.host_lpis[host_lpi / HOST_LPIS_PER_PAGE][host_lpi % 
> HOST_LPIS_PER_PAGE];
> +
> +hlpi.virt_lpi = virt_lpi;
> +hlpi.dom_id = domain_id;
> +hlpi.vcpu_id = vcpu_id;
> +
> +write_u64_atomic(>data, hlpi.data);
> +}
> +
>  static int gicv3_lpi_allocate_pendtable(uint64_t *reg)
>  {
>  uint64_t val;
> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
> index 36b44f2..d9dce3f 100644
> --- a/xen/arch/arm/vgic-v3-its.c
> +++ b/xen/arch/arm/vgic-v3-its.c
> @@ -258,8 +258,8 @@ static bool read_itte(struct virt_its *its, uint32_t 
> devid, uint32_t evid,
>  }
>
>  

Re: [Xen-devel] [PATCH v3 06/26] ARM: GICv3 ITS: introduce device mapping

2017-04-01 Thread Vijay Kilari
Hi Andre,

On Fri, Mar 31, 2017 at 11:35 PM, Andre Przywara  wrote:
> The ITS uses device IDs to map LPIs to a device. Dom0 will later use
> those IDs, which we directly pass on to the host.
> For this we have to map each device that Dom0 may request to a host
> ITS device with the same identifier.
> Allocate the respective memory and enter each device into an rbtree to
> later be able to iterate over it or to easily teardown guests.
>
> Signed-off-by: Andre Przywara 
> ---
>  xen/arch/arm/gic-v3-its.c| 227 
> +++
>  xen/arch/arm/vgic-v3.c   |   4 +
>  xen/include/asm-arm/domain.h |   3 +
>  xen/include/asm-arm/gic_v3_its.h |  23 
>  4 files changed, 257 insertions(+)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 1ac598f..295f7dc 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -21,6 +21,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -32,6 +34,18 @@
>
>  LIST_HEAD(host_its_list);
>
> +struct its_devices {
> +struct rb_node rbnode;
> +struct host_its *hw_its;
> +void *itt_addr;
> +paddr_t guest_doorbell; /* Identifies the virtual ITS */
> +uint32_t host_devid;
> +uint32_t guest_devid;
> +uint32_t eventids;  /* Number of event IDs (MSIs) */
> +uint32_t *host_lpi_blocks;  /* Which LPIs are used on the host */
> +struct pending_irq *pend_irqs;  /* One struct per event */
> +};
> +
>  bool gicv3_its_host_has_its(void)
>  {
>  return !list_empty(_its_list);
> @@ -151,6 +165,26 @@ static int its_send_cmd_mapc(struct host_its *its, 
> uint32_t collection_id,
>  return its_send_command(its, cmd);
>  }
>
> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
> + uint8_t size_bits, paddr_t itt_addr, bool valid)
> +{
> +uint64_t cmd[4];
> +
> +if ( valid )
> +{
> +ASSERT(size_bits < 32);
> +ASSERT(!(itt_addr & ~GENMASK_ULL(51, 8)));
> +}
> +cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
> +cmd[1] = size_bits;
> +cmd[2] = itt_addr;
> +if ( valid )
> +cmd[2] |= GITS_VALID_BIT;
> +cmd[3] = 0x00;
> +
> +return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(unsigned int cpu)
>  {
> @@ -376,6 +410,7 @@ static int gicv3_its_init_single_its(struct host_its 
> *hw_its)
>  hw_its->devid_bits = min(hw_its->devid_bits, max_its_device_bits);
>  if ( reg & GITS_TYPER_PTA )
>  hw_its->flags |= HOST_ITS_USES_PTA;
> +hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);
>
>  for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>  {
> @@ -432,6 +467,197 @@ int gicv3_its_init(void)
>  return 0;
>  }
>
> +static int remove_mapped_guest_device(struct its_devices *dev)
> +{
> +int ret;
> +
> +if ( dev->hw_its )
> +{
> +/* MAPD also discards all events with this device ID. */
> +int ret = its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, 
> false);
> +if ( ret )
> +return ret;
> +}
> +
> +ret = gicv3_its_wait_commands(dev->hw_its);
> +if ( ret )
> +return ret;
> +
> +xfree(dev->itt_addr);
> +xfree(dev->pend_irqs);
> +xfree(dev);
> +
> +return 0;
> +}
> +
> +static struct host_its *gicv3_its_find_by_doorbell(paddr_t doorbell_address)
> +{
> +struct host_its *hw_its;
> +
> +list_for_each_entry(hw_its, _its_list, entry)
> +{
> +if ( hw_its->addr + ITS_DOORBELL_OFFSET == doorbell_address )
> +return hw_its;
> +}
> +
> +return NULL;
> +}
> +
> +static int compare_its_guest_devices(struct its_devices *dev,
> + paddr_t doorbell, uint32_t devid)
> +{
> +if ( dev->guest_doorbell < doorbell )
> +return -1;
> +
> +if ( dev->guest_doorbell > doorbell )
> +return 1;
> +
> +if ( dev->guest_devid < devid )
> +return -1;
> +
> +if ( dev->guest_devid > devid )
> +return 1;
> +
> +return 0;
> +}
> +
> +/*
> + * Map a hardware device, identified by a certain host ITS and its device ID
> + * to domain d, a guest ITS (identified by its doorbell address) and device 
> ID.
> + * Also provide the number of events (MSIs) needed for that device.
> + * This does not check if this particular hardware device is already mapped
> + * at another domain, it is expected that this would be done by the caller.
> + */
> +int gicv3_its_map_guest_device(struct domain *d,
> +   paddr_t host_doorbell, uint32_t host_devid,
> +   paddr_t guest_doorbell, uint32_t guest_devid,
> +   uint32_t nr_events, bool valid)
> +{
> +void *itt_addr = NULL;
> +   

Re: [Xen-devel] [PATCH v2 06/27] ARM: GICv3 ITS: introduce device mapping

2017-03-30 Thread Vijay Kilari
Hi Andre,

On Thu, Mar 16, 2017 at 4:50 PM, Andre Przywara  wrote:
> The ITS uses device IDs to map LPIs to a device. Dom0 will later use
> those IDs, which we directly pass on to the host.
> For this we have to map each device that Dom0 may request to a host
> ITS device with the same identifier.
> Allocate the respective memory and enter each device into an rbtree to
> later be able to iterate over it or to easily teardown guests.
>
> Signed-off-by: Andre Przywara 
> ---
>  xen/arch/arm/gic-v3-its.c| 207 
> +++
>  xen/arch/arm/vgic-v3.c   |   3 +
>  xen/include/asm-arm/domain.h |   3 +
>  xen/include/asm-arm/gic_v3_its.h |  18 
>  4 files changed, 231 insertions(+)
>
> diff --git a/xen/arch/arm/gic-v3-its.c b/xen/arch/arm/gic-v3-its.c
> index 5c11b0d..60b15b5 100644
> --- a/xen/arch/arm/gic-v3-its.c
> +++ b/xen/arch/arm/gic-v3-its.c
> @@ -21,6 +21,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -32,6 +34,17 @@
>
>  LIST_HEAD(host_its_list);
>
> +struct its_devices {
> +struct rb_node rbnode;
> +struct host_its *hw_its;
> +void *itt_addr;
> +paddr_t guest_doorbell;
> +uint32_t host_devid;
> +uint32_t guest_devid;
> +uint32_t eventids;
> +uint32_t *host_lpis;
> +};
> +
>  bool gicv3_its_host_has_its(void)
>  {
>  return !list_empty(_its_list);
> @@ -149,6 +162,24 @@ static int its_send_cmd_mapc(struct host_its *its, 
> uint32_t collection_id,
>  return its_send_command(its, cmd);
>  }
>
> +static int its_send_cmd_mapd(struct host_its *its, uint32_t deviceid,
> + uint8_t size_bits, paddr_t itt_addr, bool valid)
> +{
> +uint64_t cmd[4];
> +
> +if ( valid )
> +{
> +ASSERT(size_bits < 32);
> +ASSERT(!(itt_addr & ~GENMASK(51, 8)));
> +}
> +cmd[0] = GITS_CMD_MAPD | ((uint64_t)deviceid << 32);
> +cmd[1] = valid ? size_bits : 0x00;
> +cmd[2] = valid ? (itt_addr | GITS_VALID_BIT) : 0x00;
> +cmd[3] = 0x00;
> +
> +return its_send_command(its, cmd);
> +}
> +
>  /* Set up the (1:1) collection mapping for the given host CPU. */
>  int gicv3_its_setup_collection(unsigned int cpu)
>  {
> @@ -379,6 +410,7 @@ static int gicv3_its_init_single_its(struct host_its 
> *hw_its)
>  devid_bits = min(devid_bits, max_its_device_bits);
>  if ( reg & GITS_TYPER_PTA )
>  hw_its->flags |= HOST_ITS_USES_PTA;
> +hw_its->itte_size = GITS_TYPER_ITT_SIZE(reg);
>
>  for ( i = 0; i < GITS_BASER_NR_REGS; i++ )
>  {
> @@ -428,6 +460,180 @@ int gicv3_its_init(void)
>  return 0;
>  }
>
> +static int remove_mapped_guest_device(struct its_devices *dev)
> +{
> +int ret;
> +
> +if ( dev->hw_its )
> +{
> +int ret = its_send_cmd_mapd(dev->hw_its, dev->host_devid, 0, 0, 
> false);
> +if ( ret )
> +return ret;
> +}
> +
> +ret = gicv3_its_wait_commands(dev->hw_its);
> +if ( ret )
> +return ret;
> +
> +xfree(dev->itt_addr);
> +xfree(dev);
> +
> +return 0;
> +}
> +
> +static struct host_its *gicv3_its_find_by_doorbell(paddr_t doorbell_address)
> +{
> +struct host_its *hw_its;
> +
> +list_for_each_entry(hw_its, _its_list, entry)
> +{
> +if ( hw_its->addr + ITS_DOORBELL_OFFSET == doorbell_address )
> +return hw_its;
> +}
> +
> +return NULL;
> +}
> +
> +static int compare_its_guest_devices(struct its_devices *dev,
> + paddr_t doorbell, uint32_t devid)
> +{
> +if ( dev->guest_doorbell < doorbell )
> +return -1;
> +
> +if ( dev->guest_doorbell > doorbell )
> +return 1;
> +
> +if ( dev->guest_devid < devid )
> +return -1;
> +
> +if ( dev->guest_devid > devid )
> +return 1;
> +
> +return 0;
> +}
> +
> +/*
> + * Map a hardware device, identified by a certain host ITS and its device ID
> + * to domain d, a guest ITS (identified by its doorbell address) and device 
> ID.
> + * Also provide the number of events (MSIs) needed for that device.
> + * This does not check if this particular hardware device is already mapped
> + * at another domain, it is expected that this would be done by the caller.
> + */
> +int gicv3_its_map_guest_device(struct domain *d,
> +   paddr_t host_doorbell, uint32_t host_devid,
> +   paddr_t guest_doorbell, uint32_t guest_devid,
> +   uint32_t nr_events, bool valid)
> +{
> +void *itt_addr = NULL;
> +struct host_its *hw_its;
> +struct its_devices *dev = NULL, *temp;
> +struct rb_node **new = >arch.vgic.its_devices.rb_node, *parent = NULL;
> +int ret = -ENOENT;
> +
> +hw_its = gicv3_its_find_by_doorbell(host_doorbell);
> +if ( !hw_its )
> +return ret;
> +
> +/* check for already existing mappings */
> +

[Xen-devel] [RFC PATCH v2 23/25] ARM: NUMA: Initialize ACPI NUMA

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Call ACPI NUMA initialization under CONFIG_ACPI_NUMA.

Signed-off-by: Vijaya Kumar 
---
 xen/arch/arm/numa/acpi_numa.c | 28 +++-
 xen/arch/arm/numa/numa.c  |  6 ++
 xen/common/numa.c | 14 ++
 xen/include/asm-arm/numa.h|  1 +
 xen/include/xen/numa.h|  1 +
 5 files changed, 49 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
index 8f51ed0..574ed45 100644
--- a/xen/arch/arm/numa/acpi_numa.c
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -29,6 +29,7 @@
 #include 
 
 extern nodemask_t processor_nodes_parsed;
+extern nodemask_t memory_nodes_parsed;
 
 /* Holds CPUID to MPIDR mapping read from MADT table. */
 struct cpuid_to_hwid {
@@ -183,7 +184,7 @@ acpi_numa_gicc_affinity_init(const struct 
acpi_srat_gicc_affinity *pa)
pxm, mpidr, node);
 }
 
-void __init acpi_map_uid_to_mpidr(void)
+static void __init acpi_map_uid_to_mpidr(void)
 {
 acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
 acpi_parse_madt_handler, NR_CPUS);
@@ -211,6 +212,31 @@ void __init arch_table_parse_srat(void)
   acpi_parse_gicc_affinity, NR_CPUS);
 }
 
+bool_t __init arch_acpi_numa_init(void)
+{
+int ret;
+
+if ( !acpi_disabled )
+{
+/*
+ * If firmware has DT, process_memory_node() call
+ * would have added memory blocks. So reset it before
+ * ACPI numa init.
+ */
+numa_clear_memblks();
+nodes_clear(memory_nodes_parsed);
+acpi_map_uid_to_mpidr();
+ret = acpi_numa_init();
+if ( ret || srat_disabled() )
+return 1;
+
+/* Register acpi node_distance handler */
+register_node_distance(_node_distance);
+}
+
+return 0;
+}
+
 void __init acpi_numa_arch_fixup(void) {}
 
 /*
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index 958085c..b5556c6 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -152,12 +152,18 @@ void __init numa_init(void)
 if ( is_numa_off() )
 goto no_numa;
 
+#ifdef CONFIG_ACPI_NUMA
+ret = arch_acpi_numa_init();
+if ( ret )
+printk(XENLOG_WARNING "ACPI NUMA init failed\n");
+#else
 if ( !dt_numa )
 goto no_numa;
 
 ret = dt_numa_init();
 if ( ret )
 printk(XENLOG_WARNING "DT NUMA init failed\n");
+#endif
 
 for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
 {
diff --git a/xen/common/numa.c b/xen/common/numa.c
index f2ac726..aca2386 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -84,6 +84,20 @@ nodeid_t get_memblk_nodeid(int id)
 return memblk_nodeid[id];
 }
 
+void __init numa_clear_memblks(void)
+{
+unsigned int i;
+
+for ( i = 0; i < get_num_node_memblks(); i++ )
+{
+node_memblk_range[i].start = 0;
+node_memblk_range[i].end = 0;
+memblk_nodeid[i] = NUMA_NO_NODE;
+}
+
+num_node_memblks = 0;
+}
+
 int __init get_mem_nodeid(paddr_t start, paddr_t end)
 {
 unsigned int i;
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index 1d4dc98..f932ba3 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -24,6 +24,7 @@ static inline nodeid_t acpi_get_nodeid(uint64_t hwid)
 
 #ifdef CONFIG_NUMA
 extern void numa_init(void);
+extern bool_t arch_acpi_numa_init(void);
 extern int dt_numa_init(void);
 extern void numa_set_cpu_node(int cpu, unsigned int nid);
 extern void numa_add_cpu(int cpu);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index c3b4adc..6c885bd 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -59,4 +59,5 @@ void set_acpi_numa(bool val);
 int get_numa_fake(void);
 extern int numa_emulation(uint64_t start_pfn, uint64_t end_pfn);
 extern void numa_dummy_init(uint64_t start_pfn, uint64_t end_pfn);
+extern void numa_clear_memblks(void);
 #endif /* _XEN_NUMA_H */
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 14/25] ARM: NUMA: Parse NUMA distance information

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Parse distance-matrix and fetch node distance information.
Store distance information in node_distance[].

Register dt_node_distance() function pointer with
the ARM numa code. This approach can be later used for
ACPI.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/bootfdt.c  |   4 +-
 xen/arch/arm/numa/dt_numa.c | 133 
 xen/arch/arm/numa/numa.c|  21 +++
 xen/include/asm-arm/numa.h  |   3 +
 xen/include/asm-arm/setup.h |   2 +
 5 files changed, 161 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 993760a..c72300c 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -17,8 +17,8 @@
 #include 
 #include 
 
-static bool_t __init device_tree_node_matches(const void *fdt, int node,
-  const char *match)
+bool_t __init device_tree_node_matches(const void *fdt, int node,
+   const char *match)
 {
 const char *name;
 size_t match_len;
diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
index 593c647..c2dcfa1 100644
--- a/xen/arch/arm/numa/dt_numa.c
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -27,6 +27,48 @@
 extern nodemask_t processor_nodes_parsed;
 extern nodemask_t memory_nodes_parsed;
 
+static uint8_t node_distance[MAX_NUMNODES][MAX_NUMNODES];
+
+static uint8_t dt_node_distance(nodeid_t nodea, nodeid_t nodeb)
+{
+if ( nodea >= MAX_NUMNODES || nodeb >= MAX_NUMNODES )
+return nodea == nodeb ? LOCAL_DISTANCE : REMOTE_DISTANCE;
+
+return node_distance[nodea][nodeb];
+}
+
+static int dt_numa_set_distance(uint32_t nodea, uint32_t nodeb,
+uint32_t distance)
+{
+   /* node_distance is uint8_t. Ensure distance is less than 255 */
+   if ( nodea >= MAX_NUMNODES || nodeb >= MAX_NUMNODES || distance > 255 )
+   return -EINVAL;
+
+   node_distance[nodea][nodeb] = distance;
+
+   return 0;
+}
+
+void init_dt_numa_distance(void)
+{
+int i, j;
+
+for ( i = 0; i < MAX_NUMNODES; i++ )
+{
+for ( j = 0; j < MAX_NUMNODES; j++ )
+{
+/*
+ * Initialize distance 10 for local distance and
+ * 20 for remote distance.
+ */
+if ( i  == j )
+node_distance[i][j] = LOCAL_DISTANCE;
+else
+node_distance[i][j] = REMOTE_DISTANCE;
+}
+}
+}
+
 /*
  * Even though we connect cpus to numa domains later in SMP
  * init, we need to know the node ids now for all cpus.
@@ -48,6 +90,76 @@ static int __init dt_numa_process_cpu_node(const void *fdt, 
int node,
 return 0;
 }
 
+static int __init dt_numa_parse_distance_map(const void *fdt, int node,
+ const char *name,
+ uint32_t address_cells,
+ uint32_t size_cells)
+{
+const struct fdt_property *prop;
+const __be32 *matrix;
+int entry_count, len, i;
+
+printk(XENLOG_INFO "NUMA: parsing numa-distance-map\n");
+
+prop = fdt_get_property(fdt, node, "distance-matrix", );
+if ( !prop )
+{
+printk(XENLOG_WARNING
+   "NUMA: No distance-matrix property in distance-map\n");
+
+return -EINVAL;
+}
+
+if ( len % sizeof(uint32_t) != 0 )
+{
+ printk(XENLOG_WARNING
+"distance-matrix in node is not a multiple of u32\n");
+
+return -EINVAL;
+}
+
+entry_count = len / sizeof(uint32_t);
+if ( entry_count <= 0 )
+{
+printk(XENLOG_WARNING "NUMA: Invalid distance-matrix\n");
+
+return -EINVAL;
+}
+
+matrix = (const __be32 *)prop->data;
+for ( i = 0; i + 2 < entry_count; i += 3 )
+{
+uint32_t nodea, nodeb, distance;
+
+nodea = dt_read_number(matrix, 1);
+matrix++;
+nodeb = dt_read_number(matrix, 1);
+matrix++;
+distance = dt_read_number(matrix, 1);
+matrix++;
+
+if ( dt_numa_set_distance(nodea, nodeb, distance) )
+{
+printk(XENLOG_WARNING
+   "NUMA: node-id out of range in distance matrix for [node%d 
-> node%d]\n",
+   nodea, nodeb);
+return -EINVAL;
+
+}
+printk(XENLOG_INFO "NUMA: distance[node%d -> node%d] = %d\n",
+   nodea, nodeb, distance);
+
+/*
+ * Set default distance of node B->A same as A->B.
+ * No need to check for return value of numa_set_distance.
+ */
+if ( nodeb > nodea )
+dt_numa_set_distance(nodeb, nodea, distance);
+}
+
+return 0;
+}
+
 static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
 const char *name, int depth,
 uint32_t address_cells,
@@ 

[Xen-devel] [RFC PATCH v2 21/25] ACPI: Move arch specific SRAT parsing

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

SRAT's X2APIC_CPU_AFFINITY and CPU_AFFINITY types are not used
by ARM. Hence move handling of this SRAT types to arch specific
file and handle them under arch_table_parse_srat().

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/numa/acpi_numa.c |  5 +
 xen/arch/x86/srat.c   | 44 +++
 xen/drivers/acpi/numa.c   | 43 ++
 xen/include/xen/acpi.h|  6 ++
 4 files changed, 57 insertions(+), 41 deletions(-)

diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
index 45b3d35..6fd937d 100644
--- a/xen/arch/arm/numa/acpi_numa.c
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -82,6 +82,11 @@ void __init acpi_map_uid_to_mpidr(void)
 acpi_parse_madt_handler, NR_CPUS);
 }
 
+void __init arch_table_parse_srat(void)
+{
+return;
+}
+
 void __init acpi_numa_arch_fixup(void) {}
 
 /*
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 760df7f..2c79329 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -214,3 +214,47 @@ uint8_t __node_distance(nodeid_t a, nodeid_t b)
 }
 
 EXPORT_SYMBOL(__node_distance);
+
+static int __init
+acpi_parse_x2apic_affinity(struct acpi_subtable_header *header,
+  const unsigned long end)
+{
+   const struct acpi_srat_x2apic_cpu_affinity *processor_affinity
+   = container_of(header, struct acpi_srat_x2apic_cpu_affinity,
+  header);
+
+   if (!header)
+   return -EINVAL;
+
+   acpi_table_print_srat_entry(header);
+
+   /* let architecture-dependent part to do it */
+   acpi_numa_x2apic_affinity_init(processor_affinity);
+
+   return 0;
+}
+
+static int __init
+acpi_parse_processor_affinity(struct acpi_subtable_header *header,
+ const unsigned long end)
+{
+   const struct acpi_srat_cpu_affinity *processor_affinity
+   = container_of(header, struct acpi_srat_cpu_affinity, header);
+
+   if (!header)
+   return -EINVAL;
+
+   acpi_table_print_srat_entry(header);
+
+   acpi_numa_processor_affinity_init(processor_affinity);
+
+   return 0;
+}
+
+void __init arch_table_parse_srat(void)
+{
+   acpi_table_parse_srat(ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY,
+ acpi_parse_x2apic_affinity, 0);
+   acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
+ acpi_parse_processor_affinity, 0);
+}
diff --git a/xen/drivers/acpi/numa.c b/xen/drivers/acpi/numa.c
index 85f8917..0adc32c 100644
--- a/xen/drivers/acpi/numa.c
+++ b/xen/drivers/acpi/numa.c
@@ -120,43 +120,6 @@ static int __init acpi_parse_slit(struct acpi_table_header 
*table)
 }
 
 static int __init
-acpi_parse_x2apic_affinity(struct acpi_subtable_header *header,
-  const unsigned long end)
-{
-   const struct acpi_srat_x2apic_cpu_affinity *processor_affinity
-   = container_of(header, struct acpi_srat_x2apic_cpu_affinity,
-  header);
-
-   if (!header)
-   return -EINVAL;
-
-   acpi_table_print_srat_entry(header);
-
-   /* let architecture-dependent part to do it */
-   acpi_numa_x2apic_affinity_init(processor_affinity);
-
-   return 0;
-}
-
-static int __init
-acpi_parse_processor_affinity(struct acpi_subtable_header *header,
- const unsigned long end)
-{
-   const struct acpi_srat_cpu_affinity *processor_affinity
-   = container_of(header, struct acpi_srat_cpu_affinity, header);
-
-   if (!header)
-   return -EINVAL;
-
-   acpi_table_print_srat_entry(header);
-
-   /* let architecture-dependent part to do it */
-   acpi_numa_processor_affinity_init(processor_affinity);
-
-   return 0;
-}
-
-static int __init
 acpi_parse_memory_affinity(struct acpi_subtable_header *header,
   const unsigned long end)
 {
@@ -197,13 +160,11 @@ int __init acpi_numa_init(void)
 {
/* SRAT: Static Resource Affinity Table */
if (!acpi_table_parse(ACPI_SIG_SRAT, acpi_parse_srat)) {
-   acpi_table_parse_srat(ACPI_SRAT_TYPE_X2APIC_CPU_AFFINITY,
- acpi_parse_x2apic_affinity, 0);
-   acpi_table_parse_srat(ACPI_SRAT_TYPE_CPU_AFFINITY,
- acpi_parse_processor_affinity, 0);
acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
  acpi_parse_memory_affinity,
  NR_NODE_MEMBLKS);
+   /* This call handles architecture dependant SRAT */
+   arch_table_parse_srat();
}
 
/* SLIT: System Locality Information Table */
diff --git a/xen/include/xen/acpi.h b/xen/include/xen/acpi.h
index 30ec0ee..0726524 100644

[Xen-devel] [RFC PATCH v2 22/25] ARM: NUMA: Extract proximity from SRAT table

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Register SRAT entry handler for type
ACPI_SRAT_TYPE_GICC_AFFINITY to parse SRAT table
and extract proximity for all CPU IDs.

Signed-off-by: Vijaya Kumar 
---
 xen/arch/arm/acpi/boot.c  |   2 +
 xen/arch/arm/numa/acpi_numa.c | 126 +-
 xen/drivers/acpi/numa.c   |  15 +
 xen/include/acpi/actbl1.h |  17 +-
 xen/include/asm-arm/numa.h|   9 +++
 xen/include/xen/numa.h|   4 ++
 6 files changed, 171 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/acpi/boot.c b/xen/arch/arm/acpi/boot.c
index 889208a..835c44e 100644
--- a/xen/arch/arm/acpi/boot.c
+++ b/xen/arch/arm/acpi/boot.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -117,6 +118,7 @@ acpi_map_gic_cpu_interface(struct 
acpi_madt_generic_interrupt *processor)
 return;
 }
 
+numa_set_node(enabled_cpus, acpi_get_nodeid(mpidr));
 /* map the logical cpu id to cpu MPIDR */
 cpu_logical_map(enabled_cpus) = mpidr;
 
diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
index 6fd937d..8f51ed0 100644
--- a/xen/arch/arm/numa/acpi_numa.c
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -28,19 +28,71 @@
 #include 
 #include 
 
+extern nodemask_t processor_nodes_parsed;
+
 /* Holds CPUID to MPIDR mapping read from MADT table. */
 struct cpuid_to_hwid {
 uint32_t cpuid;
 uint64_t hwid;
 };
 
+/* Holds NODE to MPIDR mapping. */
+struct node_to_hwid {
+nodeid_t nodeid;
+uint64_t hwid;
+};
+
 #define PHYS_CPUID_INVALID 0xff
 
 /* Holds mapping of CPU id to MPIDR read from MADT */
 static struct cpuid_to_hwid __read_mostly cpuid_to_hwid_map[NR_CPUS] =
 { [0 ... NR_CPUS - 1] = {PHYS_CPUID_INVALID, MPIDR_INVALID} };
+static struct node_to_hwid __read_mostly node_to_hwid_map[NR_CPUS] =
+{ [0 ... NR_CPUS - 1] = {NUMA_NO_NODE, MPIDR_INVALID} };
+static unsigned int cpus_in_srat;
 static unsigned int num_cpuid_to_hwid;
 
+nodeid_t __init acpi_get_nodeid(uint64_t hwid)
+{
+unsigned int i;
+
+for ( i = 0; i < cpus_in_srat; i++ )
+{
+if ( node_to_hwid_map[i].hwid == hwid )
+return node_to_hwid_map[i].nodeid;
+}
+
+return NUMA_NO_NODE;
+}
+
+static uint64_t acpi_get_cpu_hwid(int cid)
+{
+unsigned int i;
+
+for ( i = 0; i < num_cpuid_to_hwid; i++ )
+{
+if ( cpuid_to_hwid_map[i].cpuid == cid )
+return cpuid_to_hwid_map[i].hwid;
+}
+
+return MPIDR_INVALID;
+}
+
+static void __init acpi_map_node_to_hwid(nodeid_t nodeid, uint64_t hwid)
+{
+if ( nodeid >= MAX_NUMNODES )
+{
+printk(XENLOG_WARNING
+   "ACPI: NUMA: nodeid out of range %d with MPIDR 0x%lx\n",
+   nodeid, hwid);
+numa_failed();
+return;
+}
+
+node_to_hwid_map[cpus_in_srat].nodeid = nodeid;
+node_to_hwid_map[cpus_in_srat].hwid = hwid;
+}
+
 static void __init acpi_map_cpu_to_hwid(uint32_t cpuid, uint64_t mpidr)
 {
 if ( mpidr == MPIDR_INVALID )
@@ -76,15 +128,87 @@ static int __init acpi_parse_madt_handler(struct 
acpi_subtable_header *header,
 return 0;
 }
 
+/* Callback for Proximity Domain -> ACPI processor UID mapping */
+static void __init
+acpi_numa_gicc_affinity_init(const struct acpi_srat_gicc_affinity *pa)
+{
+int pxm, node;
+uint64_t mpidr;
+
+if ( srat_disabled() )
+return;
+
+if ( pa->header.length < sizeof(struct acpi_srat_gicc_affinity) )
+{
+printk(XENLOG_WARNING "SRAT: Invalid SRAT header length: %d\n",
+   pa->header.length);
+numa_failed();
+return;
+}
+
+if ( !(pa->flags & ACPI_SRAT_GICC_ENABLED) )
+return;
+
+if ( cpus_in_srat >= NR_CPUS )
+{
+printk(XENLOG_ERR
+   "SRAT: cpu_to_node_map[%d] is too small to fit all cpus\n",
+   NR_CPUS);
+return;
+}
+
+pxm = pa->proximity_domain;
+node = acpi_setup_node(pxm);
+if ( node == NUMA_NO_NODE )
+{
+numa_failed();
+return;
+}
+
+mpidr = acpi_get_cpu_hwid(pa->acpi_processor_uid);
+if ( mpidr == MPIDR_INVALID )
+{
+printk(XENLOG_ERR
+   "SRAT: PXM %d with ACPI ID %d has no valid MPIDR in MADT\n",
+   pxm, pa->acpi_processor_uid);
+numa_failed();
+return;
+}
+
+acpi_map_node_to_hwid(node, mpidr);
+node_set(node, processor_nodes_parsed);
+cpus_in_srat++;
+set_acpi_numa(1);
+printk(XENLOG_INFO "SRAT: PXM %d -> MPIDR 0x%lx -> Node %d\n",
+   pxm, mpidr, node);
+}
+
 void __init acpi_map_uid_to_mpidr(void)
 {
 acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
 acpi_parse_madt_handler, NR_CPUS);
 }
 
+static int __init
+acpi_parse_gicc_affinity(struct acpi_subtable_header *header,
+ const unsigned long end)
+{
+   const struct acpi_srat_gicc_affinity 

[Xen-devel] [RFC PATCH v2 20/25] ARM: NUMA: Extract MPIDR from MADT table

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Parse MADT table and extract MPIDR for all
CPU IDs in MADT ACPI_MADT_TYPE_GENERIC_INTERRUPT entries
and store in cpuid_to_hwid_map[]

This mapping is used by SRAT table parsing to
extract MPIDR of the CPU ID.

Signed-off-by: Vijaya Kumar 
---
 xen/arch/arm/numa/Makefile|  1 +
 xen/arch/arm/numa/acpi_numa.c | 94 +++
 xen/arch/arm/numa/numa.c  |  6 +++
 3 files changed, 101 insertions(+)

diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
index 3af3aff..b549459 100644
--- a/xen/arch/arm/numa/Makefile
+++ b/xen/arch/arm/numa/Makefile
@@ -1,2 +1,3 @@
 obj-y += dt_numa.o
 obj-y += numa.o
+obj-$(CONFIG_ACPI_NUMA) += acpi_numa.o
diff --git a/xen/arch/arm/numa/acpi_numa.c b/xen/arch/arm/numa/acpi_numa.c
new file mode 100644
index 000..45b3d35
--- /dev/null
+++ b/xen/arch/arm/numa/acpi_numa.c
@@ -0,0 +1,94 @@
+/*
+ * ACPI based NUMA setup
+ *
+ * Copyright (C) 2016 - Cavium Inc.
+ * Vijaya Kumar K 
+ *
+ * Reads the ACPI MADT and SRAT table to setup NUMA information.
+ * Contains Excerpts from x86 implementation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/* Holds CPUID to MPIDR mapping read from MADT table. */
+struct cpuid_to_hwid {
+uint32_t cpuid;
+uint64_t hwid;
+};
+
+#define PHYS_CPUID_INVALID 0xff
+
+/* Holds mapping of CPU id to MPIDR read from MADT */
+static struct cpuid_to_hwid __read_mostly cpuid_to_hwid_map[NR_CPUS] =
+{ [0 ... NR_CPUS - 1] = {PHYS_CPUID_INVALID, MPIDR_INVALID} };
+static unsigned int num_cpuid_to_hwid;
+
+static void __init acpi_map_cpu_to_hwid(uint32_t cpuid, uint64_t mpidr)
+{
+if ( mpidr == MPIDR_INVALID )
+{
+printk("Skip MADT cpu entry with invalid MPIDR\n");
+numa_failed();
+return;
+}
+
+cpuid_to_hwid_map[num_cpuid_to_hwid].hwid = mpidr;
+cpuid_to_hwid_map[num_cpuid_to_hwid].cpuid = cpuid;
+num_cpuid_to_hwid++;
+}
+
+static int __init acpi_parse_madt_handler(struct acpi_subtable_header *header,
+  const unsigned long end)
+{
+uint64_t mpidr;
+struct acpi_madt_generic_interrupt *p =
+   container_of(header, struct acpi_madt_generic_interrupt, 
header);
+
+if ( BAD_MADT_ENTRY(p, end) )
+{
+/* Though MADT is invalid, we disable NUMA by calling numa_failed() */
+numa_failed();
+return -EINVAL;
+}
+
+acpi_table_print_madt_entry(header);
+mpidr = p->arm_mpidr & MPIDR_HWID_MASK;
+acpi_map_cpu_to_hwid(p->uid, mpidr);
+
+return 0;
+}
+
+void __init acpi_map_uid_to_mpidr(void)
+{
+acpi_table_parse_madt(ACPI_MADT_TYPE_GENERIC_INTERRUPT,
+acpi_parse_madt_handler, NR_CPUS);
+}
+
+void __init acpi_numa_arch_fixup(void) {}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index 891d304..958085c 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
 
@@ -69,6 +70,11 @@ void numa_failed(void)
 init_dt_numa_distance();
 node_distance_fn = NULL;
 init_cpu_to_node();
+
+#ifdef CONFIG_ACPI_NUMA
+set_acpi_numa(0);
+reset_pxm2node();
+#endif
 }
 
 int __init arch_sanitize_nodes_memory(void)
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 17/25] ARM: NUMA: Add fallback on NUMA failure

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

On NUMA initialization failure, reset all the
NUMA structures to emulate as single node.

Signed-off-by: Vijaya Kumar 
---
 xen/arch/arm/numa/numa.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index 7583a40..891d304 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
 
@@ -164,7 +165,12 @@ void __init numa_init(void)
 if ( !ret )
 ret = numa_initmem_init(ram_start, ram_end);
 
+if ( !ret )
+return;
+
 no_numa:
+numa_dummy_init(PFN_UP(ram_start),PFN_DOWN(ram_end));
+
 return;
 }
 
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 25/25] NUMA: Enable ACPI_NUMA config

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Add CONFIG_ACPI_NUMA to xen/drivers/acpi/Kconfig and
drop CONFIG_ACPI_NUMA set in asm-x86/config.h.

Signed-off-by: Vijaya Kumar K 
---
 xen/drivers/acpi/Kconfig | 4 
 xen/include/asm-x86/config.h | 1 -
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/xen/drivers/acpi/Kconfig b/xen/drivers/acpi/Kconfig
index 488372f..8e15428 100644
--- a/xen/drivers/acpi/Kconfig
+++ b/xen/drivers/acpi/Kconfig
@@ -4,3 +4,7 @@ config ACPI
 
 config ACPI_LEGACY_TABLES_LOOKUP
bool
+
+config ACPI_NUMA
+   def_bool y
+   depends on ACPI && NUMA
diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h
index b9a6d94..cc27a52 100644
--- a/xen/include/asm-x86/config.h
+++ b/xen/include/asm-x86/config.h
@@ -37,7 +37,6 @@
 #define CONFIG_X86_L1_CACHE_SHIFT 7
 
 #define CONFIG_ACPI_SLEEP 1
-#define CONFIG_ACPI_NUMA 1
 #define CONFIG_ACPI_SRAT 1
 #define CONFIG_ACPI_CSTATE 1
 
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 24/25] NUMA: Move CONFIG_NUMA to common Kconfig

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

CONFIG_NUMA is defined in xen/drivers/acpi/Kconfig.
Move to common/Kconfig and enabled by default.
Also, NUMA feature uses PDX for physical address to
memory node mapping. Hence make HAS_PDX dependent
for NUMA.

Signed-off-by: Vijaya Kumar K 
---
 xen/common/Kconfig   | 4 
 xen/drivers/acpi/Kconfig | 3 ---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index 5334be3..d6b8a40 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -41,6 +41,10 @@ config HAS_GDBSX
 config HAS_IOPORTS
bool
 
+config NUMA
+   def_bool y
+   depends on HAS_PDX
+
 config HAS_BUILD_ID
string
option env="XEN_HAS_BUILD_ID"
diff --git a/xen/drivers/acpi/Kconfig b/xen/drivers/acpi/Kconfig
index b64d373..488372f 100644
--- a/xen/drivers/acpi/Kconfig
+++ b/xen/drivers/acpi/Kconfig
@@ -4,6 +4,3 @@ config ACPI
 
 config ACPI_LEGACY_TABLES_LOOKUP
bool
-
-config NUMA
-   bool
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 19/25] ACPI: Refactor acpi SRAT and SLIT table handling code

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Move SRAT handling code which is common across
architecture is moved to new file xen/drivers/acpi/srat.c
from xen/arch/x86/srat.c file. New header file srat.h is
introduced.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/dom0_build.c   |   1 +
 xen/arch/x86/mm.c   |   2 -
 xen/arch/x86/physdev.c  |   1 +
 xen/arch/x86/setup.c|   1 +
 xen/arch/x86/smpboot.c  |   1 +
 xen/arch/x86/srat.c | 250 +-
 xen/arch/x86/x86_64/mm.c|   1 +
 xen/drivers/acpi/Makefile   |   1 +
 xen/drivers/acpi/srat.c | 299 
 xen/drivers/passthrough/vtd/iommu.c |   1 +
 xen/include/acpi/srat.h |  24 +++
 xen/include/asm-x86/mm.h|   1 -
 xen/include/asm-x86/numa.h  |   4 -
 xen/include/xen/mm.h|   2 +
 xen/include/xen/numa.h  |   1 -
 15 files changed, 333 insertions(+), 257 deletions(-)

diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 20221b5..c131a81 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index a6b2649..ebabb0c 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -137,8 +137,6 @@ l1_pgentry_t __section(".bss.page_aligned") 
__aligned(PAGE_SIZE)
 #define PTE_UPDATE_WITH_CMPXCHG
 #endif
 
-paddr_t __read_mostly mem_hotplug;
-
 /* Private domain structs for DOMID_XEN and DOMID_IO. */
 struct domain *dom_xen, *dom_io, *dom_cow;
 
diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c
index 81cd6c9..ecc0daf 100644
--- a/xen/arch/x86/physdev.c
+++ b/xen/arch/x86/physdev.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 4410e53..d29fd1a 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 203733e..7dc06e4 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 55947bb..760df7f 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -18,14 +18,12 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
-static struct acpi_table_slit *__read_mostly acpi_slit;
-
 extern nodemask_t processor_nodes_parsed;
 extern nodemask_t memory_nodes_parsed;
-
 /*
  * Keep BIOS's CPU2node information, should not be used for memory allocaion
  */
@@ -33,87 +31,6 @@ nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
 [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
 };
 
-struct pxm2node {
-   unsigned int pxm;
-   nodeid_t node;
-};
-static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
-   { [0 ... MAX_NUMNODES - 1] = {.node = NUMA_NO_NODE} };
-
-static unsigned node_to_pxm(nodeid_t n);
-
-static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
-
-static inline bool node_found(unsigned int idx, unsigned int pxm)
-{
-   return ((pxm2node[idx].pxm == pxm) &&
-   (pxm2node[idx].node != NUMA_NO_NODE));
-}
-
-static void reset_pxm2node(void)
-{
-   unsigned int i;
-
-   for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-   pxm2node[i].node = NUMA_NO_NODE;
-}
-
-nodeid_t pxm_to_node(unsigned int pxm)
-{
-   unsigned int i;
-
-   if ((pxm < ARRAY_SIZE(pxm2node)) && node_found(pxm, pxm))
-   return pxm2node[pxm].node;
-
-   for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-   if (node_found(i, pxm))
-   return pxm2node[i].node;
-
-   return NUMA_NO_NODE;
-}
-
-nodeid_t acpi_setup_node(unsigned int pxm)
-{
-   nodeid_t node;
-   unsigned int idx;
-   static bool warned;
-   static unsigned int nodes_found;
-
-   BUILD_BUG_ON(MAX_NUMNODES >= NUMA_NO_NODE);
-
-   if (pxm < ARRAY_SIZE(pxm2node)) {
-   if (node_found(pxm, pxm))
-   return pxm2node[pxm].node;
-
-   /* Try to maintain indexing of pxm2node by pxm */
-   if (pxm2node[pxm].node == NUMA_NO_NODE) {
-   idx = pxm;
-   goto finish;
-   }
-   }
-
-   for (idx = 0; idx < ARRAY_SIZE(pxm2node); idx++)
-   if (pxm2node[idx].node == NUMA_NO_NODE)
-   goto finish;
-
-   if (!warned) {
-   printk(KERN_WARNING "SRAT: Too many proximity domains (%#x)\n",
-  pxm);
-   warned = 1;
-   }
-
-   return NUMA_NO_NODE;
-
- finish:
-   node = 

[Xen-devel] [RFC PATCH v2 12/25] ARM: NUMA: Parse CPU NUMA information

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Parse CPU node and fetch numa-node-id information.
For each node-id found, update nodemask_t mask.
Refer to /Documentation/devicetree/bindings/numa.txt.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/Makefile   |  1 +
 xen/arch/arm/bootfdt.c  | 16 --
 xen/arch/arm/numa/Makefile  |  2 ++
 xen/arch/arm/numa/dt_numa.c | 78 +
 xen/arch/arm/numa/numa.c| 50 +
 xen/arch/arm/setup.c|  4 +++
 xen/include/asm-arm/numa.h  | 10 +-
 xen/include/asm-arm/setup.h |  4 ++-
 8 files changed, 161 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 0ce94a8..d13b79f 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -3,6 +3,7 @@ subdir-$(CONFIG_ARM_64) += arm64
 subdir-y += platforms
 subdir-$(CONFIG_ARM_64) += efi
 subdir-$(CONFIG_ACPI) += acpi
+subdir-$(CONFIG_NUMA) += numa
 
 obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
 obj-y += bootfdt.init.o
diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index ea188a0..1f876f0 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -62,8 +62,20 @@ static void __init device_tree_get_reg(const __be32 **cell, 
u32 address_cells,
 *size = dt_next_cell(size_cells, cell);
 }
 
-static u32 __init device_tree_get_u32(const void *fdt, int node,
-  const char *prop_name, u32 dflt)
+bool_t __init device_tree_type_matches(const void *fdt, int node,
+   const char *match)
+{
+const void *prop;
+
+prop = fdt_getprop(fdt, node, "device_type", NULL);
+if ( prop == NULL )
+return 0;
+
+return strcmp(prop, match) == 0 ? 1 : 0;
+}
+
+u32 __init device_tree_get_u32(const void *fdt, int node,
+   const char *prop_name, u32 dflt)
 {
 const struct fdt_property *prop;
 
diff --git a/xen/arch/arm/numa/Makefile b/xen/arch/arm/numa/Makefile
new file mode 100644
index 000..3af3aff
--- /dev/null
+++ b/xen/arch/arm/numa/Makefile
@@ -0,0 +1,2 @@
+obj-y += dt_numa.o
+obj-y += numa.o
diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
new file mode 100644
index 000..66c6efb
--- /dev/null
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -0,0 +1,78 @@
+/*
+ * OF NUMA Parsing support.
+ *
+ * Copyright (C) 2015 - 2016 Cavium Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+extern nodemask_t processor_nodes_parsed;
+
+/*
+ * Even though we connect cpus to numa domains later in SMP
+ * init, we need to know the node ids now for all cpus.
+ */
+static int __init dt_numa_process_cpu_node(const void *fdt, int node,
+   const char *name,
+   uint32_t address_cells,
+   uint32_t size_cells)
+{
+uint32_t nid;
+
+nid = device_tree_get_u32(fdt, node, "numa-node-id", MAX_NUMNODES);
+
+if ( nid >= MAX_NUMNODES )
+printk(XENLOG_WARNING "NUMA: Node id %u exceeds maximum value\n", nid);
+else
+node_set(nid, processor_nodes_parsed);
+
+return 0;
+}
+
+static int __init dt_numa_scan_cpu_node(const void *fdt, int node,
+const char *name, int depth,
+uint32_t address_cells,
+uint32_t size_cells, void *data)
+{
+if ( device_tree_type_matches(fdt, node, "cpu") )
+return dt_numa_process_cpu_node(fdt, node, name, address_cells,
+size_cells);
+
+return 0;
+}
+
+int __init dt_numa_init(void)
+{
+int ret;
+
+ret = device_tree_for_each_node((void *)device_tree_flattened,
+dt_numa_scan_cpu_node, NULL);
+return ret;
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
new file mode 100644
index 000..c1c7c35
--- /dev/null
+++ b/xen/arch/arm/numa/numa.c
@@ -0,0 +1,50 @@
+/*
+ * ARM NUMA Implementation
+ *
+ * Copyright (C) 2016 - Cavium Inc.
+ * Vijaya Kumar K 
+ *
+ * This program 

[Xen-devel] [RFC PATCH v2 15/25] ARM: NUMA: Add CPU NUMA support

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

For each cpu, update cpu_to_node[] with node id from
the numa-node-id DT property. Also, initialize cpu_to_node[]
with node 0.

Add macros to access cpu_to_node[] information.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/numa/numa.c   | 21 +
 xen/arch/arm/smpboot.c | 25 -
 xen/include/asm-arm/numa.h | 24 
 3 files changed, 69 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index 0ee89da..eef5870 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -28,6 +28,25 @@ static uint8_t (*node_distance_fn)(nodeid_t a, nodeid_t b);
 extern nodemask_t processor_nodes_parsed;
 static bool_t dt_numa = 1;
 
+/*
+ * Setup early cpu_to_node.
+ */
+void __init init_cpu_to_node(void)
+{
+int i;
+
+for ( i = 0; i < NR_CPUS; i++ )
+numa_set_node(i, 0);
+}
+
+void __init numa_set_cpu_node(int cpu, unsigned int nid)
+{
+if ( !node_isset(nid, processor_nodes_parsed) || nid >= MAX_NUMNODES )
+nid = 0;
+
+numa_set_node(cpu, nid);
+}
+
 uint8_t __node_distance(nodeid_t a, nodeid_t b)
 {
 if ( node_distance_fn != NULL);
@@ -48,6 +67,7 @@ void numa_failed(void)
 dt_numa = 0;
 init_dt_numa_distance();
 node_distance_fn = NULL;
+init_cpu_to_node();
 }
 
 void __init numa_init(void)
@@ -55,6 +75,7 @@ void __init numa_init(void)
 int ret = 0;
 
 nodes_clear(processor_nodes_parsed);
+init_cpu_to_node();
 init_dt_numa_distance();
 if ( is_numa_off() )
 goto no_numa;
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 32e8722..bf7ddaf 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -29,6 +29,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -106,6 +107,7 @@ static void __init dt_smp_init_cpus(void)
 [0 ... NR_CPUS - 1] = MPIDR_INVALID
 };
 bool_t bootcpu_valid = 0;
+nodeid_t *cpu_to_nodemap;
 int rc;
 
 mpidr = boot_cpu_data.mpidr.bits & MPIDR_HWID_MASK;
@@ -117,11 +119,18 @@ static void __init dt_smp_init_cpus(void)
 return;
 }
 
+cpu_to_nodemap = xzalloc_array(nodeid_t, NR_CPUS);
+if ( !cpu_to_nodemap )
+{
+printk(XENLOG_WARNING "Failed to allocate memory for 
cpu_to_nodemap\n");
+return;
+}
+
 dt_for_each_child_node( cpus, cpu )
 {
 const __be32 *prop;
 u64 addr;
-u32 reg_len;
+u32 reg_len, nid;
 register_t hwid;
 
 if ( !dt_device_type_is_equal(cpu, "cpu") )
@@ -146,6 +155,15 @@ static void __init dt_smp_init_cpus(void)
 continue;
 }
 
+if ( !dt_property_read_u32(cpu, "numa-node-id", ) )
+{
+printk(XENLOG_WARNING "cpu node `%s`: numa-node-id not found\n",
+   dt_node_full_name(cpu));
+nid = 0;
+}
+
+cpu_to_nodemap[cpuidx] = nid;
+
 addr = dt_read_number(prop, dt_n_addr_cells(cpu));
 
 hwid = addr;
@@ -224,6 +242,7 @@ static void __init dt_smp_init_cpus(void)
 {
 printk(XENLOG_WARNING "DT missing boot CPU MPIDR[23:0]\n"
"Using only 1 CPU\n");
+xfree(cpu_to_nodemap);
 return;
 }
 
@@ -233,7 +252,10 @@ static void __init dt_smp_init_cpus(void)
 continue;
 cpumask_set_cpu(i, _possible_map);
 cpu_logical_map(i) = tmp_map[i];
+numa_set_cpu_node(i, cpu_to_nodemap[i]);
 }
+
+xfree(cpu_to_nodemap);
 }
 
 void __init smp_init_cpus(void)
@@ -313,6 +335,7 @@ void start_secondary(unsigned long boot_phys_offset,
  */
 smp_wmb();
 
+numa_add_cpu(cpuid);
 /* Now report this CPU is up */
 cpumask_set_cpu(cpuid, _online_map);
 
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index c390a0e..65bdd5e 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -14,12 +14,36 @@ extern uint8_t __node_distance(nodeid_t a, nodeid_t b);
 #ifdef CONFIG_NUMA
 extern void numa_init(void);
 extern int dt_numa_init(void);
+extern void numa_set_cpu_node(int cpu, unsigned int nid);
+extern void numa_add_cpu(int cpu);
+
+extern nodeid_t  cpu_to_node[NR_CPUS];
+extern cpumask_t node_to_cpumask[];
+/* Simple perfect hash to map pdx to node numbers */
+extern unsigned int memnode_shift;
+extern uint8_t *memnodemap;
+
+#define cpu_to_node(cpu) (cpu_to_node[cpu])
+#define parent_node(node)(node)
+#define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
+#define node_to_cpumask(node)(node_to_cpumask[node])
+
 #else
 static inline void numa_init(void)
 {
 return;
 }
 
+static inline void numa_set_cpu_node(int cpu, unsigned int nid)
+{
+return;
+}
+
+static inline void numa_add_cpu(int cpu)
+{
+ return;
+}
+
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 

[Xen-devel] [RFC PATCH v2 16/25] ARM: NUMA: Add memory NUMA support

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Implement arch_sanitize_nodes_memory() which looks at all banks
in bootinfo.mem, update nodes[] with corresponding nodeid.
Call numa_scan_nodes() generic function with ram start and
end address, which takes care of further computing memnodeshift
and populating memnodemap[] using generic implementation.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/numa/numa.c   | 79 +-
 xen/common/numa.c  | 14 
 xen/include/asm-arm/numa.h | 19 +++
 xen/include/xen/numa.h |  1 +
 4 files changed, 112 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/numa/numa.c b/xen/arch/arm/numa/numa.c
index eef5870..7583a40 100644
--- a/xen/arch/arm/numa/numa.c
+++ b/xen/arch/arm/numa/numa.c
@@ -70,9 +70,74 @@ void numa_failed(void)
 init_cpu_to_node();
 }
 
+int __init arch_sanitize_nodes_memory(void)
+{
+nodemask_t mem_nodes_parsed;
+int bank, nodeid;
+struct node *nd;
+paddr_t start, size, end;
+
+nodes_clear(mem_nodes_parsed);
+for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
+{
+start = bootinfo.mem.bank[bank].start;
+size = bootinfo.mem.bank[bank].size;
+end = start + size;
+
+nodeid = get_mem_nodeid(start, end);
+if ( nodeid >= NUMA_NO_NODE )
+{
+printk(XENLOG_WARNING
+   "NUMA: node for mem bank start 0x%lx - 0x%lx not found\n",
+   start, end);
+
+return 0;
+}
+
+nd = get_numa_node(nodeid);
+if ( !node_test_and_set(nodeid, mem_nodes_parsed) )
+{
+nd->start = start;
+nd->end = end;
+}
+else
+{
+if ( start < nd->start )
+nd->start = start;
+if ( nd->end < end )
+nd->end = end;
+}
+}
+
+return 1;
+}
+
+static bool_t __init numa_initmem_init(paddr_t ram_start, paddr_t ram_end)
+{
+int i;
+struct node *nd;
+/*
+ * In arch_sanitize_nodes_memory() we update nodes[] with properly.
+ * Hence we reset the nodes[] before calling numa_scan_nodes().
+ */
+for ( i = 0; i < MAX_NUMNODES; i++ )
+{
+nd = get_numa_node(i);
+nd->start = 0;
+nd->end = 0;
+}
+
+if ( !numa_scan_nodes(ram_start, ram_end) )
+return 0;
+
+return 1;
+}
+
 void __init numa_init(void)
 {
-int ret = 0;
+int ret = 0, bank;
+paddr_t ram_start = ~0;
+paddr_t ram_end = 0;
 
 nodes_clear(processor_nodes_parsed);
 init_cpu_to_node();
@@ -87,6 +152,18 @@ void __init numa_init(void)
 if ( ret )
 printk(XENLOG_WARNING "DT NUMA init failed\n");
 
+for ( bank = 0 ; bank < bootinfo.mem.nr_banks; bank++ )
+{
+paddr_t bank_start = bootinfo.mem.bank[bank].start;
+paddr_t bank_end = bank_start + bootinfo.mem.bank[bank].size;
+
+ram_start = min(ram_start, bank_start);
+ram_end = max(ram_end, bank_end);
+}
+
+if ( !ret )
+ret = numa_initmem_init(ram_start, ram_end);
+
 no_numa:
 return;
 }
diff --git a/xen/common/numa.c b/xen/common/numa.c
index 1789bba..f2ac726 100644
--- a/xen/common/numa.c
+++ b/xen/common/numa.c
@@ -84,6 +84,20 @@ nodeid_t get_memblk_nodeid(int id)
 return memblk_nodeid[id];
 }
 
+int __init get_mem_nodeid(paddr_t start, paddr_t end)
+{
+unsigned int i;
+
+for ( i = 0; i < get_num_node_memblks(); i++ )
+{
+if ( start >= node_memblk_range[i].start &&
+ end <= node_memblk_range[i].end )
+return memblk_nodeid[i];
+}
+
+return -EINVAL;
+}
+
 nodeid_t *get_memblk_nodeid_map(void)
 {
 return _nodeid[0];
diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index 65bdd5e..85fbbe8 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -1,6 +1,8 @@
 #ifndef __ARCH_ARM_NUMA_H
 #define __ARCH_ARM_NUMA_H
 
+#include 
+
 typedef uint8_t nodeid_t;
 
 /* Limit number of NUMA nodes supported to 4 */
@@ -28,6 +30,23 @@ extern uint8_t *memnodemap;
 #define node_to_first_cpu(node)  (__ffs(node_to_cpumask[node]))
 #define node_to_cpumask(node)(node_to_cpumask[node])
 
+static inline __attribute__((pure)) nodeid_t phys_to_nid(paddr_t addr)
+{
+return memnodemap[paddr_to_pdx(addr) >> memnode_shift];
+}
+
+struct node_data {
+unsigned long node_start_pfn;
+unsigned long node_spanned_pages;
+};
+
+extern struct node_data node_data[];
+#define NODE_DATA(nid)  (&(node_data[nid]))
+
+#define node_start_pfn(nid) (NODE_DATA(nid)->node_start_pfn)
+#define node_spanned_pages(nid) (NODE_DATA(nid)->node_spanned_pages)
+#define node_end_pfn(nid)   (NODE_DATA(nid)->node_start_pfn + \
+
 #else
 static inline void numa_init(void)
 {
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index ee53526..b40a841 100644
--- a/xen/include/xen/numa.h
+++ 

[Xen-devel] [RFC PATCH v2 18/25] ARM: NUMA: Do not expose numa info to DOM0

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Delete numa-node-id and distance map from DOM0 DT
so that NUMA information is not exposed to DOM0.
This helps particularly to boot Node 1 devices
as if booting on Node0.

However this approach has limitation where memory allocation
for the devices should be local.

Also, do not expose numa distance node to DOM0.

Signed-off-by: Vijaya Kumar 
---
 xen/arch/arm/domain_build.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index de59e5f..4a7e645 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -421,6 +421,10 @@ static int write_properties(struct domain *d, struct 
kernel_info *kinfo,
 }
 }
 
+/* Don't expose the property numa to the guest */
+if ( dt_property_name_is_equal(prop, "numa-node-id") )
+continue;
+
 /* Don't expose the property "xen,passthrough" to the guest */
 if ( dt_property_name_is_equal(prop, "xen,passthrough") )
 continue;
@@ -1173,6 +1177,11 @@ static int handle_node(struct domain *d, struct 
kernel_info *kinfo,
 DT_MATCH_TYPE("memory"),
 /* The memory mapped timer is not supported by Xen. */
 DT_MATCH_COMPATIBLE("arm,armv7-timer-mem"),
+/*
+ * NUMA info is not exposed to Dom0.
+ * So, skip distance-map infomation
+ */
+DT_MATCH_COMPATIBLE("numa-distance-map-v1"),
 { /* sentinel */ },
 };
 static const struct dt_device_match timer_matches[] __initconst =
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 13/25] ARM: NUMA: Parse memory NUMA information

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Parse memory node and fetch numa-node-id information.
For each memory range, store in node_memblk_range[]
along with node id.

When booting in UEFI mode, UEFI passes memory information
to Dom0 using EFI memory descriptor table and deletes the
memory nodes from the host DT. However to fetch the memory
numa node id, memory DT node should not be deleted by EFI stub.
With this patch, do not delete memory node from FDT.

NUMA info of memory is extracted from process_memory_node()
instead of parsing the DT again during numa_init().

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/arm/bootfdt.c  | 24 
 xen/arch/arm/efi/efi-boot.h | 25 -
 xen/arch/arm/numa/dt_numa.c | 33 +
 xen/arch/arm/numa/numa.c|  9 +
 xen/include/asm-arm/numa.h  |  2 ++
 5 files changed, 64 insertions(+), 29 deletions(-)

diff --git a/xen/arch/arm/bootfdt.c b/xen/arch/arm/bootfdt.c
index 1f876f0..993760a 100644
--- a/xen/arch/arm/bootfdt.c
+++ b/xen/arch/arm/bootfdt.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -146,6 +147,9 @@ static void __init process_memory_node(const void *fdt, int 
node,
 const __be32 *cell;
 paddr_t start, size;
 u32 reg_cells = address_cells + size_cells;
+#ifdef CONFIG_NUMA
+uint32_t nid;
+#endif
 
 if ( address_cells < 1 || size_cells < 1 )
 {
@@ -154,24 +158,36 @@ static void __init process_memory_node(const void *fdt, 
int node,
 return;
 }
 
+#ifdef CONFIG_NUMA
+nid = device_tree_get_u32(fdt, node, "numa-node-id", NR_NODE_MEMBLKS);
+#endif
 prop = fdt_get_property(fdt, node, "reg", NULL);
 if ( !prop )
 {
 printk("fdt: node `%s': missing `reg' property\n", name);
+#ifdef CONFIG_NUMA
+   numa_failed();
+#endif
 return;
 }
 
 cell = (const __be32 *)prop->data;
 banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));
 
-for ( i = 0; i < banks && bootinfo.mem.nr_banks < NR_MEM_BANKS; i++ )
+for ( i = 0; i < banks; i++ )
 {
 device_tree_get_reg(, address_cells, size_cells, , );
 if ( !size )
 continue;
-bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
-bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
-bootinfo.mem.nr_banks++;
+if ( !efi_enabled(EFI_BOOT) && bootinfo.mem.nr_banks < NR_MEM_BANKS )
+{
+bootinfo.mem.bank[bootinfo.mem.nr_banks].start = start;
+bootinfo.mem.bank[bootinfo.mem.nr_banks].size = size;
+bootinfo.mem.nr_banks++;
+}
+#ifdef CONFIG_NUMA
+dt_numa_process_memory_node(nid, start, size);
+#endif
 }
 }
 
diff --git a/xen/arch/arm/efi/efi-boot.h b/xen/arch/arm/efi/efi-boot.h
index e1e447a..07fe178 100644
--- a/xen/arch/arm/efi/efi-boot.h
+++ b/xen/arch/arm/efi/efi-boot.h
@@ -194,33 +194,8 @@ EFI_STATUS __init fdt_add_uefi_nodes(EFI_SYSTEM_TABLE 
*sys_table,
 int status;
 u32 fdt_val32;
 u64 fdt_val64;
-int prev;
 int num_rsv;
 
-/*
- * Delete any memory nodes present.  The EFI memory map is the only
- * memory description provided to Xen.
- */
-prev = 0;
-for (;;)
-{
-const char *type;
-int len;
-
-node = fdt_next_node(fdt, prev, NULL);
-if ( node < 0 )
-break;
-
-type = fdt_getprop(fdt, node, "device_type", );
-if ( type && strncmp(type, "memory", len) == 0 )
-{
-fdt_del_node(fdt, node);
-continue;
-}
-
-prev = node;
-}
-
/*
 * Delete all memory reserve map entries. When booting via UEFI,
 * kernel will use the UEFI memory map to find reserved regions.
diff --git a/xen/arch/arm/numa/dt_numa.c b/xen/arch/arm/numa/dt_numa.c
index 66c6efb..593c647 100644
--- a/xen/arch/arm/numa/dt_numa.c
+++ b/xen/arch/arm/numa/dt_numa.c
@@ -25,6 +25,7 @@
 #include 
 
 extern nodemask_t processor_nodes_parsed;
+extern nodemask_t memory_nodes_parsed;
 
 /*
  * Even though we connect cpus to numa domains later in SMP
@@ -59,6 +60,38 @@ static int __init dt_numa_scan_cpu_node(const void *fdt, int 
node,
 return 0;
 }
 
+void __init dt_numa_process_memory_node(uint32_t nid, paddr_t start,
+   paddr_t size)
+{
+struct node *nd;
+int i;
+
+i = conflicting_memblks(start, start + size);
+if ( i < 0 )
+{
+ if ( numa_add_memblk(nid, start, size) )
+ {
+ printk(XENLOG_WARNING "DT: NUMA: node-id %u overflow \n", nid);
+ numa_failed();
+ return;
+ }
+}
+else
+{
+ nd = get_node_memblk_range(i);
+ printk(XENLOG_ERR
+"NUMA DT: node %u (%"PRIx64"-%"PRIx64") overlaps with %d 
(%"PRIx64"-%"PRIx64")\n",
+nid, start, start + size, i, nd->start, nd->end);
+
+ 

[Xen-devel] [RFC PATCH v2 10/25] x86: NUMA: Move numa code and make it generic

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Move code from xen/arch/x86/numa.c to xen/common/numa.c
so that it can be used by other archs.
Few generic static functions in x86/numa.c is made
non-static common/numa.c

The generic contents of header file asm-x86/numa.h
are moved to xen/numa.h.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/numa.c| 456 --
 xen/arch/x86/srat.c|   7 +
 xen/common/Makefile|   1 +
 xen/common/numa.c  | 488 +
 xen/include/asm-x86/numa.h |  15 --
 xen/include/xen/numa.h |  18 ++
 6 files changed, 514 insertions(+), 471 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 3bdab9a..33c6806 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -10,286 +10,13 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
 #include 
-#include 
-#include 
-
-static int numa_setup(char *s);
-custom_param("numa", numa_setup);
-
-struct node_data node_data[MAX_NUMNODES];
-
-/* Mapping from pdx to node id */
-unsigned int memnode_shift;
-static typeof(*memnodemap) _memnodemap[64];
-unsigned long memnodemapsize;
-uint8_t *memnodemap;
-
-nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
-[0 ... NR_CPUS-1] = NUMA_NO_NODE
-};
-/*
- * Keep BIOS's CPU2node information, should not be used for memory allocaion
- */
-nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
-[0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
-};
-cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
-static bool numa_off = 0;
-static bool acpi_numa = 1;
-
-bool is_numa_off(void)
-{
-return numa_off;
-}
-
-bool get_acpi_numa(void)
-{
-return acpi_numa;
-}
-
-void set_acpi_numa(bool_t val)
-{
-acpi_numa = val;
-}
-
-bool srat_disabled(void)
-{
-return numa_off || acpi_numa == 0;
-}
-
-/*
- * Given a shift value, try to populate memnodemap[]
- * Returns :
- * 0 if OK
- * -EINVAL if memnodmap[] too small (of shift too small)
- * OR if node overlap or lost ram (shift too big)
- */
-static int __init populate_memnodemap(const struct node *nodes, int numnodes,
-  unsigned int shift, nodeid_t *nodeids)
-{
-unsigned long spdx, epdx;
-int i, res = -EINVAL;
-
-memset(memnodemap, NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
-for ( i = 0; i < numnodes; i++ )
-{
-spdx = paddr_to_pdx(nodes[i].start);
-epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
-if ( spdx >= epdx )
-continue;
-if ( (epdx >> shift) >= memnodemapsize )
-return 0;
-do {
-if ( memnodemap[spdx >> shift] != NUMA_NO_NODE )
-return -EINVAL;
-
-if ( !nodeids )
-memnodemap[spdx >> shift] = i;
-else
-memnodemap[spdx >> shift] = nodeids[i];
-
-spdx += (1UL << shift);
-} while ( spdx < epdx );
-res = 0;
-}
-
-return res;
-}
-
-static int __init allocate_cachealigned_memnodemap(void)
-{
-unsigned long size = PFN_UP(memnodemapsize * sizeof(*memnodemap));
-unsigned long mfn = alloc_boot_pages(size, 1);
-
-if ( !mfn )
-{
-printk(KERN_ERR
-   "NUMA: Unable to allocate Memory to Node hash map\n");
-memnodemapsize = 0;
-return -ENOMEM;
-}
-
-memnodemap = mfn_to_virt(mfn);
-mfn <<= PAGE_SHIFT;
-size <<= PAGE_SHIFT;
-printk(KERN_DEBUG "NUMA: Allocated memnodemap from %lx - %lx\n",
-   mfn, mfn + size);
-memnodemapsize = size / sizeof(*memnodemap);
-
-return 0;
-}
-
-/*
- * The LSB of all start and end addresses in the node map is the value of the
- * maximum possible shift.
- */
-static unsigned int __init extract_lsb_from_nodes(const struct node *nodes,
-  int numnodes)
-{
-unsigned int i, nodes_used = 0;
-unsigned long spdx, epdx;
-unsigned long bitfield = 0, memtop = 0;
-
-for ( i = 0; i < numnodes; i++ )
-{
-spdx = paddr_to_pdx(nodes[i].start);
-epdx = paddr_to_pdx(nodes[i].end - 1) + 1;
-if ( spdx >= epdx )
-continue;
-bitfield |= spdx;
-nodes_used++;
-if ( epdx > memtop )
-memtop = epdx;
-}
-if ( nodes_used <= 1 )
-i = BITS_PER_LONG - 1;
-else
-i = find_first_bit(, sizeof(unsigned long) * 8);
-memnodemapsize = (memtop >> i) + 1;
-
-return i;
-}
-
-int __init compute_memnode_shift(struct node *nodes, int numnodes,
- nodeid_t *nodeids, unsigned int *shift)
-{
-*shift = extract_lsb_from_nodes(nodes, numnodes);
-
-if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
-memnodemap = _memnodemap;
-else if ( allocate_cachealigned_memnodemap() )
-return -ENOMEM;
-
-

[Xen-devel] [RFC PATCH v2 11/25] x86: NUMA: Move common code from srat.c

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Move code from xen/arch/x86/srat.c to xen/common/numa.c
so that it can be used by other archs.
Few generic static functions in x86/srat.c are made
non-static common/numa.c

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/srat.c| 152 ++---
 xen/common/numa.c  | 146 +++
 xen/include/asm-x86/acpi.h |   3 -
 xen/include/asm-x86/numa.h |   2 -
 xen/include/xen/numa.h |  14 +
 5 files changed, 164 insertions(+), 153 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 2cc87a3..55947bb 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -23,9 +23,8 @@
 
 static struct acpi_table_slit *__read_mostly acpi_slit;
 
-static nodemask_t __initdata memory_nodes_parsed;
-static nodemask_t __initdata processor_nodes_parsed;
-static struct node __initdata nodes[MAX_NUMNODES];
+extern nodemask_t processor_nodes_parsed;
+extern nodemask_t memory_nodes_parsed;
 
 /*
  * Keep BIOS's CPU2node information, should not be used for memory allocaion
@@ -43,49 +42,8 @@ static struct pxm2node __read_mostly pxm2node[MAX_NUMNODES] =
 
 static unsigned node_to_pxm(nodeid_t n);
 
-static int num_node_memblks;
-static struct node node_memblk_range[NR_NODE_MEMBLKS];
-static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
 
-static struct node *get_numa_node(int id)
-{
-   return [id];
-}
-
-static nodeid_t get_memblk_nodeid(int id)
-{
-   return memblk_nodeid[id];
-}
-
-static nodeid_t *get_memblk_nodeid_map(void)
-{
-   return _nodeid[0];
-}
-
-static struct node *get_node_memblk_range(int memblk)
-{
-   return _memblk_range[memblk];
-}
-
-static int get_num_node_memblks(void)
-{
-   return num_node_memblks;
-}
-
-static int __init numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t 
size)
-{
-   if (nodeid >= NR_NODE_MEMBLKS)
-   return -EINVAL;
-
-   node_memblk_range[num_node_memblks].start = start;
-   node_memblk_range[num_node_memblks].end = start + size;
-   memblk_nodeid[num_node_memblks] = nodeid;
-   num_node_memblks++;
-
-   return 0;
-}
-
 static inline bool node_found(unsigned int idx, unsigned int pxm)
 {
return ((pxm2node[idx].pxm == pxm) &&
@@ -156,54 +114,7 @@ nodeid_t acpi_setup_node(unsigned int pxm)
return node;
 }
 
-int valid_numa_range(paddr_t start, paddr_t end, nodeid_t node)
-{
-   int i;
-
-   for (i = 0; i < get_num_node_memblks(); i++) {
-   struct node *nd = get_node_memblk_range(i);
-
-   if (nd->start <= start && nd->end > end &&
-   get_memblk_nodeid(i) == node)
-   return 1;
-   }
-
-   return 0;
-}
-
-static int __init conflicting_memblks(paddr_t start, paddr_t end)
-{
-   int i;
-
-   for (i = 0; i < get_num_node_memblks(); i++) {
-   struct node *nd = get_node_memblk_range(i);
-   if (nd->start == nd->end)
-   continue;
-   if (nd->end > start && nd->start < end)
-   return i;
-   if (nd->end == end && nd->start == start)
-   return i;
-   }
-   return -1;
-}
-
-static void __init cutoff_node(int i, paddr_t start, paddr_t end)
-{
-   struct node *nd = get_numa_node(i);
-
-   if (nd->start < start) {
-   nd->start = start;
-   if (nd->end < nd->start)
-   nd->start = nd->end;
-   }
-   if (nd->end > end) {
-   nd->end = end;
-   if (nd->start > nd->end)
-   nd->start = nd->end;
-   }
-}
-
-static void __init numa_failed(void)
+void __init numa_failed(void)
 {
int i;
printk(KERN_ERR "SRAT: SRAT not used.\n");
@@ -419,7 +330,7 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
 
 /* Sanity check to catch more bad SRATs (they are amazingly common).
Make sure the PXMs cover all memory. */
-static int __init arch_sanitize_nodes_memory(void)
+int __init arch_sanitize_nodes_memory(void)
 {
int i;
 
@@ -516,61 +427,6 @@ void __init srat_parse_regions(uint64_t addr)
pfn_pdx_hole_setup(mask >> PAGE_SHIFT);
 }
 
-/* Use the information discovered above to actually set up the nodes. */
-int __init numa_scan_nodes(uint64_t start, uint64_t end)
-{
-   int i;
-   nodemask_t all_nodes_parsed;
-   struct node *memblks;
-   nodeid_t *nodeids;
-
-   /* First clean up the node list */
-   for (i = 0; i < MAX_NUMNODES; i++)
-   cutoff_node(i, start, end);
-
-   if (get_acpi_numa() == 0)
-   return -1;
-
-   if (!arch_sanitize_nodes_memory()) {
-   numa_failed();
-   return -1;
-   }
-
-   memblks = get_node_memblk_range(0);
-   

[Xen-devel] [RFC PATCH v2 08/25] x86: NUMA: Sanitize node distance

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Introduce acpi_node_distance() and call from __node_distance().
This helps to implement arch specific __node_distance().
Also introduce LOCAL_DISTANCE & REMOTE DISTANCE macros.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/srat.c| 13 +
 xen/include/xen/numa.h |  2 ++
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 3ade36d..7cf4771 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -221,9 +221,9 @@ static int __init slit_valid(struct acpi_table_slit *slit)
for (j = 0; j < d; j++)  {
uint8_t val = slit->entry[d*i + j];
if (i == j) {
-   if (val != 10)
+   if (val != LOCAL_DISTANCE)
return 0;
-   } else if (val <= 10)
+   } else if (val <= LOCAL_DISTANCE)
return 0;
}
}
@@ -576,13 +576,13 @@ static unsigned node_to_pxm(nodeid_t n)
return 0;
 }
 
-uint8_t __node_distance(nodeid_t a, nodeid_t b)
+static uint8_t acpi_node_distance(nodeid_t a, nodeid_t b)
 {
unsigned index;
uint8_t slit_val;
 
if (!acpi_slit)
-   return a == b ? 10 : 20;
+   return a == b ? LOCAL_DISTANCE : REMOTE_DISTANCE;
index = acpi_slit->locality_count * node_to_pxm(a);
slit_val = acpi_slit->entry[index + node_to_pxm(b)];
 
@@ -593,4 +593,9 @@ uint8_t __node_distance(nodeid_t a, nodeid_t b)
return slit_val;
 }
 
+uint8_t __node_distance(nodeid_t a, nodeid_t b)
+{
+   return acpi_node_distance(a, b);
+}
+
 EXPORT_SYMBOL(__node_distance);
diff --git a/xen/include/xen/numa.h b/xen/include/xen/numa.h
index 7f6d090..922fbd8 100644
--- a/xen/include/xen/numa.h
+++ b/xen/include/xen/numa.h
@@ -8,6 +8,8 @@
 #endif
 
 #define NUMA_NO_NODE 0xFF
+#define LOCAL_DISTANCE   10
+#define REMOTE_DISTANCE  20
 #define NUMA_NO_DISTANCE 0xFF
 
 #define MAX_NUMNODES(1 << NODES_SHIFT)
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 09/25] ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Right now CONFIG_NUMA is not enabled for ARM and
existing code in asm-arm/numa.h is for !CONFIG_NUMA.
Hence put this code under #ifndef CONFIG_NUMA.

This help to make this changes work when CONFIG_NUMA
is not enabled.

Also define NODES_SHIFT macro for ARM to value 2.
This limits number of NUMA nodes supported to 4.
There is not hard restrictions on this value set to 2.

Signed-off-by: Vijaya Kumar K 
---
 xen/include/asm-arm/numa.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/xen/include/asm-arm/numa.h b/xen/include/asm-arm/numa.h
index 53f99af..924bfc0 100644
--- a/xen/include/asm-arm/numa.h
+++ b/xen/include/asm-arm/numa.h
@@ -3,6 +3,10 @@
 
 typedef uint8_t nodeid_t;
 
+/* Limit number of NUMA nodes supported to 4 */
+#define NODES_SHIFT 2
+
+#ifndef CONFIG_NUMA
 /* Fake one node for now. See also node_online_map. */
 #define cpu_to_node(cpu) 0
 #define node_to_cpumask(node)   (cpu_online_map)
@@ -16,6 +20,7 @@ static inline __attribute__((pure)) nodeid_t 
phys_to_nid(paddr_t addr)
 #define node_spanned_pages(nid) (total_pages)
 #define node_start_pfn(nid) (pdx_to_pfn(frametable_base_pdx))
 #define __node_distance(a, b) (20)
+#endif /* CONFIG_NUMA */
 
 static inline unsigned int arch_get_dma_bitsize(void)
 {
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 07/25] x86: NUMA: Rename some generic functions

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Rename some function in ACPI code as follow
 - Rename setup_node to acpi_setup_node
 - Rename bad_srat to numa_failed
 - Rename nodes_cover_memory to arch_sanitize_nodes_memory
 - Rename acpi_scan_nodes to numa_scan_nodes

Also introduce reset_pxm2node() to reset pxm2node variable.
This avoids exporting pxm2node.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/numa.c|  2 +-
 xen/arch/x86/smpboot.c |  2 +-
 xen/arch/x86/srat.c| 51 ++
 xen/arch/x86/x86_64/mm.c   |  2 +-
 xen/include/asm-x86/acpi.h |  2 +-
 xen/include/asm-x86/numa.h |  2 +-
 6 files changed, 34 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 0888d53..3bdab9a 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -298,7 +298,7 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 #endif
 
 #ifdef CONFIG_ACPI_NUMA
-if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
+if ( !is_numa_off() && !numa_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
  (uint64_t)end_pfn << PAGE_SHIFT) )
 return;
 #endif
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 82559ed..203733e 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -959,7 +959,7 @@ int cpu_add(uint32_t apic_id, uint32_t acpi_id, uint32_t 
pxm)
 
 if ( !srat_disabled() )
 {
-nodeid_t node = setup_node(pxm);
+nodeid_t node = acpi_setup_node(pxm);
 
 if ( node == NUMA_NO_NODE )
 {
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 983e1d8..3ade36d 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -85,6 +85,14 @@ static inline bool node_found(unsigned int idx, unsigned int 
pxm)
(pxm2node[idx].node != NUMA_NO_NODE));
 }
 
+static void reset_pxm2node(void)
+{
+   unsigned int i;
+
+   for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
+   pxm2node[i].node = NUMA_NO_NODE;
+}
+
 nodeid_t pxm_to_node(unsigned int pxm)
 {
unsigned int i;
@@ -99,7 +107,7 @@ nodeid_t pxm_to_node(unsigned int pxm)
return NUMA_NO_NODE;
 }
 
-nodeid_t setup_node(unsigned pxm)
+nodeid_t acpi_setup_node(unsigned int pxm)
 {
nodeid_t node;
unsigned int idx;
@@ -188,15 +196,14 @@ static void __init cutoff_node(int i, paddr_t start, 
paddr_t end)
}
 }
 
-static void __init bad_srat(void)
+static void __init numa_failed(void)
 {
int i;
printk(KERN_ERR "SRAT: SRAT not used.\n");
set_acpi_numa(0);
for (i = 0; i < MAX_LOCAL_APIC; i++)
apicid_to_node[i] = NUMA_NO_NODE;
-   for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
-   pxm2node[i].node = NUMA_NO_NODE;
+   reset_pxm2node();
mem_hotplug = 0;
 }
 
@@ -252,7 +259,7 @@ acpi_numa_x2apic_affinity_init(const struct 
acpi_srat_x2apic_cpu_affinity *pa)
if (srat_disabled())
return;
if (pa->header.length < sizeof(struct acpi_srat_x2apic_cpu_affinity)) {
-   bad_srat();
+   numa_failed();
return;
}
if (!(pa->flags & ACPI_SRAT_CPU_ENABLED))
@@ -263,9 +270,9 @@ acpi_numa_x2apic_affinity_init(const struct 
acpi_srat_x2apic_cpu_affinity *pa)
}
 
pxm = pa->proximity_domain;
-   node = setup_node(pxm);
+   node = acpi_setup_node(pxm);
if (node == NUMA_NO_NODE) {
-   bad_srat();
+   numa_failed();
return;
}
 
@@ -286,7 +293,7 @@ acpi_numa_processor_affinity_init(const struct 
acpi_srat_cpu_affinity *pa)
if (srat_disabled())
return;
if (pa->header.length != sizeof(struct acpi_srat_cpu_affinity)) {
-   bad_srat();
+   numa_failed();
return;
}
if (!(pa->flags & ACPI_SRAT_CPU_ENABLED))
@@ -297,9 +304,9 @@ acpi_numa_processor_affinity_init(const struct 
acpi_srat_cpu_affinity *pa)
pxm |= pa->proximity_domain_hi[1] << 16;
pxm |= pa->proximity_domain_hi[2] << 24;
}
-   node = setup_node(pxm);
+   node = acpi_setup_node(pxm);
if (node == NUMA_NO_NODE) {
-   bad_srat();
+   numa_failed();
return;
}
apicid_to_node[pa->apic_id] = node;
@@ -322,7 +329,7 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
if (srat_disabled())
return;
if (ma->header.length != sizeof(struct acpi_srat_mem_affinity)) {
-   bad_srat();
+   numa_failed();
return;
}
if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
@@ -332,7 +339,7 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
{
dprintk(XENLOG_WARNING,
 "Too many 

[Xen-devel] [RFC PATCH v2 06/25] x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Add accessor for nodes[] and other static variables and
used those accessors.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/srat.c | 108 +++-
 1 file changed, 82 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index ccacbcd..983e1d8 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -41,7 +41,45 @@ static struct node node_memblk_range[NR_NODE_MEMBLKS];
 static nodeid_t memblk_nodeid[NR_NODE_MEMBLKS];
 static __initdata DECLARE_BITMAP(memblk_hotplug, NR_NODE_MEMBLKS);
 
-static inline bool node_found(unsigned idx, unsigned pxm)
+static struct node *get_numa_node(int id)
+{
+   return [id];
+}
+
+static nodeid_t get_memblk_nodeid(int id)
+{
+   return memblk_nodeid[id];
+}
+
+static nodeid_t *get_memblk_nodeid_map(void)
+{
+   return _nodeid[0];
+}
+
+static struct node *get_node_memblk_range(int memblk)
+{
+   return _memblk_range[memblk];
+}
+
+static int get_num_node_memblks(void)
+{
+   return num_node_memblks;
+}
+
+static int __init numa_add_memblk(nodeid_t nodeid, paddr_t start, uint64_t 
size)
+{
+   if (nodeid >= NR_NODE_MEMBLKS)
+   return -EINVAL;
+
+   node_memblk_range[num_node_memblks].start = start;
+   node_memblk_range[num_node_memblks].end = start + size;
+   memblk_nodeid[num_node_memblks] = nodeid;
+   num_node_memblks++;
+
+   return 0;
+}
+
+static inline bool node_found(unsigned int idx, unsigned int pxm)
 {
return ((pxm2node[idx].pxm == pxm) &&
(pxm2node[idx].node != NUMA_NO_NODE));
@@ -107,11 +145,11 @@ int valid_numa_range(paddr_t start, paddr_t end, nodeid_t 
node)
 {
int i;
 
-   for (i = 0; i < num_node_memblks; i++) {
-   struct node *nd = _memblk_range[i];
+   for (i = 0; i < get_num_node_memblks(); i++) {
+   struct node *nd = get_node_memblk_range(i);
 
if (nd->start <= start && nd->end > end &&
-   memblk_nodeid[i] == node )
+   get_memblk_nodeid(i) == node)
return 1;
}
 
@@ -122,8 +160,8 @@ static int __init conflicting_memblks(paddr_t start, 
paddr_t end)
 {
int i;
 
-   for (i = 0; i < num_node_memblks; i++) {
-   struct node *nd = _memblk_range[i];
+   for (i = 0; i < get_num_node_memblks(); i++) {
+   struct node *nd = get_node_memblk_range(i);
if (nd->start == nd->end)
continue;
if (nd->end > start && nd->start < end)
@@ -136,7 +174,8 @@ static int __init conflicting_memblks(paddr_t start, 
paddr_t end)
 
 static void __init cutoff_node(int i, paddr_t start, paddr_t end)
 {
-   struct node *nd = [i];
+   struct node *nd = get_numa_node(i);
+
if (nd->start < start) {
nd->start = start;
if (nd->end < nd->start)
@@ -278,6 +317,7 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
unsigned pxm;
nodeid_t node;
int i;
+   struct node *memblk;
 
if (srat_disabled())
return;
@@ -288,7 +328,7 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
return;
 
-   if (num_node_memblks >= NR_NODE_MEMBLKS)
+   if (get_num_node_memblks() >= NR_NODE_MEMBLKS)
{
dprintk(XENLOG_WARNING,
 "Too many numa entry, try bigger NR_NODE_MEMBLKS \n");
@@ -310,27 +350,31 @@ acpi_numa_memory_affinity_init(const struct 
acpi_srat_mem_affinity *ma)
i = conflicting_memblks(start, end);
if (i < 0)
/* everything fine */;
-   else if (memblk_nodeid[i] == node) {
+   else if (get_memblk_nodeid(i) == node) {
bool mismatch = !(ma->flags & ACPI_SRAT_MEM_HOT_PLUGGABLE) !=
!test_bit(i, memblk_hotplug);
 
+   memblk = get_node_memblk_range(i);
+
printk("%sSRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with 
itself (%"PRIx64"-%"PRIx64")\n",
   mismatch ? KERN_ERR : KERN_WARNING, pxm, start, end,
-  node_memblk_range[i].start, node_memblk_range[i].end);
+  memblk->start, memblk->end);
if (mismatch) {
bad_srat();
return;
}
} else {
+   memblk = get_node_memblk_range(i);
+
printk(KERN_ERR
   "SRAT: PXM %u (%"PRIx64"-%"PRIx64") overlaps with PXM %u 
(%"PRIx64"-%"PRIx64")\n",
-  pxm, start, end, node_to_pxm(memblk_nodeid[i]),
-  node_memblk_range[i].start, node_memblk_range[i].end);
+  pxm, start, end, node_to_pxm(get_memblk_nodeid(i)),

[Xen-devel] [RFC PATCH v2 02/25] x86: NUMA: Fix datatypes and attributes

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Change u{8,32,64} to uint{8,32,64}_t and bool_t to bool.
Fix attributes coding styles.
Also change memnodeshift to unsigned int.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/numa.c| 40 +--
 xen/arch/x86/srat.c| 52 +++---
 xen/include/asm-arm/numa.h |  2 +-
 xen/include/asm-x86/numa.h | 17 ---
 4 files changed, 56 insertions(+), 55 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 8ee2302..8ed31cb 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -24,12 +24,12 @@ custom_param("numa", numa_setup);
 struct node_data node_data[MAX_NUMNODES];
 
 /* Mapping from pdx to node id */
-int memnode_shift;
+unsigned int memnode_shift;
 static typeof(*memnodemap) _memnodemap[64];
 unsigned long memnodemapsize;
-u8 *memnodemap;
+uint8_t *memnodemap;
 
-nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
+nodeid_t __read_mostly cpu_to_node[NR_CPUS] = {
 [0 ... NR_CPUS-1] = NUMA_NO_NODE
 };
 /*
@@ -38,11 +38,11 @@ nodeid_t cpu_to_node[NR_CPUS] __read_mostly = {
 nodeid_t apicid_to_node[MAX_LOCAL_APIC] = {
 [0 ... MAX_LOCAL_APIC-1] = NUMA_NO_NODE
 };
-cpumask_t node_to_cpumask[MAX_NUMNODES] __read_mostly;
+cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
-bool_t numa_off = 0;
+bool numa_off = 0;
 s8 acpi_numa = 0;
 
 int srat_disabled(void)
@@ -166,7 +166,7 @@ int __init compute_hash_shift(struct node *nodes, int 
numnodes,
 return shift;
 }
 /* initialize NODE_DATA given nodeid and start/end */
-void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
+void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end)
 {
 unsigned long start_pfn, end_pfn;
 
@@ -201,19 +201,19 @@ void __init numa_init_array(void)
 }
 
 #ifdef CONFIG_NUMA_EMU
-static int numa_fake __initdata = 0;
+static int __initdata numa_fake = 0;
 
 /* Numa emulation */
-static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
+static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
 {
 int i;
 struct node nodes[MAX_NUMNODES];
-u64 sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
+uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
 
 /* Kludge needed for the hash function */
 if ( hweight64(sz) > 1 )
 {
-u64 x = 1;
+uint64_t x = 1;
 while ( (x << 1) < sz )
 x <<= 1;
 if ( x < sz / 2 )
@@ -260,8 +260,8 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 #endif
 
 #ifdef CONFIG_ACPI_NUMA
-if ( !numa_off && !acpi_scan_nodes((u64)start_pfn << PAGE_SHIFT,
- (u64)end_pfn << PAGE_SHIFT) )
+if ( !numa_off && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
+ (uint64_t)end_pfn << PAGE_SHIFT) )
 return;
 #endif
 
@@ -269,8 +269,8 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
numa_off ? "NUMA turned off" : "No NUMA configuration found");
 
 printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
-   (u64)start_pfn << PAGE_SHIFT,
-   (u64)end_pfn << PAGE_SHIFT);
+   (uint64_t)start_pfn << PAGE_SHIFT,
+   (uint64_t)end_pfn << PAGE_SHIFT);
 /* setup dummy node covering all memory */
 memnode_shift = BITS_PER_LONG - 1;
 memnodemap = _memnodemap;
@@ -279,8 +279,8 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 for ( i = 0; i < nr_cpu_ids; i++ )
 numa_set_node(i, 0);
 cpumask_copy(_to_cpumask[0], cpumask_of(0));
-setup_node_bootmem(0, (u64)start_pfn << PAGE_SHIFT,
-(u64)end_pfn << PAGE_SHIFT);
+setup_node_bootmem(0, (paddr_t)start_pfn << PAGE_SHIFT,
+(paddr_t)end_pfn << PAGE_SHIFT);
 }
 
 void numa_add_cpu(int cpu)
@@ -294,7 +294,7 @@ void numa_set_node(int cpu, nodeid_t node)
 }
 
 /* [numa=off] */
-static __init int numa_setup(char *opt)
+static int __init numa_setup(char *opt)
 {
 if ( !strncmp(opt,"off",3) )
 numa_off = 1;
@@ -339,7 +339,7 @@ void __init init_cpu_to_node(void)
 
 for ( i = 0; i < nr_cpu_ids; i++ )
 {
-u32 apicid = x86_cpu_to_apicid[i];
+uint32_t apicid = x86_cpu_to_apicid[i];
 if ( apicid == BAD_APICID )
 continue;
 node = apicid < MAX_LOCAL_APIC ? apicid_to_node[apicid] : NUMA_NO_NODE;
@@ -380,7 +380,7 @@ static void dump_numa(unsigned char key)
 const struct vnuma_info *vnuma;
 
 printk("'%c' pressed -> dumping numa info (now-0x%X:%08X)\n", key,
-   (u32)(now >> 32), (u32)now);
+   (uint32_t)(now >> 32), (uint32_t)now);
 
 for_each_online_node ( i )
 {
@@ -507,7 +507,7 @@ static void dump_numa(unsigned char key)
 rcu_read_unlock(_read_lock);
 }
 
-static __init 

[Xen-devel] [RFC PATCH v2 00/25] ARM: Add Xen NUMA support

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

With this RFC patch series, NUMA support is added for ARM platform.
Both DT and ACPI based NUMA support is added.
Only Xen is made aware of NUMA platform. NUMA awareness to DOM0 is not
added.

As part of this series, the code under x86 architecture is
reused by moving into common files.
New files xen/common/numa.c and xen/drivers/acpi/srat.c files are
added.
For ARM specific new folder is added xen/arch/arm/numa and new files
numa.c, dt_numa.c and acpi_numa are introduced under this folder.

DT NUMA: The following major changes are performed
 - Dropped numa-node-id information from Dom0 DT.
   So that Dom0 devices make allocation from node 0 for
   devmalloc requests.
 - Memory DT is not deleted by EFI. It is exposed to Xen
   to extract numa information.
 - On NUMA failure, Fallback to Non-NUMA booting.ACPI_SRAT_TYPE_MEMORY_AFFINITY
   Assuming all the memory and CPU's are under node 0.
 - CONFIG_NUMA is introduced.

ACPI NUMA:
 - MADT is parsed before parsing SRAT table to extract
   CPU_ID to MPIDR mapping info. In Linux, while parsing SRAT
   table, MADT table is opened and extract MPIDR. This
   approach avoids opening ACPI tables recursively.
 - SRAT table is parsed for ACPI_SRAT_TYPE_GICC_AFFINITY to extract
   proximity info and MPIDR from CPU_ID to MPIDR mapping table.
 - SRAT table is parsed for ACPI_SRAT_TYPE_MEMORY_AFFINITY to extract
   memory proximity.
 - Re-use SLIT parsing of x86 for node distance information.
 - CONFIG_ACPI_NUMA is introduced

No changes are made to x86 implementation only code is sanitized and refactored.
Hence only compilation tested for x86.

Code is shared at
https://github.com/vijaykilari/xen-numa/commits/rfc_2

v2: Major changes
  - Rebased to lastest staging branch
  - Reworked on x86 NUMA code and cleanup to possible extent.
Patches 1 to 8 are created for this
  - Reworked on DT and ACPI NUMA extracting information
  - Reused DT code for memory node processing to extract NUMA info.
  - Fixed issues with DT processing
  - Added arch specific processing of SRAT
  - Reworked on MADT and SRAT processing
  - Reworked on node distance
  - All ARM changes are moved under folder arch/arm/numa.
  - NUMA ACPI common changes are kept in drivers/acpi/srat.c

Vijaya Kumar K (25):
  x86: NUMA: Clean up: Drop trailing spaces
  x86: NUMA: Fix datatypes and attributes
  x86: NUMA: Rename and sanitize some common functions
  x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake
variables
  x86: NUMA: Move generic dummy_numa_init to separate function
  x86: NUMA: Add accessors for nodes[] and node_memblk_range[] structs
  x86: NUMA: Rename some generic functions
  x86: NUMA: Sanitize node distance
  ARM: NUMA: Add existing ARM numa code under CONFIG_NUMA
  x86: NUMA: Move numa code and make it generic
  x86: NUMA: Move common code from srat.c
  ARM: NUMA: Parse CPU NUMA information
  ARM: NUMA: Parse memory NUMA information
  ARM: NUMA: Parse NUMA distance information
  ARM: NUMA: Add CPU NUMA support
  ARM: NUMA: Add memory NUMA support
  ARM: NUMA: Add fallback on NUMA failure
  ARM: NUMA: Do not expose numa info to DOM0
  ACPI: Refactor acpi SRAT and SLIT table handling code
  ARM: NUMA: Extract MPIDR from MADT table
  ACPI: Move arch specific SRAT parsing
  ARM: NUMA: Extract proximity from SRAT table
  ARM: NUMA: Initialize ACPI NUMA
  NUMA: Move CONFIG_NUMA to common Kconfig
  NUMA: Enable ACPI_NUMA config

 xen/arch/arm/Makefile   |   1 +
 xen/arch/arm/acpi/boot.c|   2 +
 xen/arch/arm/bootfdt.c  |  44 ++-
 xen/arch/arm/domain_build.c |   9 +
 xen/arch/arm/efi/efi-boot.h |  25 --
 xen/arch/arm/numa/Makefile  |   3 +
 xen/arch/arm/numa/acpi_numa.c   | 249 ++
 xen/arch/arm/numa/dt_numa.c | 244 +
 xen/arch/arm/numa/numa.c| 196 +++
 xen/arch/arm/setup.c|   4 +
 xen/arch/arm/smpboot.c  |  25 +-
 xen/arch/x86/dom0_build.c   |   1 +
 xen/arch/x86/mm.c   |   2 -
 xen/arch/x86/numa.c | 454 +
 xen/arch/x86/physdev.c  |   1 +
 xen/arch/x86/setup.c|   3 +-
 xen/arch/x86/smpboot.c  |   3 +-
 xen/arch/x86/srat.c | 412 --
 xen/arch/x86/x86_64/mm.c|   3 +-
 xen/common/Kconfig  |   4 +
 xen/common/Makefile |   1 +
 xen/common/numa.c   | 662 
 xen/drivers/acpi/Kconfig|   5 +-
 xen/drivers/acpi/Makefile   |   1 +
 xen/drivers/acpi/numa.c |  58 +---
 xen/drivers/acpi/srat.c | 299 
 xen/drivers/passthrough/vtd/iommu.c |   1 +
 xen/include/acpi/actbl1.h   |  17 +-
 xen/include/acpi/srat.h |  24 ++
 xen/include/asm-arm/numa.h  |  73 +++-
 xen/include/asm-arm/setup.h | 

[Xen-devel] [RFC PATCH v2 03/25] x86: NUMA: Rename and sanitize some common functions

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Following changes are made
 - Rename compute_hash_shift to compute_memnode_shift
   and return error values instead of shift.
 - Changes prototypes of populate_memnodemap()
   and extract_lsb_from_nodes()

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/numa.c| 47 +++---
 xen/arch/x86/srat.c|  7 +++
 xen/include/asm-x86/numa.h |  4 ++--
 3 files changed, 28 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 8ed31cb..964fc5a 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -53,15 +53,15 @@ int srat_disabled(void)
 /*
  * Given a shift value, try to populate memnodemap[]
  * Returns :
- * 1 if OK
- * 0 if memnodmap[] too small (of shift too small)
- * -1 if node overlap or lost ram (shift too big)
+ * 0 if OK
+ * -EINVAL if memnodmap[] too small (of shift too small)
+ * OR if node overlap or lost ram (shift too big)
  */
-static int __init populate_memnodemap(const struct node *nodes,
-  int numnodes, int shift, nodeid_t 
*nodeids)
+static int __init populate_memnodemap(const struct node *nodes, int numnodes,
+  unsigned int shift, nodeid_t *nodeids)
 {
 unsigned long spdx, epdx;
-int i, res = -1;
+int i, res = -EINVAL;
 
 memset(memnodemap, NUMA_NO_NODE, memnodemapsize * sizeof(*memnodemap));
 for ( i = 0; i < numnodes; i++ )
@@ -74,7 +74,7 @@ static int __init populate_memnodemap(const struct node 
*nodes,
 return 0;
 do {
 if ( memnodemap[spdx >> shift] != NUMA_NO_NODE )
-return -1;
+return -EINVAL;
 
 if ( !nodeids )
 memnodemap[spdx >> shift] = i;
@@ -83,7 +83,7 @@ static int __init populate_memnodemap(const struct node 
*nodes,
 
 spdx += (1UL << shift);
 } while ( spdx < epdx );
-res = 1;
+res = 0;
 }
 
 return res;
@@ -99,7 +99,7 @@ static int __init allocate_cachealigned_memnodemap(void)
 printk(KERN_ERR
"NUMA: Unable to allocate Memory to Node hash map\n");
 memnodemapsize = 0;
-return -1;
+return -ENOMEM;
 }
 
 memnodemap = mfn_to_virt(mfn);
@@ -116,10 +116,10 @@ static int __init allocate_cachealigned_memnodemap(void)
  * The LSB of all start and end addresses in the node map is the value of the
  * maximum possible shift.
  */
-static int __init extract_lsb_from_nodes(const struct node *nodes,
- int numnodes)
+static unsigned int __init extract_lsb_from_nodes(const struct node *nodes,
+  int numnodes)
 {
-int i, nodes_used = 0;
+unsigned int i, nodes_used = 0;
 unsigned long spdx, epdx;
 unsigned long bitfield = 0, memtop = 0;
 
@@ -143,27 +143,27 @@ static int __init extract_lsb_from_nodes(const struct 
node *nodes,
 return i;
 }
 
-int __init compute_hash_shift(struct node *nodes, int numnodes,
-  nodeid_t *nodeids)
+int __init compute_memnode_shift(struct node *nodes, int numnodes,
+ nodeid_t *nodeids, unsigned int *shift)
 {
-int shift;
+*shift = extract_lsb_from_nodes(nodes, numnodes);
 
-shift = extract_lsb_from_nodes(nodes, numnodes);
 if ( memnodemapsize <= ARRAY_SIZE(_memnodemap) )
 memnodemap = _memnodemap;
 else if ( allocate_cachealigned_memnodemap() )
-return -1;
-printk(KERN_DEBUG "NUMA: Using %d for the hash shift.\n", shift);
+return -ENOMEM;
+
+printk(KERN_DEBUG "NUMA: Using %u for the hash shift.\n", *shift);
 
-if ( populate_memnodemap(nodes, numnodes, shift, nodeids) != 1 )
+if ( populate_memnodemap(nodes, numnodes, *shift, nodeids) )
 {
 printk(KERN_INFO "Your memory is not aligned you need to "
"rebuild your hypervisor with a bigger NODEMAPSIZE "
-   "shift=%d\n", shift);
-return -1;
+   "shift=%u\n", *shift);
+return -EINVAL;
 }
 
-return shift;
+return 0;
 }
 /* initialize NODE_DATA given nodeid and start/end */
 void __init setup_node_bootmem(nodeid_t nodeid, paddr_t start, paddr_t end)
@@ -235,8 +235,7 @@ static int __init numa_emulation(uint64_t start_pfn, 
uint64_t end_pfn)
(nodes[i].end - nodes[i].start) >> 20);
 node_set_online(i);
 }
-memnode_shift = compute_hash_shift(nodes, numa_fake, NULL);
-if ( memnode_shift < 0 )
+if ( compute_memnode_shift(nodes, numa_fake, NULL, _shift) )
 {
 memnode_shift = 0;
 printk(KERN_ERR "No NUMA hash function found. Emulation disabled.\n");
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 800a7c3..2d0c047 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -470,10 +470,9 @@ int __init 

[Xen-devel] [RFC PATCH v2 04/25] x86: NUMA: Add accessors for acpi_numa, numa_off and numa_fake variables

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Add accessor functions for acpi_numa and numa_off static
variables. Init value of acpi_numa is set 1 instead of 0.
Also return value of srat_disabled is changed to bool.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/numa.c| 43 +++
 xen/arch/x86/setup.c   |  2 +-
 xen/arch/x86/srat.c| 12 ++--
 xen/include/asm-x86/acpi.h |  1 -
 xen/include/asm-x86/numa.h |  5 +
 xen/include/xen/numa.h |  3 +++
 6 files changed, 42 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 964fc5a..6b794a7 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -42,12 +42,27 @@ cpumask_t __read_mostly node_to_cpumask[MAX_NUMNODES];
 
 nodemask_t __read_mostly node_online_map = { { [0] = 1UL } };
 
-bool numa_off = 0;
-s8 acpi_numa = 0;
+static bool numa_off = 0;
+static bool acpi_numa = 1;
 
-int srat_disabled(void)
+bool is_numa_off(void)
 {
-return numa_off || acpi_numa < 0;
+return numa_off;
+}
+
+bool get_acpi_numa(void)
+{
+return acpi_numa;
+}
+
+void set_acpi_numa(bool_t val)
+{
+acpi_numa = val;
+}
+
+bool srat_disabled(void)
+{
+return numa_off || acpi_numa == 0;
 }
 
 /*
@@ -202,13 +217,17 @@ void __init numa_init_array(void)
 
 #ifdef CONFIG_NUMA_EMU
 static int __initdata numa_fake = 0;
+static int get_numa_fake(void)
+{
+return numa_fake;
+}
 
 /* Numa emulation */
 static int __init numa_emulation(uint64_t start_pfn, uint64_t end_pfn)
 {
 int i;
 struct node nodes[MAX_NUMNODES];
-uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / numa_fake;
+uint64_t sz = ((end_pfn - start_pfn) << PAGE_SHIFT) / get_numa_fake();
 
 /* Kludge needed for the hash function */
 if ( hweight64(sz) > 1 )
@@ -223,10 +242,10 @@ static int __init numa_emulation(uint64_t start_pfn, 
uint64_t end_pfn)
 }
 
 memset(,0,sizeof(nodes));
-for ( i = 0; i < numa_fake; i++ )
+for ( i = 0; i < get_numa_fake(); i++ )
 {
 nodes[i].start = (start_pfn << PAGE_SHIFT) + i * sz;
-if ( i == numa_fake - 1 )
+if ( i == get_numa_fake() - 1 )
 sz = (end_pfn << PAGE_SHIFT) - nodes[i].start;
 nodes[i].end = nodes[i].start + sz;
 printk(KERN_INFO
@@ -235,7 +254,7 @@ static int __init numa_emulation(uint64_t start_pfn, 
uint64_t end_pfn)
(nodes[i].end - nodes[i].start) >> 20);
 node_set_online(i);
 }
-if ( compute_memnode_shift(nodes, numa_fake, NULL, _shift) )
+if ( compute_memnode_shift(nodes, get_numa_fake(), NULL, _shift) )
 {
 memnode_shift = 0;
 printk(KERN_ERR "No NUMA hash function found. Emulation disabled.\n");
@@ -254,18 +273,18 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 int i;
 
 #ifdef CONFIG_NUMA_EMU
-if ( numa_fake && !numa_emulation(start_pfn, end_pfn) )
+if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
 return;
 #endif
 
 #ifdef CONFIG_ACPI_NUMA
-if ( !numa_off && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
+if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
  (uint64_t)end_pfn << PAGE_SHIFT) )
 return;
 #endif
 
 printk(KERN_INFO "%s\n",
-   numa_off ? "NUMA turned off" : "No NUMA configuration found");
+   is_numa_off() ? "NUMA turned off" : "No NUMA configuration found");
 
 printk(KERN_INFO "Faking a node at %016"PRIx64"-%016"PRIx64"\n",
(uint64_t)start_pfn << PAGE_SHIFT,
@@ -312,7 +331,7 @@ static int __init numa_setup(char *opt)
 if ( !strncmp(opt,"noacpi",6) )
 {
 numa_off = 0;
-acpi_numa = -1;
+acpi_numa = 0;
 }
 #endif
 
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index 1cd290e..4410e53 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -241,7 +241,7 @@ void srat_detect_node(int cpu)
 node_set_online(node);
 numa_set_node(cpu, node);
 
-if ( opt_cpu_info && acpi_numa > 0 )
+if ( opt_cpu_info && get_acpi_numa() )
 printk("CPU %d APIC %d -> Node %d\n", cpu, apicid, node);
 }
 
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index 2d0c047..ccacbcd 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -153,7 +153,7 @@ static void __init bad_srat(void)
 {
int i;
printk(KERN_ERR "SRAT: SRAT not used.\n");
-   acpi_numa = -1;
+   set_acpi_numa(0);
for (i = 0; i < MAX_LOCAL_APIC; i++)
apicid_to_node[i] = NUMA_NO_NODE;
for (i = 0; i < ARRAY_SIZE(pxm2node); i++)
@@ -232,7 +232,7 @@ acpi_numa_x2apic_affinity_init(const struct 
acpi_srat_x2apic_cpu_affinity *pa)
 
apicid_to_node[pa->apic_id] = node;
node_set(node, processor_nodes_parsed);
-   acpi_numa = 1;
+   set_acpi_numa(1);
printk(KERN_INFO "SRAT: PXM %u -> APIC %08x -> Node %u\n",
 

[Xen-devel] [RFC PATCH v2 05/25] x86: NUMA: Move generic dummy_numa_init to separate function

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Split numa_initmem_init() so that the numa fallback code is moved
as separate function which is generic.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/numa.c | 29 +
 1 file changed, 17 insertions(+), 12 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 6b794a7..0888d53 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -268,21 +268,10 @@ static int __init numa_emulation(uint64_t start_pfn, 
uint64_t end_pfn)
 }
 #endif
 
-void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
+static void __init numa_dummy_init(unsigned long start_pfn, unsigned long 
end_pfn)
 {
 int i;
 
-#ifdef CONFIG_NUMA_EMU
-if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
-return;
-#endif
-
-#ifdef CONFIG_ACPI_NUMA
-if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
- (uint64_t)end_pfn << PAGE_SHIFT) )
-return;
-#endif
-
 printk(KERN_INFO "%s\n",
is_numa_off() ? "NUMA turned off" : "No NUMA configuration found");
 
@@ -301,6 +290,22 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 (paddr_t)end_pfn << PAGE_SHIFT);
 }
 
+void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
+{
+#ifdef CONFIG_NUMA_EMU
+if ( get_numa_fake() && !numa_emulation(start_pfn, end_pfn) )
+return;
+#endif
+
+#ifdef CONFIG_ACPI_NUMA
+if ( !is_numa_off() && !acpi_scan_nodes((uint64_t)start_pfn << PAGE_SHIFT,
+ (uint64_t)end_pfn << PAGE_SHIFT) )
+return;
+#endif
+
+numa_dummy_init(start_pfn, end_pfn);
+}
+
 void numa_add_cpu(int cpu)
 {
 cpumask_set_cpu(cpu, _to_cpumask[cpu_to_node(cpu)]);
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [RFC PATCH v2 01/25] x86: NUMA: Clean up: Drop trailing spaces

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

Fix coding style, trailing spaces, tabs in NUMA code.
Also drop unused macros and functions.

Signed-off-by: Vijaya Kumar K 
---
 xen/arch/x86/numa.c| 47 +-
 xen/arch/x86/srat.c|  2 +-
 xen/include/asm-x86/numa.h | 43 --
 3 files changed, 38 insertions(+), 54 deletions(-)

diff --git a/xen/arch/x86/numa.c b/xen/arch/x86/numa.c
index 6f4d438..8ee2302 100644
--- a/xen/arch/x86/numa.c
+++ b/xen/arch/x86/numa.c
@@ -1,8 +1,8 @@
-/* 
+/*
  * Generic VM initialization for x86-64 NUMA setups.
  * Copyright 2002,2003 Andi Kleen, SuSE Labs.
  * Adapted for Xen: Ryan Harper 
- */ 
+ */
 
 #include 
 #include 
@@ -21,13 +21,6 @@
 static int numa_setup(char *s);
 custom_param("numa", numa_setup);
 
-#ifndef Dprintk
-#define Dprintk(x...)
-#endif
-
-/* from proto.h */
-#define round_up(x,y) x)+(y))-1) & (~((y)-1)))
-
 struct node_data node_data[MAX_NUMNODES];
 
 /* Mapping from pdx to node id */
@@ -144,8 +137,9 @@ static int __init extract_lsb_from_nodes(const struct node 
*nodes,
 if ( nodes_used <= 1 )
 i = BITS_PER_LONG - 1;
 else
-i = find_first_bit(, sizeof(unsigned long)*8);
+i = find_first_bit(, sizeof(unsigned long) * 8);
 memnodemapsize = (memtop >> i) + 1;
+
 return i;
 }
 
@@ -173,7 +167,7 @@ int __init compute_hash_shift(struct node *nodes, int 
numnodes,
 }
 /* initialize NODE_DATA given nodeid and start/end */
 void __init setup_node_bootmem(nodeid_t nodeid, u64 start, u64 end)
-{ 
+{
 unsigned long start_pfn, end_pfn;
 
 start_pfn = start >> PAGE_SHIFT;
@@ -183,7 +177,7 @@ void __init setup_node_bootmem(nodeid_t nodeid, u64 start, 
u64 end)
 NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn;
 
 node_set_online(nodeid);
-} 
+}
 
 void __init numa_init_array(void)
 {
@@ -214,7 +208,7 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
 {
 int i;
 struct node nodes[MAX_NUMNODES];
-u64 sz = ((end_pfn - start_pfn)< 1 )
@@ -222,21 +216,22 @@ static int __init numa_emulation(u64 start_pfn, u64 
end_pfn)
 u64 x = 1;
 while ( (x << 1) < sz )
 x <<= 1;
-if ( x < sz/2 )
-printk(KERN_ERR "Numa emulation unbalanced. Complain to 
maintainer\n");
+if ( x < sz / 2 )
+printk(KERN_ERR
+   "Numa emulation unbalanced. Complain to maintainer\n");
 sz = x;
 }
 
 memset(,0,sizeof(nodes));
 for ( i = 0; i < numa_fake; i++ )
 {
-nodes[i].start = (start_pfn<> 20);
 node_set_online(i);
 }
@@ -256,7 +251,7 @@ static int __init numa_emulation(u64 start_pfn, u64 end_pfn)
 #endif
 
 void __init numa_initmem_init(unsigned long start_pfn, unsigned long end_pfn)
-{ 
+{
 int i;
 
 #ifdef CONFIG_NUMA_EMU
@@ -291,7 +286,7 @@ void __init numa_initmem_init(unsigned long start_pfn, 
unsigned long end_pfn)
 void numa_add_cpu(int cpu)
 {
 cpumask_set_cpu(cpu, _to_cpumask[cpu_to_node(cpu)]);
-} 
+}
 
 void numa_set_node(int cpu, nodeid_t node)
 {
@@ -299,8 +294,8 @@ void numa_set_node(int cpu, nodeid_t node)
 }
 
 /* [numa=off] */
-static __init int numa_setup(char *opt) 
-{ 
+static __init int numa_setup(char *opt)
+{
 if ( !strncmp(opt,"off",3) )
 numa_off = 1;
 if ( !strncmp(opt,"on",2) )
@@ -323,7 +318,7 @@ static __init int numa_setup(char *opt)
 #endif
 
 return 1;
-} 
+}
 
 /*
  * Setup early cpu_to_node.
@@ -385,7 +380,7 @@ static void dump_numa(unsigned char key)
 const struct vnuma_info *vnuma;
 
 printk("'%c' pressed -> dumping numa info (now-0x%X:%08X)\n", key,
-   (u32)(now>>32), (u32)now);
+   (u32)(now >> 32), (u32)now);
 
 for_each_online_node ( i )
 {
diff --git a/xen/arch/x86/srat.c b/xen/arch/x86/srat.c
index d86783e..d270b75 100644
--- a/xen/arch/x86/srat.c
+++ b/xen/arch/x86/srat.c
@@ -7,7 +7,7 @@
  * Called from acpi_numa_init while reading the SRAT and SLIT tables.
  * Assumes all memory regions belonging to a single proximity domain
  * are in one chunk. Holes between them will be 

[Xen-devel] [PATCH v4] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP

2017-03-28 Thread vijay . kilari
From: Vijaya Kumar K 

On ARM64, virt_to_mfn uses the hardware for address
translation. So if the virtual address is not mapped translation
fault is raised. On ARM64, DIRECTMAP_VIRT region is direct mapped.

On ARM platforms with NUMA, While initializing second memory node,
panic is triggered from init_node_heap() when virt_to_mfn()
is called for DIRECTMAP_VIRT region address.
Here the check is made to ensure that MFN less than max MFN mapped.
The max MFN is found by calling virt_to_mfn of DIRECTMAP_VIRT_END
region. Since DIRECMAP_VIRT region is not mapped to any virtual address
on ARM, it fails.

In this patch, instead of calling virt_to_mfn(), arch helper
arch_mfn_in_directmap() is introduced. On ARM64 this arch helper
will return true, whereas on ARM DIRECTMAP_VIRT region is not directly mapped
only xenheap region is directly mapped. So on ARM return false always.
For x86 this helper does virt_to_mfn.

Signed-off-by: Vijaya Kumar K 
---
 xen/common/page_alloc.c|  7 ++-
 xen/include/asm-arm/arm32/mm.h | 20 
 xen/include/asm-arm/arm64/mm.h | 20 
 xen/include/asm-arm/mm.h   |  8 
 xen/include/asm-x86/mm.h   | 11 +++
 5 files changed, 61 insertions(+), 5 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 42c20cb..c4ffb31 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -520,9 +520,6 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
 unsigned long needed = (sizeof(**_heap) +
 sizeof(**avail) * NR_ZONES +
 PAGE_SIZE - 1) >> PAGE_SHIFT;
-#ifdef DIRECTMAP_VIRT_END
-unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
-#endif
 int i, j;
 
 if ( !first_node_initialised )
@@ -534,7 +531,7 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
 }
 #ifdef DIRECTMAP_VIRT_END
 else if ( *use_tail && nr >= needed &&
-  (mfn + nr) <= (virt_to_mfn(eva - 1) + 1) &&
+  arch_mfn_in_directmap(mfn + nr) &&
   (!xenheap_bits ||
!((mfn + nr - 1) >> (xenheap_bits - PAGE_SHIFT))) )
 {
@@ -543,7 +540,7 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
   PAGE_SIZE - sizeof(**avail) * NR_ZONES;
 }
 else if ( nr >= needed &&
-  (mfn + needed) <= (virt_to_mfn(eva - 1) + 1) &&
+  arch_mfn_in_directmap(mfn + needed) &&
   (!xenheap_bits ||
!((mfn + needed - 1) >> (xenheap_bits - PAGE_SHIFT))) )
 {
diff --git a/xen/include/asm-arm/arm32/mm.h b/xen/include/asm-arm/arm32/mm.h
new file mode 100644
index 000..e93d9df
--- /dev/null
+++ b/xen/include/asm-arm/arm32/mm.h
@@ -0,0 +1,20 @@
+#ifndef __ARM_ARM32_MM_H__
+#define __ARM_ARM32_MM_H__
+
+/* On ARM only xenheap memory is directly mapped. Hence return false. */
+static inline bool arch_mfn_in_directmap(unsigned long mfn)
+{
+return false;
+}
+
+#endif /* __ARM_ARM32_MM_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/arm64/mm.h b/xen/include/asm-arm/arm64/mm.h
new file mode 100644
index 000..36ee9c8
--- /dev/null
+++ b/xen/include/asm-arm/arm64/mm.h
@@ -0,0 +1,20 @@
+#ifndef __ARM_ARM64_MM_H__
+#define __ARM_ARM64_MM_H__
+
+/* On ARM64 DIRECTMAP_VIRT region is directly mapped. Hence return true */
+static inline bool arch_mfn_in_directmap(unsigned long mfn)
+{
+return true;
+}
+
+#endif /* __ARM_ARM64_MM_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 4892155..0fef612 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -6,6 +6,14 @@
 #include 
 #include 
 
+#if defined(CONFIG_ARM_32)
+# include 
+#elif defined(CONFIG_ARM_64)
+# include 
+#else
+# error "unknown ARM variant"
+#endif
+
 /* Align Xen to a 2 MiB boundary. */
 #define XEN_PADDR_ALIGN (1 << 21)
 
diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
index e22603c..efae611 100644
--- a/xen/include/asm-x86/mm.h
+++ b/xen/include/asm-x86/mm.h
@@ -602,4 +602,15 @@ extern const char zero_page[];
 /* Build a 32bit PSE page table using 4MB pages. */
 void write_32bit_pse_identmap(uint32_t *l2);
 
+/*
+ * x86 maps DIRECTMAP_VIRT to physical memory. Get the mfn for directmap
+ * memory region.
+ */
+static inline bool arch_mfn_in_directmap(unsigned long mfn)
+{
+unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
+
+return (mfn <= (virt_to_mfn(eva - 1) + 1));
+}
+
 #endif /* __ASM_X86_MM_H__ */
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org

Re: [Xen-devel] ARM: SMMUv3 support

2017-03-27 Thread Vijay Kilari
On Mon, Mar 27, 2017 at 10:00 PM, Goel, Sameer <sg...@codeaurora.org> wrote:
> Hi,
>  I am working on adding this support. The work is in initial stages and will 
> target ACPI systems to start with. Do you have a specific requirement? Or 
> even better: want to help with DT testing ? :)

Thanks Sameer. I don't have any specific requirement. I am also
looking with ACPI support.
Please share your RFC patches so that I can test on our platform.

> Thanks,
> Sameer
>
> On 3/20/2017 11:58 PM, Vijay Kilari wrote:
>> Hi,
>>
>>  Is there any effort put by anyone to get SMMUv3 support in Xen for 
>> ARM64?.
>> Would be glad to know.
>>
>> Regards
>> Vijay
>>
>> ___
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
>>
>
> --
> Qualcomm Innovation Center, Inc.
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v3] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP

2017-03-27 Thread Vijay Kilari
Hi Jan,

  Thanks for the review.

On Mon, Mar 27, 2017 at 12:50 PM, Jan Beulich  wrote:
 On 27.03.17 at 09:10,  wrote:
>> @@ -254,7 +262,6 @@ static inline int gvirt_to_maddr(vaddr_t va, paddr_t 
>> *pa, unsigned int flags)
>>  #define virt_to_mfn(va)   (virt_to_maddr(va) >> PAGE_SHIFT)
>>  #define mfn_to_virt(mfn)  (maddr_to_virt((paddr_t)(mfn) << PAGE_SHIFT))
>>
>> -
>>  /* Convert between Xen-heap virtual addresses and page-info structures. */
>
> If I was an ARM maintainer, I'd object to such a stray change (even
> if generally it looks good to me to remove double blank lines).

Hmm. That got creeped from from previous commit change.. I will take care.

>
>> @@ -374,6 +375,17 @@ perms_strictly_increased(uint32_t old_flags, uint32_t 
>> new_flags)
>>  return ((of | (of ^ nf)) == nf);
>>  }
>>
>> +/*
>> + * x86 maps DIRECTMAP_VIRT to physical memory. Get the mfn for directmap
>> + * memory region.
>> + */
>> +static inline bool_t arch_mfn_below_directmap_max_mfn(unsigned long mfn)
>
> I'm pretty convinced it has been pointed out to you that we use
> plain bool nowadays. Also the function name looks overly long to
> me. How about arch_mfn_in_directmap()?

This name is fine with me.

>
>> +{
>> +unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
>> +
>> +return (mfn <= (virt_to_mfn(eva - 1) + 1)) ? true : false;
>
> There's absolutely no need for conditional expressions like this. The
> result of the comparison is fine as is for a function with a boolean
> result (and that was already the case back when we were still using
> bool_t).
OK
>
> Jan
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v3] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP

2017-03-27 Thread vijay . kilari
From: Vijaya Kumar K 

On ARM64, virt_to_mfn uses the hardware for address
translation. So if the virtual address is not mapped translation
fault is raised. On ARM64, DIRECTMAP_VIRT region is direct mapped.

On ARM platforms with NUMA, While initializing second memory node,
panic is triggered from init_node_heap() when virt_to_mfn()
is called for DIRECTMAP_VIRT region address.
Here the check is made to ensure that MFN less than max MFN mapped.
The max MFN is found by calling virt_to_mfn of DIRECTMAP_VIRT_END
region. Since DIRECMAP_VIRT region is not mapped to any virtual address
on ARM, it fails.

In this patch, instead of calling virt_to_mfn(), arch helper
arch_mfn_below_directmap_max_mfn() is introduced. On ARM64 this arch helper
will return true, whereas on ARM DIRECTMAP_VIRT region is not directly mapped
only xenheap region is directly mapped. So on ARM return false always.
For x86 this helper does virt_to_mfn.

Signed-off-by: Vijaya Kumar K 

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 42c20cb..85322cd 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -520,9 +520,6 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
 unsigned long needed = (sizeof(**_heap) +
 sizeof(**avail) * NR_ZONES +
 PAGE_SIZE - 1) >> PAGE_SHIFT;
-#ifdef DIRECTMAP_VIRT_END
-unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
-#endif
 int i, j;
 
 if ( !first_node_initialised )
@@ -534,7 +531,7 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
 }
 #ifdef DIRECTMAP_VIRT_END
 else if ( *use_tail && nr >= needed &&
-  (mfn + nr) <= (virt_to_mfn(eva - 1) + 1) &&
+  arch_mfn_below_directmap_max_mfn(mfn + nr) &&
   (!xenheap_bits ||
!((mfn + nr - 1) >> (xenheap_bits - PAGE_SHIFT))) )
 {
@@ -543,7 +540,7 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
   PAGE_SIZE - sizeof(**avail) * NR_ZONES;
 }
 else if ( nr >= needed &&
-  (mfn + needed) <= (virt_to_mfn(eva - 1) + 1) &&
+  arch_mfn_below_directmap_max_mfn(mfn + needed) &&
   (!xenheap_bits ||
!((mfn + needed - 1) >> (xenheap_bits - PAGE_SHIFT))) )
 {
diff --git a/xen/include/asm-arm/arm32/mm.h b/xen/include/asm-arm/arm32/mm.h
new file mode 100644
index 000..85b2388
--- /dev/null
+++ b/xen/include/asm-arm/arm32/mm.h
@@ -0,0 +1,20 @@
+#ifndef __ARM_ARM32_MM_H__
+#define __ARM_ARM32_MM_H__
+
+/* On ARM only xenheap memory is directly mapped. Hence return false. */
+static inline bool_t arch_mfn_below_directmap_max_mfn(unsigned long mfn)
+{
+return false;
+}
+
+#endif /* __ARM_ARM32_MM_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/arm64/mm.h b/xen/include/asm-arm/arm64/mm.h
new file mode 100644
index 000..98c6fc7
--- /dev/null
+++ b/xen/include/asm-arm/arm64/mm.h
@@ -0,0 +1,20 @@
+#ifndef __ARM_ARM64_MM_H__
+#define __ARM_ARM64_MM_H__
+
+/* On ARM64 DIRECTMAP_VIRT region is directly mapped. Hence return true */
+static inline bool_t arch_mfn_below_directmap_max_mfn(unsigned long mfn)
+{
+return true;
+}
+
+#endif /* __ARM_ARM64_MM_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 4892155..5a802cc 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -6,6 +6,14 @@
 #include 
 #include 
 
+#if defined(CONFIG_ARM_32)
+# include 
+#elif defined(CONFIG_ARM_64)
+# include 
+#else
+# error "unknown ARM variant"
+#endif
+
 /* Align Xen to a 2 MiB boundary. */
 #define XEN_PADDR_ALIGN (1 << 21)
 
@@ -254,7 +262,6 @@ static inline int gvirt_to_maddr(vaddr_t va, paddr_t *pa, 
unsigned int flags)
 #define virt_to_mfn(va)   (virt_to_maddr(va) >> PAGE_SHIFT)
 #define mfn_to_virt(mfn)  (maddr_to_virt((paddr_t)(mfn) << PAGE_SHIFT))
 
-
 /* Convert between Xen-heap virtual addresses and page-info structures. */
 static inline struct page_info *virt_to_page(const void *v)
 {
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index 46faffc..e0c31b6 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -18,6 +18,7 @@
 #ifndef __ASSEMBLY__
 # include 
 # include 
+# include 
 #endif
 
 #include 
@@ -374,6 +375,17 @@ perms_strictly_increased(uint32_t old_flags, uint32_t 
new_flags)
 return ((of | (of ^ nf)) == nf);
 }
 
+/*
+ * x86 maps DIRECTMAP_VIRT to physical memory. Get the mfn for directmap
+ * memory region.
+ */
+static inline bool_t arch_mfn_below_directmap_max_mfn(unsigned long mfn)
+{
+unsigned long eva = min(DIRECTMAP_VIRT_END, 

[Xen-devel] ARM: SMMUv3 support

2017-03-20 Thread Vijay Kilari
Hi,

 Is there any effort put by anyone to get SMMUv3 support in Xen for ARM64?.
Would be glad to know.

Regards
Vijay

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2 14/27] ARM: vGICv3: introduce basic ITS emulation bits

2017-03-20 Thread Vijay Kilari
On Thu, Mar 16, 2017 at 9:55 PM, Shanker Donthineni
 wrote:
> Hi Andre,
>
>
> On 03/16/2017 06:20 AM, Andre Przywara wrote:
>> Create a new file to hold the emulation code for the ITS widget.
>> For now we emulate the memory mapped ITS registers and provide a stub
>> to introduce the ITS command handling framework (but without actually
>> emulating any commands at this time).
>>
>> Signed-off-by: Andre Przywara 
>> ---
>>  xen/arch/arm/Makefile |   1 +
>>  xen/arch/arm/vgic-v3-its.c| 487 
>> ++
>>  xen/arch/arm/vgic-v3.c|   9 -
>>  xen/include/asm-arm/gic_v3_defs.h |  19 ++
>>  4 files changed, 507 insertions(+), 9 deletions(-)
>>  create mode 100644 xen/arch/arm/vgic-v3-its.c
>>
>> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
>> index 02a8737..e7ce2c83 100644
>> --- a/xen/arch/arm/Makefile
>> +++ b/xen/arch/arm/Makefile
>> @@ -47,6 +47,7 @@ obj-y += traps.o
>>  obj-y += vgic.o
>>  obj-y += vgic-v2.o
>>  obj-$(CONFIG_HAS_GICV3) += vgic-v3.o
>> +obj-$(CONFIG_HAS_ITS) += vgic-v3-its.o
>>  obj-y += vm_event.o
>>  obj-y += vtimer.o
>>  obj-y += vpsci.o
>> diff --git a/xen/arch/arm/vgic-v3-its.c b/xen/arch/arm/vgic-v3-its.c
>> new file mode 100644
>> index 000..5337638
>> --- /dev/null
>> +++ b/xen/arch/arm/vgic-v3-its.c
>> @@ -0,0 +1,487 @@
>> +/*
>> + * xen/arch/arm/vgic-v3-its.c
>> + *
>> + * ARM Interrupt Translation Service (ITS) emulation
>> + *
>> + * Andre Przywara 
>> + * Copyright (c) 2016,2017 ARM Ltd.
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License as published by
>> + * the Free Software Foundation; under version 2 of the License.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; If not, see .
>> + */
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +
>> +/* Data structure to describe a virtual ITS */
>> +struct virt_its {
>> +struct domain *d;
>> +spinlock_t vcmd_lock;   /* protects the virtual command buffer */
>> +uint64_t cbaser;
>> +uint64_t *cmdbuf;
>> +int cwriter;
>> +int creadr;
>> +spinlock_t its_lock;/* protects the collection and device 
>> tables */
>> +uint64_t baser0, baser1;
>> +uint16_t *coll_table;
>> +int max_collections;
>> +uint64_t *dev_table;
>> +int max_devices;
>> +bool enabled;
>> +};
>> +
>> +/*
>> + * An Interrupt Translation Table Entry: this is indexed by a
>> + * DeviceID/EventID pair and is located in guest memory.
>> + */
>> +struct vits_itte
>> +{
>> +uint32_t vlpi;
>> +uint16_t collection;
>> +};
>> +
>> +/**
>> + * Functions that handle ITS commands *
>> + **/
>> +
>> +static uint64_t its_cmd_mask_field(uint64_t *its_cmd,
>> +   int word, int shift, int size)
>> +{
>> +return (le64_to_cpu(its_cmd[word]) >> shift) & (BIT(size) - 1);
>> +}
>> +
>> +#define its_cmd_get_command(cmd)its_cmd_mask_field(cmd, 0,  0,  8)
>> +#define its_cmd_get_deviceid(cmd)   its_cmd_mask_field(cmd, 0, 32, 32)
>> +#define its_cmd_get_size(cmd)   its_cmd_mask_field(cmd, 1,  0,  5)
>> +#define its_cmd_get_id(cmd) its_cmd_mask_field(cmd, 1,  0, 32)
>> +#define its_cmd_get_physical_id(cmd)its_cmd_mask_field(cmd, 1, 32, 32)
>> +#define its_cmd_get_collection(cmd) its_cmd_mask_field(cmd, 2,  0, 16)
>> +#define its_cmd_get_target_addr(cmd)its_cmd_mask_field(cmd, 2, 16, 32)
>> +#define its_cmd_get_validbit(cmd)   its_cmd_mask_field(cmd, 2, 63,  1)
>> +
>> +#define ITS_CMD_BUFFER_SIZE(baser)  baser) & 0xff) + 1) << 12)
>> +
>> +static int vgic_its_handle_cmds(struct domain *d, struct virt_its *its,
>> +uint32_t writer)
>> +{
>> +uint64_t *cmdptr;
>> +
>> +if ( !its->cmdbuf )
>> +return -1;
>> +
>> +if ( writer >= ITS_CMD_BUFFER_SIZE(its->cbaser) )
>> +return -1;
>> +
>> +spin_lock(>vcmd_lock);
>> +
>> +while ( its->creadr != writer )
>> +{
>> +cmdptr = its->cmdbuf + (its->creadr / sizeof(*its->cmdbuf));
>> +switch (its_cmd_get_command(cmdptr))
>> +{
>> +case GITS_CMD_SYNC:
>> +/* We handle ITS commands synchronously, so we ignore SYNC. */
>> + break;
>> +default:
>> 

Re: [Xen-devel] [PATCH v2] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP

2017-03-14 Thread Vijay Kilari
On Tue, Mar 14, 2017 at 9:02 PM, Julien Grall  wrote:
> Hello Vijay,
>
> On 13/03/17 11:43, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> On ARM, virt_to_mfn uses the hardware for address
>> translation. So if the virtual address is not mapped translation
>> fault is raised.On ARM, DIRECTMAP_VIRT region is direct mapped.
>
>
> This is not true. As I said before, all the memory will be direct mapped on
> ARM64 but not on ARM32.
>
> For ARM32, only xenheap will be direct mapped. So you may want to return
> is_xenheap_mfn(...). Or even return false in all the case. Either is fine by
> me, but it would need to be explained in the code.

Is this ok?.

/*
 * On ARM64 DIRECTMAP_VIRT region is directly mapped. Hence return true;
 * On ARM only xenheap memory is directly mapped. Hence return false.
 */
static inline bool_t arch_mfn_below_directmap_max_mfn(unsigned long mfn)
{
#ifdef CONFIG_ARM_64
return true;
#else
return false;
#endif
}

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v2] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP

2017-03-13 Thread Vijay Kilari
On Mon, Mar 13, 2017 at 7:12 PM, Jan Beulich  wrote:
 On 13.03.17 at 12:43,  wrote:
>> --- a/xen/include/asm-arm/mm.h
>> +++ b/xen/include/asm-arm/mm.h
>> @@ -260,6 +260,13 @@ static inline int gvirt_to_maddr(vaddr_t va, paddr_t 
>> *pa, unsigned int flags)
>>  #define virt_to_mfn(va)   (virt_to_maddr(va) >> PAGE_SHIFT)
>>  #define mfn_to_virt(mfn)  (maddr_to_virt((paddr_t)(mfn) << PAGE_SHIFT))
>>
>> +/*
>> + * On ARM DIRECTMAP_VIRT region is directly mapped. Hence return true;
>> + */
>> +static inline bool_t arch_mfn_below_directmap_max_mfn(unsigned long mfn)
>> +{
>> +return 1;
>> +}
>
> bool and true respectively, please (also on the x86 side).

OK
>
>> --- a/xen/include/asm-x86/page.h
>> +++ b/xen/include/asm-x86/page.h
>> @@ -18,6 +18,7 @@
>>  #ifndef __ASSEMBLY__
>>  # include 
>>  # include 
>> +# include 
>
> Why?

Compilation fails saying min() is not defined

>
>> @@ -374,6 +375,21 @@ perms_strictly_increased(uint32_t old_flags, uint32_t 
>> new_flags)
>>  return ((of | (of ^ nf)) == nf);
>>  }
>>
>> +/*
>> + * x86 maps DIRECTMAP_VIRT to physical memory. Get the mfn for directmap
>> + * memory region.
>> + */
>> +static inline bool_t arch_mfn_below_directmap_max_mfn(unsigned long mfn)
>> +{
>> +#ifdef DIRECTMAP_VIRT_END
>
> The symbol is always defined on x86 afaics - no need for the #ifdef.
ok.
>
> Jan
>

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


[Xen-devel] [PATCH v2] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP

2017-03-13 Thread vijay . kilari
From: Vijaya Kumar K 

On ARM, virt_to_mfn uses the hardware for address
translation. So if the virtual address is not mapped translation
fault is raised.On ARM, DIRECTMAP_VIRT region is direct mapped.

On ARM with NUMA, While initializing second memory node,
panic is triggered from init_node_heap() when virt_to_mfn()
is called for DIRECTMAP_VIRT region address.
Here the check is made to ensure that MFN less than max MFN mapped.
The max MFN is found by calling virt_to_mfn of DIRECTMAP_VIRT_END
region. Since DIRECMAP_VIRT region is not mapped to any virtual address
on ARM, it fails.

In this patch, instead of calling virt_to_mfn(), arch helper
arch_mfn_below_directmap_max_mfn() is introduced. On ARM this arch helper
will return 1 always and for x86 this helper does virt_to_mfn.

Signed-off-by: Vijaya Kumar K 
---
 xen/common/page_alloc.c|  7 ++-
 xen/include/asm-arm/mm.h   |  7 +++
 xen/include/asm-x86/page.h | 16 
 3 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 42c20cb..85322cd 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -520,9 +520,6 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
 unsigned long needed = (sizeof(**_heap) +
 sizeof(**avail) * NR_ZONES +
 PAGE_SIZE - 1) >> PAGE_SHIFT;
-#ifdef DIRECTMAP_VIRT_END
-unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
-#endif
 int i, j;
 
 if ( !first_node_initialised )
@@ -534,7 +531,7 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
 }
 #ifdef DIRECTMAP_VIRT_END
 else if ( *use_tail && nr >= needed &&
-  (mfn + nr) <= (virt_to_mfn(eva - 1) + 1) &&
+  arch_mfn_below_directmap_max_mfn(mfn + nr) &&
   (!xenheap_bits ||
!((mfn + nr - 1) >> (xenheap_bits - PAGE_SHIFT))) )
 {
@@ -543,7 +540,7 @@ static unsigned long init_node_heap(int node, unsigned long 
mfn,
   PAGE_SIZE - sizeof(**avail) * NR_ZONES;
 }
 else if ( nr >= needed &&
-  (mfn + needed) <= (virt_to_mfn(eva - 1) + 1) &&
+  arch_mfn_below_directmap_max_mfn(mfn + needed) &&
   (!xenheap_bits ||
!((mfn + needed - 1) >> (xenheap_bits - PAGE_SHIFT))) )
 {
diff --git a/xen/include/asm-arm/mm.h b/xen/include/asm-arm/mm.h
index 60ccbf3..f0c90c2 100644
--- a/xen/include/asm-arm/mm.h
+++ b/xen/include/asm-arm/mm.h
@@ -260,6 +260,13 @@ static inline int gvirt_to_maddr(vaddr_t va, paddr_t *pa, 
unsigned int flags)
 #define virt_to_mfn(va)   (virt_to_maddr(va) >> PAGE_SHIFT)
 #define mfn_to_virt(mfn)  (maddr_to_virt((paddr_t)(mfn) << PAGE_SHIFT))
 
+/*
+ * On ARM DIRECTMAP_VIRT region is directly mapped. Hence return true;
+ */
+static inline bool_t arch_mfn_below_directmap_max_mfn(unsigned long mfn)
+{
+return 1;
+}
 
 /* Convert between Xen-heap virtual addresses and page-info structures. */
 static inline struct page_info *virt_to_page(const void *v)
diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h
index 46faffc..3ea5310 100644
--- a/xen/include/asm-x86/page.h
+++ b/xen/include/asm-x86/page.h
@@ -18,6 +18,7 @@
 #ifndef __ASSEMBLY__
 # include 
 # include 
+# include 
 #endif
 
 #include 
@@ -374,6 +375,21 @@ perms_strictly_increased(uint32_t old_flags, uint32_t 
new_flags)
 return ((of | (of ^ nf)) == nf);
 }
 
+/*
+ * x86 maps DIRECTMAP_VIRT to physical memory. Get the mfn for directmap
+ * memory region.
+ */
+static inline bool_t arch_mfn_below_directmap_max_mfn(unsigned long mfn)
+{
+#ifdef DIRECTMAP_VIRT_END
+unsigned long eva = min(DIRECTMAP_VIRT_END, HYPERVISOR_VIRT_END);
+
+return mfn <= (virt_to_mfn(eva - 1) + 1);
+#else
+return 0;
+#endif
+}
+
 #endif /* !__ASSEMBLY__ */
 
 #define PAGE_ALIGN(x) (((x) + PAGE_SIZE - 1) & PAGE_MASK)
-- 
2.7.4


___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [RFC PATCH v1 16/21] ARM: NUMA: Extract proximity from SRAT table

2017-03-10 Thread Vijay Kilari
On Fri, Mar 3, 2017 at 8:52 PM, Jan Beulich  wrote:
 On 03.03.17 at 16:16,  wrote:
>> On Fri, Mar 3, 2017 at 8:22 PM, Julien Grall  wrote:
>>> int __init acpi_numa_init(void)
>>> {
>>> if (!acpi_parse_table()) {
>>> acpi_table_parse_srat(TYPE_CPU_AFFINITY);
>>
>> This is not defined for ARM. We have to make this also arch specific.
>> So all arch specific code from xen/drivers/acpi/numa.c should be moved
>> to arch specific to xen/arch/x86/srat.c
>
> There surely is a way to specify processor affinity on ARM?

In ARM, we use ACPI_SRAT_TYPE_GICC_AFFINITY type entry in SRAT
to extract cpu to proximity mapping

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


Re: [Xen-devel] [PATCH v1] boot allocator: Use arch helper for virt_to_mfn on DIRECTMAP

2017-03-10 Thread Vijay Kilari
On Fri, Mar 10, 2017 at 3:09 PM, Julien Grall  wrote:
> Hello,
>
> It is not the first time I am saying this. Please CC *all* the maintainers
> of the code you modify. Give a look at scripts/get_maintainers.pl.

I got below maintainers when I ran the script on complete patch as below.

ubuntu@ubuntu:~/xen$ ./scripts/get_maintainer.pl
outgoing/0001-boot-allocator-Use-arch-helper-for-virt_to_mfn-on-DI.patch
Stefano Stabellini 
Julien Grall 
Jan Beulich 
Andrew Cooper 
xen-devel@lists.xen.org
ubuntu@ubuntu:~/xen$

But I think you are seeing different/full maintainer list with
./scripts/get_maintainer.pl -f xen/common/page_alloc.c

>
> On 03/10/2017 07:32 AM, vijay.kil...@gmail.com wrote:
>>
>> From: Vijaya Kumar K 
>>
>> On ARM, virt_to_mfn uses the hardware for address
>> translation. So if the virtual address is not mapped translation
>> fault is raised.
>> On ARM with NUMA, While initializing second memory node,
>> panic is triggered from init_node_heap() when virt_to_mfn()
>> is called for DIRECTMAP_VIRT region address.
>>
>> The init_node_heap() makes a check on MFN passed to ensure that
>> MFN less than max MFN. For this, check is made against virt_to_mfn of
>> DIRECTMAP_VIRT_END region. Since DIRECMAP_VIRT region is not mapped
>> to any physical memory on ARM, it fails.
>>
>> In this patch, instead of calling virt_to_mfn(), arch helper
>> arch_directmap_virt_to_mfn() is introduced. For ARM this arch helper
>> will return 0 and for x86 this helper does virt_to_mfn.
>
>
> I don't understand why you return 0 for ARM. It will prevent the code to
> optimize the case where all the node memory is in the direct mapped region.
> Instead it will allocate extra page in xenheap.
>
> On the previous discussion [1], it has been said that on ARM64 all the
> memory is currently direct mapped. So this check should *always* be true and
> not false. It was suggested to move the whole check in arch specific code.
>
> If this suggestion does not fit, please explain why. Similarly you need to
> justify why you return 0 for ARM because so far it looks a random value.

Thanks for pointing out.
I was biased by your statement "On ARM64, all the memory is direct
mapped so far, so this check will
always be false.", Sorry, I missed your later reply.

>
> Regards,
>
> [1] https://lists.xen.org/archives/html/xen-devel/2017-01/msg00575.html
>
> --
> Julien Grall

___
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel


  1   2   3   4   5   6   >