Re: [lng-odp] NUMA aware memory allocation?

Jacob, Jerin Mon, 11 May 2015 04:46:57 -0700

'zero' cannot be treated as default here as node 0 exists.
We  may need make -1 as default here to specify get_current_node() where code 
executes.

From: Bill Fischofer <[email protected]>
Sent: Monday, May 11, 2015 4:47 PM
To: Jacob, Jerin
Cc: Gábor Sándor Enyedi; Savolainen, Petri (Nokia - FI/Espoo); Zoltan Kiss; 
[email protected]
Subject: Re: [lng-odp] NUMA aware memory allocation?

The structs already exist and we've adopted the convention that zeros are 
interpreted as default values, so that way those cases are covered.

On Mon, May 11, 2015 at 4:53 AM, Jacob, Jerin  <[email protected]> 
wrote:

Either way is fine with me. Only concern I have with adding extra info in 
appropriate odp_xxx_params_t is that NON numa applications(most likely case) 
needs
fill the structure with some default value all the time.

From: Bill Fischofer <[email protected]>
Sent: Friday, May 8, 2015 11:56 PM
To: Jacob, Jerin
Cc: Gábor Sándor Enyedi; Savolainen, Petri (Nokia - FI/Espoo); Zoltan Kiss;  
[email protected]

Subject: Re: [lng-odp] NUMA aware memory allocation?

Good points, however rather than having odp_..._onnode() variants, I think 
encoding the extra info in an appropriate odp_xxx_params_t structure would be 
more consistent with how we've been shaping the APIs.  That way it doesn't 
require separate  API calls to  handle the variants.

On Fri, May 8, 2015 at 10:11 AM, Jacob, Jerin  <[email protected]> 
wrote:

In multi node ODP implementation / application usage perceptive,
we need to consider, How we can expose the HW resources in each node.
resources could be cpus, memory and any hw accelerated blocks for packet 
processing.

In case of CPU resource, we could take the current API model like, API's for 
querying how may
cpu resource available in each node and start specific work on selected cpus 
using odp_cpu_mask_t/
Let implementation take care of pinning/exposing the number cores for ODP on 
each node.

In case of memory resource, IMO odp_shm_reserve can extended to allocated form a
specific node

In case of hw accelerated blocks resources, IMO we should add node
parameter while creating the handles

IMO, Gábor Sándor Enyedi's example may be visualized like this on multi node ODP

-local_pool = odp_pool_create() // create a local pool
-odp_pktio_open(..,local_pool)  // open local node pktio and attach to local 
pool

-remote_pool = odp_pool_create_onnode(node...) // create a remote pool as 
packet needs to go remote node DDR
-odp_pktio_open_onnode(node,...,remote_pool) // open remote node pktio with 
remote pool

-odp_cpu_count()
-create cpu mask and lunch work on local node

-odp_cpu_count(node) // to get number works available on remote node
-create cpu mask and lunch work on remote node

From: Bill Fischofer <[email protected]>
Sent: Friday, May 8, 2015 7:43 PM
To: Gábor Sándor Enyedi
Cc: Savolainen, Petri (Nokia - FI/Espoo); Jacob, Jerin; Zoltan Kiss;   
[email protected]

Subject: Re: [lng-odp] NUMA aware memory allocation?

Thanks, that's good info. So in this case is it sufficient to say that the 
memory used for odp_pool_create() is the one associated with the thread that 
executes the create call?  Presumably then when a packet arrives and is 
assigned to a CoS  that points to   that pool then events from that pool are 
sent to queues that are only scheduled to the corresponding cores that have 
fast access to that pool.  Right now queues have an odp_schedule_group_t but 
that's still fairly rudimentary.  It sounds like  we might want   to point the 
queue at the pool for scheduling purposes so that it would inherit the NUMA 
considerations you mention.

On Fri, May 8, 2015 at 9:00 AM, Gábor Sándor Enyedi  
<[email protected]> wrote:

For me and for now the use-case is very simple: we have an x86 with two Xeon 
CPU-s (dual socket) in it. Each of the CPU-s have its own memory and own 
PCIExpress bus, as usual. First, I want to make only some test code, but later 
we may  want to port our high   speed OF soft switch to ODP (now, its on DPDK). 
We want to assign a correct core for each interface, and each slot must use its 
own copy of forwarding data in its own memory. We have the experience that if 
we accidentally assigned a bad  core to an interface,   we could get even about 
50% performance drop, so NUMA is essential.
Based on the previous, for us something similar to that used in DPDK's 
rte_malloc (and its variants) and a NUMA aware buffer pool create was enough 
for now. Later we want to investigate other architectures... but I don't know 
the use-cases yet.

Gabor

On 05/08/2015 03:35 PM, Bill Fischofer wrote:

Insofar as possible, the mechanics of NUMA should be the responsibility of the 
ODP implementation, rather than the application, since that way the application 
retains maximum portability.

However, from an ODP API perspective, I think we need to be mindful of NUMA 
considerations to give implementations the necessary "hooks" to properly 
support the NUMA aspects of their platform.  This is why ODP APIs need to be 
careful about what addressability    assumptions they make.

If Gábor or Jerrin can list a couple of specific relevant cases I think that 
will help in focusing the discussion and get us off to a good start.

On Fri, May 8, 2015 at 8:26 AM, Savolainen, Petri (Nokia - FI/Espoo) 
<[email protected]> wrote:
 Hi,

ODP is OS agnostic and thus thread management (e.g. thread creation and pinning 
to physical cores) and NUMA awareness should happen mostly outside of ODP APIs.

For example, NUMA could be visible in ODP APIs this way:
* Add odp_cpumask_xxx() calls that indicate NUMA dependency between CPUs (just 
for information)
* Add a way to identify groups of threads which frequently share resources 
(memory and handles) within the group
* Give the thread group as a hint (parameter) to various ODP calls that create 
shared resources. Implementation can use the information to allocate resources 
"near" to the threads in the group. However, the user is responsible to group 
the threads and map/pin    those into physical CPUs in a way that enables NUMA 
aware optimizations.

-Petri

> -----Original Message-----
> From: lng-odp [mailto:[email protected]] On Behalf Of ext
> Gábor Sándor Enyedi
> Sent: Friday, May 08, 2015 10:48 AM
> To: Jerin Jacob; Zoltan Kiss
> Cc: [email protected]
> Subject: Re: [lng-odp] NUMA aware memory allocation?
>
> Hi,
>
> Thanks. So, is the workaround for now to start the threads, and do all
> the memory reservation on the thread? And to call odp_shm_reserve()
> instead of simple malloc() calls? Can I use multiple buffer pools, one
> for each thread or interface?
> BR,
>
> Gabor
>
> P.s.: Do you know when will this issue in the API be fixed (e.g. in next
> release or whatever)?
>
> On 05/08/2015 09:06 AM, Jerin Jacob wrote:
> > On Thu, May 07, 2015 at 05:00:54PM +0100, Zoltan Kiss wrote:
> >
> >> Hi,
> >>
> >> I'm not aware of any such interface, but others with more knowledge can
> >> comment about it. The ODP-DPDK implementation creates buffer pools on
> the
> >> NUMA node where the pool create function were actually called.
> > current ODP spec is not NUMA aware. We need to have API to support nodes
> enumeration and
> > explicit node parameter to alloc/free resource from specific node like
> odp_shm_reserve_onnode(node, ...)
> > and while keeping existing API odp_shm_reserve() allocated on node where
> the current code runs
> >
> >
> >> Regards,
> >>
> >> Zoli
> >>
> >> On 07/05/15 16:32, Gábor Sándor Enyedi wrote:
> >>> Hi!
> >>>
> >>> I just started to test ODP, trying to write my first application, but
> >>> found a problem: if I want to write NUMA aware code, how should I
> >>> allocate memory close to a given thread? I mean, I know there is
> >>> libnuma, but should I use it? I guess not, but I cannot find memory
> >>> allocation functions in ODP. Is there a function similar to
> >>> numa_alloc_onnode()?
> >>> Thanks,
> >>>
> >>> Gabor
> >>> _______________________________________________
> >>> lng-odp mailing list
> >>> [email protected]
> >>>    https://lists.linaro.org/mailman/listinfo/lng-odp
> >> _______________________________________________
> >> lng-odp mailing list
> >> [email protected]
> >>    https://lists.linaro.org/mailman/listinfo/lng-odp
>
>
> _______________________________________________
> lng-odp mailing list
> [email protected]
> https://lists.linaro.org/mailman/listinfo/lng-odp
_______________________________________________
lng-odp mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/lng-odp

_______________________________________________
lng-odp mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] NUMA aware memory allocation?

Reply via email to