On Mon, May 11, 2015 at 12:46:15PM +0000, Savolainen, Petri (Nokia - FI/Espoo) wrote: > Hi, > > In general, odp_xxx_param_t should be designed so that memset(¶m, 0, > sizeof(odp_xxx_param_t)) gives the default behavior. Also if param is a > pointer, param == NULL can be defined as the default.
This is currently not the case for odp_queue_param_t. The odp_schedule_*_t types within that structure are defined by the platform, and linux-generic currently uses non-zero defaults. param == NULL is obviously only useful if you want default behaviour for all of the elements in the structure. -- Stuart. > > Anyway, special calls for local vs remote configuration should be avoided. I > think that a typical ODP application (e.g. all our examples) would consist of > a control thread, which would first set up all resources for the worker > threads and then create/launch/pin/monitor those threads. So, workers would > not necessarily create the resources they use. Also, the control thread > itself may not be pinned and may run on any available core (OS kernel > decides). > > Direct usage of physical IDs should be minimized in the API. When > virtualization is added into the picture, physical node/core/port/etc IDs are > not relevant any more. The user decides which physical nodes/cores runs a VM, > which threads are pinned to which guest OS cpu IDs, which threads share > resources, ... > > ODP application or implementation cannot directly select physical resources, > but needs some information from the user to do the "right" thing e.g.' > - user has configured > - guest OS CPUs 3 and 6 to the same NUMA node 1 > - shared memory area "shm_0" to locate on a DDR connected to node 0 > - shared memory area "shm_1" to locate on a DDR connected to node 1 > - "eth1" and "eth2" to be a 10 GE NIC interfaces connected to node 1 > - user launches an app and passes above information to it > - app main thread > - creates two worker threads and pins those to cpu IDs 3 and 6 > - reserves shared memory from "shm_0" for logs, etc control communication > (not local to workers) > - reserves shared memory from "shm_1" for worker's shared data (local to > workers) > - opens pktio interfaces "eth1" and "eth2" (local to workers) > - kicks workers to start > > So, some more information may need to flow from user to implementation, but > no direct physical IDs from the application. Either we extend the named and > preconfigured resources concept from pktio to other (physically located) > resources, or add parameters which describe what is needed. Named resources > are exact: "send packet outs from eth0" vs "send packets out from an > interface nearest to the thread". Similarly e.g. memory may need exact > location/properties vs. implementation always selecting the fastest. > > > -Petri > > > > > -----Original Message----- > > From: ext Jacob, Jerin [mailto:[email protected]] > > Sent: Monday, May 11, 2015 12:54 PM > > To: Bill Fischofer > > Cc: Gábor Sándor Enyedi; Savolainen, Petri (Nokia - FI/Espoo); Zoltan > > Kiss; [email protected] > > Subject: Re: [lng-odp] NUMA aware memory allocation? > > > > > > Either way is fine with me. Only concern I have with adding extra info in > > appropriate odp_xxx_params_t is that NON numa applications(most likely > > case) needs > > fill the structure with some default value all the time. > > > > > > From: Bill Fischofer <[email protected]> > > Sent: Friday, May 8, 2015 11:56 PM > > To: Jacob, Jerin > > Cc: Gábor Sándor Enyedi; Savolainen, Petri (Nokia - FI/Espoo); Zoltan > > Kiss; [email protected] > > Subject: Re: [lng-odp] NUMA aware memory allocation? > > > > > > Good points, however rather than having odp_..._onnode() variants, I think > > encoding the extra info in an appropriate odp_xxx_params_t structure would > > be more consistent with how we've been shaping the APIs. That way it > > doesn't require separate API calls to handle the variants. > > > > > > On Fri, May 8, 2015 at 10:11 AM, Jacob, Jerin > > <[email protected]> wrote: > > > > In multi node ODP implementation / application usage perceptive, > > we need to consider, How we can expose the HW resources in each node. > > resources could be cpus, memory and any hw accelerated blocks for packet > > processing. > > > > > > In case of CPU resource, we could take the current API model like, API's > > for querying how may > > cpu resource available in each node and start specific work on selected > > cpus using odp_cpu_mask_t/ > > Let implementation take care of pinning/exposing the number cores for ODP > > on each node. > > > > In case of memory resource, IMO odp_shm_reserve can extended to allocated > > form a > > specific node > > > > In case of hw accelerated blocks resources, IMO we should add node > > parameter while creating the handles > > > > > > IMO, Gábor Sándor Enyedi's example may be visualized like this on multi > > node ODP > > > > > > -local_pool = odp_pool_create() // create a local pool > > -odp_pktio_open(..,local_pool) // open local node pktio and attach to > > local pool > > > > -remote_pool = odp_pool_create_onnode(node...) // create a remote pool as > > packet needs to go remote node DDR > > -odp_pktio_open_onnode(node,...,remote_pool) // open remote node pktio > > with remote pool > > > > -odp_cpu_count() > > -create cpu mask and lunch work on local node > > > > -odp_cpu_count(node) // to get number works available on remote node > > -create cpu mask and lunch work on remote node > > > > > > From: Bill Fischofer <[email protected]> > > Sent: Friday, May 8, 2015 7:43 PM > > To: Gábor Sándor Enyedi > > Cc: Savolainen, Petri (Nokia - FI/Espoo); Jacob, Jerin; Zoltan Kiss; lng- > > [email protected] > > > > > > Subject: Re: [lng-odp] NUMA aware memory allocation? > > > > > > Thanks, that's good info. So in this case is it sufficient to say that the > > memory used for odp_pool_create() is the one associated with the thread > > that executes the create call? Presumably then when a packet arrives and > > is assigned to a CoS that points to that pool then events from that pool > > are sent to queues that are only scheduled to the corresponding cores that > > have fast access to that pool. Right now queues have an > > odp_schedule_group_t but that's still fairly rudimentary. It sounds like > > we might want to point the queue at the pool for scheduling purposes so > > that it would inherit the NUMA considerations you mention. > > > > > > On Fri, May 8, 2015 at 9:00 AM, Gábor Sándor Enyedi > > <[email protected]> wrote: > > > > For me and for now the use-case is very simple: we have an x86 with two > > Xeon CPU-s (dual socket) in it. Each of the CPU-s have its own memory and > > own PCIExpress bus, as usual. First, I want to make only some test code, > > but later we may want to port our high speed OF soft switch to ODP (now, > > its on DPDK). We want to assign a correct core for each interface, and > > each slot must use its own copy of forwarding data in its own memory. We > > have the experience that if we accidentally assigned a bad core to an > > interface, we could get even about 50% performance drop, so NUMA is > > essential. > > Based on the previous, for us something similar to that used in DPDK's > > rte_malloc (and its variants) and a NUMA aware buffer pool create was > > enough for now. Later we want to investigate other architectures... but I > > don't know the use-cases yet. > > > > Gabor > > > > > > > > > > > > > > On 05/08/2015 03:35 PM, Bill Fischofer wrote: > > > > Insofar as possible, the mechanics of NUMA should be the responsibility of > > the ODP implementation, rather than the application, since that way the > > application retains maximum portability. > > > > > > However, from an ODP API perspective, I think we need to be mindful of > > NUMA considerations to give implementations the necessary "hooks" to > > properly support the NUMA aspects of their platform. This is why ODP APIs > > need to be careful about what addressability assumptions they make. > > > > > > If Gábor or Jerrin can list a couple of specific relevant cases I think > > that will help in focusing the discussion and get us off to a good start. > > > > > > On Fri, May 8, 2015 at 8:26 AM, Savolainen, Petri (Nokia - FI/Espoo) > > <[email protected]> wrote: > > Hi, > > > > ODP is OS agnostic and thus thread management (e.g. thread creation and > > pinning to physical cores) and NUMA awareness should happen mostly outside > > of ODP APIs. > > > > For example, NUMA could be visible in ODP APIs this way: > > * Add odp_cpumask_xxx() calls that indicate NUMA dependency between CPUs > > (just for information) > > * Add a way to identify groups of threads which frequently share resources > > (memory and handles) within the group > > * Give the thread group as a hint (parameter) to various ODP calls that > > create shared resources. Implementation can use the information to > > allocate resources "near" to the threads in the group. However, the user > > is responsible to group the threads and map/pin those into physical CPUs > > in a way that enables NUMA aware optimizations. > > > > > > -Petri > > > > > > > > > > > > > -----Original Message----- > > > From: lng-odp [mailto:[email protected]] On Behalf Of ext > > > Gábor Sándor Enyedi > > > Sent: Friday, May 08, 2015 10:48 AM > > > To: Jerin Jacob; Zoltan Kiss > > > Cc: [email protected] > > > Subject: Re: [lng-odp] NUMA aware memory allocation? > > > > > > Hi, > > > > > > Thanks. So, is the workaround for now to start the threads, and do all > > > the memory reservation on the thread? And to call odp_shm_reserve() > > > instead of simple malloc() calls? Can I use multiple buffer pools, one > > > for each thread or interface? > > > BR, > > > > > > Gabor > > > > > > P.s.: Do you know when will this issue in the API be fixed (e.g. in next > > > release or whatever)? > > > > > > On 05/08/2015 09:06 AM, Jerin Jacob wrote: > > > > On Thu, May 07, 2015 at 05:00:54PM +0100, Zoltan Kiss wrote: > > > > > > > >> Hi, > > > >> > > > >> I'm not aware of any such interface, but others with more knowledge > > can > > > >> comment about it. The ODP-DPDK implementation creates buffer pools on > > > the > > > >> NUMA node where the pool create function were actually called. > > > > current ODP spec is not NUMA aware. We need to have API to support > > nodes > > > enumeration and > > > > explicit node parameter to alloc/free resource from specific node like > > > odp_shm_reserve_onnode(node, ...) > > > > and while keeping existing API odp_shm_reserve() allocated on node > > where > > > the current code runs > > > > > > > > > > > >> Regards, > > > >> > > > >> Zoli > > > >> > > > >> On 07/05/15 16:32, Gábor Sándor Enyedi wrote: > > > >>> Hi! > > > >>> > > > >>> I just started to test ODP, trying to write my first application, > > but > > > >>> found a problem: if I want to write NUMA aware code, how should I > > > >>> allocate memory close to a given thread? I mean, I know there is > > > >>> libnuma, but should I use it? I guess not, but I cannot find memory > > > >>> allocation functions in ODP. Is there a function similar to > > > >>> numa_alloc_onnode()? > > > >>> Thanks, > > > >>> > > > >>> Gabor > > > >>> _______________________________________________ > > > >>> lng-odp mailing list > > > >>> [email protected] > > > >>> https://lists.linaro.org/mailman/listinfo/lng-odp > > > >> _______________________________________________ > > > >> lng-odp mailing list > > > >> [email protected] > > > >> https://lists.linaro.org/mailman/listinfo/lng-odp > > > > > > > > > _______________________________________________ > > > lng-odp mailing list > > > [email protected] > > > https://lists.linaro.org/mailman/listinfo/lng-odp > > _______________________________________________ > > lng-odp mailing list > > [email protected] > > https://lists.linaro.org/mailman/listinfo/lng-odp > > > > > > > > > > > _______________________________________________ > lng-odp mailing list > [email protected] > https://lists.linaro.org/mailman/listinfo/lng-odp _______________________________________________ lng-odp mailing list [email protected] https://lists.linaro.org/mailman/listinfo/lng-odp
