From: EXT Mike Holmes [mailto:[email protected]] Sent: Wednesday, May 11, 2016 4:54 AM To: Bill Fischofer <[email protected]> Cc: Savolainen, Petri (Nokia - FI/Espoo) <[email protected]>; LNG ODP Mailman List <[email protected]> Subject: Re: [lng-odp] ODP Addressing Model
On 10 May 2016 at 20:33, Bill Fischofer <[email protected]<mailto:[email protected]>> wrote: On Tue, May 10, 2016 at 3:17 AM, Savolainen, Petri (Nokia - FI/Espoo) <[email protected]<mailto:[email protected]>> wrote: From a portability and best practices standpoint it therefore seems that we should encourage all global resource allocation to be performed by the application during initialization after it has called odp_init_global() but prior to creating other threads. I think this is too strict in general. Some implementation may be this strict and some applications may be static enough to create all resources in main thread at init time, but in general the recommendation should be looser: • The “initialization” thread is the one calling odp_global_init() • The init thread creates all other ODP threads of an instance and does that only after it has itself called odp_global_init() and odp_local_init() This seems like a reasonable recommendation, however we may wish to revisit this down the road to cover situations where additional resources are brought online to handle peak loads. For example, currently the number of CPUs available to an application is static and odp_cpumask_all_available() does not change. In a VNF environment one can imagine wanting to add additional processing resources to an existing VNF to handle peak loads and in that case more CPUs and associated worker threads would want to be brought up dynamically. Obviously we're not going to do that now, but whatever conventions we establish should scale to that sort of future more dynamic environment. This is possible since those CPUs/threads are added after global_init(). System (and ODP) would know the maximum number of CPUs (etc resources) and could prepare for the maximum from the start. An implementation (under Linux) should allocate all internal shared memory (e.g. memory for shm, pools, queues, etc resources and handles) (from Linux) during global_init(). This way all ODP handles and pointers to ODP managed memory may be shared between all ODP threads - regardless if processes or pthreads are used as ODP threads. Naturally, all non-ODP managed handles and memory are out of this scope (e.g. global variables of an application are visible / not visible to other threads as specified by the thread type). I don't see how this is a less onerous assumption than the suggestion that all pools and shm's be allocated by the initial thread prior to creating other threads. First, it's hard to see how this suggestion can be implemented without API changes and since the Monarch API is frozen then any move in this direction would see to be post-Monarch. The difference is that implementation would do memory allocations in init time, instead of the application. When application calls shm_reserve no system call and new memory mapping is needed. Implementation would just slice the memory it already has allocated from system. Agree with Bill. Any API change to Monarch will now require the SC to vote on approving the changes, the procedure would be for the proposing member company to raise it and get approval. This is does not propose any API change. It’s the change that could be done for odp-linux implementation (in process mode at least). Also, as noted above, such a structure would seem to prevent future dynamic resource extensibility since applications would have to know in advance the upper limits of all resources they might require. Since pools and shms have differing attributes (shm_flags, or the different types of pools and attributes expressible via the odp_pool_param_t struct) this would seem to be at best an awkward requirement. We already have a precedent in the timers. We state that applications are expected to call odp_timer_pool_create() for each timer pool needed and then call odp_timer_pool_start(). I think the simplest, safest, and most portable rule to state is that ODP applications should assume that resources created by ODP APIs are created as shareable objects in the address space of the caller. If the application is using a single-address-space threading model (what odp-linux supports in Monarch) then it becomes automatic by this rule that ODP objects and their derived addresses would be sharable among all other threads sharing that address space. If a linux-style process model is being used, then again it's clear that sharing is possible if the resource is created prior to forking so that the shared object is present in both address spaces by standard Linux fork() semantics. If some other threading model is being used by some other implementation, then this rule would still be a useful portable design guideline for applications to follow. I don't see the need to stipulate beyond this (at least for Monarch). I think you need to still define “resource/object” and its “creation time”. E.g. if a worker thread creates a queue (later in the program), it must be able to share the queue handle with all other threads (so that othes may enqueue to it) – no matter the thread type. If an implementation uses indexes as handles, there’s no problem. If implementation uses pointers as handles, there would be problems with processes depending on when the memory for the object was allocated (global_init() vs queue_create()) and when forks happen. Wouldn’t it be the simplest to define that all ODP handles and addresses are sharable and then leave it to individual implementations to define: · What operating systems they support (odp-linux == LINUX) · What thread model of those OSes they support (odp-linux == pthread and process) · What are the constrains of the support (odp-linux == “… when using processes, processes must be forked from the same parent which has called odp_global_init() before fork … max shm memory size available is X GB. … ”) I don’t expect that all implementations support both models. Also it’s not an issue if application need to be modified when ported between operating systems or thread models (it’s a major effort by definition anyway). Our validation suite is an exception, since it needs to be able verify both models (in Linux). How a RTOS or Windows implementation would be validated - the implementer would need to port the suite on top of his OS / threading model (update the startup code in minimum). -Petri User may pass maximum resource usage numbers (max shm usage, max pool memory usage, max number of queues, etc) during build and/or run time (global_init()). -Petri _______________________________________________ lng-odp mailing list [email protected]<mailto:[email protected]> https://lists.linaro.org/mailman/listinfo/lng-odp -- Mike Holmes Technical Manager - Linaro Networking Group Linaro.org<http://www.linaro.org/> │ Open source software for ARM SoCs "Work should be fun and collaborative, the rest follows"
_______________________________________________ lng-odp mailing list [email protected] https://lists.linaro.org/mailman/listinfo/lng-odp
