Re: [lng-odp] thread/shmem discussion summary V3

Ola Liljedahl Thu, 02 Jun 2016 09:18:42 -0700

On 2 June 2016 at 11:08, Christophe Milard <[email protected]>
wrote:


> since V2: Update following Barry and Bill's comments
> since V1: Update following arch call 31 may 2016
>
> This is a tentative to sum up the discussions around the thread/process
> that have been happening these last weeks.
> Sorry for the formalism of this mail, but it seems we need accuracy here...
>
> This summary is organized as follows:
>
> It is a set of statements, each of them expecting a separate answer
> from you. When no specific ODP version is specified, the statement
> regards the"ultimate" goal (i.e what we want eventually to achieve).
> Each statement is prefixed with:
>   - a statement number for further reference (e.g. S1)
>   - a status word (one of 'agreed' or 'open', or 'closed').
> Agreed statements expect a yes/no answers: 'yes' meaning that you
> acknowledge that this is your understanding of the agreement and will
> not nack an implementation based on this statement. You can comment
> after a yes, but your comment will not block any implementation based
> on the agreed statement. A 'no' implies that the statement does not
> reflect your understanding of the agreement, or you refuse the
> proposal.
> Any 'no' received on an 'agreed' statement will push it back as 'open'.
> Open statements are fully open for further discussion.
>
> S1  -agreed: an ODP thread is an OS/platform concurrent execution
> environment object (as opposed to an ODP objects). No more specific
> definition is given by the ODP API itself.
>
> Barry: YES
> ---------------------------
>
> S2  -agreed: Each ODP implementation must tell what is allowed to be
> used as ODP thread for that specific implementation: a linux-based
> implementation, for instance, will have to state whether odp threads
> can be linux pthread, linux processes, or both, or any other type of
> concurrent execution environment. ODP implementations can put any
> restriction they wish on what an ODP thread is allowed to be. This
> should be documented in the ODP implementation documentation.
>
> Barry: YES
> ---------------------------
>
> S3  -agreed: in the linux generic ODP implementation a odpthread will be
> either:
>         * a linux process descendant (or same as) the odp instantiation
> process.
>         * a pthread 'member' of a linux process descendant (or same
> as) the odp instantiation process.
>
> Barry: YES
> ---------------------------
>
> S4  -agreed: For monarch, the linux generic ODP implementation only
> supports odp thread as pthread member of the instantiation process.
>
> Barry: YES
> ---------------------------
>
> S5  -agreed: whether multiple instances of ODP can be run on the same
> machine is left as a implementation decision. The ODP implementation
> document
> should state what is supported and any restriction is allowed.
>
> Barry: YES
> ---------------------------
>
> S6  -agreed: The l-g odp implementation will support multiple odp
> instances whose instantiation processes are different and not
> ancestor/descendant of each others. Different instances of ODP will,
> of course, be restricted in sharing common OS ressources (The total
> amount of memory available for each ODP instances may decrease as the
> number of instances increases, the access to network interfaces will
> probably be granted to the first instance grabbing the interface and
> denied to others... some other rule may apply when sharing other
> common OD ressources.)
> ---------------------------
>
> S7  -agreed: the l-g odp implementation will not support multiple ODP
> instances initiated from the same linux process (calling multiple time
> odp_init_global).
> As an illustration, This means that a single process P is not allowed
> to execute the following calls (in any order)
> instance1 = odp_global_init()
> instance2 = odp_globa_init()
> pthread_create (and, in that thread, run odp_local_init(instance1) )
> pthread_create (and, in that thread, run odp_local_init(instance2) )
> -------------------
>
> S8  -agreed: the l-g odp implementation will not support multiple ODP
> instances initiated from related linux processes (descendant/ancestor
> of each other), hence enabling ODP 'sub-instance'? As an illustration,
> this means that the following is not supported:
> instance1 = odp_global_init()
> pthread_create (and, in that thread, run odp_local_init(instance1) )
> if (fork()==0) {
>     instance2 = odp_globa_init()
>     pthread_create (and, in that thread, run odp_local_init(instance2) )
> }
>
> --------------------
> S9  -agreed: the odp instance passed as parameter to odp_local_init()
> must always be one of the odp_instance returned by odp_global_init()
>
> Barry: YES
> ---------------------------
>
> S10 -agreed: For l-g, if the answer to S7 and S8 are 'no', then due to S3,
> the odp_instance an odp_thread can attach to is completely defined by
> the ancestor of the thread, making the odp_instance parameter of
> odp_init_local redundant. The odp l-g implementation guide will
> enlighten this
> redundancy, but will stress that even in this case the parameter to
> odp_local_init() still have to be set correctly, as its usage is
> internal to the implementation.
>
> Barry: I think so
> Bill: This practice also ensures that applications behave unchanged if
> and when multi-instance support is added, so I don't think we need to
> be apologetic about this parameter requirement.
> ---------------------------
>
> S11 -agreed: at odp_global_init() time, the application will provide 3 sets
> of cpu (i.e 3 cpu masks):
>         -the control cpu mask
>         -the worker cpu mask
>         -the odp service cpu mask (i.e the set of cpu odp can take for
> its own usage)
>
> Bill: I think we should clarify that the service mask is not part of
> ODP Monarch. That's something that we'll probably introduce in Tiger
> Moth. As such, the notion of service CPUs is hidden/internal to
> Monarch implementations.
>
> Barry: YES (though the service cpu mask is post monarch)
> ---------------------------
>
> S12 -agreed: the odp implementation may return an error at
> odp_init_global() call if the number of cpu in the odp service mask
> (or their 'position') does not match the ODP implementation need.
>
> Barry: YES
> ---------------------------
>
> S13 -agreed: the application is fully responsible of pinning its own
> odp threads to different cpus, and this is done directly through OS
> system calls, or via helper functions (as opposed to ODP API calls).
> This pinning should be made among cpus member of the worker cpu mask
> or the control cpu mask.
>
> Barry: YES, but I support the existence of helper functions to do this
> – including the
> important case of pinning the main thread
> ---------------------------
>
> S14 -agreed: whether more than one odp thread can be pinned to the
> same cpu is left as an implementation choice (and the answer to that
> question can be different for the service, worker and control
> threads). This choice should be well documented in the implementation
> user manual.
>
> Barry: YES
> ---------------------------
>
> S15 -agreed: the odp implementation is responsible of pinning its own
> service threads among the cpu member of the odp service cpu mask.
>
> Barry: YES,  in principle – BUT be aware that currently the l-g ODP
> implementation
> (and perhaps many others) cannot call the helper functions (unless
> inlined),
> so this internal pinning may not be well coordinated with the helpers.
>
> ---------------------------
>
> S16 -open: why does the odp implementation need to know the control and
> worker mask? If S13 is true, shoudln't these two masks be part of the
> helper
> only? (meaning that S11 is wrong)
>
> Barry: Currently it probably doesn’t NEED them, but perhaps in the
> future, with some
> new API’s and capabilities, it might benefit from this information,
> and so I would
> leave them in.
> ---------------------------
>
> S17 -open: should masks passed as parameter to odp_global_init() have the
> same "namespace" as those used internally within ODP?
>
> Barry: YES
> ---------------------------
>
> S18 -agreed: ODP handles are valid over the whole ODP instance, i.e.
> any odp handle remains valid among all the odpthreads of the ODP
> instance regardless of the odp thread type (process, thread or
> whatever): an ODP thread A can pass an odp handle to onother ODP
> thread B (using any kind of IPC), and B can use the handle.
>
> -----------------
>
> S19 -open : any pointer retrieved by an ODP call (such as
> odp_*_get_addr()) follows the rules defined by the OS, with the
> possible exception defined in S21. For the linux generic ODP
> implementation, this means that
> pointers are fully shareable when using pthreads and that pointers
> pointing to shared mem areas will be shareable as long as the fork()
> happens after the shm_reserve().
>
> Barry: NO. Disagree.  I would prefer to see a consistent ODP answer on
> this topic, and in
> particular I don’t even believe that most OS’s “have rules defining …”,
> since
> in fact one can make programs run under Linux which can share pointers
> regardless
> the ordering of fork() calls.  Most OS have lots of (continually
> evolving) capabilities
> in the category of sharing memory and so “following the rules of the OS”
> is not
> well defined.
> Instead, I prefer a simpler rule.  Memory reserved using the special
> flag is guaranteed
> to use the same addresses across processes, and all other pointers are
> not guaranteed
> to be the same nor guaranteed to be different, so the ODP programmer
> should avoid
> any such assumptions for maximum portability.  But of course programmers
> often
> only consider a subset of possible targets (e.g. how many programmers
> consider porting
> to an 8-bit CPU or a machine with a 36-bit word length), and so they
> may happily take advantage
> of certain non-guaranteed assumptions.
>
> ---------------------
>
> S20 -open: by default, shmem addresses (returned by shm_get_addr())
> follow the OS rules, as defined by S19.
>
The question is which OS rules apply (an OS can have different rules for
different OS objects, e.g. memory regions allocated using malloc and mmap
will behave differently). I think the answer depends on ODP shmem objects
are implemented. Only the ODP implementation knows how ODP shmem objects
are created (e.g. use some OS system call, manipulate the page tables
directly). So essentially the sharability of pointers is ODP implementation
specific (although ODP implementations on the same OS can be expected to
behave the same). Conclusion: we actually don't specify anything at all
here, it is completely up to the ODP implementation.

What is required/expected by ODP applications? If we don't make
applications happy, ODP is unlikely to succeed.
I think many applications are happy with a single-process thread model
where all memory is shared and pointers can be shared freely.
I hear of some applications that require multi-process thread model, I
expect that those applications also want to be able to share memory and
pointers freely between them, at least memory that was specifically
allocated to be shared (so called shared memory regions, what's otherwise
the purpose of such memory regions?).


> Barry: Disagree with the same comments as in S19.
>
> ---------------------
>
> S21 -open: shm will support and extra flag at shm_reserve() call time:
> SHM_XXX. The usage of this flag will allocate shared memory guaranteed
> to be located at the same virtual address on all odpthreads of the
> odp_instance. Pointers to this shared memory type are therefore fully
> sharable, even on odpthreads running on different VA space (e.g.
> processes). The amount of memory which can be allocated using this
> flag can be
> limited to any value by the ODP implementation, down to zero bytes,
> meaning that some odp implementation may not support this option at
> all. The shm_reserve() will return an error in this case.The usage of
> this flag by the application is therefore not recommended. The ODP
> implementation may require a hint about the size of this area at
> odp_init_global() call time.
>
> Barry: Mostly agree, except for the comment about the special flag not
> being recommended.
>
Agree. Some/many applications will want to share memory between
threads/processes and must be able to do so. Some ODP platforms may have
limitations to the amount of memory (if any) that can be shared and may
thus fail to run certain applications. Such is life. I don't see a problem
with that. Possibly we should remove the phrase "not recommended" and just
state that portability may be limited.



> ------------------
>
> S22 -open: please put here your name suggestions for this SHM_XXX flag :-).
>
SHM_I_REALLY_WANT_TO_SHARE_THIS_MEMORY


>
> S23 -open: The rules above define relatively well the behaviour of
> pointer retrieved by the call to odp_shm_get_addr(). But many points
> needs tobe defined regarding other ODP objects pointers: What is the
> validity of a pointer to a packet, for instance? If process A creates
> a packet pool P, then forks B and C, and B allocate a packet from P
> and retrieves a pointer to a packet allocated from this P... Is this
> pointer valid in A and C? In the current l-g implementation, it
> will... Is this behaviour
>
Perhaps we need the option to specify the
I_REALLY_WANT_TO_SHARE_THIS_MEMORY flag when creating all types of ODP
pools?
An ODP implementation can always fail to create such a pool if the
sharability requirement can not be satisfied.

something we wish to enforce on any odp implementation? What about
> other objects: buffers, atomics ... Some clear rule has to be defined
>
Allocation of locations used for atomic operations is the responsibility of
the application which can (and must) choose a suitable type of memory.

here... How things behave and if this behaviour is a part of the ODP
> API or just specific to different implementations...
>
It is better that sharability is an explicit requirement from the
application. It should be specified as a flag parameter to the different
calls that create/allocate regions of memory (shmem, different types of
pools).


> Barry:
> Again refer to S19 answer.  Specifically it is about what is
> GUARANTEED regarding
> pointer validity, not whether the pointers in certain cases will happen to
> be
> the same.  So for your example, the pointer is not guaranteed to be
> valid in A and C,
> but the programmer might well believe that for all the ODP platforms
> and implementations
> they expect to run on, this is very likely to be the case, in which
> case we can’t stop them
> from constraining their program’s portability – no more than requiring
> them to be able to
> port to a ternary (3-valued “bit”) architecture.
> ---------------------
>
> Thanks for your feedback!
> _______________________________________________
> lng-odp mailing list
> [email protected]
> https://lists.linaro.org/mailman/listinfo/lng-odp
>
_______________________________________________
lng-odp mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] thread/shmem discussion summary V3

Reply via email to