Hi, thanks Bill I understand more deeply of ODP thread concept and in embedded app developers are involved in target platform tuning/optimization.
Can I have a little example: say we have a data-plane app which includes 3 ODP threads. And would like to install and run it upon 2 platforms. - Platform A: 2 cores. - Platform B: 10 cores. Question, which one of the below assumptions is the current ODP programming model? *1, *Application developer writes target platform specific code to tell that: On platform A run threads (0) on core (0), and threads (1,2) on core (1). On platform B run threads (0) on core (0), and threads (1) can scale out and duplicate 8 instances on core (1~8), and thread (2) on core (9). Install and run on different platform requires above platform specific code and recompilation for target. *2, *Application developer writes code to specify: Threads (0, 2) would not scale out Threads (1) can scale out (up to a limit N?) Platform A has 3 cores available (as command line parameter?) Platform B has 10 cores available (as command line parameter?) Install and run on different platform may not requires re-compilation. ODP intelligently arrange the threads according to the information provided. Last question: in some case like power save mode available cores shrink would ODP intelligently re-arrange the ODP threads dynamically in runtime? Thanks and Best Regards, Yi On 5 May 2016 at 18:50, Bill Fischofer <[email protected]> wrote: > I've added this to the agenda for Monday's call, however I suggest we > continue the dialog here as well as background. > > Regarding thread pinning, there's always been a tradeoff on that. On the > one hand dedicating cores to threads is ideal for scale out in many core > systems, however ODP does not require many core environments to work > effectively, so ODP APIs enable but do not require or assume that cores are > dedicated to threads. That's really a question of application design and > fit to the particular platform it's running on. In embedded environments > you'll likely see this model more since the application knows which > platform it's being targeted for. In VNF environments, by contrast, you're > more likely to see a blend where applications will take advantage of > however many cores are available to it but will still run without dedicated > cores in environments with more modest resources. > > On Wed, May 4, 2016 at 9:45 PM, Yi He <[email protected]> wrote: > >> Hi, thanks Mike and Bill, >> >> From your clear summarize can we put it into several TO-DO decisions: (we >> can have a discussion in next ARCH call): >> >> 1. How to addressing the precise semantics of the existing timing >> APIs (odp_cpu_xxx) as they relate to processor locality. >> >> >> - *A:* guarantee by adding constraint to ODP thread concept: every >> ODP thread shall be deployed and pinned on one CPU core. >> - A sub-question: my understanding is that application programmers >> only need to specify available CPU sets for control/worker threads, >> and it >> is ODP to arrange the threads onto each CPU core while launching, >> right? >> - *B*: guarantee by adding new APIs to disable/enable CPU migration. >> - Then document clearly in user's guide or API document. >> >> >> 1. Understand the requirement to have both processor-local and >> system-wide timing APIs: >> >> >> - There are some APIs available in time.h (odp_time_local(), etc). >> - We can have a thread to understand the relationship, usage >> scenarios and constraints of APIs in time.h and cpu.h. >> >> Best Regards, Yi >> >> On 4 May 2016 at 23:32, Bill Fischofer <[email protected]> wrote: >> >>> I think there are two fallouts form this discussion. First, there is >>> the question of the precise semantics of the existing timing APIs as they >>> relate to processor locality. Applications such as profiling tests, to the >>> extent that they APIs that have processor-local semantics, must ensure that >>> the thread(s) using these APIs are pinned for the duration of the >>> measurement. >>> >>> The other point is the one that Petri brought up about having other APIs >>> that provide timing information based on wall time or other metrics that >>> are not processor-local. While these may not have the same performance >>> characteristics, they would be independent of thread migration >>> considerations. >>> >>> Of course all this depends on exactly what one is trying to measure. >>> Since thread migration is not free, allowing such activity may or may not >>> be relevant to what is being measured, so ODP probably wants to have both >>> processor-local and systemwide timing APIs. We just need to be sure they >>> are specified precisely so that applications know how to use them properly. >>> >>> On Wed, May 4, 2016 at 10:23 AM, Mike Holmes <[email protected]> >>> wrote: >>> >>>> It sounded like the arch call was leaning towards documenting that on >>>> odp-linux the application must ensure that odp_threads are pinned to cores >>>> when launched. >>>> This is a restriction that some platforms may not need to make, vs the >>>> idea that a piece of ODP code can use these APIs to ensure the behavior it >>>> needs without knowledge or reliance on the wider system. >>>> >>>> On 4 May 2016 at 01:45, Yi He <[email protected]> wrote: >>>> >>>>> Establish a performance profiling environment guarantees meaningful >>>>> and consistency of consecutive invocations of the odp_cpu_xxx() APIs. >>>>> While after profiling was done restore the execution environment to >>>>> its multi-core optimized state. >>>>> >>>>> Signed-off-by: Yi He <[email protected]> >>>>> --- >>>>> include/odp/api/spec/cpu.h | 31 +++++++++++++++++++++++++++++++ >>>>> 1 file changed, 31 insertions(+) >>>>> >>>>> diff --git a/include/odp/api/spec/cpu.h b/include/odp/api/spec/cpu.h >>>>> index 2789511..0bc9327 100644 >>>>> --- a/include/odp/api/spec/cpu.h >>>>> +++ b/include/odp/api/spec/cpu.h >>>>> @@ -27,6 +27,21 @@ extern "C" { >>>>> >>>>> >>>>> /** >>>>> + * @typedef odp_profiler_t >>>>> + * ODP performance profiler handle >>>>> + */ >>>>> + >>>>> +/** >>>>> + * Setup a performance profiling environment >>>>> + * >>>>> + * A performance profiling environment guarantees meaningful and >>>>> consistency of >>>>> + * consecutive invocations of the odp_cpu_xxx() APIs. >>>>> + * >>>>> + * @return performance profiler handle >>>>> + */ >>>>> +odp_profiler_t odp_profiler_start(void); >>>>> + >>>>> +/** >>>>> * CPU identifier >>>>> * >>>>> * Determine CPU identifier on which the calling is running. CPU >>>>> numbering is >>>>> @@ -170,6 +185,22 @@ uint64_t odp_cpu_cycles_resolution(void); >>>>> void odp_cpu_pause(void); >>>>> >>>>> /** >>>>> + * Stop the performance profiling environment >>>>> + * >>>>> + * Stop performance profiling and restore the execution environment >>>>> to its >>>>> + * multi-core optimized state, won't preserve meaningful and >>>>> consistency of >>>>> + * consecutive invocations of the odp_cpu_xxx() APIs anymore. >>>>> + * >>>>> + * @param profiler performance profiler handle >>>>> + * >>>>> + * @retval 0 on success >>>>> + * @retval <0 on failure >>>>> + * >>>>> + * @see odp_profiler_start() >>>>> + */ >>>>> +int odp_profiler_stop(odp_profiler_t profiler); >>>>> + >>>>> +/** >>>>> * @} >>>>> */ >>>>> >>>>> -- >>>>> 1.9.1 >>>>> >>>>> _______________________________________________ >>>>> lng-odp mailing list >>>>> [email protected] >>>>> https://lists.linaro.org/mailman/listinfo/lng-odp >>>>> >>>> >>>> >>>> >>>> -- >>>> Mike Holmes >>>> Technical Manager - Linaro Networking Group >>>> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM >>>> SoCs >>>> "Work should be fun and collaborative, the rest follows" >>>> >>>> >>>> >>>> _______________________________________________ >>>> lng-odp mailing list >>>> [email protected] >>>> https://lists.linaro.org/mailman/listinfo/lng-odp >>>> >>>> >>> >> >
_______________________________________________ lng-odp mailing list [email protected] https://lists.linaro.org/mailman/listinfo/lng-odp
