stop

Yi He Thu, 05 May 2016 06:00:51 -0700

Hi, thanks Bill

I understand more deeply of ODP thread concept and in embedded app
developers are involved in target platform tuning/optimization.


Can I have a little example: say we have a data-plane app which includes 3
ODP threads. And would like to install and run it upon 2 platforms.

   - Platform A: 2 cores.
   - Platform B: 10 cores.

Question, which one of the below assumptions is the current ODP programming
model?

*1, *Application developer writes target platform specific code to tell
that:

On platform A run threads (0) on core (0), and threads (1,2) on core (1).
On platform B run threads (0) on core (0), and threads (1) can scale out
and duplicate 8 instances on core (1~8), and thread (2) on core (9).

Install and run on different platform requires above platform specific code
and recompilation for target.

*2, *Application developer writes code to specify:

Threads (0, 2) would not scale out
Threads (1) can scale out (up to a limit N?)
Platform A has 3 cores available (as command line parameter?)
Platform B has 10 cores available (as command line parameter?)

Install and run on different platform may not requires re-compilation.
ODP intelligently arrange the threads according to the information provided.

Last question: in some case like power save mode available cores shrink
would ODP intelligently re-arrange the ODP threads dynamically in runtime?

Thanks and Best Regards, Yi

On 5 May 2016 at 18:50, Bill Fischofer <[email protected]> wrote:

> I've added this to the agenda for Monday's call, however I suggest we
> continue the dialog here as well as background.
>
> Regarding thread pinning, there's always been a tradeoff on that.  On the
> one hand dedicating cores to threads is ideal for scale out in many core
> systems, however ODP does not require many core environments to work
> effectively, so ODP APIs enable but do not require or assume that cores are
> dedicated to threads. That's really a question of application design and
> fit to the particular platform it's running on. In embedded environments
> you'll likely see this model more since the application knows which
> platform it's being targeted for. In VNF environments, by contrast, you're
> more likely to see a blend where applications will take advantage of
> however many cores are available to it but will still run without dedicated
> cores in environments with more modest resources.
>
> On Wed, May 4, 2016 at 9:45 PM, Yi He <[email protected]> wrote:
>
>> Hi, thanks Mike and Bill,
>>
>> From your clear summarize can we put it into several TO-DO decisions: (we
>> can have a discussion in next ARCH call):
>>
>>    1. How to addressing the precise semantics of the existing timing
>>    APIs (odp_cpu_xxx) as they relate to processor locality.
>>
>>
>>    - *A:* guarantee by adding constraint to ODP thread concept: every
>>    ODP thread shall be deployed and pinned on one CPU core.
>>       - A sub-question: my understanding is that application programmers
>>       only need to specify available CPU sets for control/worker threads, 
>> and it
>>       is ODP to arrange the threads onto each CPU core while launching, 
>> right?
>>    - *B*: guarantee by adding new APIs to disable/enable CPU migration.
>>    - Then document clearly in user's guide or API document.
>>
>>
>>    1. Understand the requirement to have both processor-local and
>>    system-wide timing APIs:
>>
>>
>>    - There are some APIs available in time.h (odp_time_local(), etc).
>>    - We can have a thread to understand the relationship, usage
>>    scenarios and constraints of APIs in time.h and cpu.h.
>>
>> Best Regards, Yi
>>
>> On 4 May 2016 at 23:32, Bill Fischofer <[email protected]> wrote:
>>
>>> I think there are two fallouts form this discussion.  First, there is
>>> the question of the precise semantics of the existing timing APIs as they
>>> relate to processor locality. Applications such as profiling tests, to the
>>> extent that they APIs that have processor-local semantics, must ensure that
>>> the thread(s) using these APIs are pinned for the duration of the
>>> measurement.
>>>
>>> The other point is the one that Petri brought up about having other APIs
>>> that provide timing information based on wall time or other metrics that
>>> are not processor-local.  While these may not have the same performance
>>> characteristics, they would be independent of thread migration
>>> considerations.
>>>
>>> Of course all this depends on exactly what one is trying to measure.
>>> Since thread migration is not free, allowing such activity may or may not
>>> be relevant to what is being measured, so ODP probably wants to have both
>>> processor-local and systemwide timing APIs.  We just need to be sure they
>>> are specified precisely so that applications know how to use them properly.
>>>
>>> On Wed, May 4, 2016 at 10:23 AM, Mike Holmes <[email protected]>
>>> wrote:
>>>
>>>> It sounded like the arch call was leaning towards documenting that on
>>>> odp-linux  the application must ensure that odp_threads are pinned to cores
>>>> when launched.
>>>> This is a restriction that some platforms may not need to make, vs the
>>>> idea that a piece of ODP code can use these APIs to ensure the behavior it
>>>> needs without knowledge or reliance on the wider system.
>>>>
>>>> On 4 May 2016 at 01:45, Yi He <[email protected]> wrote:
>>>>
>>>>> Establish a performance profiling environment guarantees meaningful
>>>>> and consistency of consecutive invocations of the odp_cpu_xxx() APIs.
>>>>> While after profiling was done restore the execution environment to
>>>>> its multi-core optimized state.
>>>>>
>>>>> Signed-off-by: Yi He <[email protected]>
>>>>> ---
>>>>>  include/odp/api/spec/cpu.h | 31 +++++++++++++++++++++++++++++++
>>>>>  1 file changed, 31 insertions(+)
>>>>>
>>>>> diff --git a/include/odp/api/spec/cpu.h b/include/odp/api/spec/cpu.h
>>>>> index 2789511..0bc9327 100644
>>>>> --- a/include/odp/api/spec/cpu.h
>>>>> +++ b/include/odp/api/spec/cpu.h
>>>>> @@ -27,6 +27,21 @@ extern "C" {
>>>>>
>>>>>
>>>>>  /**
>>>>> + * @typedef odp_profiler_t
>>>>> + * ODP performance profiler handle
>>>>> + */
>>>>> +
>>>>> +/**
>>>>> + * Setup a performance profiling environment
>>>>> + *
>>>>> + * A performance profiling environment guarantees meaningful and
>>>>> consistency of
>>>>> + * consecutive invocations of the odp_cpu_xxx() APIs.
>>>>> + *
>>>>> + * @return performance profiler handle
>>>>> + */
>>>>> +odp_profiler_t odp_profiler_start(void);
>>>>> +
>>>>> +/**
>>>>>   * CPU identifier
>>>>>   *
>>>>>   * Determine CPU identifier on which the calling is running. CPU
>>>>> numbering is
>>>>> @@ -170,6 +185,22 @@ uint64_t odp_cpu_cycles_resolution(void);
>>>>>  void odp_cpu_pause(void);
>>>>>
>>>>>  /**
>>>>> + * Stop the performance profiling environment
>>>>> + *
>>>>> + * Stop performance profiling and restore the execution environment
>>>>> to its
>>>>> + * multi-core optimized state, won't preserve meaningful and
>>>>> consistency of
>>>>> + * consecutive invocations of the odp_cpu_xxx() APIs anymore.
>>>>> + *
>>>>> + * @param profiler  performance profiler handle
>>>>> + *
>>>>> + * @retval 0 on success
>>>>> + * @retval <0 on failure
>>>>> + *
>>>>> + * @see odp_profiler_start()
>>>>> + */
>>>>> +int odp_profiler_stop(odp_profiler_t profiler);
>>>>> +
>>>>> +/**
>>>>>   * @}
>>>>>   */
>>>>>
>>>>> --
>>>>> 1.9.1
>>>>>
>>>>> _______________________________________________
>>>>> lng-odp mailing list
>>>>> [email protected]
>>>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Mike Holmes
>>>> Technical Manager - Linaro Networking Group
>>>> Linaro.org <http://www.linaro.org/> *│ *Open source software for ARM
>>>> SoCs
>>>> "Work should be fun and collaborative, the rest follows"
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> lng-odp mailing list
>>>> [email protected]
>>>> https://lists.linaro.org/mailman/listinfo/lng-odp
>>>>
>>>>
>>>
>>
>

_______________________________________________
lng-odp mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/lng-odp

Re: [lng-odp] [API-NEXT, RFC, 1/1] api: cpu: performance profiling start/stop

Reply via email to