Oops!  Sorry folks - thought I had already edited the subject lines of
the actual patches but that must have been before finally getting a
version thru check-odp.... resending :-\


On Mon, Feb 15, 2016 at 1:49 PM, Gary S. Robertson
<[email protected]> wrote:
> Current versions of ODP linux-generic use sched_getaffinity() and / or
> pthread_getaffinity_np() to obtain counts and sets of CPUs for use by
> ODP tasks.  This method returns inappropriate results when the
> underlying kernel is compiled with NO_HZ_FULL support, as are the
> current LNG kernels. (See Linaro BUG 2027 for details.)
>
> In the process of correcting this erroneous behavior in ODP linux-generic
> it was discovered that some of the validation and performance tests
> were using deprecated methods for determining counts of available CPUs.
> This led to these tests hanging in barrier_wait() because their barriers
> were initialized to expect more tasks than available worker CPUs could
> support.  Since these tests are used to validate ODP linux-generic code
> changes, fixing the tests became a prerequisite to submitting patches
> to address BUG 2027... so patch 1 in this series addresses the test
> faults.  These test changes were confirmed to work properly whether the
> bug 2027 fix is in place or not - they do not depend on the bug fix.
>
> As to the bug fix itself, the 'getaffinity' CPU detection logic is
> replaced by code which mines the /sysfs pseudo-filesystem for info
> about the CPUs detected on the system at boot time.  Additionally,
> the new CPU detection method uses this information to map 'primary'
> CPUs separately from any 'thread sibling' CPUs (AKA 'hyperthreads')
> sharing the same hardware core.  Since some core hardware may be
> shared between 'thread siblings', worker tasks running on a given
> CPU may experience performance degredation if other tasks are
> scheduled on 'thread sibling' CPUs from the same hardware core.
> The new CPU detection logic attempts to mitigate this by omitting
> the 'thread siblings' of worker CPUs from the CPUs available to ODP.
> Control tasks expect to share resources, so any thread siblings
> found for 'primary' CPUs designated as control CPUs are included in
> the corresponding cpumask.
>
> As a consequence of the 'thread sibling' adjustments, on a system with
> 'hyperthreaded' CPUs present, the default cpumasks returned by the
> new CPU detection logic will have an increased number of control CPUs
> and a decreased number of worker CPUs... but will be better optimized
> for worker performance.  On a system without 'hyperthread' CPUs, the
> new CPU detection logic will default to allocating CPU 0 for control
> and the rest for workers - just as the old logic was expected to do.
>
> Per a suggestion by Petri S. the cpumask generation is moved from the
> odp_cpumask_default_control() and odp_cpumask_default_worker()
> functions into the odp_init_global() function.
> The odp_cpumask_default*() functions now use the cpumasks supplied by
> odp_init_global() without modification (except for returning only
> the number of CPUs requested or the number available - whichever is
> fewer - when asked to provide a default worker cpumask).
>
> As a bonus the bug fix cpumask generation logic also reduces the
> number of changes required in order to support an anticipated
> ODP API change which would allow ODP to accept and use control and
> worker cpumasks specified by an external entity such as a provisioning
> executive or a command-line argument parser.
>
> Gary S. Robertson (2):
>   Correct worker count calculation in tests
>   Make cpu detection work with NO_HZ_FULL
>
>  platform/linux-generic/include/odp_internal.h |  31 ++--
>  platform/linux-generic/odp_cpumask_task.c     |  45 +++--
>  platform/linux-generic/odp_init.c             | 230 
> +++++++++++++++++++++++++-
>  platform/linux-generic/odp_system_info.c      |  14 +-
>  test/api_test/odp_common.c                    |   4 +-
>  test/api_test/odp_ring_test.c                 |   4 +-
>  test/performance/odp_atomic.c                 |   9 +-
>  test/validation/cpumask/cpumask.c             |   2 +-
>  test/validation/shmem/shmem.c                 |   5 +-
>  test/validation/timer/timer.c                 |  16 +-
>  10 files changed, 298 insertions(+), 62 deletions(-)
>
> --
> 1.9.1
>
_______________________________________________
lng-odp mailing list
[email protected]
https://lists.linaro.org/mailman/listinfo/lng-odp

Reply via email to