Re: [lng-odp] [EXT] Re: ODP1.15 with gcc-linaro-5.3.1

2017-10-18 Thread Brian Brooks
checking for GCC atomic builtins... yes
checking whether -latomic is needed for 64-bit atomic built-ins... no
checking whether -latomic is needed for 128-bit atomic built-ins... yes

So, GCC 5.3.1 will not lower __atomic/__sync builtins on a 128-bit
data type to machine instructions, and instead emit a call to a
function (which is assumed to be provided by libatomic).

There is some timer and scheduler code that makes use of atomics on
128-bit types.

Can you enable libatomic in the crosstools environment?


On Wed, Oct 18, 2017 at 11:51 AM, Liron Himi <lir...@marvell.com> wrote:
> Hi Brian,
>
> I attached full configure output.
>
> Liron
>
> -Original Message-
> From: Brian Brooks [mailto:brian.bro...@linaro.org]
> Sent: Wednesday, October 18, 2017 18:50
> To: Liron Himi <lir...@marvell.com>
> Cc: lng-odp@lists.linaro.org
> Subject: [EXT] Re: [lng-odp] ODP1.15 with gcc-linaro-5.3.1
>
> External Email
>
> --
> Hi Liron,
>
> Can you paste a full copy of the ./configure output?
>
> Brian
>
> On Wed, Oct 18, 2017 at 9:58 AM, Liron Himi <lir...@marvell.com> wrote:
>> Hi,
>>
>> We are using 'gcc-linaro-5.3.1-2016.05-x86_64_aarch64-linux-gnu' as our 
>> tool-chain.
>> When I compile ODP1.15 with it I get a lot of:
>> 'libtool: warning: library 
>> '/home/userlab/work/crosstools/gcc-linaro-5.3.1-2016.05-x86_64_aarch64-linux-gnu/aarch64-linux-gnu/lib64/libatomic.la'
>>  was moved.'
>>
>> The main problem is that we have another package that uses our ODP
>> outcome and it doesn't compile due to
>> '/bin/sed: can't read 
>> /home/tcwg-buildslave/workspace/tcwg-make-release/label/docker-trusty-amd64-tcwg/target/aarch64-linux-gnu/_build/builds/destdir/x86_64-unknown-linux-gnu/aarch64-linux-gnu/lib/../lib64/libatomic.la:
>>  No such file or directory
>> libtool:   error: 
>> '/home/tcwg-buildslave/workspace/tcwg-make-release/label/docker-trusty-amd64-tcwg/target/aarch64-linux-gnu/_build/builds/destdir/x86_64-unknown-linux-gnu/aarch64-linux-gnu/lib/../lib64/libatomic.la'
>>  is not a valid libtool archive'
>>
>> I notice that it is related to added lines (compared to ODP1.11) in 
>> 'linux-generic/m4/configure.m4'.
>> dnl Check whether -latomic is needed
>> use_libatomic=no
>>
>> AC_MSG_CHECKING(whether -latomic is needed for 64-bit atomic
>> built-ins) AC_LINK_IFELSE(
>>   [AC_LANG_SOURCE([[
>> static int loc;
>> int main(void)
>> {
>> int prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
>> return 0;
>> }
>> ]])],
>>   [AC_MSG_RESULT(no)],
>>   [AC_MSG_RESULT(yes)
>>AC_CHECK_LIB(
>>  [atomic], [__atomic_exchange_8],
>>  [use_libatomic=yes],
>>  [AC_MSG_FAILURE([__atomic_exchange_8 is not available])])
>>   ])
>>
>> AC_MSG_CHECKING(whether -latomic is needed for 128-bit atomic
>> built-ins) AC_LINK_IFELSE(
>>   [AC_LANG_SOURCE([[
>> static __int128 loc;
>> int main(void)
>> {
>> __int128 prev;
>> prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
>> return 0;
>> }
>> ]])],
>>   [AC_MSG_RESULT(no)],
>>   [AC_MSG_RESULT(yes)
>>AC_CHECK_LIB(
>>  [atomic], [__atomic_exchange_16],
>>  [use_libatomic=yes],
>>  [AC_MSG_CHECKING([cannot detect support for 128-bit atomics])])
>>   ])
>>
>> if test "x$use_libatomic" = "xyes"; then
>>   ATOMIC_LIBS="-latomic"
>> fi
>> AC_SUBST([ATOMIC_LIBS])
>>
>>
>> Is there anything I can do with 5.3.1 except of removing the new lines in 
>> configure.m4?
>>
>> Thanks,
>> Liron


Re: [lng-odp] ODP1.15 with gcc-linaro-5.3.1

2017-10-18 Thread Brian Brooks
Hi Liron,

Can you paste a full copy of the ./configure output?

Brian

On Wed, Oct 18, 2017 at 9:58 AM, Liron Himi  wrote:
> Hi,
>
> We are using 'gcc-linaro-5.3.1-2016.05-x86_64_aarch64-linux-gnu' as our 
> tool-chain.
> When I compile ODP1.15 with it I get a lot of:
> 'libtool: warning: library 
> '/home/userlab/work/crosstools/gcc-linaro-5.3.1-2016.05-x86_64_aarch64-linux-gnu/aarch64-linux-gnu/lib64/libatomic.la'
>  was moved.'
>
> The main problem is that we have another package that uses our ODP outcome 
> and it doesn't compile due to
> '/bin/sed: can't read 
> /home/tcwg-buildslave/workspace/tcwg-make-release/label/docker-trusty-amd64-tcwg/target/aarch64-linux-gnu/_build/builds/destdir/x86_64-unknown-linux-gnu/aarch64-linux-gnu/lib/../lib64/libatomic.la:
>  No such file or directory
> libtool:   error: 
> '/home/tcwg-buildslave/workspace/tcwg-make-release/label/docker-trusty-amd64-tcwg/target/aarch64-linux-gnu/_build/builds/destdir/x86_64-unknown-linux-gnu/aarch64-linux-gnu/lib/../lib64/libatomic.la'
>  is not a valid libtool archive'
>
> I notice that it is related to added lines (compared to ODP1.11) in 
> 'linux-generic/m4/configure.m4'.
> dnl Check whether -latomic is needed
> use_libatomic=no
>
> AC_MSG_CHECKING(whether -latomic is needed for 64-bit atomic built-ins)
> AC_LINK_IFELSE(
>   [AC_LANG_SOURCE([[
> static int loc;
> int main(void)
> {
> int prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
> return 0;
> }
> ]])],
>   [AC_MSG_RESULT(no)],
>   [AC_MSG_RESULT(yes)
>AC_CHECK_LIB(
>  [atomic], [__atomic_exchange_8],
>  [use_libatomic=yes],
>  [AC_MSG_FAILURE([__atomic_exchange_8 is not available])])
>   ])
>
> AC_MSG_CHECKING(whether -latomic is needed for 128-bit atomic built-ins)
> AC_LINK_IFELSE(
>   [AC_LANG_SOURCE([[
> static __int128 loc;
> int main(void)
> {
> __int128 prev;
> prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
> return 0;
> }
> ]])],
>   [AC_MSG_RESULT(no)],
>   [AC_MSG_RESULT(yes)
>AC_CHECK_LIB(
>  [atomic], [__atomic_exchange_16],
>  [use_libatomic=yes],
>  [AC_MSG_CHECKING([cannot detect support for 128-bit atomics])])
>   ])
>
> if test "x$use_libatomic" = "xyes"; then
>   ATOMIC_LIBS="-latomic"
> fi
> AC_SUBST([ATOMIC_LIBS])
>
>
> Is there anything I can do with 5.3.1 except of removing the new lines in 
> configure.m4?
>
> Thanks,
> Liron


Re: [lng-odp] Moving scalable scheduler to master

2017-10-12 Thread Brian Brooks
This code is primarily contained within its own files, so I don't see
how this mitigates any issues (merge conflicts) with merging it to
master.

On Wed, Oct 11, 2017 at 6:14 PM, Bill Fischofer
 wrote:
> I've looked over the code and the biggest issue surrounds the use of
> #ifdefs in open code. The issue is that the scheduler behaves
> significantly different based on whether it's running on AArch64 vs.
> other architectures. This means that code coverage is dependent on the
> target platform.
>
> From a modularization standpoint we're going to need to split this up
> so that the architecture paths are more dynamic.
>
> The key file for setting this up is
> platform/linux-generic/include/odp_schedule_scalable_config.h where
> the key lines are:
>
> /*
> * Split queue producer/consumer metadata into separate cache lines.
> * This is beneficial on e.g. Cortex-A57 but not so much on A53.
> */
> #define CONFIG_SPLIT_PRODCONS
> /*
> * Use locks to protect queue (ring buffer) and scheduler state updates
> * On x86, this decreases overhead noticeably.
> */
> #if !defined(__arm__) && !defined(__aarch64__)
> #define CONFIG_QSCHST_LOCK
> /* Keep all ring buffer/qschst data together when using locks */
> #undef CONFIG_SPLIT_PRODCONS
> #endif
>
> An example problematic section is:
>
> #ifndef CONFIG_QSCHST_LOCK
> /* The scheduler is the only entity that performs the dequeue from a queue. */
> static void
> sched_update_deq(sched_elem_t *q,
> uint32_t actual,
> bool atomic) __attribute__((always_inline));
> static inline void
> sched_update_deq(sched_elem_t *q,
> uint32_t actual, bool atomic)
> {
> qschedstate_t oss, nss;
> uint32_t ticket;
>
> ...
> }
> #endif
>
> #ifdef CONFIG_QSCHST_LOCK
> static void
> sched_update_deq_sc(sched_elem_t *q,
>   uint32_t actual,
>   bool atomic)
> __attribute__((always_inline));
> static inline void
> sched_update_deq_sc(sched_elem_t *q,
>  uint32_t actual, bool atomic)
> {
> qschedstate_t oss, nss;
> uint32_t ticket;
>
> ...
> }
> #endif
>
> Where two different routines are defined depending on the architecture
> and then the reset of the code refers to one or the other, again using
> #ifdefs:
>
> static inline void _schedule_release_atomic(sched_scalable_thread_state_t *ts)
> {
> #ifdef CONFIG_QSCHST_LOCK
> sched_update_deq_sc(ts->atomq, ts->dequeued, true);
> ODP_ASSERT(ts->atomq->qschst.cur_ticket != ts->ticket);
> ODP_ASSERT(ts->atomq->qschst.cur_ticket ==
>  ts->atomq->qschst.nxt_ticket);
> #else
> sched_update_deq(ts->atomq, ts->dequeued, true);
> #endif
> ts->atomq = NULL;
> ts->ticket = TICKET_INVALID;
> }
>
> In general, we shouldn't have name variants and this sort of
> conditional compilation. If sched_update_deq() should behave
> differently based on architecture then that should be contained within
> that routine rather than cascading into the main code. Ideally these
> variants should be factored into an arch subdirectory.
>
> #ifdefs also cause problems with maintenance because the compiler
> doesn't see all code paths (only those selected by the #ifdefs). So I
> may inadvertently make a change on one code path that breaks the other
> code path but I don't detect that.
>
> A way to avoid this problem is to replace the #ifdefs with inlined
> functions. For example instead of:
>
> #ifdef PREDICATE
> some code
> #else
> some other code
> #endif
>
> Using a function:
>
> if (predicate_function()) {
> some code
> } else {
> some other code
> }
>
> We can have predicate_function() evaluate to 0 or 1 statically based
> on the target architecture so there's no loss of efficiency compared
> to #ifdefs but all code is always compiled so a single compile pass
> can check all configuration variations for errors.


Re: [lng-odp] generic core + HW specific drivers

2017-10-03 Thread Brian Brooks
The approach taken by Vulkan project:
https://github.com/KhronosGroup/Vulkan-LoaderAndValidationLayers/blob/master/loader/LoaderAndLayerInterface.md

On Tue, Oct 3, 2017 at 6:52 PM, Dmitry Eremin-Solenikov
 wrote:
> On 03/10/17 11:12, Savolainen, Petri (Nokia - FI/Espoo) wrote:
>>> -Original Message-
>>> From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Ola
>>> Liljedahl
>>> Sent: Friday, September 29, 2017 8:47 PM
>>> To: lng-odp@lists.linaro.org
>>> Subject: [lng-odp] generic core + HW specific drivers
>>>
>>> olli@vubuntu:~$ dpkg --get-selections | grep xorg
>>> xorg install
>>> xorg-docs-core install
>>> xserver-xorg install
>>> xserver-xorg-core install
>>> xserver-xorg-input-all install
>>> xserver-xorg-input-evdev install
>>> xserver-xorg-input-libinput install
>>> xserver-xorg-input-synaptics install
>>> xserver-xorg-input-wacom install
>>> xserver-xorg-video-all install
>>> xserver-xorg-video-amdgpu install
>>> xserver-xorg-video-ati install
>>> xserver-xorg-video-fbdev install
>>> xserver-xorg-video-intel install
>>> xserver-xorg-video-mach64 install
>>> xserver-xorg-video-neomagic install
>>> xserver-xorg-video-nouveau install<<>> xserver-xorg-video-openchrome install
>>> xserver-xorg-video-qxl install
>>> xserver-xorg-video-r128 install
>>> xserver-xorg-video-radeon install .   <<>> xserver-xorg-video-savage install
>>> xserver-xorg-video-siliconmotion install
>>> xserver-xorg-video-sisusb install
>>> xserver-xorg-video-tdfx install
>>> xserver-xorg-video-trident install
>>> xserver-xorg-video-vesa install
>>> xserver-xorg-video-vmware install .   <<>>
>>> So let's rename ODP Cloud to ODP Core.
>>>
>>> -- Ola
>>
>>
>> DPDK packages in Ubuntu 17.05 (https://packages.ubuntu.com/artful/dpdk) 
>> include many HW dependent packages
>>
>> ...
>> librte-pmd-fm10k17.05 (= 17.05.2-0ubuntu1) [amd64, i386]  <<< Intel Red Rock 
>> Canyon net driver, provided only for x86
>> librte-pmd-i40e17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-ixgbe17.05 (= 17.05.2-0ubuntu1) [not ppc64el]
>> librte-pmd-kni17.05 (= 17.05.2-0ubuntu1) [not i386]
>> librte-pmd-lio17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-nfp17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-null-crypto17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-null17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-octeontx-ssovf17.05 (= 17.05.2-0ubuntu1)   <<< OcteonTX SSO 
>> eventdev driver files as a package
>> librte-pmd-pcap17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-qede17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-ring17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-sfc-efx17.05 (= 17.05.2-0ubuntu1) [amd64]
>> librte-pmd-skeleton-event17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-sw-event17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-tap17.05 (= 17.05.2-0ubuntu1)
>> librte-pmd-thunderx-nicvf17.05 (= 17.05.2-0ubuntu1)  <<< ThunderX net driver 
>> files as a package
>> ...
>>
>>
>> So, we should be able to deliver ODP as a set of HW independent and HW 
>> specific packages (libraries). For example, minimal install would include 
>> only odp, odp-linux and odp-test-suite, but when on arm64 (and especially 
>> when on ThunderX) odp-thunderx would be installed also. The trick would be 
>> how to select odp-thunderx installed files (also headers) as default when 
>> application is built or run on ThunderX (and not on another arm64).
>>
>> Package:
>> * odp (only generic folders and documentation, no implementation)
>>   * depends on: odp-linux, odp-test-suite
>>   * recommends: odp-linux, odp-dpdk, odp-thunderx, odp-dpaa2, ...
>> * odp-linux
>>   * depends on: odp, openssl
>>   * recommends: dpdk, netmap, ...
>> * odp-dpdk
>>   * depends on: odp, dpdk
>> * odp-thunderx [arm64]
>>   * depends on: odp, ...
>> * odp-test-suite
>>   * depends on: odp
>
> I suppose this would satisfy distribution requirements. Especially if we
> can merge all platforms to the same repo. It might be reasonable to
> optionally follow the model of OpenCL ICD (installable client drivers),
> when there is a frontend library moderating access to several backend
> libraries. Each driver can be built as a stand-along libopencl or as
> ICD, being picked up by frontend libopencl1. Modular ODP might well be
> just one of these drivers, serving the case when single platform
> combines different hardware instances.
>
> --
> With best wishes
> Dmitry


Re: [lng-odp] Future of per-arch-platform ABI spec files

2017-09-20 Thread Brian Brooks
On 09/19 12:03:20, Dmitry Eremin-Solenikov wrote:
> Hello,
> 
> I have been poking around per-arch-platform ABI spec files.
> Currently all architectures just include default specs. Do we have any
> particular use case for these separate files? Otherwise I'd suggest to
> drop them completely and just leave 'default' in place.

include/odp/arch is ripe for deduplication. Feel free to add me to the
review if you take it up.

> -- 
> With best wishes
> Dmitry


Re: [lng-odp] Compiler Barrier API

2017-09-11 Thread Brian Brooks
On Sun, Sep 10, 2017 at 10:33 PM, Bill Fischofer
<bill.fischo...@linaro.org> wrote:
> Before we consider adding new synchronization APIs, we need a clearly
> defined use case for why a portable application would need this. ODP
> implementations may make use of such things based on their knowledge
> of the platform architecture, but ODP applications (should) have no
> such platform-specific dependencies. So we need a platform-independent
> use case.

There's a disconnect here. odp_compiler_internal.h implies
platform/xxx/include/odp_compiler_internal.h. That would be the
location to add a compiler barrier macro for implementation use only.

I think there's enough consensus in this thread to say that there's no
need to add an ODP API for a compiler barrier. This same rationale
could be extended to all shared memory synchronization algorithms
(deprecate them from ODP API) since they are not Control / Data Plane
objects and are implemented using atomic primitives (which are an
abstraction layer provided by the compiler). Perhaps they should be
moved to a odp-helper-sync library that depends on an
odp-helper-atomic library.

> On Sun, Sep 10, 2017 at 10:28 PM, Brian Brooks <brian.bro...@linaro.org> 
> wrote:
>> Hi Andriy,
>>
>> On Wed, Sep 6, 2017 at 12:24 PM, Andriy Berestovskyy
>> <andriy.berestovs...@caviumnetworks.com> wrote:
>>> Hey Brian,
>>> You are right, there are no compiler barriers on master, I just was on
>>> api-next branch:
>>>
>>> https://git.linaro.org/lng/odp.git/tree/platform/linux-generic/arch/arm/odp_atomic.h?h=api-next#n56
>>
>> How about a #define for compiler barrier in odp_compiler_internal.h?
>>
>>> Andriy
>>>
>>>
>>> On 05.09.2017 17:06, Brian Brooks wrote:
>>>>
>>>> I don't see a compiler barrier in the odp.git repo. Perhaps 'nop', but
>>>> this acts as more than a pure compiler barrier?
>>>>
>>>>
>>>> On Tue, Sep 5, 2017 at 8:23 AM, Andriy Berestovskyy
>>>> <andriy.berestovs...@caviumnetworks.com> wrote:
>>>>>
>>>>> Hey Petri,
>>>>>
>>>>> On 05.09.2017 14:17, Savolainen, Petri (Nokia - FI/Espoo) wrote:
>>>>>>
>>>>>>
>>>>>> I think compiler barrier is too weak for writing portable code, since it
>>>>>> does not guarantee that the CPU would not re-order the instructions.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> No, compiler barrier does not guarantee CPU order. Though, sometimes we
>>>>> do
>>>>> not need such a guarantee.
>>>>>
>>>>> Compiler barriers are in the same league with volatiles and provide the
>>>>> same
>>>>> weak guarantees: eventually the effect will be observed and code will not
>>>>> be
>>>>> optimized out by the compiler.
>>>>>
>>>>>
>>>>>
>>>>>> ODP implementation has ":::memory" in context of inline assembly
>>>>>> instructions, but that's when we are writing code against specific ISA
>>>>>> and
>>>>>> thus know which amount of synchronization is needed.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> The barrier is quite rare, so it might remain in form of asm volatile...
>>>>>
>>>>>
>>>>> Andriy


Re: [lng-odp] Compiler Barrier API

2017-09-10 Thread Brian Brooks
Hi Andriy,

On Wed, Sep 6, 2017 at 12:24 PM, Andriy Berestovskyy
<andriy.berestovs...@caviumnetworks.com> wrote:
> Hey Brian,
> You are right, there are no compiler barriers on master, I just was on
> api-next branch:
>
> https://git.linaro.org/lng/odp.git/tree/platform/linux-generic/arch/arm/odp_atomic.h?h=api-next#n56

How about a #define for compiler barrier in odp_compiler_internal.h?

> Andriy
>
>
> On 05.09.2017 17:06, Brian Brooks wrote:
>>
>> I don't see a compiler barrier in the odp.git repo. Perhaps 'nop', but
>> this acts as more than a pure compiler barrier?
>>
>>
>> On Tue, Sep 5, 2017 at 8:23 AM, Andriy Berestovskyy
>> <andriy.berestovs...@caviumnetworks.com> wrote:
>>>
>>> Hey Petri,
>>>
>>> On 05.09.2017 14:17, Savolainen, Petri (Nokia - FI/Espoo) wrote:
>>>>
>>>>
>>>> I think compiler barrier is too weak for writing portable code, since it
>>>> does not guarantee that the CPU would not re-order the instructions.
>>>
>>>
>>>
>>>
>>> No, compiler barrier does not guarantee CPU order. Though, sometimes we
>>> do
>>> not need such a guarantee.
>>>
>>> Compiler barriers are in the same league with volatiles and provide the
>>> same
>>> weak guarantees: eventually the effect will be observed and code will not
>>> be
>>> optimized out by the compiler.
>>>
>>>
>>>
>>>> ODP implementation has ":::memory" in context of inline assembly
>>>> instructions, but that's when we are writing code against specific ISA
>>>> and
>>>> thus know which amount of synchronization is needed.
>>>
>>>
>>>
>>>
>>> The barrier is quite rare, so it might remain in form of asm volatile...
>>>
>>>
>>> Andriy


Re: [lng-odp] Supporting ODP_PKTIO_OP_MT_SAFE

2017-09-10 Thread Brian Brooks
Honnappa,

Could your proposal be simplified to: MT-safe pktio should be
deprecated because it is not a common use case. Applications will
either use MT-unsafe pktio or the MT-safe scheduler.

> 1) Polling method - in which one pkt I/O will be created for each receive 
> worker thread. In this case, support for ODP_PKTIO_OP_MT_SAFE is not required.

Absence of MT-safe does not require 1:1 mapping of thread to pktio. It
just means that it is the application's responsibility to ensure
exclusive access to a pktio.

> for high throughput packet I/Os [..] we do not need to support 
> ODP_PKTIO_OP_MT_SAFE
> We could keep the support for ODP_PKTIO_OP_MT_SAFE for other pkt I/Os.

This would introduce an undesirable leaky abstraction.

BB

On Sun, Sep 10, 2017 at 12:40 PM, Bill Fischofer
 wrote:
> We can discuss this during tomorrow's ARCH call, and probably further
> at Connect. MT Safe is the default behavior and it's opposite ("MT
> Unsafe") was added as a potential optimization when applications
> assure implementations that only a single thread will be polling a
> PktIn queue or adding to a Pktout queue.
>
> Ideally, we'd like to retire all application I/O polling and use the
> scheduler exclusively, but that's that's a longer-term goal. For now
> we have both.
>
> On Sun, Sep 10, 2017 at 8:11 AM, Honnappa Nagarahalli
>  wrote:
>> Hi,
>> I think there are 2 ways in which pkt I/O will be used:
>>
>> 1) Polling method - in which one pkt I/O will be created for each
>> receive worker thread. In this case, support for ODP_PKTIO_OP_MT_SAFE
>> is not required.
>> 2) Event method - the scheduler is used to receive packets. In this
>> case the scheduler will provide the exclusive access to a pkt I/O.
>> Again in this case support for ODP_PKTIO_OP_MT_SAFE is not required.
>>
>> I am thinking, for high throughput packet I/Os such as dpdk or ODP
>> native drivers (in the future), we do not need to support
>> ODP_PKTIO_OP_MT_SAFE. The odp_pktio_open API call can return an error
>> if ODP_PKTIO_OP_MT_SAFE is asked for.
>>
>> We could keep the support for ODP_PKTIO_OP_MT_SAFE for other pkt I/Os.
>>
>> This will save space in cache for the locks as well as instruction cycles.
>>
>> Thank you,
>> Honnappa


Re: [lng-odp] [PATCH] linux-gen: barrier: Use correct memory ordering

2017-09-06 Thread Brian Brooks
ping

On Sat, Aug 26, 2017 at 9:52 AM, Bill Fischofer
<bill.fischo...@linaro.org> wrote:
> On Sat, Aug 26, 2017 at 12:40 AM, Brian Brooks <brian.bro...@arm.com> wrote:
>
>> Memory accesses that happen-before, in program order, a call to
>> odp_barrier_wait() cannot be reordered to after the call. Similarly,
>> memory accesses that happen-after, in program order, a call to
>> odp_barrier_wait() cannot be reordered to before the call.
>>
>> The current implementation of barriers uses sequentially consistent
>> fences on either side of odp_barrier_wait().
>>
>> The correct memory ordering for barriers is release upon entering
>> odp_barrier_wait(), to prevent reordering to after the barrier, and
>> acquire upon exiting odp_barrier_wait(), to prevent reordering to
>> before the barrier.
>>
>> The measurable performance difference is negligible on weakly ordered
>> architectures such as ARM, so the highlight of this change is correctness.
>>
>> Signed-off-by: Brian Brooks <brian.bro...@arm.com>
>>
>
> Reviewed-by: Bill Fischofer <bill.fischo...@linaro.org>
>
>
>> ---
>>  platform/linux-generic/odp_barrier.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/platform/linux-generic/odp_barrier.c
>> b/platform/linux-generic/odp_barrier.c
>> index 5eb354de..f70bdbf8 100644
>> --- a/platform/linux-generic/odp_barrier.c
>> +++ b/platform/linux-generic/odp_barrier.c
>> @@ -34,7 +34,7 @@ void odp_barrier_wait(odp_barrier_t *barrier)
>> uint32_t count;
>> int wasless;
>>
>> -   odp_mb_full();
>> +   odp_mb_release();
>>
>> count   = odp_atomic_fetch_inc_u32(>bar);
>> wasless = count < barrier->count;
>> @@ -48,5 +48,5 @@ void odp_barrier_wait(odp_barrier_t *barrier)
>> odp_cpu_pause();
>> }
>>
>> -   odp_mb_full();
>> +   odp_mb_acquire();
>>  }
>> --
>> 2.14.1
>>
>>


Re: [lng-odp] Compiler Barrier API

2017-09-05 Thread Brian Brooks
I don't see a compiler barrier in the odp.git repo. Perhaps 'nop', but
this acts as more than a pure compiler barrier?


On Tue, Sep 5, 2017 at 8:23 AM, Andriy Berestovskyy
 wrote:
> Hey Petri,
>
> On 05.09.2017 14:17, Savolainen, Petri (Nokia - FI/Espoo) wrote:
>>
>> I think compiler barrier is too weak for writing portable code, since it
>> does not guarantee that the CPU would not re-order the instructions.
>
>
>
> No, compiler barrier does not guarantee CPU order. Though, sometimes we do
> not need such a guarantee.
>
> Compiler barriers are in the same league with volatiles and provide the same
> weak guarantees: eventually the effect will be observed and code will not be
> optimized out by the compiler.
>
>
>
>> ODP implementation has ":::memory" in context of inline assembly
>> instructions, but that's when we are writing code against specific ISA and
>> thus know which amount of synchronization is needed.
>
>
>
> The barrier is quite rare, so it might remain in form of asm volatile...
>
>
> Andriy


[lng-odp] [PATCH] linux-gen: barrier: Use correct memory ordering

2017-08-25 Thread Brian Brooks
Memory accesses that happen-before, in program order, a call to
odp_barrier_wait() cannot be reordered to after the call. Similarly,
memory accesses that happen-after, in program order, a call to
odp_barrier_wait() cannot be reordered to before the call.

The current implementation of barriers uses sequentially consistent
fences on either side of odp_barrier_wait().

The correct memory ordering for barriers is release upon entering
odp_barrier_wait(), to prevent reordering to after the barrier, and
acquire upon exiting odp_barrier_wait(), to prevent reordering to
before the barrier.

The measurable performance difference is negligible on weakly ordered
architectures such as ARM, so the highlight of this change is correctness.

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
---
 platform/linux-generic/odp_barrier.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/platform/linux-generic/odp_barrier.c 
b/platform/linux-generic/odp_barrier.c
index 5eb354de..f70bdbf8 100644
--- a/platform/linux-generic/odp_barrier.c
+++ b/platform/linux-generic/odp_barrier.c
@@ -34,7 +34,7 @@ void odp_barrier_wait(odp_barrier_t *barrier)
uint32_t count;
int wasless;
 
-   odp_mb_full();
+   odp_mb_release();
 
count   = odp_atomic_fetch_inc_u32(>bar);
wasless = count < barrier->count;
@@ -48,5 +48,5 @@ void odp_barrier_wait(odp_barrier_t *barrier)
odp_cpu_pause();
}
 
-   odp_mb_full();
+   odp_mb_acquire();
 }
-- 
2.14.1



Re: [lng-odp] merge master to api-next

2017-07-27 Thread Brian Brooks
On 07/27 14:04:24, Maxim Uvarov wrote:
> Looks like everything is fine:
> 
> https://github.com/Linaro/odp/pull/71
> 
> Brian do you want to add your review to final patch?  If no more objection
> I'm merging this and will work on merging other patches in the queue.

For ca08c5c810, Signed-off-by: Brian Brooks <brian.bro...@arm.com>

> Maxim.
> 
> 
> 
> On 26 July 2017 at 22:33, Maxim Uvarov <maxim.uva...@linaro.org> wrote:
> 
> > Updated pull request. Let's see what test will show.
> >
> > Maxim.
> >
> > On 07/26/17 22:19, Dmitry Eremin-Solenikov wrote:
> > > On 26/07/17 22:04, Bill Fischofer wrote:
> > >> I guess another question is why doesn't this issue show up in api-next
> > >> without this patch series or in master by itself?
> > >
> > > Because we do not (yet) have cross-compilation testing in api-next in
> > > Travis. And master branch doesn't have this code also.
> > >
> > >>
> > >> On Wed, Jul 26, 2017 at 1:55 PM, Dmitry Eremin-Solenikov
> > >> <dmitry.ereminsoleni...@linaro.org
> > >> <mailto:dmitry.ereminsoleni...@linaro.org>> wrote:
> > >>
> > >> Brian,
> > >>
> > >>
> > >> On 20 July 2017 at 00:23, Brian Brooks <brian.bro...@arm.com
> > >> <mailto:brian.bro...@arm.com>> wrote:
> > >> > On 07/26 10:29:41, Dmitry Eremin-Solenikov wrote:
> > >> >> On 26/07/17 01:00, Maxim Uvarov wrote:
> > >> >> > Merge request is:
> > >> >> >
> > >> >> > https://github.com/Linaro/odp/pull/71
> > >> <https://github.com/Linaro/odp/pull/71>
> > >> >> >
> > >> >> > 2 arm build fails with errors.
> > >> >>
> > >> >> Both are due to scalable scheduler code.
> > >> >
> > >> > Neither are due to scalable scheduler. They are actually due to
> > compiling
> > >> > an ARMv8 instruction into the binary when the compiler is
> > targeting ARMv7a.
> > >>
> > >> Sorry, this looked to me like the code that went in with the
> > >> scheduler. Please
> > >> excuse me.
> > >>
> > >> > I suggest we enable CONFIG_WFE only for ARMv8:
> > >>
> > >> Hmm. Wasn't this code also tested for ARMv7?
> > >>
> > >> >   diff --git a/platform/linux-generic/arch/arm/odp_cpu.h
> > b/platform/linux-generic/arch/arm/odp_cpu.h
> > >> >   index 8ef50da4..72c81020 100644
> > >> >   --- a/platform/linux-generic/arch/arm/odp_cpu.h
> > >> >   +++ b/platform/linux-generic/arch/arm/odp_cpu.h
> > >> >   @@ -38,7 +38,9 @@
> > >> > * more scalable) and enables the CPU to enter a sleep state
> > (lower power
> > >> > * consumption).
> > >> > */
> > >> >   +#ifdef __aarch64__
> > >> >#define CONFIG_WFE
> > >> >   +#endif
> > >> >
> > >> >static inline void dmb(void)
> > >> >{
> > >> >
> > >> > Can you test with this?
> > >>
> > >> I did not test it, but hopefully it should fix the issue.
> > >>
> > >> >
> > >> >> >
> > >> >> > 1.
> > >> >> >
> > >> >> > https://travis-ci.org/muvarov/odp/jobs/257478037
> > >> <https://travis-ci.org/muvarov/odp/jobs/257478037>
> > >> >> >   CC   odp_rwlock.lo
> > >> >> > :1:2: error: instruction requires: armv8
> > >> >> > sevl
> > >> >> >
> > >> >> > 2.
> > >> >> >
> > >> >> > https://travis-ci.org/muvarov/odp/jobs/257478036
> > >> <https://travis-ci.org/muvarov/odp/jobs/257478036>
> > >> >> >   CC   odp_schedule.lo
> > >> >> > /tmp/ccRdUDmO.s: Assembler messages:
> > >> >> > /tmp/ccRdUDmO.s:1281: Error: selected processor does not
> > >> support Thumb
> > >> >> > mode `sevl'
> > >> >> > /tmp/ccRdUDmO.s:2306: Error: selected processor does not
> > >> support Thumb
> > >> >> > mode `sevl'
> > >> >> > /tmp/ccRdUDmO.s:2481: Error: selected processor does not
> > >> support Thumb
> > >> >> > mode `sevl'
> > >> >> > /tmp/ccRdUDmO.s:3217: Error: selected processor does not
> > >> support Thumb
> > >> >> > mode `sevl'
> > >> >> > /tmp/ccRdUDmO.s:3635: Error: selected processor does not
> > >> support Thumb
> > >> >> > mode `sevl'
> > >> >> > /tmp/ccRdUDmO.s:3873: Error: selected processor does not
> > >> support Thumb
> > >> >> > mode `sevl'
> > >> >> > make[1]: *** [odp_queue_scalable.lo] Error 1
> > >> >> > make[1]: *** Waiting for unfinished jobs
> > >> >> > make[1]: Leaving directory
> > >> >> > `/home/travis/build/muvarov/odp/platform/linux-generic'
> > >> >> > make: *** [all-recursive] Error 1
> > >>
> > >> --
> > >> With best wishes
> > >> Dmitry
> > >>
> > >>
> > >
> > >
> >
> >


Re: [lng-odp] merge master to api-next

2017-07-26 Thread Brian Brooks
On 07/26 10:29:41, Dmitry Eremin-Solenikov wrote:
> On 26/07/17 01:00, Maxim Uvarov wrote:
> > Merge request is:
> > 
> > https://github.com/Linaro/odp/pull/71
> > 
> > 2 arm build fails with errors.
> 
> Both are due to scalable scheduler code.

Neither are due to scalable scheduler. They are actually due to compiling
an ARMv8 instruction into the binary when the compiler is targeting ARMv7a.

I suggest we enable CONFIG_WFE only for ARMv8:

  diff --git a/platform/linux-generic/arch/arm/odp_cpu.h 
b/platform/linux-generic/arch/arm/odp_cpu.h
  index 8ef50da4..72c81020 100644
  --- a/platform/linux-generic/arch/arm/odp_cpu.h
  +++ b/platform/linux-generic/arch/arm/odp_cpu.h
  @@ -38,7 +38,9 @@
* more scalable) and enables the CPU to enter a sleep state (lower power
* consumption).
*/
  +#ifdef __aarch64__
   #define CONFIG_WFE
  +#endif

   static inline void dmb(void)
   {

Can you test with this?

> > 
> > 1.
> > 
> > https://travis-ci.org/muvarov/odp/jobs/257478037
> >   CC   odp_rwlock.lo
> > :1:2: error: instruction requires: armv8
> > sevl
> > 
> > 2.
> > 
> > https://travis-ci.org/muvarov/odp/jobs/257478036
> >   CC   odp_schedule.lo
> > /tmp/ccRdUDmO.s: Assembler messages:
> > /tmp/ccRdUDmO.s:1281: Error: selected processor does not support Thumb
> > mode `sevl'
> > /tmp/ccRdUDmO.s:2306: Error: selected processor does not support Thumb
> > mode `sevl'
> > /tmp/ccRdUDmO.s:2481: Error: selected processor does not support Thumb
> > mode `sevl'
> > /tmp/ccRdUDmO.s:3217: Error: selected processor does not support Thumb
> > mode `sevl'
> > /tmp/ccRdUDmO.s:3635: Error: selected processor does not support Thumb
> > mode `sevl'
> > /tmp/ccRdUDmO.s:3873: Error: selected processor does not support Thumb
> > mode `sevl'
> > make[1]: *** [odp_queue_scalable.lo] Error 1
> > make[1]: *** Waiting for unfinished jobs
> > make[1]: Leaving directory
> > `/home/travis/build/muvarov/odp/platform/linux-generic'
> > make: *** [all-recursive] Error 1
> > 
> 
> 
> -- 
> With best wishes
> Dmitry


Re: [lng-odp] [API-NEXT PATCHv7 1/4] api: timer: add odp_timer_capability() api

2017-07-24 Thread Brian Brooks
Reviewed-by: Brian Brooks <brian.bro...@arm.com>

On Wed, Jul 19, 2017 at 10:43 PM, Kevin Wang <kevin.w...@arm.com> wrote:
> Currently, user needs to decide the timer resolution before creating
> a timer pool. But sometimes it will cause timer overrun as the system
> can't support such high resolution.
> So a new API is required to expose the timer capability to the user.
>
> Signed-off-by: Kevin Wang <kevin.w...@arm.com>
> ---
>  include/odp/api/spec/timer.h | 27 +++
>  1 file changed, 27 insertions(+)
>
> diff --git a/include/odp/api/spec/timer.h b/include/odp/api/spec/timer.h
> index 75f9db9..b76f565 100644
> --- a/include/odp/api/spec/timer.h
> +++ b/include/odp/api/spec/timer.h
> @@ -108,6 +108,33 @@ typedef struct {
>  } odp_timer_pool_param_t;
>
>  /**
> + * Timer capability
> + */
> +typedef struct {
> +   /** Highest timer resolution in nanoseconds.
> +*
> +*  This defines the highest resolution supported by a timer.
> +*  It's the minimum valid value for 'res_ns' timer pool
> +*  parameter.
> +*/
> +   uint64_t highest_res_ns;
> +} odp_timer_capability_t;
> +
> +/**
> + * Query timer capabilities
> + *
> + * Outputs timer capabilities on success.
> + *
> + * @param  clk_src  Clock source for timers
> + * @param[out] capa Pointer to capability structure for output
> + *
> + * @retval 0 on success
> + * @retval <0 on failure
> + */
> +int odp_timer_capability(odp_timer_clk_src_t clk_src,
> +odp_timer_capability_t *capa);
> +
> +/**
>   * Create a timer pool
>   *
>   * The use of pool name is optional. Unique names are not required.
> --
> 2.7.4
>


Re: [lng-odp] [PATCH v2 1/3] example: ipfragaddress: fix compilation with clang

2017-07-14 Thread Brian Brooks
On 07/12 16:00:04, Github ODP bot wrote:
> From: Dmitry Eremin-Solenikov 
> 
> Clang 3.8 is stricter than GCC wrt register allocation vs 128-bit
> variables. Sometimes it can not understand using 128-bit var in place of
> 64-bit register resulting in the following errors:
> 
> /odp_ipfragreass_atomics_arm.h:18:51: error: value size does not match
> register
>   size specified by the constraint and modifier
>   [-Werror,-Wasm-operand-widths]
> __asm__ volatile("ldaxp %0, %H0, [%1]" : "=" (old)
> ^
> ./odp_ipfragreass_atomics_arm.h:18:27: note: use constraint modifier "w"
> __asm__ volatile("ldaxp %0, %H0, [%1]" : "=" (old)
> 
> Explicitly pass low and high parts of 128-bit variable in separate
> assembly parameters.
> 
> Signed-off-by: Dmitry Eremin-Solenikov 
> ---
> /** Email created from pull request 73 (lumag:cross-2)
>  ** https://github.com/Linaro/odp/pull/73
>  ** Patch: https://github.com/Linaro/odp/pull/73.patch
>  ** Base sha: 7fc6d27e937b57b31360b07028388c811f8300dc
>  ** Merge commit sha: 3dd283d7e61ba21beed2607ef200584754f22d76
>  **/
>  example/ipfragreass/odp_ipfragreass_atomics_arm.h | 19 +--
>  1 file changed, 13 insertions(+), 6 deletions(-)
> 
> diff --git a/example/ipfragreass/odp_ipfragreass_atomics_arm.h 
> b/example/ipfragreass/odp_ipfragreass_atomics_arm.h
> index 99c37a77..94ccd00e 100644
> --- a/example/ipfragreass/odp_ipfragreass_atomics_arm.h
> +++ b/example/ipfragreass/odp_ipfragreass_atomics_arm.h
> @@ -13,26 +13,33 @@
>  static inline __int128 lld(__int128 *var, int mo)
>  {
>   __int128 old;
> + uint64_t lo, hi;
>  
>   if (mo == __ATOMIC_ACQUIRE)
> - __asm__ volatile("ldaxp %0, %H0, [%1]" : "=" (old)
> + __asm__ volatile("ldaxp %0, %1, [%2]" : "=r" (lo), "=r" (hi)

The LDAXP inst performs two loads into two 64-bit registers. The order of these
loads are not specified, so either of the output operands can be written to
before the instruction finishes. So, the early clobber modifier '&' should be
used not just on the first output operand (lo) but also the last (hi).

Please see platform/linux-generic/arch/arm/odp_llsc.h for reference.

>: "r" (var) : "memory");
>   else /* mo == __ATOMIC_RELAXED */
> - __asm__ volatile("ldxp %0, %H0, [%1]" : "=" (old)
> + __asm__ volatile("ldxp %0, %1, [%2]" : "=r" (lo), "=r" (hi)
>: "r" (var) : );
> + old = hi;
> + old <<= 64;
> + old |= lo;
> +
>   return old;
> +
>  }
>  
>  static inline uint32_t scd(__int128 *var, __int128 neu, int mo)
>  {
>   uint32_t ret;
> + uint64_t lo = neu, hi = neu >> 64;
>  
>   if (mo == __ATOMIC_RELEASE)
> - __asm__ volatile("stlxp %w0, %1, %H1, [%2]" : "=" (ret)
> -  : "r" (neu), "r" (var) : "memory");
> + __asm__ volatile("stlxp %w0, %1, %2, [%3]" : "=" (ret)
> +  : "r" (lo), "r" (hi), "r" (var) : "memory");
>   else /* mo == __ATOMIC_RELAXED */
> - __asm__ volatile("stxp %w0, %1, %H1, [%2]" : "=" (ret)
> -  : "r" (neu), "r" (var) : );
> + __asm__ volatile("stxp %w0, %1, %2, [%3]" : "=" (ret)
> +  : "r" (lo), "r" (hi), "r" (var) : "memory");
>   return ret;
>  }
>  
> 


Re: [lng-odp] questions about buffer allocation in linux-generic

2017-07-11 Thread Brian Brooks
On 07/07 19:23:10, Bill Fischofer wrote:
> On Fri, Jul 7, 2017 at 4:55 PM, Brian Brooks <brian.bro...@linaro.org> wrote:
> > Why is a buffer's "user area" not adjacent (virtually) to the buffer
> > header itself?
> > If by design, then a simpler way is to let the user manage that memory
> > and pass a pointer that gets associated with the buffer (context,
> > usr_ptr, cookie, ...).
> 
> ODP does not mandate where any metadata is with respect to the rest of
> a packet since that would impose undesirable restrictions on
> implementation flexibility, especially for platforms that make use of
> HW accelerators. In such implementations the main packet data may be
> managed by HW while the metadata is handled by SW, so questions of
> adjacency are left up to the implementation.
> 
> In odp-linux, since this area is optional and variable sized, there's
> no way to make it adjacent and still have compile-time access to
> various internal packet fields that can be held at fixed offsets.

The fixed-sized user area per buffer can be adjacent to the remaining
buffer headers, even if the pool is dynamically allocated. This is a
simpler form of implementing buffer alignment. You store an offset or
a pointer to where the object starts.

Was there any reason for why this design (not placing user area adjacent
to the buffer itself) was chosen during implementation?

> >
> > Why are the sizes here round up to nearest cache line?
> > Alignment bytes are allocated, which is enough to place the object at
> > the requested alignment.
> 
> Keeping this area in its own cache line(s) means that this optional
> area doesn't impinge on the cache footprint consumed by the rest of
> the header. This can be fine-tuned if measurements show that this
> decision is sub-optimal.
> 
> >
> > sizeof(odp_packet_hdr_t) bytes are allocated per buffer regardless of
> > the pool type.
> 
> Currently we don't support optional metadata that's not in the main
> odp_packet_hdr_t. As we refine the odp_mbuf definition we may want to
> split out optional metadata areas (e.g., in support of event sub-types
> like ODP_PACKET_IPSEC, etc.). That's a tuning detail.

We want to allocate a timeout header adjacent to the buffer header for
timeout pools. A timeout header is much smaller than a packet header.
Currently, no matter what pool type, we always place a packet header
below the buffer header. This is design, not tuning.

> >
> > odp_pool_t opaque type in linux-generic can be used as a pointer to
> > pool_t instead of the u32 pool id.  This can save a level of
> > indirection by avoiding looking up the pool in the array of pools.
> > Opaque pointers are sizeof(void *), so it is not saving any space to
> > store a u32 there.
> 
> This is one of the changes Petri is introducing in his latest buffer
> restructure patch. Again, these are tuning optimizations since the API
> remains opaque.

Right, these are not API questions.


[lng-odp] questions about buffer allocation in linux-generic

2017-07-07 Thread Brian Brooks
Why is a buffer's "user area" not adjacent (virtually) to the buffer
header itself?
If by design, then a simpler way is to let the user manage that memory
and pass a pointer that gets associated with the buffer (context,
usr_ptr, cookie, ...).

Why are the sizes here round up to nearest cache line?
Alignment bytes are allocated, which is enough to place the object at
the requested alignment.

sizeof(odp_packet_hdr_t) bytes are allocated per buffer regardless of
the pool type.

odp_pool_t opaque type in linux-generic can be used as a pointer to
pool_t instead of the u32 pool id.  This can save a level of
indirection by avoiding looking up the pool in the array of pools.
Opaque pointers are sizeof(void *), so it is not saving any space to
store a u32 there.


Re: [lng-odp] [PATCH] doc: userguide: add portability and usage info for odp time apis

2017-06-29 Thread Brian Brooks
On 06/29 16:21:47, Maxim Uvarov wrote:
> Hello Bill,
> 
> patch is good. Please see my notes bellow which I think reasonable.
> 
> 
> On 02/14/17 01:47, Bill Fischofer wrote:
> > Clarify and expand on portability and performance considerations
> > regarding the use of the ODP time APIs in fulfillment of JIRA
> 
> fulfilment
> 
> > issue https://projects.linaro.org/browse/ODP-575
> > 
> > Signed-off-by: Bill Fischofer 
> > ---
> >  doc/users-guide/users-guide.adoc | 32 +---
> >  1 file changed, 21 insertions(+), 11 deletions(-)
> > 
> > diff --git a/doc/users-guide/users-guide.adoc 
> > b/doc/users-guide/users-guide.adoc
> > index 41c57d1c..05bade8c 100755
> > --- a/doc/users-guide/users-guide.adoc
> > +++ b/doc/users-guide/users-guide.adoc
> > @@ -362,31 +362,41 @@ PktIOs are represented by handles of abstract type 
> > `odp_pktio_t`.
> >  
> >  === Time
> >  The time API is used to measure time intervals and track time flow of an
> > -application and presents a convenient way to get access to a time source.
> > -The time API consists of two main parts: local time API and global time 
> > API.
> > +application and presents a convenient way to get access to an
> > +implementation-defined time source. The time API consists of two main 
> > parts:
> > +local time API and global time API.
> >  
> >   Local time
> > -The local time API is designed to be used within one thread and can be 
> > faster
> > -than the global time API. The local time API cannot be used between 
> > threads as
> > -time consistency is not guaranteed, and in some cases that's enough.
> > -So, local time stamps are local to the calling thread and must not be 
> > shared
> > -with other threads. Current local time can be read with `odp_time_local()`.
> > +The local time API is designed to be used within one thread and obtaining
> > +local time may be more efficient in some implementations than global
> > +time.
> 
> 
> The local time API is designed to be used within one odp worker which is
> bind to specific cpu core.
> 
> 
> > Local time stamps are local to the calling thread and should not be
> > +shared with other threads, as local time is not guaranteed to be consistent
> > +between threads. Current local time can be read with `odp_time_local()`.
> >  
> >   Global time
> >  The global time API is designed to be used for tracking time between 
> > threads.
> > -So, global time stamps can be shared between threads. Current global time 
> > can
> > -be read with `odp_time_global()`.
> > +So, global time stamps may safely be shared between threads. Current global
> > +time can be read with `odp_time_global()`.
> 
> also odp workers and control threads.
> 
> >  
> > -Both, local and global time is not wrapped during the application life 
> > cycle.
> > +Both local and global time is not wrapped during the application life 
> > cycle.
> >  The time API includes functions to operate with time, such as 
> > `odp_time_diff()`,
> >  `odp_time_sum()`, `odp_time_cmp()`, conversion functions like
> >  `odp_time_to_ns()`, `odp_time_local_from_ns()`, 
> > `odp_time_global_from_ns()`.
> >  To get rate of time source `odp_time_local_res()`, `odp_time_global_res()`
> >  are used. To wait, `odp_time_wait_ns()` and `odp_time_wait_until()` are 
> > used,
> > -during witch a thread potentially busy loop the entire wait time.
> > +during which a thread potentially busy loops the entire wait time.
> >  
> >  The `odp_time_t` opaque type represents local or global timestamps.
> >  
> > + Portability Considerations
> > +The ODP Time APIs are designed to permit high-precision relative time
> > +measurement within an ODP application. No attempt is made to correlate an
> > +`odp_time_t` object with "wall time" or any other external time reference.
> > +As defined by the ODP specification, `odp_time_t` values are required to
> > +be unique over a span of at least 10 years. Most implementations will 
> > choose
> > +to implement time values using 64-bit values, whose wrap times exceed 500
> > +years, making wrapping concerns not relevant to ODP applications.
> > +
> 
> Yes that is good addition. That means that odp time is not adjusted (ptp
> or ntp) and can not go backwards. I.e. each next call you will get
> updated incremented value. No time adjustments.

I was confused by the bit about no correlation with wall time. What are the
Time APIs good for then? ;) If it is more accurate to say something about
clock drift, synchronization, thread vs. cpu time then that would be much 
better!

> Best regards,
> Maxim.
> 
> >  === Timer
> >  Timers are how ODP applications measure and respond to the passage of time.
> >  Timers are drawn from specialized pools called timer pools that have their
> > 
> 


Re: [lng-odp] [PATCHv2] linux-gen: scheduler: modular scheduler interface

2017-06-29 Thread Brian Brooks
On 06/29 12:08:49, Savolainen, Petri (Nokia - FI/Espoo) wrote:
> 
> 
> > -Original Message-
> > From: Brian Brooks [mailto:brian.bro...@arm.com]
> > Sent: Wednesday, June 28, 2017 5:17 PM
> > To: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolai...@nokia.com>
> > Cc: Joyce Kong <joyce.k...@arm.com>; lng-odp@lists.linaro.org
> > Subject: Re: [lng-odp] [PATCHv2] linux-gen: scheduler: modular scheduler
> > interface
> > 
> > On 06/28 07:24:08, Savolainen, Petri (Nokia - FI/Espoo) wrote:
> > >
> > >
> > > > -Original Message-
> > > > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> > Joyce
> > > > Kong
> > > > Sent: Wednesday, June 28, 2017 5:14 AM
> > > > To: lng-odp@lists.linaro.org
> > > > Cc: Joyce Kong <joyce.k...@arm.com>
> > > > Subject: [lng-odp] [PATCHv2] linux-gen: scheduler: modular scheduler
> > > > interface
> > > >
> > > > The modular scheduler interface in odp_schedule_if.h includes
> > functions
> > > > from pktio and queue. It needs to be cleaned out. The pktio/queue
> > related
> > > > functions should be moved to pktio/queue internal header file.
> > >
> > > Sched_cb_xxx() functions are the interface from a scheduler towards
> > other parts of the system. So, those calls are not in the scheduler
> > interface file by mistake.
> > 
> > Generally speaking, function declarations should exist in the .h
> > file of the .c file that contains the definitions. These functions
> > are defined in queue.c and called from schedule.c. The declarations
> > should be in queue.h. It can be as simple as that.
> 
> We try to enforce a tight scheduler interface (== currently 
> odp_schedule_if.h). I decided to create only one file for simplicity. There 
> could be additional files to define each output direction *interface* from a 
> scheduler (queue_if_for_scheduler.h, pktio_if_for_scheduler.h, 
> timer_if_for_scheduler.h) which includes only those functions that a 
> scheduler may use. Moving things back to odp_xxx_internal.h would bring back 
> the problem of clearly defining what functions (or types) a scheduler may 
> use. Without clear interface file, each developer gets creative and access 
> different parts of the system from different schedulers with different (e.g. 
> locking) expectations ... and we are back in the original mess.

I doesn't need to be that tight, controlled, or unfamiliar.

  sched_cb_num_queues()
  sched_cb_queue_prio()
  sched_cb_queue_grp()
  sched_cb_is_ordered()
  sched_cb_is_atomic()
  sched_cb_queue_handle()

are safe if called from any other componenet, and s/sched_cb/queue/.

  sched_cb_queue_destroy_finalize()
  sched_cb_queue_deq_multi()

can be something like:

  struct sched_queue_ops {
void (*sched_queue_destroy)(u32 qi);
int  (*sched_queue_deq_multi)(u32 qi, ..);
  };

  void register_sched_queue_op(struct sched_queue_ops *ops)
  {
..
  }

I don't think this needs to be done for pktio because pktio
itself is already an abstraction layer over multiple pktio
implementations. I'm not sure the existing queue abstraction
layer supports the above functions, but they could. So, scratch
the above example.

The pktio functions just need to be renamed and moved to internal
pktio.h file since they are defined there and called by the scheduler.
It won't be the end of the world if some other component calls
pktin_poll(). That would certainly be interesting and covered in
code reviews. Perhaps it even makes sense to do that in a different
software architecture.

> > And, usually 'cb' or 'callback' is used in naming a function that is
> > called through a function pointer. So, the naming is off here as well
> > since these functions are always called via a direct function call.
> 
> We need common prefixes for input and output direction scheduler interface 
> calls. I did pick up sched_cb_ to make distinction from sched_fn, which is 
> the input direction. Those are just names. 

That is combining the interfaces of two components into one and
calling it the scheduler interface.

> > 
> > If you have to caution the user as to why something is written the
> > way it is, and that it is not a mistake, it better be a juicy topic.
> > Following best practices such as placing function declarations in
> > the .h file of the .c file that they are defined in is not a juicy
> > topic.
> 
> 
> The patch claims to make the interface cleaner, but it does the opposite by 
> moving interface functions back into internal header files. How do you define 
> a clean interface, if the included file exposes e.g. pktio in

Re: [lng-odp] [PATCHv2] linux-gen: scheduler: modular scheduler interface

2017-06-28 Thread Brian Brooks
On 06/28 10:13:40, Joyce Kong wrote:
> [This sender failed our fraud detection checks and may not be who they appear 
> to be. Learn about spoofing at http://aka.ms/LearnAboutSpoofing]
> 
> The modular scheduler interface in odp_schedule_if.h includes functions
> from pktio and queue. It needs to be cleaned out. The pktio/queue related
> functions should be moved to pktio/queue internal header file.
> 
> Signed-off-by: Joyce Kong <joyce.k...@arm.com>

Reviewed-by: Brian Brooks <brian.bro...@arm.com>

> ---
>  platform/linux-generic/include/odp_packet_io_internal.h |  5 +
>  platform/linux-generic/include/odp_queue_internal.h | 12 +++-
>  platform/linux-generic/include/odp_schedule_if.h| 15 ---
>  platform/linux-generic/odp_schedule.c   |  2 +-
>  platform/linux-generic/odp_schedule_iquery.c|  2 +-
>  platform/linux-generic/odp_schedule_sp.c|  2 ++
>  6 files changed, 20 insertions(+), 18 deletions(-)
> 
> diff --git a/platform/linux-generic/include/odp_packet_io_internal.h 
> b/platform/linux-generic/include/odp_packet_io_internal.h
> index 89bb6f3..70bc44d 100644
> --- a/platform/linux-generic/include/odp_packet_io_internal.h
> +++ b/platform/linux-generic/include/odp_packet_io_internal.h
> @@ -262,6 +262,11 @@ int single_recv_queue(pktio_entry_t *entry, int index, 
> odp_packet_t packets[],
>  int single_send_queue(pktio_entry_t *entry, int index,
>   const odp_packet_t packets[], int num);
> 
> +/* Interface for the scheduler */
> +int sched_cb_pktin_poll(int pktio_index, int num_queue, int index[]);
> +void sched_cb_pktio_stop_finalize(int pktio_index);
> +int sched_cb_num_pktio(void);
> +
>  extern const pktio_if_ops_t netmap_pktio_ops;
>  extern const pktio_if_ops_t dpdk_pktio_ops;
>  extern const pktio_if_ops_t sock_mmsg_pktio_ops;
> diff --git a/platform/linux-generic/include/odp_queue_internal.h 
> b/platform/linux-generic/include/odp_queue_internal.h
> index 560f826..452ecab 100644
> --- a/platform/linux-generic/include/odp_queue_internal.h
> +++ b/platform/linux-generic/include/odp_queue_internal.h
> @@ -20,7 +20,6 @@ extern "C" {
> 
>  #include 
>  #include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -94,6 +93,17 @@ int queue_deq_multi(queue_entry_t *queue, odp_buffer_hdr_t 
> *buf_hdr[], int num);
>  void queue_lock(queue_entry_t *queue);
>  void queue_unlock(queue_entry_t *queue);
> 
> +/* Interface for the scheduler */
> +int sched_cb_num_queues(void);
> +int sched_cb_queue_prio(uint32_t queue_index);
> +int sched_cb_queue_grp(uint32_t queue_index);
> +int sched_cb_queue_is_ordered(uint32_t queue_index);
> +int sched_cb_queue_is_atomic(uint32_t queue_index);
> +odp_queue_t sched_cb_queue_handle(uint32_t queue_index);
> +void sched_cb_queue_destroy_finalize(uint32_t queue_index);
> +int sched_cb_queue_deq_multi(uint32_t queue_index, odp_event_t ev[], int 
> num);
> +int sched_cb_queue_empty(uint32_t queue_index);
> +
>  static inline uint32_t queue_to_id(odp_queue_t handle)
>  {
> return _odp_typeval(handle) - 1;
> diff --git a/platform/linux-generic/include/odp_schedule_if.h 
> b/platform/linux-generic/include/odp_schedule_if.h
> index 530d157..8dae081 100644
> --- a/platform/linux-generic/include/odp_schedule_if.h
> +++ b/platform/linux-generic/include/odp_schedule_if.h
> @@ -11,7 +11,6 @@
>  extern "C" {
>  #endif
> 
> -#include 
>  #include 
>  #include 
> 
> @@ -60,20 +59,6 @@ typedef struct schedule_fn_t {
>  /* Interface towards the scheduler */
>  extern const schedule_fn_t *sched_fn;
> 
> -/* Interface for the scheduler */
> -int sched_cb_pktin_poll(int pktio_index, int num_queue, int index[]);
> -void sched_cb_pktio_stop_finalize(int pktio_index);
> -int sched_cb_num_pktio(void);
> -int sched_cb_num_queues(void);
> -int sched_cb_queue_prio(uint32_t queue_index);
> -int sched_cb_queue_grp(uint32_t queue_index);
> -int sched_cb_queue_is_ordered(uint32_t queue_index);
> -int sched_cb_queue_is_atomic(uint32_t queue_index);
> -odp_queue_t sched_cb_queue_handle(uint32_t queue_index);
> -void sched_cb_queue_destroy_finalize(uint32_t queue_index);
> -int sched_cb_queue_deq_multi(uint32_t queue_index, odp_event_t ev[], int 
> num);
> -int sched_cb_queue_empty(uint32_t queue_index);
> -
>  /* API functions */
>  typedef struct {
> uint64_t (*schedule_wait_time)(uint64_t);
> diff --git a/platform/linux-generic/odp_schedule.c 
> b/platform/linux-generic/odp_schedule.c
> index c4567d8..d33fac2 100644
> --- a/platform/linux-generic/odp_schedule.c
> +++ b/platform/linux-generic/odp_schedule.c
> @@ -6,7 +6,6

Re: [lng-odp] [PATCHv2] linux-gen: scheduler: modular scheduler interface

2017-06-28 Thread Brian Brooks
On 06/28 07:24:08, Savolainen, Petri (Nokia - FI/Espoo) wrote:
> 
> 
> > -Original Message-
> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Joyce
> > Kong
> > Sent: Wednesday, June 28, 2017 5:14 AM
> > To: lng-odp@lists.linaro.org
> > Cc: Joyce Kong 
> > Subject: [lng-odp] [PATCHv2] linux-gen: scheduler: modular scheduler
> > interface
> > 
> > The modular scheduler interface in odp_schedule_if.h includes functions
> > from pktio and queue. It needs to be cleaned out. The pktio/queue related
> > functions should be moved to pktio/queue internal header file.
> 
> Sched_cb_xxx() functions are the interface from a scheduler towards other 
> parts of the system. So, those calls are not in the scheduler  interface file 
> by mistake.

Generally speaking, function declarations should exist in the .h
file of the .c file that contains the definitions. These functions
are defined in queue.c and called from schedule.c. The declarations
should be in queue.h. It can be as simple as that.

And, usually 'cb' or 'callback' is used in naming a function that is
called through a function pointer. So, the naming is off here as well
since these functions are always called via a direct function call.

If you have to caution the user as to why something is written the
way it is, and that it is not a mistake, it better be a juicy topic.
Following best practices such as placing function declarations in
the .h file of the .c file that they are defined in is not a juicy
topic.

> If we now agree that queues and scheduler come as a package, tighter 
> integration between queue.c and 3 original schedulers could be done. But 
> preferably through another header file than odp_queue_internal.h, since that 
> exposes internals of the queue block.
> 
> This should *not* be done for pktio. It's the sched -> pktio interface as of 
> today.
> 
> I can look into this, since I'd inline as much as possible of those 
> sched_cb_queue_xxx() functions anyway. Also this work should not affect the 
> ARM scheduler, right?
> 
> -Petri
> 
>  
> 


[lng-odp] [PATCH] build: fix 64-bit atomics detection

2017-06-27 Thread Brian Brooks
Use uint64_t instead of int type.

This resolves ipfragreass build breakage with clang on 32-bit systems.

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
---
 platform/linux-generic/m4/configure.m4 | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/platform/linux-generic/m4/configure.m4 
b/platform/linux-generic/m4/configure.m4
index e1197f60..c8164b8c 100644
--- a/platform/linux-generic/m4/configure.m4
+++ b/platform/linux-generic/m4/configure.m4
@@ -34,10 +34,11 @@ use_libatomic=no
 AC_MSG_CHECKING(whether -latomic is needed for 64-bit atomic built-ins)
 AC_LINK_IFELSE(
   [AC_LANG_SOURCE([[
-static int loc;
+#include 
+static uint64_t loc;
 int main(void)
 {
-int prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
+uint64_t prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
 return 0;
 }
 ]])],
-- 
2.13.1



Re: [lng-odp] [PATCH] linux-gen: time: use true hz

2017-06-27 Thread Brian Brooks
On 06/27 07:43:19, Savolainen, Petri (Nokia - FI/Espoo) wrote:
> 
> 
> > -Original Message-
> > From: lng-odp [mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of Brian
> > Brooks
> > Sent: Monday, June 26, 2017 9:21 PM
> > To: lng-odp@lists.linaro.org
> > Subject: [lng-odp] [PATCH] linux-gen: time: use true hz
> > 
> > Use true hz value instead of dividing by 10.
> > 
> > The architected timer in ARM SoCs generally have a lower
> > frequency than IA TSC. It is detrimental to divide the
> > hertz by ten in this case. This is causing time validation
> > failures on ARM systems where the architected timer runs
> > at 50MHz.
> > 
> > Signed-off-by: Brian Brooks <brian.bro...@arm.com>
> > ---
> >  platform/linux-generic/odp_time.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
> > 
> > diff --git a/platform/linux-generic/odp_time.c b/platform/linux-
> > generic/odp_time.c
> > index 2bbe5666..a831cc51 100644
> > --- a/platform/linux-generic/odp_time.c
> > +++ b/platform/linux-generic/odp_time.c
> > @@ -102,9 +102,7 @@ static inline odp_time_t time_hw_cur(void)
> > 
> >  static inline uint64_t time_hw_res(void)
> >  {
> > -   /* Promise a bit lower resolution than average cycle counter
> > -* frequency */
> > -   return global.hw_freq_hz / 10;
> > +   return global.hw_freq_hz;
> >  }
> > 
> >  static inline uint64_t time_hw_to_ns(odp_time_t time)
> > --
> > 2.13.1
> 
> 
> Did you test this change on x86? This should cause test failures on x86.

Yes, tests passed on x86 2.8GHz.

> You should fix the time test suite also. The suite converts Hz resolution to 
> ns resolution here:
> 
> static void time_test_res(time_res_cb time_res, uint64_t *res)
> {
>   uint64_t rate;
> 
>   rate = time_res();
>   CU_ASSERT(rate > MIN_TIME_RATE);
>   CU_ASSERT(rate < MAX_TIME_RATE);
> 
>   *res = ODP_TIME_SEC_IN_NS / rate;
>   if (ODP_TIME_SEC_IN_NS % rate)
>   (*res)++;
> }
> 
> When TSC resolution is >1GHz, the ns resolution is zero and there is no 
> margin for rounding errors. 

Actually, ns resolution is one.

> static void time_test_conversion(time_from_ns_cb time_from_ns, uint64_t res)
> {
>   uint64_t ns1, ns2;
>   odp_time_t time;
>   uint64_t upper_limit, lower_limit;
> 
>   ns1 = 100;
>   time = time_from_ns(ns1);
> 
>   ns2 = odp_time_to_ns(time);
> 
>   /* need to check within arithmetic tolerance that the same
>* value in ns is returned after conversions */
>   upper_limit = ns1 + res;
>   lower_limit = ns1 - res;
>   CU_ASSERT((ns2 <= upper_limit) && (ns2 >= lower_limit)); 
> 
> 
> The error margin needs to be changed from ns resolution to something else. I 
> think it should be just a percentage from the original value (floating point 
> math) e.g. +-20% throughout the test suite.

As long as the tick values read from the CPU are converted to nanoseconds
using the true frequency of the tick counter (true for code in odp_time.c)
I think reporting a resolution 1ns for >1GHz might be fine.

> -Petri
> 
> 


[lng-odp] [API-NEXT PATCH v10 5/6] linux-gen: sched scalable: add scalable scheduler

2017-06-23 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Signed-off-by: Kevin Wang <kevin.w...@arm.com>
Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
---
 platform/linux-generic/Makefile.am |7 +
 .../include/odp/api/plat/schedule_types.h  |4 +-
 .../linux-generic/include/odp_config_internal.h|   15 +-
 .../include/odp_queue_scalable_internal.h  |  104 +
 platform/linux-generic/include/odp_schedule_if.h   |2 +-
 .../linux-generic/include/odp_schedule_scalable.h  |  139 ++
 .../include/odp_schedule_scalable_config.h |   52 +
 .../include/odp_schedule_scalable_ordered.h|  132 ++
 platform/linux-generic/m4/odp_schedule.m4  |   55 +-
 platform/linux-generic/odp_queue_if.c  |8 +
 platform/linux-generic/odp_queue_scalable.c| 1022 ++
 platform/linux-generic/odp_schedule_if.c   |6 +
 platform/linux-generic/odp_schedule_scalable.c | 1980 
 .../linux-generic/odp_schedule_scalable_ordered.c  |  345 
 14 files changed, 3849 insertions(+), 22 deletions(-)
 create mode 100644 platform/linux-generic/include/odp_queue_scalable_internal.h
 create mode 100644 platform/linux-generic/include/odp_schedule_scalable.h
 create mode 100644 
platform/linux-generic/include/odp_schedule_scalable_config.h
 create mode 100644 
platform/linux-generic/include/odp_schedule_scalable_ordered.h
 create mode 100644 platform/linux-generic/odp_queue_scalable.c
 create mode 100644 platform/linux-generic/odp_schedule_scalable.c
 create mode 100644 platform/linux-generic/odp_schedule_scalable_ordered.c

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 19e2241b..82ab4642 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -185,9 +185,13 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_pool_internal.h \
  ${srcdir}/include/odp_posix_extensions.h \
  ${srcdir}/include/odp_queue_internal.h \
+ ${srcdir}/include/odp_queue_scalable_internal.h \
  ${srcdir}/include/odp_ring_internal.h \
  ${srcdir}/include/odp_queue_if.h \
  ${srcdir}/include/odp_schedule_if.h \
+ ${srcdir}/include/odp_schedule_scalable.h \
+ ${srcdir}/include/odp_schedule_scalable_config.h \
+ ${srcdir}/include/odp_schedule_scalable_ordered.h \
  ${srcdir}/include/odp_sorted_list_internal.h \
  ${srcdir}/include/odp_shm_internal.h \
  ${srcdir}/include/odp_time_internal.h \
@@ -259,12 +263,15 @@ __LIB__libodp_linux_la_SOURCES = \
   odp_pool.c \
   odp_queue.c \
   odp_queue_if.c \
+  odp_queue_scalable.c \
   odp_rwlock.c \
   odp_rwlock_recursive.c \
   odp_schedule.c \
   odp_schedule_if.c \
   odp_schedule_sp.c \
   odp_schedule_iquery.c \
+  odp_schedule_scalable.c \
+  odp_schedule_scalable_ordered.c \
   odp_shared_memory.c \
   odp_sorted_list.c \
   odp_spinlock.c \
diff --git a/platform/linux-generic/include/odp/api/plat/schedule_types.h 
b/platform/linux-generic/include/odp/api/plat/schedule_types.h
index 535fd6d0..4e75f9ee 100644
--- a/platform/linux-generic/include/odp/api/plat/schedule_types.h
+++ b/platform/linux-generic/include/odp/api/plat/schedule_types.h
@@ -18,6 +18,8 @@
 extern "C" {
 #endif
 
+#include 
+
 /** @addtogroup odp_scheduler
  *  @{
  */
@@ -44,7 +46,7 @@ typedef int odp_schedule_sync_t;
 typedef int odp_schedule_group_t;
 
 /* These must be kept in sync with thread_globals_t in odp_thread.c */
-#define ODP_SCHED_GROUP_INVALID -1
+#define ODP_SCHED_GROUP_INVALID ((odp_schedule_group_t)-1)
 #define ODP_SCHED_GROUP_ALL 0
 #define ODP_SCHED_GROUP_WORKER  1
 #define ODP_SCHED_GROUP_CONTROL 2
diff --git a/platform/linux-generic/include/odp_config_internal.h 
b/platform/linux-generic/include/odp_config_internal.h
index dadd59e7..3cff0045 100644
--- a/platform/linux-generic/include/odp_config_internal.h
+++ b/platform/linux-generic/include/odp_config_internal.h
@@ -7,10 +7,6 @@
 #ifndef ODP_CONFIG_INTERNAL_H_
 #define ODP_CONFIG_INTERNAL_H_
 
-#ifdef __cplusplus
-extern "C" {
-#endif
-
 /*
  * Maximum number of pools
  */
@@ -22,6 +18,13 @@ extern "C" {
 #define ODP_CONFIG_QUEUES 1024
 
 /*
+ * Maximum queue depth. Maximum number of elements that can be stored in a
+ * queue. This value is used only when the size is not explicitly provided
+ * durin

[lng-odp] [API-NEXT PATCH v10 2/6] linux-gen: sched scalable: add arch files

2017-06-23 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 platform/linux-generic/Makefile.am   |  17 ++
 platform/linux-generic/arch/arm/odp_atomic.h | 212 +++
 platform/linux-generic/arch/arm/odp_cpu.h|  71 +++
 platform/linux-generic/arch/arm/odp_cpu_idling.h |  53 +
 platform/linux-generic/arch/arm/odp_llsc.h   | 253 +++
 platform/linux-generic/arch/default/odp_cpu.h|  43 
 platform/linux-generic/arch/mips64/odp_cpu.h |  43 
 platform/linux-generic/arch/powerpc/odp_cpu.h|  43 
 platform/linux-generic/arch/x86/odp_cpu.h|  43 
 9 files changed, 778 insertions(+)
 create mode 100644 platform/linux-generic/arch/arm/odp_atomic.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu_idling.h
 create mode 100644 platform/linux-generic/arch/arm/odp_llsc.h
 create mode 100644 platform/linux-generic/arch/default/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/mips64/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/powerpc/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/x86/odp_cpu.h

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 2293f7f0..09e25166 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -8,6 +8,7 @@ AM_CFLAGS +=  -I$(srcdir)/include
 AM_CFLAGS +=  -I$(top_srcdir)/include
 AM_CFLAGS +=  -I$(top_srcdir)/include/odp/arch/@ARCH_ABI@
 AM_CFLAGS +=  -I$(top_builddir)/include
+AM_CFLAGS +=  -I$(top_srcdir)/arch/@ARCH_DIR@
 AM_CFLAGS +=  -Iinclude
 AM_CFLAGS +=  -DSYSCONFDIR=\"@sysconfdir@\"
 AM_CFLAGS +=  -D_ODP_PKTIO_IPC
@@ -198,6 +199,22 @@ noinst_HEADERS = \
  ${srcdir}/include/protocols/udp.h \
  ${srcdir}/Makefile.inc
 
+if ARCH_IS_ARM
+noinst_HEADERS += ${srcdir}/arch/arm/odp_atomic.h \
+ ${srcdir}/arch/arm/odp_cpu.h \
+ ${srcdir}/arch/arm/odp_cpu_idling.h \
+ ${srcdir}/arch/arm/odp_llsc.h
+endif
+if ARCH_IS_MIPS64
+noinst_HEADERS += ${srcdir}/arch/mips64/odp_cpu.h
+endif
+if ARCH_IS_POWERPC
+noinst_HEADERS += ${srcdir}/arch/powerpc/odp_cpu.h
+endif
+if ARCH_IS_X86
+noinst_HEADERS += ${srcdir}/arch/x86/odp_cpu.h
+endif
+
 __LIB__libodp_linux_la_SOURCES = \
   _fdserver.c \
   _ishm.c \
diff --git a/platform/linux-generic/arch/arm/odp_atomic.h 
b/platform/linux-generic/arch/arm/odp_atomic.h
new file mode 100644
index ..3a21a47b
--- /dev/null
+++ b/platform/linux-generic/arch/arm/odp_atomic.h
@@ -0,0 +1,212 @@
+/* Copyright (c) 2017, ARM Limited. All rights reserved.
+ *
+ * Copyright (c) 2017, Linaro Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_ATOMIC_H
+#define PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_ATOMIC_H
+
+#ifndef PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_CPU_H
+#error This file should not be included directly, please include odp_cpu.h
+#endif
+
+#ifdef CONFIG_DMBSTR
+
+#define atomic_store_release(loc, val, ro) \
+do {   \
+   _odp_release_barrier(ro);   \
+   __atomic_store_n(loc, val, __ATOMIC_RELAXED);   \
+} while (0)
+
+#else
+
+#define atomic_store_release(loc, val, ro) \
+   __atomic_store_n(loc, val, __ATOMIC_RELEASE)
+
+#endif  /* CONFIG_DMBSTR */
+
+#ifdef __aarch64__
+
+#define HAS_ACQ(mo) ((mo) != __ATOMIC_RELAXED && (mo) != __ATOMIC_RELEASE)
+#define HAS_RLS(mo) ((mo) == __ATOMIC_RELEASE || (mo) == __ATOMIC_ACQ_REL || \
+(mo) == __ATOMIC_SEQ_CST)
+
+#define LL_MO(mo) (HAS_ACQ((mo)) ? __ATOMIC_ACQUIRE : __ATOMIC_RELAXED)
+#define SC_MO(mo) (HAS_RLS((mo)) ? __ATOMIC_RELEASE : __ATOMIC_RELAXED)
+
+#ifndef __ARM_FEATURE_QRDMX /* Feature only available in v8.1a and beyond */
+static inline bool
+__lockfree_compare_exchange_16(register __int128 *var, __int128 *exp,
+  register __int128 neu, bool weak, int mo_success,
+  int mo_failure)
+{
+   (void)weak; /* Always do strong CAS or we can't perform atomic read */
+   /* Ignore memory ordering for failure, memory order for
+* success must be stronger or equal. */
+   (void)mo_failure;
+   register __int128 old;
+   register __int128 expected;
+   int ll_mo = LL_MO(mo_success);
+   int sc_mo = SC_MO(mo_success);
+
+   expected = *exp;
+   __asm__ volatile("" ::: "memory");
+   do {
+   /* Atomicity of LLD is not guaranteed */
+   old = lld(var, ll_mo);
+   /* Must write back neu or old to verify atomicity of LLD */
+   } while (

[lng-odp] [API-NEXT PATCH v10 4/6] linux-gen: sched scalable: add a concurrent queue

2017-06-23 Thread Brian Brooks
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Brian Brooks <brian.bro...@arm.com>
---
 platform/linux-generic/Makefile.am   |   1 +
 platform/linux-generic/include/odp_llqueue.h | 311 +++
 2 files changed, 312 insertions(+)
 create mode 100644 platform/linux-generic/include/odp_llqueue.h

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 39322dc1..19e2241b 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -170,6 +170,7 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_errno_define.h \
  ${srcdir}/include/odp_forward_typedefs_internal.h \
  ${srcdir}/include/odp_internal.h \
+ ${srcdir}/include/odp_llqueue.h \
  ${srcdir}/include/odp_name_table_internal.h \
  ${srcdir}/include/odp_packet_internal.h \
  ${srcdir}/include/odp_packet_io_internal.h \
diff --git a/platform/linux-generic/include/odp_llqueue.h 
b/platform/linux-generic/include/odp_llqueue.h
new file mode 100644
index ..99b12e66
--- /dev/null
+++ b/platform/linux-generic/include/odp_llqueue.h
@@ -0,0 +1,311 @@
+/* Copyright (c) 2017, ARM Limited. All rights reserved.
+ *
+ * Copyright (c) 2017, Linaro Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef ODP_LLQUEUE_H_
+#define ODP_LLQUEUE_H_
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+/**
+ * Linked list queues
+ */
+
+struct llqueue;
+struct llnode;
+
+static struct llnode *llq_head(struct llqueue *llq);
+static void llqueue_init(struct llqueue *llq);
+static void llq_enqueue(struct llqueue *llq, struct llnode *node);
+static struct llnode *llq_dequeue(struct llqueue *llq);
+static odp_bool_t llq_dequeue_cond(struct llqueue *llq, struct llnode *exp);
+static odp_bool_t llq_cond_rotate(struct llqueue *llq, struct llnode *node);
+static odp_bool_t llq_on_queue(struct llnode *node);
+
+/**
+ * The implementation(s)
+ */
+
+#define SENTINEL ((void *)~(uintptr_t)0)
+
+#ifdef CONFIG_LLDSCD
+/* Implement queue operations using double-word LL/SC */
+
+/* The scalar equivalent of a double pointer */
+#if __SIZEOF_PTRDIFF_T__ == 4
+typedef uint64_t dintptr_t;
+#endif
+#if __SIZEOF_PTRDIFF_T__ == 8
+typedef __int128 dintptr_t;
+#endif
+
+struct llnode {
+   struct llnode *next;
+};
+
+union llht {
+   struct {
+   struct llnode *head, *tail;
+   } st;
+   dintptr_t ui;
+};
+
+struct llqueue {
+   union llht u;
+};
+
+static inline struct llnode *llq_head(struct llqueue *llq)
+{
+   return __atomic_load_n(>u.st.head, __ATOMIC_RELAXED);
+}
+
+static inline void llqueue_init(struct llqueue *llq)
+{
+   llq->u.st.head = NULL;
+   llq->u.st.tail = NULL;
+}
+
+static inline void llq_enqueue(struct llqueue *llq, struct llnode *node)
+{
+   union llht old, neu;
+
+   ODP_ASSERT(node->next == NULL);
+   node->next = SENTINEL;
+   do {
+   old.ui = lld(>u.ui, __ATOMIC_RELAXED);
+   neu.st.head = old.st.head == NULL ? node : old.st.head;
+   neu.st.tail = node;
+   } while (odp_unlikely(scd(>u.ui, neu.ui, __ATOMIC_RELEASE)));
+   if (old.st.tail != NULL) {
+   /* List was not empty */
+   ODP_ASSERT(old.st.tail->next == SENTINEL);
+   old.st.tail->next = node;
+   }
+}
+
+static inline struct llnode *llq_dequeue(struct llqueue *llq)
+{
+   struct llnode *head;
+   union llht old, neu;
+
+   /* llq_dequeue() may be used in a busy-waiting fashion
+* Read head using plain load to avoid disturbing remote LL/SC
+*/
+   head = __atomic_load_n(>u.st.head, __ATOMIC_ACQUIRE);
+   if (head == NULL)
+   return NULL;
+   /* Read head->next before LL to minimize cache miss latency
+* in LL/SC below
+*/
+   (void)__atomic_load_n(>next, __ATOMIC_RELAXED);
+
+   do {
+   old.ui = lld(>u.ui, __ATOMIC_RELAXED);
+   if (odp_unlikely(old.st.head == NULL)) {
+   /* Empty list */
+   return NULL;
+   } else if (odp_unlikely(old.st.head == old.st.tail)) {
+   /* Single-element in list */
+   neu.st.head = NULL;
+   neu.st.tail = NULL;
+   } else {
+   /* Multi-element list, dequeue head */
+   struct llnode *next;
+ 

[lng-odp] [API-NEXT PATCH v10 3/6] linux-gen: sched scalable: add a bitset

2017-06-23 Thread Brian Brooks
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 platform/linux-generic/Makefile.am  |   1 +
 platform/linux-generic/include/odp_bitset.h | 212 
 2 files changed, 213 insertions(+)
 create mode 100644 platform/linux-generic/include/odp_bitset.h

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 09e25166..39322dc1 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -159,6 +159,7 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_atomic_internal.h \
  ${srcdir}/include/odp_buffer_inlines.h \
  ${srcdir}/include/odp_bitmap_internal.h \
+ ${srcdir}/include/odp_bitset.h \
  ${srcdir}/include/odp_buffer_internal.h \
  ${srcdir}/include/odp_classification_datamodel.h \
  ${srcdir}/include/odp_classification_inlines.h \
diff --git a/platform/linux-generic/include/odp_bitset.h 
b/platform/linux-generic/include/odp_bitset.h
new file mode 100644
index ..4b7dd6d6
--- /dev/null
+++ b/platform/linux-generic/include/odp_bitset.h
@@ -0,0 +1,212 @@
+/* Copyright (c) 2017, ARM Limited. All rights reserved.
+ *
+ * Copyright (c) 2017, Linaro Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef _ODP_BITSET_H_
+#define _ODP_BITSET_H_
+
+#include 
+
+#include 
+
+/**
+ * bitset abstract data type
+ */
+/* This could be a struct of scalars to support larger bit sets */
+
+/*
+ * Size of atomic bit set. This limits the max number of threads,
+ * scheduler groups and reorder windows. On ARMv8/64-bit and x86-64, the
+ * (lock-free) max is 128
+ */
+
+/* Find a suitable data type that supports lock-free atomic operations */
+#if defined(__aarch64__) && defined(__SIZEOF_INT128__) && \
+   __SIZEOF_INT128__ == 16
+#define LOCKFREE16
+typedef __int128 bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT128__)
+
+#elif __GCC_ATOMIC_LLONG_LOCK_FREE == 2 && \
+   __SIZEOF_LONG_LONG__ != __SIZEOF_LONG__
+typedef unsigned long long bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_LONG_LONG__)
+
+#elif __GCC_ATOMIC_LONG_LOCK_FREE == 2 && __SIZEOF_LONG__ != __SIZEOF_INT__
+typedef unsigned long bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_LONG__)
+
+#elif __GCC_ATOMIC_INT_LOCK_FREE == 2
+typedef unsigned int bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT__)
+
+#else
+/* Target does not support lock-free atomic operations */
+typedef unsigned int bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT__)
+#endif
+
+#if ATOM_BITSET_SIZE <= 32
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   return 1UL << bit;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   return __builtin_ffsl(b);
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor32(bs, mo);
+}
+
+#elif ATOM_BITSET_SIZE <= 64
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   return 1ULL << bit;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   return __builtin_ffsll(b);
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor64(bs, mo);
+}
+
+#elif ATOM_BITSET_SIZE <= 128
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   if (bit < 64)
+   return 1ULL << bit;
+   else
+   return (unsigned __int128)(1ULL << (bit - 64)) << 64;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   if ((uint64_t)b != 0)
+   return __builtin_ffsll((uint64_t)b);
+   else if ((b >> 64) != 0)
+   return __builtin_ffsll((uint64_t)(b >> 64)) + 64;
+   else
+   return 0;
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor128(bs, mo);
+}
+
+#else
+#error Unsupported size of bit sets (ATOM_BITSET_SIZE)
+#endif
+
+/* Atomic load with memory ordering */
+static inline bitset_t atom_bitset_load(bitset_t *bs, int mo)
+{
+#ifdef LOCKFREE16
+   return __lockfree_load_16(bs, mo);
+#else
+   return __atomic_load_n(bs, mo);
+#endif
+}
+
+/* Atomic bit set with memory ordering */
+static inline void atom_bitset_set(bitset_t *bs, uint32_t bit, int mo)
+{
+#ifdef LOCKFREE16
+

[lng-odp] [API-NEXT PATCH v10 6/6] travis: add scalable scheduler in CI

2017-06-23 Thread Brian Brooks
From: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>

Added running tests with scalable scheduler to CI

Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
Reviewed-by: Brian Brooks <brian.bro...@arm.com>
---
 .travis.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.travis.yml b/.travis.yml
index 89f028d1..35bab77f 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -62,6 +62,7 @@ env:
 - CONF="--disable-abi-compat"
 - CONF="--enable-schedule-sp"
 - CONF="--enable-schedule-iquery"
+- CONF="--enable-schedule-scalable"
 
 install:
 - echo 1000 | sudo tee /proc/sys/vm/nr_hugepages
-- 
2.13.1



[lng-odp] [API-NEXT PATCH v10 0/6] A scalable software scheduler

2017-06-23 Thread Brian Brooks
This work derives from Ola Liljedahl's prototype [1] which introduced a
scalable scheduler design based on primarily lock-free algorithms and
data structures designed to decrease contention. A thread searches
through a data structure containing only queues that are both non-empty
and allowed to be scheduled to that thread. Strict priority scheduling is
respected, and (W)RR scheduling may be used within queues of the same priority.
Lastly, pre-scheduling or stashing is not employed since it is optional
functionality that can be implemented in the application.

In addition to scalable ring buffers, the algorithm also uses unbounded
concurrent queues. LL/SC and CAS variants exist in cases where absense of
ABA problem cannot be proved, and also in cases where the compiler's atomic
built-ins may not be lowered to the desired instruction(s). Finally, a version
of the algorithm that uses locks is also provided.

Use --enable-schedule-scalable to conditionally compile this scheduler
into the library.

[1] https://lists.linaro.org/pipermail/lng-odp/2016-September/025682.html

On checkpatch.pl:
 - [2/6] and [5/6] have checkpatch.pl issues that are superfluous

v10:
 - Rebase against fixes for conditional compilation of arch sources
 - Add Linaro copyright
 - Support legacy compilers that do not support ARM ACLE
 - Remove inclusion of odp_schedule_config.h
 - Revert driver shm block change
 - Use ordered lock count #define

v9:
 - Include patch to enable scalable scheduler in Travis CI
 - Fix 'make distcheck'

v8:
 - Reword commit messages

v7:
 - Rebase against new modular queue interface
 - Duplicate arch files under mips64 and powerpc
 - Fix sched->order_lock()
 - Loop until all deferred events have been enqueued
 - Implement ord_enq_multi()
 - Fix ordered_lock/unlock
 - Revert stylistic changes
 - Add default xfactor
 - Remove changes to odp_sched_latency
 - Remove ULL suffix to alleviate Clang build

v6:
 - Move conversions into scalable scheduler to alleviate #ifdefs
 - Remove unnecessary prefetch
 - Fix ARMv8 build

v5:
 - Allocate cache aligned memory using shm pool APIs
 - Move more code to scalable scheduler specific files
 - Remove CONFIG_SPLIT_READWRITE
 - Fix 'make distcheck' issue

v4:
 - Fix a couple more checkpatch.pl issues

v3:
 - Only conditionally compile scalable scheduler and queue
 - Move some code to arch/ dir
 - Use a single shm block for queues instead of block-per-queue
 - De-interleave odp_llqueue.h
 - Use compiler macros to determine ATOM_BITSET_SIZE
 - Incorporated queue size changes
 - Dropped 'ODP_' prefix on config and moved to other files
 - Dropped a few patches that were send independently to the list

v2:
 - Move ARMv8 issues and other fixes into separate patches
 - Abstract away some #ifdefs
 - Fix some checkpatch.pl warnings

Brian Brooks (5):
  test: odp_pktio_ordered: add queue size
  linux-gen: sched scalable: add arch files
  linux-gen: sched scalable: add a bitset
  linux-gen: sched scalable: add a concurrent queue
  linux-gen: sched scalable: add scalable scheduler

Honnappa Nagarahalli (1):
  travis: add scalable scheduler in CI

 .travis.yml|1 +
 platform/linux-generic/Makefile.am |   26 +
 platform/linux-generic/arch/arm/odp_atomic.h   |  212 +++
 platform/linux-generic/arch/arm/odp_cpu.h  |   71 +
 platform/linux-generic/arch/arm/odp_cpu_idling.h   |   53 +
 platform/linux-generic/arch/arm/odp_llsc.h |  253 +++
 platform/linux-generic/arch/default/odp_cpu.h  |   43 +
 platform/linux-generic/arch/mips64/odp_cpu.h   |   43 +
 platform/linux-generic/arch/powerpc/odp_cpu.h  |   43 +
 platform/linux-generic/arch/x86/odp_cpu.h  |   43 +
 .../include/odp/api/plat/schedule_types.h  |4 +-
 platform/linux-generic/include/odp_bitset.h|  212 +++
 .../linux-generic/include/odp_config_internal.h|   15 +-
 platform/linux-generic/include/odp_llqueue.h   |  311 +++
 .../include/odp_queue_scalable_internal.h  |  104 +
 platform/linux-generic/include/odp_schedule_if.h   |2 +-
 .../linux-generic/include/odp_schedule_scalable.h  |  139 ++
 .../include/odp_schedule_scalable_config.h |   52 +
 .../include/odp_schedule_scalable_ordered.h|  132 ++
 platform/linux-generic/m4/odp_schedule.m4  |   55 +-
 platform/linux-generic/odp_queue_if.c  |8 +
 platform/linux-generic/odp_queue_scalable.c| 1022 ++
 platform/linux-generic/odp_schedule_if.c   |6 +
 platform/linux-generic/odp_schedule_scalable.c | 1980 
 .../linux-generic/odp_schedule_scalable_ordered.c  |  345 
 test/common_plat/performance/odp_pktio_ordered.c   |4 +
 26 files changed, 5157 insertions(+), 22 deletions(-)
 create mode 100644 platform/linux-generic/arch/arm/odp_atomic.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/

Re: [lng-odp] [PATCH v4] example: add IPv4 fragmentation/reassembly example

2017-06-23 Thread Brian Brooks
On 06/23 22:50:09, Maxim Uvarov wrote:
> clang -m32 on x86 failed.

On 32-bit machines -latomic is needed when compiling with
Clang. It is not needed with GCC.

I suspect the atomics checking in
/platform/linux-generic/m4/configure.ac is always using
GCC when compiling these mini sources. It needs to be using
${CC}. Is that possible?

> Joe, do you want to take a look at it?
> 
> [...truncated 76.24 KB...]
>   cc: clang
>   cc version: 3.5.0
>   cppflags:   
>   am_cppflags:
> -I<https://ci.linaro.org/job/odp-tool-check/build_type=clang_and_m32_on_64,label=docker-jessie-amd64/ws/check-odp/installed/x86_64-i386/openssl-OpenSSL_1_0_1h/include>
> -I<https://ci.linaro.org/job/odp-tool-check/build_type=clang_and_m32_on_64,label=docker-jessie-amd64/ws/check-odp/installed/x86_64-i386/cunit-2.1-3/include>
>   am_cxxflags:-std=c++11
>   cflags: -m32
>   am_cflags:   -pthread  -DIMPLEMENTATION_NAME=odp-linux
> -DODP_DEBUG_PRINT=0 -DODPH_DEBUG_PRINT=0 -DODP_DEBUG=0 -W -Wall -Werror
> -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations
> -Wold-style-definition -Wpointer-arith -Wcast-align -Wnested-externs
> -Wcast-qual -Wformat-nonliteral -Wformat-security -Wundef
> -Wwrite-strings -std=c99  -mcx16
>   ldflags:-m32
>   am_ldflags:   -pthread -lrt
> -L<https://ci.linaro.org/job/odp-tool-check/build_type=clang_and_m32_on_64,label=docker-jessie-amd64/ws/check-odp/installed/x86_64-i386/openssl-OpenSSL_1_0_1h/lib>
> -L<https://ci.linaro.org/job/odp-tool-check/build_type=clang_and_m32_on_64,label=docker-jessie-amd64/ws/check-odp/installed/x86_64-i386/cunit-2.1-3/lib>
>   libs:   -lcunit -lcrypto
>   defs:   -DHAVE_CONFIG_H
>   static libraries:   yes
>   shared libraries:   yes
>   ABI compatible: yes
>   Deprecated APIs:no
>   cunit:  yes
>   test_vald:  yes
>   test_perf:  yes
>   test_perf_proc: yes
>   test_cpp:   no
>   test_helper:yes
>   test_example:   yes
>   user_guides:no
> 
>   CC   odp_ipfragreass-odp_ipfragreass.o
>   CC   odp_ipfragreass-odp_ipfragreass_fragment.o
>   CC   odp_ipfragreass-odp_ipfragreass_helpers.o
>   CC   odp_ipfragreass-odp_ipfragreass_reassemble.o
>   CCLD odp_ipfragreass
> odp_ipfragreass-odp_ipfragreass_reassemble.o: In function
> `atomic_strong_cas_dblptr':
> odp_ipfragreass_reassemble.c:(.text+0x89f): undefined reference to
> `__atomic_compare_exchange_8'
> clang: error: linker command failed with exit code 1 (use -v to see
> invocation)
> Makefile:695: recipe for target 'odp_ipfragreass' failed
> 
> On 06/13/17 17:44, Joe Savage wrote:
> >>>>> Joe,
> >>>>>
> >>>>> can you please make it work with clang? I sent a patch to ml before. It
> >>>>> might still apply, so you can review it.
> >>>>> https://travis-ci.org/muvarov/odp/jobs/223572921
> >>>>>
> >>>>> the goal is to find good combination of -mcx16 and -latomic flags. And
> >>>>> we need to test that it still works on arm due to there is no mcx16 
> >>>>> flag.
> >>>>>
> >>>>> Maxim.
> >>>>
> >>>> Hey Maxim,
> >>>>
> >>>> As we discussed previously, from my end the example should work perfectly
> >>>> with clang, it is just a matter of passing the right flags to the 
> >>>> compiler
> >>>> through the project infrastructure. As such, you should be in a much 
> >>>> better
> >>>> position to evaluate this than I. Looking at the descriptions of the 
> >>>> patches
> >>>> alone, however, it looks like your patch combined with Brian Brooks'
> >>>> "configure.ac: fix native Clang build on ARMv8" should resolve any
> >>>> compilation issues here.
> >>>>
> >>>> Joe
> >>>>
> >>>
> >>> Ok, than if you want I can update my patch and send together with yours
> >>> as patchset.
> >>>
> >>> Maxim.
> >>
> >> Sure -- if that's how you'd like the patches to be grouped, go for it.
> > 
> > Any progress on this, Maxim?
> > 
> 


Re: [lng-odp] [PATCH v2] build: fix conditional compilation of sources

2017-06-22 Thread Brian Brooks
Hi Maxim,

Can you please land this in api-next as well as master since
we need to rebase the scalable scheduler patch series against
this patch?

Thanks,
Brian

On 06/22 17:05:38, Brian Brooks wrote:
> Explicitly add all arch//* files to respective _SOURCES
> variables instead of using @ARCH_DIR@ substitution.
> 
> This patch fixes the broken build for ARM, PPC, and MIPS
> introduced by [1] and the similar issue reported while
> testing [2].
> 
> From the Autoconf manual [3]:
> 
>   You can't put a configure substitution (e.g., '@FOO@' or
>   '$(FOO)' where FOO is defined via AC_SUBST) into a _SOURCES
>   variable. The reason for this is a bit hard to explain, but
>   suffice to say that it simply won't work.
> 
> Here be dragons..
> 
> [1] https://lists.linaro.org/pipermail/lng-odp/2017-April/030324.html
> [2] https://lists.linaro.org/pipermail/lng-odp/2017-June/031598.html
> [3] 
> https://www.gnu.org/software/automake/manual/html_node/Conditional-Sources.html
> 
> Signed-off-by: Brian Brooks <brian.bro...@arm.com>
> Reviewed-by: Kevin Wang <kevin.w...@arm.com>
> Reviewed-by: Yi He <yi...@arm.com>
> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> ---
> 
> v2:
>  - Alphabetize x86 sources (Maxim)
> 
>  configure.ac   |  3 +++
>  platform/linux-generic/Makefile.am | 40 
> ++
>  2 files changed, 35 insertions(+), 8 deletions(-)
> 
> diff --git a/configure.ac b/configure.ac
> index 46c7bbd2..45812f66 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -225,6 +225,9 @@ AM_CONDITIONAL([HAVE_DOXYGEN], [test "x${DOXYGEN}" = 
> "xdoxygen"])
>  AM_CONDITIONAL([user_guide], [test "x${user_guides}" = "xyes" ])
>  AM_CONDITIONAL([HAVE_MSCGEN], [test "x${MSCGEN}" = "xmscgen"])
>  AM_CONDITIONAL([helper_linux], [test x$helper_linux = xyes ])
> +AM_CONDITIONAL([ARCH_IS_ARM], [test "x${ARCH_DIR}" = "xarm"])
> +AM_CONDITIONAL([ARCH_IS_MIPS64], [test "x${ARCH_DIR}" = "xmips64"])
> +AM_CONDITIONAL([ARCH_IS_POWERPC], [test "x${ARCH_DIR}" = "xpowerpc"])
>  AM_CONDITIONAL([ARCH_IS_X86], [test "x${ARCH_DIR}" = "xx86"])
>  
>  ##
> diff --git a/platform/linux-generic/Makefile.am 
> b/platform/linux-generic/Makefile.am
> index 8dcdebd2..989a65b6 100644
> --- a/platform/linux-generic/Makefile.am
> +++ b/platform/linux-generic/Makefile.am
> @@ -63,8 +63,20 @@ odpapiinclude_HEADERS = \
> $(srcdir)/include/odp/api/time.h \
> $(srcdir)/include/odp/api/timer.h \
> $(srcdir)/include/odp/api/traffic_mngr.h \
> -   $(srcdir)/include/odp/api/version.h \
> -   $(srcdir)/arch/@ARCH_DIR@/odp/api/cpu_arch.h
> +   $(srcdir)/include/odp/api/version.h
> +
> +if ARCH_IS_ARM
> +odpapiinclude_HEADERS += $(srcdir)/arch/arm/odp/api/cpu_arch.h
> +endif
> +if ARCH_IS_MIPS64
> +odpapiinclude_HEADERS += $(srcdir)/arch/mips64/odp/api/cpu_arch.h
> +endif
> +if ARCH_IS_POWERPC
> +odpapiinclude_HEADERS += $(srcdir)/arch/powerpc/odp/api/cpu_arch.h
> +endif
> +if ARCH_IS_X86
> +odpapiinclude_HEADERS += $(srcdir)/arch/x86/odp/api/cpu_arch.h
> +endif
>  
>  odpapiplatincludedir= $(includedir)/odp/api/plat
>  odpapiplatinclude_HEADERS = \
> @@ -217,20 +229,32 @@ __LIB__libodp_linux_la_SOURCES = \
>  odp_timer_wheel.c \
>  odp_traffic_mngr.c \
>  odp_version.c \
> -odp_weak.c \
> -arch/@ARCH_DIR@/odp_cpu_arch.c \
> -arch/@ARCH_DIR@/odp_sysinfo_parse.c
> -
> -__LIB__libodp_linux_la_LIBADD = $(ATOMIC_LIBS)
> +odp_weak.c
>  
> +if ARCH_IS_ARM
> +__LIB__libodp_linux_la_SOURCES += arch/arm/odp_cpu_arch.c \
> +   arch/arm/odp_sysinfo_parse.c
> +endif
> +if ARCH_IS_MIPS64
> +__LIB__libodp_linux_la_SOURCES += arch/mips64/odp_cpu_arch.c \
> +   arch/mips64/odp_sysinfo_parse.c
> +endif
> +if ARCH_IS_POWERPC
> +__LIB__libodp_linux_la_SOURCES += arch/powerpc/odp_cpu_arch.c \
> +   arch/powerpc/odp_sysinfo_parse.c
> +endif
>  if ARCH_IS_X86
> -__LIB__libodp_linux_la_SOURCES += arch/@ARCH_DIR@/cpu_flags.c
> +__LIB__libodp_linux_la_SOURCES += arch/x86/cpu_flags.c \
> +   arch/x86/odp_cpu_arch.c \
> +   arch/x86/odp_sysinfo_parse.c
>  endif
>  
>  if HAVE_PCAP
>  __LIB__libodp_linux_la_SOURCES += pktio/pcap.c
>  endif
>  
> +__LIB__libodp_linux_la_LIBADD = $(ATOMIC_LIBS)
> +
>  # Create symlink for ABI header files. Application does not need to use the 
> arch
>  # specific include path for installed files.
>  install-data-hook:
> -- 
> 2.13.1
> 


[lng-odp] [PATCH v2] build: fix conditional compilation of sources

2017-06-22 Thread Brian Brooks
Explicitly add all arch//* files to respective _SOURCES
variables instead of using @ARCH_DIR@ substitution.

This patch fixes the broken build for ARM, PPC, and MIPS
introduced by [1] and the similar issue reported while
testing [2].

>From the Autoconf manual [3]:

  You can't put a configure substitution (e.g., '@FOO@' or
  '$(FOO)' where FOO is defined via AC_SUBST) into a _SOURCES
  variable. The reason for this is a bit hard to explain, but
  suffice to say that it simply won't work.

Here be dragons..

[1] https://lists.linaro.org/pipermail/lng-odp/2017-April/030324.html
[2] https://lists.linaro.org/pipermail/lng-odp/2017-June/031598.html
[3] 
https://www.gnu.org/software/automake/manual/html_node/Conditional-Sources.html

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Kevin Wang <kevin.w...@arm.com>
Reviewed-by: Yi He <yi...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---

v2:
 - Alphabetize x86 sources (Maxim)

 configure.ac   |  3 +++
 platform/linux-generic/Makefile.am | 40 ++
 2 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/configure.ac b/configure.ac
index 46c7bbd2..45812f66 100644
--- a/configure.ac
+++ b/configure.ac
@@ -225,6 +225,9 @@ AM_CONDITIONAL([HAVE_DOXYGEN], [test "x${DOXYGEN}" = 
"xdoxygen"])
 AM_CONDITIONAL([user_guide], [test "x${user_guides}" = "xyes" ])
 AM_CONDITIONAL([HAVE_MSCGEN], [test "x${MSCGEN}" = "xmscgen"])
 AM_CONDITIONAL([helper_linux], [test x$helper_linux = xyes ])
+AM_CONDITIONAL([ARCH_IS_ARM], [test "x${ARCH_DIR}" = "xarm"])
+AM_CONDITIONAL([ARCH_IS_MIPS64], [test "x${ARCH_DIR}" = "xmips64"])
+AM_CONDITIONAL([ARCH_IS_POWERPC], [test "x${ARCH_DIR}" = "xpowerpc"])
 AM_CONDITIONAL([ARCH_IS_X86], [test "x${ARCH_DIR}" = "xx86"])
 
 ##
diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 8dcdebd2..989a65b6 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -63,8 +63,20 @@ odpapiinclude_HEADERS = \
  $(srcdir)/include/odp/api/time.h \
  $(srcdir)/include/odp/api/timer.h \
  $(srcdir)/include/odp/api/traffic_mngr.h \
- $(srcdir)/include/odp/api/version.h \
- $(srcdir)/arch/@ARCH_DIR@/odp/api/cpu_arch.h
+ $(srcdir)/include/odp/api/version.h
+
+if ARCH_IS_ARM
+odpapiinclude_HEADERS += $(srcdir)/arch/arm/odp/api/cpu_arch.h
+endif
+if ARCH_IS_MIPS64
+odpapiinclude_HEADERS += $(srcdir)/arch/mips64/odp/api/cpu_arch.h
+endif
+if ARCH_IS_POWERPC
+odpapiinclude_HEADERS += $(srcdir)/arch/powerpc/odp/api/cpu_arch.h
+endif
+if ARCH_IS_X86
+odpapiinclude_HEADERS += $(srcdir)/arch/x86/odp/api/cpu_arch.h
+endif
 
 odpapiplatincludedir= $(includedir)/odp/api/plat
 odpapiplatinclude_HEADERS = \
@@ -217,20 +229,32 @@ __LIB__libodp_linux_la_SOURCES = \
   odp_timer_wheel.c \
   odp_traffic_mngr.c \
   odp_version.c \
-  odp_weak.c \
-  arch/@ARCH_DIR@/odp_cpu_arch.c \
-  arch/@ARCH_DIR@/odp_sysinfo_parse.c
-
-__LIB__libodp_linux_la_LIBADD = $(ATOMIC_LIBS)
+  odp_weak.c
 
+if ARCH_IS_ARM
+__LIB__libodp_linux_la_SOURCES += arch/arm/odp_cpu_arch.c \
+ arch/arm/odp_sysinfo_parse.c
+endif
+if ARCH_IS_MIPS64
+__LIB__libodp_linux_la_SOURCES += arch/mips64/odp_cpu_arch.c \
+ arch/mips64/odp_sysinfo_parse.c
+endif
+if ARCH_IS_POWERPC
+__LIB__libodp_linux_la_SOURCES += arch/powerpc/odp_cpu_arch.c \
+ arch/powerpc/odp_sysinfo_parse.c
+endif
 if ARCH_IS_X86
-__LIB__libodp_linux_la_SOURCES += arch/@ARCH_DIR@/cpu_flags.c
+__LIB__libodp_linux_la_SOURCES += arch/x86/cpu_flags.c \
+ arch/x86/odp_cpu_arch.c \
+ arch/x86/odp_sysinfo_parse.c
 endif
 
 if HAVE_PCAP
 __LIB__libodp_linux_la_SOURCES += pktio/pcap.c
 endif
 
+__LIB__libodp_linux_la_LIBADD = $(ATOMIC_LIBS)
+
 # Create symlink for ABI header files. Application does not need to use the 
arch
 # specific include path for installed files.
 install-data-hook:
-- 
2.13.1



Re: [lng-odp] [PATCH] build: fix conditional compilation of sources

2017-06-22 Thread Brian Brooks
On 06/22 19:44:45, Maxim Uvarov wrote:
> On 06/22/17 19:19, Brian Brooks wrote:
> > On 06/22 19:06:01, Maxim Uvarov wrote:
> >> On 06/22/17 17:17, Brian Brooks wrote:
> >>> On 06/22 11:13:57, Maxim Uvarov wrote:
> >>>> On 22 June 2017 at 06:24, Brian Brooks <brian.bro...@arm.com> wrote:
> >>>>
> >>>>> Explicitly add all arch//* files to respective _SOURCES
> >>>>> variables instead of using @ARCH_DIR@ substitution.
> >>>>>
> >>>>> This patch fixes the broken build for ARM, PPC, and MIPS
> >>>>> introduced by [1] and the similar issue reported while
> >>>>> testing [2].
> >>>>>
> >>>>> From the Autoconf manual [3]:
> >>>>>
> >>>>>   You can't put a configure substitution (e.g., '@FOO@' or
> >>>>>   '$(FOO)' where FOO is defined via AC_SUBST) into a _SOURCES
> >>>>>   variable. The reason for this is a bit hard to explain, but
> >>>>>   suffice to say that it simply won't work.
> >>>>>
> >>
> >>
> >> not clean why $(srcdir) work and $(ARCH_DIR) will not work.
> >>
> >> I changed this in your patch and it works well:
> >>
> >> -odpapiinclude_HEADERS += $(srcdir)/arch/x86/odp/api/cpu_arch.h
> >> +odpapiinclude_HEADERS += $(srcdir)/arch/$(ARCH_DIR)/odp/api/cpu_arch.h
> > 
> > Tried it on ARM and it breaks. If you read the Autoconf manual (above) it
> > explicitly states that you cannot use variable substitution in _SOURCES
> > (obviously also _HEADERS). As you point out, this is probably also only
> > for user-defined variables (e.g. configure.ac) instead of preset output
> > variables (e.g. srcdir).
> > 
> 
> ok thanks. then only one comment for alphabetical reorder.

They are in alphabetical order according to arch: A > M > P > X
Do you see something else?

> I think we can try to add arm-qemu to travis to also capture such bugs.

I thought there were ARM machines in LNG CI? Is there a way for users
to trigger a CI run over there before submitting a patch?

> Maxim.
> 
> 
> >> Maxim.
> >>
> >>
> >>>>> Here be dragons..
> >>>>>
> >>>>> [1] https://lists.linaro.org/pipermail/lng-odp/2017-April/030324.html
> >>>>> [2] https://lists.linaro.org/pipermail/lng-odp/2017-June/031598.html
> >>>>> [3] https://www.gnu.org/software/automake/manual/html_node/
> >>>>> Conditional-Sources.html
> >>>>>
> >>>>> Signed-off-by: Brian Brooks <brian.bro...@arm.com>
> >>>>> ---
> >>>>>  configure.ac   |  3 +++
> >>>>>  platform/linux-generic/Makefile.am | 40 ++
> >>>>> 
> >>>>>  2 files changed, 35 insertions(+), 8 deletions(-)
> >>>>>
> >>>>> diff --git a/configure.ac b/configure.ac
> >>>>> index 46c7bbd2..45812f66 100644
> >>>>> --- a/configure.ac
> >>>>> +++ b/configure.ac
> >>>>> @@ -225,6 +225,9 @@ AM_CONDITIONAL([HAVE_DOXYGEN], [test "x${DOXYGEN}" =
> >>>>> "xdoxygen"])
> >>>>>  AM_CONDITIONAL([user_guide], [test "x${user_guides}" = "xyes" ])
> >>>>>  AM_CONDITIONAL([HAVE_MSCGEN], [test "x${MSCGEN}" = "xmscgen"])
> >>>>>  AM_CONDITIONAL([helper_linux], [test x$helper_linux = xyes ])
> >>>>> +AM_CONDITIONAL([ARCH_IS_ARM], [test "x${ARCH_DIR}" = "xarm"])
> >>>>> +AM_CONDITIONAL([ARCH_IS_MIPS64], [test "x${ARCH_DIR}" = "xmips64"])
> >>>>> +AM_CONDITIONAL([ARCH_IS_POWERPC], [test "x${ARCH_DIR}" = "xpowerpc"])
> >>>>>  AM_CONDITIONAL([ARCH_IS_X86], [test "x${ARCH_DIR}" = "xx86"])
> >>>>>
> >>>>>  
> >>>>> ##
> >>>>> diff --git a/platform/linux-generic/Makefile.am 
> >>>>> b/platform/linux-generic/
> >>>>> Makefile.am
> >>>>> index 8dcdebd2..385c5304 100644
> >>>>> --- a/platform/linux-generic/Makefile.am
> >>>>> +++ b/platform/linux-generic/Makefile.am
> >>>>> @@ -63,8 +63,20 @@ odpapiinclude_HEADERS = \
> >>>>>  

Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue

2017-06-22 Thread Brian Brooks
> > > The first is built only for ARM and the second for the rest. Would there
> > >be a way to build both always ?
> > For ARMv7a and ARMv8a, you could build both versions. You really want to
> > use the LL/SC version on these architectures.
> > 
> > For architectures without double-word LL/SC, only the lock-based version
> > can be built.
> 
> 
> You could *compile* the lock version always. It's based on locks, not on arch 
> specific instructions.

That would require an abstraction layer consisting of function pointers
pointing to one of the two implementations. On architectures without support
for LLD/SCD, there would only be one implementation.

This could make sense if... you were benchmarking *many* different concurrent
queue implementations and wanted to keep the benchmark code extremely succinct
and were willing to pay for function pointers. But that is not the case here.

This code is deliberately written to be static inline and conditionally 
compiled.

> -Petri
> 


Re: [lng-odp] [PATCH] build: fix conditional compilation of sources

2017-06-22 Thread Brian Brooks
On 06/22 19:06:01, Maxim Uvarov wrote:
> On 06/22/17 17:17, Brian Brooks wrote:
> > On 06/22 11:13:57, Maxim Uvarov wrote:
> >> On 22 June 2017 at 06:24, Brian Brooks <brian.bro...@arm.com> wrote:
> >>
> >>> Explicitly add all arch//* files to respective _SOURCES
> >>> variables instead of using @ARCH_DIR@ substitution.
> >>>
> >>> This patch fixes the broken build for ARM, PPC, and MIPS
> >>> introduced by [1] and the similar issue reported while
> >>> testing [2].
> >>>
> >>> From the Autoconf manual [3]:
> >>>
> >>>   You can't put a configure substitution (e.g., '@FOO@' or
> >>>   '$(FOO)' where FOO is defined via AC_SUBST) into a _SOURCES
> >>>   variable. The reason for this is a bit hard to explain, but
> >>>   suffice to say that it simply won't work.
> >>>
> 
> 
> not clean why $(srcdir) work and $(ARCH_DIR) will not work.
> 
> I changed this in your patch and it works well:
> 
> -odpapiinclude_HEADERS += $(srcdir)/arch/x86/odp/api/cpu_arch.h
> +odpapiinclude_HEADERS += $(srcdir)/arch/$(ARCH_DIR)/odp/api/cpu_arch.h

Tried it on ARM and it breaks. If you read the Autoconf manual (above) it
explicitly states that you cannot use variable substitution in _SOURCES
(obviously also _HEADERS). As you point out, this is probably also only
for user-defined variables (e.g. configure.ac) instead of preset output
variables (e.g. srcdir).

> Maxim.
> 
> 
> >>> Here be dragons..
> >>>
> >>> [1] https://lists.linaro.org/pipermail/lng-odp/2017-April/030324.html
> >>> [2] https://lists.linaro.org/pipermail/lng-odp/2017-June/031598.html
> >>> [3] https://www.gnu.org/software/automake/manual/html_node/
> >>> Conditional-Sources.html
> >>>
> >>> Signed-off-by: Brian Brooks <brian.bro...@arm.com>
> >>> ---
> >>>  configure.ac   |  3 +++
> >>>  platform/linux-generic/Makefile.am | 40 ++
> >>> 
> >>>  2 files changed, 35 insertions(+), 8 deletions(-)
> >>>
> >>> diff --git a/configure.ac b/configure.ac
> >>> index 46c7bbd2..45812f66 100644
> >>> --- a/configure.ac
> >>> +++ b/configure.ac
> >>> @@ -225,6 +225,9 @@ AM_CONDITIONAL([HAVE_DOXYGEN], [test "x${DOXYGEN}" =
> >>> "xdoxygen"])
> >>>  AM_CONDITIONAL([user_guide], [test "x${user_guides}" = "xyes" ])
> >>>  AM_CONDITIONAL([HAVE_MSCGEN], [test "x${MSCGEN}" = "xmscgen"])
> >>>  AM_CONDITIONAL([helper_linux], [test x$helper_linux = xyes ])
> >>> +AM_CONDITIONAL([ARCH_IS_ARM], [test "x${ARCH_DIR}" = "xarm"])
> >>> +AM_CONDITIONAL([ARCH_IS_MIPS64], [test "x${ARCH_DIR}" = "xmips64"])
> >>> +AM_CONDITIONAL([ARCH_IS_POWERPC], [test "x${ARCH_DIR}" = "xpowerpc"])
> >>>  AM_CONDITIONAL([ARCH_IS_X86], [test "x${ARCH_DIR}" = "xx86"])
> >>>
> >>>  
> >>> ##
> >>> diff --git a/platform/linux-generic/Makefile.am b/platform/linux-generic/
> >>> Makefile.am
> >>> index 8dcdebd2..385c5304 100644
> >>> --- a/platform/linux-generic/Makefile.am
> >>> +++ b/platform/linux-generic/Makefile.am
> >>> @@ -63,8 +63,20 @@ odpapiinclude_HEADERS = \
> >>>   $(srcdir)/include/odp/api/time.h \
> >>>   $(srcdir)/include/odp/api/timer.h \
> >>>   $(srcdir)/include/odp/api/traffic_mngr.h \
> >>> - $(srcdir)/include/odp/api/version.h \
> >>> - $(srcdir)/arch/@ARCH_DIR@/odp/api/cpu_arch.h
> >>> + $(srcdir)/include/odp/api/version.h
> >>> +
> >>> +if ARCH_IS_ARM
> >>> +odpapiinclude_HEADERS += $(srcdir)/arch/arm/odp/api/cpu_arch.h
> >>> +endif
> >>> +if ARCH_IS_MIPS64
> >>> +odpapiinclude_HEADERS += $(srcdir)/arch/mips64/odp/api/cpu_arch.h
> >>> +endif
> >>> +if ARCH_IS_POWERPC
> >>> +odpapiinclude_HEADERS += $(srcdir)/arch/powerpc/odp/api/cpu_arch.h
> >>> +endif
> >>> +if ARCH_IS_X86
> >>> +odpapiinclude_HEADERS += $(srcdir)/arch/x86/odp/api/cpu_arch.h
> >>> +endif
> >>>
> >>>
> >>
> >> If - else is better to be her

Re: [lng-odp] [API-NEXT PATCH v4] timer: allow timer processing to run on worker cores

2017-06-22 Thread Brian Brooks
On 06/22 18:30:47, Maxim Uvarov wrote:
> On 06/22/17 17:55, Brian Brooks wrote:
> > On 06/22 10:27:01, Savolainen, Petri (Nokia - FI/Espoo) wrote:
> >> I was asking to make sure that performance impact has been checked also 
> >> when timers are not used, e.g. l2fwd performance before and after the 
> >> change. It would be also appropriate to test impact in the worst case: 
> >> l2fwd type application + a periodic 1sec timeout. Timer is on, but 
> >> timeouts come very unfrequently (compared to packets).
> >>
> >> It seems that no performance tests were run, although the change affects 
> >> performance of many applications (e.g. OFP has high packet rate with 
> >> timers). Configuration options should be set with  defaults that are 
> >> acceptable trade-off between packet processing performance and timeout 
> >> accuracy.
> > 
> > If timers are not used, the overhead is just checking a RO variable
> > (post global init). If timers are used, CONFIG_ parameters have been
> > provided. The defaults for these parameters came from the work to
> > drastically reduce jitter of timer processing which is documented
> > here [1] and presented at Linaro Connect here [2].
> > 
> > If you speculate that these defaults might need to be changed, e.g.
> > l2fwd, we welcome collaboration and data. But, this is not a blocking
> > issue for this patch right now.
> > 
> > [1] 
> > https://docs.google.com/document/d/1sY7rOxqCNu-bMqjBiT5_keAIohrX1ZW-eL0oGLAQ4OM/edit?usp=sharing
> > [2] http://connect.linaro.org/resource/bud17/bud17-320/
> > 
> 
> 1) we have all adjustable configs here
> ./platform/linux-generic/include/odp_config_internal.h
> that might be also needs to be there.

We had to move scalable scheduler CONFIG_ into 
include/odp_schedule_scalable_config.h
because it was decided that placing component-specific CONFIG_ in 
odp_config_internal.h
was not allowed.

> 2) Do we need something special in CI to check different config values?

No, the two CONFIG_ in this patch are related to timing. So, they do not
affect things like conditional compilation or enable/disable functionality.

> 3) Why it's compile time config values and not run time?

It is simpler.

> Maxim.
> 
> 
> >> -Petri
> >>
> >>
> >> From: Maxim Uvarov [mailto:maxim.uva...@linaro.org] 
> >> Sent: Thursday, June 22, 2017 11:22 AM
> >> To: Honnappa Nagarahalli <honnappa.nagaraha...@linaro.org>
> >> Cc: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolai...@nokia.com>; 
> >> lng-odp-forward <lng-odp@lists.linaro.org>
> >> Subject: Re: [lng-odp] [API-NEXT PATCH v4] timer: allow timer processing 
> >> to run on worker cores
> >>
> >> Petri, do you want to test performance before patch inclusion?
> >> Maxim.
> >>
> >> On 21 June 2017 at 21:52, Honnappa Nagarahalli 
> >> <mailto:honnappa.nagaraha...@linaro.org> wrote:
> >> We have not run any performance application. In our Linaro connect
> >> meeting, we presented numbers on how it improves the timer resolution.
> >> At this point, there is enough configuration options to control the
> >> effect of calling timer in the scheduler. For applications that do not
> >> want to use the timer, there should not be any change. For
> >> applications that use timers non-frequently, the check frequency can
> >> be controlled via the provided configuration options.
> >>
> >> On 20 June 2017 at 02:34, Savolainen, Petri (Nokia - FI/Espoo)
> >> <mailto:petri.savolai...@nokia.com> wrote:
> >>> Do you have some performance numbers? E.g. how much this slows down an 
> >>> application which does not use timers (e.g. l2fwd), or an application 
> >>> that uses only few, non-frequent timeouts?
> >>>
> >>> Additionally, init.h/feature.h is not yet in api-next - so this would not 
> >>> build yet.
> >>>
> >>>
> >>> -Petri
> >>>
> >>>
> >>>> -Original Message-
> >>>> From: lng-odp [mailto:mailto:lng-odp-boun...@lists.linaro.org] On Behalf 
> >>>> Of
> >>>> Honnappa Nagarahalli
> >>>> Sent: Tuesday, June 20, 2017 7:07 AM
> >>>> To: Bill Fischofer <mailto:bill.fischo...@linaro.org>
> >>>> Cc: lng-odp-forward <mailto:lng-odp@lists.linaro.org>
> >>>> Subject: Re: [lng-odp] [API-NEXT PATCH v4] timer: allow timer processing
> >>>> to run on worker cores
> >>>>
> >>>> Are you saying we should be good to merge this now?
> >>>>
> >>>> On 19 June 2017 at 17:42, Bill Fischofer 
> >>>> <mailto:bill.fischo...@linaro.org>
> >>>> wrote:
> >>>>> On Mon, Jun 19, 2017 at 4:19 PM, Honnappa Nagarahalli
> >>>>> <mailto:honnappa.nagaraha...@linaro.org> wrote:
> >>>>>> Hi Bill/Maxim,
> >>>>>>  I do not see any further comments, can we merge this to api-next?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Honnappa
> >>>
> >>>
> >>
> 


Re: [lng-odp] [API-NEXT PATCH v4] timer: allow timer processing to run on worker cores

2017-06-22 Thread Brian Brooks
On 06/22 10:27:01, Savolainen, Petri (Nokia - FI/Espoo) wrote:
> I was asking to make sure that performance impact has been checked also when 
> timers are not used, e.g. l2fwd performance before and after the change. It 
> would be also appropriate to test impact in the worst case: l2fwd type 
> application + a periodic 1sec timeout. Timer is on, but timeouts come very 
> unfrequently (compared to packets).
> 
> It seems that no performance tests were run, although the change affects 
> performance of many applications (e.g. OFP has high packet rate with timers). 
> Configuration options should be set with  defaults that are acceptable 
> trade-off between packet processing performance and timeout accuracy.

If timers are not used, the overhead is just checking a RO variable
(post global init). If timers are used, CONFIG_ parameters have been
provided. The defaults for these parameters came from the work to
drastically reduce jitter of timer processing which is documented
here [1] and presented at Linaro Connect here [2].

If you speculate that these defaults might need to be changed, e.g.
l2fwd, we welcome collaboration and data. But, this is not a blocking
issue for this patch right now.

[1] 
https://docs.google.com/document/d/1sY7rOxqCNu-bMqjBiT5_keAIohrX1ZW-eL0oGLAQ4OM/edit?usp=sharing
[2] http://connect.linaro.org/resource/bud17/bud17-320/

> -Petri
> 
> 
> From: Maxim Uvarov [mailto:maxim.uva...@linaro.org] 
> Sent: Thursday, June 22, 2017 11:22 AM
> To: Honnappa Nagarahalli 
> Cc: Savolainen, Petri (Nokia - FI/Espoo) ; 
> lng-odp-forward 
> Subject: Re: [lng-odp] [API-NEXT PATCH v4] timer: allow timer processing to 
> run on worker cores
> 
> Petri, do you want to test performance before patch inclusion?
> Maxim.
> 
> On 21 June 2017 at 21:52, Honnappa Nagarahalli 
>  wrote:
> We have not run any performance application. In our Linaro connect
> meeting, we presented numbers on how it improves the timer resolution.
> At this point, there is enough configuration options to control the
> effect of calling timer in the scheduler. For applications that do not
> want to use the timer, there should not be any change. For
> applications that use timers non-frequently, the check frequency can
> be controlled via the provided configuration options.
> 
> On 20 June 2017 at 02:34, Savolainen, Petri (Nokia - FI/Espoo)
>  wrote:
> > Do you have some performance numbers? E.g. how much this slows down an 
> > application which does not use timers (e.g. l2fwd), or an application that 
> > uses only few, non-frequent timeouts?
> >
> > Additionally, init.h/feature.h is not yet in api-next - so this would not 
> > build yet.
> >
> >
> > -Petri
> >
> >
> >> -Original Message-
> >> From: lng-odp [mailto:mailto:lng-odp-boun...@lists.linaro.org] On Behalf Of
> >> Honnappa Nagarahalli
> >> Sent: Tuesday, June 20, 2017 7:07 AM
> >> To: Bill Fischofer 
> >> Cc: lng-odp-forward 
> >> Subject: Re: [lng-odp] [API-NEXT PATCH v4] timer: allow timer processing
> >> to run on worker cores
> >>
> >> Are you saying we should be good to merge this now?
> >>
> >> On 19 June 2017 at 17:42, Bill Fischofer 
> >> wrote:
> >> > On Mon, Jun 19, 2017 at 4:19 PM, Honnappa Nagarahalli
> >> >  wrote:
> >> >> Hi Bill/Maxim,
> >> >>     I do not see any further comments, can we merge this to api-next?
> >> >>
> >> >> Thanks,
> >> >> Honnappa
> >
> >
> 


Re: [lng-odp] [PATCH] build: fix conditional compilation of sources

2017-06-22 Thread Brian Brooks
On 06/22 11:13:57, Maxim Uvarov wrote:
> On 22 June 2017 at 06:24, Brian Brooks <brian.bro...@arm.com> wrote:
> 
> > Explicitly add all arch//* files to respective _SOURCES
> > variables instead of using @ARCH_DIR@ substitution.
> >
> > This patch fixes the broken build for ARM, PPC, and MIPS
> > introduced by [1] and the similar issue reported while
> > testing [2].
> >
> > From the Autoconf manual [3]:
> >
> >   You can't put a configure substitution (e.g., '@FOO@' or
> >   '$(FOO)' where FOO is defined via AC_SUBST) into a _SOURCES
> >   variable. The reason for this is a bit hard to explain, but
> >   suffice to say that it simply won't work.
> >
> > Here be dragons..
> >
> > [1] https://lists.linaro.org/pipermail/lng-odp/2017-April/030324.html
> > [2] https://lists.linaro.org/pipermail/lng-odp/2017-June/031598.html
> > [3] https://www.gnu.org/software/automake/manual/html_node/
> > Conditional-Sources.html
> >
> > Signed-off-by: Brian Brooks <brian.bro...@arm.com>
> > ---
> >  configure.ac   |  3 +++
> >  platform/linux-generic/Makefile.am | 40 ++
> > 
> >  2 files changed, 35 insertions(+), 8 deletions(-)
> >
> > diff --git a/configure.ac b/configure.ac
> > index 46c7bbd2..45812f66 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -225,6 +225,9 @@ AM_CONDITIONAL([HAVE_DOXYGEN], [test "x${DOXYGEN}" =
> > "xdoxygen"])
> >  AM_CONDITIONAL([user_guide], [test "x${user_guides}" = "xyes" ])
> >  AM_CONDITIONAL([HAVE_MSCGEN], [test "x${MSCGEN}" = "xmscgen"])
> >  AM_CONDITIONAL([helper_linux], [test x$helper_linux = xyes ])
> > +AM_CONDITIONAL([ARCH_IS_ARM], [test "x${ARCH_DIR}" = "xarm"])
> > +AM_CONDITIONAL([ARCH_IS_MIPS64], [test "x${ARCH_DIR}" = "xmips64"])
> > +AM_CONDITIONAL([ARCH_IS_POWERPC], [test "x${ARCH_DIR}" = "xpowerpc"])
> >  AM_CONDITIONAL([ARCH_IS_X86], [test "x${ARCH_DIR}" = "xx86"])
> >
> >  
> > ##
> > diff --git a/platform/linux-generic/Makefile.am b/platform/linux-generic/
> > Makefile.am
> > index 8dcdebd2..385c5304 100644
> > --- a/platform/linux-generic/Makefile.am
> > +++ b/platform/linux-generic/Makefile.am
> > @@ -63,8 +63,20 @@ odpapiinclude_HEADERS = \
> >   $(srcdir)/include/odp/api/time.h \
> >   $(srcdir)/include/odp/api/timer.h \
> >   $(srcdir)/include/odp/api/traffic_mngr.h \
> > - $(srcdir)/include/odp/api/version.h \
> > - $(srcdir)/arch/@ARCH_DIR@/odp/api/cpu_arch.h
> > + $(srcdir)/include/odp/api/version.h
> > +
> > +if ARCH_IS_ARM
> > +odpapiinclude_HEADERS += $(srcdir)/arch/arm/odp/api/cpu_arch.h
> > +endif
> > +if ARCH_IS_MIPS64
> > +odpapiinclude_HEADERS += $(srcdir)/arch/mips64/odp/api/cpu_arch.h
> > +endif
> > +if ARCH_IS_POWERPC
> > +odpapiinclude_HEADERS += $(srcdir)/arch/powerpc/odp/api/cpu_arch.h
> > +endif
> > +if ARCH_IS_X86
> > +odpapiinclude_HEADERS += $(srcdir)/arch/x86/odp/api/cpu_arch.h
> > +endif
> >
> >
> 
> If - else is better to be here. Something like:
> 
> ifeq ($ARCH_IS_ARM), 1)
> ..
> 
> else ifeq ($ARCH_IS_MIPS64, 1)
> 
> else
>  unsupported
> endif
> 
> 
> It will be more nice if it would be:
> ifeq ($ARCH, arm)
> ..
> else ifeq ($ARCH, mips)

Can't do this because ifeq, ifneq, ifdef, and ifndef are Makefile conditionals,
not Automake conditionals.

>  odpapiplatincludedir= $(includedir)/odp/api/plat
> >  odpapiplatinclude_HEADERS = \
> > @@ -217,20 +229,32 @@ __LIB__libodp_linux_la_SOURCES = \
> >odp_timer_wheel.c \
> >odp_traffic_mngr.c \
> >odp_version.c \
> > -  odp_weak.c \
> > -  arch/@ARCH_DIR@/odp_cpu_arch.c \
> > -  arch/@ARCH_DIR@/odp_sysinfo_parse.c
> > -
> > -__LIB__libodp_linux_la_LIBADD = $(ATOMIC_LIBS)
> > +  odp_weak.c
> >
> > +if ARCH_IS_ARM
> > +__LIB__libodp_linux_la_SOURCES += arch/arm/odp_cpu_arch.c \
> > + arch/arm/odp_sysinfo_parse.c
> > +endif
> > +if ARCH_IS_MIPS64
> > +__LIB__libodp_linux_la_SOURCES += arch/mips64/odp_cpu_arch.c \
> &

Re: [lng-odp] [PATCH API-NEXT v1 1/1] linux-generic: crypto: adapt HMAC code to OpenSSL 1.1.x

2017-06-22 Thread Brian Brooks
On 06/15 11:00:07, Github ODP bot wrote:
> From: Dmitry Eremin-Solenikov <dmitry.ereminsoleni...@linaro.org>
> 
> OpenSSL 1.1.x has changed HMAC API in an incompatible way. Let's adapt
> to it by providing version-dependent wrapper around HMAC calculation.

I am using OpenSSL 1.1.x on some machines.

Reviewed-and-tested-by: Brian Brooks <brian.bro...@arm.com>

> Signed-off-by: Dmitry Eremin-Solenikov <dmitry.ereminsoleni...@linaro.org>
> ---
> /** Email created from pull request 51 (lumag:hmac-1.1.x)
>  ** https://github.com/Linaro/odp/pull/51
>  ** Patch: https://github.com/Linaro/odp/pull/51.patch
>  ** Base sha: 4f97e500a097928e308a415c32a88465adc5f5cc
>  ** Merge commit sha: 31e0b980e18e6b8761600e5ab0f4aadbf88bdbac
>  **/
>  platform/linux-generic/odp_crypto.c | 43 
> +
>  1 file changed, 34 insertions(+), 9 deletions(-)
> 
> diff --git a/platform/linux-generic/odp_crypto.c 
> b/platform/linux-generic/odp_crypto.c
> index 6fc1907d..68fc5658 100644
> --- a/platform/linux-generic/odp_crypto.c
> +++ b/platform/linux-generic/odp_crypto.c
> @@ -128,20 +128,18 @@ null_crypto_routine(odp_crypto_op_param_t *param 
> ODP_UNUSED,
>  }
>  
>  static
> -void packet_hmac(odp_crypto_op_param_t *param,
> -  odp_crypto_generic_session_t *session,
> -  uint8_t *hash)
> +void packet_hmac_calculate(HMAC_CTX *ctx,
> +odp_crypto_op_param_t *param,
> +odp_crypto_generic_session_t *session,
> +uint8_t *hash)
>  {
>   odp_packet_t pkt = param->out_pkt;
>   uint32_t offset = param->auth_range.offset;
>   uint32_t len   = param->auth_range.length;
> - HMAC_CTX ctx;
>  
>   ODP_ASSERT(offset + len <= odp_packet_len(pkt));
>  
> - /* Hash it */
> - HMAC_CTX_init();
> - HMAC_Init_ex(,
> + HMAC_Init_ex(ctx,
>session->auth.key,
>session->auth.key_length,
>session->auth.evp_md,
> @@ -152,14 +150,41 @@ void packet_hmac(odp_crypto_op_param_t *param,
>   void *mapaddr = odp_packet_offset(pkt, offset, , NULL);
>   uint32_t maclen = len > seglen ? seglen : len;
>  
> - HMAC_Update(, mapaddr, maclen);
> + HMAC_Update(ctx, mapaddr, maclen);
>   offset  += maclen;
>   len -= maclen;
>   }
>  
> - HMAC_Final(, hash, NULL);
> + HMAC_Final(ctx, hash, NULL);
> +}
> +
> +#if OPENSSL_VERSION_NUMBER < 0x1010L
> +static
> +void packet_hmac(odp_crypto_op_param_t *param,
> +  odp_crypto_generic_session_t *session,
> +  uint8_t *hash)
> +{
> + HMAC_CTX ctx;
> +
> + /* Hash it */
> + HMAC_CTX_init();
> + packet_hmac_calculate(, param, session, hash);
>   HMAC_CTX_cleanup();
>  }
> +#else
> +static
> +void packet_hmac(odp_crypto_op_param_t *param,
> +  odp_crypto_generic_session_t *session,
> +  uint8_t *hash)
> +{
> + HMAC_CTX *ctx;
> +
> + /* Hash it */
> + ctx = HMAC_CTX_new();
> + packet_hmac_calculate(ctx, param, session, hash);
> + HMAC_CTX_free(ctx);
> +}
> +#endif
>  
>  static
>  odp_crypto_alg_err_t auth_gen(odp_crypto_op_param_t *param,
> 


[lng-odp] [PATCH] build: fix conditional compilation of sources

2017-06-21 Thread Brian Brooks
Explicitly add all arch//* files to respective _SOURCES
variables instead of using @ARCH_DIR@ substitution.

This patch fixes the broken build for ARM, PPC, and MIPS
introduced by [1] and the similar issue reported while
testing [2].

>From the Autoconf manual [3]:

  You can't put a configure substitution (e.g., '@FOO@' or
  '$(FOO)' where FOO is defined via AC_SUBST) into a _SOURCES
  variable. The reason for this is a bit hard to explain, but
  suffice to say that it simply won't work.

Here be dragons..

[1] https://lists.linaro.org/pipermail/lng-odp/2017-April/030324.html
[2] https://lists.linaro.org/pipermail/lng-odp/2017-June/031598.html
[3] 
https://www.gnu.org/software/automake/manual/html_node/Conditional-Sources.html

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
---
 configure.ac   |  3 +++
 platform/linux-generic/Makefile.am | 40 ++
 2 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/configure.ac b/configure.ac
index 46c7bbd2..45812f66 100644
--- a/configure.ac
+++ b/configure.ac
@@ -225,6 +225,9 @@ AM_CONDITIONAL([HAVE_DOXYGEN], [test "x${DOXYGEN}" = 
"xdoxygen"])
 AM_CONDITIONAL([user_guide], [test "x${user_guides}" = "xyes" ])
 AM_CONDITIONAL([HAVE_MSCGEN], [test "x${MSCGEN}" = "xmscgen"])
 AM_CONDITIONAL([helper_linux], [test x$helper_linux = xyes ])
+AM_CONDITIONAL([ARCH_IS_ARM], [test "x${ARCH_DIR}" = "xarm"])
+AM_CONDITIONAL([ARCH_IS_MIPS64], [test "x${ARCH_DIR}" = "xmips64"])
+AM_CONDITIONAL([ARCH_IS_POWERPC], [test "x${ARCH_DIR}" = "xpowerpc"])
 AM_CONDITIONAL([ARCH_IS_X86], [test "x${ARCH_DIR}" = "xx86"])
 
 ##
diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 8dcdebd2..385c5304 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -63,8 +63,20 @@ odpapiinclude_HEADERS = \
  $(srcdir)/include/odp/api/time.h \
  $(srcdir)/include/odp/api/timer.h \
  $(srcdir)/include/odp/api/traffic_mngr.h \
- $(srcdir)/include/odp/api/version.h \
- $(srcdir)/arch/@ARCH_DIR@/odp/api/cpu_arch.h
+ $(srcdir)/include/odp/api/version.h
+
+if ARCH_IS_ARM
+odpapiinclude_HEADERS += $(srcdir)/arch/arm/odp/api/cpu_arch.h
+endif
+if ARCH_IS_MIPS64
+odpapiinclude_HEADERS += $(srcdir)/arch/mips64/odp/api/cpu_arch.h
+endif
+if ARCH_IS_POWERPC
+odpapiinclude_HEADERS += $(srcdir)/arch/powerpc/odp/api/cpu_arch.h
+endif
+if ARCH_IS_X86
+odpapiinclude_HEADERS += $(srcdir)/arch/x86/odp/api/cpu_arch.h
+endif
 
 odpapiplatincludedir= $(includedir)/odp/api/plat
 odpapiplatinclude_HEADERS = \
@@ -217,20 +229,32 @@ __LIB__libodp_linux_la_SOURCES = \
   odp_timer_wheel.c \
   odp_traffic_mngr.c \
   odp_version.c \
-  odp_weak.c \
-  arch/@ARCH_DIR@/odp_cpu_arch.c \
-  arch/@ARCH_DIR@/odp_sysinfo_parse.c
-
-__LIB__libodp_linux_la_LIBADD = $(ATOMIC_LIBS)
+  odp_weak.c
 
+if ARCH_IS_ARM
+__LIB__libodp_linux_la_SOURCES += arch/arm/odp_cpu_arch.c \
+ arch/arm/odp_sysinfo_parse.c
+endif
+if ARCH_IS_MIPS64
+__LIB__libodp_linux_la_SOURCES += arch/mips64/odp_cpu_arch.c \
+ arch/mips64/odp_sysinfo_parse.c
+endif
+if ARCH_IS_POWERPC
+__LIB__libodp_linux_la_SOURCES += arch/powerpc/odp_cpu_arch.c \
+ arch/powerpc/odp_sysinfo_parse.c
+endif
 if ARCH_IS_X86
-__LIB__libodp_linux_la_SOURCES += arch/@ARCH_DIR@/cpu_flags.c
+__LIB__libodp_linux_la_SOURCES += arch/x86/odp_cpu_arch.c \
+ arch/x86/odp_sysinfo_parse.c \
+ arch/x86/cpu_flags.c
 endif
 
 if HAVE_PCAP
 __LIB__libodp_linux_la_SOURCES += pktio/pcap.c
 endif
 
+__LIB__libodp_linux_la_LIBADD = $(ATOMIC_LIBS)
+
 # Create symlink for ABI header files. Application does not need to use the 
arch
 # specific include path for installed files.
 install-data-hook:
-- 
2.13.1



Re: [lng-odp] [PATCH] linux-gen: time: fix ARM compile for GCC 4.8

2017-06-21 Thread Brian Brooks
#if -> #ifdef

On Wed, Jun 21, 2017 at 3:27 PM, Maxim Uvarov  wrote:
> odp check with ARCH=arm fails after this patch:
>
>   CC   arch/arm/odp_cpu_arch.lo
> arch/arm/odp_cpu_arch.c: In function 'cpu_global_time':
> arch/arm/odp_cpu_arch.c:71:5: error: "__aarch64__" is not defined
> [-Werror=undef]
>  #if __aarch64__
>  ^
> arch/arm/odp_cpu_arch.c: In function 'cpu_global_time_freq':
> arch/arm/odp_cpu_arch.c:91:5: error: "__aarch64__" is not defined
> [-Werror=undef]
>  #if __aarch64__
>  ^
> cc1: all warnings being treated as errors
>
> implementation_name:odp-linux
> host:   arm-unknown-linux-gnueabihf
> ARCH_DIRarm
> ARCH_ABIarm32-linux
> with_platform:  linux-generic
> helper_linux:   no
> prefix: /opt/Linaro/check-odp-v3.git/new-build
> sysconfdir: ${prefix}/etc
> libdir: ${exec_prefix}/lib
> includedir: ${prefix}/include
> testdir:${exec_prefix}/lib/odp/tests
> WITH_ARCH:  arm
>
> cc: arm-linux-gnueabihf-gcc
> cc version: 5.3.1
>
>
>
>
> On 06/21/17 15:42, Bill Fischofer wrote:
>> I've confirmed this is benign on x86. Brian: Please review for ARM.
>>
>> On Wed, Jun 21, 2017 at 6:48 AM, Petri Savolainen
>>  wrote:
>>> Use __aarch64__ instead of __ARM_ARCH, since it's backwards
>>> compatible between GCC versions.
>>>
>>> Fixes bug https://bugs.linaro.org/show_bug.cgi?id=3066
>>>
>>> Signed-off-by: Petri Savolainen 
>>> ---
>>>  platform/linux-generic/arch/arm/odp_cpu_arch.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/platform/linux-generic/arch/arm/odp_cpu_arch.c 
>>> b/platform/linux-generic/arch/arm/odp_cpu_arch.c
>>> index 91d439d9..fd1b3ed6 100644
>>> --- a/platform/linux-generic/arch/arm/odp_cpu_arch.c
>>> +++ b/platform/linux-generic/arch/arm/odp_cpu_arch.c
>>> @@ -68,7 +68,7 @@ int cpu_has_global_time(void)
>>>
>>>  uint64_t cpu_global_time(void)
>>>  {
>>> -#if __ARM_ARCH == 8
>>> +#if __aarch64__
>>> uint64_t cntvct;
>>>
>>> /*
>>> @@ -88,7 +88,7 @@ uint64_t cpu_global_time(void)
>>>
>>>  uint64_t cpu_global_time_freq(void)
>>>  {
>>> -#if __ARM_ARCH == 8
>>> +#if __aarch64__
>>> uint64_t cntfrq;
>>>
>>> __asm__ volatile("mrs %0, cntfrq_el0" : "=r"(cntfrq) : : );
>>> --
>>> 2.13.0
>>>
>


Re: [lng-odp] [API-NEXT PATCH v9 0/6] A scalable software scheduler

2017-06-21 Thread Brian Brooks
On 06/20 19:38:59, Bill Fischofer wrote:
> Now that master has been merged back into api-next I can confirm that
> make distcheck fails for 64-bit systems as well, so I need to withdraw
> my earlier reviewed-and-tested-by until that is fixed.

This appears to be a problem in pristine upstream as well. "make distcheck"
is broken on ARM (and likely all non-x86 archs).

The issue is that in platform/linux-generic/Makefile.am "if ARCH_IS_X86"
is always true. I have not yet found a workaround..

> On Mon, Jun 19, 2017 at 6:13 PM, Bill Fischofer
> <bill.fischo...@linaro.org> wrote:
> > Looks like I posted a wee bit too soon. On a 32-bit system:
> >
> > bill@Ub16-32:~/linaro/armschedv9$ make distcheck
> > make  dist-gzip am__post_remove_distdir='@:'
> > make[1]: Entering directory '/home/bill/linaro/armschedv9'
> > if test -d "opendataplane-1.14.0.0"; then find
> > "opendataplane-1.14.0.0" -type d ! -perm -200 -exec chmod u+w {} ';'
> > && rm -rf "opendataplane-1.14.0.0" || { sleep 5 && rm -rf
> > "opendataplane-1.14.0.0"; }; else :; fi
> > test -d "opendataplane-1.14.0.0" || mkdir "opendataplane-1.14.0.0"
> >  (cd platform/linux-generic && make
> > top_distdir=../../opendataplane-1.14.0.0
> > distdir=../../opendataplane-1.14.0.0/platform/linux-generic \
> >  am__remove_distdir=: am__skip_length_check=: am__skip_mode_fix=: 
> > distdir)
> > make[2]: Entering directory
> > '/home/bill/linaro/armschedv9/platform/linux-generic'
> > make[2]: *** No rule to make target 'arch/x86/odp_atomic.h', needed by
> > 'distdir'.  Stop.
> > make[2]: Leaving directory 
> > '/home/bill/linaro/armschedv9/platform/linux-generic'
> > Makefile:603: recipe for target 'distdir' failed
> > make[1]: *** [distdir] Error 1
> > make[1]: Leaving directory '/home/bill/linaro/armschedv9'
> > Makefile:702: recipe for target 'dist' failed
> > make: *** [dist] Error 2
> >
> > On Mon, Jun 19, 2017 at 6:11 PM, Bill Fischofer
> > <bill.fischo...@linaro.org> wrote:
> >> For the v9 series:
> >>
> >> Reviewed-and-tested-by: Bill Fischofer <bill.fischo...@linaro.org>
> >>
> >> I also verified that there are no conflicts between this series and
> >> Petri's queue cleanup patch, so this can apply and run just fine on
> >> top of it. Maxim should be able to merge both tomorrow.
> >>
> >> On Mon, Jun 19, 2017 at 2:12 PM, Brian Brooks <brian.bro...@arm.com> wrote:
> >>> This work derives from Ola Liljedahl's prototype [1] which introduced a
> >>> scalable scheduler design based on primarily lock-free algorithms and
> >>> data structures designed to decrease contention. A thread searches
> >>> through a data structure containing only queues that are both non-empty
> >>> and allowed to be scheduled to that thread. Strict priority scheduling is
> >>> respected, and (W)RR scheduling may be used within queues of the same 
> >>> priority.
> >>> Lastly, pre-scheduling or stashing is not employed since it is optional
> >>> functionality that can be implemented in the application.
> >>>
> >>> In addition to scalable ring buffers, the algorithm also uses unbounded
> >>> concurrent queues. LL/SC and CAS variants exist in cases where absense of
> >>> ABA problem cannot be proved, and also in cases where the compiler's 
> >>> atomic
> >>> built-ins may not be lowered to the desired instruction(s). Finally, a 
> >>> version
> >>> of the algorithm that uses locks is also provided.
> >>>
> >>> Use --enable-schedule-scalable to conditionally compile this scheduler
> >>> into the library.
> >>>
> >>> [1] https://lists.linaro.org/pipermail/lng-odp/2016-September/025682.html
> >>>
> >>> On checkpatch.pl:
> >>>  - [2/6] and [5/6] have checkpatch.pl issues that are superfluous
> >>>
> >>> v9:
> >>>  - Include patch to enable scalable scheduler in Travis CI
> >>>  - Fix 'make distcheck'
> >>>
> >>> v8:
> >>>  - Reword commit messages
> >>>
> >>> v7:
> >>>  - Rebase against new modular queue interface
> >>>  - Duplicate arch files under mips64 and powerpc
> >>>  - Fix sched->order_lock()
> >>>  - Loop until all deferred events have been enqueued
> >>>  - Implement ord_enq_multi()
> >>>  - Fix ordere

Re: [lng-odp] [PATCH] linux-gen: time: fix ARM compile for GCC 4.8

2017-06-21 Thread Brian Brooks
Reviewed-by: Brian Brooks <brian.bro...@arm.com>

ACLE [1] was added to GCC 4.9. GCC 4.8 does define __aarch64__
so this change should be safe.

[1] 
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053c/IHI0053C_acle_2_0.pdf

On 06/21 14:48:36, Petri Savolainen wrote:
> Use __aarch64__ instead of __ARM_ARCH, since it's backwards
> compatible between GCC versions.
> 
> Fixes bug https://bugs.linaro.org/show_bug.cgi?id=3066
> 
> Signed-off-by: Petri Savolainen <petri.savolai...@linaro.org>
> ---
>  platform/linux-generic/arch/arm/odp_cpu_arch.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/platform/linux-generic/arch/arm/odp_cpu_arch.c 
> b/platform/linux-generic/arch/arm/odp_cpu_arch.c
> index 91d439d9..fd1b3ed6 100644
> --- a/platform/linux-generic/arch/arm/odp_cpu_arch.c
> +++ b/platform/linux-generic/arch/arm/odp_cpu_arch.c
> @@ -68,7 +68,7 @@ int cpu_has_global_time(void)
>  
>  uint64_t cpu_global_time(void)
>  {
> -#if __ARM_ARCH == 8
> +#if __aarch64__
>   uint64_t cntvct;
>  
>   /*
> @@ -88,7 +88,7 @@ uint64_t cpu_global_time(void)
>  
>  uint64_t cpu_global_time_freq(void)
>  {
> -#if __ARM_ARCH == 8
> +#if __aarch64__
>   uint64_t cntfrq;
>  
>   __asm__ volatile("mrs %0, cntfrq_el0" : "=r"(cntfrq) : : );
> -- 
> 2.13.0
> 


[lng-odp] [API-NEXT PATCH v9 5/6] linux-gen: sched scalable: add scalable scheduler

2017-06-19 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Signed-off-by: Kevin Wang <kevin.w...@arm.com>
Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
---
 platform/linux-generic/Makefile.am |7 +
 .../include/odp/api/plat/schedule_types.h  |4 +-
 .../linux-generic/include/odp_config_internal.h|   17 +-
 .../include/odp_queue_scalable_internal.h  |  102 +
 platform/linux-generic/include/odp_schedule_if.h   |2 +-
 .../linux-generic/include/odp_schedule_scalable.h  |  137 ++
 .../include/odp_schedule_scalable_config.h |   55 +
 .../include/odp_schedule_scalable_ordered.h|  132 ++
 platform/linux-generic/m4/odp_schedule.m4  |   55 +-
 platform/linux-generic/odp_queue_if.c  |8 +
 platform/linux-generic/odp_queue_scalable.c| 1020 ++
 platform/linux-generic/odp_schedule_if.c   |6 +
 platform/linux-generic/odp_schedule_scalable.c | 1978 
 .../linux-generic/odp_schedule_scalable_ordered.c  |  347 
 14 files changed, 3848 insertions(+), 22 deletions(-)
 create mode 100644 platform/linux-generic/include/odp_queue_scalable_internal.h
 create mode 100644 platform/linux-generic/include/odp_schedule_scalable.h
 create mode 100644 
platform/linux-generic/include/odp_schedule_scalable_config.h
 create mode 100644 
platform/linux-generic/include/odp_schedule_scalable_ordered.h
 create mode 100644 platform/linux-generic/odp_queue_scalable.c
 create mode 100644 platform/linux-generic/odp_schedule_scalable.c
 create mode 100644 platform/linux-generic/odp_schedule_scalable_ordered.c

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 90cc4ca6..067d3965 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -172,9 +172,13 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_pool_internal.h \
  ${srcdir}/include/odp_posix_extensions.h \
  ${srcdir}/include/odp_queue_internal.h \
+ ${srcdir}/include/odp_queue_scalable_internal.h \
  ${srcdir}/include/odp_ring_internal.h \
  ${srcdir}/include/odp_queue_if.h \
  ${srcdir}/include/odp_schedule_if.h \
+ ${srcdir}/include/odp_schedule_scalable.h \
+ ${srcdir}/include/odp_schedule_scalable_config.h \
+ ${srcdir}/include/odp_schedule_scalable_ordered.h \
  ${srcdir}/include/odp_sorted_list_internal.h \
  ${srcdir}/include/odp_shm_internal.h \
  ${srcdir}/include/odp_time_internal.h \
@@ -237,12 +241,15 @@ __LIB__libodp_linux_la_SOURCES = \
   odp_pool.c \
   odp_queue.c \
   odp_queue_if.c \
+  odp_queue_scalable.c \
   odp_rwlock.c \
   odp_rwlock_recursive.c \
   odp_schedule.c \
   odp_schedule_if.c \
   odp_schedule_sp.c \
   odp_schedule_iquery.c \
+  odp_schedule_scalable.c \
+  odp_schedule_scalable_ordered.c \
   odp_shared_memory.c \
   odp_sorted_list.c \
   odp_spinlock.c \
diff --git a/platform/linux-generic/include/odp/api/plat/schedule_types.h 
b/platform/linux-generic/include/odp/api/plat/schedule_types.h
index 535fd6d0..4e75f9ee 100644
--- a/platform/linux-generic/include/odp/api/plat/schedule_types.h
+++ b/platform/linux-generic/include/odp/api/plat/schedule_types.h
@@ -18,6 +18,8 @@
 extern "C" {
 #endif
 
+#include 
+
 /** @addtogroup odp_scheduler
  *  @{
  */
@@ -44,7 +46,7 @@ typedef int odp_schedule_sync_t;
 typedef int odp_schedule_group_t;
 
 /* These must be kept in sync with thread_globals_t in odp_thread.c */
-#define ODP_SCHED_GROUP_INVALID -1
+#define ODP_SCHED_GROUP_INVALID ((odp_schedule_group_t)-1)
 #define ODP_SCHED_GROUP_ALL 0
 #define ODP_SCHED_GROUP_WORKER  1
 #define ODP_SCHED_GROUP_CONTROL 2
diff --git a/platform/linux-generic/include/odp_config_internal.h 
b/platform/linux-generic/include/odp_config_internal.h
index dadd59e7..6cc844f3 100644
--- a/platform/linux-generic/include/odp_config_internal.h
+++ b/platform/linux-generic/include/odp_config_internal.h
@@ -7,9 +7,7 @@
 #ifndef ODP_CONFIG_INTERNAL_H_
 #define ODP_CONFIG_INTERNAL_H_
 
-#ifdef __cplusplus
-extern "C" {
-#endif
+#include 
 
 /*
  * Maximum number of pools
@@ -22,6 +20,13 @@ extern "C" {
 #define ODP_CONFIG_QUEUES 1024
 
 /*
+ * Maximum queue depth. Maximum number of elements that can be stored in a
+ * queue. This value is used only when the size is not explicitly provided
+ * durin

[lng-odp] [API-NEXT PATCH v9 2/6] linux-gen: sched scalable: add arch files

2017-06-19 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 configure.ac |   1 +
 platform/linux-generic/Makefile.am   |   8 +
 platform/linux-generic/arch/arm/odp_atomic.h | 210 +++
 platform/linux-generic/arch/arm/odp_cpu.h|  65 ++
 platform/linux-generic/arch/arm/odp_cpu_idling.h |  51 +
 platform/linux-generic/arch/arm/odp_llsc.h   | 249 +++
 platform/linux-generic/arch/default/odp_cpu.h|  41 
 platform/linux-generic/arch/mips64/odp_cpu.h |  41 
 platform/linux-generic/arch/powerpc/odp_cpu.h|  41 
 platform/linux-generic/arch/x86/odp_cpu.h|  41 
 10 files changed, 748 insertions(+)
 create mode 100644 platform/linux-generic/arch/arm/odp_atomic.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu_idling.h
 create mode 100644 platform/linux-generic/arch/arm/odp_llsc.h
 create mode 100644 platform/linux-generic/arch/default/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/mips64/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/powerpc/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/x86/odp_cpu.h

diff --git a/configure.ac b/configure.ac
index fe36ce16..6f7357bc 100644
--- a/configure.ac
+++ b/configure.ac
@@ -226,6 +226,7 @@ AM_CONDITIONAL([user_guide], [test "x${user_guides}" = 
"xyes" ])
 AM_CONDITIONAL([HAVE_MSCGEN], [test "x${MSCGEN}" = "xmscgen"])
 AM_CONDITIONAL([helper_linux], [test x$helper_linux = xyes ])
 AM_CONDITIONAL([ARCH_IS_X86], [test "x${ARCH_DIR}" = "xx86"])
+AM_CONDITIONAL([ARCH_IS_ARM], [test "x${ARCH_DIR}" = "xarm"])
 
 ##
 # Setup doxygen documentation
diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 58c73767..4690a650 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -8,6 +8,7 @@ AM_CFLAGS +=  -I$(srcdir)/include
 AM_CFLAGS +=  -I$(top_srcdir)/include
 AM_CFLAGS +=  -I$(top_srcdir)/include/odp/arch/@ARCH_ABI@
 AM_CFLAGS +=  -I$(top_builddir)/include
+AM_CFLAGS +=  -I$(top_srcdir)/arch/@ARCH_DIR@
 AM_CFLAGS +=  -Iinclude
 AM_CFLAGS +=  -DSYSCONFDIR=\"@sysconfdir@\"
 AM_CFLAGS +=  -D_ODP_PKTIO_IPC
@@ -183,8 +184,15 @@ noinst_HEADERS = \
  ${srcdir}/include/protocols/ipsec.h \
  ${srcdir}/include/protocols/tcp.h \
  ${srcdir}/include/protocols/udp.h \
+ ${srcdir}/arch/@ARCH_DIR@/odp_cpu.h \
  ${srcdir}/Makefile.inc
 
+if ARCH_IS_ARM
+noinst_HEADERS += ${srcdir}/arch/@ARCH_DIR@/odp_atomic.h \
+ ${srcdir}/arch/@ARCH_DIR@/odp_cpu_idling.h \
+ ${srcdir}/arch/@ARCH_DIR@/odp_llsc.h
+endif
+
 __LIB__libodp_linux_la_SOURCES = \
   _fdserver.c \
   _ishm.c \
diff --git a/platform/linux-generic/arch/arm/odp_atomic.h 
b/platform/linux-generic/arch/arm/odp_atomic.h
new file mode 100644
index ..0ddd8a11
--- /dev/null
+++ b/platform/linux-generic/arch/arm/odp_atomic.h
@@ -0,0 +1,210 @@
+/* Copyright (c) 2017, ARM Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier:BSD-3-Clause
+ */
+
+#ifndef PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_ATOMIC_H
+#define PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_ATOMIC_H
+
+#ifndef PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_CPU_H
+#error This file should not be included directly, please include odp_cpu.h
+#endif
+
+#ifdef CONFIG_DMBSTR
+
+#define atomic_store_release(loc, val, ro) \
+do {   \
+   _odp_release_barrier(ro);   \
+   __atomic_store_n(loc, val, __ATOMIC_RELAXED);   \
+} while (0)
+
+#else
+
+#define atomic_store_release(loc, val, ro) \
+   __atomic_store_n(loc, val, __ATOMIC_RELEASE)
+
+#endif  /* CONFIG_DMBSTR */
+
+#if __ARM_ARCH == 8
+
+#define HAS_ACQ(mo) ((mo) != __ATOMIC_RELAXED && (mo) != __ATOMIC_RELEASE)
+#define HAS_RLS(mo) ((mo) == __ATOMIC_RELEASE || (mo) == __ATOMIC_ACQ_REL || \
+(mo) == __ATOMIC_SEQ_CST)
+
+#define LL_MO(mo) (HAS_ACQ((mo)) ? __ATOMIC_ACQUIRE : __ATOMIC_RELAXED)
+#define SC_MO(mo) (HAS_RLS((mo)) ? __ATOMIC_RELEASE : __ATOMIC_RELAXED)
+
+#ifndef __ARM_FEATURE_QRDMX /* Feature only available in v8.1a and beyond */
+static inline bool
+__lockfree_compare_exchange_16(register __int128 *var, __int128 *exp,
+  register __int128 neu, bool weak, int mo_success,
+  int mo_failure)
+{
+   (void)weak; /* Always do strong CAS or we can't perform atomic read */
+   /* Ignore mem

[lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue

2017-06-19 Thread Brian Brooks
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Brian Brooks <brian.bro...@arm.com>
---
 platform/linux-generic/Makefile.am   |   1 +
 platform/linux-generic/include/odp_llqueue.h | 309 +++
 2 files changed, 310 insertions(+)
 create mode 100644 platform/linux-generic/include/odp_llqueue.h

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index b869fd4b..90cc4ca6 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -157,6 +157,7 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_errno_define.h \
  ${srcdir}/include/odp_forward_typedefs_internal.h \
  ${srcdir}/include/odp_internal.h \
+ ${srcdir}/include/odp_llqueue.h \
  ${srcdir}/include/odp_name_table_internal.h \
  ${srcdir}/include/odp_packet_internal.h \
  ${srcdir}/include/odp_packet_io_internal.h \
diff --git a/platform/linux-generic/include/odp_llqueue.h 
b/platform/linux-generic/include/odp_llqueue.h
new file mode 100644
index ..758af490
--- /dev/null
+++ b/platform/linux-generic/include/odp_llqueue.h
@@ -0,0 +1,309 @@
+/* Copyright (c) 2017, ARM Limited.
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier:BSD-3-Clause
+ */
+
+#ifndef ODP_LLQUEUE_H_
+#define ODP_LLQUEUE_H_
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+/**
+ * Linked list queues
+ */
+
+struct llqueue;
+struct llnode;
+
+static struct llnode *llq_head(struct llqueue *llq);
+static void llqueue_init(struct llqueue *llq);
+static void llq_enqueue(struct llqueue *llq, struct llnode *node);
+static struct llnode *llq_dequeue(struct llqueue *llq);
+static odp_bool_t llq_dequeue_cond(struct llqueue *llq, struct llnode *exp);
+static odp_bool_t llq_cond_rotate(struct llqueue *llq, struct llnode *node);
+static odp_bool_t llq_on_queue(struct llnode *node);
+
+/**
+ * The implementation(s)
+ */
+
+#define SENTINEL ((void *)~(uintptr_t)0)
+
+#ifdef CONFIG_LLDSCD
+/* Implement queue operations using double-word LL/SC */
+
+/* The scalar equivalent of a double pointer */
+#if __SIZEOF_PTRDIFF_T__ == 4
+typedef uint64_t dintptr_t;
+#endif
+#if __SIZEOF_PTRDIFF_T__ == 8
+typedef __int128 dintptr_t;
+#endif
+
+struct llnode {
+   struct llnode *next;
+};
+
+union llht {
+   struct {
+   struct llnode *head, *tail;
+   } st;
+   dintptr_t ui;
+};
+
+struct llqueue {
+   union llht u;
+};
+
+static inline struct llnode *llq_head(struct llqueue *llq)
+{
+   return __atomic_load_n(>u.st.head, __ATOMIC_RELAXED);
+}
+
+static inline void llqueue_init(struct llqueue *llq)
+{
+   llq->u.st.head = NULL;
+   llq->u.st.tail = NULL;
+}
+
+static inline void llq_enqueue(struct llqueue *llq, struct llnode *node)
+{
+   union llht old, neu;
+
+   ODP_ASSERT(node->next == NULL);
+   node->next = SENTINEL;
+   do {
+   old.ui = lld(>u.ui, __ATOMIC_RELAXED);
+   neu.st.head = old.st.head == NULL ? node : old.st.head;
+   neu.st.tail = node;
+   } while (odp_unlikely(scd(>u.ui, neu.ui, __ATOMIC_RELEASE)));
+   if (old.st.tail != NULL) {
+   /* List was not empty */
+   ODP_ASSERT(old.st.tail->next == SENTINEL);
+   old.st.tail->next = node;
+   }
+}
+
+static inline struct llnode *llq_dequeue(struct llqueue *llq)
+{
+   struct llnode *head;
+   union llht old, neu;
+
+   /* llq_dequeue() may be used in a busy-waiting fashion
+* Read head using plain load to avoid disturbing remote LL/SC
+*/
+   head = __atomic_load_n(>u.st.head, __ATOMIC_ACQUIRE);
+   if (head == NULL)
+   return NULL;
+   /* Read head->next before LL to minimize cache miss latency
+* in LL/SC below
+*/
+   (void)__atomic_load_n(>next, __ATOMIC_RELAXED);
+
+   do {
+   old.ui = lld(>u.ui, __ATOMIC_RELAXED);
+   if (odp_unlikely(old.st.head == NULL)) {
+   /* Empty list */
+   return NULL;
+   } else if (odp_unlikely(old.st.head == old.st.tail)) {
+   /* Single-element in list */
+   neu.st.head = NULL;
+   neu.st.tail = NULL;
+   } else {
+   /* Multi-element list, dequeue head */
+   struct llnode *next;
+   /* Wait until llq_enqueue() has w

[lng-odp] [API-NEXT PATCH v9 3/6] linux-gen: sched scalable: add a bitset

2017-06-19 Thread Brian Brooks
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 platform/linux-generic/Makefile.am  |   1 +
 platform/linux-generic/include/odp_bitset.h | 210 
 2 files changed, 211 insertions(+)
 create mode 100644 platform/linux-generic/include/odp_bitset.h

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 4690a650..b869fd4b 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -146,6 +146,7 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_atomic_internal.h \
  ${srcdir}/include/odp_buffer_inlines.h \
  ${srcdir}/include/odp_bitmap_internal.h \
+ ${srcdir}/include/odp_bitset.h \
  ${srcdir}/include/odp_buffer_internal.h \
  ${srcdir}/include/odp_classification_datamodel.h \
  ${srcdir}/include/odp_classification_inlines.h \
diff --git a/platform/linux-generic/include/odp_bitset.h 
b/platform/linux-generic/include/odp_bitset.h
new file mode 100644
index ..69b1a8dc
--- /dev/null
+++ b/platform/linux-generic/include/odp_bitset.h
@@ -0,0 +1,210 @@
+/* Copyright (c) 2017, ARM Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef _ODP_BITSET_H_
+#define _ODP_BITSET_H_
+
+#include 
+
+#include 
+
+/**
+ * bitset abstract data type
+ */
+/* This could be a struct of scalars to support larger bit sets */
+
+/*
+ * Size of atomic bit set. This limits the max number of threads,
+ * scheduler groups and reorder windows. On ARMv8/64-bit and x86-64, the
+ * (lock-free) max is 128
+ */
+
+/* Find a suitable data type that supports lock-free atomic operations */
+#if defined(__ARM_ARCH) &&  __ARM_ARCH == 8 &&  __ARM_64BIT_STATE == 1 && \
+   defined(__SIZEOF_INT128__) && __SIZEOF_INT128__ == 16
+#define LOCKFREE16
+typedef __int128 bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT128__)
+
+#elif __GCC_ATOMIC_LLONG_LOCK_FREE == 2 && \
+   __SIZEOF_LONG_LONG__ != __SIZEOF_LONG__
+typedef unsigned long long bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_LONG_LONG__)
+
+#elif __GCC_ATOMIC_LONG_LOCK_FREE == 2 && __SIZEOF_LONG__ != __SIZEOF_INT__
+typedef unsigned long bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_LONG__)
+
+#elif __GCC_ATOMIC_INT_LOCK_FREE == 2
+typedef unsigned int bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT__)
+
+#else
+/* Target does not support lock-free atomic operations */
+typedef unsigned int bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT__)
+#endif
+
+#if ATOM_BITSET_SIZE <= 32
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   return 1UL << bit;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   return __builtin_ffsl(b);
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor32(bs, mo);
+}
+
+#elif ATOM_BITSET_SIZE <= 64
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   return 1ULL << bit;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   return __builtin_ffsll(b);
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor64(bs, mo);
+}
+
+#elif ATOM_BITSET_SIZE <= 128
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   if (bit < 64)
+   return 1ULL << bit;
+   else
+   return (unsigned __int128)(1ULL << (bit - 64)) << 64;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   if ((uint64_t)b != 0)
+   return __builtin_ffsll((uint64_t)b);
+   else if ((b >> 64) != 0)
+   return __builtin_ffsll((uint64_t)(b >> 64)) + 64;
+   else
+   return 0;
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor128(bs, mo);
+}
+
+#else
+#error Unsupported size of bit sets (ATOM_BITSET_SIZE)
+#endif
+
+/* Atomic load with memory ordering */
+static inline bitset_t atom_bitset_load(bitset_t *bs, int mo)
+{
+#ifdef LOCKFREE16
+   return __lockfree_load_16(bs, mo);
+#else
+   return __atomic_load_n(bs, mo);
+#endif
+}
+
+/* Atomic bit set with memory ordering */
+static inline void atom_bitset_set(bitset_t *bs, uint32_t bit, int mo)
+{
+#ifdef LOCKFREE16
+  

[lng-odp] [API-NEXT PATCH v9 6/6] travis: add scalable scheduler in CI

2017-06-19 Thread Brian Brooks
From: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>

Added running tests with scalable scheduler to CI

Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
Reviewed-by: Brian Brooks <brian.bro...@arm.com>
---
 .travis.yml | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.travis.yml b/.travis.yml
index 1bc82b3c..8463c9fe 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -61,6 +61,7 @@ env:
 - CONF="--disable-abi-compat"
 - CONF="--enable-schedule-sp"
 - CONF="--enable-schedule-iquery"
+- CONF="--enable-schedule-scalable"
 
 install:
 - echo 1000 | sudo tee /proc/sys/vm/nr_hugepages
-- 
2.13.1



[lng-odp] [API-NEXT PATCH v9 0/6] A scalable software scheduler

2017-06-19 Thread Brian Brooks
This work derives from Ola Liljedahl's prototype [1] which introduced a
scalable scheduler design based on primarily lock-free algorithms and
data structures designed to decrease contention. A thread searches
through a data structure containing only queues that are both non-empty
and allowed to be scheduled to that thread. Strict priority scheduling is
respected, and (W)RR scheduling may be used within queues of the same priority.
Lastly, pre-scheduling or stashing is not employed since it is optional
functionality that can be implemented in the application.

In addition to scalable ring buffers, the algorithm also uses unbounded
concurrent queues. LL/SC and CAS variants exist in cases where absense of
ABA problem cannot be proved, and also in cases where the compiler's atomic
built-ins may not be lowered to the desired instruction(s). Finally, a version
of the algorithm that uses locks is also provided.

Use --enable-schedule-scalable to conditionally compile this scheduler
into the library.

[1] https://lists.linaro.org/pipermail/lng-odp/2016-September/025682.html

On checkpatch.pl:
 - [2/6] and [5/6] have checkpatch.pl issues that are superfluous

v9:
 - Include patch to enable scalable scheduler in Travis CI
 - Fix 'make distcheck'

v8:
 - Reword commit messages

v7:
 - Rebase against new modular queue interface
 - Duplicate arch files under mips64 and powerpc
 - Fix sched->order_lock()
 - Loop until all deferred events have been enqueued
 - Implement ord_enq_multi()
 - Fix ordered_lock/unlock
 - Revert stylistic changes
 - Add default xfactor
 - Remove changes to odp_sched_latency
 - Remove ULL suffix to alleviate Clang build

v6:
 - Move conversions into scalable scheduler to alleviate #ifdefs
 - Remove unnecessary prefetch
 - Fix ARMv8 build

v5:
 - Allocate cache aligned memory using shm pool APIs
 - Move more code to scalable scheduler specific files
 - Remove CONFIG_SPLIT_READWRITE
 - Fix 'make distcheck' issue

v4:
 - Fix a couple more checkpatch.pl issues

v3:
 - Only conditionally compile scalable scheduler and queue
 - Move some code to arch/ dir
 - Use a single shm block for queues instead of block-per-queue
 - De-interleave odp_llqueue.h
 - Use compiler macros to determine ATOM_BITSET_SIZE
 - Incorporated queue size changes
 - Dropped 'ODP_' prefix on config and moved to other files
 - Dropped a few patches that were send independently to the list

v2:
 - Move ARMv8 issues and other fixes into separate patches
 - Abstract away some #ifdefs
 - Fix some checkpatch.pl warnings

Brian Brooks (5):
  test: odp_pktio_ordered: add queue size
  linux-gen: sched scalable: add arch files
  linux-gen: sched scalable: add a bitset
  linux-gen: sched scalable: add a concurrent queue
  linux-gen: sched scalable: add scalable scheduler

Honnappa Nagarahalli (1):
  travis: add scalable scheduler in CI

 .travis.yml|1 +
 configure.ac   |1 +
 platform/linux-generic/Makefile.am |   17 +
 platform/linux-generic/arch/arm/odp_atomic.h   |  210 +++
 platform/linux-generic/arch/arm/odp_cpu.h  |   65 +
 platform/linux-generic/arch/arm/odp_cpu_idling.h   |   51 +
 platform/linux-generic/arch/arm/odp_llsc.h |  249 +++
 platform/linux-generic/arch/default/odp_cpu.h  |   41 +
 platform/linux-generic/arch/mips64/odp_cpu.h   |   41 +
 platform/linux-generic/arch/powerpc/odp_cpu.h  |   41 +
 platform/linux-generic/arch/x86/odp_cpu.h  |   41 +
 .../include/odp/api/plat/schedule_types.h  |4 +-
 platform/linux-generic/include/odp_bitset.h|  210 +++
 .../linux-generic/include/odp_config_internal.h|   17 +-
 platform/linux-generic/include/odp_llqueue.h   |  309 +++
 .../include/odp_queue_scalable_internal.h  |  102 +
 platform/linux-generic/include/odp_schedule_if.h   |2 +-
 .../linux-generic/include/odp_schedule_scalable.h  |  137 ++
 .../include/odp_schedule_scalable_config.h |   55 +
 .../include/odp_schedule_scalable_ordered.h|  132 ++
 platform/linux-generic/m4/odp_schedule.m4  |   55 +-
 platform/linux-generic/odp_queue_if.c  |8 +
 platform/linux-generic/odp_queue_scalable.c| 1020 ++
 platform/linux-generic/odp_schedule_if.c   |6 +
 platform/linux-generic/odp_schedule_scalable.c | 1978 
 .../linux-generic/odp_schedule_scalable_ordered.c  |  347 
 test/common_plat/performance/odp_pktio_ordered.c   |4 +
 27 files changed, 5122 insertions(+), 22 deletions(-)
 create mode 100644 platform/linux-generic/arch/arm/odp_atomic.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu_idling.h
 create mode 100644 platform/linux-generic/arch/arm/odp_llsc.h
 create mode 100644 platform/linux-generic/arch/default/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/mips64/odp_cpu.h
 cre

[lng-odp] [API-NEXT PATCH v9 1/6] test: odp_pktio_ordered: add queue size

2017-06-19 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
---
 test/common_plat/performance/odp_pktio_ordered.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/test/common_plat/performance/odp_pktio_ordered.c 
b/test/common_plat/performance/odp_pktio_ordered.c
index 4bb0bef9..50bfef51 100644
--- a/test/common_plat/performance/odp_pktio_ordered.c
+++ b/test/common_plat/performance/odp_pktio_ordered.c
@@ -91,6 +91,9 @@
 /** Maximum number of pktio queues per interface */
 #define MAX_QUEUES 32
 
+/** Seems to need at least 8192 elements per queue */
+#define QUEUE_SIZE 8192
+
 /** Maximum number of pktio interfaces */
 #define MAX_PKTIOS 8
 
@@ -1232,6 +1235,7 @@ int main(int argc, char *argv[])
qparam.sched.prio = ODP_SCHED_PRIO_DEFAULT;
qparam.sched.sync = ODP_SCHED_SYNC_ATOMIC;
qparam.sched.group = ODP_SCHED_GROUP_ALL;
+   qparam.size   = QUEUE_SIZE;
 
gbl_args->flow_qcontext[i][j].idx = i;
gbl_args->flow_qcontext[i][j].input_queue = 0;
-- 
2.13.1



[lng-odp] [API-NEXT PATCH v8 5/5] linux-gen: sched scalable: add scalable scheduler

2017-06-18 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Signed-off-by: Kevin Wang <kevin.w...@arm.com>
Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
---
 platform/linux-generic/Makefile.am |7 +
 .../include/odp/api/plat/schedule_types.h  |4 +-
 .../linux-generic/include/odp_config_internal.h|   17 +-
 .../include/odp_queue_scalable_internal.h  |  102 +
 platform/linux-generic/include/odp_schedule_if.h   |2 +-
 .../linux-generic/include/odp_schedule_scalable.h  |  137 ++
 .../include/odp_schedule_scalable_config.h |   55 +
 .../include/odp_schedule_scalable_ordered.h|  132 ++
 platform/linux-generic/m4/odp_schedule.m4  |   55 +-
 platform/linux-generic/odp_queue_if.c  |8 +
 platform/linux-generic/odp_queue_scalable.c| 1020 ++
 platform/linux-generic/odp_schedule_if.c   |6 +
 platform/linux-generic/odp_schedule_scalable.c | 1978 
 .../linux-generic/odp_schedule_scalable_ordered.c  |  347 
 14 files changed, 3848 insertions(+), 22 deletions(-)
 create mode 100644 platform/linux-generic/include/odp_queue_scalable_internal.h
 create mode 100644 platform/linux-generic/include/odp_schedule_scalable.h
 create mode 100644 
platform/linux-generic/include/odp_schedule_scalable_config.h
 create mode 100644 
platform/linux-generic/include/odp_schedule_scalable_ordered.h
 create mode 100644 platform/linux-generic/odp_queue_scalable.c
 create mode 100644 platform/linux-generic/odp_schedule_scalable.c
 create mode 100644 platform/linux-generic/odp_schedule_scalable_ordered.c

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 3cb7511b..570760ba 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -171,9 +171,13 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_pool_internal.h \
  ${srcdir}/include/odp_posix_extensions.h \
  ${srcdir}/include/odp_queue_internal.h \
+ ${srcdir}/include/odp_queue_scalable_internal.h \
  ${srcdir}/include/odp_ring_internal.h \
  ${srcdir}/include/odp_queue_if.h \
  ${srcdir}/include/odp_schedule_if.h \
+ ${srcdir}/include/odp_schedule_scalable.h \
+ ${srcdir}/include/odp_schedule_scalable_config.h \
+ ${srcdir}/include/odp_schedule_scalable_ordered.h \
  ${srcdir}/include/odp_sorted_list_internal.h \
  ${srcdir}/include/odp_shm_internal.h \
  ${srcdir}/include/odp_time_internal.h \
@@ -230,12 +234,15 @@ __LIB__libodp_linux_la_SOURCES = \
   odp_pool.c \
   odp_queue.c \
   odp_queue_if.c \
+  odp_queue_scalable.c \
   odp_rwlock.c \
   odp_rwlock_recursive.c \
   odp_schedule.c \
   odp_schedule_if.c \
   odp_schedule_sp.c \
   odp_schedule_iquery.c \
+  odp_schedule_scalable.c \
+  odp_schedule_scalable_ordered.c \
   odp_shared_memory.c \
   odp_sorted_list.c \
   odp_spinlock.c \
diff --git a/platform/linux-generic/include/odp/api/plat/schedule_types.h 
b/platform/linux-generic/include/odp/api/plat/schedule_types.h
index 535fd6d0..4e75f9ee 100644
--- a/platform/linux-generic/include/odp/api/plat/schedule_types.h
+++ b/platform/linux-generic/include/odp/api/plat/schedule_types.h
@@ -18,6 +18,8 @@
 extern "C" {
 #endif
 
+#include 
+
 /** @addtogroup odp_scheduler
  *  @{
  */
@@ -44,7 +46,7 @@ typedef int odp_schedule_sync_t;
 typedef int odp_schedule_group_t;
 
 /* These must be kept in sync with thread_globals_t in odp_thread.c */
-#define ODP_SCHED_GROUP_INVALID -1
+#define ODP_SCHED_GROUP_INVALID ((odp_schedule_group_t)-1)
 #define ODP_SCHED_GROUP_ALL 0
 #define ODP_SCHED_GROUP_WORKER  1
 #define ODP_SCHED_GROUP_CONTROL 2
diff --git a/platform/linux-generic/include/odp_config_internal.h 
b/platform/linux-generic/include/odp_config_internal.h
index dadd59e7..6cc844f3 100644
--- a/platform/linux-generic/include/odp_config_internal.h
+++ b/platform/linux-generic/include/odp_config_internal.h
@@ -7,9 +7,7 @@
 #ifndef ODP_CONFIG_INTERNAL_H_
 #define ODP_CONFIG_INTERNAL_H_
 
-#ifdef __cplusplus
-extern "C" {
-#endif
+#include 
 
 /*
  * Maximum number of pools
@@ -22,6 +20,13 @@ extern "C" {
 #define ODP_CONFIG_QUEUES 1024
 
 /*
+ * Maximum queue depth. Maximum number of elements that can be stored in a
+ * queue. This value is used only when the size is not explicitly provided
+ * durin

[lng-odp] [API-NEXT PATCH v8 2/5] linux-gen: sched scalable: add arch files

2017-06-18 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 platform/linux-generic/Makefile.am   |   2 +
 platform/linux-generic/arch/arm/odp_atomic.h | 210 +++
 platform/linux-generic/arch/arm/odp_cpu.h|  65 ++
 platform/linux-generic/arch/arm/odp_cpu_idling.h |  51 +
 platform/linux-generic/arch/arm/odp_llsc.h   | 249 +++
 platform/linux-generic/arch/default/odp_cpu.h|  41 
 platform/linux-generic/arch/mips64/odp_cpu.h |  41 
 platform/linux-generic/arch/powerpc/odp_cpu.h|  41 
 platform/linux-generic/arch/x86/odp_cpu.h|  41 
 9 files changed, 741 insertions(+)
 create mode 100644 platform/linux-generic/arch/arm/odp_atomic.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu_idling.h
 create mode 100644 platform/linux-generic/arch/arm/odp_llsc.h
 create mode 100644 platform/linux-generic/arch/default/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/mips64/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/powerpc/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/x86/odp_cpu.h

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 58c73767..b95e691b 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -8,6 +8,7 @@ AM_CFLAGS +=  -I$(srcdir)/include
 AM_CFLAGS +=  -I$(top_srcdir)/include
 AM_CFLAGS +=  -I$(top_srcdir)/include/odp/arch/@ARCH_ABI@
 AM_CFLAGS +=  -I$(top_builddir)/include
+AM_CFLAGS +=  -I$(top_srcdir)/arch/@ARCH_DIR@
 AM_CFLAGS +=  -Iinclude
 AM_CFLAGS +=  -DSYSCONFDIR=\"@sysconfdir@\"
 AM_CFLAGS +=  -D_ODP_PKTIO_IPC
@@ -183,6 +184,7 @@ noinst_HEADERS = \
  ${srcdir}/include/protocols/ipsec.h \
  ${srcdir}/include/protocols/tcp.h \
  ${srcdir}/include/protocols/udp.h \
+ ${srcdir}/arch/@ARCH_DIR@/odp_cpu.h \
  ${srcdir}/Makefile.inc
 
 __LIB__libodp_linux_la_SOURCES = \
diff --git a/platform/linux-generic/arch/arm/odp_atomic.h 
b/platform/linux-generic/arch/arm/odp_atomic.h
new file mode 100644
index ..0ddd8a11
--- /dev/null
+++ b/platform/linux-generic/arch/arm/odp_atomic.h
@@ -0,0 +1,210 @@
+/* Copyright (c) 2017, ARM Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier:BSD-3-Clause
+ */
+
+#ifndef PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_ATOMIC_H
+#define PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_ATOMIC_H
+
+#ifndef PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_CPU_H
+#error This file should not be included directly, please include odp_cpu.h
+#endif
+
+#ifdef CONFIG_DMBSTR
+
+#define atomic_store_release(loc, val, ro) \
+do {   \
+   _odp_release_barrier(ro);   \
+   __atomic_store_n(loc, val, __ATOMIC_RELAXED);   \
+} while (0)
+
+#else
+
+#define atomic_store_release(loc, val, ro) \
+   __atomic_store_n(loc, val, __ATOMIC_RELEASE)
+
+#endif  /* CONFIG_DMBSTR */
+
+#if __ARM_ARCH == 8
+
+#define HAS_ACQ(mo) ((mo) != __ATOMIC_RELAXED && (mo) != __ATOMIC_RELEASE)
+#define HAS_RLS(mo) ((mo) == __ATOMIC_RELEASE || (mo) == __ATOMIC_ACQ_REL || \
+(mo) == __ATOMIC_SEQ_CST)
+
+#define LL_MO(mo) (HAS_ACQ((mo)) ? __ATOMIC_ACQUIRE : __ATOMIC_RELAXED)
+#define SC_MO(mo) (HAS_RLS((mo)) ? __ATOMIC_RELEASE : __ATOMIC_RELAXED)
+
+#ifndef __ARM_FEATURE_QRDMX /* Feature only available in v8.1a and beyond */
+static inline bool
+__lockfree_compare_exchange_16(register __int128 *var, __int128 *exp,
+  register __int128 neu, bool weak, int mo_success,
+  int mo_failure)
+{
+   (void)weak; /* Always do strong CAS or we can't perform atomic read */
+   /* Ignore memory ordering for failure, memory order for
+* success must be stronger or equal. */
+   (void)mo_failure;
+   register __int128 old;
+   register __int128 expected;
+   int ll_mo = LL_MO(mo_success);
+   int sc_mo = SC_MO(mo_success);
+
+   expected = *exp;
+   __asm__ volatile("" ::: "memory");
+   do {
+   /* Atomicity of LLD is not guaranteed */
+   old = lld(var, ll_mo);
+   /* Must write back neu or old to verify atomicity of LLD */
+   } while (odp_unlikely(scd(var, old == expected ? neu : old, sc_mo)));
+   *exp = old; /* Always update, atomically read value */
+   return old == expected;
+}
+
+static inline __int128 __lockfree_exchange_16(__int128 *var, __int128 neu,
+ int mo)
+{
+   register __int128 old;
+   int ll_mo = LL_MO(mo);
+   int sc_mo = SC_MO(mo);
+
+   do {
+   /* Atomicity o

[lng-odp] [API-NEXT PATCH v8 4/5] linux-gen: sched scalable: add a concurrent queue

2017-06-18 Thread Brian Brooks
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Brian Brooks <brian.bro...@arm.com>
---
 platform/linux-generic/Makefile.am   |   1 +
 platform/linux-generic/include/odp_llqueue.h | 309 +++
 2 files changed, 310 insertions(+)
 create mode 100644 platform/linux-generic/include/odp_llqueue.h

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index b95e691b..3cb7511b 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -156,6 +156,7 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_errno_define.h \
  ${srcdir}/include/odp_forward_typedefs_internal.h \
  ${srcdir}/include/odp_internal.h \
+ ${srcdir}/include/odp_llqueue.h \
  ${srcdir}/include/odp_name_table_internal.h \
  ${srcdir}/include/odp_packet_internal.h \
  ${srcdir}/include/odp_packet_io_internal.h \
diff --git a/platform/linux-generic/include/odp_llqueue.h 
b/platform/linux-generic/include/odp_llqueue.h
new file mode 100644
index ..758af490
--- /dev/null
+++ b/platform/linux-generic/include/odp_llqueue.h
@@ -0,0 +1,309 @@
+/* Copyright (c) 2017, ARM Limited.
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier:BSD-3-Clause
+ */
+
+#ifndef ODP_LLQUEUE_H_
+#define ODP_LLQUEUE_H_
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+/**
+ * Linked list queues
+ */
+
+struct llqueue;
+struct llnode;
+
+static struct llnode *llq_head(struct llqueue *llq);
+static void llqueue_init(struct llqueue *llq);
+static void llq_enqueue(struct llqueue *llq, struct llnode *node);
+static struct llnode *llq_dequeue(struct llqueue *llq);
+static odp_bool_t llq_dequeue_cond(struct llqueue *llq, struct llnode *exp);
+static odp_bool_t llq_cond_rotate(struct llqueue *llq, struct llnode *node);
+static odp_bool_t llq_on_queue(struct llnode *node);
+
+/**
+ * The implementation(s)
+ */
+
+#define SENTINEL ((void *)~(uintptr_t)0)
+
+#ifdef CONFIG_LLDSCD
+/* Implement queue operations using double-word LL/SC */
+
+/* The scalar equivalent of a double pointer */
+#if __SIZEOF_PTRDIFF_T__ == 4
+typedef uint64_t dintptr_t;
+#endif
+#if __SIZEOF_PTRDIFF_T__ == 8
+typedef __int128 dintptr_t;
+#endif
+
+struct llnode {
+   struct llnode *next;
+};
+
+union llht {
+   struct {
+   struct llnode *head, *tail;
+   } st;
+   dintptr_t ui;
+};
+
+struct llqueue {
+   union llht u;
+};
+
+static inline struct llnode *llq_head(struct llqueue *llq)
+{
+   return __atomic_load_n(>u.st.head, __ATOMIC_RELAXED);
+}
+
+static inline void llqueue_init(struct llqueue *llq)
+{
+   llq->u.st.head = NULL;
+   llq->u.st.tail = NULL;
+}
+
+static inline void llq_enqueue(struct llqueue *llq, struct llnode *node)
+{
+   union llht old, neu;
+
+   ODP_ASSERT(node->next == NULL);
+   node->next = SENTINEL;
+   do {
+   old.ui = lld(>u.ui, __ATOMIC_RELAXED);
+   neu.st.head = old.st.head == NULL ? node : old.st.head;
+   neu.st.tail = node;
+   } while (odp_unlikely(scd(>u.ui, neu.ui, __ATOMIC_RELEASE)));
+   if (old.st.tail != NULL) {
+   /* List was not empty */
+   ODP_ASSERT(old.st.tail->next == SENTINEL);
+   old.st.tail->next = node;
+   }
+}
+
+static inline struct llnode *llq_dequeue(struct llqueue *llq)
+{
+   struct llnode *head;
+   union llht old, neu;
+
+   /* llq_dequeue() may be used in a busy-waiting fashion
+* Read head using plain load to avoid disturbing remote LL/SC
+*/
+   head = __atomic_load_n(>u.st.head, __ATOMIC_ACQUIRE);
+   if (head == NULL)
+   return NULL;
+   /* Read head->next before LL to minimize cache miss latency
+* in LL/SC below
+*/
+   (void)__atomic_load_n(>next, __ATOMIC_RELAXED);
+
+   do {
+   old.ui = lld(>u.ui, __ATOMIC_RELAXED);
+   if (odp_unlikely(old.st.head == NULL)) {
+   /* Empty list */
+   return NULL;
+   } else if (odp_unlikely(old.st.head == old.st.tail)) {
+   /* Single-element in list */
+   neu.st.head = NULL;
+   neu.st.tail = NULL;
+   } else {
+   /* Multi-element list, dequeue head */
+   struct llnode *next;
+   /* Wait until llq_enqueue() has w

[lng-odp] [API-NEXT PATCH v8 3/5] linux-gen: sched scalable: add a bitset

2017-06-18 Thread Brian Brooks
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 platform/linux-generic/include/odp_bitset.h | 210 
 1 file changed, 210 insertions(+)
 create mode 100644 platform/linux-generic/include/odp_bitset.h

diff --git a/platform/linux-generic/include/odp_bitset.h 
b/platform/linux-generic/include/odp_bitset.h
new file mode 100644
index ..69b1a8dc
--- /dev/null
+++ b/platform/linux-generic/include/odp_bitset.h
@@ -0,0 +1,210 @@
+/* Copyright (c) 2017, ARM Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef _ODP_BITSET_H_
+#define _ODP_BITSET_H_
+
+#include 
+
+#include 
+
+/**
+ * bitset abstract data type
+ */
+/* This could be a struct of scalars to support larger bit sets */
+
+/*
+ * Size of atomic bit set. This limits the max number of threads,
+ * scheduler groups and reorder windows. On ARMv8/64-bit and x86-64, the
+ * (lock-free) max is 128
+ */
+
+/* Find a suitable data type that supports lock-free atomic operations */
+#if defined(__ARM_ARCH) &&  __ARM_ARCH == 8 &&  __ARM_64BIT_STATE == 1 && \
+   defined(__SIZEOF_INT128__) && __SIZEOF_INT128__ == 16
+#define LOCKFREE16
+typedef __int128 bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT128__)
+
+#elif __GCC_ATOMIC_LLONG_LOCK_FREE == 2 && \
+   __SIZEOF_LONG_LONG__ != __SIZEOF_LONG__
+typedef unsigned long long bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_LONG_LONG__)
+
+#elif __GCC_ATOMIC_LONG_LOCK_FREE == 2 && __SIZEOF_LONG__ != __SIZEOF_INT__
+typedef unsigned long bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_LONG__)
+
+#elif __GCC_ATOMIC_INT_LOCK_FREE == 2
+typedef unsigned int bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT__)
+
+#else
+/* Target does not support lock-free atomic operations */
+typedef unsigned int bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT__)
+#endif
+
+#if ATOM_BITSET_SIZE <= 32
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   return 1UL << bit;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   return __builtin_ffsl(b);
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor32(bs, mo);
+}
+
+#elif ATOM_BITSET_SIZE <= 64
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   return 1ULL << bit;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   return __builtin_ffsll(b);
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor64(bs, mo);
+}
+
+#elif ATOM_BITSET_SIZE <= 128
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   if (bit < 64)
+   return 1ULL << bit;
+   else
+   return (unsigned __int128)(1ULL << (bit - 64)) << 64;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   if ((uint64_t)b != 0)
+   return __builtin_ffsll((uint64_t)b);
+   else if ((b >> 64) != 0)
+   return __builtin_ffsll((uint64_t)(b >> 64)) + 64;
+   else
+   return 0;
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor128(bs, mo);
+}
+
+#else
+#error Unsupported size of bit sets (ATOM_BITSET_SIZE)
+#endif
+
+/* Atomic load with memory ordering */
+static inline bitset_t atom_bitset_load(bitset_t *bs, int mo)
+{
+#ifdef LOCKFREE16
+   return __lockfree_load_16(bs, mo);
+#else
+   return __atomic_load_n(bs, mo);
+#endif
+}
+
+/* Atomic bit set with memory ordering */
+static inline void atom_bitset_set(bitset_t *bs, uint32_t bit, int mo)
+{
+#ifdef LOCKFREE16
+   (void)__lockfree_fetch_or_16(bs, bitset_mask(bit), mo);
+#else
+   (void)__atomic_fetch_or(bs, bitset_mask(bit), mo);
+#endif
+}
+
+/* Atomic bit clear with memory ordering */
+static inline void atom_bitset_clr(bitset_t *bs, uint32_t bit, int mo)
+{
+#ifdef LOCKFREE16
+   (void)__lockfree_fetch_and_16(bs, ~bitset_mask(bit), mo);
+#else
+   (void)__atomic_fetch_and(bs, ~bitset_mask(bit), mo);
+#endif
+}
+
+/* Atomic exchange with memory ordering */
+static inline bitset_t atom_bitset_xchg(bitset_t *bs, bitset_t neu, int mo)
+{
+#ifdef LOCKFREE16
+   return __lockfree_exchange_16(bs, neu, mo);
+#else
+   return __atomic_exchange_n(bs, neu, mo);
+#endif

[lng-odp] [API-NEXT PATCH v8 1/5] test: odp_pktio_ordered: add queue size

2017-06-18 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
---
 test/common_plat/performance/odp_pktio_ordered.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/test/common_plat/performance/odp_pktio_ordered.c 
b/test/common_plat/performance/odp_pktio_ordered.c
index 4bb0bef9..50bfef51 100644
--- a/test/common_plat/performance/odp_pktio_ordered.c
+++ b/test/common_plat/performance/odp_pktio_ordered.c
@@ -91,6 +91,9 @@
 /** Maximum number of pktio queues per interface */
 #define MAX_QUEUES 32
 
+/** Seems to need at least 8192 elements per queue */
+#define QUEUE_SIZE 8192
+
 /** Maximum number of pktio interfaces */
 #define MAX_PKTIOS 8
 
@@ -1232,6 +1235,7 @@ int main(int argc, char *argv[])
qparam.sched.prio = ODP_SCHED_PRIO_DEFAULT;
qparam.sched.sync = ODP_SCHED_SYNC_ATOMIC;
qparam.sched.group = ODP_SCHED_GROUP_ALL;
+   qparam.size   = QUEUE_SIZE;
 
gbl_args->flow_qcontext[i][j].idx = i;
gbl_args->flow_qcontext[i][j].input_queue = 0;
-- 
2.13.1



[lng-odp] [API-NEXT PATCH v8 0/5] A scalable software scheduler

2017-06-18 Thread Brian Brooks
This work derives from Ola Liljedahl's prototype [1] which introduced a
scalable scheduler design based on primarily lock-free algorithms and
data structures designed to decrease contention. A thread searches
through a data structure containing only queues that are both non-empty
and allowed to be scheduled to that thread. Strict priority scheduling is
respected, and (W)RR scheduling may be used within queues of the same priority.
Lastly, pre-scheduling or stashing is not employed since it is optional
functionality that can be implemented in the application.

In addition to scalable ring buffers, the algorithm also uses unbounded
concurrent queues. LL/SC and CAS variants exist in cases where absense of
ABA problem cannot be proved, and also in cases where the compiler's atomic
built-ins may not be lowered to the desired instruction(s). Finally, a version
of the algorithm that uses locks is also provided.

Use --enable-schedule-scalable to conditionally compile this scheduler
into the library.

[1] https://lists.linaro.org/pipermail/lng-odp/2016-September/025682.html

On checkpatch.pl:
 - [2/5] and [5/5] have checkpatch.pl issues that are superfluous

v8:
 - Reword commit messages

v7:
 - Rebase against new modular queue interface
 - Duplicate arch files under mips64 and powerpc
 - Fix sched->order_lock()
 - Loop until all deferred events have been enqueued
 - Implement ord_enq_multi()
 - Fix ordered_lock/unlock
 - Revert stylistic changes
 - Add default xfactor
 - Remove changes to odp_sched_latency
 - Remove ULL suffix to alleviate Clang build

v6:
 - Move conversions into scalable scheduler to alleviate #ifdefs
 - Remove unnecessary prefetch
 - Fix ARMv8 build

v5:
 - Allocate cache aligned memory using shm pool APIs
 - Move more code to scalable scheduler specific files
 - Remove CONFIG_SPLIT_READWRITE
 - Fix 'make distcheck' issue

v4:
 - Fix a couple more checkpatch.pl issues

v3:
 - Only conditionally compile scalable scheduler and queue
 - Move some code to arch/ dir
 - Use a single shm block for queues instead of block-per-queue
 - De-interleave odp_llqueue.h
 - Use compiler macros to determine ATOM_BITSET_SIZE
 - Incorporated queue size changes
 - Dropped 'ODP_' prefix on config and moved to other files
 - Dropped a few patches that were send independently to the list

v2:
 - Move ARMv8 issues and other fixes into separate patches
 - Abstract away some #ifdefs
 - Fix some checkpatch.pl warnings

Brian Brooks (5):
  test: odp_pktio_ordered: add queue size
  Add arch/ files
  Add a bitset
  Add a concurrent queue
  Add scalable scheduler

 platform/linux-generic/Makefile.am |   10 +
 platform/linux-generic/arch/arm/odp_atomic.h   |  210 +++
 platform/linux-generic/arch/arm/odp_cpu.h  |   65 +
 platform/linux-generic/arch/arm/odp_cpu_idling.h   |   51 +
 platform/linux-generic/arch/arm/odp_llsc.h |  249 +++
 platform/linux-generic/arch/default/odp_cpu.h  |   41 +
 platform/linux-generic/arch/mips64/odp_cpu.h   |   41 +
 platform/linux-generic/arch/powerpc/odp_cpu.h  |   41 +
 platform/linux-generic/arch/x86/odp_cpu.h  |   41 +
 .../include/odp/api/plat/schedule_types.h  |4 +-
 platform/linux-generic/include/odp_bitset.h|  210 +++
 .../linux-generic/include/odp_config_internal.h|   17 +-
 platform/linux-generic/include/odp_llqueue.h   |  309 +++
 .../include/odp_queue_scalable_internal.h  |  102 +
 platform/linux-generic/include/odp_schedule_if.h   |2 +-
 .../linux-generic/include/odp_schedule_scalable.h  |  137 ++
 .../include/odp_schedule_scalable_config.h |   55 +
 .../include/odp_schedule_scalable_ordered.h|  132 ++
 platform/linux-generic/m4/odp_schedule.m4  |   55 +-
 platform/linux-generic/odp_queue_if.c  |8 +
 platform/linux-generic/odp_queue_scalable.c| 1020 ++
 platform/linux-generic/odp_schedule_if.c   |6 +
 platform/linux-generic/odp_schedule_scalable.c | 1978 
 .../linux-generic/odp_schedule_scalable_ordered.c  |  347 
 test/common_plat/performance/odp_pktio_ordered.c   |4 +
 25 files changed, 5113 insertions(+), 22 deletions(-)
 create mode 100644 platform/linux-generic/arch/arm/odp_atomic.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu_idling.h
 create mode 100644 platform/linux-generic/arch/arm/odp_llsc.h
 create mode 100644 platform/linux-generic/arch/default/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/mips64/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/powerpc/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/x86/odp_cpu.h
 create mode 100644 platform/linux-generic/include/odp_bitset.h
 create mode 100644 platform/linux-generic/include/odp_llqueue.h
 create mode 100644 platform/linux-generic/include/odp_queue_scalable_internal.h
 create mode 100644 platform/linux-generic/incl

Re: [lng-odp] [API-NEXT PATCH v7 0/5] A scalable software scheduler

2017-06-18 Thread Brian Brooks
On 06/16 22:42:04, Maxim Uvarov wrote:
> On 06/14/17 04:21, Brian Brooks wrote:
> > Brian Brooks (5):
> >   test: odp_pktio_ordered: add queue size
> >   Add arch/ files
> >   Add a bitset
> >   Add a concurrent queue
> >   Add scalable scheduler
> 
> Please rename commits with existence commit style.

Sure, will do.

> Has to be:
> linux-gen: sched scalable: addr arch files
> linux-gen: sched scalable: add a bitset
> and etc.
> 
> I.e. git log --oneline should follow existence style.
> 
> Thank you,
> Maxim.


[lng-odp] [API-NEXT PATCH v4] timer: allow timer processing to run on worker cores

2017-06-14 Thread Brian Brooks
Run timer pool processing on worker cores if the application hints
that the scheduler will be used. This reduces the latency and jitter
of the point at which timer pool processing begins. See [1] for details.

[1] 
https://docs.google.com/document/d/1sY7rOxqCNu-bMqjBiT5_keAIohrX1ZW-eL0oGLAQ4OM/edit?usp=sharing

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---

** There is a false positive checkpatch.pl warning **

v4:
 - Rebase against Bill's feature init patch

v3:
 - Add rate limiting by scheduling rounds

v2:
 - Reword 'worker_timers' to 'use_scheduler'
 - Use time instead of ticks

 platform/linux-generic/include/odp_internal.h  |   2 +-
 .../linux-generic/include/odp_timer_internal.h |  24 +
 platform/linux-generic/odp_init.c  |   2 +-
 platform/linux-generic/odp_schedule.c  |   3 +
 platform/linux-generic/odp_schedule_iquery.c   |   3 +
 platform/linux-generic/odp_schedule_sp.c   |   3 +
 platform/linux-generic/odp_timer.c | 112 +++--
 7 files changed, 138 insertions(+), 11 deletions(-)

diff --git a/platform/linux-generic/include/odp_internal.h 
b/platform/linux-generic/include/odp_internal.h
index 90e2a629..404792cf 100644
--- a/platform/linux-generic/include/odp_internal.h
+++ b/platform/linux-generic/include/odp_internal.h
@@ -108,7 +108,7 @@ int odp_queue_term_global(void);
 int odp_crypto_init_global(void);
 int odp_crypto_term_global(void);
 
-int odp_timer_init_global(void);
+int odp_timer_init_global(const odp_init_t *params);
 int odp_timer_term_global(void);
 int odp_timer_disarm_all(void);
 
diff --git a/platform/linux-generic/include/odp_timer_internal.h 
b/platform/linux-generic/include/odp_timer_internal.h
index 91b12c54..67ee9fef 100644
--- a/platform/linux-generic/include/odp_timer_internal.h
+++ b/platform/linux-generic/include/odp_timer_internal.h
@@ -20,6 +20,12 @@
 #include 
 #include 
 
+/* Minimum number of nanoseconds between checking timer pools. */
+#define CONFIG_TIMER_RUN_RATELIMIT_NS 100
+
+/* Minimum number of scheduling rounds between checking timer pools. */
+#define CONFIG_TIMER_RUN_RATELIMIT_ROUNDS 1
+
 /**
  * Internal Timeout header
  */
@@ -35,4 +41,22 @@ typedef struct {
odp_timer_t timer;
 } odp_timeout_hdr_t;
 
+/*
+ * Whether to run timer pool processing 'inline' (on worker cores) or in
+ * background threads (thread-per-timerpool).
+ *
+ * If the application will use both scheduler and timer this flag is set
+ * to true, otherwise false. This application conveys this information via
+ * the 'not_used' bits in odp_init_t which are passed to odp_global_init().
+ */
+extern odp_bool_t inline_timers;
+
+unsigned _timer_run(void);
+
+/* Static inline wrapper to minimize modification of schedulers. */
+static inline unsigned timer_run(void)
+{
+   return inline_timers ? _timer_run() : 0;
+}
+
 #endif
diff --git a/platform/linux-generic/odp_init.c 
b/platform/linux-generic/odp_init.c
index 62a1fbc2..8c17cbb0 100644
--- a/platform/linux-generic/odp_init.c
+++ b/platform/linux-generic/odp_init.c
@@ -241,7 +241,7 @@ int odp_init_global(odp_instance_t *instance,
}
stage = PKTIO_INIT;
 
-   if (odp_timer_init_global()) {
+   if (odp_timer_init_global(params)) {
ODP_ERR("ODP timer init failed.\n");
goto init_failed;
}
diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index 011d4dc4..04d09981 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Number of priority levels  */
 #define NUM_PRIO 8
@@ -998,6 +999,8 @@ static int schedule_loop(odp_queue_t *out_queue, uint64_t 
wait,
int ret;
 
while (1) {
+   timer_run();
+
ret = do_schedule(out_queue, out_ev, max_num);
 
if (ret)
diff --git a/platform/linux-generic/odp_schedule_iquery.c 
b/platform/linux-generic/odp_schedule_iquery.c
index bdf1a460..f7c411f6 100644
--- a/platform/linux-generic/odp_schedule_iquery.c
+++ b/platform/linux-generic/odp_schedule_iquery.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Number of priority levels */
 #define NUM_SCHED_PRIO 8
@@ -719,6 +720,8 @@ static int schedule_loop(odp_queue_t *out_queue, uint64_t 
wait,
odp_time_t next, wtime;
 
while (1) {
+   timer_run();
+
count = do_schedule(out_queue, out_ev, max_num);
 
if (count)
diff --git a/platform/linux-generic/odp_schedule_sp.c 
b/platform/linux-generic/odp_schedule_sp.c
index 91d70e3a..252d128d 100644
--- a/platform/linux-generic/odp_schedule_sp.c
+++ b/platform/linux-generic/odp_schedule_sp.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #in

[lng-odp] [API-NEXT PATCH v7 5/5] Add scalable scheduler

2017-06-13 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Signed-off-by: Kevin Wang <kevin.w...@arm.com>
Signed-off-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
---
 platform/linux-generic/Makefile.am |7 +
 .../include/odp/api/plat/schedule_types.h  |4 +-
 .../linux-generic/include/odp_config_internal.h|   17 +-
 .../include/odp_queue_scalable_internal.h  |  102 +
 platform/linux-generic/include/odp_schedule_if.h   |2 +-
 .../linux-generic/include/odp_schedule_scalable.h  |  137 ++
 .../include/odp_schedule_scalable_config.h |   55 +
 .../include/odp_schedule_scalable_ordered.h|  132 ++
 platform/linux-generic/m4/odp_schedule.m4  |   55 +-
 platform/linux-generic/odp_queue_if.c  |8 +
 platform/linux-generic/odp_queue_scalable.c| 1020 ++
 platform/linux-generic/odp_schedule_if.c   |6 +
 platform/linux-generic/odp_schedule_scalable.c | 1978 
 .../linux-generic/odp_schedule_scalable_ordered.c  |  347 
 14 files changed, 3848 insertions(+), 22 deletions(-)
 create mode 100644 platform/linux-generic/include/odp_queue_scalable_internal.h
 create mode 100644 platform/linux-generic/include/odp_schedule_scalable.h
 create mode 100644 
platform/linux-generic/include/odp_schedule_scalable_config.h
 create mode 100644 
platform/linux-generic/include/odp_schedule_scalable_ordered.h
 create mode 100644 platform/linux-generic/odp_queue_scalable.c
 create mode 100644 platform/linux-generic/odp_schedule_scalable.c
 create mode 100644 platform/linux-generic/odp_schedule_scalable_ordered.c

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 3cb7511b..570760ba 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -171,9 +171,13 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_pool_internal.h \
  ${srcdir}/include/odp_posix_extensions.h \
  ${srcdir}/include/odp_queue_internal.h \
+ ${srcdir}/include/odp_queue_scalable_internal.h \
  ${srcdir}/include/odp_ring_internal.h \
  ${srcdir}/include/odp_queue_if.h \
  ${srcdir}/include/odp_schedule_if.h \
+ ${srcdir}/include/odp_schedule_scalable.h \
+ ${srcdir}/include/odp_schedule_scalable_config.h \
+ ${srcdir}/include/odp_schedule_scalable_ordered.h \
  ${srcdir}/include/odp_sorted_list_internal.h \
  ${srcdir}/include/odp_shm_internal.h \
  ${srcdir}/include/odp_time_internal.h \
@@ -230,12 +234,15 @@ __LIB__libodp_linux_la_SOURCES = \
   odp_pool.c \
   odp_queue.c \
   odp_queue_if.c \
+  odp_queue_scalable.c \
   odp_rwlock.c \
   odp_rwlock_recursive.c \
   odp_schedule.c \
   odp_schedule_if.c \
   odp_schedule_sp.c \
   odp_schedule_iquery.c \
+  odp_schedule_scalable.c \
+  odp_schedule_scalable_ordered.c \
   odp_shared_memory.c \
   odp_sorted_list.c \
   odp_spinlock.c \
diff --git a/platform/linux-generic/include/odp/api/plat/schedule_types.h 
b/platform/linux-generic/include/odp/api/plat/schedule_types.h
index 535fd6d0..4e75f9ee 100644
--- a/platform/linux-generic/include/odp/api/plat/schedule_types.h
+++ b/platform/linux-generic/include/odp/api/plat/schedule_types.h
@@ -18,6 +18,8 @@
 extern "C" {
 #endif
 
+#include 
+
 /** @addtogroup odp_scheduler
  *  @{
  */
@@ -44,7 +46,7 @@ typedef int odp_schedule_sync_t;
 typedef int odp_schedule_group_t;
 
 /* These must be kept in sync with thread_globals_t in odp_thread.c */
-#define ODP_SCHED_GROUP_INVALID -1
+#define ODP_SCHED_GROUP_INVALID ((odp_schedule_group_t)-1)
 #define ODP_SCHED_GROUP_ALL 0
 #define ODP_SCHED_GROUP_WORKER  1
 #define ODP_SCHED_GROUP_CONTROL 2
diff --git a/platform/linux-generic/include/odp_config_internal.h 
b/platform/linux-generic/include/odp_config_internal.h
index dadd59e7..6cc844f3 100644
--- a/platform/linux-generic/include/odp_config_internal.h
+++ b/platform/linux-generic/include/odp_config_internal.h
@@ -7,9 +7,7 @@
 #ifndef ODP_CONFIG_INTERNAL_H_
 #define ODP_CONFIG_INTERNAL_H_
 
-#ifdef __cplusplus
-extern "C" {
-#endif
+#include 
 
 /*
  * Maximum number of pools
@@ -22,6 +20,13 @@ extern "C" {
 #define ODP_CONFIG_QUEUES 1024
 
 /*
+ * Maximum queue depth. Maximum number of elements that can be stored in a
+ * queue. This value is used only when the size is not explicitly provided
+ * durin

[lng-odp] [API-NEXT PATCH v7 2/5] Add arch/ files

2017-06-13 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 platform/linux-generic/Makefile.am   |   2 +
 platform/linux-generic/arch/arm/odp_atomic.h | 210 +++
 platform/linux-generic/arch/arm/odp_cpu.h|  65 ++
 platform/linux-generic/arch/arm/odp_cpu_idling.h |  51 +
 platform/linux-generic/arch/arm/odp_llsc.h   | 249 +++
 platform/linux-generic/arch/default/odp_cpu.h|  41 
 platform/linux-generic/arch/mips64/odp_cpu.h |  41 
 platform/linux-generic/arch/powerpc/odp_cpu.h|  41 
 platform/linux-generic/arch/x86/odp_cpu.h|  41 
 9 files changed, 741 insertions(+)
 create mode 100644 platform/linux-generic/arch/arm/odp_atomic.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu_idling.h
 create mode 100644 platform/linux-generic/arch/arm/odp_llsc.h
 create mode 100644 platform/linux-generic/arch/default/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/mips64/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/powerpc/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/x86/odp_cpu.h

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 58c73767..b95e691b 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -8,6 +8,7 @@ AM_CFLAGS +=  -I$(srcdir)/include
 AM_CFLAGS +=  -I$(top_srcdir)/include
 AM_CFLAGS +=  -I$(top_srcdir)/include/odp/arch/@ARCH_ABI@
 AM_CFLAGS +=  -I$(top_builddir)/include
+AM_CFLAGS +=  -I$(top_srcdir)/arch/@ARCH_DIR@
 AM_CFLAGS +=  -Iinclude
 AM_CFLAGS +=  -DSYSCONFDIR=\"@sysconfdir@\"
 AM_CFLAGS +=  -D_ODP_PKTIO_IPC
@@ -183,6 +184,7 @@ noinst_HEADERS = \
  ${srcdir}/include/protocols/ipsec.h \
  ${srcdir}/include/protocols/tcp.h \
  ${srcdir}/include/protocols/udp.h \
+ ${srcdir}/arch/@ARCH_DIR@/odp_cpu.h \
  ${srcdir}/Makefile.inc
 
 __LIB__libodp_linux_la_SOURCES = \
diff --git a/platform/linux-generic/arch/arm/odp_atomic.h 
b/platform/linux-generic/arch/arm/odp_atomic.h
new file mode 100644
index ..0ddd8a11
--- /dev/null
+++ b/platform/linux-generic/arch/arm/odp_atomic.h
@@ -0,0 +1,210 @@
+/* Copyright (c) 2017, ARM Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier:BSD-3-Clause
+ */
+
+#ifndef PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_ATOMIC_H
+#define PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_ATOMIC_H
+
+#ifndef PLATFORM_LINUXGENERIC_ARCH_ARM_ODP_CPU_H
+#error This file should not be included directly, please include odp_cpu.h
+#endif
+
+#ifdef CONFIG_DMBSTR
+
+#define atomic_store_release(loc, val, ro) \
+do {   \
+   _odp_release_barrier(ro);   \
+   __atomic_store_n(loc, val, __ATOMIC_RELAXED);   \
+} while (0)
+
+#else
+
+#define atomic_store_release(loc, val, ro) \
+   __atomic_store_n(loc, val, __ATOMIC_RELEASE)
+
+#endif  /* CONFIG_DMBSTR */
+
+#if __ARM_ARCH == 8
+
+#define HAS_ACQ(mo) ((mo) != __ATOMIC_RELAXED && (mo) != __ATOMIC_RELEASE)
+#define HAS_RLS(mo) ((mo) == __ATOMIC_RELEASE || (mo) == __ATOMIC_ACQ_REL || \
+(mo) == __ATOMIC_SEQ_CST)
+
+#define LL_MO(mo) (HAS_ACQ((mo)) ? __ATOMIC_ACQUIRE : __ATOMIC_RELAXED)
+#define SC_MO(mo) (HAS_RLS((mo)) ? __ATOMIC_RELEASE : __ATOMIC_RELAXED)
+
+#ifndef __ARM_FEATURE_QRDMX /* Feature only available in v8.1a and beyond */
+static inline bool
+__lockfree_compare_exchange_16(register __int128 *var, __int128 *exp,
+  register __int128 neu, bool weak, int mo_success,
+  int mo_failure)
+{
+   (void)weak; /* Always do strong CAS or we can't perform atomic read */
+   /* Ignore memory ordering for failure, memory order for
+* success must be stronger or equal. */
+   (void)mo_failure;
+   register __int128 old;
+   register __int128 expected;
+   int ll_mo = LL_MO(mo_success);
+   int sc_mo = SC_MO(mo_success);
+
+   expected = *exp;
+   __asm__ volatile("" ::: "memory");
+   do {
+   /* Atomicity of LLD is not guaranteed */
+   old = lld(var, ll_mo);
+   /* Must write back neu or old to verify atomicity of LLD */
+   } while (odp_unlikely(scd(var, old == expected ? neu : old, sc_mo)));
+   *exp = old; /* Always update, atomically read value */
+   return old == expected;
+}
+
+static inline __int128 __lockfree_exchange_16(__int128 *var, __int128 neu,
+ int mo)
+{
+   register __int128 old;
+   int ll_mo = LL_MO(mo);
+   int sc_mo = SC_MO(mo);
+
+   do {
+   /* Atomicity o

[lng-odp] [API-NEXT PATCH v7 4/5] Add a concurrent queue

2017-06-13 Thread Brian Brooks
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Brian Brooks <brian.bro...@arm.com>
---
 platform/linux-generic/Makefile.am   |   1 +
 platform/linux-generic/include/odp_llqueue.h | 309 +++
 2 files changed, 310 insertions(+)
 create mode 100644 platform/linux-generic/include/odp_llqueue.h

diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index b95e691b..3cb7511b 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -156,6 +156,7 @@ noinst_HEADERS = \
  ${srcdir}/include/odp_errno_define.h \
  ${srcdir}/include/odp_forward_typedefs_internal.h \
  ${srcdir}/include/odp_internal.h \
+ ${srcdir}/include/odp_llqueue.h \
  ${srcdir}/include/odp_name_table_internal.h \
  ${srcdir}/include/odp_packet_internal.h \
  ${srcdir}/include/odp_packet_io_internal.h \
diff --git a/platform/linux-generic/include/odp_llqueue.h 
b/platform/linux-generic/include/odp_llqueue.h
new file mode 100644
index ..758af490
--- /dev/null
+++ b/platform/linux-generic/include/odp_llqueue.h
@@ -0,0 +1,309 @@
+/* Copyright (c) 2017, ARM Limited.
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier:BSD-3-Clause
+ */
+
+#ifndef ODP_LLQUEUE_H_
+#define ODP_LLQUEUE_H_
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+/**
+ * Linked list queues
+ */
+
+struct llqueue;
+struct llnode;
+
+static struct llnode *llq_head(struct llqueue *llq);
+static void llqueue_init(struct llqueue *llq);
+static void llq_enqueue(struct llqueue *llq, struct llnode *node);
+static struct llnode *llq_dequeue(struct llqueue *llq);
+static odp_bool_t llq_dequeue_cond(struct llqueue *llq, struct llnode *exp);
+static odp_bool_t llq_cond_rotate(struct llqueue *llq, struct llnode *node);
+static odp_bool_t llq_on_queue(struct llnode *node);
+
+/**
+ * The implementation(s)
+ */
+
+#define SENTINEL ((void *)~(uintptr_t)0)
+
+#ifdef CONFIG_LLDSCD
+/* Implement queue operations using double-word LL/SC */
+
+/* The scalar equivalent of a double pointer */
+#if __SIZEOF_PTRDIFF_T__ == 4
+typedef uint64_t dintptr_t;
+#endif
+#if __SIZEOF_PTRDIFF_T__ == 8
+typedef __int128 dintptr_t;
+#endif
+
+struct llnode {
+   struct llnode *next;
+};
+
+union llht {
+   struct {
+   struct llnode *head, *tail;
+   } st;
+   dintptr_t ui;
+};
+
+struct llqueue {
+   union llht u;
+};
+
+static inline struct llnode *llq_head(struct llqueue *llq)
+{
+   return __atomic_load_n(>u.st.head, __ATOMIC_RELAXED);
+}
+
+static inline void llqueue_init(struct llqueue *llq)
+{
+   llq->u.st.head = NULL;
+   llq->u.st.tail = NULL;
+}
+
+static inline void llq_enqueue(struct llqueue *llq, struct llnode *node)
+{
+   union llht old, neu;
+
+   ODP_ASSERT(node->next == NULL);
+   node->next = SENTINEL;
+   do {
+   old.ui = lld(>u.ui, __ATOMIC_RELAXED);
+   neu.st.head = old.st.head == NULL ? node : old.st.head;
+   neu.st.tail = node;
+   } while (odp_unlikely(scd(>u.ui, neu.ui, __ATOMIC_RELEASE)));
+   if (old.st.tail != NULL) {
+   /* List was not empty */
+   ODP_ASSERT(old.st.tail->next == SENTINEL);
+   old.st.tail->next = node;
+   }
+}
+
+static inline struct llnode *llq_dequeue(struct llqueue *llq)
+{
+   struct llnode *head;
+   union llht old, neu;
+
+   /* llq_dequeue() may be used in a busy-waiting fashion
+* Read head using plain load to avoid disturbing remote LL/SC
+*/
+   head = __atomic_load_n(>u.st.head, __ATOMIC_ACQUIRE);
+   if (head == NULL)
+   return NULL;
+   /* Read head->next before LL to minimize cache miss latency
+* in LL/SC below
+*/
+   (void)__atomic_load_n(>next, __ATOMIC_RELAXED);
+
+   do {
+   old.ui = lld(>u.ui, __ATOMIC_RELAXED);
+   if (odp_unlikely(old.st.head == NULL)) {
+   /* Empty list */
+   return NULL;
+   } else if (odp_unlikely(old.st.head == old.st.tail)) {
+   /* Single-element in list */
+   neu.st.head = NULL;
+   neu.st.tail = NULL;
+   } else {
+   /* Multi-element list, dequeue head */
+   struct llnode *next;
+   /* Wait until llq_enqueue() has w

[lng-odp] [API-NEXT PATCH v7 3/5] Add a bitset

2017-06-13 Thread Brian Brooks
Signed-off-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 platform/linux-generic/include/odp_bitset.h | 210 
 1 file changed, 210 insertions(+)
 create mode 100644 platform/linux-generic/include/odp_bitset.h

diff --git a/platform/linux-generic/include/odp_bitset.h 
b/platform/linux-generic/include/odp_bitset.h
new file mode 100644
index ..69b1a8dc
--- /dev/null
+++ b/platform/linux-generic/include/odp_bitset.h
@@ -0,0 +1,210 @@
+/* Copyright (c) 2017, ARM Limited
+ * All rights reserved.
+ *
+ * SPDX-License-Identifier: BSD-3-Clause
+ */
+
+#ifndef _ODP_BITSET_H_
+#define _ODP_BITSET_H_
+
+#include 
+
+#include 
+
+/**
+ * bitset abstract data type
+ */
+/* This could be a struct of scalars to support larger bit sets */
+
+/*
+ * Size of atomic bit set. This limits the max number of threads,
+ * scheduler groups and reorder windows. On ARMv8/64-bit and x86-64, the
+ * (lock-free) max is 128
+ */
+
+/* Find a suitable data type that supports lock-free atomic operations */
+#if defined(__ARM_ARCH) &&  __ARM_ARCH == 8 &&  __ARM_64BIT_STATE == 1 && \
+   defined(__SIZEOF_INT128__) && __SIZEOF_INT128__ == 16
+#define LOCKFREE16
+typedef __int128 bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT128__)
+
+#elif __GCC_ATOMIC_LLONG_LOCK_FREE == 2 && \
+   __SIZEOF_LONG_LONG__ != __SIZEOF_LONG__
+typedef unsigned long long bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_LONG_LONG__)
+
+#elif __GCC_ATOMIC_LONG_LOCK_FREE == 2 && __SIZEOF_LONG__ != __SIZEOF_INT__
+typedef unsigned long bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_LONG__)
+
+#elif __GCC_ATOMIC_INT_LOCK_FREE == 2
+typedef unsigned int bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT__)
+
+#else
+/* Target does not support lock-free atomic operations */
+typedef unsigned int bitset_t;
+#define ATOM_BITSET_SIZE (CHAR_BIT * __SIZEOF_INT__)
+#endif
+
+#if ATOM_BITSET_SIZE <= 32
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   return 1UL << bit;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   return __builtin_ffsl(b);
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor32(bs, mo);
+}
+
+#elif ATOM_BITSET_SIZE <= 64
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   return 1ULL << bit;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   return __builtin_ffsll(b);
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor64(bs, mo);
+}
+
+#elif ATOM_BITSET_SIZE <= 128
+
+static inline bitset_t bitset_mask(uint32_t bit)
+{
+   if (bit < 64)
+   return 1ULL << bit;
+   else
+   return (unsigned __int128)(1ULL << (bit - 64)) << 64;
+}
+
+/* Return first-bit-set with StdC ffs() semantics */
+static inline uint32_t bitset_ffs(bitset_t b)
+{
+   if ((uint64_t)b != 0)
+   return __builtin_ffsll((uint64_t)b);
+   else if ((b >> 64) != 0)
+   return __builtin_ffsll((uint64_t)(b >> 64)) + 64;
+   else
+   return 0;
+}
+
+/* Load-exclusive with memory ordering */
+static inline bitset_t bitset_monitor(bitset_t *bs, int mo)
+{
+   return monitor128(bs, mo);
+}
+
+#else
+#error Unsupported size of bit sets (ATOM_BITSET_SIZE)
+#endif
+
+/* Atomic load with memory ordering */
+static inline bitset_t atom_bitset_load(bitset_t *bs, int mo)
+{
+#ifdef LOCKFREE16
+   return __lockfree_load_16(bs, mo);
+#else
+   return __atomic_load_n(bs, mo);
+#endif
+}
+
+/* Atomic bit set with memory ordering */
+static inline void atom_bitset_set(bitset_t *bs, uint32_t bit, int mo)
+{
+#ifdef LOCKFREE16
+   (void)__lockfree_fetch_or_16(bs, bitset_mask(bit), mo);
+#else
+   (void)__atomic_fetch_or(bs, bitset_mask(bit), mo);
+#endif
+}
+
+/* Atomic bit clear with memory ordering */
+static inline void atom_bitset_clr(bitset_t *bs, uint32_t bit, int mo)
+{
+#ifdef LOCKFREE16
+   (void)__lockfree_fetch_and_16(bs, ~bitset_mask(bit), mo);
+#else
+   (void)__atomic_fetch_and(bs, ~bitset_mask(bit), mo);
+#endif
+}
+
+/* Atomic exchange with memory ordering */
+static inline bitset_t atom_bitset_xchg(bitset_t *bs, bitset_t neu, int mo)
+{
+#ifdef LOCKFREE16
+   return __lockfree_exchange_16(bs, neu, mo);
+#else
+   return __atomic_exchange_n(bs, neu, mo);
+#endif

[lng-odp] [API-NEXT PATCH v7 0/5] A scalable software scheduler

2017-06-13 Thread Brian Brooks
This work derives from Ola Liljedahl's prototype [1] which introduced a
scalable scheduler design based on primarily lock-free algorithms and
data structures designed to decrease contention. A thread searches
through a data structure containing only queues that are both non-empty
and allowed to be scheduled to that thread. Strict priority scheduling is
respected, and (W)RR scheduling may be used within queues of the same priority.
Lastly, pre-scheduling or stashing is not employed since it is optional
functionality that can be implemented in the application.

In addition to scalable ring buffers, the algorithm also uses unbounded
concurrent queues. LL/SC and CAS variants exist in cases where absense of
ABA problem cannot be proved, and also in cases where the compiler's atomic
built-ins may not be lowered to the desired instruction(s). Finally, a version
of the algorithm that uses locks is also provided.

Use --enable-schedule-scalable to conditionally compile this scheduler
into the library.

[1] https://lists.linaro.org/pipermail/lng-odp/2016-September/025682.html

On checkpatch.pl:
 - [2/5] and [5/5] have checkpatch.pl issues that are superfluous

v7:
 - Rebase against new modular queue interface
 - Duplicate arch files under mips64 and powerpc
 - Fix sched->order_lock()
 - Loop until all deferred events have been enqueued
 - Implement ord_enq_multi()
 - Fix ordered_lock/unlock
 - Revert stylistic changes
 - Add default xfactor
 - Remove changes to odp_sched_latency
 - Remove ULL suffix to alleviate Clang build

v6:
 - Move conversions into scalable scheduler to alleviate #ifdefs
 - Remove unnecessary prefetch
 - Fix ARMv8 build

v5:
 - Allocate cache aligned memory using shm pool APIs
 - Move more code to scalable scheduler specific files
 - Remove CONFIG_SPLIT_READWRITE
 - Fix 'make distcheck' issue

v4:
 - Fix a couple more checkpatch.pl issues

v3:
 - Only conditionally compile scalable scheduler and queue
 - Move some code to arch/ dir
 - Use a single shm block for queues instead of block-per-queue
 - De-interleave odp_llqueue.h
 - Use compiler macros to determine ATOM_BITSET_SIZE
 - Incorporated queue size changes
 - Dropped 'ODP_' prefix on config and moved to other files
 - Dropped a few patches that were send independently to the list

v2:
 - Move ARMv8 issues and other fixes into separate patches
 - Abstract away some #ifdefs
 - Fix some checkpatch.pl warnings

Brian Brooks (5):
  test: odp_pktio_ordered: add queue size
  Add arch/ files
  Add a bitset
  Add a concurrent queue
  Add scalable scheduler

 platform/linux-generic/Makefile.am |   10 +
 platform/linux-generic/arch/arm/odp_atomic.h   |  210 +++
 platform/linux-generic/arch/arm/odp_cpu.h  |   65 +
 platform/linux-generic/arch/arm/odp_cpu_idling.h   |   51 +
 platform/linux-generic/arch/arm/odp_llsc.h |  249 +++
 platform/linux-generic/arch/default/odp_cpu.h  |   41 +
 platform/linux-generic/arch/mips64/odp_cpu.h   |   41 +
 platform/linux-generic/arch/powerpc/odp_cpu.h  |   41 +
 platform/linux-generic/arch/x86/odp_cpu.h  |   41 +
 .../include/odp/api/plat/schedule_types.h  |4 +-
 platform/linux-generic/include/odp_bitset.h|  210 +++
 .../linux-generic/include/odp_config_internal.h|   17 +-
 platform/linux-generic/include/odp_llqueue.h   |  309 +++
 .../include/odp_queue_scalable_internal.h  |  102 +
 platform/linux-generic/include/odp_schedule_if.h   |2 +-
 .../linux-generic/include/odp_schedule_scalable.h  |  137 ++
 .../include/odp_schedule_scalable_config.h |   55 +
 .../include/odp_schedule_scalable_ordered.h|  132 ++
 platform/linux-generic/m4/odp_schedule.m4  |   55 +-
 platform/linux-generic/odp_queue_if.c  |8 +
 platform/linux-generic/odp_queue_scalable.c| 1020 ++
 platform/linux-generic/odp_schedule_if.c   |6 +
 platform/linux-generic/odp_schedule_scalable.c | 1978 
 .../linux-generic/odp_schedule_scalable_ordered.c  |  347 
 test/common_plat/performance/odp_pktio_ordered.c   |4 +
 25 files changed, 5113 insertions(+), 22 deletions(-)
 create mode 100644 platform/linux-generic/arch/arm/odp_atomic.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/arm/odp_cpu_idling.h
 create mode 100644 platform/linux-generic/arch/arm/odp_llsc.h
 create mode 100644 platform/linux-generic/arch/default/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/mips64/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/powerpc/odp_cpu.h
 create mode 100644 platform/linux-generic/arch/x86/odp_cpu.h
 create mode 100644 platform/linux-generic/include/odp_bitset.h
 create mode 100644 platform/linux-generic/include/odp_llqueue.h
 create mode 100644 platform/linux-generic/include/odp_queue_scalable_internal.h
 create mode 100644 platform/linux-generic/include/odp_schedule_scalable.h
 cre

[lng-odp] [API-NEXT PATCH v7 1/5] test: odp_pktio_ordered: add queue size

2017-06-13 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
---
 test/common_plat/performance/odp_pktio_ordered.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/test/common_plat/performance/odp_pktio_ordered.c 
b/test/common_plat/performance/odp_pktio_ordered.c
index 4bb0bef9..50bfef51 100644
--- a/test/common_plat/performance/odp_pktio_ordered.c
+++ b/test/common_plat/performance/odp_pktio_ordered.c
@@ -91,6 +91,9 @@
 /** Maximum number of pktio queues per interface */
 #define MAX_QUEUES 32
 
+/** Seems to need at least 8192 elements per queue */
+#define QUEUE_SIZE 8192
+
 /** Maximum number of pktio interfaces */
 #define MAX_PKTIOS 8
 
@@ -1232,6 +1235,7 @@ int main(int argc, char *argv[])
qparam.sched.prio = ODP_SCHED_PRIO_DEFAULT;
qparam.sched.sync = ODP_SCHED_SYNC_ATOMIC;
qparam.sched.group = ODP_SCHED_GROUP_ALL;
+   qparam.size   = QUEUE_SIZE;
 
gbl_args->flow_qcontext[i][j].idx = i;
gbl_args->flow_qcontext[i][j].input_queue = 0;
-- 
2.13.1



[lng-odp] [PATCH v5 1/2] build: GCC 7 fixes

2017-06-08 Thread Brian Brooks
The GCC 7 series introduces changes that expose ODP compilation
issues. These include case statement fall through warnings, and
stricter checks on potential string overflows and other semantic
analysis.

Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 DEPENDENCIES   |  5 ++--
 configure.ac   | 13 ++
 pkgconfig/libodp-linux.pc.in   |  2 +-
 platform/linux-generic/Makefile.am |  2 ++
 platform/linux-generic/m4/configure.m4 | 45 ++
 5 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/DEPENDENCIES b/DEPENDENCIES
index a194cad1..7bcbd5eb 100644
--- a/DEPENDENCIES
+++ b/DEPENDENCIES
@@ -8,13 +8,14 @@ Prerequisites for building the OpenDataPlane (ODP) API
 
automake
autoconf
+   autoconf-archive
libtool
 
On Debian/Ubuntu systems:
-   $ sudo apt-get install automake autoconf libtool
+   $ sudo apt-get install automake autoconf autoconf-archive libtool
 
On CentOS/RedHat/Fedora systems:
-   $ sudo yum install automake autoconf libtool
+   $ sudo yum install automake autoconf autoconf-archive libtool
 
 3. Required libraries
 
diff --git a/configure.ac b/configure.ac
index 7569ebe0..6351878a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -300,6 +300,19 @@ ODP_CFLAGS="$ODP_CFLAGS -Wmissing-declarations 
-Wold-style-definition -Wpointer-
 ODP_CFLAGS="$ODP_CFLAGS -Wcast-align -Wnested-externs -Wcast-qual 
-Wformat-nonliteral"
 ODP_CFLAGS="$ODP_CFLAGS -Wformat-security -Wundef -Wwrite-strings"
 ODP_CFLAGS="$ODP_CFLAGS -std=c99"
+
+dnl Use -Werror in the checks below since Clang emits a warning instead of
+dnl an error when it encounters an unknown warning option.
+AX_CHECK_COMPILE_FLAG([-Wimplicit-fallthrough=0],
+  [ODP_CFLAGS="$ODP_CFLAGS -Wimplicit-fallthrough=0"],
+  [], [-Werror])
+AX_CHECK_COMPILE_FLAG([-Wformat-truncation=0],
+  [ODP_CFLAGS="$ODP_CFLAGS -Wformat-truncation=0"],
+  [], [-Werror])
+AX_CHECK_COMPILE_FLAG([-Wformat-overflow=0],
+  [ODP_CFLAGS="$ODP_CFLAGS -Wformat-overflow=0"],
+  [], [-Werror])
+
 # Extra flags for example to suppress certain warning types
 ODP_CFLAGS="$ODP_CFLAGS $ODP_CFLAGS_EXTRA"
 
diff --git a/pkgconfig/libodp-linux.pc.in b/pkgconfig/libodp-linux.pc.in
index 0769b214..61770175 100644
--- a/pkgconfig/libodp-linux.pc.in
+++ b/pkgconfig/libodp-linux.pc.in
@@ -7,5 +7,5 @@ Name: libodp-linux
 Description: The ODP packet processing engine
 Version: @PKGCONFIG_VERSION@
 Libs: -L${libdir} -lodp-linux
-Libs.private:
+Libs.private: @ATOMIC_LIBS@
 Cflags: -I${includedir}
diff --git a/platform/linux-generic/Makefile.am 
b/platform/linux-generic/Makefile.am
index 69fdf8b9..00ce80d7 100644
--- a/platform/linux-generic/Makefile.am
+++ b/platform/linux-generic/Makefile.am
@@ -219,6 +219,8 @@ __LIB__libodp_linux_la_SOURCES = \
   arch/@ARCH_DIR@/odp_cpu_arch.c \
   arch/@ARCH_DIR@/odp_sysinfo_parse.c
 
+__LIB__libodp_linux_la_LIBADD = $(ATOMIC_LIBS)
+
 if HAVE_PCAP
 __LIB__libodp_linux_la_SOURCES += pktio/pcap.c
 endif
diff --git a/platform/linux-generic/m4/configure.m4 
b/platform/linux-generic/m4/configure.m4
index a2a25408..6a429f1d 100644
--- a/platform/linux-generic/m4/configure.m4
+++ b/platform/linux-generic/m4/configure.m4
@@ -28,6 +28,51 @@ AC_LINK_IFELSE(
 echo "Use newer version. For gcc > 4.7.0"
 exit -1)
 
+dnl Check whether -latomic is needed
+use_libatomic=no
+
+AC_MSG_CHECKING(whether -latomic is needed for 64-bit atomic built-ins)
+AC_LINK_IFELSE(
+  [AC_LANG_SOURCE([[
+static int loc;
+int main(void)
+{
+int prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
+return 0;
+}
+]])],
+  [AC_MSG_RESULT(no)],
+  [AC_MSG_RESULT(yes)
+   AC_CHECK_LIB(
+ [atomic], [__atomic_exchange_8],
+ [use_libatomic=yes],
+ [AC_MSG_FAILURE([__atomic_exchange_8 is not available])])
+  ])
+
+AC_MSG_CHECKING(whether -latomic is needed for 128-bit atomic built-ins)
+AC_LINK_IFELSE(
+  [AC_LANG_SOURCE([[
+static __int128 loc;
+int main(void)
+{
+__int128 prev;
+prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
+return 0;
+}
+]])],
+  [AC_MSG_RESULT(no)],
+  [AC_MSG_RESULT(yes)
+   AC_CHECK_LIB(
+ [atomic], [__atomic_exchange_16],
+ [use_libatomic=yes],
+ [AC_MSG_FAILURE([cannot detect support for 128-bit atomics])])
+  ])
+
+if test "x$use_libatomic" = "xyes"; then
+  ATOMIC_LIBS="-latomic"
+fi
+AC_SUBST([ATOMIC_LIBS])
+
 m4_include([platform/linux-generic/m4/odp_pthread.m4])
 m4_include([platform/linux-generic/m4/odp_openssl.m4])
 m4_include([platform/linux-generic/m4/odp_pcap.m4])
-- 
2.13.0



[lng-odp] [PATCH v5 0/2] GCC 7 fixes

2017-06-08 Thread Brian Brooks
The GCC 7 series introduces changes that expose ODP compilation
issues. These include case statement fall through warnings, and
stricter checks on potential string overflows and other semantic
analysis.

Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027


v5:
 - Make -latomic only a libodp.a/so dependency (Dmitry)

v4:
 - Add -Werror to AX_CHECK_COMPILE_FLAG for Clang (Bill)

v3:
 - Split into multiple patches files (Maxim)
 - Disable warnings in favor of patching right now (Dmitry)
 - Improve libatomic detection in autoconf (Dmitry)
 - Add autoconf-arch to DEPENDENCIES file

v2:
 - Add Bug id to commit message (Bill)


Brian Brooks (2):
  build: GCC 7 fixes
  pktio: GCC 7 fixes

 DEPENDENCIES  |  5 +--
 configure.ac  | 13 
 pkgconfig/libodp-linux.pc.in  |  2 +-
 platform/linux-generic/Makefile.am|  2 ++
 platform/linux-generic/m4/configure.m4| 45 +++
 test/common_plat/validation/api/pktio/pktio.c |  4 ++-
 6 files changed, 67 insertions(+), 4 deletions(-)

-- 
2.13.0



[lng-odp] [PATCH v5 2/2] pktio: GCC 7 fixes

2017-06-08 Thread Brian Brooks
The GCC 7 series introduces changes that expose ODP compilation
issues. These include case statement fall through warnings, and
stricter checks on potential string overflows and other semantic
analysis.

Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
---
 test/common_plat/validation/api/pktio/pktio.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/test/common_plat/validation/api/pktio/pktio.c 
b/test/common_plat/validation/api/pktio/pktio.c
index 11fe974f..4d8d2cc7 100644
--- a/test/common_plat/validation/api/pktio/pktio.c
+++ b/test/common_plat/validation/api/pktio/pktio.c
@@ -1429,7 +1429,9 @@ int pktio_check_statistics_counters(void)
 void pktio_test_statistics_counters(void)
 {
odp_pktio_t pktio_rx, pktio_tx;
-   odp_pktio_t pktio[MAX_NUM_IFACES];
+   odp_pktio_t pktio[MAX_NUM_IFACES] = {
+   ODP_PKTIO_INVALID, ODP_PKTIO_INVALID
+   };
odp_packet_t pkt;
odp_packet_t tx_pkt[1000];
uint32_t pkt_seq[1000];
-- 
2.13.0



[lng-odp] [API-NEXT v3 2/2] timer: allow timer processing to run on worker cores

2017-06-08 Thread Brian Brooks
Run timer pool processing on worker cores if the application hints
that the scheduler will be used. This reduces the latency and jitter
of the point at which timer pool processing begins. See [1] for details.

[1] 
https://docs.google.com/document/d/1sY7rOxqCNu-bMqjBiT5_keAIohrX1ZW-eL0oGLAQ4OM/edit?usp=sharing

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---

v3:
 - Add rate limiting by scheduling rounds

v2:
 - Reword 'worker_timers' to 'use_scheduler'
 - Use ODP Time instead of ticks


include/odp/api/spec/init.h|  5 ++
 platform/linux-generic/include/odp_internal.h  |  1 +
 .../linux-generic/include/odp_timer_internal.h | 11 +++
 platform/linux-generic/odp_init.c  |  8 +-
 platform/linux-generic/odp_schedule.c  |  4 +-
 platform/linux-generic/odp_schedule_iquery.c   |  3 +
 platform/linux-generic/odp_schedule_sp.c   |  4 +
 platform/linux-generic/odp_timer.c | 98 --
 8 files changed, 122 insertions(+), 12 deletions(-)

diff --git a/include/odp/api/spec/init.h b/include/odp/api/spec/init.h
index 154cdf8f..44950893 100644
--- a/include/odp/api/spec/init.h
+++ b/include/odp/api/spec/init.h
@@ -153,6 +153,11 @@ typedef struct odp_init_t {
odp_log_func_t log_fn;
/** Replacement for the default abort fn */
odp_abort_func_t abort_fn;
+   /** Whether the application will ever call odp_schedule() or not.
+
+   Default: true
+   */
+   odp_bool_t use_scheduler;
 } odp_init_t;
 
 /**
diff --git a/platform/linux-generic/include/odp_internal.h 
b/platform/linux-generic/include/odp_internal.h
index 90e2a629..26a5ffd3 100644
--- a/platform/linux-generic/include/odp_internal.h
+++ b/platform/linux-generic/include/odp_internal.h
@@ -51,6 +51,7 @@ struct odp_global_data_s {
odp_cpumask_t worker_cpus;
int num_cpus_installed;
config_t configuration;
+   odp_bool_t use_scheduler;
 };
 
 enum init_stage {
diff --git a/platform/linux-generic/include/odp_timer_internal.h 
b/platform/linux-generic/include/odp_timer_internal.h
index 91b12c54..cd09a0fe 100644
--- a/platform/linux-generic/include/odp_timer_internal.h
+++ b/platform/linux-generic/include/odp_timer_internal.h
@@ -20,6 +20,15 @@
 #include 
 #include 
 
+/* Minimum number of nanoseconds between calls to _timer_run() per thread. */
+#define CONFIG_TIMER_RUN_RATELIMIT_NS 100
+
+/*
+ * Minimum number of scheduling rounds between calls to _timer_run()
+ * per thread.
+ */
+#define CONFIG_TIMER_RUN_RATELIMIT_ROUNDS 1
+
 /**
  * Internal Timeout header
  */
@@ -35,4 +44,6 @@ typedef struct {
odp_timer_t timer;
 } odp_timeout_hdr_t;
 
+unsigned timer_run(void);
+
 #endif
diff --git a/platform/linux-generic/odp_init.c 
b/platform/linux-generic/odp_init.c
index 685e02fa..cbec83c3 100644
--- a/platform/linux-generic/odp_init.c
+++ b/platform/linux-generic/odp_init.c
@@ -4,11 +4,13 @@
  * SPDX-License-Identifier: BSD-3-Clause
  */
 #include 
-#include 
 #include 
-#include 
+
 #include 
+#include 
 #include 
+
+#include 
 #include 
 #include 
 #include 
@@ -159,12 +161,14 @@ int odp_init_global(odp_instance_t *instance,
enum init_stage stage = NO_INIT;
odp_global_data.log_fn = odp_override_log;
odp_global_data.abort_fn = odp_override_abort;
+   odp_global_data.use_scheduler = true;
 
if (params != NULL) {
if (params->log_fn != NULL)
odp_global_data.log_fn = params->log_fn;
if (params->abort_fn != NULL)
odp_global_data.abort_fn = params->abort_fn;
+   odp_global_data.use_scheduler = params->use_scheduler;
}
 
cleanup_files(_ODP_TMPDIR, odp_global_data.main_pid);
diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index c4567d81..f8ff315c 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Number of priority levels  */
 #define NUM_PRIO 8
@@ -988,7 +989,6 @@ static inline int do_schedule(odp_queue_t *out_queue, 
odp_event_t out_ev[],
return 0;
 }
 
-
 static int schedule_loop(odp_queue_t *out_queue, uint64_t wait,
 odp_event_t out_ev[],
 unsigned int max_num)
@@ -998,6 +998,8 @@ static int schedule_loop(odp_queue_t *out_queue, uint64_t 
wait,
int ret;
 
while (1) {
+   timer_run();
+
ret = do_schedule(out_queue, out_ev, max_num);
 
if (ret)
diff --git a/platform/linux-generic/odp_schedule_iquery.c 
b/platform/linux-generic/odp_schedule_iquery.c
index 75470aff..e865650a 100644
--- a/platform/linux-generic/odp_schedule_ique

[lng-odp] [API-NEXT v3 1/2] timer: organize #include

2017-06-08 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Kevin Wang <kevin.w...@arm.com>
---
 platform/linux-generic/odp_timer.c | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/platform/linux-generic/odp_timer.c 
b/platform/linux-generic/odp_timer.c
index cf610bfa..4539ea48 100644
--- a/platform/linux-generic/odp_timer.c
+++ b/platform/linux-generic/odp_timer.c
@@ -22,29 +22,23 @@
 #include 
 
 #include 
+#include 
+#include 
+#include 
 #include 
+#include 
+#include 
 #include 
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
-#include 
 #include 
-#include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -52,6 +46,13 @@
 #include 
 #include 
 #include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 
 
 #define TMO_UNUSED   ((uint64_t)0x)
-- 
2.13.0



Re: [lng-odp] [API-NEXT v2 2/2] timer: allow timer processing to run on worker cores

2017-06-08 Thread Brian Brooks
On 06/08 08:08:52, Savolainen, Petri (Nokia - FI/Espoo) wrote:
> It's always a trade-off between performance (of other than timeout events) 
> and timeout accuracy. There are variety of ODP applications, which means that 
> it's hard to find single timer poll strategy which fit all. Optimally, 
> polling would adapt to the timer usage.
> 
> Consider application that handles 1M packets / sec per cpu and requests 
> timeouts with a period of 10 or 100ms. In practice, it would be enough to 
> check the time and run timer code, only on every 1k or 10k events (packets). 
> Especially, if time stamp read is a system call, it might not be a great idea 
> to do that for every 1us (for every packet or every schedule() call).

Agree with analysis, but the major assumption here is that the traffic
is deterministic. Perhaps there are studies on the characteristics or level
of determinism of traffic or link utilization of boxes in the field?

> -Petri
> 
> 
> > -Original Message-
> > From: Bogdan Pricope [mailto:bogdan.pric...@linaro.org]
> > Sent: Thursday, June 08, 2017 9:22 AM
> > To: Brian Brooks <brian.bro...@arm.com>
> > Cc: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolai...@nokia.com>;
> > lng-odp@lists.linaro.org
> > Subject: Re: [lng-odp] [API-NEXT v2 2/2] timer: allow timer processing to
> > run on worker cores
> > 
> > Brian, during yesterday meeting you mentioned about modifying this
> > patch to call timer processing function (_timer_run()) after fixed (or
> > dynamically determined) number of loops:
> >  - If you meant application loops (calls to odp_schedule()), this is
> > wrong for the reasons presented in my prev. email
> >  - If you meant scheduler internal loops (calls to do_schedule() from
> > inside schedule_loop()), this may work.
> > 
> > On 7 June 2017 at 20:02, Brian Brooks <brian.bro...@arm.com> wrote:
> > > On 06/07 17:39:16, Bogdan Pricope wrote:
> > >> In OFP we have this use case: N-1 cores are doing packet processing in
> > >> direct pktin mode; core 0 is doing odp_schedule() to process timers
> > >> (ARP entries expiration) and maybe other events.
> > >>
> > >> To take from this:
> > >> -  If scheduler + timers are used, may not be use on all cores
> > >> -  If scheduler + timers are used, odp_schedule() may not be
> > >> called at the rate of the packets
> > >>
> > >>
> > >> So, is not a good idea to condition timer processing to the number of
> > >> odp_schedule() calls.
> > >
> > > Can you please elaborate on how you reached this conclusion? I did not
> > > understand.
> > >
> > >> On 7 June 2017 at 10:30, Savolainen, Petri (Nokia - FI/Espoo)
> > >> <petri.savolai...@nokia.com> wrote:
> > >> >
> > >> >>
> > >> >> diff --git a/include/odp/api/spec/init.h
> > b/include/odp/api/spec/init.h
> > >> >> index 154cdf8f..44950893 100644
> > >> >> --- a/include/odp/api/spec/init.h
> > >> >> +++ b/include/odp/api/spec/init.h
> > >> >> @@ -153,6 +153,11 @@ typedef struct odp_init_t {
> > >> >>   odp_log_func_t log_fn;
> > >> >>   /** Replacement for the default abort fn */
> > >> >>   odp_abort_func_t abort_fn;
> > >> >> + /** Whether the application will ever call odp_schedule() or
> > >> >> not.
> > >> >> +
> > >> >> + Default: true
> > >> >> + */
> > >> >> + odp_bool_t use_scheduler;
> > >> >>  } odp_init_t;
> > >> >
> > >> >
> > >> > This is an API change. It should be in a separate patch and subject
> > should be "api: init: ...".
> > >> >
> > >> > Anyway, Bill sent already a proposal that is closer what I described
> > earlier. Init struct should have odp_feature_t bit field for feature /
> > unused_feature selection.
> > >> >
> > >> > -Petri


Re: [lng-odp] [API-NEXT v2 2/2] timer: allow timer processing to run on worker cores

2017-06-08 Thread Brian Brooks
On 06/08 09:21:35, Bogdan Pricope wrote:
> Brian, during yesterday meeting you mentioned about modifying this
> patch to call timer processing function (_timer_run()) after fixed (or
> dynamically determined) number of loops:
>  - If you meant application loops (calls to odp_schedule()), this is
> wrong for the reasons presented in my prev. email
>  - If you meant scheduler internal loops (calls to do_schedule() from
> inside schedule_loop()), this may work.

We are in sync. I am thinking scheduler internal loops.

I implemented this yesterday as a build config (no #ifdefs !!) and the
compiler will optimize out the code if _timer_run() should be called
every scheduler internal loop. So, best of both worlds without any
penalty if not using rate limiting by scheduler internal loops.

> On 7 June 2017 at 20:02, Brian Brooks <brian.bro...@arm.com> wrote:
> > On 06/07 17:39:16, Bogdan Pricope wrote:
> >> In OFP we have this use case: N-1 cores are doing packet processing in
> >> direct pktin mode; core 0 is doing odp_schedule() to process timers
> >> (ARP entries expiration) and maybe other events.
> >>
> >> To take from this:
> >> -  If scheduler + timers are used, may not be use on all cores
> >> -  If scheduler + timers are used, odp_schedule() may not be
> >> called at the rate of the packets
> >>
> >>
> >> So, is not a good idea to condition timer processing to the number of
> >> odp_schedule() calls.
> >
> > Can you please elaborate on how you reached this conclusion? I did not
> > understand.
> >
> >> On 7 June 2017 at 10:30, Savolainen, Petri (Nokia - FI/Espoo)
> >> <petri.savolai...@nokia.com> wrote:
> >> >
> >> >>
> >> >> diff --git a/include/odp/api/spec/init.h b/include/odp/api/spec/init.h
> >> >> index 154cdf8f..44950893 100644
> >> >> --- a/include/odp/api/spec/init.h
> >> >> +++ b/include/odp/api/spec/init.h
> >> >> @@ -153,6 +153,11 @@ typedef struct odp_init_t {
> >> >>   odp_log_func_t log_fn;
> >> >>   /** Replacement for the default abort fn */
> >> >>   odp_abort_func_t abort_fn;
> >> >> + /** Whether the application will ever call odp_schedule() or
> >> >> not.
> >> >> +
> >> >> + Default: true
> >> >> + */
> >> >> + odp_bool_t use_scheduler;
> >> >>  } odp_init_t;
> >> >
> >> >
> >> > This is an API change. It should be in a separate patch and subject 
> >> > should be "api: init: ...".
> >> >
> >> > Anyway, Bill sent already a proposal that is closer what I described 
> >> > earlier. Init struct should have odp_feature_t bit field for feature / 
> >> > unused_feature selection.
> >> >
> >> > -Petri


Re: [lng-odp] [PATCH v4 1/2] build: GCC 7 fixes

2017-06-08 Thread Brian Brooks
On 06/08 16:00:21, Dmitry Eremin-Solenikov wrote:
> On 08.06.2017 06:40, Brian Brooks wrote:
> > The GCC 7 series introduces changes that expose ODP compilation
> > issues. These include case statement fall through warnings, and
> > stricter checks on potential string overflows and other semantic
> > analysis.
> > 
> > Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027
> > 
> > Signed-off-by: Brian Brooks <brian.bro...@arm.com>
> > Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > ---
> >  DEPENDENCIES   |  5 ++--
> >  configure.ac   | 13 ++
> >  platform/linux-generic/m4/configure.m4 | 44 
> > ++
> >  3 files changed, 60 insertions(+), 2 deletions(-)
> > 
> > diff --git a/DEPENDENCIES b/DEPENDENCIES
> > index a194cad1..7bcbd5eb 100644
> > --- a/DEPENDENCIES
> > +++ b/DEPENDENCIES
> > @@ -8,13 +8,14 @@ Prerequisites for building the OpenDataPlane (ODP) API
> >  
> > automake
> > autoconf
> > +   autoconf-archive
> > libtool
> >  
> > On Debian/Ubuntu systems:
> > -   $ sudo apt-get install automake autoconf libtool
> > +   $ sudo apt-get install automake autoconf autoconf-archive libtool
> >  
> > On CentOS/RedHat/Fedora systems:
> > -   $ sudo yum install automake autoconf libtool
> > +   $ sudo yum install automake autoconf autoconf-archive libtool
> 
> As with ax_pthread, it might be easier to just import corresponding .m4
> file.

I'd prefer we just require autoconf-archive and removes other duplicated
m4 files from this repo. checkpatch.pl is an example of why copying files
is generally not a good practice.

> >  
> >  3. Required libraries
> >  
> > diff --git a/configure.ac b/configure.ac
> > index 7569ebe0..6351878a 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -300,6 +300,19 @@ ODP_CFLAGS="$ODP_CFLAGS -Wmissing-declarations 
> > -Wold-style-definition -Wpointer-
> >  ODP_CFLAGS="$ODP_CFLAGS -Wcast-align -Wnested-externs -Wcast-qual 
> > -Wformat-nonliteral"
> >  ODP_CFLAGS="$ODP_CFLAGS -Wformat-security -Wundef -Wwrite-strings"
> >  ODP_CFLAGS="$ODP_CFLAGS -std=c99"
> > +
> > +dnl Use -Werror in the checks below since Clang emits a warning instead of
> > +dnl an error when it encounters an unknown warning option.
> > +AX_CHECK_COMPILE_FLAG([-Wimplicit-fallthrough=0],
> > +  [ODP_CFLAGS="$ODP_CFLAGS -Wimplicit-fallthrough=0"],
> > +  [], [-Werror])
> > +AX_CHECK_COMPILE_FLAG([-Wformat-truncation=0],
> > +  [ODP_CFLAGS="$ODP_CFLAGS -Wformat-truncation=0"],
> > +  [], [-Werror])
> > +AX_CHECK_COMPILE_FLAG([-Wformat-overflow=0],
> > +  [ODP_CFLAGS="$ODP_CFLAGS -Wformat-overflow=0"],
> > +  [], [-Werror])
> > +
> >  # Extra flags for example to suppress certain warning types
> >  ODP_CFLAGS="$ODP_CFLAGS $ODP_CFLAGS_EXTRA"
> >  
> > diff --git a/platform/linux-generic/m4/configure.m4 
> > b/platform/linux-generic/m4/configure.m4
> > index a2a25408..be04da8a 100644
> > --- a/platform/linux-generic/m4/configure.m4
> > +++ b/platform/linux-generic/m4/configure.m4
> > @@ -28,6 +28,50 @@ AC_LINK_IFELSE(
> >  echo "Use newer version. For gcc > 4.7.0"
> >  exit -1)
> >  
> > +dnl Check whether -latomic is needed
> > +use_libatomic=no
> > +
> > +AC_MSG_CHECKING(whether -latomic is needed for 64-bit atomic built-ins)
> > +AC_LINK_IFELSE(
> > +  [AC_LANG_SOURCE([[
> > +static int loc;
> > +int main(void)
> > +{
> > +int prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
> > +return 0;
> > +}
> > +]])],
> > +  [AC_MSG_RESULT(no)],
> > +  [AC_MSG_RESULT(yes)
> > +   AC_CHECK_LIB(
> > + [atomic], [__atomic_exchange_8],
> > + [use_libatomic=yes],
> > + [AC_MSG_FAILURE([__atomic_exchange_8 is not available])])
> > +  ])
> > +
> > +AC_MSG_CHECKING(whether -latomic is needed for 128-bit atomic built-ins)
> > +AC_LINK_IFELSE(
> > +  [AC_LANG_SOURCE([[
> > +static __int128 loc;
> > +int main(void)
> > +{
> > +__int128 prev;
> > +prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
> > +return 0;
> > +}
> > +]])],
> > +  [AC_MSG_RESULT(no)],
> > +  [AC_MSG_RESULT(yes)
> > +   AC_CHECK_LIB(
> > + [atomic], [__atomic_exchange_16],
> > + [use_libatomic=yes],
> > + [AC_MSG_FAILURE([cannot detect support for 128-bit atomics])])
> > +  ])
> > +
> > +if test "x$use_libatomic" = "xyes"; then
> > +  AM_LDFLAGS="$AM_LDFLAGS -latomic"
> > +fi
> 
> Could you please change this to ATOMIC_LDFLAGS (see PR #45)

I see. Yes.

> > +
> >  m4_include([platform/linux-generic/m4/odp_pthread.m4])
> >  m4_include([platform/linux-generic/m4/odp_openssl.m4])
> >  m4_include([platform/linux-generic/m4/odp_pcap.m4])
> > 
> 
> 
> -- 
> With best wishes
> Dmitry


[lng-odp] [PATCH] arch: arm: add CPU global time

2017-06-07 Thread Brian Brooks
Expose ARMv8 Generic Timer through internal CPU global time functions.

Signed-off-by: Brian Brooks <brian.bro...@arm.com>

---

v2:
 - Add text to explain the usage of the ARM architected timer (Petri)


 platform/linux-generic/arch/arm/odp_cpu_arch.c | 38 +-
 1 file changed, 37 insertions(+), 1 deletion(-)

diff --git a/platform/linux-generic/arch/arm/odp_cpu_arch.c 
b/platform/linux-generic/arch/arm/odp_cpu_arch.c
index c31f9084..91d439d9 100644
--- a/platform/linux-generic/arch/arm/odp_cpu_arch.c
+++ b/platform/linux-generic/arch/arm/odp_cpu_arch.c
@@ -50,15 +50,51 @@ uint64_t odp_cpu_cycles_resolution(void)
 
 int cpu_has_global_time(void)
 {
-   return 0;
+   uint64_t hz = cpu_global_time_freq();
+
+   /*
+* The system counter portion of the architected timer must
+* provide a uniform view of system time to all processing
+* elements in the system. This should hold true even for
+* heterogeneous SoCs.
+*
+* Determine whether the system has 'global time' by checking
+* whether a read of the architected timer frequency sys reg
+* returns a sane value. Sane is considered to be within
+* 1MHz and 6GHz (1us and .1667ns period).
+*/
+   return hz >= 100 && hz <= 60;
 }
 
 uint64_t cpu_global_time(void)
 {
+#if __ARM_ARCH == 8
+   uint64_t cntvct;
+
+   /*
+* To be consistent with other architectures, do not issue a
+* serializing instruction, e.g. ISB, before reading this
+* sys reg.
+*/
+
+   /* Memory clobber to minimize optimization around load from sys reg. */
+   __asm__ volatile("mrs %0, cntvct_el0" : "=r"(cntvct) : : "memory");
+
+   return cntvct;
+#else
return 0;
+#endif
 }
 
 uint64_t cpu_global_time_freq(void)
 {
+#if __ARM_ARCH == 8
+   uint64_t cntfrq;
+
+   __asm__ volatile("mrs %0, cntfrq_el0" : "=r"(cntfrq) : : );
+
+   return cntfrq;
+#else
return 0;
+#endif
 }
-- 
2.13.0



[lng-odp] [PATCH v4 0/2] GCC 7 fixes

2017-06-07 Thread Brian Brooks
The GCC 7 series introduces changes that expose ODP compilation
issues. These include case statement fall through warnings, and
stricter checks on potential string overflows and other semantic
analysis.

Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027

Brian Brooks (2):
  build: GCC 7 fixes
  pktio: GCC 7 fixes

 DEPENDENCIES  |  5 +--
 configure.ac  | 13 
 platform/linux-generic/m4/configure.m4| 44 +++
 test/common_plat/validation/api/pktio/pktio.c |  4 ++-
 4 files changed, 63 insertions(+), 3 deletions(-)

--

v4:
 - Add -Werror to AX_CHECK_COMPILE_FLAG for Clang (Bill)

v3:
 - Split into multiple patches files (Maxim)
 - Disable warnings in favor of patching right now (Dmitry)
 - Improve libatomic detection in autoconf (Dmitry)
 - Add autoconf-arch to DEPENDENCIES file

v2:
 - Add Bug id to commit message (Bill)

2.13.0



[lng-odp] [PATCH v4 1/2] build: GCC 7 fixes

2017-06-07 Thread Brian Brooks
The GCC 7 series introduces changes that expose ODP compilation
issues. These include case statement fall through warnings, and
stricter checks on potential string overflows and other semantic
analysis.

Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 DEPENDENCIES   |  5 ++--
 configure.ac   | 13 ++
 platform/linux-generic/m4/configure.m4 | 44 ++
 3 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/DEPENDENCIES b/DEPENDENCIES
index a194cad1..7bcbd5eb 100644
--- a/DEPENDENCIES
+++ b/DEPENDENCIES
@@ -8,13 +8,14 @@ Prerequisites for building the OpenDataPlane (ODP) API
 
automake
autoconf
+   autoconf-archive
libtool
 
On Debian/Ubuntu systems:
-   $ sudo apt-get install automake autoconf libtool
+   $ sudo apt-get install automake autoconf autoconf-archive libtool
 
On CentOS/RedHat/Fedora systems:
-   $ sudo yum install automake autoconf libtool
+   $ sudo yum install automake autoconf autoconf-archive libtool
 
 3. Required libraries
 
diff --git a/configure.ac b/configure.ac
index 7569ebe0..6351878a 100644
--- a/configure.ac
+++ b/configure.ac
@@ -300,6 +300,19 @@ ODP_CFLAGS="$ODP_CFLAGS -Wmissing-declarations 
-Wold-style-definition -Wpointer-
 ODP_CFLAGS="$ODP_CFLAGS -Wcast-align -Wnested-externs -Wcast-qual 
-Wformat-nonliteral"
 ODP_CFLAGS="$ODP_CFLAGS -Wformat-security -Wundef -Wwrite-strings"
 ODP_CFLAGS="$ODP_CFLAGS -std=c99"
+
+dnl Use -Werror in the checks below since Clang emits a warning instead of
+dnl an error when it encounters an unknown warning option.
+AX_CHECK_COMPILE_FLAG([-Wimplicit-fallthrough=0],
+  [ODP_CFLAGS="$ODP_CFLAGS -Wimplicit-fallthrough=0"],
+  [], [-Werror])
+AX_CHECK_COMPILE_FLAG([-Wformat-truncation=0],
+  [ODP_CFLAGS="$ODP_CFLAGS -Wformat-truncation=0"],
+  [], [-Werror])
+AX_CHECK_COMPILE_FLAG([-Wformat-overflow=0],
+  [ODP_CFLAGS="$ODP_CFLAGS -Wformat-overflow=0"],
+  [], [-Werror])
+
 # Extra flags for example to suppress certain warning types
 ODP_CFLAGS="$ODP_CFLAGS $ODP_CFLAGS_EXTRA"
 
diff --git a/platform/linux-generic/m4/configure.m4 
b/platform/linux-generic/m4/configure.m4
index a2a25408..be04da8a 100644
--- a/platform/linux-generic/m4/configure.m4
+++ b/platform/linux-generic/m4/configure.m4
@@ -28,6 +28,50 @@ AC_LINK_IFELSE(
 echo "Use newer version. For gcc > 4.7.0"
 exit -1)
 
+dnl Check whether -latomic is needed
+use_libatomic=no
+
+AC_MSG_CHECKING(whether -latomic is needed for 64-bit atomic built-ins)
+AC_LINK_IFELSE(
+  [AC_LANG_SOURCE([[
+static int loc;
+int main(void)
+{
+int prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
+return 0;
+}
+]])],
+  [AC_MSG_RESULT(no)],
+  [AC_MSG_RESULT(yes)
+   AC_CHECK_LIB(
+ [atomic], [__atomic_exchange_8],
+ [use_libatomic=yes],
+ [AC_MSG_FAILURE([__atomic_exchange_8 is not available])])
+  ])
+
+AC_MSG_CHECKING(whether -latomic is needed for 128-bit atomic built-ins)
+AC_LINK_IFELSE(
+  [AC_LANG_SOURCE([[
+static __int128 loc;
+int main(void)
+{
+__int128 prev;
+prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
+return 0;
+}
+]])],
+  [AC_MSG_RESULT(no)],
+  [AC_MSG_RESULT(yes)
+   AC_CHECK_LIB(
+ [atomic], [__atomic_exchange_16],
+ [use_libatomic=yes],
+ [AC_MSG_FAILURE([cannot detect support for 128-bit atomics])])
+  ])
+
+if test "x$use_libatomic" = "xyes"; then
+  AM_LDFLAGS="$AM_LDFLAGS -latomic"
+fi
+
 m4_include([platform/linux-generic/m4/odp_pthread.m4])
 m4_include([platform/linux-generic/m4/odp_openssl.m4])
 m4_include([platform/linux-generic/m4/odp_pcap.m4])
-- 
2.13.0



[lng-odp] [PATCH v4 2/2] pktio: GCC 7 fixes

2017-06-07 Thread Brian Brooks
The GCC 7 series introduces changes that expose ODP compilation
issues. These include case statement fall through warnings, and
stricter checks on potential string overflows and other semantic
analysis.

Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
---
 test/common_plat/validation/api/pktio/pktio.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/test/common_plat/validation/api/pktio/pktio.c 
b/test/common_plat/validation/api/pktio/pktio.c
index 11fe974f..4d8d2cc7 100644
--- a/test/common_plat/validation/api/pktio/pktio.c
+++ b/test/common_plat/validation/api/pktio/pktio.c
@@ -1429,7 +1429,9 @@ int pktio_check_statistics_counters(void)
 void pktio_test_statistics_counters(void)
 {
odp_pktio_t pktio_rx, pktio_tx;
-   odp_pktio_t pktio[MAX_NUM_IFACES];
+   odp_pktio_t pktio[MAX_NUM_IFACES] = {
+   ODP_PKTIO_INVALID, ODP_PKTIO_INVALID
+   };
odp_packet_t pkt;
odp_packet_t tx_pkt[1000];
uint32_t pkt_seq[1000];
-- 
2.13.0



Re: [lng-odp] [API-NEXT v2 0/2] inline timers

2017-06-07 Thread Brian Brooks
LGTM. Sometimes updates from Github propagate to mailing list, but I
did not see any update in this case / for the update that you did
today.

On Wed, Jun 7, 2017 at 2:11 PM, Bill Fischofer
<bill.fischo...@linaro.org> wrote:
> I've posted pull request https://github.com/Linaro/odp/pull/46 to add
> the odp_feature_t type and its usage in odp_init_global.
>
> On Tue, Jun 6, 2017 at 2:09 PM, Brian Brooks <brian.bro...@arm.com> wrote:
>> Run timer pool processing on worker cores if the application hints
>> that the scheduler will be used. This reduces the latency and jitter
>> of the point at which timer pool processing begins. See [1] for details.
>>
>> [1] 
>> https://docs.google.com/document/d/1sY7rOxqCNu-bMqjBiT5_keAIohrX1ZW-eL0oGLAQ4OM/edit?usp=sharing
>>
>> Signed-off-by: Brian Brooks <brian.bro...@arm.com>
>>
>> v2:
>>  - Reword 'worker_timers' to 'use_scheduler'
>>  - Use ODP Time instead of ticks
>>
>> Brian Brooks (2):
>>   timer: organize #include
>>   timer: allow timer processing to run on worker cores
>>
>>  include/odp/api/spec/init.h|   5 +
>>  platform/linux-generic/include/odp_internal.h  |   1 +
>>  .../linux-generic/include/odp_timer_internal.h |   5 +
>>  platform/linux-generic/odp_init.c  |   8 +-
>>  platform/linux-generic/odp_schedule.c  |   4 +-
>>  platform/linux-generic/odp_schedule_iquery.c   |   3 +
>>  platform/linux-generic/odp_schedule_sp.c   |   4 +
>>  platform/linux-generic/odp_timer.c | 116 
>> +
>>  8 files changed, 122 insertions(+), 24 deletions(-)
>>
>> --
>> 2.13.0
>>


Re: [lng-odp] [PATCHv2] linux-gen: improve conversion between buf_hdr_t and packet_t

2017-06-07 Thread Brian Brooks
On 06/07 15:19:08, Joyce Kong wrote:
> Signed-off-by: Joyce Kong <joyce.k...@arm.com>

Reviewed-by: Brian Brooks <brian.bro...@arm.com>

> ---
>  platform/linux-generic/include/odp_packet_internal.h | 10 ++
>  platform/linux-generic/odp_packet_io.c   |  8 +++-
>  platform/linux-generic/pktio/loop.c  |  4 ++--
>  3 files changed, 15 insertions(+), 7 deletions(-)
> 
> diff --git a/platform/linux-generic/include/odp_packet_internal.h 
> b/platform/linux-generic/include/odp_packet_internal.h
> index 0a9f177..6edacec 100644
> --- a/platform/linux-generic/include/odp_packet_internal.h
> +++ b/platform/linux-generic/include/odp_packet_internal.h
> @@ -168,6 +168,16 @@ static inline odp_packet_t 
> packet_handle(odp_packet_hdr_t *pkt_hdr)
>   return (odp_packet_t)pkt_hdr;
>  }
>  
> +static inline odp_buffer_hdr_t *packet_to_buf_hdr(odp_packet_t pkt)
> +{
> + return _packet_hdr(pkt)->buf_hdr;
> +}
> +
> +static inline odp_packet_t packet_from_buf_hdr(odp_buffer_hdr_t *buf_hdr)
> +{
> + return (odp_packet_t)(odp_packet_hdr_t *)buf_hdr;
> +}
> +
>  static inline void copy_packet_parser_metadata(odp_packet_hdr_t *src_hdr,
>  odp_packet_hdr_t *dst_hdr)
>  {
> diff --git a/platform/linux-generic/odp_packet_io.c 
> b/platform/linux-generic/odp_packet_io.c
> index 50a000e..5451967 100644
> --- a/platform/linux-generic/odp_packet_io.c
> +++ b/platform/linux-generic/odp_packet_io.c
> @@ -552,7 +552,6 @@ static inline int pktin_recv_buf(odp_pktin_queue_t queue,
>   odp_packet_t packets[num];
>   odp_packet_hdr_t *pkt_hdr;
>   odp_buffer_hdr_t *buf_hdr;
> - odp_buffer_t buf;
>   int i;
>   int pkts;
>   int num_rx = 0;
> @@ -562,8 +561,7 @@ static inline int pktin_recv_buf(odp_pktin_queue_t queue,
>   for (i = 0; i < pkts; i++) {
>   pkt = packets[i];
>   pkt_hdr = odp_packet_hdr(pkt);
> - buf = _odp_packet_to_buffer(pkt);
> - buf_hdr = buf_hdl_to_hdr(buf);
> + buf_hdr = packet_to_buf_hdr(pkt);
>  
>   if (pkt_hdr->p.input_flags.dst_queue) {
>   queue_entry_t *dst_queue;
> @@ -582,7 +580,7 @@ static inline int pktin_recv_buf(odp_pktin_queue_t queue,
>  
>  int pktout_enqueue(queue_entry_t *qentry, odp_buffer_hdr_t *buf_hdr)
>  {
> - odp_packet_t pkt = _odp_packet_from_buffer(buf_hdr->handle.handle);
> + odp_packet_t pkt = packet_from_buf_hdr(buf_hdr);
>   int len = 1;
>   int nbr;
>  
> @@ -612,7 +610,7 @@ int pktout_enq_multi(queue_entry_t *qentry, 
> odp_buffer_hdr_t *buf_hdr[],
>   return nbr;
>  
>   for (i = 0; i < num; ++i)
> - pkt_tbl[i] = _odp_packet_from_buffer(buf_hdr[i]->handle.handle);
> + pkt_tbl[i] = packet_from_buf_hdr(buf_hdr[i]);
>  
>   nbr = odp_pktout_send(qentry->s.pktout, pkt_tbl, num);
>   return nbr;
> diff --git a/platform/linux-generic/pktio/loop.c 
> b/platform/linux-generic/pktio/loop.c
> index 61e98ad..c956f48 100644
> --- a/platform/linux-generic/pktio/loop.c
> +++ b/platform/linux-generic/pktio/loop.c
> @@ -81,7 +81,7 @@ static int loopback_recv(pktio_entry_t *pktio_entry, int 
> index ODP_UNUSED,
>   for (i = 0; i < nbr; i++) {
>   uint32_t pkt_len;
>  
> - pkt = _odp_packet_from_buffer(odp_hdr_to_buf(hdr_tbl[i]));
> + pkt = packet_from_buf_hdr(hdr_tbl[i]);
>   pkt_len = odp_packet_len(pkt);
>  
>  
> @@ -162,7 +162,7 @@ static int loopback_send(pktio_entry_t *pktio_entry, int 
> index ODP_UNUSED,
>   len = QUEUE_MULTI_MAX;
>  
>   for (i = 0; i < len; ++i) {
> - hdr_tbl[i] = buf_hdl_to_hdr(_odp_packet_to_buffer(pkt_tbl[i]));
> + hdr_tbl[i] = packet_to_buf_hdr(pkt_tbl[i]);
>   bytes += odp_packet_len(pkt_tbl[i]);
>   }
>  
> -- 
> 2.7.4
> 
> 


Re: [lng-odp] [PATCH v3 0/2] GCC 7 fixes

2017-06-07 Thread Brian Brooks
On 06/07 07:03:31, Bill Fischofer wrote:
> Sorry but a bit more testing means I need to rescind my review. This
> patch seems to break clang compilation:

You're right. Which is a bit strange because this macro comes from
autoconf-archive package. Will see if there can be a workaround.

> bill@Ubuntu64:~/linaro/gcc7fix$ make
> Making all in platform/linux-generic
> make[1]: Entering directory '/home/bill/linaro/gcc7fix/platform/linux-generic'
>   CC   _fdserver.lo
> error: unknown warning option '-Wimplicit-fallthrough=0'; did you mean
>   '-Wimplicit-fallthrough'? [-Werror,-Wunknown-warning-option]
> error: unknown warning option '-Wformat-truncation=0'
>   [-Werror,-Wunknown-warning-option]
> error: unknown warning option '-Wformat-overflow=0'; did you mean
>   '-Wshift-overflow'? [-Werror,-Wunknown-warning-option]
> Makefile:920: recipe for target '_fdserver.lo' failed
> make[1]: *** [_fdserver.lo] Error 1
> make[1]: Leaving directory '/home/bill/linaro/gcc7fix/platform/linux-generic'
> Makefile:492: recipe for target 'all-recursive' failed
> make: *** [all-recursive] Error 1
> 
> The new options should only apply to GCC-based builds, ./configure
> CC=clang should not use these extensions.
> 
> On Tue, Jun 6, 2017 at 7:49 AM, Bill Fischofer
> <bill.fischo...@linaro.org> wrote:
> > Sorry, my bad. The gcc-7 was installed as gcc-7 and I forgot to
> > specify CC=gcc-7 on ./configure. Correcting that allows everything to
> > compile cleanly.
> >
> > For this series:
> >
> > Reviewed-and-tested-by: Bill Fischofer <bill.fischo...@linaro.org>
> >
> > On Mon, Jun 5, 2017 at 10:36 PM, Brian Brooks <brian.bro...@arm.com> wrote:
> >> On 06/05 22:29:22, Bill Fischofer wrote:
> >>> On Mon, Jun 5, 2017 at 9:32 PM, Brian Brooks <brian.bro...@arm.com> wrote:
> >>> > On 06/05 18:40:02, Bill Fischofer wrote:
> >>> >> After installing a copy of GCC 7, It looks like this patch is an
> >>> >> incomplete fix. With this patch applied older GCC 6.3.0 continues to
> >>> >> work fine, but GCC 7.0.1 generates the following errors:
> >>> >>
> >>> >> Making all in platform/linux-generic
> >>> >> make[1]: Entering directory 
> >>> >> '/home/bill/linaro/gcc7fix/platform/linux-generic'
> >>> >>   CC   pktio/ipc.lo
> >>> >> pktio/ipc.c: In function ‘ipc_close’:
> >>> >> pktio/ipc.c:698:33: error: ‘%s’ directive output may be truncated
> >>> >> writing up to 255 bytes into a region of size 32
> >>> >> [-Werror=format-truncation=]
> >>> >>snprintf(name, sizeof(name), "%s", dev);
> >>> >
> >>> > It looks like ./bootstrap might not have been run after applying the 
> >>> > patch?
> >>>
> >>> I just retried the build on a clean directory with ./bootstrap and
> >>> ./configure and get the same errors.
> >>
> >> Can you post the bottom part of ./configure output?
> >>
> >> Here is what I see:
> >>
> >> cc: gcc
> >> cc version: 7.1.1
> >> cppflags:
> >> am_cppflags:
> >> am_cxxflags:-std=c++11
> >> cflags: -g -O2
> >> am_cflags:   -pthread -DHAVE_PCAP  
> >> -DIMPLEMENTATION_NAME=odp-linux -DODP_DEBUG_PRINT=0 -DODPH_DEBUG_PRINT=0 
> >> -DODP_DEBUG=0 -W -Wall -Werror -Wstrict-prototypes -Wmissing-prototypes 
> >> -Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-align 
> >> -Wnested-externs -Wcast-qual -Wformat-nonliteral -Wformat-security -Wundef 
> >> -Wwrite-strings -std=c99 -Wimplicit-fallthrough=0 -Wformat-truncation=0 
> >> -Wformat-overflow=0  -mcx16
> >> ldflags:
> >> am_ldflags:  -latomic  -pthread -lrt
> >> libs:   -lrt -lcunit -lcrypto   -lpcap
> >> defs:   -DHAVE_CONFIG_H
> >> static libraries:   yes
> >> shared libraries:   yes
> >> ABI compatible: yes
> >> cunit:  yes
> >> test_vald:  yes
> >> test_perf:  yes
> >> test_perf_proc: yes
> >> test_cpp:   yes
> >> test_helper:yes
> >> test_example:   yes
> >> user_guides:no


Re: [lng-odp] [API-NEXT v2 2/2] timer: allow timer processing to run on worker cores

2017-06-07 Thread Brian Brooks
On 06/07 17:39:16, Bogdan Pricope wrote:
> In OFP we have this use case: N-1 cores are doing packet processing in
> direct pktin mode; core 0 is doing odp_schedule() to process timers
> (ARP entries expiration) and maybe other events.
> 
> To take from this:
> -  If scheduler + timers are used, may not be use on all cores
> -  If scheduler + timers are used, odp_schedule() may not be
> called at the rate of the packets
> 
> 
> So, is not a good idea to condition timer processing to the number of
> odp_schedule() calls.

Can you please elaborate on how you reached this conclusion? I did not
understand.

> On 7 June 2017 at 10:30, Savolainen, Petri (Nokia - FI/Espoo)
>  wrote:
> >
> >>
> >> diff --git a/include/odp/api/spec/init.h b/include/odp/api/spec/init.h
> >> index 154cdf8f..44950893 100644
> >> --- a/include/odp/api/spec/init.h
> >> +++ b/include/odp/api/spec/init.h
> >> @@ -153,6 +153,11 @@ typedef struct odp_init_t {
> >>   odp_log_func_t log_fn;
> >>   /** Replacement for the default abort fn */
> >>   odp_abort_func_t abort_fn;
> >> + /** Whether the application will ever call odp_schedule() or
> >> not.
> >> +
> >> + Default: true
> >> + */
> >> + odp_bool_t use_scheduler;
> >>  } odp_init_t;
> >
> >
> > This is an API change. It should be in a separate patch and subject should 
> > be "api: init: ...".
> >
> > Anyway, Bill sent already a proposal that is closer what I described 
> > earlier. Init struct should have odp_feature_t bit field for feature / 
> > unused_feature selection.
> >
> > -Petri


[lng-odp] [API-NEXT v2 2/2] timer: allow timer processing to run on worker cores

2017-06-06 Thread Brian Brooks
Run timer pool processing on worker cores if the application hints
that the scheduler will be used. This reduces the latency and jitter
of the point at which timer pool processing begins. See [1] for details.

[1] 
https://docs.google.com/document/d/1sY7rOxqCNu-bMqjBiT5_keAIohrX1ZW-eL0oGLAQ4OM/edit?usp=sharing

Signed-off-by: Brian Brooks <brian.bro...@arm.com>

---
 include/odp/api/spec/init.h|  5 ++
 platform/linux-generic/include/odp_internal.h  |  1 +
 .../linux-generic/include/odp_timer_internal.h |  5 ++
 platform/linux-generic/odp_init.c  |  8 +-
 platform/linux-generic/odp_schedule.c  |  4 +-
 platform/linux-generic/odp_schedule_iquery.c   |  3 +
 platform/linux-generic/odp_schedule_sp.c   |  4 +
 platform/linux-generic/odp_timer.c | 91 +++---
 8 files changed, 109 insertions(+), 12 deletions(-)

diff --git a/include/odp/api/spec/init.h b/include/odp/api/spec/init.h
index 154cdf8f..44950893 100644
--- a/include/odp/api/spec/init.h
+++ b/include/odp/api/spec/init.h
@@ -153,6 +153,11 @@ typedef struct odp_init_t {
odp_log_func_t log_fn;
/** Replacement for the default abort fn */
odp_abort_func_t abort_fn;
+   /** Whether the application will ever call odp_schedule() or not.
+
+   Default: true
+   */
+   odp_bool_t use_scheduler;
 } odp_init_t;
 
 /**
diff --git a/platform/linux-generic/include/odp_internal.h 
b/platform/linux-generic/include/odp_internal.h
index 90e2a629..26a5ffd3 100644
--- a/platform/linux-generic/include/odp_internal.h
+++ b/platform/linux-generic/include/odp_internal.h
@@ -51,6 +51,7 @@ struct odp_global_data_s {
odp_cpumask_t worker_cpus;
int num_cpus_installed;
config_t configuration;
+   odp_bool_t use_scheduler;
 };
 
 enum init_stage {
diff --git a/platform/linux-generic/include/odp_timer_internal.h 
b/platform/linux-generic/include/odp_timer_internal.h
index 91b12c54..eda50430 100644
--- a/platform/linux-generic/include/odp_timer_internal.h
+++ b/platform/linux-generic/include/odp_timer_internal.h
@@ -20,6 +20,9 @@
 #include 
 #include 
 
+/* Minimum number of nanoseconds between calls to _timer_run(). */
+#define CONFIG_TIMER_RUN_RATELIMIT_PERIOD 100
+
 /**
  * Internal Timeout header
  */
@@ -35,4 +38,6 @@ typedef struct {
odp_timer_t timer;
 } odp_timeout_hdr_t;
 
+int timer_run(void);
+
 #endif
diff --git a/platform/linux-generic/odp_init.c 
b/platform/linux-generic/odp_init.c
index 685e02fa..cbec83c3 100644
--- a/platform/linux-generic/odp_init.c
+++ b/platform/linux-generic/odp_init.c
@@ -4,11 +4,13 @@
  * SPDX-License-Identifier: BSD-3-Clause
  */
 #include 
-#include 
 #include 
-#include 
+
 #include 
+#include 
 #include 
+
+#include 
 #include 
 #include 
 #include 
@@ -159,12 +161,14 @@ int odp_init_global(odp_instance_t *instance,
enum init_stage stage = NO_INIT;
odp_global_data.log_fn = odp_override_log;
odp_global_data.abort_fn = odp_override_abort;
+   odp_global_data.use_scheduler = true;
 
if (params != NULL) {
if (params->log_fn != NULL)
odp_global_data.log_fn = params->log_fn;
if (params->abort_fn != NULL)
odp_global_data.abort_fn = params->abort_fn;
+   odp_global_data.use_scheduler = params->use_scheduler;
}
 
cleanup_files(_ODP_TMPDIR, odp_global_data.main_pid);
diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index c4567d81..f8ff315c 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Number of priority levels  */
 #define NUM_PRIO 8
@@ -988,7 +989,6 @@ static inline int do_schedule(odp_queue_t *out_queue, 
odp_event_t out_ev[],
return 0;
 }
 
-
 static int schedule_loop(odp_queue_t *out_queue, uint64_t wait,
 odp_event_t out_ev[],
 unsigned int max_num)
@@ -998,6 +998,8 @@ static int schedule_loop(odp_queue_t *out_queue, uint64_t 
wait,
int ret;
 
while (1) {
+   timer_run();
+
ret = do_schedule(out_queue, out_ev, max_num);
 
if (ret)
diff --git a/platform/linux-generic/odp_schedule_iquery.c 
b/platform/linux-generic/odp_schedule_iquery.c
index 75470aff..e865650a 100644
--- a/platform/linux-generic/odp_schedule_iquery.c
+++ b/platform/linux-generic/odp_schedule_iquery.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Number of priority levels */
 #define NUM_SCHED_PRIO 8
@@ -719,6 +720,8 @@ static int schedule_loop(odp_queue_t *out_queue, uint64_t 
wait,
odp_time_t next, wtime;
 
while (1) {
+   timer_run();
+
count = do_schedule(out_queue, out_ev, max_num);
 
  

[lng-odp] [API-NEXT v2 0/2] inline timers

2017-06-06 Thread Brian Brooks
Run timer pool processing on worker cores if the application hints
that the scheduler will be used. This reduces the latency and jitter
of the point at which timer pool processing begins. See [1] for details.

[1] 
https://docs.google.com/document/d/1sY7rOxqCNu-bMqjBiT5_keAIohrX1ZW-eL0oGLAQ4OM/edit?usp=sharing

Signed-off-by: Brian Brooks <brian.bro...@arm.com>

v2:
 - Reword 'worker_timers' to 'use_scheduler'
 - Use ODP Time instead of ticks

Brian Brooks (2):
  timer: organize #include
  timer: allow timer processing to run on worker cores

 include/odp/api/spec/init.h|   5 +
 platform/linux-generic/include/odp_internal.h  |   1 +
 .../linux-generic/include/odp_timer_internal.h |   5 +
 platform/linux-generic/odp_init.c  |   8 +-
 platform/linux-generic/odp_schedule.c  |   4 +-
 platform/linux-generic/odp_schedule_iquery.c   |   3 +
 platform/linux-generic/odp_schedule_sp.c   |   4 +
 platform/linux-generic/odp_timer.c | 116 +
 8 files changed, 122 insertions(+), 24 deletions(-)

-- 
2.13.0



[lng-odp] [API-NEXT v2 1/2] timer: organize #include

2017-06-06 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Kevin Wang <kevin.w...@arm.com>
---
 platform/linux-generic/odp_timer.c | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/platform/linux-generic/odp_timer.c 
b/platform/linux-generic/odp_timer.c
index cf610bfa..4539ea48 100644
--- a/platform/linux-generic/odp_timer.c
+++ b/platform/linux-generic/odp_timer.c
@@ -22,29 +22,23 @@
 #include 
 
 #include 
+#include 
+#include 
+#include 
 #include 
+#include 
+#include 
 #include 
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
-#include 
 #include 
-#include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -52,6 +46,13 @@
 #include 
 #include 
 #include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 
 
 #define TMO_UNUSED   ((uint64_t)0x)
-- 
2.13.0



[lng-odp] [API-NEXT] arch: arm: add CPU global time

2017-06-06 Thread Brian Brooks
Expose ARMv8 Generic Timer through internal CPU global time functions.

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
---
 platform/linux-generic/arch/arm/odp_cpu_arch.c | 30 +-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/platform/linux-generic/arch/arm/odp_cpu_arch.c 
b/platform/linux-generic/arch/arm/odp_cpu_arch.c
index c31f9084..ceba9d1f 100644
--- a/platform/linux-generic/arch/arm/odp_cpu_arch.c
+++ b/platform/linux-generic/arch/arm/odp_cpu_arch.c
@@ -50,15 +50,43 @@ uint64_t odp_cpu_cycles_resolution(void)
 
 int cpu_has_global_time(void)
 {
-   return 0;
+   uint64_t hz = cpu_global_time_freq();
+
+   return hz >= 100 && hz <= 60;
 }
 
 uint64_t cpu_global_time(void)
 {
+#if __ARM_ARCH == 8
+   uint64_t cntvct;
+
+   /*
+* The system counter must provide a uniform view of system time
+* to all processing elements in the system. This should hold true
+* for heterogeneous SoCs.
+*
+* Be consistent with other architectures and don't issue a serializing
+* instruction, e.g. ISB.
+*/
+
+   /* Memory clobber to minimize optimization around load from sys reg. */
+   __asm__ volatile("mrs %0, cntvct_el0" : "=r"(cntvct) : : "memory");
+
+   return cntvct;
+#else
return 0;
+#endif
 }
 
 uint64_t cpu_global_time_freq(void)
 {
+#if __ARM_ARCH == 8
+   uint64_t cntfrq;
+
+   __asm__ volatile("mrs %0, cntfrq_el0" : "=r"(cntfrq) : : );
+
+   return cntfrq;
+#else
return 0;
+#endif
 }
-- 
2.13.0



Re: [lng-odp] [PATCH v3 0/2] GCC 7 fixes

2017-06-05 Thread Brian Brooks
On 06/05 22:29:22, Bill Fischofer wrote:
> On Mon, Jun 5, 2017 at 9:32 PM, Brian Brooks <brian.bro...@arm.com> wrote:
> > On 06/05 18:40:02, Bill Fischofer wrote:
> >> After installing a copy of GCC 7, It looks like this patch is an
> >> incomplete fix. With this patch applied older GCC 6.3.0 continues to
> >> work fine, but GCC 7.0.1 generates the following errors:
> >>
> >> Making all in platform/linux-generic
> >> make[1]: Entering directory 
> >> '/home/bill/linaro/gcc7fix/platform/linux-generic'
> >>   CC   pktio/ipc.lo
> >> pktio/ipc.c: In function ‘ipc_close’:
> >> pktio/ipc.c:698:33: error: ‘%s’ directive output may be truncated
> >> writing up to 255 bytes into a region of size 32
> >> [-Werror=format-truncation=]
> >>snprintf(name, sizeof(name), "%s", dev);
> >
> > It looks like ./bootstrap might not have been run after applying the patch?
> 
> I just retried the build on a clean directory with ./bootstrap and
> ./configure and get the same errors.

Can you post the bottom part of ./configure output?

Here is what I see:

cc: gcc
cc version: 7.1.1
cppflags:
am_cppflags:
am_cxxflags:-std=c++11
cflags: -g -O2
am_cflags:   -pthread -DHAVE_PCAP  
-DIMPLEMENTATION_NAME=odp-linux -DODP_DEBUG_PRINT=0 -DODPH_DEBUG_PRINT=0 
-DODP_DEBUG=0 -W -Wall -Werror -Wstrict-prototypes -Wmissing-prototypes 
-Wmissing-declarations -Wold-style-definition -Wpointer-arith -Wcast-align 
-Wnested-externs -Wcast-qual -Wformat-nonliteral -Wformat-security -Wundef 
-Wwrite-strings -std=c99 -Wimplicit-fallthrough=0 -Wformat-truncation=0 
-Wformat-overflow=0  -mcx16
ldflags:
am_ldflags:  -latomic  -pthread -lrt
libs:   -lrt -lcunit -lcrypto   -lpcap
defs:   -DHAVE_CONFIG_H
static libraries:   yes
shared libraries:   yes
ABI compatible: yes
cunit:  yes
test_vald:  yes
test_perf:  yes
test_perf_proc: yes
test_cpp:   yes
test_helper:yes
test_example:   yes
user_guides:no


Re: [lng-odp] [PATCH v3 0/2] GCC 7 fixes

2017-06-05 Thread Brian Brooks
On 06/05 18:40:02, Bill Fischofer wrote:
> After installing a copy of GCC 7, It looks like this patch is an
> incomplete fix. With this patch applied older GCC 6.3.0 continues to
> work fine, but GCC 7.0.1 generates the following errors:
> 
> Making all in platform/linux-generic
> make[1]: Entering directory '/home/bill/linaro/gcc7fix/platform/linux-generic'
>   CC   pktio/ipc.lo
> pktio/ipc.c: In function ‘ipc_close’:
> pktio/ipc.c:698:33: error: ‘%s’ directive output may be truncated
> writing up to 255 bytes into a region of size 32
> [-Werror=format-truncation=]
>snprintf(name, sizeof(name), "%s", dev);

It looks like ./bootstrap might not have been run after applying the patch?


[lng-odp] [PATCH v3 1/2] build: GCC 7 fixes

2017-06-05 Thread Brian Brooks
The GCC 7 series introduces changes that expose ODP compilation
issues. These include case statement fall through warnings, and
stricter checks on potential string overflows and other semantic
analysis.

Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 DEPENDENCIES   |  5 ++--
 configure.ac   |  8 +++
 platform/linux-generic/m4/configure.m4 | 44 ++
 3 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/DEPENDENCIES b/DEPENDENCIES
index a194cad1..7bcbd5eb 100644
--- a/DEPENDENCIES
+++ b/DEPENDENCIES
@@ -8,13 +8,14 @@ Prerequisites for building the OpenDataPlane (ODP) API
 
automake
autoconf
+   autoconf-archive
libtool
 
On Debian/Ubuntu systems:
-   $ sudo apt-get install automake autoconf libtool
+   $ sudo apt-get install automake autoconf autoconf-archive libtool
 
On CentOS/RedHat/Fedora systems:
-   $ sudo yum install automake autoconf libtool
+   $ sudo yum install automake autoconf autoconf-archive libtool
 
 3. Required libraries
 
diff --git a/configure.ac b/configure.ac
index 7569ebe0..e61d94ca 100644
--- a/configure.ac
+++ b/configure.ac
@@ -300,6 +300,14 @@ ODP_CFLAGS="$ODP_CFLAGS -Wmissing-declarations 
-Wold-style-definition -Wpointer-
 ODP_CFLAGS="$ODP_CFLAGS -Wcast-align -Wnested-externs -Wcast-qual 
-Wformat-nonliteral"
 ODP_CFLAGS="$ODP_CFLAGS -Wformat-security -Wundef -Wwrite-strings"
 ODP_CFLAGS="$ODP_CFLAGS -std=c99"
+
+AX_CHECK_COMPILE_FLAG([-Wimplicit-fallthrough=0],
+  [ODP_CFLAGS="$ODP_CFLAGS -Wimplicit-fallthrough=0"])
+AX_CHECK_COMPILE_FLAG([-Wformat-truncation=0],
+  [ODP_CFLAGS="$ODP_CFLAGS -Wformat-truncation=0"])
+AX_CHECK_COMPILE_FLAG([-Wformat-overflow=0],
+  [ODP_CFLAGS="$ODP_CFLAGS -Wformat-overflow=0"])
+
 # Extra flags for example to suppress certain warning types
 ODP_CFLAGS="$ODP_CFLAGS $ODP_CFLAGS_EXTRA"
 
diff --git a/platform/linux-generic/m4/configure.m4 
b/platform/linux-generic/m4/configure.m4
index a2a25408..be04da8a 100644
--- a/platform/linux-generic/m4/configure.m4
+++ b/platform/linux-generic/m4/configure.m4
@@ -28,6 +28,50 @@ AC_LINK_IFELSE(
 echo "Use newer version. For gcc > 4.7.0"
 exit -1)
 
+dnl Check whether -latomic is needed
+use_libatomic=no
+
+AC_MSG_CHECKING(whether -latomic is needed for 64-bit atomic built-ins)
+AC_LINK_IFELSE(
+  [AC_LANG_SOURCE([[
+static int loc;
+int main(void)
+{
+int prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
+return 0;
+}
+]])],
+  [AC_MSG_RESULT(no)],
+  [AC_MSG_RESULT(yes)
+   AC_CHECK_LIB(
+ [atomic], [__atomic_exchange_8],
+ [use_libatomic=yes],
+ [AC_MSG_FAILURE([__atomic_exchange_8 is not available])])
+  ])
+
+AC_MSG_CHECKING(whether -latomic is needed for 128-bit atomic built-ins)
+AC_LINK_IFELSE(
+  [AC_LANG_SOURCE([[
+static __int128 loc;
+int main(void)
+{
+__int128 prev;
+prev = __atomic_exchange_n(, 7, __ATOMIC_RELAXED);
+return 0;
+}
+]])],
+  [AC_MSG_RESULT(no)],
+  [AC_MSG_RESULT(yes)
+   AC_CHECK_LIB(
+ [atomic], [__atomic_exchange_16],
+ [use_libatomic=yes],
+ [AC_MSG_FAILURE([cannot detect support for 128-bit atomics])])
+  ])
+
+if test "x$use_libatomic" = "xyes"; then
+  AM_LDFLAGS="$AM_LDFLAGS -latomic"
+fi
+
 m4_include([platform/linux-generic/m4/odp_pthread.m4])
 m4_include([platform/linux-generic/m4/odp_openssl.m4])
 m4_include([platform/linux-generic/m4/odp_pcap.m4])
-- 
2.13.0



[lng-odp] [PATCH v3 0/2] GCC 7 fixes

2017-06-05 Thread Brian Brooks
The GCC 7 series introduces changes that expose ODP compilation
issues. These include case statement fall through warnings, and
stricter checks on potential string overflows and other semantic
analysis.

Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027

Brian Brooks (2):
  build: GCC 7 fixes
  pktio: GCC 7 fixes

 DEPENDENCIES  |  5 +--
 configure.ac  |  8 +
 platform/linux-generic/m4/configure.m4| 44 +++
 test/common_plat/validation/api/pktio/pktio.c |  4 ++-
 4 files changed, 58 insertions(+), 3 deletions(-)

--

v3:
 - Split into multiple patches files (Maxim)
 - Disable warnings in favor of patching right now (Dmitry)
 - Improve libatomic detection in autoconf (Dmitry)
 - Add autoconf-arch to DEPENDENCIES file

v2:
 - Add Bug id to commit message (Bill)

2.13.0



[lng-odp] [PATCH v3 2/2] pktio: GCC 7 fixes

2017-06-05 Thread Brian Brooks
The GCC 7 series introduces changes that expose ODP compilation
issues. These include case statement fall through warnings, and
stricter checks on potential string overflows and other semantic
analysis.

Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
---
 test/common_plat/validation/api/pktio/pktio.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/test/common_plat/validation/api/pktio/pktio.c 
b/test/common_plat/validation/api/pktio/pktio.c
index 11fe974f..4d8d2cc7 100644
--- a/test/common_plat/validation/api/pktio/pktio.c
+++ b/test/common_plat/validation/api/pktio/pktio.c
@@ -1429,7 +1429,9 @@ int pktio_check_statistics_counters(void)
 void pktio_test_statistics_counters(void)
 {
odp_pktio_t pktio_rx, pktio_tx;
-   odp_pktio_t pktio[MAX_NUM_IFACES];
+   odp_pktio_t pktio[MAX_NUM_IFACES] = {
+   ODP_PKTIO_INVALID, ODP_PKTIO_INVALID
+   };
odp_packet_t pkt;
odp_packet_t tx_pkt[1000];
uint32_t pkt_seq[1000];
-- 
2.13.0



Re: [lng-odp] [PATCH] Fixes for GCC 7

2017-06-05 Thread Brian Brooks
On 06/02 20:34:21, Dmitry Eremin-Solenikov wrote:
> Or just
> 
> AC_LINK_IFELSE([AC_LANG_CALL([], [your_atomic_func])], [ATOMIC_LIBS=""],
>[AC_CHECK_LIB([atomic], [your_atomic_func], [ATOMIC_LIBS="-latomic"],
>   [AC_MSG_FAILURE([your_atomic_func is not available])])])
> AC_SUBST([ATOMIC_LIBS])

This appears to be it except AC_LANG_SOURCE needs to be used instead of
AC_LANG_CALL due to type issues (e.g. autoconf assumes the declaration
to be: char __atomic_exchange_16() which fails).


Re: [lng-odp] [PATCH] Fixes for GCC 7

2017-06-05 Thread Brian Brooks
On 06/02 20:09:54, Dmitry Eremin-Solenikov wrote:
> On 02.06.2017 18:34, Brian Brooks wrote:
> > On 06/02 10:39:18, Dmitry Eremin-Solenikov wrote:
> >> On 01.06.2017 22:05, Brian Brooks wrote:
> >>> Signed-off-by: Brian Brooks <brian.bro...@arm.com>
> >>> Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
> >>> Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> >>> ---
> >>>  configure.ac  | 5 +
> >>>  platform/linux-generic/m4/configure.m4| 4 
> >>>  platform/linux-generic/pktio/ipc.c| 6 --
> >>>  platform/linux-generic/pktio/sysfs.c  | 2 +-
> >>>  test/common_plat/validation/api/pktio/pktio.c | 4 +++-
> >>>  5 files changed, 17 insertions(+), 4 deletions(-)
> >>>
> >>> diff --git a/configure.ac b/configure.ac
> >>> index 7569ebe0..5eabe4d4 100644
> >>> --- a/configure.ac
> >>> +++ b/configure.ac
> >>> @@ -300,6 +300,11 @@ ODP_CFLAGS="$ODP_CFLAGS -Wmissing-declarations 
> >>> -Wold-style-definition -Wpointer-
> >>>  ODP_CFLAGS="$ODP_CFLAGS -Wcast-align -Wnested-externs -Wcast-qual 
> >>> -Wformat-nonliteral"
> >>>  ODP_CFLAGS="$ODP_CFLAGS -Wformat-security -Wundef -Wwrite-strings"
> >>>  ODP_CFLAGS="$ODP_CFLAGS -std=c99"
> >>> +
> >>> +if test "${CC}" == "gcc" -a ${CC_VERSION_MAJOR} -ge 7; then
> >>> +  ODP_CFLAGS="$ODP_CFLAGS -Wimplicit-fallthrough=0"
> >>> +fi
> >>> +
> >>
> >> Shouldn't Wimplicit-fallthrough=2 be enough? Where are you hitting the
> >> warning?
> > 
> > Not every fallthrough is commented.
> > 
> > Please read the manual if you would like to know more:
> > https://gcc.gnu.org/onlinedocs/gcc-7.1.0/gcc/Warning-Options.html#Warning-Options
> 
> So, it would be better to add necessary comments in my opinion.
> The warning is useful.

Agree warnings are good, but disabling the warning will keep the build sane
for those who are using GCC 7 now. When more people upgrade, the additional
warnings can be turned on and fixed in bulk.

> >>>  # Extra flags for example to suppress certain warning types
> >>>  ODP_CFLAGS="$ODP_CFLAGS $ODP_CFLAGS_EXTRA"
> >>>  
> >>> diff --git a/platform/linux-generic/m4/configure.m4 
> >>> b/platform/linux-generic/m4/configure.m4
> >>> index a2a25408..3e2978b5 100644
> >>> --- a/platform/linux-generic/m4/configure.m4
> >>> +++ b/platform/linux-generic/m4/configure.m4
> >>> @@ -28,6 +28,10 @@ AC_LINK_IFELSE(
> >>>  echo "Use newer version. For gcc > 4.7.0"
> >>>  exit -1)
> >>>  
> >>> +if test "${CC}" == "gcc" -a ${CC_VERSION_MAJOR} -ge 7; then
> >>> +  AM_LDFLAGS="$AM_LDFLAGS -latomic"
> >>> +fi
> >>> +
> >>
> >> This should be replaced with proper AC_CHECK_LIB or AC_SEARCH_LIBS
> > 
> > I don't think so. The link to libatomic is needed based on the compiler
> > version, not based on whether a program compiles with -latomic or not
> > which is AC_CHECK_LIB behavior. If you disagree, please show me how it
> > can be done.
> > 
> > This is also a very simple (3 line) solution.
> 
> Simple solution:
> 
> AC_SEARCH_LIBS([your_atomic_func], [atomic])
> 
> Cleaner solution:
> 
> AC_LINK_IFELSE([AC_LANG_CALL([], [your_atomic_func])], [ATOMIC_LIBS=""],
> [OLDLIBS=$LIBS
> LIBS="$LIBS -latomic"
> AC_LINK_IFELSE([AC_LANG_CALL([], [your_atomic_func])],
> [ATOMIC_LIBS="-latomic"],
> [AC_MSG_FAILURE([your_atomic_func is not available])])
> LIBS=$OLDLIBS])
> AC_SUBST([ATOMIC_LIBS])
> 
> Then you can use $(ATOMIC_LIBS) when you need to use your_atomic_function().
> 
> 
> >>>  m4_include([platform/linux-generic/m4/odp_pthread.m4])
> >>>  m4_include([platform/linux-generic/m4/odp_openssl.m4])
> >>>  m4_include([platform/linux-generic/m4/odp_pcap.m4])
> >>> diff --git a/platform/linux-generic/pktio/ipc.c 
> >>> b/platform/linux-generic/pktio/ipc.c
> >>> index 06175e5a..29c3a546 100644
> >>> --- a/platform/linux-generic/pktio/ipc.c
> >>> +++ b/platform/linux-generic/pktio/ipc.c
> >>> @@ -694,8 +694,10 @@ static int ipc_close(pktio_entry_t *pktio_entry)
> >>>  
> >>>   if (sscanf(dev, "ipc:%d:%s", , tail) == 2)
> >>>   snprintf(name, sizeof(name), "ipc:%s", tail);
> >>> - else
> >>> - snprintf(name, sizeof(name), "%s", dev);
> >>> + else {
> >>> + strncpy(name, dev, sizeof(name));
> >>> + name[sizeof(name) - 1] = '\0';
> >>> + }
> >>
> >> Why?
> > 
> > New -Wformat-truncation=level behavior. Please read the manual if you'd like
> > to know more.
> 
> I'd suggest instead to disable -Wformat-truncation.

Agree, if it is possible.


Re: [lng-odp] [PATCH] Fixes for GCC 7

2017-06-05 Thread Brian Brooks
On 06/02 23:27:08, Maxim Uvarov wrote:
> On 06/02/17 18:36, Brian Brooks wrote:
> > On 06/02 15:07:48, Maxim Uvarov wrote:
> >> I think this patch has to be spit on several patches. Having patch which
> >> correct unrelated things is strange and make it hard to merge/cherry-pick.
> > 
> > They are all related to things that break the build with GCC 7. It's
> > unnecessary and extra complexity to split it up into more than one patch.
> > The single patch is small and easily reviewable anyway.
> > 
> 
> Idea here is the following. Different odp implementations can inherit
> specific parts of mainline ODP to their code. For example changes to
> configure.ac might be sufficient for odp-dpdk and change to fix pktio
> name is not. Other platform developers prefer to take entire commit
> from linux-generic with git cherry-pick  command.

Aha, so split into build related changes and source related changes?

> Maxim.
> 
> 
> 
> 
> >>
> >> Maxim.
> 


[lng-odp] [PATCH v2] Fixes for GCC 7

2017-06-02 Thread Brian Brooks
The GCC 7 series introduces changes that expose ODP compilation
issues. These include case statement fall through warnings, and
stricter checks on potential string overflows and other semantic
analysis.

Fixes: https://bugs.linaro.org/show_bug.cgi?id=3027

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 configure.ac  | 5 +
 platform/linux-generic/m4/configure.m4| 4 
 platform/linux-generic/pktio/ipc.c| 6 --
 platform/linux-generic/pktio/sysfs.c  | 2 +-
 test/common_plat/validation/api/pktio/pktio.c | 4 +++-
 5 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/configure.ac b/configure.ac
index 7569ebe0..5eabe4d4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -300,6 +300,11 @@ ODP_CFLAGS="$ODP_CFLAGS -Wmissing-declarations 
-Wold-style-definition -Wpointer-
 ODP_CFLAGS="$ODP_CFLAGS -Wcast-align -Wnested-externs -Wcast-qual 
-Wformat-nonliteral"
 ODP_CFLAGS="$ODP_CFLAGS -Wformat-security -Wundef -Wwrite-strings"
 ODP_CFLAGS="$ODP_CFLAGS -std=c99"
+
+if test "${CC}" == "gcc" -a ${CC_VERSION_MAJOR} -ge 7; then
+  ODP_CFLAGS="$ODP_CFLAGS -Wimplicit-fallthrough=0"
+fi
+
 # Extra flags for example to suppress certain warning types
 ODP_CFLAGS="$ODP_CFLAGS $ODP_CFLAGS_EXTRA"
 
diff --git a/platform/linux-generic/m4/configure.m4 
b/platform/linux-generic/m4/configure.m4
index a2a25408..3e2978b5 100644
--- a/platform/linux-generic/m4/configure.m4
+++ b/platform/linux-generic/m4/configure.m4
@@ -28,6 +28,10 @@ AC_LINK_IFELSE(
 echo "Use newer version. For gcc > 4.7.0"
 exit -1)
 
+if test "${CC}" == "gcc" -a ${CC_VERSION_MAJOR} -ge 7; then
+  AM_LDFLAGS="$AM_LDFLAGS -latomic"
+fi
+
 m4_include([platform/linux-generic/m4/odp_pthread.m4])
 m4_include([platform/linux-generic/m4/odp_openssl.m4])
 m4_include([platform/linux-generic/m4/odp_pcap.m4])
diff --git a/platform/linux-generic/pktio/ipc.c 
b/platform/linux-generic/pktio/ipc.c
index 06175e5a..29c3a546 100644
--- a/platform/linux-generic/pktio/ipc.c
+++ b/platform/linux-generic/pktio/ipc.c
@@ -694,8 +694,10 @@ static int ipc_close(pktio_entry_t *pktio_entry)
 
if (sscanf(dev, "ipc:%d:%s", , tail) == 2)
snprintf(name, sizeof(name), "ipc:%s", tail);
-   else
-   snprintf(name, sizeof(name), "%s", dev);
+   else {
+   strncpy(name, dev, sizeof(name));
+   name[sizeof(name) - 1] = '\0';
+   }
 
/* unlink this pktio info for both master and slave */
odp_shm_free(pktio_entry->s.ipc.pinfo_shm);
diff --git a/platform/linux-generic/pktio/sysfs.c 
b/platform/linux-generic/pktio/sysfs.c
index be0822dd..6e9bc59b 100644
--- a/platform/linux-generic/pktio/sysfs.c
+++ b/platform/linux-generic/pktio/sysfs.c
@@ -43,7 +43,7 @@ static int sysfs_get_val(const char *fname, uint64_t *val)
 int sysfs_stats(pktio_entry_t *pktio_entry,
odp_pktio_stats_t *stats)
 {
-   char fname[256];
+   char fname[300];
const char *dev = pktio_entry->s.name;
int ret = 0;
 
diff --git a/test/common_plat/validation/api/pktio/pktio.c 
b/test/common_plat/validation/api/pktio/pktio.c
index 11fe974f..4d8d2cc7 100644
--- a/test/common_plat/validation/api/pktio/pktio.c
+++ b/test/common_plat/validation/api/pktio/pktio.c
@@ -1429,7 +1429,9 @@ int pktio_check_statistics_counters(void)
 void pktio_test_statistics_counters(void)
 {
odp_pktio_t pktio_rx, pktio_tx;
-   odp_pktio_t pktio[MAX_NUM_IFACES];
+   odp_pktio_t pktio[MAX_NUM_IFACES] = {
+   ODP_PKTIO_INVALID, ODP_PKTIO_INVALID
+   };
odp_packet_t pkt;
odp_packet_t tx_pkt[1000];
uint32_t pkt_seq[1000];
-- 
2.13.0



Re: [lng-odp] [PATCH] Fixes for GCC 7

2017-06-02 Thread Brian Brooks
On 06/01 22:30:16, Bill Fischofer wrote:
> On Thu, Jun 1, 2017 at 9:48 PM, Brian Brooks <brian.bro...@arm.com> wrote:
> > On 06/01 15:00:28, Bill Fischofer wrote:
> >> If this is a bug fix it should reference a Bug that describes in more
> >> detail what is being fixed.
> >
> > Can you elaborate?
> >
> > The subject line "Fixes for GCC 7" is sufficient.
> 
> If this is fixing a bug the commit log should reference the bugzilla
> entry associated with that bug.

There is no bugzilla or jira. This is solving a minor problem that I have
because I am using newer GCC release on ARM and x86.


Re: [lng-odp] [PATCH] Fixes for GCC 7

2017-06-02 Thread Brian Brooks
On 06/02 15:07:48, Maxim Uvarov wrote:
> I think this patch has to be spit on several patches. Having patch which
> correct unrelated things is strange and make it hard to merge/cherry-pick.

They are all related to things that break the build with GCC 7. It's
unnecessary and extra complexity to split it up into more than one patch.
The single patch is small and easily reviewable anyway.

> 
> Maxim.


Re: [lng-odp] [PATCH] Fixes for GCC 7

2017-06-02 Thread Brian Brooks
On 06/02 10:39:18, Dmitry Eremin-Solenikov wrote:
> On 01.06.2017 22:05, Brian Brooks wrote:
> > Signed-off-by: Brian Brooks <brian.bro...@arm.com>
> > Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
> > Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
> > ---
> >  configure.ac  | 5 +
> >  platform/linux-generic/m4/configure.m4| 4 
> >  platform/linux-generic/pktio/ipc.c| 6 --
> >  platform/linux-generic/pktio/sysfs.c  | 2 +-
> >  test/common_plat/validation/api/pktio/pktio.c | 4 +++-
> >  5 files changed, 17 insertions(+), 4 deletions(-)
> > 
> > diff --git a/configure.ac b/configure.ac
> > index 7569ebe0..5eabe4d4 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -300,6 +300,11 @@ ODP_CFLAGS="$ODP_CFLAGS -Wmissing-declarations 
> > -Wold-style-definition -Wpointer-
> >  ODP_CFLAGS="$ODP_CFLAGS -Wcast-align -Wnested-externs -Wcast-qual 
> > -Wformat-nonliteral"
> >  ODP_CFLAGS="$ODP_CFLAGS -Wformat-security -Wundef -Wwrite-strings"
> >  ODP_CFLAGS="$ODP_CFLAGS -std=c99"
> > +
> > +if test "${CC}" == "gcc" -a ${CC_VERSION_MAJOR} -ge 7; then
> > +  ODP_CFLAGS="$ODP_CFLAGS -Wimplicit-fallthrough=0"
> > +fi
> > +
> 
> Shouldn't Wimplicit-fallthrough=2 be enough? Where are you hitting the
> warning?

Not every fallthrough is commented.

Please read the manual if you would like to know more:
https://gcc.gnu.org/onlinedocs/gcc-7.1.0/gcc/Warning-Options.html#Warning-Options

> >  # Extra flags for example to suppress certain warning types
> >  ODP_CFLAGS="$ODP_CFLAGS $ODP_CFLAGS_EXTRA"
> >  
> > diff --git a/platform/linux-generic/m4/configure.m4 
> > b/platform/linux-generic/m4/configure.m4
> > index a2a25408..3e2978b5 100644
> > --- a/platform/linux-generic/m4/configure.m4
> > +++ b/platform/linux-generic/m4/configure.m4
> > @@ -28,6 +28,10 @@ AC_LINK_IFELSE(
> >  echo "Use newer version. For gcc > 4.7.0"
> >  exit -1)
> >  
> > +if test "${CC}" == "gcc" -a ${CC_VERSION_MAJOR} -ge 7; then
> > +  AM_LDFLAGS="$AM_LDFLAGS -latomic"
> > +fi
> > +
> 
> This should be replaced with proper AC_CHECK_LIB or AC_SEARCH_LIBS

I don't think so. The link to libatomic is needed based on the compiler
version, not based on whether a program compiles with -latomic or not
which is AC_CHECK_LIB behavior. If you disagree, please show me how it
can be done.

This is also a very simple (3 line) solution.

> >  m4_include([platform/linux-generic/m4/odp_pthread.m4])
> >  m4_include([platform/linux-generic/m4/odp_openssl.m4])
> >  m4_include([platform/linux-generic/m4/odp_pcap.m4])
> > diff --git a/platform/linux-generic/pktio/ipc.c 
> > b/platform/linux-generic/pktio/ipc.c
> > index 06175e5a..29c3a546 100644
> > --- a/platform/linux-generic/pktio/ipc.c
> > +++ b/platform/linux-generic/pktio/ipc.c
> > @@ -694,8 +694,10 @@ static int ipc_close(pktio_entry_t *pktio_entry)
> >  
> > if (sscanf(dev, "ipc:%d:%s", , tail) == 2)
> > snprintf(name, sizeof(name), "ipc:%s", tail);
> > -   else
> > -   snprintf(name, sizeof(name), "%s", dev);
> > +   else {
> > +   strncpy(name, dev, sizeof(name));
> > +   name[sizeof(name) - 1] = '\0';
> > +   }
> 
> Why?

New -Wformat-truncation=level behavior. Please read the manual if you'd like
to know more.

> >  
> > /* unlink this pktio info for both master and slave */
> > odp_shm_free(pktio_entry->s.ipc.pinfo_shm);
> 
> 
> -- 
> With best wishes
> Dmitry


[lng-odp] [PATCH] Fixes for GCC 7

2017-06-01 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>
---
 configure.ac  | 5 +
 platform/linux-generic/m4/configure.m4| 4 
 platform/linux-generic/pktio/ipc.c| 6 --
 platform/linux-generic/pktio/sysfs.c  | 2 +-
 test/common_plat/validation/api/pktio/pktio.c | 4 +++-
 5 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/configure.ac b/configure.ac
index 7569ebe0..5eabe4d4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -300,6 +300,11 @@ ODP_CFLAGS="$ODP_CFLAGS -Wmissing-declarations 
-Wold-style-definition -Wpointer-
 ODP_CFLAGS="$ODP_CFLAGS -Wcast-align -Wnested-externs -Wcast-qual 
-Wformat-nonliteral"
 ODP_CFLAGS="$ODP_CFLAGS -Wformat-security -Wundef -Wwrite-strings"
 ODP_CFLAGS="$ODP_CFLAGS -std=c99"
+
+if test "${CC}" == "gcc" -a ${CC_VERSION_MAJOR} -ge 7; then
+  ODP_CFLAGS="$ODP_CFLAGS -Wimplicit-fallthrough=0"
+fi
+
 # Extra flags for example to suppress certain warning types
 ODP_CFLAGS="$ODP_CFLAGS $ODP_CFLAGS_EXTRA"
 
diff --git a/platform/linux-generic/m4/configure.m4 
b/platform/linux-generic/m4/configure.m4
index a2a25408..3e2978b5 100644
--- a/platform/linux-generic/m4/configure.m4
+++ b/platform/linux-generic/m4/configure.m4
@@ -28,6 +28,10 @@ AC_LINK_IFELSE(
 echo "Use newer version. For gcc > 4.7.0"
 exit -1)
 
+if test "${CC}" == "gcc" -a ${CC_VERSION_MAJOR} -ge 7; then
+  AM_LDFLAGS="$AM_LDFLAGS -latomic"
+fi
+
 m4_include([platform/linux-generic/m4/odp_pthread.m4])
 m4_include([platform/linux-generic/m4/odp_openssl.m4])
 m4_include([platform/linux-generic/m4/odp_pcap.m4])
diff --git a/platform/linux-generic/pktio/ipc.c 
b/platform/linux-generic/pktio/ipc.c
index 06175e5a..29c3a546 100644
--- a/platform/linux-generic/pktio/ipc.c
+++ b/platform/linux-generic/pktio/ipc.c
@@ -694,8 +694,10 @@ static int ipc_close(pktio_entry_t *pktio_entry)
 
if (sscanf(dev, "ipc:%d:%s", , tail) == 2)
snprintf(name, sizeof(name), "ipc:%s", tail);
-   else
-   snprintf(name, sizeof(name), "%s", dev);
+   else {
+   strncpy(name, dev, sizeof(name));
+   name[sizeof(name) - 1] = '\0';
+   }
 
/* unlink this pktio info for both master and slave */
odp_shm_free(pktio_entry->s.ipc.pinfo_shm);
diff --git a/platform/linux-generic/pktio/sysfs.c 
b/platform/linux-generic/pktio/sysfs.c
index be0822dd..6e9bc59b 100644
--- a/platform/linux-generic/pktio/sysfs.c
+++ b/platform/linux-generic/pktio/sysfs.c
@@ -43,7 +43,7 @@ static int sysfs_get_val(const char *fname, uint64_t *val)
 int sysfs_stats(pktio_entry_t *pktio_entry,
odp_pktio_stats_t *stats)
 {
-   char fname[256];
+   char fname[300];
const char *dev = pktio_entry->s.name;
int ret = 0;
 
diff --git a/test/common_plat/validation/api/pktio/pktio.c 
b/test/common_plat/validation/api/pktio/pktio.c
index 11fe974f..4d8d2cc7 100644
--- a/test/common_plat/validation/api/pktio/pktio.c
+++ b/test/common_plat/validation/api/pktio/pktio.c
@@ -1429,7 +1429,9 @@ int pktio_check_statistics_counters(void)
 void pktio_test_statistics_counters(void)
 {
odp_pktio_t pktio_rx, pktio_tx;
-   odp_pktio_t pktio[MAX_NUM_IFACES];
+   odp_pktio_t pktio[MAX_NUM_IFACES] = {
+   ODP_PKTIO_INVALID, ODP_PKTIO_INVALID
+   };
odp_packet_t pkt;
odp_packet_t tx_pkt[1000];
uint32_t pkt_seq[1000];
-- 
2.13.0



[lng-odp] [API-NEXT PATCH 6/6] timer: allow timer processing to run on worker cores

2017-05-28 Thread Brian Brooks
Use 'worker_timers' option in global initialization to run timer
processing from within the schedule call instead of on background
threads. This option reduces the latency and jitter of the time
when timer pool processing begins. See [1] for details.

[1] 
https://docs.google.com/document/d/1sY7rOxqCNu-bMqjBiT5_keAIohrX1ZW-eL0oGLAQ4OM/edit?usp=sharing

Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
---
 include/odp/api/spec/init.h|  8 ++
 .../linux-generic/include/odp_timer_internal.h |  7 ++
 platform/linux-generic/odp_init.c  |  9 ++-
 platform/linux-generic/odp_schedule.c  |  5 +-
 platform/linux-generic/odp_schedule_iquery.c   |  4 +
 platform/linux-generic/odp_schedule_sp.c   |  4 +
 platform/linux-generic/odp_timer.c | 90 +++---
 7 files changed, 115 insertions(+), 12 deletions(-)

diff --git a/include/odp/api/spec/init.h b/include/odp/api/spec/init.h
index 154cdf8f..9df99d4a 100644
--- a/include/odp/api/spec/init.h
+++ b/include/odp/api/spec/init.h
@@ -153,6 +153,14 @@ typedef struct odp_init_t {
odp_log_func_t log_fn;
/** Replacement for the default abort fn */
odp_abort_func_t abort_fn;
+   /** Allow timer maintenance work to run on worker cores.
+
+   When enabled, work will run from within odp_schedule().
+   When disabled, work will run in background threads.
+
+   Default: disabled (false)
+   */
+   odp_bool_t worker_timers;
 } odp_init_t;
 
 /**
diff --git a/platform/linux-generic/include/odp_timer_internal.h 
b/platform/linux-generic/include/odp_timer_internal.h
index 91b12c54..4a1bbeb3 100644
--- a/platform/linux-generic/include/odp_timer_internal.h
+++ b/platform/linux-generic/include/odp_timer_internal.h
@@ -20,6 +20,9 @@
 #include 
 #include 
 
+/* Minimum number of nanoseconds between calls to _timer_run(). */
+#define CONFIG_TIMER_RUN_RATELIMIT_PERIOD 100
+
 /**
  * Internal Timeout header
  */
@@ -35,4 +38,8 @@ typedef struct {
odp_timer_t timer;
 } odp_timeout_hdr_t;
 
+odp_bool_t worker_timers;
+
+int timer_run(void);
+
 #endif
diff --git a/platform/linux-generic/odp_init.c 
b/platform/linux-generic/odp_init.c
index 685e02fa..f08f221c 100644
--- a/platform/linux-generic/odp_init.c
+++ b/platform/linux-generic/odp_init.c
@@ -4,11 +4,14 @@
  * SPDX-License-Identifier: BSD-3-Clause
  */
 #include 
-#include 
 #include 
-#include 
+
 #include 
+#include 
 #include 
+#include 
+
+#include 
 #include 
 #include 
 #include 
@@ -165,6 +168,8 @@ int odp_init_global(odp_instance_t *instance,
odp_global_data.log_fn = params->log_fn;
if (params->abort_fn != NULL)
odp_global_data.abort_fn = params->abort_fn;
+   if (params->worker_timers)
+   worker_timers = true;
}
 
cleanup_files(_ODP_TMPDIR, odp_global_data.main_pid);
diff --git a/platform/linux-generic/odp_schedule.c 
b/platform/linux-generic/odp_schedule.c
index f680ac47..0c634f62 100644
--- a/platform/linux-generic/odp_schedule.c
+++ b/platform/linux-generic/odp_schedule.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Number of priority levels  */
 #define NUM_PRIO 8
@@ -985,7 +986,6 @@ static inline int do_schedule(odp_queue_t *out_queue, 
odp_event_t out_ev[],
return 0;
 }
 
-
 static int schedule_loop(odp_queue_t *out_queue, uint64_t wait,
 odp_event_t out_ev[],
 unsigned int max_num)
@@ -995,6 +995,9 @@ static int schedule_loop(odp_queue_t *out_queue, uint64_t 
wait,
int ret;
 
while (1) {
+   if (worker_timers)
+   (void)timer_run();
+
ret = do_schedule(out_queue, out_ev, max_num);
 
if (ret)
diff --git a/platform/linux-generic/odp_schedule_iquery.c 
b/platform/linux-generic/odp_schedule_iquery.c
index b8a40011..67457d8f 100644
--- a/platform/linux-generic/odp_schedule_iquery.c
+++ b/platform/linux-generic/odp_schedule_iquery.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Number of priority levels */
 #define NUM_SCHED_PRIO 8
@@ -718,6 +719,9 @@ static int schedule_loop(odp_queue_t *out_queue, uint64_t 
wait,
odp_time_t next, wtime;
 
while (1) {
+   if (worker_timers)
+   (void)timer_run();
+
count = do_schedule(out_queue, out_ev, max_num);
 
if (count)
diff --git a/platform/linux-generic/odp_schedule_sp.c 
b/platform/linux-generic/odp_schedule_sp.c
index 0fd4d87d..bef9ec01 100644
--- a/platform/linux-generic/odp_schedule_sp.c
+++ b/platform/linux-generic/odp_schedule_sp.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define NUM_THREADODP_THREAD_COUNT_MAX
 #define NUM_QUEUE ODP_CONFIG

[lng-odp] [API-NEXT PATCH 4/6] timer: organize #include

2017-05-28 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Kevin Wang <kevin.w...@arm.com>
---
 platform/linux-generic/odp_timer.c | 25 +
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/platform/linux-generic/odp_timer.c 
b/platform/linux-generic/odp_timer.c
index cf610bfa..4539ea48 100644
--- a/platform/linux-generic/odp_timer.c
+++ b/platform/linux-generic/odp_timer.c
@@ -22,29 +22,23 @@
 #include 
 
 #include 
+#include 
+#include 
+#include 
 #include 
+#include 
+#include 
 #include 
-#include 
-#include 
 #include 
-#include 
-#include 
-#include 
 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
 #include 
-#include 
-#include 
 #include 
-#include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
@@ -52,6 +46,13 @@
 #include 
 #include 
 #include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 #include 
 
 #define TMO_UNUSED   ((uint64_t)0x)
-- 
2.13.0



[lng-odp] [API-NEXT PATCH 5/6] api: timer: add odp_timer_pool_res()

2017-05-28 Thread Brian Brooks
Signed-off-by: Brian Brooks <brian.bro...@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljed...@arm.com>
Reviewed-by: Kevin Wang <kevin.w...@arm.com>
---
 include/odp/api/spec/timer.h  | 9 +
 platform/linux-generic/odp_timer.c| 5 +
 test/common_plat/validation/api/timer/timer.c | 2 ++
 3 files changed, 16 insertions(+)

diff --git a/include/odp/api/spec/timer.h b/include/odp/api/spec/timer.h
index 75f9db98..633cbe47 100644
--- a/include/odp/api/spec/timer.h
+++ b/include/odp/api/spec/timer.h
@@ -196,6 +196,15 @@ int odp_timer_pool_info(odp_timer_pool_t tpid,
odp_timer_pool_info_t *info);
 
 /**
+ * Get resolution from timer pool
+ *
+ * @param tpid Timer pool identifier
+ *
+ * @return Timeout resolution in nanoseconds
+ */
+uint64_t odp_timer_pool_res(odp_timer_pool_t tpid);
+
+/**
  * Allocate a timer
  *
  * Create a timer (allocating all necessary resources e.g. timeout event) from
diff --git a/platform/linux-generic/odp_timer.c 
b/platform/linux-generic/odp_timer.c
index 4539ea48..80dce873 100644
--- a/platform/linux-generic/odp_timer.c
+++ b/platform/linux-generic/odp_timer.c
@@ -842,6 +842,11 @@ int odp_timer_pool_info(odp_timer_pool_t tpid,
return 0;
 }
 
+uint64_t odp_timer_pool_res(odp_timer_pool_t tpid)
+{
+   return tpid->param.res_ns;
+}
+
 uint64_t odp_timer_pool_to_u64(odp_timer_pool_t tpid)
 {
return _odp_pri(tpid);
diff --git a/test/common_plat/validation/api/timer/timer.c 
b/test/common_plat/validation/api/timer/timer.c
index b7d84c64..0fb4631e 100644
--- a/test/common_plat/validation/api/timer/timer.c
+++ b/test/common_plat/validation/api/timer/timer.c
@@ -529,6 +529,8 @@ void timer_test_odp_timer_all(void)
CU_ASSERT(tpinfo.param.max_tmo == MAX);
CU_ASSERT(strcmp(tpinfo.name, NAME) == 0);
 
+   CU_ASSERT(odp_timer_pool_res(tp) == RES);
+
LOG_DBG("Timer pool handle: %" PRIu64 "\n", odp_timer_pool_to_u64(tp));
LOG_DBG("#timers..: %u\n", NTIMERS);
LOG_DBG("Tmo range: %u ms (%" PRIu64 " ticks)\n", RANGE_MS,
-- 
2.13.0



  1   2   3   4   >