Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
On Thu, Jun 22, 2017 at 11:46 AM, Brian Brookswrote: >> > > The first is built only for ARM and the second for the rest. Would there >> > >be a way to build both always ? >> > For ARMv7a and ARMv8a, you could build both versions. You really want to >> > use the LL/SC version on these architectures. >> > >> > For architectures without double-word LL/SC, only the lock-based version >> > can be built. >> >> >> You could *compile* the lock version always. It's based on locks, not on >> arch specific instructions. > > That would require an abstraction layer consisting of function pointers > pointing to one of the two implementations. On architectures without support > for LLD/SCD, there would only be one implementation. > > This could make sense if... you were benchmarking *many* different concurrent > queue implementations and wanted to keep the benchmark code extremely succinct > and were willing to pay for function pointers. But that is not the case here. > > This code is deliberately written to be static inline and conditionally > compiled. We're going to need to think about how to make microarchitecture distinctions like this dynamic since we want a single binary to be distributed to all ARM architectures. I agree that such dynamism is outside the scope of this series, but it's something we're going to need as the cloud work progresses. > >> -Petri >>
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
> > > The first is built only for ARM and the second for the rest. Would there > > >be a way to build both always ? > > For ARMv7a and ARMv8a, you could build both versions. You really want to > > use the LL/SC version on these architectures. > > > > For architectures without double-word LL/SC, only the lock-based version > > can be built. > > > You could *compile* the lock version always. It's based on locks, not on arch > specific instructions. That would require an abstraction layer consisting of function pointers pointing to one of the two implementations. On architectures without support for LLD/SCD, there would only be one implementation. This could make sense if... you were benchmarking *many* different concurrent queue implementations and wanted to keep the benchmark code extremely succinct and were willing to pay for function pointers. But that is not the case here. This code is deliberately written to be static inline and conditionally compiled. > -Petri >
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
> -Original Message- > From: Ola Liljedahl [mailto:ola.liljed...@arm.com] > Sent: Wednesday, June 21, 2017 7:31 PM > To: Savolainen, Petri (Nokia - FI/Espoo) <petri.savolai...@nokia.com>; > Brian Brooks <brian.bro...@arm.com>; lng-odp@lists.linaro.org > Cc: nd <n...@arm.com> > Subject: Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: > add a concurrent queue > > > > > > On 20/06/2017, 15:12, "Savolainen, Petri (Nokia - FI/Espoo)" > <petri.savolai...@nokia.com> wrote: > > >> +++ b/platform/linux-generic/include/odp_llqueue.h > >> @@ -0,0 +1,309 @@ > >> +/* Copyright (c) 2017, ARM Limited. > >> + * All rights reserved. > >> + * > >> + * SPDX-License-Identifier:BSD-3-Clause > >> + */ > >> + > >> +#ifndef ODP_LLQUEUE_H_ > >> +#define ODP_LLQUEUE_H_ > >> + > >> +#include > >> +#include > >> +#include > >> + > >> +#include > >> +#include > >> +#include > >> + > >> +#include > >> +#include > >> + > >> > >>+/** > * > >>* > >> ** > >> + * Linked list queues > >> + > >> > >> > * > >>* > >> ***/ > >> + > >> +struct llqueue; > >> +struct llnode; > >> + > >> +static struct llnode *llq_head(struct llqueue *llq); > >> +static void llqueue_init(struct llqueue *llq); > >> +static void llq_enqueue(struct llqueue *llq, struct llnode *node); > >> +static struct llnode *llq_dequeue(struct llqueue *llq); > >> +static odp_bool_t llq_dequeue_cond(struct llqueue *llq, struct llnode > >> *exp); > >> +static odp_bool_t llq_cond_rotate(struct llqueue *llq, struct llnode > >> *node); > >> +static odp_bool_t llq_on_queue(struct llnode *node); > >> + > >> > >>+/** > * > >>* > >> ** > >> + * The implementation(s) > >> + > >> > >> > * > >>* > >> ***/ > >> + > >> +#define SENTINEL ((void *)~(uintptr_t)0) > >> + > >> +#ifdef CONFIG_LLDSCD > >> +/* Implement queue operations using double-word LL/SC */ > > > >> + > >> +#else > >> +/* Implement queue operations protected by a spin lock */ > >> + > > > >There's a lot of ifdef'ed code in this file, basically two full parallel > >implementations. > This horse has been flogged before on the mailing list. Nothing has changed in our ifdef policy. The less ifdef'ed code, the better. This patch set introduces about 60 new #ifdef/#if/#ifndefs (when header file guards are not calculated). > > > The first is built only for ARM and the second for the rest. Would there > >be a way to build both always ? > For ARMv7a and ARMv8a, you could build both versions. You really want to > use the LL/SC version on these architectures. > > For architectures without double-word LL/SC, only the lock-based version > can be built. You could *compile* the lock version always. It's based on locks, not on arch specific instructions. -Petri
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
On 06/21/17 21:57, Honnappa Nagarahalli wrote: > I cannot make a decision on this topic. This is what I have been told > and I am also told that Linaro is in sync on this. I will initiate > some discussions on this internally. Bill, do you want to check in > Linaro on this? > we will check this first. > I suggest we go ahead with the patch and agree to change it when we > have a decision. > and we can not go ahead until everything is resolved. Maxim. > On 21 June 2017 at 12:37, Bill Fischoferwrote: >> It's perfectly fine to have an ARM copyright as the original >> contributor, but it also needs a Linaro copyright to be incorporated >> into ODP. See traffic_mngr.c for an example of this. >> >> On Wed, Jun 21, 2017 at 9:27 AM, Honnappa Nagarahalli >> wrote: >>> Hi Maxim, >>> This is a new file added by us. Hence the ARM copyright. >>> Thanks, >>> Honnappa >>> >>> On 21 June 2017 at 09:06, Maxim Uvarov wrote: On 06/19/17 22:12, Brian Brooks wrote: > > --- /dev/null > +++ b/platform/linux-generic/include/odp_llqueue.h > @@ -0,0 +1,309 @@ > +/* Copyright (c) 2017, ARM Limited. > + * All rights reserved. > + * > + * SPDX-License-Identifier:BSD-3-Clause > + */ > + Has to be Linaro copyright. Maxim.
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
I cannot make a decision on this topic. This is what I have been told and I am also told that Linaro is in sync on this. I will initiate some discussions on this internally. Bill, do you want to check in Linaro on this? I suggest we go ahead with the patch and agree to change it when we have a decision. On 21 June 2017 at 12:37, Bill Fischoferwrote: > It's perfectly fine to have an ARM copyright as the original > contributor, but it also needs a Linaro copyright to be incorporated > into ODP. See traffic_mngr.c for an example of this. > > On Wed, Jun 21, 2017 at 9:27 AM, Honnappa Nagarahalli > wrote: >> Hi Maxim, >> This is a new file added by us. Hence the ARM copyright. >> Thanks, >> Honnappa >> >> On 21 June 2017 at 09:06, Maxim Uvarov wrote: >>> On 06/19/17 22:12, Brian Brooks wrote: --- /dev/null +++ b/platform/linux-generic/include/odp_llqueue.h @@ -0,0 +1,309 @@ +/* Copyright (c) 2017, ARM Limited. + * All rights reserved. + * + * SPDX-License-Identifier:BSD-3-Clause + */ + >>> >>> >>> Has to be Linaro copyright. >>> >>> Maxim.
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
It's perfectly fine to have an ARM copyright as the original contributor, but it also needs a Linaro copyright to be incorporated into ODP. See traffic_mngr.c for an example of this. On Wed, Jun 21, 2017 at 9:27 AM, Honnappa Nagarahalliwrote: > Hi Maxim, > This is a new file added by us. Hence the ARM copyright. > Thanks, > Honnappa > > On 21 June 2017 at 09:06, Maxim Uvarov wrote: >> On 06/19/17 22:12, Brian Brooks wrote: >>> >>> --- /dev/null >>> +++ b/platform/linux-generic/include/odp_llqueue.h >>> @@ -0,0 +1,309 @@ >>> +/* Copyright (c) 2017, ARM Limited. >>> + * All rights reserved. >>> + * >>> + * SPDX-License-Identifier:BSD-3-Clause >>> + */ >>> + >> >> >> Has to be Linaro copyright. >> >> Maxim.
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
On 20/06/2017, 15:12, "Savolainen, Petri (Nokia - FI/Espoo)"wrote: >> +++ b/platform/linux-generic/include/odp_llqueue.h >> @@ -0,0 +1,309 @@ >> +/* Copyright (c) 2017, ARM Limited. >> + * All rights reserved. >> + * >> + * SPDX-License-Identifier:BSD-3-Clause >> + */ >> + >> +#ifndef ODP_LLQUEUE_H_ >> +#define ODP_LLQUEUE_H_ >> + >> +#include >> +#include >> +#include >> + >> +#include >> +#include >> +#include >> + >> +#include >> +#include >> + >> >>+/*** >>* >> ** >> + * Linked list queues >> + >> >>* >>* >> ***/ >> + >> +struct llqueue; >> +struct llnode; >> + >> +static struct llnode *llq_head(struct llqueue *llq); >> +static void llqueue_init(struct llqueue *llq); >> +static void llq_enqueue(struct llqueue *llq, struct llnode *node); >> +static struct llnode *llq_dequeue(struct llqueue *llq); >> +static odp_bool_t llq_dequeue_cond(struct llqueue *llq, struct llnode >> *exp); >> +static odp_bool_t llq_cond_rotate(struct llqueue *llq, struct llnode >> *node); >> +static odp_bool_t llq_on_queue(struct llnode *node); >> + >> >>+/*** >>* >> ** >> + * The implementation(s) >> + >> >>* >>* >> ***/ >> + >> +#define SENTINEL ((void *)~(uintptr_t)0) >> + >> +#ifdef CONFIG_LLDSCD >> +/* Implement queue operations using double-word LL/SC */ > >> + >> +#else >> +/* Implement queue operations protected by a spin lock */ >> + > >There's a lot of ifdef'ed code in this file, basically two full parallel >implementations. This horse has been flogged before on the mailing list. > The first is built only for ARM and the second for the rest. Would there >be a way to build both always ? For ARMv7a and ARMv8a, you could build both versions. You really want to use the LL/SC version on these architectures. For architectures without double-word LL/SC, only the lock-based version can be built. > >-Petri >
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
I think we have talked about this earlier. We have been directed from our executives that we should use ARM copyright if a member has done this work and Linaro/LNG/FF/Bill are in sync on this. Thanks, Honnappa On 21 June 2017 at 10:10, Maxim Uvarovwrote: > On 06/21/17 17:27, Honnappa Nagarahalli wrote: >> >> Hi Maxim, >> This is a new file added by us. Hence the ARM copyright. >> Thanks, >> Honnappa >> > > Was this work done as Linaro member/assignee? If yes - it has to have Linaro > copyright as I know. > > Maxim. > > >> On 21 June 2017 at 09:06, Maxim Uvarov wrote: >>> >>> On 06/19/17 22:12, Brian Brooks wrote: --- /dev/null +++ b/platform/linux-generic/include/odp_llqueue.h @@ -0,0 +1,309 @@ +/* Copyright (c) 2017, ARM Limited. + * All rights reserved. + * + * SPDX-License-Identifier:BSD-3-Clause + */ + >>> >>> >>> >>> Has to be Linaro copyright. >>> >>> Maxim. > >
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
On 06/21/17 17:27, Honnappa Nagarahalli wrote: Hi Maxim, This is a new file added by us. Hence the ARM copyright. Thanks, Honnappa Was this work done as Linaro member/assignee? If yes - it has to have Linaro copyright as I know. Maxim. On 21 June 2017 at 09:06, Maxim Uvarovwrote: On 06/19/17 22:12, Brian Brooks wrote: --- /dev/null +++ b/platform/linux-generic/include/odp_llqueue.h @@ -0,0 +1,309 @@ +/* Copyright (c) 2017, ARM Limited. + * All rights reserved. + * + * SPDX-License-Identifier:BSD-3-Clause + */ + Has to be Linaro copyright. Maxim.
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
On 06/19/17 22:12, Brian Brooks wrote: Signed-off-by: Ola LiljedahlReviewed-by: Brian Brooks --- platform/linux-generic/Makefile.am | 1 + platform/linux-generic/include/odp_llqueue.h | 309 +++ 2 files changed, 310 insertions(+) create mode 100644 platform/linux-generic/include/odp_llqueue.h diff --git a/platform/linux-generic/Makefile.am b/platform/linux-generic/Makefile.am index b869fd4b..90cc4ca6 100644 --- a/platform/linux-generic/Makefile.am +++ b/platform/linux-generic/Makefile.am @@ -157,6 +157,7 @@ noinst_HEADERS = \ ${srcdir}/include/odp_errno_define.h \ ${srcdir}/include/odp_forward_typedefs_internal.h \ ${srcdir}/include/odp_internal.h \ + ${srcdir}/include/odp_llqueue.h \ ${srcdir}/include/odp_name_table_internal.h \ ${srcdir}/include/odp_packet_internal.h \ ${srcdir}/include/odp_packet_io_internal.h \ diff --git a/platform/linux-generic/include/odp_llqueue.h b/platform/linux-generic/include/odp_llqueue.h new file mode 100644 index ..758af490 --- /dev/null +++ b/platform/linux-generic/include/odp_llqueue.h @@ -0,0 +1,309 @@ +/* Copyright (c) 2017, ARM Limited. + * All rights reserved. + * + * SPDX-License-Identifier:BSD-3-Clause + */ + +#ifndef ODP_LLQUEUE_H_ +#define ODP_LLQUEUE_H_ + +#include +#include +#include + +#include +#include +#include + +#include +#include + +/** + * Linked list queues + */ + +struct llqueue; +struct llnode; + +static struct llnode *llq_head(struct llqueue *llq); +static void llqueue_init(struct llqueue *llq); +static void llq_enqueue(struct llqueue *llq, struct llnode *node); +static struct llnode *llq_dequeue(struct llqueue *llq); +static odp_bool_t llq_dequeue_cond(struct llqueue *llq, struct llnode *exp); +static odp_bool_t llq_cond_rotate(struct llqueue *llq, struct llnode *node); +static odp_bool_t llq_on_queue(struct llnode *node); + +/** + * The implementation(s) + */ + +#define SENTINEL ((void *)~(uintptr_t)0) + +#ifdef CONFIG_LLDSCD +/* Implement queue operations using double-word LL/SC */ + +/* The scalar equivalent of a double pointer */ +#if __SIZEOF_PTRDIFF_T__ == 4 +typedef uint64_t dintptr_t; +#endif +#if __SIZEOF_PTRDIFF_T__ == 8 +typedef __int128 dintptr_t; +#endif + +struct llnode { + struct llnode *next; +}; + +union llht { + struct { + struct llnode *head, *tail; + } st; + dintptr_t ui; you use dintptr_t only once. It's here. No need for typedef. Remove ifdef above. Place it here and change to #else if. And code will be 2 times smaller. +}; + +struct llqueue { + union llht u; +}; that looks not clear for me. 'struct llqueue' and 'union llht' is the same thing. After that you reference to it that as 'struct llqueue' in function arguments and 'union llht' inside the function. Like bellow... + +static inline struct llnode *llq_head(struct llqueue *llq) +{ + return __atomic_load_n(>u.st.head, __ATOMIC_RELAXED); +} + +static inline void llqueue_init(struct llqueue *llq) +{ + llq->u.st.head = NULL; + llq->u.st.tail = NULL; +} + +static inline void llq_enqueue(struct llqueue *llq, struct llnode *node) +{ + union llht old, neu; + here. btw neu has to be new, right? + ODP_ASSERT(node->next == NULL); + node->next = SENTINEL; + do { + old.ui = lld(>u.ui, __ATOMIC_RELAXED); + neu.st.head = old.st.head == NULL ? node : old.st.head; + neu.st.tail = node; + } while (odp_unlikely(scd(>u.ui, neu.ui, __ATOMIC_RELEASE))); + if (old.st.tail != NULL) { + /* List was not empty */ + ODP_ASSERT(old.st.tail->next == SENTINEL); + old.st.tail->next = node; + } +} + +static inline struct llnode *llq_dequeue(struct llqueue *llq) +{ + struct llnode *head; + union llht old, neu; + + /* llq_dequeue() may be used in a busy-waiting fashion +* Read head using plain load to avoid disturbing remote LL/SC +*/ + head = __atomic_load_n(>u.st.head, __ATOMIC_ACQUIRE); + if (head == NULL) + return NULL; + /* Read head->next before LL to minimize cache miss latency +* in LL/SC below +*/ + (void)__atomic_load_n(>next, __ATOMIC_RELAXED); + + do { + old.ui = lld(>u.ui, __ATOMIC_RELAXED); + if (odp_unlikely(old.st.head == NULL)) { + /* Empty list */ + return NULL; + } else
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
Hi Maxim, This is a new file added by us. Hence the ARM copyright. Thanks, Honnappa On 21 June 2017 at 09:06, Maxim Uvarovwrote: > On 06/19/17 22:12, Brian Brooks wrote: >> >> --- /dev/null >> +++ b/platform/linux-generic/include/odp_llqueue.h >> @@ -0,0 +1,309 @@ >> +/* Copyright (c) 2017, ARM Limited. >> + * All rights reserved. >> + * >> + * SPDX-License-Identifier:BSD-3-Clause >> + */ >> + > > > Has to be Linaro copyright. > > Maxim.
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
On 06/19/17 22:12, Brian Brooks wrote: --- /dev/null +++ b/platform/linux-generic/include/odp_llqueue.h @@ -0,0 +1,309 @@ +/* Copyright (c) 2017, ARM Limited. + * All rights reserved. + * + * SPDX-License-Identifier:BSD-3-Clause + */ + Has to be Linaro copyright. Maxim.
Re: [lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
> +++ b/platform/linux-generic/include/odp_llqueue.h > @@ -0,0 +1,309 @@ > +/* Copyright (c) 2017, ARM Limited. > + * All rights reserved. > + * > + * SPDX-License-Identifier:BSD-3-Clause > + */ > + > +#ifndef ODP_LLQUEUE_H_ > +#define ODP_LLQUEUE_H_ > + > +#include > +#include > +#include > + > +#include > +#include > +#include > + > +#include > +#include > + > +/ > ** > + * Linked list queues > + > ** > ***/ > + > +struct llqueue; > +struct llnode; > + > +static struct llnode *llq_head(struct llqueue *llq); > +static void llqueue_init(struct llqueue *llq); > +static void llq_enqueue(struct llqueue *llq, struct llnode *node); > +static struct llnode *llq_dequeue(struct llqueue *llq); > +static odp_bool_t llq_dequeue_cond(struct llqueue *llq, struct llnode > *exp); > +static odp_bool_t llq_cond_rotate(struct llqueue *llq, struct llnode > *node); > +static odp_bool_t llq_on_queue(struct llnode *node); > + > +/ > ** > + * The implementation(s) > + > ** > ***/ > + > +#define SENTINEL ((void *)~(uintptr_t)0) > + > +#ifdef CONFIG_LLDSCD > +/* Implement queue operations using double-word LL/SC */ > + > +#else > +/* Implement queue operations protected by a spin lock */ > + There's a lot of ifdef'ed code in this file, basically two full parallel implementations. The first is built only for ARM and the second for the rest. Would there be a way to build both always ? -Petri
[lng-odp] [API-NEXT PATCH v9 4/6] linux-gen: sched scalable: add a concurrent queue
Signed-off-by: Ola LiljedahlReviewed-by: Brian Brooks --- platform/linux-generic/Makefile.am | 1 + platform/linux-generic/include/odp_llqueue.h | 309 +++ 2 files changed, 310 insertions(+) create mode 100644 platform/linux-generic/include/odp_llqueue.h diff --git a/platform/linux-generic/Makefile.am b/platform/linux-generic/Makefile.am index b869fd4b..90cc4ca6 100644 --- a/platform/linux-generic/Makefile.am +++ b/platform/linux-generic/Makefile.am @@ -157,6 +157,7 @@ noinst_HEADERS = \ ${srcdir}/include/odp_errno_define.h \ ${srcdir}/include/odp_forward_typedefs_internal.h \ ${srcdir}/include/odp_internal.h \ + ${srcdir}/include/odp_llqueue.h \ ${srcdir}/include/odp_name_table_internal.h \ ${srcdir}/include/odp_packet_internal.h \ ${srcdir}/include/odp_packet_io_internal.h \ diff --git a/platform/linux-generic/include/odp_llqueue.h b/platform/linux-generic/include/odp_llqueue.h new file mode 100644 index ..758af490 --- /dev/null +++ b/platform/linux-generic/include/odp_llqueue.h @@ -0,0 +1,309 @@ +/* Copyright (c) 2017, ARM Limited. + * All rights reserved. + * + * SPDX-License-Identifier:BSD-3-Clause + */ + +#ifndef ODP_LLQUEUE_H_ +#define ODP_LLQUEUE_H_ + +#include +#include +#include + +#include +#include +#include + +#include +#include + +/** + * Linked list queues + */ + +struct llqueue; +struct llnode; + +static struct llnode *llq_head(struct llqueue *llq); +static void llqueue_init(struct llqueue *llq); +static void llq_enqueue(struct llqueue *llq, struct llnode *node); +static struct llnode *llq_dequeue(struct llqueue *llq); +static odp_bool_t llq_dequeue_cond(struct llqueue *llq, struct llnode *exp); +static odp_bool_t llq_cond_rotate(struct llqueue *llq, struct llnode *node); +static odp_bool_t llq_on_queue(struct llnode *node); + +/** + * The implementation(s) + */ + +#define SENTINEL ((void *)~(uintptr_t)0) + +#ifdef CONFIG_LLDSCD +/* Implement queue operations using double-word LL/SC */ + +/* The scalar equivalent of a double pointer */ +#if __SIZEOF_PTRDIFF_T__ == 4 +typedef uint64_t dintptr_t; +#endif +#if __SIZEOF_PTRDIFF_T__ == 8 +typedef __int128 dintptr_t; +#endif + +struct llnode { + struct llnode *next; +}; + +union llht { + struct { + struct llnode *head, *tail; + } st; + dintptr_t ui; +}; + +struct llqueue { + union llht u; +}; + +static inline struct llnode *llq_head(struct llqueue *llq) +{ + return __atomic_load_n(>u.st.head, __ATOMIC_RELAXED); +} + +static inline void llqueue_init(struct llqueue *llq) +{ + llq->u.st.head = NULL; + llq->u.st.tail = NULL; +} + +static inline void llq_enqueue(struct llqueue *llq, struct llnode *node) +{ + union llht old, neu; + + ODP_ASSERT(node->next == NULL); + node->next = SENTINEL; + do { + old.ui = lld(>u.ui, __ATOMIC_RELAXED); + neu.st.head = old.st.head == NULL ? node : old.st.head; + neu.st.tail = node; + } while (odp_unlikely(scd(>u.ui, neu.ui, __ATOMIC_RELEASE))); + if (old.st.tail != NULL) { + /* List was not empty */ + ODP_ASSERT(old.st.tail->next == SENTINEL); + old.st.tail->next = node; + } +} + +static inline struct llnode *llq_dequeue(struct llqueue *llq) +{ + struct llnode *head; + union llht old, neu; + + /* llq_dequeue() may be used in a busy-waiting fashion +* Read head using plain load to avoid disturbing remote LL/SC +*/ + head = __atomic_load_n(>u.st.head, __ATOMIC_ACQUIRE); + if (head == NULL) + return NULL; + /* Read head->next before LL to minimize cache miss latency +* in LL/SC below +*/ + (void)__atomic_load_n(>next, __ATOMIC_RELAXED); + + do { + old.ui = lld(>u.ui, __ATOMIC_RELAXED); + if (odp_unlikely(old.st.head == NULL)) { + /* Empty list */ + return NULL; + } else if (odp_unlikely(old.st.head == old.st.tail)) { + /* Single-element in list */ + neu.st.head = NULL; + neu.st.tail = NULL; + } else { + /* Multi-element list, dequeue head */ + struct llnode *next; + /* Wait until llq_enqueue() has written true next +* pointer +*/ +