Re: [PATCH -v2 4/6] locking, arch: Update spin_unlock_wait()
On 5/27/2016 5:05 AM, Peter Zijlstra wrote: On Thu, May 26, 2016 at 05:10:36PM -0400, Chris Metcalf wrote: On 5/26/2016 10:19 AM, Peter Zijlstra wrote: --- a/arch/tile/lib/spinlock_32.c +++ b/arch/tile/lib/spinlock_32.c @@ -72,10 +72,14 @@ void arch_spin_unlock_wait(arch_spinlock if (next == curr) return; + smp_rmb(); + /* Wait until the current locker has released the lock. */ do { delay_backoff(iterations++); } while (READ_ONCE(lock->current_ticket) == curr); + + smp_acquire__after_ctrl_dep(); } EXPORT_SYMBOL(arch_spin_unlock_wait); --- a/arch/tile/lib/spinlock_64.c +++ b/arch/tile/lib/spinlock_64.c @@ -72,10 +72,14 @@ void arch_spin_unlock_wait(arch_spinlock if (arch_spin_next(val) == curr) return; + smp_rmb(); + /* Wait until the current locker has released the lock. */ do { delay_backoff(iterations++); } while (arch_spin_current(READ_ONCE(lock->lock)) == curr); + + smp_acquire__after_ctrl_dep(); } EXPORT_SYMBOL(arch_spin_unlock_wait); The smp_rmb() are unnecessary for tile. We READ_ONCE next/curr from the lock and compare them, so we know the load(s) are complete. There's no microarchitectural speculation going on so that's that. Then we READ_ONCE the next load on the lock from within the wait loop, so our load/load ordering is guaranteed. Does TILE never speculate reads? Because in that case the control dependency already provides a full load->load,store barrier and you'd want smp_acquire__after_ctrl_dep() to be a barrier() instead of smp_rmb(). Yes, that's a good point. I didn't look at the definition of smp_acquire__after_ctrl_dep(), but it certainly sounds like that's exactly a compiler barrier for tile. There is no load speculation performed. The only out-of-order stuff that happens is in the memory subsystem: stores will become visible in arbitrary order, and loads will arrive in arbitrary order, but as soon as the result of a load is used in any other kind of instruction, the instruction issue will halt until the pending load(s) for the instruction operands are available. -- Chris Metcalf, Mellanox Technologies http://www.mellanox.com -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH nf] netfilter: nf_tables: validate NFTA_SET_TABLE parameter
If the NFTA_SET_TABLE parameter is missing and the NLM_F_DUMP flag is not set, then a NULL pointer dereference is triggered in nf_tables_set_lookup because ctx.table is NULL. Signed-off-by: Phil Turnbull--- net/netfilter/nf_tables_api.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index c444929..569ba6a 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -2647,6 +2647,8 @@ static int nf_tables_getset(struct net *net, struct sock *nlsk, /* Only accept unspec with dump */ if (nfmsg->nfgen_family == NFPROTO_UNSPEC) return -EAFNOSUPPORT; + if (!nla[NFTA_SET_TABLE]) + return -EINVAL; set = nf_tables_set_lookup(ctx.table, nla[NFTA_SET_NAME]); if (IS_ERR(set)) -- 2.8.0 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2 libnftnl] set_elem: Copy user data memory
All attributes are passed by copy, so user data should be copied too. Signed-off-by: Carlos Falgueras García--- src/set_elem.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/src/set_elem.c b/src/set_elem.c index 990be24..b9c7e1e 100644 --- a/src/set_elem.c +++ b/src/set_elem.c @@ -125,8 +125,14 @@ void nftnl_set_elem_set(struct nftnl_set_elem *s, uint16_t attr, s->timeout = *((uint64_t *)data); break; case NFTNL_SET_ELEM_USERDATA: /* NFTA_SET_ELEM_USERDATA */ - s->user.data = (void *)data; - s->user.len = data_len; + if (s->user.data != NULL) + xfree(s->user.data); + + s->user.data = malloc(data_len); + if (!s->user.data) + return; + memcpy(s->user.data, data, data_len); + s->user.len = data_len; break; default: return; -- 2.8.2 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2 nft] set_elem: Use libnftnl/udata to store set element comment
The set element comment is stored in nftnl_set_elem->user.data using libnftnl/udata infrastructure. This allows store multiple variable length user data into set element. Signed-off-by: Carlos Falgueras García--- src/netlink.c | 50 ++ 1 file changed, 46 insertions(+), 4 deletions(-) diff --git a/src/netlink.c b/src/netlink.c index b0dcb90..f82d4fa 100644 --- a/src/netlink.c +++ b/src/netlink.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -208,6 +209,7 @@ static struct nftnl_set_elem *alloc_nftnl_setelem(const struct expr *expr) const struct expr *elem, *key, *data; struct nftnl_set_elem *nlse; struct nft_data_linearize nld; + struct nftnl_udata_buf *udbuf; nlse = nftnl_set_elem_alloc(); if (nlse == NULL) @@ -228,9 +230,18 @@ static struct nftnl_set_elem *alloc_nftnl_setelem(const struct expr *expr) if (elem->timeout) nftnl_set_elem_set_u64(nlse, NFTNL_SET_ELEM_TIMEOUT, elem->timeout); - if (elem->comment) + if (elem->comment) { + udbuf = nftnl_udata_buf_alloc(NFT_USERDATA_MAXLEN); + if (!udbuf) + memory_allocation_error(); + if (!nftnl_udata_put_strz(udbuf, UDATA_TYPE_COMMENT, + elem->comment)) + memory_allocation_error(); nftnl_set_elem_set(nlse, NFTNL_SET_ELEM_USERDATA, - elem->comment, strlen(elem->comment) + 1); + nftnl_udata_buf_data(udbuf), + nftnl_udata_buf_len(udbuf)); + nftnl_udata_buf_free(udbuf); + } if (data != NULL) { netlink_gen_data(data, ); @@ -1421,6 +1432,38 @@ static struct expr *netlink_parse_concat_elem(const struct datatype *dtype, return concat; } +static int parse_udata_cb(const struct nftnl_udata *attr, void *data) +{ + unsigned char *value = nftnl_udata_get(attr); + uint8_t type = nftnl_udata_type(attr); + uint8_t len = nftnl_udata_len(attr); + const struct nftnl_udata **tb = data; + + switch (type) { + case UDATA_TYPE_COMMENT: + if (value[len - 1] != '\0') + return -1; + break; + default: + return 0; + } + tb[type] = attr; + return 0; +} + +static char *udata_get_comment(const void *data, uint32_t data_len) +{ + const struct nftnl_udata *tb[UDATA_TYPE_MAX + 1] = {}; + + if (nftnl_udata_parse(data, data_len, parse_udata_cb, tb) < 0) + return NULL; + + if (!tb[UDATA_TYPE_COMMENT]) + return NULL; + + return xstrdup(nftnl_udata_get(tb[UDATA_TYPE_COMMENT])); +} + static int netlink_delinearize_setelem(struct nftnl_set_elem *nlse, const struct set *set) { @@ -1457,8 +1500,7 @@ static int netlink_delinearize_setelem(struct nftnl_set_elem *nlse, uint32_t len; data = nftnl_set_elem_get(nlse, NFTNL_SET_ELEM_USERDATA, ); - expr->comment = xmalloc(len); - memcpy((char *)expr->comment, data, len); + expr->comment = udata_get_comment(data, len); } if (nftnl_set_elem_is_set(nlse, NFT_SET_ELEM_ATTR_EXPR)) { const struct nftnl_expr *nle; -- 2.8.2 -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -v2 4/6] locking, arch: Update spin_unlock_wait()
On Thu, May 26, 2016 at 05:10:36PM -0400, Chris Metcalf wrote: > On 5/26/2016 10:19 AM, Peter Zijlstra wrote: > >--- a/arch/tile/lib/spinlock_32.c > >+++ b/arch/tile/lib/spinlock_32.c > >@@ -72,10 +72,14 @@ void arch_spin_unlock_wait(arch_spinlock > > if (next == curr) > > return; > >+smp_rmb(); > >+ > > /* Wait until the current locker has released the lock. */ > > do { > > delay_backoff(iterations++); > > } while (READ_ONCE(lock->current_ticket) == curr); > >+ > >+smp_acquire__after_ctrl_dep(); > > } > > EXPORT_SYMBOL(arch_spin_unlock_wait); > >--- a/arch/tile/lib/spinlock_64.c > >+++ b/arch/tile/lib/spinlock_64.c > >@@ -72,10 +72,14 @@ void arch_spin_unlock_wait(arch_spinlock > > if (arch_spin_next(val) == curr) > > return; > >+smp_rmb(); > >+ > > /* Wait until the current locker has released the lock. */ > > do { > > delay_backoff(iterations++); > > } while (arch_spin_current(READ_ONCE(lock->lock)) == curr); > >+ > >+smp_acquire__after_ctrl_dep(); > > } > > EXPORT_SYMBOL(arch_spin_unlock_wait); > > The smp_rmb() are unnecessary for tile. We READ_ONCE next/curr from the > lock and compare them, so we know the load(s) are complete. There's no > microarchitectural speculation going on so that's that. Then we READ_ONCE > the next load on the lock from within the wait loop, so our load/load > ordering is guaranteed. Does TILE never speculate reads? Because in that case the control dependency already provides a full load->load,store barrier and you'd want smp_acquire__after_ctrl_dep() to be a barrier() instead of smp_rmb(). -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -v2 4/6] locking, arch: Update spin_unlock_wait()
On Fri, May 27, 2016 at 08:46:49AM +0200, Martin Schwidefsky wrote: > > This fixes a number of spin_unlock_wait() users that (not > > unreasonably) rely on this. > > All that is missing is an smp_rmb(), no? Indeed. > > --- a/arch/s390/include/asm/spinlock.h > > +++ b/arch/s390/include/asm/spinlock.h > > @@ -95,8 +95,7 @@ static inline void arch_spin_unlock(arch > > > > static inline void arch_spin_unlock_wait(arch_spinlock_t *lock) > > { > > - while (arch_spin_is_locked(lock)) > > - arch_spin_relax(lock); > > + smp_cond_load_acquire(>lock, !VAL); > > } > > > > /* > > This change adds the smp_rmb() at the end of the waiting loop, but > it also replaces arch_spin_relax() alias arch_lock_relax() with a > cpu_relax(). This is not good, these two functions do *very* different > things. cpu_relax() does an undirected yield with diagnose 0x44 but > only if the system is non-SMT. arch_lock_relax() does an additional > cpu_is_preempted() to test if the target cpu is running and does a > directed yield with diagnose 0x9c. > > Why can't we just add the smp_rmb() to the arch_spin_unlock_wait()? We can; I forgot about the special cpu_relax on s390, will fix. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH -v2 4/6] locking, arch: Update spin_unlock_wait()
On Thu, 26 May 2016 16:19:26 +0200 Peter Zijlstrawrote: > This patch updates/fixes all spin_unlock_wait() implementations. > > The update is in semantics; where it previously was only a control > dependency, we now upgrade to a full load-acquire to match the > store-release from the spin_unlock() we waited on. This ensures that > when spin_unlock_wait() returns, we're guaranteed to observe the full > critical section we waited on. > > This fixes a number of spin_unlock_wait() users that (not > unreasonably) rely on this. All that is missing is an smp_rmb(), no? > --- a/arch/s390/include/asm/spinlock.h > +++ b/arch/s390/include/asm/spinlock.h > @@ -95,8 +95,7 @@ static inline void arch_spin_unlock(arch > > static inline void arch_spin_unlock_wait(arch_spinlock_t *lock) > { > - while (arch_spin_is_locked(lock)) > - arch_spin_relax(lock); > + smp_cond_load_acquire(>lock, !VAL); > } > > /* This change adds the smp_rmb() at the end of the waiting loop, but it also replaces arch_spin_relax() alias arch_lock_relax() with a cpu_relax(). This is not good, these two functions do *very* different things. cpu_relax() does an undirected yield with diagnose 0x44 but only if the system is non-SMT. arch_lock_relax() does an additional cpu_is_preempted() to test if the target cpu is running and does a directed yield with diagnose 0x9c. Why can't we just add the smp_rmb() to the arch_spin_unlock_wait()? -- blue skies, Martin. "Reality continues to ruin my life." - Calvin. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html