Re: [PATCH v4.14-stable] sch_netem: restore skb->dev after dequeuing from the rbtree

2018-10-30 Thread Eduardo Valentin
Greg,

On Thu, Oct 18, 2018 at 03:43:48PM -0700, David Miller wrote:
> From: Christoph Paasch 
> Date: Thu, 18 Oct 2018 13:38:40 -0700
> 
> > Upstream commit bffa72cf7f9d ("net: sk_buff rbnode reorg") got
> > backported as commit 6b921536f170 ("net: sk_buff rbnode reorg") into the
> > v4.14.x-tree.
> > 
> > However, the backport does not include the changes in sch_netem.c
> > 
> > We need these, as otherwise the skb->dev pointer is not set when
> > dequeueing from the netem rbtree, resulting in a panic:
>  ...
> > Fixes: 6b921536f170 ("net: sk_buff rbnode reorg")
> > Cc: Stephen Hemminger 
> > Cc: Eric Dumazet 
> > Cc: Soheil Hassas Yeganeh 
> > Cc: Wei Wang 
> > Cc: Willem de Bruijn 
> > Signed-off-by: Christoph Paasch 
> > ---
> > 
> > Notes:
> > This patch should only make it into v4.14-stable as that's the only 
> > branch where
> > the offending commit has been backported to.
> 
> Greg, please queue up.

Are you planing to queue this one ?

Looks to me it was a miss on the backport.

It seams that the backport was touching different files, and missed the change
on net/sched/sch_netem.c. So, to me, even if this patch may not follow the
strictly the rules of stable, as it is not a patch in upstream, seams to be a 
needed change, even if it is specific to stable linux-4.14.y.

> 

-- 
All the best,
Eduardo Valentin


Re: [PATCH bpf] bpf: use per htab salt for bucket hash

2018-08-23 Thread Eduardo Valentin
On Wed, Aug 22, 2018 at 11:49:37PM +0200, Daniel Borkmann wrote:
> All BPF hash and LRU maps currently have a known and global seed
> we feed into jhash() which is 0. This is suboptimal, thus fix it
> by generating a random seed upon hashtab setup time which we can
> later on feed into jhash() on lookup, update and deletions.
> 
> Fixes: 0f8e4bd8a1fc8 ("bpf: add hashtable type of eBPF maps")
> Signed-off-by: Daniel Borkmann 
> Acked-by: Alexei Starovoitov 

Reviewed-by: Eduardo Valentin 


> ---
>  kernel/bpf/hashtab.c | 23 +--
>  1 file changed, 13 insertions(+), 10 deletions(-)
> 
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 04b8eda..03cc59e 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include "percpu_freelist.h"
>  #include "bpf_lru_list.h"
> @@ -41,6 +42,7 @@ struct bpf_htab {
>   atomic_t count; /* number of elements in this hashtable */
>   u32 n_buckets;  /* number of hash buckets */
>   u32 elem_size;  /* size of each element in bytes */
> + u32 hashrnd;
>  };
>  
>  /* each htab element is struct htab_elem + key + value */
> @@ -371,6 +373,7 @@ static struct bpf_map *htab_map_alloc(union bpf_attr 
> *attr)
>   if (!htab->buckets)
>   goto free_htab;
>  
> + htab->hashrnd = get_random_int();
>   for (i = 0; i < htab->n_buckets; i++) {
>   INIT_HLIST_NULLS_HEAD(&htab->buckets[i].head, i);
>   raw_spin_lock_init(&htab->buckets[i].lock);
> @@ -402,9 +405,9 @@ static struct bpf_map *htab_map_alloc(union bpf_attr 
> *attr)
>   return ERR_PTR(err);
>  }
>  
> -static inline u32 htab_map_hash(const void *key, u32 key_len)
> +static inline u32 htab_map_hash(const void *key, u32 key_len, u32 hashrnd)
>  {
> - return jhash(key, key_len, 0);
> + return jhash(key, key_len, hashrnd);
>  }
>  
>  static inline struct bucket *__select_bucket(struct bpf_htab *htab, u32 hash)
> @@ -470,7 +473,7 @@ static void *__htab_map_lookup_elem(struct bpf_map *map, 
> void *key)
>  
>   key_size = map->key_size;
>  
> - hash = htab_map_hash(key, key_size);
> + hash = htab_map_hash(key, key_size, htab->hashrnd);
>  
>   head = select_bucket(htab, hash);
>  
> @@ -597,7 +600,7 @@ static int htab_map_get_next_key(struct bpf_map *map, 
> void *key, void *next_key)
>   if (!key)
>   goto find_first_elem;
>  
> - hash = htab_map_hash(key, key_size);
> + hash = htab_map_hash(key, key_size, htab->hashrnd);
>  
>   head = select_bucket(htab, hash);
>  
> @@ -824,7 +827,7 @@ static int htab_map_update_elem(struct bpf_map *map, void 
> *key, void *value,
>  
>   key_size = map->key_size;
>  
> - hash = htab_map_hash(key, key_size);
> + hash = htab_map_hash(key, key_size, htab->hashrnd);
>  
>   b = __select_bucket(htab, hash);
>   head = &b->head;
> @@ -880,7 +883,7 @@ static int htab_lru_map_update_elem(struct bpf_map *map, 
> void *key, void *value,
>  
>   key_size = map->key_size;
>  
> - hash = htab_map_hash(key, key_size);
> + hash = htab_map_hash(key, key_size, htab->hashrnd);
>  
>   b = __select_bucket(htab, hash);
>   head = &b->head;
> @@ -945,7 +948,7 @@ static int __htab_percpu_map_update_elem(struct bpf_map 
> *map, void *key,
>  
>   key_size = map->key_size;
>  
> - hash = htab_map_hash(key, key_size);
> + hash = htab_map_hash(key, key_size, htab->hashrnd);
>  
>   b = __select_bucket(htab, hash);
>   head = &b->head;
> @@ -998,7 +1001,7 @@ static int __htab_lru_percpu_map_update_elem(struct 
> bpf_map *map, void *key,
>  
>   key_size = map->key_size;
>  
> - hash = htab_map_hash(key, key_size);
> + hash = htab_map_hash(key, key_size, htab->hashrnd);
>  
>   b = __select_bucket(htab, hash);
>   head = &b->head;
> @@ -1071,7 +1074,7 @@ static int htab_map_delete_elem(struct bpf_map *map, 
> void *key)
>  
>   key_size = map->key_size;
>  
> - hash = htab_map_hash(key, key_size);
> +     hash = htab_map_hash(key, key_size, htab->hashrnd);
>   b = __select_bucket(htab, hash);
>   head = &b->head;
>  
> @@ -1103,7 +1106,7 @@ static int htab_lru_map_delete_elem(struct bpf_map 
> *map, void *key)
>  
>   key_size = map->key_size;
>  
> - hash = htab_map_hash(key, key_size);
> + hash = htab_map_hash(key, key_size, htab->hashrnd);
>   b = __select_bucket(htab, hash);
>   head = &b->head;
>  
> -- 
> 2.9.5
> 
> 

-- 
All the best,
Eduardo Valentin


Re: [PATCH v2 0/2] enable hires timer to timeout datagram socket

2017-09-08 Thread Eduardo Valentin
Hello,

On Fri, Sep 08, 2017 at 10:26:45AM -0700, David Miller wrote:
> From: David Woodhouse 
> Date: Fri, 08 Sep 2017 18:23:22 +0100
> 
> > I don't know that anyone's ever tried saying "show me the chapter and
> > verse of the documentation"
> 
> Do you know why I brought this up?  Because the person I am replying
> to told me that the syscall documentation should have suggested this
> or that.
> 
> That's why.

:-) My intention was for sure not to upset anybody.

Just to reiterate, the point of patch is simple, there was a change in behavior 
in the system call from one kernel version to the other. As I mentioned, I 
agree that the userspace could use other means to achieve the same, but still 
the system call behavior has changed.

> 
> So let's concentrate on the other aspects of my reply, ok?

I agree. I would prefer to understand here what is the technical reason not to 
accept these patches other than "use other system calls".


-- 
All the best,
Eduardo Valentin


Re: [PATCH v2 0/2] enable hires timer to timeout datagram socket

2017-09-08 Thread Eduardo Valentin
David,

On Tue, Aug 22, 2017 at 09:30:30PM -0700, David Miller wrote:
> From: Vallish Vaidyeshwara 
> Date: Wed, 23 Aug 2017 00:10:25 +
> 
> > I am submitting 2 patch series to enable hires timer to timeout
> > datagram sockets (AF_UNIX & AF_INET domain) and test code to test
> > timeout accuracy on these sockets.
> 
> This is not reasonable.
> 
> If you want high resolution events with real guarantees, please use
> the kernel interfaces which provide this as explained to you as
> feedback by other reviewers.

I understand the kernel provides other interfaces.

> 
> I'm not applying this, sorry.

However, this is a clear, the system call, from the net subsystem,  has changed 
in behavior across kernel versions. From application / userspace perspective, 
changing the system call without clear documentation or deprecation path, to 
me, looks like breaking userspace, isn't it?

If the correct recommendation is to use different system calls this should have 
been mentioned in system call documentation before changing its behavior, not 
expect the user to figure out after kernel release/upgrade, right?


-- 
All the best,
Eduardo Valentin


Re: [PATCH 1/1] bridge: mdb: fix leak on complete_info ptr on fail path

2017-07-12 Thread Eduardo Valentin
On Tue, Jul 11, 2017 at 08:02:33PM -0700, David Miller wrote:
> From: Eduardo Valentin 
> Date: Tue, 11 Jul 2017 14:55:12 -0700
> 
> > We currently get the following kmemleak report:
> > unreferenced object 0x8800039d9820 (size 32):
> >   comm "softirq", pid 0, jiffies 4295212383 (age 792.416s)
> >   hex dump (first 32 bytes):
> > 00 0c e0 03 00 88 ff ff ff 02 00 00 00 00 00 00  
> > 00 00 00 01 ff 11 00 02 86 dd 00 00 ff ff ff ff  
> >   backtrace:
> > [] kmemleak_alloc+0x4a/0xa0
> > [] kmem_cache_alloc_trace+0xb8/0x1c0
> > [] __br_mdb_notify+0x2a3/0x300 [bridge]
> > [] br_mdb_notify+0x6e/0x70 [bridge]
> > [] br_multicast_add_group+0x109/0x150 [bridge]
> > [] br_ip6_multicast_add_group+0x58/0x60 [bridge]
> > [] br_multicast_rcv+0x1d5/0xdb0 [bridge]
> > [] br_handle_frame_finish+0xcf/0x510 [bridge]
> > [] br_nf_hook_thresh.part.27+0xb/0x10 [br_netfilter]
> > [] br_nf_hook_thresh+0x48/0xb0 [br_netfilter]
> > [] br_nf_pre_routing_finish_ipv6+0x109/0x1d0 
> > [br_netfilter]
> > [] br_nf_pre_routing_ipv6+0xd0/0x14c [br_netfilter]
> > [] br_nf_pre_routing+0x197/0x3d0 [br_netfilter]
> > [] nf_iterate+0x52/0x60
> > [] nf_hook_slow+0x5c/0xb0
> > [] br_handle_frame+0x1a4/0x2c0 [bridge]
> > 
> > This happens when switchdev_port_obj_add() fails. This patch
> > frees complete_info object in the fail path.
> 
> Applied, thanks.
> 

Thanks!

> I'm so glad I pushed back on your original patch :-)

man, me too !! :-)

> 
> > Cc: stable  # v4.9+
> 
> Please do not add stable tags to networking patches, I queue up and
> submit networking -stable changes myself upon request which I am doing
> in this case as well.

Oh, I see. I won't copy stable next time and I will request you to queue, when 
applicable.

> 
> Thanks.

-- 
All the best,
Eduardo Valentin


[PATCH 1/1] bridge: mdb: fix leak on complete_info ptr on fail path

2017-07-11 Thread Eduardo Valentin
We currently get the following kmemleak report:
unreferenced object 0x8800039d9820 (size 32):
  comm "softirq", pid 0, jiffies 4295212383 (age 792.416s)
  hex dump (first 32 bytes):
00 0c e0 03 00 88 ff ff ff 02 00 00 00 00 00 00  
00 00 00 01 ff 11 00 02 86 dd 00 00 ff ff ff ff  
  backtrace:
[] kmemleak_alloc+0x4a/0xa0
[] kmem_cache_alloc_trace+0xb8/0x1c0
[] __br_mdb_notify+0x2a3/0x300 [bridge]
[] br_mdb_notify+0x6e/0x70 [bridge]
[] br_multicast_add_group+0x109/0x150 [bridge]
[] br_ip6_multicast_add_group+0x58/0x60 [bridge]
[] br_multicast_rcv+0x1d5/0xdb0 [bridge]
[] br_handle_frame_finish+0xcf/0x510 [bridge]
[] br_nf_hook_thresh.part.27+0xb/0x10 [br_netfilter]
[] br_nf_hook_thresh+0x48/0xb0 [br_netfilter]
[] br_nf_pre_routing_finish_ipv6+0x109/0x1d0 
[br_netfilter]
[] br_nf_pre_routing_ipv6+0xd0/0x14c [br_netfilter]
[] br_nf_pre_routing+0x197/0x3d0 [br_netfilter]
[] nf_iterate+0x52/0x60
[] nf_hook_slow+0x5c/0xb0
[] br_handle_frame+0x1a4/0x2c0 [bridge]

This happens when switchdev_port_obj_add() fails. This patch
frees complete_info object in the fail path.

Cc: stable  # v4.9+
Reviewed-by: Vallish Vaidyeshwara 
Signed-off-by: Eduardo Valentin 
---
 net/bridge/br_mdb.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index b084548..c1030f8 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -323,7 +323,8 @@ static void __br_mdb_notify(struct net_device *dev, struct 
net_bridge_port *p,
__mdb_entry_to_br_ip(entry, &complete_info->ip);
mdb.obj.complete_priv = complete_info;
mdb.obj.complete = br_mdb_complete;
-   switchdev_port_obj_add(port_dev, &mdb.obj);
+   if (switchdev_port_obj_add(port_dev, &mdb.obj))
+   kfree(complete_info);
}
} else if (port_dev && type == RTM_DELMDB) {
switchdev_port_obj_del(port_dev, &mdb.obj);
-- 
2.7.4



Re: [PATCH 1/1] bridge: mdb: report complete_info ptr as not a kmemleak

2017-07-11 Thread Eduardo Valentin
On Wed, Jul 05, 2017 at 09:05:14AM -0700, Eduardo Valentin wrote:
> On Tue, Jul 04, 2017 at 01:48:42AM -0700, David Miller wrote:
> > From: Eduardo Valentin 
> > Date: Mon, 3 Jul 2017 10:06:34 -0700
> > 
> > > We currently get the following kmemleak report:
> >  ...
> > > This patch flags the complete_info ptr object as not a leak as it will
> > > get freed when .complete_priv() is called,
> > 
> > We don't call .complete_priv().  We call .complete().
> 
> True, I can fix this wording in the commit next version.
> 
> > 
> >  for the br mdb case, it
> > > will be freed at br_mdb_complete().
> > > 
> > > Cc: stable  # v4.9+
> > > Reviewed-by: Vallish Vaidyeshwara 
> > > Signed-off-by: Eduardo Valentin 
> > 
> > I don't understand why kmemleak cannot see this.
> > 
> > We store the pointer globally when we do the switchdev_port_obv_add()
> > call and it should be able to see that.
> 
> I see and agree. But even then I get these reports consistently on RTM_NEWMDB 
> type.
> This is why I am proposing marking as a non-kmemleak.
> 
> To me, this is only a leak if .complete() never gets called.

Ok. So, in fact, I believe the problem I am seeing in more when 
switchdev_port_obj_add() fails.

I will send a patch for that.

> 
> 
> -- 
> All the best,
> Eduardo Valentin

-- 
All the best,
Eduardo Valentin


Re: [PATCH 1/1] bridge: mdb: report complete_info ptr as not a kmemleak

2017-07-05 Thread Eduardo Valentin
On Tue, Jul 04, 2017 at 01:48:42AM -0700, David Miller wrote:
> From: Eduardo Valentin 
> Date: Mon, 3 Jul 2017 10:06:34 -0700
> 
> > We currently get the following kmemleak report:
>  ...
> > This patch flags the complete_info ptr object as not a leak as it will
> > get freed when .complete_priv() is called,
> 
> We don't call .complete_priv().  We call .complete().

True, I can fix this wording in the commit next version.

> 
>  for the br mdb case, it
> > will be freed at br_mdb_complete().
> > 
> > Cc: stable  # v4.9+
> > Reviewed-by: Vallish Vaidyeshwara 
> > Signed-off-by: Eduardo Valentin 
> 
> I don't understand why kmemleak cannot see this.
> 
> We store the pointer globally when we do the switchdev_port_obv_add()
> call and it should be able to see that.

I see and agree. But even then I get these reports consistently on RTM_NEWMDB 
type.
This is why I am proposing marking as a non-kmemleak.

To me, this is only a leak if .complete() never gets called.


-- 
All the best,
Eduardo Valentin


[PATCH 1/1] bridge: mdb: report complete_info ptr as not a kmemleak

2017-07-03 Thread Eduardo Valentin
We currently get the following kmemleak report:
unreferenced object 0x8800039d9820 (size 32):
  comm "softirq", pid 0, jiffies 4295212383 (age 792.416s)
  hex dump (first 32 bytes):
00 0c e0 03 00 88 ff ff ff 02 00 00 00 00 00 00  
00 00 00 01 ff 11 00 02 86 dd 00 00 ff ff ff ff  
  backtrace:
[] kmemleak_alloc+0x4a/0xa0
[] kmem_cache_alloc_trace+0xb8/0x1c0
[] __br_mdb_notify+0x2a3/0x300 [bridge]
[] br_mdb_notify+0x6e/0x70 [bridge]
[] br_multicast_add_group+0x109/0x150 [bridge]
[] br_ip6_multicast_add_group+0x58/0x60 [bridge]
[] br_multicast_rcv+0x1d5/0xdb0 [bridge]
[] br_handle_frame_finish+0xcf/0x510 [bridge]
[] br_nf_hook_thresh.part.27+0xb/0x10 [br_netfilter]
[] br_nf_hook_thresh+0x48/0xb0 [br_netfilter]
[] br_nf_pre_routing_finish_ipv6+0x109/0x1d0 
[br_netfilter]
[] br_nf_pre_routing_ipv6+0xd0/0x14c [br_netfilter]
[] br_nf_pre_routing+0x197/0x3d0 [br_netfilter]
[] nf_iterate+0x52/0x60
[] nf_hook_slow+0x5c/0xb0
[] br_handle_frame+0x1a4/0x2c0 [bridge]

This patch flags the complete_info ptr object as not a leak as it will
get freed when .complete_priv() is called, for the br mdb case, it
will be freed at br_mdb_complete().

Cc: stable  # v4.9+
Reviewed-by: Vallish Vaidyeshwara 
Signed-off-by: Eduardo Valentin 
---
 net/bridge/br_mdb.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c
index b084548..1c81546 100644
--- a/net/bridge/br_mdb.c
+++ b/net/bridge/br_mdb.c
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -319,6 +320,8 @@ static void __br_mdb_notify(struct net_device *dev, struct 
net_bridge_port *p,
if (port_dev && type == RTM_NEWMDB) {
complete_info = kmalloc(sizeof(*complete_info), GFP_ATOMIC);
if (complete_info) {
+   /* This pointer is freed in br_mdb_complete() */
+   kmemleak_not_leak(complete_info);
complete_info->port = p;
__mdb_entry_to_br_ip(entry, &complete_info->ip);
mdb.obj.complete_priv = complete_info;
-- 
2.7.4