On Mon, 2018-02-12 at 13:58 -0800, Yonghong Song wrote:
> There is a memory leak happening in lpm_trie map_free callback
> function trie_free. The trie structure itself does not get freed.
> 
> Also, trie_free function did not do synchronize_rcu before freeing
> various data structures. This is incorrect as some rcu_read_lock
> region(s) for lookup, update, delete or get_next_key may not complete yet.
> The fix is to add synchronize_rcu in the beginning of trie_free.
> The useless spin_lock is removed from this function as well.
> 
> Fixes: b95a5c4db09b ("bpf: add a longest prefix match trie map 
> implementation")
> Reported-by: Mathieu Malaterre <ma...@debian.org>
> Reported-by: Alexei Starovoitov <a...@kernel.org>
> Tested-by: Mathieu Malaterre <ma...@debian.org>
> Signed-off-by: Yonghong Song <y...@fb.com>
> ---
>  kernel/bpf/lpm_trie.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/lpm_trie.c b/kernel/bpf/lpm_trie.c
> index 7b469d1..9b41ea4 100644
> --- a/kernel/bpf/lpm_trie.c
> +++ b/kernel/bpf/lpm_trie.c
> @@ -555,7 +555,12 @@ static void trie_free(struct bpf_map *map)
>       struct lpm_trie_node __rcu **slot;
>       struct lpm_trie_node *node;
>  
> -     raw_spin_lock(&trie->lock);
> +     /* at this point bpf_prog->aux->refcnt == 0 and this map->refcnt == 0,
> +      * so the programs (can be more than one that used this map) were
> +      * disconnected from events. Wait for outstanding programs to complete
> +      * update/lookup/delete/get_next_key and free the trie.
> +      */
> +     synchronize_rcu();
>  

Please do not do that.

Use kfree_rcu() instead (adding one struct rcu_head in struct lpm_trie)

>       /* Always start at the root and walk down to a node that has no
>        * children. Then free that node, nullify its reference in the parent
> @@ -588,7 +593,7 @@ static void trie_free(struct bpf_map *map)
>       }
>  
>  unlock:
> -     raw_spin_unlock(&trie->lock);
> +     kfree(trie);
>  }
>  
>  static int trie_get_next_key(struct bpf_map *map, void *_key, void 
> *_next_key)

Reply via email to