[PATCH v3] lib/flex_proportions.c: cleanup __fprop_inc_percpu_max
If the given type has fraction smaller than max_frac/FPROP_FRAC_BASE, the code could be modified to call __fprop_inc_percpu() directly and easier to understand. After this patch, fprop_reflect_period_percpu() will be called twice, and quicky return on pl->period == p->period test, so it would not result to significant downside of performance. Thanks for Jan's guidance. Signed-off-by: Tan Hu Reviewed-by: Jan Kara --- lib/flex_proportions.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/lib/flex_proportions.c b/lib/flex_proportions.c index 7852bfff5..451543937 100644 --- a/lib/flex_proportions.c +++ b/lib/flex_proportions.c @@ -266,8 +266,7 @@ void __fprop_inc_percpu_max(struct fprop_global *p, if (numerator > (((u64)denominator) * max_frac) >> FPROP_FRAC_SHIFT) return; - } else - fprop_reflect_period_percpu(p, pl); - percpu_counter_add_batch(>events, 1, PROP_BATCH); - percpu_counter_add(>events, 1); + } + + __fprop_inc_percpu(p, pl); } -- 2.19.1
[PATCH v2] lib/flex_proportions.c: cleanup __fprop_inc_percpu_max
If the given type has fraction smaller than max_frac/FPROP_FRAC_BASE, the code could be modified to call __fprop_inc_percpu() directly and easier to understand. After this patch, fprop_reflect_period_percpu() will be called twice, and quicky return on pl->period == p->period test, so it would not result to significant downside of performance. Thanks for Jan's guidance. Signed-off-by: Tan Hu --- lib/flex_proportions.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/lib/flex_proportions.c b/lib/flex_proportions.c index 7852bfff5..451543937 100644 --- a/lib/flex_proportions.c +++ b/lib/flex_proportions.c @@ -266,8 +266,7 @@ void __fprop_inc_percpu_max(struct fprop_global *p, if (numerator > (((u64)denominator) * max_frac) >> FPROP_FRAC_SHIFT) return; - } else - fprop_reflect_period_percpu(p, pl); - percpu_counter_add_batch(>events, 1, PROP_BATCH); - percpu_counter_add(>events, 1); + } + + __fprop_inc_percpu(p, pl); } -- 2.19.1
[PATCH] lib/flex_proportions.c: aging counts when fraction smaller than max_frac/FPROP_FRAC_BASE
If the given type has fraction smaller than max_frac/FPROP_FRAC_BASE, __fprop_inc_percpu_max should follow the design formula and aging fraction too. Signed-off-by: Tan Hu --- lib/flex_proportions.c | 7 +++ 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/lib/flex_proportions.c b/lib/flex_proportions.c index 7852bfff5..451543937 100644 --- a/lib/flex_proportions.c +++ b/lib/flex_proportions.c @@ -266,8 +266,7 @@ void __fprop_inc_percpu_max(struct fprop_global *p, if (numerator > (((u64)denominator) * max_frac) >> FPROP_FRAC_SHIFT) return; - } else - fprop_reflect_period_percpu(p, pl); - percpu_counter_add_batch(>events, 1, PROP_BATCH); - percpu_counter_add(>events, 1); + } + + __fprop_inc_percpu(p, pl); } -- 2.19.1
[PATCH] proc: only export statistics of softirqs for online cpus
Only export statistics of softirqs for online cpus like /proc/interrupts, it would be more clearly. Signed-off-by: Tan Hu --- fs/proc/softirqs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/proc/softirqs.c b/fs/proc/softirqs.c index 12901dc..b7e6a0f 100644 --- a/fs/proc/softirqs.c +++ b/fs/proc/softirqs.c @@ -12,13 +12,13 @@ static int show_softirqs(struct seq_file *p, void *v) int i, j; seq_puts(p, ""); - for_each_possible_cpu(i) + for_each_online_cpu(i) seq_printf(p, "CPU%-8d", i); seq_putc(p, '\n'); for (i = 0; i < NR_SOFTIRQS; i++) { seq_printf(p, "%12s:", softirq_to_name[i]); - for_each_possible_cpu(j) + for_each_online_cpu(j) seq_printf(p, " %10u", kstat_softirqs_cpu(i, j)); seq_putc(p, '\n'); } -- 1.8.3.1
[PATCH v3] ipvs: fix race between ip_vs_conn_new() and ip_vs_del_dest()
We came across infinite loop in ipvs when using ipvs in docker env. When ipvs receives new packets and cannot find an ipvs connection, it will create a new connection, then if the dest is unavailable (i.e. IP_VS_DEST_F_AVAILABLE), the packet will be dropped sliently. But if the dropped packet is the first packet of this connection, the connection control timer never has a chance to start and the ipvs connection cannot be released. This will lead to memory leak, or infinite loop in cleanup_net() when net namespace is released like this: ip_vs_conn_net_cleanup at a0a9f31a [ip_vs] __ip_vs_cleanup at a0a9f60a [ip_vs] ops_exit_list at 81567a49 cleanup_net at 81568b40 process_one_work at 810a851b worker_thread at 810a9356 kthread at 810b0b6f ret_from_fork at 81697a18 race condition: CPU1 CPU2 ip_vs_in() ip_vs_conn_new() ip_vs_del_dest() __ip_vs_unlink_dest() ~IP_VS_DEST_F_AVAILABLE cp->dest && !IP_VS_DEST_F_AVAILABLE __ip_vs_conn_put ... cleanup_net ---> infinite looping Fix this by checking whether the timer already started. Signed-off-by: Tan Hu Reviewed-by: Jiang Biao --- v2: fix use-after-free in CONN_ONE_PACKET case suggested by Julian Anastasov v3: remove trailing whitespace for patch checking net/netfilter/ipvs/ip_vs_core.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c index 0679dd1..a17104f 100644 --- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -1972,13 +1972,20 @@ static int ip_vs_in_icmp_v6(struct netns_ipvs *ipvs, struct sk_buff *skb, if (cp->dest && !(cp->dest->flags & IP_VS_DEST_F_AVAILABLE)) { /* the destination server is not available */ - if (sysctl_expire_nodest_conn(ipvs)) { + __u32 flags = cp->flags; + + /* when timer already started, silently drop the packet.*/ + if (timer_pending(>timer)) + __ip_vs_conn_put(cp); + else + ip_vs_conn_put(cp); + + if (sysctl_expire_nodest_conn(ipvs) && + !(flags & IP_VS_CONN_F_ONE_PACKET)) { /* try to expire the connection immediately */ ip_vs_conn_expire_now(cp); } - /* don't restart its timer, and silently - drop the packet. */ - __ip_vs_conn_put(cp); + return NF_DROP; } -- 1.8.3.1
[PATCH v3] ipvs: fix race between ip_vs_conn_new() and ip_vs_del_dest()
We came across infinite loop in ipvs when using ipvs in docker env. When ipvs receives new packets and cannot find an ipvs connection, it will create a new connection, then if the dest is unavailable (i.e. IP_VS_DEST_F_AVAILABLE), the packet will be dropped sliently. But if the dropped packet is the first packet of this connection, the connection control timer never has a chance to start and the ipvs connection cannot be released. This will lead to memory leak, or infinite loop in cleanup_net() when net namespace is released like this: ip_vs_conn_net_cleanup at a0a9f31a [ip_vs] __ip_vs_cleanup at a0a9f60a [ip_vs] ops_exit_list at 81567a49 cleanup_net at 81568b40 process_one_work at 810a851b worker_thread at 810a9356 kthread at 810b0b6f ret_from_fork at 81697a18 race condition: CPU1 CPU2 ip_vs_in() ip_vs_conn_new() ip_vs_del_dest() __ip_vs_unlink_dest() ~IP_VS_DEST_F_AVAILABLE cp->dest && !IP_VS_DEST_F_AVAILABLE __ip_vs_conn_put ... cleanup_net ---> infinite looping Fix this by checking whether the timer already started. Signed-off-by: Tan Hu Reviewed-by: Jiang Biao --- v2: fix use-after-free in CONN_ONE_PACKET case suggested by Julian Anastasov v3: remove trailing whitespace for patch checking net/netfilter/ipvs/ip_vs_core.c | 15 +++ 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c index 0679dd1..a17104f 100644 --- a/net/netfilter/ipvs/ip_vs_core.c +++ b/net/netfilter/ipvs/ip_vs_core.c @@ -1972,13 +1972,20 @@ static int ip_vs_in_icmp_v6(struct netns_ipvs *ipvs, struct sk_buff *skb, if (cp->dest && !(cp->dest->flags & IP_VS_DEST_F_AVAILABLE)) { /* the destination server is not available */ - if (sysctl_expire_nodest_conn(ipvs)) { + __u32 flags = cp->flags; + + /* when timer already started, silently drop the packet.*/ + if (timer_pending(>timer)) + __ip_vs_conn_put(cp); + else + ip_vs_conn_put(cp); + + if (sysctl_expire_nodest_conn(ipvs) && + !(flags & IP_VS_CONN_F_ONE_PACKET)) { /* try to expire the connection immediately */ ip_vs_conn_expire_now(cp); } - /* don't restart its timer, and silently - drop the packet. */ - __ip_vs_conn_put(cp); + return NF_DROP; } -- 1.8.3.1