Re: [tip:sched/urgent] sched: Fix crash in sched_init_numa()

2016-02-08 Thread Raghavendra K T
On 01/19/2016 07:08 PM, tip-bot for Raghavendra K T wrote: Commit-ID: 9c03ee147193645be4c186d3688232fa438c57c7 Gitweb: http://git.kernel.org/tip/9c03ee147193645be4c186d3688232fa438c57c7 Author: Raghavendra K T AuthorDate: Sat, 16 Jan 2016 00:31:23 +0530 Committer: Ingo Molnar

[PATCH] blk-mq: Avoid memoryless numa node encoded in hctx numa_node

2015-12-02 Thread Raghavendra K T
use local_memory_node(), which is guaranteed to have memory. local_memory_node is a noop in other architectures that does not support memoryless nodes. Signed-off-by: Raghavendra K T --- block/blk-mq-cpumap.c | 2 +- block/blk-mq.c| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) Validated

[PATCH] blk-mq: Reuse hardware context cpumask for tags

2015-12-02 Thread Raghavendra K T
hctx->cpumask is already populated and let the tag cpumask follow that instead of going through a new for loop. Signed-off-by: Raghavendra K T --- block/blk-mq.c | 9 + 1 file changed, 1 insertion(+), 8 deletions(-) Nish had suggested to put cpumask_copy after WARN_ON (instead

Re: BUG: Unable to handle kernel paging request for data at address __percpu_counter_add

2015-11-29 Thread Raghavendra K T
On 11/24/2015 02:43 AM, Tejun Heo wrote: Hello, On Thu, Nov 19, 2015 at 03:54:35PM +0530, Raghavendra K T wrote: While I was creating thousands of docker container on a power8 baremetal (config: 4.3.0 kernel 1TB RAM, 20core (=160 cpu) system. After creating around 5600 container I have hit

Re: BUG: Unable to handle kernel paging request for data at address __percpu_counter_add

2015-11-23 Thread Raghavendra K T
On 11/24/2015 02:43 AM, Tejun Heo wrote: Hello, On Thu, Nov 19, 2015 at 03:54:35PM +0530, Raghavendra K T wrote: While I was creating thousands of docker container on a power8 baremetal (config: 4.3.0 kernel 1TB RAM, 20core (=160 cpu) system. After creating around 5600 container I have hit

BUG: Unable to handle kernel paging request for data at address __percpu_counter_add

2015-11-19 Thread Raghavendra K T
Hi, While I was creating thousands of docker container on a power8 baremetal (config: 4.3.0 kernel 1TB RAM, 20core (=160 cpu) system. After creating around 5600 container I have hit below problem. [This is looking similar to https://bugzilla.kernel.org/show_bug.cgi?id=101011, but kernel had Re

Re: [PATCH] block-mq:Fix the null memory access while setting tags cpumask

2015-10-13 Thread Raghavendra K T
On 10/13/2015 10:17 PM, Jeff Moyer wrote: Raghavendra K T writes: In nr_hw_queues >1 cases when certain number of cpus are onlined/or offlined, that results change in request_queue map in block-mq layer, we see the kernel dumping like: What version is that patch against? This prob

[PATCH] block-mq:Fix the null memory access while setting tags cpumask

2015-10-12 Thread Raghavendra K T
new mapping does not cause problem. That is also fixed with this change. This problem is originally found in powervm which had 160 cpus (SMT8), 128 nr_hw_queues. The dump was easily reproduced with offlining last core and it has been a blocker issue because cpu hotplug is a common case for DLPAR.

Re: [PATCH RFC 0/5] powerpc:numa Add serial nid support

2015-10-06 Thread Raghavendra K T
On 10/06/2015 03:55 PM, Michael Ellerman wrote: On Sun, 2015-09-27 at 23:59 +0530, Raghavendra K T wrote: Problem description: Powerpc has sparse node numbering, i.e. on a 4 node system nodes are numbered (possibly) as 0,1,16,17. At a lower level, we map the chipid got from device tree is

Re: [RFC, 1/5] powerpc:numa Add numa_cpu_lookup function to update lookup table

2015-10-06 Thread Raghavendra K T
On 10/06/2015 03:47 PM, Michael Ellerman wrote: On Sun, 2015-27-09 at 18:29:09 UTC, Raghavendra K T wrote: We access numa_cpu_lookup_table array directly in all the places to read/update numa cpu lookup information. Instead use a helper function to update. This is helpful in changing the way

Re: [PATCH RFC 0/5] powerpc:numa Add serial nid support

2015-09-29 Thread Raghavendra K T
On 09/30/2015 01:16 AM, Denis Kirjanov wrote: On 9/29/15, Raghavendra K T wrote: On 09/28/2015 10:34 PM, Nishanth Aravamudan wrote: On 28.09.2015 [13:44:42 +0300], Denis Kirjanov wrote: On 9/27/15, Raghavendra K T wrote: Problem description: Powerpc has sparse node numbering, i.e. on a 4

Re: [PATCH RFC 3/5] powerpc:numa create 1:1 mappaing between chipid and nid

2015-09-29 Thread Raghavendra K T
On 09/28/2015 11:05 PM, Nishanth Aravamudan wrote: On 27.09.2015 [23:59:11 +0530], Raghavendra K T wrote: Once we have made the distinction between nid and chipid create a 1:1 mapping between them. This makes compacting the nids easy later. No functionality change. Signed-off-by: Raghavendra

Re: [PATCH RFC 0/5] powerpc:numa Add serial nid support

2015-09-29 Thread Raghavendra K T
On 09/28/2015 11:04 PM, Nishanth Aravamudan wrote: On 27.09.2015 [23:59:08 +0530], Raghavendra K T wrote: [...] 2) Map the sparse chipid got from device tree to a serial nid at kernel level (The idea proposed in this series). Pro: It is more natural to handle at kernel level than at lower

Re: [PATCH RFC 4/5] powerpc:numa Add helper functions to maintain chipid to nid mapping

2015-09-29 Thread Raghavendra K T
On 09/28/2015 11:02 PM, Nishanth Aravamudan wrote: On 27.09.2015 [23:59:12 +0530], Raghavendra K T wrote: Create arrays that maps serial nids and sparse chipids. Note: My original idea had only two arrays of chipid to nid map. Final code is inspired by driver/acpi/numa.c that maps a proximity

Re: [PATCH RFC 3/5] powerpc:numa create 1:1 mappaing between chipid and nid

2015-09-29 Thread Raghavendra K T
On 09/28/2015 10:58 PM, Nishanth Aravamudan wrote: On 27.09.2015 [23:59:11 +0530], Raghavendra K T wrote: Once we have made the distinction between nid and chipid create a 1:1 mapping between them. This makes compacting the nids easy later. Didn't the previous patch just do the opposi

Re: [PATCH RFC 2/5] powerpc:numa Rename functions referring to nid as chipid

2015-09-29 Thread Raghavendra K T
On 09/28/2015 10:57 PM, Nishanth Aravamudan wrote: On 27.09.2015 [23:59:10 +0530], Raghavendra K T wrote: There is no change in the fuctionality Signed-off-by: Raghavendra K T --- arch/powerpc/mm/numa.c | 42 +- 1 file changed, 21 insertions(+), 21

Re: [PATCH RFC 0/5] powerpc:numa Add serial nid support

2015-09-29 Thread Raghavendra K T
On 09/28/2015 10:34 PM, Nishanth Aravamudan wrote: On 28.09.2015 [13:44:42 +0300], Denis Kirjanov wrote: On 9/27/15, Raghavendra K T wrote: Problem description: Powerpc has sparse node numbering, i.e. on a 4 node system nodes are numbered (possibly) as 0,1,16,17. At a lower level, we map the

Re: [PATCH RFC 1/5] powerpc:numa Add numa_cpu_lookup function to update lookup table

2015-09-27 Thread Raghavendra K T
On 09/27/2015 11:59 PM, Raghavendra K T wrote: We access numa_cpu_lookup_table array directly in all the places to read/update numa cpu lookup information. Instead use a helper function to update. This is helpful in changing the way numa<-->cpu mapping in single place when needed. Thi

[PATCH RFC 3/5] powerpc:numa create 1:1 mappaing between chipid and nid

2015-09-27 Thread Raghavendra K T
Once we have made the distinction between nid and chipid create a 1:1 mapping between them. This makes compacting the nids easy later. No functionality change. Signed-off-by: Raghavendra K T --- arch/powerpc/mm/numa.c | 36 +--- 1 file changed, 29 insertions

[PATCH RFC 0/5] powerpc:numa Add serial nid support

2015-09-27 Thread Raghavendra K T
: cleanup patches patch 4: Adds helper function to map nid and chipid patch 5: Uses the mapping to get serial nid Raghavendra K T (5): powerpc:numa Add numa_cpu_lookup function to update lookup table powerpc:numa Rename functions referring to nid as chipid powerpc:numa create 1:1 mappaing

[PATCH RFC 5/5] powerpc:numa Use chipid to nid mapping to get serial numa node ids

2015-09-27 Thread Raghavendra K T
and cpus 2) Running the tests from numactl source. 3) Creating 1000s of docker containers stressing the system Signed-off-by: Raghavendra K T --- arch/powerpc/mm/numa.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm

[PATCH RFC 1/5] powerpc:numa Add numa_cpu_lookup function to update lookup table

2015-09-27 Thread Raghavendra K T
nality. Signed-off-by: Raghavendra K T --- arch/powerpc/include/asm/mmzone.h | 2 +- arch/powerpc/kernel/smp.c | 10 +- arch/powerpc/mm/numa.c| 28 +--- 3 files changed, 23 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/include/asm/mmzon

[PATCH RFC 4/5] powerpc:numa Add helper functions to maintain chipid to nid mapping

2015-09-27 Thread Raghavendra K T
in first unused nid easily by knowing first unset bit in the mask. No change in functionality. Signed-off-by: Raghavendra K T --- arch/powerpc/mm/numa.c | 48 +++- 1 file changed, 47 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/mm/numa.c b

[PATCH RFC 2/5] powerpc:numa Rename functions referring to nid as chipid

2015-09-27 Thread Raghavendra K T
There is no change in the fuctionality Signed-off-by: Raghavendra K T --- arch/powerpc/mm/numa.c | 42 +- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index d5e6eee..f84ed2f 100644 --- a

Re: [PATCH V2 2/2] powerpc:numa Do not allocate bootmem memory for non existing nodes

2015-09-22 Thread Raghavendra K T
* Michael Ellerman [2015-09-22 15:29:03]: > On Tue, 2015-09-15 at 07:38 +0530, Raghavendra K T wrote: > > > > ... nothing > > Sure this patch looks obvious, but please give me a changelog that proves > you've thought about it thoroughly. > > For example is i

Re: [PATCH V2 2/2] powerpc:numa Do not allocate bootmem memory for non existing nodes

2015-09-22 Thread Raghavendra K T
On 09/22/2015 10:59 AM, Michael Ellerman wrote: On Tue, 2015-09-15 at 07:38 +0530, Raghavendra K T wrote: ... nothing Sure this patch looks obvious, but please give me a changelog that proves you've thought about it thoroughly. For example is it OK to use for_each_node() at this poi

Re: [PATCH V2 1/2] mm: Replace nr_node_ids for loop with for_each_node in list lru

2015-09-14 Thread Raghavendra K T
On 09/15/2015 07:38 AM, Raghavendra K T wrote: The functions used in the patch are in slowpath, which gets called whenever alloc_super is called during mounts. Though this should not make difference for the architectures with sequential numa node ids, for the powerpc which can potentially have

[PATCH V2 0/2] Replace nr_node_ids for loop with for_each_node

2015-09-14 Thread Raghavendra K T
p (Vldimir) - Add comment that node 0 should always be present (Vladimir) Raghavendra K T (2): mm: Replace nr_node_ids for loop with for_each_node in list lru powerpc:numa Do not allocate bootmem memory for non existing nodes arch/powerpc/mm/numa.c | 2 +- mm/list_lru.c

[PATCH V2 2/2] powerpc:numa Do not allocate bootmem memory for non existing nodes

2015-09-14 Thread Raghavendra K T
Signed-off-by: Raghavendra K T --- arch/powerpc/mm/numa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 8b9502a..8d8a541 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -80,7 +80,7 @@ static void

[PATCH V2 1/2] mm: Replace nr_node_ids for loop with for_each_node in list lru

2015-09-14 Thread Raghavendra K T
numa ids, 0,1,16,17 is common), this patch saves some unnecessary allocations for non existing numa nodes. Even without that saving, perhaps patch makes code more readable. [ Take memcg_aware check outside for_each loop: Vldimir] Signed-off-by: Raghavendra K T --- mm/list_lru.c | 34

Re: [PATCH 1/2] mm: Replace nr_node_ids for loop with for_each_node in list lru

2015-09-14 Thread Raghavendra K T
On 09/14/2015 05:34 PM, Vladimir Davydov wrote: On Mon, Sep 14, 2015 at 05:09:31PM +0530, Raghavendra K T wrote: On 09/14/2015 02:30 PM, Vladimir Davydov wrote: On Wed, Sep 09, 2015 at 12:01:46AM +0530, Raghavendra K T wrote: The functions used in the patch are in slowpath, which gets called

Re: [PATCH 1/2] mm: Replace nr_node_ids for loop with for_each_node in list lru

2015-09-14 Thread Raghavendra K T
On 09/14/2015 02:30 PM, Vladimir Davydov wrote: Hi, On Wed, Sep 09, 2015 at 12:01:46AM +0530, Raghavendra K T wrote: The functions used in the patch are in slowpath, which gets called whenever alloc_super is called during mounts. Though this should not make difference for the architectures

[PATCH 1/2] mm: Replace nr_node_ids for loop with for_each_node in list lru

2015-09-08 Thread Raghavendra K T
numa ids, 0,1,16,17 is common), this patch saves some unnecessary allocations for non existing numa nodes. Even without that saving, perhaps patch makes code more readable. Signed-off-by: Raghavendra K T --- mm/list_lru.c | 23 +++ 1 file changed, 15 insertions(+), 8 deletions

[PATCH 2/2] powerpc:numa Do not allocate bootmem memory for non existing nodes

2015-09-08 Thread Raghavendra K T
Signed-off-by: Raghavendra K T --- arch/powerpc/mm/numa.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 8b9502a..8d8a541 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -80,7 +80,7 @@ static void

[PATCH 0/2] Replace nr_node_ids for loop with for_each_node

2015-09-08 Thread Raghavendra K T
loop with for_each_node so that allocations happen only for existing numa nodes. Please note that, though there are many places where nr_node_ids is used, current patchset uses for_each_node only for slowpath to avoid find_next_bit traversal. Raghavendra K T (2): mm: Replace nr_node_ids for loop with for_ea

[PATCH RFC V4 1/2] net: Introduce helper functions to get the per cpu data

2015-08-29 Thread Raghavendra K T
Signed-off-by: Raghavendra K T --- include/net/ip.h | 10 ++ net/ipv4/af_inet.c | 41 +++-- 2 files changed, 37 insertions(+), 14 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index d5fe9f2..93bf12e 100644 --- a/include/net/ip.h

[PATCH RFC V4 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-29 Thread Raghavendra K T
[kernel.kallsyms] [k] veth_stats_one changes/ideas suggested: Using buffer in stack (Eric), Usage of memset (David), Using memcpy in place of unaligned_put (Joe). Signed-off-by: Raghavendra K T --- net/ipv6/addrconf.c | 26 -- 1 file changed, 16 insertions(+), 10 deletions

[PATCH RFC V4 0/2] Optimize the snmp stat aggregation for large cpus

2015-08-29 Thread Raghavendra K T
cache-misses: 1.41 % Please let me know if you have suggestions/comments. Thanks Eric, Joe and David for the comments. Raghavendra K T (2): net: Introduce helper functions to get the per cpu data net: Optimize snmp stat aggregation by walking all the percpu data at

Re: [PATCH RFC V3 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-29 Thread Raghavendra K T
On 08/29/2015 08:51 PM, Joe Perches wrote: On Sat, 2015-08-29 at 07:32 -0700, Eric Dumazet wrote: On Sat, 2015-08-29 at 14:37 +0530, Raghavendra K T wrote: static inline void __snmp6_fill_stats64(u64 *stats, void __percpu *mib, - int items, int bytes

[PATCH RFC V3 1/2] net: Introduce helper functions to get the per cpu data

2015-08-29 Thread Raghavendra K T
Signed-off-by: Raghavendra K T --- include/net/ip.h | 10 ++ net/ipv4/af_inet.c | 41 +++-- 2 files changed, 37 insertions(+), 14 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index d5fe9f2..93bf12e 100644 --- a/include/net/ip.h

[PATCH RFC V3 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-29 Thread Raghavendra K T
[kernel.kallsyms] [k] veth_stats_one changes/ideas suggested: Using buffer in stack (Eric), Usage of memset (David), Using memcpy in place of unaligned_put (Joe). Signed-off-by: Raghavendra K T --- net/ipv6/addrconf.c | 22 +- 1 file changed, 13 insertions(+), 9 deletions(-) Changes

[PATCH RFC V3 0/2] Optimize the snmp stat aggregation for large cpus

2015-08-29 Thread Raghavendra K T
.47% docker docker[.] strings.FieldsFunc cache-misses: 1.41 % Please let me know if you have suggestions/comments. Thanks Eric, Joe and David for comments on V1 and V2. Raghavendra K T (2): net: Introduc

Re: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-29 Thread Raghavendra K T
On 08/29/2015 10:41 AM, David Miller wrote: From: Raghavendra K T Date: Sat, 29 Aug 2015 08:27:15 +0530 resending the patch with memset. Please let me know if you want to resend all the patches. Do not post patches as replies to existing discussion threads. Instead, make a new, fresh

Re: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-29 Thread Raghavendra K T
On 08/29/2015 08:56 AM, Eric Dumazet wrote: On Sat, 2015-08-29 at 08:27 +0530, Raghavendra K T wrote: /* Use put_unaligned() because stats may not be aligned for u64. */ put_unaligned(items, &stats[0]); for (i = 1; i < items; i++) - put_un

Re: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-28 Thread Raghavendra K T
* David Miller [2015-08-28 11:24:13]: > From: Raghavendra K T > Date: Fri, 28 Aug 2015 12:09:52 +0530 > > > On 08/28/2015 12:08 AM, David Miller wrote: > >> From: Raghavendra K T > >> Date: Wed, 26 Aug 2015 23:07:33 +0530 > >> > &

Re: [PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-27 Thread Raghavendra K T
On 08/28/2015 12:08 AM, David Miller wrote: From: Raghavendra K T Date: Wed, 26 Aug 2015 23:07:33 +0530 @@ -4641,10 +4647,12 @@ static inline void __snmp6_fill_stats64(u64 *stats, void __percpu *mib, static void snmp6_fill_stats(u64 *stats, struct inet6_dev *idev, int attrtype

[PATCH RFC V2 1/2] net: Introduce helper functions to get the per cpu data

2015-08-26 Thread Raghavendra K T
Signed-off-by: Raghavendra K T --- include/net/ip.h | 10 ++ net/ipv4/af_inet.c | 41 +++-- 2 files changed, 37 insertions(+), 14 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index d5fe9f2..93bf12e 100644 --- a/include/net/ip.h

[PATCH RFC V2 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-26 Thread Raghavendra K T
[kernel.kallsyms] [k] _raw_spin_lock Signed-off-by: Raghavendra K T --- net/ipv6/addrconf.c | 18 +- 1 file changed, 13 insertions(+), 5 deletions(-) Change in V2: - Allocate stat calculation buffer in stack (Eric) Thanks David and Eric for coments on V1 and as both of them

[PATCH RFC V2 0/2] Optimize the snmp stat aggregation for large cpus

2015-08-26 Thread Raghavendra K T
lease let me know if you have suggestions/comments. Thanks Eric and David for comments on V1. Raghavendra K T (2): net: Introduce helper functions to get the per cpu data net: Optimize snmp stat aggregation by walking all the percpu data at once include/net/ip.h| 10 +++

Re: [PATCH RFC 0/2] Optimize the snmp stat aggregation for large cpus

2015-08-26 Thread Raghavendra K T
On 08/26/2015 07:39 PM, Eric Dumazet wrote: On Wed, 2015-08-26 at 15:55 +0530, Raghavendra K T wrote: On 08/26/2015 04:37 AM, David Miller wrote: From: Raghavendra K T Date: Tue, 25 Aug 2015 13:24:24 +0530 Please let me know if you have suggestions/comments. Like Eric Dumazet said the

Re: [PATCH RFC 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-26 Thread Raghavendra K T
On 08/25/2015 09:30 PM, Eric Dumazet wrote: On Tue, 2015-08-25 at 21:17 +0530, Raghavendra K T wrote: On 08/25/2015 07:58 PM, Eric Dumazet wrote: This is a great idea, but kcalloc()/kmalloc() can fail and you'll crash the whole kernel at this point. Good catch, and my bad. Though s

Re: [PATCH RFC 0/2] Optimize the snmp stat aggregation for large cpus

2015-08-26 Thread Raghavendra K T
On 08/26/2015 04:37 AM, David Miller wrote: From: Raghavendra K T Date: Tue, 25 Aug 2015 13:24:24 +0530 Please let me know if you have suggestions/comments. Like Eric Dumazet said the idea is good but needs some adjustments. You might want to see whether a per-cpu work buffer works for

Re: [PATCH RFC 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-25 Thread Raghavendra K T
On 08/25/2015 09:30 PM, Eric Dumazet wrote: On Tue, 2015-08-25 at 21:17 +0530, Raghavendra K T wrote: On 08/25/2015 07:58 PM, Eric Dumazet wrote: This is a great idea, but kcalloc()/kmalloc() can fail and you'll crash the whole kernel at this point. Good catch, and my bad. Though s

Re: [PATCH RFC 0/2] Optimize the snmp stat aggregation for large cpus

2015-08-25 Thread Raghavendra K T
On 08/25/2015 08:03 PM, Eric Dumazet wrote: On Tue, 2015-08-25 at 13:24 +0530, Raghavendra K T wrote: While creating 1000 containers, perf is showing lot of time spent in snmp_fold_field on a large cpu system. The current patch tries to improve by reordering the statistics gathering. Please

Re: [PATCH RFC 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-25 Thread Raghavendra K T
On 08/25/2015 07:58 PM, Eric Dumazet wrote: On Tue, 2015-08-25 at 13:24 +0530, Raghavendra K T wrote: Docker container creation linearly increased from around 1.6 sec to 7.5 sec (at 1000 containers) and perf data showed 50% ovehead in snmp_fold_field. reason: currently __snmp6_fill_stats64

[PATCH RFC 2/2] net: Optimize snmp stat aggregation by walking all the percpu data at once

2015-08-25 Thread Raghavendra K T
[kernel.kallsyms] [k] _raw_spin_lock Signed-off-by: Raghavendra K T --- net/ipv6/addrconf.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index 21c2c81..2ec905f 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c

[PATCH RFC 1/2] net: Introduce helper functions to get the per cpu data

2015-08-25 Thread Raghavendra K T
Signed-off-by: Raghavendra K T --- include/net/ip.h | 10 ++ net/ipv4/af_inet.c | 41 +++-- 2 files changed, 37 insertions(+), 14 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index d5fe9f2..93bf12e 100644 --- a/include/net/ip.h

[PATCH RFC 0/2] Optimize the snmp stat aggregation for large cpus

2015-08-25 Thread Raghavendra K T
[.] strings.FieldsFunc 2.96% docker docker[.] backtrace_qsort cache-misses: 1.38 % Please let me know if you have suggestions/comments. Raghavendra K T (2): net:

Re: [PATCH v2 5/6] locking/pvqspinlock: Opportunistically defer kicking to unlock time

2015-07-14 Thread Raghavendra K T
On 07/15/2015 07:43 AM, Waiman Long wrote: Performing CPU kicking at lock time can be a bit faster if there is no kick-ahead. On the other hand, deferring it to unlock time is preferrable when kick-ahead can be performed or when the VM guest is having too few vCPUs that a vCPU may be kicked twice

Re: [PATCH v2] selinux: reduce locking overhead in inode_free_security()

2015-06-13 Thread Raghavendra K T
On 06/13/2015 04:05 AM, Waiman Long wrote: On 06/12/2015 08:31 AM, Stephen Smalley wrote: On 06/12/2015 02:26 AM, Raghavendra K T wrote: On 06/12/2015 03:01 AM, Waiman Long wrote: The inode_free_security() function just took the superblock's isec_lock before checking and trying to remov

Re: [PATCH v2] selinux: reduce locking overhead in inode_free_security()

2015-06-11 Thread Raghavendra K T
On 06/12/2015 03:01 AM, Waiman Long wrote: The inode_free_security() function just took the superblock's isec_lock before checking and trying to remove the inode security struct from the linked list. In many cases, the list was empty and so the lock taking is wasteful as no useful work is done. O

Re: [PATCH v5 0/4] idle memory tracking

2015-06-09 Thread Raghavendra K T
On 06/09/2015 01:05 AM, Andrew Morton wrote: On Sun, 7 Jun 2015 11:41:15 +0530 Raghavendra KT wrote: On Tue, May 12, 2015 at 7:04 PM, Vladimir Davydov wrote: Hi, This patch set introduces a new user API for tracking user memory pages that have not been used for a given period of time. The

Re: [RFC] arm: Add for atomic half word exchange

2015-06-01 Thread Raghavendra K T
On 06/02/2015 11:19 AM, Sarbojit Ganguly wrote: I made the CONFIG_ARCH_MULTI_V6=y and CONFIG_CPU_V6K=y CONFIG_CPU_32v6=y CONFIG_CPU_32v6K=y and compiled 4.0.4 with the patch. Result is a compilation success. Regards, Sarbojit Hi Sarbojit, I am not familiar about the implication of setting t

Re: [PATCH] x86/spinlocks: Fix regression in spinlock contention detection

2015-05-06 Thread Raghavendra K T
On 05/05/2015 09:45 AM, Tahsin Erdogan wrote: A spinlock is regarded as contended when there is at least one waiter. Currently, the code that checks whether there are any waiters rely on tail value being greater than head. However, this is not true if tail reaches the max value and wraps back to

Re: [PATCH] x86/spinlocks: Fix regression in spinlock contention detection

2015-05-05 Thread Raghavendra K T
On 05/05/2015 07:33 PM, Tahsin Erdogan wrote: The conversion to signed happens with types shorter than int (__ticket_t is either u8 or u16). By changing Raghavendra's program to use unsigned short int, you can see the problem: #include #define LOCK_INC 2 int main() {

Re: [PATCH] x86/spinlocks: Fix regression in spinlock contention detection

2015-05-05 Thread Raghavendra K T
On 05/05/2015 09:02 PM, Waiman Long wrote: On 05/05/2015 11:25 AM, Raghavendra K T wrote: On 05/05/2015 07:33 PM, Tahsin Erdogan wrote: The conversion to signed happens with types shorter than int (__ticket_t is either u8 or u16). By changing Raghavendra's program to use unsigned shor

Re: [PATCH] x86/spinlocks: Fix regression in spinlock contention detection

2015-05-05 Thread Raghavendra K T
On 05/05/2015 02:47 PM, Peter Zijlstra wrote: On Mon, May 04, 2015 at 09:15:31PM -0700, Tahsin Erdogan wrote: A spinlock is regarded as contended when there is at least one waiter. Currently, the code that checks whether there are any waiters rely on tail value being greater than head. However,

Re: [PATCH 0/9] qspinlock stuff -v15

2015-03-26 Thread Raghavendra K T
of patches works fine. Feel free to add Tested-by: Raghavendra K T #kvm pv As far as performance is concerned (with my 16core +ht machine having 16vcpu guests [ even w/ , w/o the lfsr hash patchset ]), I do not see any significant observations to report, though I understand that we could see much

Re: [PATCH 9/9] qspinlock,x86,kvm: Implement KVM support for paravirt qspinlock

2015-03-20 Thread Raghavendra K T
On 03/20/2015 02:38 AM, Waiman Long wrote: On 03/19/2015 06:01 AM, Peter Zijlstra wrote: [...] You are probably right. The initial apply_paravirt() was done before the SMP boot. Subsequent ones were at kernel module load time. I put a counter in the __native_queue_spin_unlock() and it registere

Re: [PATCH for stable] x86/spinlocks/paravirt: Fix memory corruption on unlock

2015-02-24 Thread Raghavendra K T
On 02/24/2015 08:50 PM, Greg KH wrote: On Tue, Feb 24, 2015 at 03:47:37PM +0100, Ingo Molnar wrote: * Greg KH wrote: On Tue, Feb 24, 2015 at 02:54:59PM +0530, Raghavendra K T wrote: Paravirt spinlock clears slowpath flag after doing unlock. As explained by Linus currently it does

Re: [PATCH for stable] x86/spinlocks/paravirt: Fix memory corruption on unlock

2015-02-24 Thread Raghavendra K T
On 02/24/2015 08:17 PM, Ingo Molnar wrote: * Greg KH wrote: On Tue, Feb 24, 2015 at 02:54:59PM +0530, Raghavendra K T wrote: Paravirt spinlock clears slowpath flag after doing unlock. As explained by Linus currently it does: prev = *lock; add_smp(&

[PATCH for stable] x86/spinlocks/paravirt: Fix memory corruption on unlock

2015-02-24 Thread Raghavendra K T
0.02 dbench 1x -1.77 dbench 2x -0.63 [Jeremy: hinted missing TICKET_LOCK_INC for kick] [Oleg: Moving slowpath flag to head, ticket_equals idea] [PeterZ: Detailed changelog] Reported-by: Sasha Levin Suggested-by: Linus Torvalds Signed-off-by: Raghavendra K T Review

[tip:locking/core] x86/spinlocks/paravirt: Fix memory corruption on unlock

2015-02-18 Thread tip-bot for Raghavendra K T
Commit-ID: d6abfdb2022368d8c6c4be3f11a06656601a6cc2 Gitweb: http://git.kernel.org/tip/d6abfdb2022368d8c6c4be3f11a06656601a6cc2 Author: Raghavendra K T AuthorDate: Fri, 6 Feb 2015 16:44:11 +0530 Committer: Ingo Molnar CommitDate: Wed, 18 Feb 2015 14:53:49 +0100 x86/spinlocks/paravirt

Re: [Xen-devel] [PATCH V5] x86 spinlock: Fix memory corruption on completing completions

2015-02-17 Thread Raghavendra K T
On 02/16/2015 10:17 PM, David Vrabel wrote: On 15/02/15 17:30, Raghavendra K T wrote: --- a/arch/x86/xen/spinlock.c +++ b/arch/x86/xen/spinlock.c @@ -41,7 +41,7 @@ static u8 zero_stats; static inline void check_zero(void) { u8 ret; - u8 old = ACCESS_ONCE(zero_stats

Re: [PATCH V5] x86 spinlock: Fix memory corruption on completing completions

2015-02-15 Thread Raghavendra K T
On 02/15/2015 09:47 PM, Oleg Nesterov wrote: Well, I regret I mentioned the lack of barrier after enter_slowpath ;) On 02/15, Raghavendra K T wrote: @@ -46,7 +46,8 @@ static __always_inline bool static_key_false(struct static_key *key); static inline void __ticket_enter_slowpath

Re: [PATCH V5] x86 spinlock: Fix memory corruption on completing completions

2015-02-15 Thread Raghavendra K T
* Raghavendra K T [2015-02-15 11:25:44]: Resending the V5 with smp_mb__after_atomic() change without bumping up revision ---8<--- >From 0b9ecde30e3bf5b5b24009fd2ac5fc7ac4b81158 Mon Sep 17 00:00:00 2001 From: Raghavendra K T Date: Fri, 6 Feb 2015 16:44:11 +0530 Subject: [PATCH RESEND V

Re: [PATCH V5] x86 spinlock: Fix memory corruption on completing completions

2015-02-14 Thread Raghavendra K T
On 02/15/2015 11:25 AM, Raghavendra K T wrote: Paravirt spinlock clears slowpath flag after doing unlock. As explained by Linus currently it does: prev = *lock; add_smp(&lock->tickets.head, TICKET_LOCK_INC); /* add_smp() is a

[PATCH V5] x86 spinlock: Fix memory corruption on completing completions

2015-02-14 Thread Raghavendra K T
0.02 dbench 1x -1.77 dbench 2x -0.63 [Jeremy: hinted missing TICKET_LOCK_INC for kick] [Oleg: Moving slowpath flag to head, ticket_equals idea] [PeterZ: Detailed changelog] Reported-by: Sasha Levin Suggested-by: Linus Torvalds Signed-off-by: Raghavendra K

Re: [PATCH V4] x86 spinlock: Fix memory corruption on completing completions

2015-02-14 Thread Raghavendra K T
On 02/13/2015 09:02 PM, Oleg Nesterov wrote: On 02/13, Raghavendra K T wrote: @@ -164,7 +161,7 @@ static inline int arch_spin_is_locked(arch_spinlock_t *lock) { struct __raw_tickets tmp = READ_ONCE(lock->tickets); - return tmp.tail != tmp.head; + return tmp.t

[PATCH V4] x86 spinlock: Fix memory corruption on completing completions

2015-02-12 Thread Raghavendra K T
0.02 dbench 1x -1.77 dbench 2x -0.63 [Jeremy: hinted missing TICKET_LOCK_INC for kick] [Oleg: Moving slowpath flag to head, ticket_equals idea] [PeterZ: Detailed changelog] Reported-by: Sasha Levin Suggested-by: Linus Torvalds Signed-off-by: Raghavendra K

Re: [PATCH V3] x86 spinlock: Fix memory corruption on completing completions

2015-02-12 Thread Raghavendra K T
On 02/12/2015 08:30 PM, Peter Zijlstra wrote: On Thu, Feb 12, 2015 at 05:17:27PM +0530, Raghavendra K T wrote: [...] Linus suggested that we should not do any writes to lock after unlock(), and we can move slowpath clearing to fastpath lock. So this patch implements the fix with: 1. Moving

Re: [PATCH V3] x86 spinlock: Fix memory corruption on completing completions

2015-02-12 Thread Raghavendra K T
On 02/12/2015 07:32 PM, Oleg Nesterov wrote: Damn, sorry for noise, forgot to mention... On 02/12, Raghavendra K T wrote: +static inline void __ticket_check_and_clear_slowpath(arch_spinlock_t *lock, + __ticket_t head) +{ + if (head

Re: [PATCH V3] x86 spinlock: Fix memory corruption on completing completions

2015-02-12 Thread Raghavendra K T
On 02/12/2015 07:20 PM, Oleg Nesterov wrote: On 02/12, Raghavendra K T wrote: @@ -191,8 +189,7 @@ static inline void arch_spin_unlock_wait(arch_spinlock_t *lock) * We need to check "unlocked" in a loop, tmp.head == head * can be false positive

Re: [PATCH V3] x86 spinlock: Fix memory corruption on completing completions

2015-02-12 Thread Raghavendra K T
On 02/12/2015 07:07 PM, Oleg Nesterov wrote: On 02/12, Raghavendra K T wrote: @@ -772,7 +773,8 @@ __visible void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want) * check again make sure it didn't become free while * we weren't looking. */

[PATCH V3] x86 spinlock: Fix memory corruption on completing completions

2015-02-12 Thread Raghavendra K T
x -0.63 [Jeremy: hinted missing TICKET_LOCK_INC for kick] [Oleg: Moving slowpath flag to head, ticket_equals idea] Reported-by: Sasha Levin Suggested-by: Linus Torvalds Signed-off-by: Raghavendra K T --- arch/x86/include/asm/spinlock.h | 87 - ar

Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-11 Thread Raghavendra K T
On 02/11/2015 11:08 PM, Oleg Nesterov wrote: On 02/11, Raghavendra K T wrote: On 02/10/2015 06:56 PM, Oleg Nesterov wrote: In this case __ticket_check_and_clear_slowpath() really needs to cmpxchg the whole .head_tail. Plus obviously more boring changes. This needs a separate patch even _if_

Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-11 Thread Raghavendra K T
On 02/10/2015 06:56 PM, Oleg Nesterov wrote: On 02/10, Raghavendra K T wrote: On 02/10/2015 06:23 AM, Linus Torvalds wrote: add_smp(&lock->tickets.head, TICKET_LOCK_INC); if (READ_ONCE(lock->tickets.tail) & TICKET_SLOWPATH_FLAG) .. into something like

Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-10 Thread Raghavendra K T
On 02/10/2015 06:23 AM, Linus Torvalds wrote: On Mon, Feb 9, 2015 at 4:02 AM, Peter Zijlstra wrote: On Mon, Feb 09, 2015 at 03:04:22PM +0530, Raghavendra K T wrote: So we have 3 choices, 1. xadd 2. continue with current approach. 3. a read before unlock and also after that. For the truly

Re: [PATCH V2] x86 spinlock: Fix memory corruption on completing completions

2015-02-09 Thread Raghavendra K T
Ccing Davidlohr, (sorry that I got confused with similar address in cc list). On 02/09/2015 08:44 PM, Oleg Nesterov wrote: On 02/09, Raghavendra K T wrote: +static inline void __ticket_check_and_clear_slowpath(arch_spinlock_t *lock) +{ + arch_spinlock_t old, new; + __ticket_t diff

[PATCH V2] x86 spinlock: Fix memory corruption on completing completions

2015-02-09 Thread Raghavendra K T
ll could be set when somebody does arch_trylock. Handle that too by ignoring slowpath flag during lock availability check. [Jeremy: hinted missing TICKET_LOCK_INC for kick] Reported-by: Sasha Levin Suggested-by: Linus Torvalds Signed-off-by: Raghavendra K T --- ar

Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-09 Thread Raghavendra K T
On 02/09/2015 05:32 PM, Peter Zijlstra wrote: On Mon, Feb 09, 2015 at 03:04:22PM +0530, Raghavendra K T wrote: So we have 3 choices, 1. xadd 2. continue with current approach. 3. a read before unlock and also after that. For the truly paranoid we have probe_kernel_address(), suppose the lock

Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-09 Thread Raghavendra K T
On 02/09/2015 02:44 AM, Jeremy Fitzhardinge wrote: On 02/06/2015 06:49 AM, Raghavendra K T wrote: [...] Linus suggested that we should not do any writes to lock after unlock(), and we can move slowpath clearing to fastpath lock. Yep, that seems like a sound approach. Current approach

Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-08 Thread Raghavendra K T
On 02/07/2015 12:27 AM, Sasha Levin wrote: On 02/06/2015 09:49 AM, Raghavendra K T wrote: Paravirt spinlock clears slowpath flag after doing unlock. As explained by Linus currently it does: prev = *lock; add_smp(&lock->tickets.head, TICKET_L

Re: [PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-08 Thread Raghavendra K T
On 02/06/2015 09:55 PM, Linus Torvalds wrote: On Fri, Feb 6, 2015 at 6:49 AM, Raghavendra K T wrote: Paravirt spinlock clears slowpath flag after doing unlock. [ fix edited out ] So I'm not going to be applying this for 3.19, because it's much too late and the patch is too scary

Re: sched: memory corruption on completing completions

2015-02-06 Thread Raghavendra K T
On 02/06/2015 12:18 PM, Raghavendra K T wrote: On 02/06/2015 04:27 AM, Linus Torvalds wrote: On Thu, Feb 5, 2015 at 2:37 PM, Davidlohr Bueso wrote: It is possible that the paravirt spinlocks could be saved by: - moving the clearing of TICKET_SLOWPATH_FLAG into the fastpath locking code

[PATCH] x86 spinlock: Fix memory corruption on completing completions

2015-02-06 Thread Raghavendra K T
ll could be set when somebody does arch_trylock. Handle that too by ignoring slowpath flag during lock availability check. Reported-by: Sasha Levin Suggested-by: Linus Torvalds Signed-off-by: Raghavendra K T --- arch/x86/include/asm/spinlock.h | 70 - 1 file chang

Re: sched: memory corruption on completing completions

2015-02-06 Thread Raghavendra K T
On 02/06/2015 04:07 AM, Davidlohr Bueso wrote: On Thu, 2015-02-05 at 13:34 -0800, Linus Torvalds wrote: On Thu, Feb 5, 2015 at 1:02 PM, Sasha Levin wrote: Interestingly enough, according to that article this behaviour seems to be "by design": Oh, it's definitely by design, it's just that th

Re: sched: memory corruption on completing completions

2015-02-05 Thread Raghavendra K T
On 02/06/2015 04:27 AM, Linus Torvalds wrote: On Thu, Feb 5, 2015 at 2:37 PM, Davidlohr Bueso wrote: It is possible that the paravirt spinlocks could be saved by: - moving the clearing of TICKET_SLOWPATH_FLAG into the fastpath locking code. Ouch, to avoid deadlocks they explicitly need th

Re: [PATCH v14 08/11] qspinlock, x86: Rename paravirt_ticketlocks_enabled

2015-01-21 Thread Raghavendra K T
On 01/21/2015 01:42 AM, Waiman Long wrote: This patch renames the paravirt_ticketlocks_enabled static key to a more generic paravirt_spinlocks_enabled name. Signed-off-by: Waiman Long Signed-off-by: Peter Zijlstra --- Reviewed-by: Raghavendra K T -- To unsubscribe from this list: send the

Re: [PATCH] cpusets: Make cpus_allowed and mems_allowed masks hotplug invariant

2014-10-09 Thread Raghavendra K T
On 10/09/2014 10:42 AM, Preeti U Murthy wrote: Hi Raghu, remove_tasks_in_empty_cpuset() is called on the legacy hierarchy when the cpuset becomes empty, hence we require it. But you are right its not called on the default hierarchy. My point was if legacy hierarchy follows unified hierarchy

  1   2   3   4   5   6   >