Re: kmem_cache_alloc panic in 3.10+

2014-01-31 Thread Eric Dumazet
On Fri, 2014-01-31 at 15:43 -0800, dormando wrote: > chpxchg_double()? that's not related to the 62713c4b fix right? > > I'll see what I can do.. it's going to take a long time to iterate on this > though. Dont know about this commit. I was more thinking about

Re: kmem_cache_alloc panic in 3.10+

2014-01-31 Thread dormando
On Fri, 31 Jan 2014, David Rientjes wrote: > On Fri, 31 Jan 2014, dormando wrote: > > > > CONFIG_SLUB_DEBUG_ON will definitely be slower but can help to identify > > > any possible corruption issues. > > > > > > I'm wondering if you have CONFIG_MEMCG enabled and are actually allocating > > > slab

Re: kmem_cache_alloc panic in 3.10+

2014-01-31 Thread David Rientjes
On Fri, 31 Jan 2014, dormando wrote: > > CONFIG_SLUB_DEBUG_ON will definitely be slower but can help to identify > > any possible corruption issues. > > > > I'm wondering if you have CONFIG_MEMCG enabled and are actually allocating > > slab in a non-root memcg? What does /proc/self/cgroup say? >

Re: kmem_cache_alloc panic in 3.10+

2014-01-31 Thread David Rientjes
On Thu, 30 Jan 2014, dormando wrote: > > > I really wonder... it looks like a possible in SLUB. (might be already > > > fixed) > > > > > > Could you try using SLAB instead ? > > > > try config_slub_debug_on=y ? it should catch double free and other things. > > > > Any slowdowns/issues with that?

Re: kmem_cache_alloc panic in 3.10+

2014-01-31 Thread David Rientjes
On Thu, 30 Jan 2014, dormando wrote: I really wonder... it looks like a possible in SLUB. (might be already fixed) Could you try using SLAB instead ? try config_slub_debug_on=y ? it should catch double free and other things. Any slowdowns/issues with that?

Re: kmem_cache_alloc panic in 3.10+

2014-01-31 Thread David Rientjes
On Fri, 31 Jan 2014, dormando wrote: CONFIG_SLUB_DEBUG_ON will definitely be slower but can help to identify any possible corruption issues. I'm wondering if you have CONFIG_MEMCG enabled and are actually allocating slab in a non-root memcg? What does /proc/self/cgroup say?

Re: kmem_cache_alloc panic in 3.10+

2014-01-31 Thread dormando
On Fri, 31 Jan 2014, David Rientjes wrote: On Fri, 31 Jan 2014, dormando wrote: CONFIG_SLUB_DEBUG_ON will definitely be slower but can help to identify any possible corruption issues. I'm wondering if you have CONFIG_MEMCG enabled and are actually allocating slab in a non-root

Re: kmem_cache_alloc panic in 3.10+

2014-01-31 Thread Eric Dumazet
On Fri, 2014-01-31 at 15:43 -0800, dormando wrote: chpxchg_double()? that's not related to the 62713c4b fix right? I'll see what I can do.. it's going to take a long time to iterate on this though. Dont know about this commit. I was more thinking about

Re: kmem_cache_alloc panic in 3.10+

2014-01-30 Thread dormando
> On Thu, Jan 30, 2014 at 6:16 PM, Eric Dumazet wrote: > > On Wed, 2014-01-29 at 23:05 -0800, dormando wrote: > > > >> We hit the routing code fairly hard. Any hints for what to look at or how > >> to instrument it? Or if it's fixed already? It's a real pain to iterate > >> since it takes ~30

Re: kmem_cache_alloc panic in 3.10+

2014-01-30 Thread Alexei Starovoitov
On Thu, Jan 30, 2014 at 6:16 PM, Eric Dumazet wrote: > On Wed, 2014-01-29 at 23:05 -0800, dormando wrote: > >> We hit the routing code fairly hard. Any hints for what to look at or how >> to instrument it? Or if it's fixed already? It's a real pain to iterate >> since it takes ~30 days to crash,

Re: kmem_cache_alloc panic in 3.10+

2014-01-30 Thread Eric Dumazet
On Wed, 2014-01-29 at 23:05 -0800, dormando wrote: > We hit the routing code fairly hard. Any hints for what to look at or how > to instrument it? Or if it's fixed already? It's a real pain to iterate > since it takes ~30 days to crash, usually. Sometimes. I really wonder... it looks like a

Re: kmem_cache_alloc panic in 3.10+

2014-01-30 Thread Eric Dumazet
On Wed, 2014-01-29 at 23:05 -0800, dormando wrote: We hit the routing code fairly hard. Any hints for what to look at or how to instrument it? Or if it's fixed already? It's a real pain to iterate since it takes ~30 days to crash, usually. Sometimes. I really wonder... it looks like a

Re: kmem_cache_alloc panic in 3.10+

2014-01-30 Thread Alexei Starovoitov
On Thu, Jan 30, 2014 at 6:16 PM, Eric Dumazet eric.duma...@gmail.com wrote: On Wed, 2014-01-29 at 23:05 -0800, dormando wrote: We hit the routing code fairly hard. Any hints for what to look at or how to instrument it? Or if it's fixed already? It's a real pain to iterate since it takes ~30

Re: kmem_cache_alloc panic in 3.10+

2014-01-30 Thread dormando
On Thu, Jan 30, 2014 at 6:16 PM, Eric Dumazet eric.duma...@gmail.com wrote: On Wed, 2014-01-29 at 23:05 -0800, dormando wrote: We hit the routing code fairly hard. Any hints for what to look at or how to instrument it? Or if it's fixed already? It's a real pain to iterate since it takes

Re: kmem_cache_alloc panic in 3.10+

2014-01-29 Thread dormando
> > On Sat, 2014-01-18 at 00:44 -0800, dormando wrote: > > > Hello again! > > > > > > We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least > > > (trying newer stables now, but I can't tell if it was fixed, and it takes > > > weeks to reproduce). > > > > > > Unfortunately I can

Re: kmem_cache_alloc panic in 3.10+

2014-01-29 Thread dormando
On Sat, 2014-01-18 at 00:44 -0800, dormando wrote: Hello again! We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least (trying newer stables now, but I can't tell if it was fixed, and it takes weeks to reproduce). Unfortunately I can only get 8k back from

Re: kmem_cache_alloc panic in 3.10+

2014-01-18 Thread dormando
> On Sat, 2014-01-18 at 00:44 -0800, dormando wrote: > > Hello again! > > > > We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least > > (trying newer stables now, but I can't tell if it was fixed, and it takes > > weeks to reproduce). > > > > Unfortunately I can only get 8k

Re: kmem_cache_alloc panic in 3.10+

2014-01-18 Thread Eric Dumazet
On Sat, 2014-01-18 at 08:29 -0800, Eric Dumazet wrote: > Hmm... > > Some dst seems to be destroyed twice. This likely screws slab allocator. > > Please try following untested patch : Forget it, after some coffee it makes no longer sense ;) -- To unsubscribe from this list: send the line

Re: kmem_cache_alloc panic in 3.10+

2014-01-18 Thread Eric Dumazet
On Sat, 2014-01-18 at 00:44 -0800, dormando wrote: > Hello again! > > We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least > (trying newer stables now, but I can't tell if it was fixed, and it takes > weeks to reproduce). > > Unfortunately I can only get 8k back from pstore.

Re: kmem_cache_alloc panic in 3.10+

2014-01-18 Thread Eric Dumazet
On Sat, 2014-01-18 at 00:44 -0800, dormando wrote: Hello again! We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least (trying newer stables now, but I can't tell if it was fixed, and it takes weeks to reproduce). Unfortunately I can only get 8k back from pstore. The

Re: kmem_cache_alloc panic in 3.10+

2014-01-18 Thread Eric Dumazet
On Sat, 2014-01-18 at 08:29 -0800, Eric Dumazet wrote: Hmm... Some dst seems to be destroyed twice. This likely screws slab allocator. Please try following untested patch : Forget it, after some coffee it makes no longer sense ;) -- To unsubscribe from this list: send the line

Re: kmem_cache_alloc panic in 3.10+

2014-01-18 Thread dormando
On Sat, 2014-01-18 at 00:44 -0800, dormando wrote: Hello again! We've had a rare crash that's existed between 3.10.0 and 3.10.15 at least (trying newer stables now, but I can't tell if it was fixed, and it takes weeks to reproduce). Unfortunately I can only get 8k back from pstore.