From: Bob Picco
Date: Tue, 24 Mar 2015 10:57:53 -0400
> Seems solid with 2.6.39 on M7-4. Jalap?no is happy with current sparc.git.
Thanks for all the testing, it's been integrated into the -stable
queues as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the
David Miller wrote: [Mon Mar 23 2015, 12:25:30PM EDT]
> From: David Miller
> Date: Sun, 22 Mar 2015 22:19:06 -0400 (EDT)
>
> > I'll work on a fix.
>
> Ok, here is what I committed. David et al., let me know if you still
> see the crashes with this applied.
>
> Of course, I'll queue this
From: Bob Picco bpi...@meloft.net
Date: Tue, 24 Mar 2015 10:57:53 -0400
Seems solid with 2.6.39 on M7-4. Jalap?no is happy with current sparc.git.
Thanks for all the testing, it's been integrated into the -stable
queues as well.
--
To unsubscribe from this list: send the line unsubscribe
David Miller wrote: [Mon Mar 23 2015, 12:25:30PM EDT]
From: David Miller da...@davemloft.net
Date: Sun, 22 Mar 2015 22:19:06 -0400 (EDT)
I'll work on a fix.
Ok, here is what I committed. David et al., let me know if you still
see the crashes with this applied.
Of course, I'll
On 3/23/15 1:35 PM, David Miller wrote:
From: David Ahern
Date: Mon, 23 Mar 2015 11:34:34 -0600
seems like a formality at this point, but this resolves the panic on
the M7-based ldom and baremetal. The T5-8 failed to boot, but it could
be a different problem.
Specifically, does the T5-8
From: "John Stoffel"
Date: Mon, 23 Mar 2015 15:56:02 -0400
>> "David" == David Miller writes:
>
> David> From: "John Stoffel"
> David> Date: Mon, 23 Mar 2015 12:51:03 -0400
>
>>> Would it make sense to have some memmove()/memcopy() tests on bootup
>>> to catch problems like this? I know
On 3/23/15 1:35 PM, David Miller wrote:
From: David Ahern
Date: Mon, 23 Mar 2015 11:34:34 -0600
seems like a formality at this point, but this resolves the panic on
the M7-based ldom and baremetal. The T5-8 failed to boot, but it could
be a different problem.
Specifically, does the T5-8
> "David" == David Miller writes:
David> From: "John Stoffel"
David> Date: Mon, 23 Mar 2015 12:51:03 -0400
>> Would it make sense to have some memmove()/memcopy() tests on bootup
>> to catch problems like this? I know this is a strange case, and
>> probably not too common, but how hard
From: Linus Torvalds
Date: Mon, 23 Mar 2015 12:47:49 -0700
> On Mon, Mar 23, 2015 at 12:08 PM, David Miller wrote:
>>
>> Sure you could do that in C, but I really want to avoid using memcpy()
>> if dst and src overlap in any way at all.
>>
>> Said another way, I don't want to codify that "64"
On Mon, Mar 23, 2015 at 12:08 PM, David Miller wrote:
>
> Sure you could do that in C, but I really want to avoid using memcpy()
> if dst and src overlap in any way at all.
>
> Said another way, I don't want to codify that "64" thing. The next
> chip could do 128 byte initializing stores.
But
From: David Ahern
Date: Mon, 23 Mar 2015 11:34:34 -0600
> seems like a formality at this point, but this resolves the panic on
> the M7-based ldom and baremetal. The T5-8 failed to boot, but it could
> be a different problem.
Specifically, does the T5-8 boot without my patch applied?
--
To
From: "John Stoffel"
Date: Mon, 23 Mar 2015 12:51:03 -0400
> Would it make sense to have some memmove()/memcopy() tests on bootup
> to catch problems like this? I know this is a strange case, and
> probably not too common, but how hard would it be to wire up tests
> that go through 1 to 128
From: Linus Torvalds
Date: Mon, 23 Mar 2015 10:00:02 -0700
> Maybe the code could be something like
>
> void *memmove(void *dst, const void *src, size_t n);
> {
> // non-overlapping cases
> if (src + n <= dst)
> return memcpy(dst, src, n);
> if (dst +
On 3/23/15 10:25 AM, David Miller wrote:
[PATCH] sparc64: Fix several bugs in memmove().
Firstly, handle zero length calls properly. Believe it or not there
are a few of these happening during early boot.
Next, we can't just drop to a memcpy() call in the forward copy case
where dst <= src.
On Mon, Mar 23, 2015 at 9:25 AM, David Miller wrote:
>
> Ok, here is what I committed.
So I wonder - looking at that assembly, I get the feeling that it
isn't any better code than gcc could generate from simple C code.
Would it perhaps be better to turn memmove() into C?
That's particularly
David>
David> [PATCH] sparc64: Fix several bugs in memmove().
David> Firstly, handle zero length calls properly. Believe it or not there
David> are a few of these happening during early boot.
David> Next, we can't just drop to a memcpy() call in the forward copy case
From: David Miller
Date: Sun, 22 Mar 2015 22:19:06 -0400 (EDT)
> I'll work on a fix.
Ok, here is what I committed. David et al., let me know if you still
see the crashes with this applied.
Of course, I'll queue this up for -stable as well.
Thanks!
[PATCH] sparc64: Fix
From: John Stoffel j...@stoffel.org
Date: Mon, 23 Mar 2015 12:51:03 -0400
Would it make sense to have some memmove()/memcopy() tests on bootup
to catch problems like this? I know this is a strange case, and
probably not too common, but how hard would it be to wire up tests
that go through 1
David == David Miller da...@davemloft.net writes:
David From: John Stoffel j...@stoffel.org
David Date: Mon, 23 Mar 2015 12:51:03 -0400
Would it make sense to have some memmove()/memcopy() tests on bootup
to catch problems like this? I know this is a strange case, and
probably not too
From: John Stoffel j...@stoffel.org
Date: Mon, 23 Mar 2015 15:56:02 -0400
David == David Miller da...@davemloft.net writes:
David From: John Stoffel j...@stoffel.org
David Date: Mon, 23 Mar 2015 12:51:03 -0400
Would it make sense to have some memmove()/memcopy() tests on bootup
to catch
David
David [PATCH] sparc64: Fix several bugs in memmove().
David Firstly, handle zero length calls properly. Believe it or not there
David are a few of these happening during early boot.
David Next, we can't just drop to a memcpy() call in the forward copy case
David
On Mon, Mar 23, 2015 at 12:08 PM, David Miller da...@davemloft.net wrote:
Sure you could do that in C, but I really want to avoid using memcpy()
if dst and src overlap in any way at all.
Said another way, I don't want to codify that 64 thing. The next
chip could do 128 byte initializing
From: Linus Torvalds torva...@linux-foundation.org
Date: Mon, 23 Mar 2015 12:47:49 -0700
On Mon, Mar 23, 2015 at 12:08 PM, David Miller da...@davemloft.net wrote:
Sure you could do that in C, but I really want to avoid using memcpy()
if dst and src overlap in any way at all.
Said another
On 3/23/15 1:35 PM, David Miller wrote:
From: David Ahern david.ah...@oracle.com
Date: Mon, 23 Mar 2015 11:34:34 -0600
seems like a formality at this point, but this resolves the panic on
the M7-based ldom and baremetal. The T5-8 failed to boot, but it could
be a different problem.
From: Linus Torvalds torva...@linux-foundation.org
Date: Mon, 23 Mar 2015 10:00:02 -0700
Maybe the code could be something like
void *memmove(void *dst, const void *src, size_t n);
{
// non-overlapping cases
if (src + n = dst)
return memcpy(dst, src,
From: David Ahern david.ah...@oracle.com
Date: Mon, 23 Mar 2015 11:34:34 -0600
seems like a formality at this point, but this resolves the panic on
the M7-based ldom and baremetal. The T5-8 failed to boot, but it could
be a different problem.
Specifically, does the T5-8 boot without my patch
From: David Miller da...@davemloft.net
Date: Sun, 22 Mar 2015 22:19:06 -0400 (EDT)
I'll work on a fix.
Ok, here is what I committed. David et al., let me know if you still
see the crashes with this applied.
Of course, I'll queue this up for -stable as well.
Thanks!
On Mon, Mar 23, 2015 at 9:25 AM, David Miller da...@davemloft.net wrote:
Ok, here is what I committed.
So I wonder - looking at that assembly, I get the feeling that it
isn't any better code than gcc could generate from simple C code.
Would it perhaps be better to turn memmove() into C?
On 3/23/15 10:25 AM, David Miller wrote:
[PATCH] sparc64: Fix several bugs in memmove().
Firstly, handle zero length calls properly. Believe it or not there
are a few of these happening during early boot.
Next, we can't just drop to a memcpy() call in the forward copy case
where dst = src.
On 3/23/15 1:35 PM, David Miller wrote:
From: David Ahern david.ah...@oracle.com
Date: Mon, 23 Mar 2015 11:34:34 -0600
seems like a formality at this point, but this resolves the panic on
the M7-based ldom and baremetal. The T5-8 failed to boot, but it could
be a different problem.
Nevermind I think I figured out the problem.
It's the cache initializing stores, we can't do overlapping
copies where dst <= src in all cases because of them.
A store to a address modulo the cache line size (which for
these instructions is 64 bytes), clears that whole line.
But when we're
From: David Ahern
Date: Sun, 22 Mar 2015 18:03:30 -0600
> On 3/22/15 5:54 PM, David Miller wrote:
>>> I just put it on 4.0.0-rc4 and ditto -- problem goes away, so it
>>> clearly suggests the memcpy or memmove are the root cause.
>>
>> Thanks, didn't notice that.
>>
>> So, something is amuck.
>
On 3/22/15 5:54 PM, David Miller wrote:
I just put it on 4.0.0-rc4 and ditto -- problem goes away, so it
clearly suggests the memcpy or memmove are the root cause.
Thanks, didn't notice that.
So, something is amuck.
to continue to refine the problem ... I modified only the memmove lines
From: Linus Torvalds
Date: Sun, 22 Mar 2015 16:49:51 -0700
> On Sun, Mar 22, 2015 at 3:23 PM, David Miller wrote:
>>
>> Yes, using VIS how we do is alright, and in fact I did an audit of
>> this about 1 year ago. This is another one of those "if this is
>> wrong, so much stuff would break"
>
From: David Ahern
Date: Sun, 22 Mar 2015 17:35:49 -0600
> I don't know if you caught Bob's message; he has a hack to bypass
> memcpy and memmove in mm/slab.c use a for loop to move entries. With
> the hack he is not seeing the problem.
>
> This is the hack:
>
> +static void move_entries(void
On Sun, Mar 22, 2015 at 3:23 PM, David Miller wrote:
>
> Yes, using VIS how we do is alright, and in fact I did an audit of
> this about 1 year ago. This is another one of those "if this is
> wrong, so much stuff would break"
Maybe. But it does seem like Bob Picco has narrowed it down to
On 3/22/15 4:23 PM, David Miller wrote:
I don't even know which version of memcpy ends up being used on M7.
Some of them do things like use VIS. I can follow some regular sparc
asm, there's no way I'm even *looking* at that. Is it really ok to use
VIS registers in random contexts?
Yes, using
From: Linus Torvalds
Date: Sun, 22 Mar 2015 12:47:08 -0700
> Which was why I was asking how sure you are that memcpy *always*
> copies from low to high.
Yeah I'm pretty sure.
> I don't even know which version of memcpy ends up being used on M7.
> Some of them do things like use VIS. I can
On Sun, Mar 22, 2015 at 10:36 AM, David Miller wrote:
>
> And they end up using that byte-at-a-time code, since SLAB and SLUB
> do mmemove() calls of the form:
>
> memmove(X + N, X, LEN);
Actually, the common case in slab is overlapping but of the form
memmove(p, p+x, len);
which
David Miller wrote: [Sun Mar 22 2015, 01:36:03PM EDT]
> From: Linus Torvalds
> Date: Sat, 21 Mar 2015 11:49:12 -0700
>
> > Davem? I don't read sparc assembly, so I'm *really* not going to try
> > to verify that (a) all the memcpy implementations always copy
> > low-to-high and (b) that I
From: Linus Torvalds
Date: Sat, 21 Mar 2015 11:49:12 -0700
> Davem? I don't read sparc assembly, so I'm *really* not going to try
> to verify that (a) all the memcpy implementations always copy
> low-to-high and (b) that I even read the address comparisons in
> memmove.S right.
All of the sparc
From: Linus Torvalds torva...@linux-foundation.org
Date: Sun, 22 Mar 2015 16:49:51 -0700
On Sun, Mar 22, 2015 at 3:23 PM, David Miller da...@davemloft.net wrote:
Yes, using VIS how we do is alright, and in fact I did an audit of
this about 1 year ago. This is another one of those if this is
From: Linus Torvalds torva...@linux-foundation.org
Date: Sat, 21 Mar 2015 11:49:12 -0700
Davem? I don't read sparc assembly, so I'm *really* not going to try
to verify that (a) all the memcpy implementations always copy
low-to-high and (b) that I even read the address comparisons in
memmove.S
David Miller wrote: [Sun Mar 22 2015, 01:36:03PM EDT]
From: Linus Torvalds torva...@linux-foundation.org
Date: Sat, 21 Mar 2015 11:49:12 -0700
Davem? I don't read sparc assembly, so I'm *really* not going to try
to verify that (a) all the memcpy implementations always copy
On Sun, Mar 22, 2015 at 10:36 AM, David Miller da...@davemloft.net wrote:
And they end up using that byte-at-a-time code, since SLAB and SLUB
do mmemove() calls of the form:
memmove(X + N, X, LEN);
Actually, the common case in slab is overlapping but of the form
memmove(p, p+x,
On 3/22/15 4:23 PM, David Miller wrote:
I don't even know which version of memcpy ends up being used on M7.
Some of them do things like use VIS. I can follow some regular sparc
asm, there's no way I'm even *looking* at that. Is it really ok to use
VIS registers in random contexts?
Yes, using
From: David Ahern david.ah...@oracle.com
Date: Sun, 22 Mar 2015 17:35:49 -0600
I don't know if you caught Bob's message; he has a hack to bypass
memcpy and memmove in mm/slab.c use a for loop to move entries. With
the hack he is not seeing the problem.
This is the hack:
+static void
From: Linus Torvalds torva...@linux-foundation.org
Date: Sun, 22 Mar 2015 12:47:08 -0700
Which was why I was asking how sure you are that memcpy *always*
copies from low to high.
Yeah I'm pretty sure.
I don't even know which version of memcpy ends up being used on M7.
Some of them do things
On 3/22/15 5:54 PM, David Miller wrote:
I just put it on 4.0.0-rc4 and ditto -- problem goes away, so it
clearly suggests the memcpy or memmove are the root cause.
Thanks, didn't notice that.
So, something is amuck.
to continue to refine the problem ... I modified only the memmove lines
From: David Ahern david.ah...@oracle.com
Date: Sun, 22 Mar 2015 18:03:30 -0600
On 3/22/15 5:54 PM, David Miller wrote:
I just put it on 4.0.0-rc4 and ditto -- problem goes away, so it
clearly suggests the memcpy or memmove are the root cause.
Thanks, didn't notice that.
So, something is
Nevermind I think I figured out the problem.
It's the cache initializing stores, we can't do overlapping
copies where dst = src in all cases because of them.
A store to a address modulo the cache line size (which for
these instructions is 64 bytes), clears that whole line.
But when we're doing
On Sun, Mar 22, 2015 at 3:23 PM, David Miller da...@davemloft.net wrote:
Yes, using VIS how we do is alright, and in fact I did an audit of
this about 1 year ago. This is another one of those if this is
wrong, so much stuff would break
Maybe. But it does seem like Bob Picco has narrowed it
On Sat, Mar 21, 2015 at 10:45 AM, David Ahern wrote:
>
> You raise a lot of valid questions and something to look into. But if the
> root cause were such a fundamental issue (CPU memory ordering, compiler bug,
> etc) why would it only occur on this one code path -- free with SLAB and
> NUMA --
On 3/20/15 6:47 PM, Linus Torvalds wrote:
Here's another data point: If I disable NUMA I don't see the problem.
Performance drops, but no NULL pointer splats which would have been panics.
So the NUMA case triggers the per-node "n->shared" logic, which
*should* be protected by "n->list_lock".
On 3/20/15 6:47 PM, Linus Torvalds wrote:
Here's another data point: If I disable NUMA I don't see the problem.
Performance drops, but no NULL pointer splats which would have been panics.
So the NUMA case triggers the per-node n-shared logic, which
*should* be protected by n-list_lock. Maybe
On Sat, Mar 21, 2015 at 10:45 AM, David Ahern david.ah...@oracle.com wrote:
You raise a lot of valid questions and something to look into. But if the
root cause were such a fundamental issue (CPU memory ordering, compiler bug,
etc) why would it only occur on this one code path -- free with
On Fri, Mar 20, 2015 at 5:18 PM, David Ahern wrote:
> On 3/20/15 4:49 PM, David Ahern wrote:
>>
>> I did ask around and apparently this bug is hit only with the new M7
>> processors. DaveM: that's why you are not hitting this.
Quite frankly, this smells even more like an architecture bug. It
On 3/20/15 6:34 PM, David Rientjes wrote:
On Fri, 20 Mar 2015, David Ahern wrote:
Here's another data point: If I disable NUMA I don't see the problem.
Performance drops, but no NULL pointer splats which would have been panics.
The 128 cpu ldom with NUMA enabled shows the problem every single
On Fri, 20 Mar 2015, David Ahern wrote:
> Here's another data point: If I disable NUMA I don't see the problem.
> Performance drops, but no NULL pointer splats which would have been panics.
>
> The 128 cpu ldom with NUMA enabled shows the problem every single time I do a
> kernel compile (-j
On 3/20/15 4:49 PM, David Ahern wrote:
On 3/20/15 3:17 PM, Linus Torvalds wrote:
In other words, if I read that sparc asm right (and it is very likely
that I do *not*), then "objp" is NULL, and that's why you crash.
That does appear to be why. I put a WARN_ON before
clear_obj_pfmemalloc() if
On 3/20/15 3:17 PM, Linus Torvalds wrote:
In other words, if I read that sparc asm right (and it is very likely
that I do *not*), then "objp" is NULL, and that's why you crash.
That does appear to be why. I put a WARN_ON before
clear_obj_pfmemalloc() if objpp[i] is NULL. I got 2 splats during
On Fri, Mar 20, 2015 at 8:07 AM, David Ahern wrote:
> Instruction DUMP: 86230003 8730f00d 8728f006 8600c007 8e0ac008
> 2ac1c002 c658e030 d458e028
Ok, so it's d658c007 that faults, which is that
ldx [ %g3 + %g7 ], %o3
instruction.
Looking at your objdump:
> free_block():
>
From: David Ahern
Date: Fri, 20 Mar 2015 13:54:09 -0600
> Interesting. With -j <64 and talking softly it completes. But -j 128
> and higher always ends in a panic.
Please share more details of your configuration.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the
On 03/20/2015 09:58 AM, Linus Torvalds wrote:
> 128 cpu's is still "unusual", of course, but by no means unheard of,
> and I'f have expected others to report it too if it was wasy to
> trigger on x86-64.
FWIW, I configured a kernel with SLAB and kicked off a bunch of compiles
on a 160-thread
On 3/20/15 1:47 PM, David Miller wrote:
From: David Ahern
Date: Fri, 20 Mar 2015 12:05:05 -0600
DaveM: do you mind if I submit a patch to change the default for sparc
to SLUB?
I think we're jumping the gun about all of this, and doing anything
with default Kconfig settings would be entirely
From: David Ahern
Date: Fri, 20 Mar 2015 12:05:05 -0600
> DaveM: do you mind if I submit a patch to change the default for sparc
> to SLUB?
I think we're jumping the gun about all of this, and doing anything
with default Kconfig settings would be entirely premature until we
know what the real
From: Linus Torvalds
Date: Fri, 20 Mar 2015 09:58:25 -0700
> 128 cpu's is still "unusual"
As unusual as the system I do all of my kernel builds on :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo
On 3/20/15 12:53 PM, Linus Torvalds wrote:
SLUB should definitely be considered a stable allocator. It's the
default allocator for at least Fedora, and that presumably means all
of Redhat.
SuSE seems to use SLAB still, though, so it must be getting lots of
testing on x86 too.
Did you test
On Fri, Mar 20, 2015 at 11:05 AM, David Ahern wrote:
>
> Evidently, it is a well known problem internally that goes back to at least
> 2.6.39.
>
> To this point I have not paid attention to the allocators. At what point is
> SLUB considered stable for large systems? Is 2.6.39 stable?
SLUB should
On 3/20/15 10:58 AM, Linus Torvalds wrote:
That said, SLAB is probably also almost unheard of in high-CPU
configurations, since slub has all the magical unlocked lists etc for
scalability. So maybe it's a generic SLAB bug, and nobody with lots of
CPU's is testing SLAB.
Evidently, it is a well
On Fri, Mar 20, 2015 at 9:53 AM, David Ahern wrote:
>
> I haven't tried 3.19 yet. Just backed up to 3.18 and it shows the same
> problem. And I can reproduce the 4.0 crash in a 128 cpu ldom (VM).
Ok, so if 3.18 also has it, then trying 3.19 is pointless, this is
obviously an old problem. Which
On 3/20/15 10:48 AM, Linus Torvalds wrote:
[ Added Davem and the sparc mailing list, since it happens on sparc
and that just makes me suspicious ]
On Fri, Mar 20, 2015 at 8:07 AM, David Ahern wrote:
I can easily reproduce the panic below doing a kernel build with make -j N,
N=128, 256, etc.
[ Added Davem and the sparc mailing list, since it happens on sparc
and that just makes me suspicious ]
On Fri, Mar 20, 2015 at 8:07 AM, David Ahern wrote:
> I can easily reproduce the panic below doing a kernel build with make -j N,
> N=128, 256, etc. This is a 1024 cpu system running
On 3/20/15 6:34 PM, David Rientjes wrote:
On Fri, 20 Mar 2015, David Ahern wrote:
Here's another data point: If I disable NUMA I don't see the problem.
Performance drops, but no NULL pointer splats which would have been panics.
The 128 cpu ldom with NUMA enabled shows the problem every single
On Fri, Mar 20, 2015 at 5:18 PM, David Ahern david.ah...@oracle.com wrote:
On 3/20/15 4:49 PM, David Ahern wrote:
I did ask around and apparently this bug is hit only with the new M7
processors. DaveM: that's why you are not hitting this.
Quite frankly, this smells even more like an
On 3/20/15 4:49 PM, David Ahern wrote:
On 3/20/15 3:17 PM, Linus Torvalds wrote:
In other words, if I read that sparc asm right (and it is very likely
that I do *not*), then objp is NULL, and that's why you crash.
That does appear to be why. I put a WARN_ON before
clear_obj_pfmemalloc() if
On Fri, 20 Mar 2015, David Ahern wrote:
Here's another data point: If I disable NUMA I don't see the problem.
Performance drops, but no NULL pointer splats which would have been panics.
The 128 cpu ldom with NUMA enabled shows the problem every single time I do a
kernel compile (-j 128).
On 3/20/15 12:53 PM, Linus Torvalds wrote:
SLUB should definitely be considered a stable allocator. It's the
default allocator for at least Fedora, and that presumably means all
of Redhat.
SuSE seems to use SLAB still, though, so it must be getting lots of
testing on x86 too.
Did you test
On Fri, Mar 20, 2015 at 11:05 AM, David Ahern david.ah...@oracle.com wrote:
Evidently, it is a well known problem internally that goes back to at least
2.6.39.
To this point I have not paid attention to the allocators. At what point is
SLUB considered stable for large systems? Is 2.6.39
On 3/20/15 10:58 AM, Linus Torvalds wrote:
That said, SLAB is probably also almost unheard of in high-CPU
configurations, since slub has all the magical unlocked lists etc for
scalability. So maybe it's a generic SLAB bug, and nobody with lots of
CPU's is testing SLAB.
Evidently, it is a well
On Fri, Mar 20, 2015 at 9:53 AM, David Ahern david.ah...@oracle.com wrote:
I haven't tried 3.19 yet. Just backed up to 3.18 and it shows the same
problem. And I can reproduce the 4.0 crash in a 128 cpu ldom (VM).
Ok, so if 3.18 also has it, then trying 3.19 is pointless, this is
obviously an
[ Added Davem and the sparc mailing list, since it happens on sparc
and that just makes me suspicious ]
On Fri, Mar 20, 2015 at 8:07 AM, David Ahern david.ah...@oracle.com wrote:
I can easily reproduce the panic below doing a kernel build with make -j N,
N=128, 256, etc. This is a 1024 cpu
On 3/20/15 10:48 AM, Linus Torvalds wrote:
[ Added Davem and the sparc mailing list, since it happens on sparc
and that just makes me suspicious ]
On Fri, Mar 20, 2015 at 8:07 AM, David Ahern david.ah...@oracle.com wrote:
I can easily reproduce the panic below doing a kernel build with make -j
From: David Ahern david.ah...@oracle.com
Date: Fri, 20 Mar 2015 12:05:05 -0600
DaveM: do you mind if I submit a patch to change the default for sparc
to SLUB?
I think we're jumping the gun about all of this, and doing anything
with default Kconfig settings would be entirely premature until we
On 03/20/2015 09:58 AM, Linus Torvalds wrote:
128 cpu's is still unusual, of course, but by no means unheard of,
and I'f have expected others to report it too if it was wasy to
trigger on x86-64.
FWIW, I configured a kernel with SLAB and kicked off a bunch of compiles
on a 160-thread x86_64
From: Linus Torvalds torva...@linux-foundation.org
Date: Fri, 20 Mar 2015 09:58:25 -0700
128 cpu's is still unusual
As unusual as the system I do all of my kernel builds on :-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to
On 3/20/15 1:47 PM, David Miller wrote:
From: David Ahern david.ah...@oracle.com
Date: Fri, 20 Mar 2015 12:05:05 -0600
DaveM: do you mind if I submit a patch to change the default for sparc
to SLUB?
I think we're jumping the gun about all of this, and doing anything
with default Kconfig
From: David Ahern david.ah...@oracle.com
Date: Fri, 20 Mar 2015 13:54:09 -0600
Interesting. With -j 64 and talking softly it completes. But -j 128
and higher always ends in a panic.
Please share more details of your configuration.
--
To unsubscribe from this list: send the line unsubscribe
On Fri, Mar 20, 2015 at 8:07 AM, David Ahern david.ah...@oracle.com wrote:
Instruction DUMP: 86230003 8730f00d 8728f006 d658c007 8600c007 8e0ac008
2ac1c002 c658e030 d458e028
Ok, so it's d658c007 that faults, which is that
ldx [ %g3 + %g7 ], %o3
instruction.
Looking at your
On 3/20/15 3:17 PM, Linus Torvalds wrote:
In other words, if I read that sparc asm right (and it is very likely
that I do *not*), then objp is NULL, and that's why you crash.
That does appear to be why. I put a WARN_ON before
clear_obj_pfmemalloc() if objpp[i] is NULL. I got 2 splats during
90 matches
Mail list logo