Re: [lkp-robot] [x86/kconfig] 81d3871900: BUG:unable_to_handle_kernel

2017-10-13 Thread Andy Lutomirski
On Fri, Oct 13, 2017 at 12:09 PM, Linus Torvalds
 wrote:
> On Fri, Oct 13, 2017 at 6:56 AM, Andrey Ryabinin
>  wrote:
>>
>> This could be fixed by s/vmovdqa/vmovdqu change like bellow, but maybe the 
>> right fix
>> would be to align the data properly?
>
> I suspect anything that has the SHA extensions should also do
> unaligned loads efficiently. The whole "aligned only" model is broken.
> It's just doing two loads from the state pointer, there's likely no
> point in trying to align it.
>
> So your patch looks fine, but maybe somebody could add the required
> alignment to the sha256 context allocation (which I don't know where
> it is).

IIRC if we try the latter, then we'll risk hitting the #*!&@% gcc bug
that mostly prevents 16-byte alignment from working on GCC before 4.8
or so.  That way lies debugging disasters.


Re: [lkp-robot] [x86/kconfig] 81d3871900: BUG:unable_to_handle_kernel

2017-10-13 Thread Linus Torvalds
On Fri, Oct 13, 2017 at 6:56 AM, Andrey Ryabinin
 wrote:
>
> This could be fixed by s/vmovdqa/vmovdqu change like bellow, but maybe the 
> right fix
> would be to align the data properly?

I suspect anything that has the SHA extensions should also do
unaligned loads efficiently. The whole "aligned only" model is broken.
It's just doing two loads from the state pointer, there's likely no
point in trying to align it.

So your patch looks fine, but maybe somebody could add the required
alignment to the sha256 context allocation (which I don't know where
it is).

But yeah, that other SLOB panic looks unrelated to this.

   Linus


Re: [lkp-robot] [x86/kconfig] 81d3871900: BUG:unable_to_handle_kernel

2017-10-13 Thread Josh Poimboeuf
On Fri, Oct 13, 2017 at 04:56:43PM +0300, Andrey Ryabinin wrote:
> On 10/13/2017 07:45 AM, Josh Poimboeuf wrote:
> > On Thu, Oct 12, 2017 at 12:05:04PM -0500, Christopher Lameter wrote:
> >> On Wed, 11 Oct 2017, Josh Poimboeuf wrote:
> >>
> >>> I failed to add the slab maintainers to CC on the last attempt.  Trying
> >>> again.
> >>
> >>
> >> Hmmm... Yea. SLOB is rarely used and tested. Good illustration of a simple
> >> allocator and the K mechanism that was used in the early kernels.
> >>
>  Adding the slub maintainers.  Is slob still supposed to work?
> >>
> >> Have not seen anyone using it in a decade or so.
> >>
> >> Does the same config with SLUB and slub_debug on the commandline run
> >> cleanly?
> >>
>  I have no idea how that crypto panic could could be related to slob, but
>  at least it goes away when I switch to slub.
> >>
> >> Can you run SLUB with full debug? specify slub_debug on the commandline or
> >> set CONFIG_SLUB_DEBUG_ON
> > 
> > Oddly enough, with CONFIG_SLUB+slub_debug, I get the same crypto panic I
> > got with CONFIG_SLOB.  The trapping instruction is:
> > 
> >   vmovdqa 0x140(%rdi),%xmm0
> 
> 
> It's unaligned access. Look at %rdi. vmovdqa requires 16-byte alignment.
> Apparently, something fed kmalloc()'ed data here. But kmalloc() guarantees 
> only sizeof(unsigned long)
> alignment. slub_debug changes slub's objects layout, so what happened to be 
> 16-bytes aligned
> without slub_debug, may become 8-byte aligned with slub_debg on.
> 
>
> > I'll try to bisect it tomorrow.  It at least goes back to v4.10.
> 
> Probably no point. I bet this bug always was here (since this code added).
> 
> This could be fixed by s/vmovdqa/vmovdqu change like bellow, but maybe the 
> right fix
> would be to align the data properly?
> 
> ---
>  arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S 
> b/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
> index 8fe6338bcc84..7fd5d9b568c7 100644
> --- a/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
> +++ b/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
> @@ -155,8 +155,8 @@ LABEL skip_ %I
>  .endr
>  
>   # Find min length
> - vmovdqa _lens+0*16(state), %xmm0
> - vmovdqa _lens+1*16(state), %xmm1
> + vmovdqu _lens+0*16(state), %xmm0
> + vmovdqu _lens+1*16(state), %xmm1
>  
>   vpminud %xmm1, %xmm0, %xmm2 # xmm2 has {D,C,B,A}
>   vpalignr $8, %xmm2, %xmm3, %xmm3# xmm3 has {x,x,D,C}
> @@ -176,8 +176,8 @@ LABEL skip_ %I
>   vpsubd  %xmm2, %xmm0, %xmm0
>   vpsubd  %xmm2, %xmm1, %xmm1
>  
> - vmovdqa %xmm0, _lens+0*16(state)
> - vmovdqa %xmm1, _lens+1*16(state)
> + vmovdqu %xmm0, _lens+0*16(state)
> + vmovdqu %xmm1, _lens+1*16(state)
>  
>   # "state" and "args" are the same address, arg1
>   # len is arg2
> -- 
> 2.13.6

Makes sense.  I can confirm that the above patch fixes the panic.

-- 
Josh


Re: [lkp-robot] [x86/kconfig] 81d3871900: BUG:unable_to_handle_kernel

2017-10-13 Thread Andrey Ryabinin
On 10/13/2017 07:45 AM, Josh Poimboeuf wrote:
> On Thu, Oct 12, 2017 at 12:05:04PM -0500, Christopher Lameter wrote:
>> On Wed, 11 Oct 2017, Josh Poimboeuf wrote:
>>
>>> I failed to add the slab maintainers to CC on the last attempt.  Trying
>>> again.
>>
>>
>> Hmmm... Yea. SLOB is rarely used and tested. Good illustration of a simple
>> allocator and the K mechanism that was used in the early kernels.
>>
 Adding the slub maintainers.  Is slob still supposed to work?
>>
>> Have not seen anyone using it in a decade or so.
>>
>> Does the same config with SLUB and slub_debug on the commandline run
>> cleanly?
>>
 I have no idea how that crypto panic could could be related to slob, but
 at least it goes away when I switch to slub.
>>
>> Can you run SLUB with full debug? specify slub_debug on the commandline or
>> set CONFIG_SLUB_DEBUG_ON
> 
> Oddly enough, with CONFIG_SLUB+slub_debug, I get the same crypto panic I
> got with CONFIG_SLOB.  The trapping instruction is:
> 
>   vmovdqa 0x140(%rdi),%xmm0


It's unaligned access. Look at %rdi. vmovdqa requires 16-byte alignment.
Apparently, something fed kmalloc()'ed data here. But kmalloc() guarantees only 
sizeof(unsigned long)
alignment. slub_debug changes slub's objects layout, so what happened to be 
16-bytes aligned
without slub_debug, may become 8-byte aligned with slub_debg on.

   
> I'll try to bisect it tomorrow.  It at least goes back to v4.10.

Probably no point. I bet this bug always was here (since this code added).

This could be fixed by s/vmovdqa/vmovdqu change like bellow, but maybe the 
right fix
would be to align the data properly?

---
 arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S 
b/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
index 8fe6338bcc84..7fd5d9b568c7 100644
--- a/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
+++ b/arch/x86/crypto/sha256-mb/sha256_mb_mgr_flush_avx2.S
@@ -155,8 +155,8 @@ LABEL skip_ %I
 .endr
 
# Find min length
-   vmovdqa _lens+0*16(state), %xmm0
-   vmovdqa _lens+1*16(state), %xmm1
+   vmovdqu _lens+0*16(state), %xmm0
+   vmovdqu _lens+1*16(state), %xmm1
 
vpminud %xmm1, %xmm0, %xmm2 # xmm2 has {D,C,B,A}
vpalignr $8, %xmm2, %xmm3, %xmm3# xmm3 has {x,x,D,C}
@@ -176,8 +176,8 @@ LABEL skip_ %I
vpsubd  %xmm2, %xmm0, %xmm0
vpsubd  %xmm2, %xmm1, %xmm1
 
-   vmovdqa %xmm0, _lens+0*16(state)
-   vmovdqa %xmm1, _lens+1*16(state)
+   vmovdqu %xmm0, _lens+0*16(state)
+   vmovdqu %xmm1, _lens+1*16(state)
 
# "state" and "args" are the same address, arg1
# len is arg2
-- 
2.13.6



> I'm
> not really sure whether this panic is related to SLUB or SLOB at all.
> (Though the original panic reported upthread by the kernel test robot
> *does* look SLOB related.)
> 
>   general protection fault:  [#1] PREEMPT SMP
>   Modules linked in:
>   CPU: 0 PID: 58 Comm: kworker/0:1 Not tainted 4.13.0 #81
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 
> 04/01/2014
>   Workqueue: crypto mcryptd_flusher
>   task: 880139108040 task.stack: c982c000
>   RIP: 0010:skip_7+0x0/0x67
>   RSP: 0018:c982fd88 EFLAGS: 00010246
>   RAX: 88013834172c RBX: f7654321 RCX: 0003
>   RDX:  RSI: 81d254f9 RDI: 8801381b1a88
>   RBP: c982fd90 R08:  R09: 0001
>   R10: 0001 R11:  R12: 82392260
>   R13: 88013a7e6500 R14: fffb80f5 R15: 
>   FS:  () GS:88013a60() knlGS:
>   CS:  0010 DS:  ES:  CR0: 80050033
>   CR2: 7f88491ef914 CR3: 01e11000 CR4: 001406f0
>   Call Trace:
>sha256_ctx_mgr_flush+0x28/0x30
>sha256_mb_flusher+0x53/0x120
>mcryptd_flusher+0xc4/0xf0
>process_one_work+0x253/0x6b0
>worker_thread+0x4d/0x3b0
>? preempt_count_sub+0x9b/0x100
>kthread+0x133/0x150
>? process_one_work+0x6b0/0x6b0
>? kthread_create_on_node+0x70/0x70
>ret_from_fork+0x2a/0x40
>   Code: 89 87 30 01 00 00 c7 87 58 01 00 00 ff ff ff ff 48 83 bf a0 01 00 00 
> 00 75 11 48 89 87 38 01 00 00 c7 87 5c 01 00 00 ff ff ff ff  f9 6f 87 40 
> 01 00 00 c5 f9 6f 8f 50 01 00 00 c4 e2 79 3b d1
>   RIP: skip_7+0x0/0x67 RSP: c982fd88
>   ---[ end trace d89a1613b7d1b8bc ]---
>   BUG: sleeping function called from invalid context at 
> ./include/linux/percpu-rwsem.h:33
>   in_atomic(): 1, irqs_disabled(): 0, pid: 58, name: kworker/0:1
>   INFO: lockdep is turned off.
>   Preemption disabled at:
>   [] kernel_fpu_begin+0x13/0x20
>   CPU: 0 PID: 58 Comm: kworker/0:1 Tainted: G  D 4.13.0 #81
>   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1.fc26 
> 04/01/2014
>   Workqueue: crypto