Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2

2017-02-14 Thread Herbert Xu
On Mon, Feb 13, 2017 at 09:20:48AM -0800, Tim Chen wrote:
> 
> Megha is now able to create a test set up that produce
> similar problem reported by Dmitry.  This patch did not
> completely fix it.  So maybe you can hold off on merging
> this patch to the mainline till we can develop a more
> complete fix.

Tim, please send them as follow-up patches.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2

2017-02-13 Thread Tim Chen
On Sat, 2017-02-11 at 18:50 +0800, Herbert Xu wrote:
> On Wed, Feb 01, 2017 at 10:45:02AM -0800, Tim Chen wrote:
> > 
> > 
> > One theory that Mehga and I have is that perhaps the flusher
> > and regular computaion updates are stepping on each other. 
> > Can you try this patch and see if it helps?
> Patch applied.  Thanks.

Herbert,

Megha is now able to create a test set up that produce
similar problem reported by Dmitry.  This patch did not
completely fix it.  So maybe you can hold off on merging
this patch to the mainline till we can develop a more
complete fix.

Thanks.

Tim


Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2

2017-02-11 Thread Herbert Xu
On Wed, Feb 01, 2017 at 10:45:02AM -0800, Tim Chen wrote:
>
> One theory that Mehga and I have is that perhaps the flusher
> and regular computaion updates are stepping on each other. 
> Can you try this patch and see if it helps?

Patch applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2

2017-02-02 Thread Tim Chen
On Thu, 2017-02-02 at 11:58 +0100, Dmitry Vyukov wrote:
> On Wed, Feb 1, 2017 at 7:45 PM, Tim Chen  wrote:
> > 
> > On Tue, Jan 31, 2017 at 02:16:31PM +0100, Dmitry Vyukov wrote:
> > > 
> > > Hello,
> > > 
> > > I am getting the following reports with low frequency while running
> > > syzkaller fuzzer. Unfortunately they are not reproducible and happen
> > > in a background thread, so it is difficult to extract any context on
> > > my side. I see only few such crashes per week, so most likely it is
> > > some hard to trigger data race. The following reports are from mmotm
> > > tree, commits 00e20cfc2bf04a0cbe1f5405f61c8426f43eee84 and
> > > fff7e71eac7788904753136f09bcad7471f7799e. Any ideas as to how this can
> > > happen?
> > > 
> > > BUG: unable to handle kernel NULL pointer dereference at 0060
> > > IP: [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> > > arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> > > PGD 1d2395067 [  220.874864] PUD 1d2860067
> > > Oops: 0002 [#1] SMP KASAN
> > > Dumping ftrace buffer:
> > >    (ftrace buffer empty)
> > > Modules linked in:
> > > CPU: 0 PID: 516 Comm: kworker/0:1 Not tainted 4.9.0 #4
> > > Hardware name: Google Google Compute Engine/Google Compute Engine,
> > > BIOS Google 01/01/2011
> > > Workqueue: crypto mcryptd_queue_worker
> > > task: 8801d9f346c0 task.stack: 8801d9f08000
> > > RIP: 0010:[]  []
> > > sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> > > arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> > > RSP: 0018:8801d9f0eef8  EFLAGS: 00010202
> > > RAX:  RBX: 8801d7db1190 RCX: 0006
> > > RDX: 0001 RSI: 8801d9f34ee8 RDI: 8801d7db1040
> > > RBP: 8801d9f0f258 R08: 0001 R09: 0001
> > > R10: 0002 R11: 0003 R12: 8801d9f0f230
> > > R13: 8801c8bbc4e0 R14: 8801c8bbc530 R15: 8801d9f0ef70
> > > FS:  () GS:8801dc00() 
> > > knlGS:
> > > CS:  0010 DS:  ES:  CR0: 80050033
> > > CR2: 0060 CR3: 0001cc15a000 CR4: 001406f0
> > > DR0:  DR1:  DR2: 
> > > DR3:  DR6: fffe0ff0 DR7: 0400
> > > Stack:
> > >  8801d7db1040 813fa207 dc00 e8c0f238
> > >  0002 11003b3e1dea e8c0f218 8801d9f0f190
> > >  0282 e8c0f140 e8c0f220 41b58ab3
> > > Call Trace:
> > >  [] sha512_mb_update+0x2f7/0x4e0
> > > arch/x86/crypto/sha512-mb/sha512_mb.c:588
> > >  [] crypto_ahash_update include/crypto/hash.h:512 
> > > [inline]
> > >  [] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
> > >  [] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
> > >  [] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
> > >  [] process_one_work+0xbd0/0x1c10 
> > > kernel/workqueue.c:2096
> > >  [] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
> > >  [] kthread+0x323/0x3e0 kernel/kthread.c:209
> > >  [] ret_from_fork+0x2a/0x40 
> > > arch/x86/entry/entry_64.S:433
> > > Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
> > > 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 
> > > 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
> > > RIP  [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> > > arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> > >  RSP 
> > > CR2: 0060
> > > ---[ end trace 139fd4cda5dfe2c4 ]---
> > > 
> > Dmitry,
> > 
> > One theory that Mehga and I have is that perhaps the flusher
> > and regular computaion updates are stepping on each other.
> > Can you try this patch and see if it helps?
> 
> No, for this one I can't. Sorry.
> It happens with very low frequency and only one fuzzer that tests
> mmotm tree. If/when this is committed, I can keep an eye on these
> reports and notify if I still see them.
> If you have a hypothesis as to how it happens, perhaps you could write
> a test that provokes the crash and maybe add some sleeps to kernel
> code or alter timeouts to increase probability.
> 

If this patch is merged, it will most likely go into Herbert's crypto-dev
tree and not Andrew's mm tree for testing.  We will try to do the best
on our side for testing to replicate the crash scenario.  

Will it be possible to have one of your fuzzer run the crypto-dev tree 
once the patch got merged there?  

Thanks.

Tim

> 
> 
> > 
> > --->8---
> > 
> > From: Tim Chen 
> > Subject: [PATCH] crypto/sha512-mb: Protect sha512 mb ctx mgr access
> > To: Herbert Xu , Dmitry Vyukov 
> > 
> > Cc: Tim Chen , David Miller 
> > , linux-crypto@vger.kernel.org, LKML 
> > , megha@linux.intel.com, 
> > fenghua...@intel.com
> > 
> > The flusher and 

Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2

2017-02-02 Thread Dmitry Vyukov
On Wed, Feb 1, 2017 at 7:45 PM, Tim Chen  wrote:
> On Tue, Jan 31, 2017 at 02:16:31PM +0100, Dmitry Vyukov wrote:
>> Hello,
>>
>> I am getting the following reports with low frequency while running
>> syzkaller fuzzer. Unfortunately they are not reproducible and happen
>> in a background thread, so it is difficult to extract any context on
>> my side. I see only few such crashes per week, so most likely it is
>> some hard to trigger data race. The following reports are from mmotm
>> tree, commits 00e20cfc2bf04a0cbe1f5405f61c8426f43eee84 and
>> fff7e71eac7788904753136f09bcad7471f7799e. Any ideas as to how this can
>> happen?
>>
>> BUG: unable to handle kernel NULL pointer dereference at 0060
>> IP: [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
>> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>> PGD 1d2395067 [  220.874864] PUD 1d2860067
>> Oops: 0002 [#1] SMP KASAN
>> Dumping ftrace buffer:
>>(ftrace buffer empty)
>> Modules linked in:
>> CPU: 0 PID: 516 Comm: kworker/0:1 Not tainted 4.9.0 #4
>> Hardware name: Google Google Compute Engine/Google Compute Engine,
>> BIOS Google 01/01/2011
>> Workqueue: crypto mcryptd_queue_worker
>> task: 8801d9f346c0 task.stack: 8801d9f08000
>> RIP: 0010:[]  []
>> sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
>> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>> RSP: 0018:8801d9f0eef8  EFLAGS: 00010202
>> RAX:  RBX: 8801d7db1190 RCX: 0006
>> RDX: 0001 RSI: 8801d9f34ee8 RDI: 8801d7db1040
>> RBP: 8801d9f0f258 R08: 0001 R09: 0001
>> R10: 0002 R11: 0003 R12: 8801d9f0f230
>> R13: 8801c8bbc4e0 R14: 8801c8bbc530 R15: 8801d9f0ef70
>> FS:  () GS:8801dc00() knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2: 0060 CR3: 0001cc15a000 CR4: 001406f0
>> DR0:  DR1:  DR2: 
>> DR3:  DR6: fffe0ff0 DR7: 0400
>> Stack:
>>  8801d7db1040 813fa207 dc00 e8c0f238
>>  0002 11003b3e1dea e8c0f218 8801d9f0f190
>>  0282 e8c0f140 e8c0f220 41b58ab3
>> Call Trace:
>>  [] sha512_mb_update+0x2f7/0x4e0
>> arch/x86/crypto/sha512-mb/sha512_mb.c:588
>>  [] crypto_ahash_update include/crypto/hash.h:512 [inline]
>>  [] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
>>  [] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
>>  [] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
>>  [] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
>>  [] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
>>  [] kthread+0x323/0x3e0 kernel/kthread.c:209
>>  [] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
>> Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
>> 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 
>> 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
>> RIP  [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
>> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>>  RSP 
>> CR2: 0060
>> ---[ end trace 139fd4cda5dfe2c4 ]---
>>
>
> Dmitry,
>
> One theory that Mehga and I have is that perhaps the flusher
> and regular computaion updates are stepping on each other.
> Can you try this patch and see if it helps?


No, for this one I can't. Sorry.
It happens with very low frequency and only one fuzzer that tests
mmotm tree. If/when this is committed, I can keep an eye on these
reports and notify if I still see them.
If you have a hypothesis as to how it happens, perhaps you could write
a test that provokes the crash and maybe add some sleeps to kernel
code or alter timeouts to increase probability.



> --->8---
>
> From: Tim Chen 
> Subject: [PATCH] crypto/sha512-mb: Protect sha512 mb ctx mgr access
> To: Herbert Xu , Dmitry Vyukov 
> 
> Cc: Tim Chen , David Miller 
> , linux-crypto@vger.kernel.org, LKML 
> , megha@linux.intel.com, 
> fenghua...@intel.com
>
> The flusher and regular multi-buffer computation via mcryptd may race with 
> another.
> Add here a lock and turn off interrupt to to access multi-buffer
> computation state cstate->mgr before a round of computation. This should
> prevent the flusher code jumping in.
>
> Signed-off-by: Tim Chen 
> ---
>  arch/x86/crypto/sha512-mb/sha512_mb.c | 64 
> +++
>  1 file changed, 42 insertions(+), 22 deletions(-)
>
> diff --git a/arch/x86/crypto/sha512-mb/sha512_mb.c 
> b/arch/x86/crypto/sha512-mb/sha512_mb.c
> index d210174..f3c1c21 100644
> --- a/arch/x86/crypto/sha512-mb/sha512_mb.c
> +++ b/arch/x86/crypto/sha512-mb/sha512_mb.c
> @@ 

Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2

2017-02-01 Thread Tim Chen
On Tue, Jan 31, 2017 at 02:16:31PM +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I am getting the following reports with low frequency while running
> syzkaller fuzzer. Unfortunately they are not reproducible and happen
> in a background thread, so it is difficult to extract any context on
> my side. I see only few such crashes per week, so most likely it is
> some hard to trigger data race. The following reports are from mmotm
> tree, commits 00e20cfc2bf04a0cbe1f5405f61c8426f43eee84 and
> fff7e71eac7788904753136f09bcad7471f7799e. Any ideas as to how this can
> happen?
> 
> BUG: unable to handle kernel NULL pointer dereference at 0060
> IP: [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> PGD 1d2395067 [  220.874864] PUD 1d2860067
> Oops: 0002 [#1] SMP KASAN
> Dumping ftrace buffer:
>(ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 516 Comm: kworker/0:1 Not tainted 4.9.0 #4
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Workqueue: crypto mcryptd_queue_worker
> task: 8801d9f346c0 task.stack: 8801d9f08000
> RIP: 0010:[]  []
> sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> RSP: 0018:8801d9f0eef8  EFLAGS: 00010202
> RAX:  RBX: 8801d7db1190 RCX: 0006
> RDX: 0001 RSI: 8801d9f34ee8 RDI: 8801d7db1040
> RBP: 8801d9f0f258 R08: 0001 R09: 0001
> R10: 0002 R11: 0003 R12: 8801d9f0f230
> R13: 8801c8bbc4e0 R14: 8801c8bbc530 R15: 8801d9f0ef70
> FS:  () GS:8801dc00() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0060 CR3: 0001cc15a000 CR4: 001406f0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Stack:
>  8801d7db1040 813fa207 dc00 e8c0f238
>  0002 11003b3e1dea e8c0f218 8801d9f0f190
>  0282 e8c0f140 e8c0f220 41b58ab3
> Call Trace:
>  [] sha512_mb_update+0x2f7/0x4e0
> arch/x86/crypto/sha512-mb/sha512_mb.c:588
>  [] crypto_ahash_update include/crypto/hash.h:512 [inline]
>  [] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
>  [] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
>  [] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
>  [] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
>  [] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
>  [] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
> 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 
> 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
> RIP  [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>  RSP 
> CR2: 0060
> ---[ end trace 139fd4cda5dfe2c4 ]---
> 

Dmitry,

One theory that Mehga and I have is that perhaps the flusher
and regular computaion updates are stepping on each other. 
Can you try this patch and see if it helps?

Tim

--->8---

From: Tim Chen 
Subject: [PATCH] crypto/sha512-mb: Protect sha512 mb ctx mgr access
To: Herbert Xu , Dmitry Vyukov 
Cc: Tim Chen , David Miller , 
linux-crypto@vger.kernel.org, LKML , 
megha@linux.intel.com, fenghua...@intel.com

The flusher and regular multi-buffer computation via mcryptd may race with 
another.
Add here a lock and turn off interrupt to to access multi-buffer
computation state cstate->mgr before a round of computation. This should
prevent the flusher code jumping in.

Signed-off-by: Tim Chen 
---
 arch/x86/crypto/sha512-mb/sha512_mb.c | 64 +++
 1 file changed, 42 insertions(+), 22 deletions(-)

diff --git a/arch/x86/crypto/sha512-mb/sha512_mb.c 
b/arch/x86/crypto/sha512-mb/sha512_mb.c
index d210174..f3c1c21 100644
--- a/arch/x86/crypto/sha512-mb/sha512_mb.c
+++ b/arch/x86/crypto/sha512-mb/sha512_mb.c
@@ -221,7 +221,7 @@ static struct sha512_hash_ctx *sha512_ctx_mgr_resubmit
 }
 
 static struct sha512_hash_ctx
-   *sha512_ctx_mgr_get_comp_ctx(struct sha512_ctx_mgr *mgr)
+   *sha512_ctx_mgr_get_comp_ctx(struct mcryptd_alg_cstate *cstate)
 {
/*
 * If get_comp_job returns NULL, there are no jobs complete.
@@ -233,11 +233,17 @@ static struct sha512_hash_ctx
 * Otherwise, all jobs currently being managed by the hash_ctx_mgr
 * still need processing.
 */
+   struct sha512_ctx_mgr *mgr;
struct sha512_hash_ctx *ctx;
+  

Re: crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2

2017-01-31 Thread Tim Chen
On Tue, 2017-01-31 at 14:16 +0100, Dmitry Vyukov wrote:
> Hello,
> 
> I am getting the following reports with low frequency while running
> syzkaller fuzzer. Unfortunately they are not reproducible and happen
> in a background thread, so it is difficult to extract any context on
> my side. I see only few such crashes per week, so most likely it is
> some hard to trigger data race. The following reports are from mmotm
> tree, commits 00e20cfc2bf04a0cbe1f5405f61c8426f43eee84 and
> fff7e71eac7788904753136f09bcad7471f7799e. Any ideas as to how this can
> happen?

Wonder if there is a race between the flusher thread that flush out
existing jobs if we don't have incoming jobs for a while and computation
via mcryptd.  Maybe the flusher fires at the same time when there is
a new job arriving.

Let Megha and I think a bit about it to come up with a patch to see
if that's the case.

Tim

> 
> BUG: unable to handle kernel NULL pointer dereference at 0060
> IP: [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> PGD 1d2395067 [  220.874864] PUD 1d2860067
> Oops: 0002 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 516 Comm: kworker/0:1 Not tainted 4.9.0 #4
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Workqueue: crypto mcryptd_queue_worker
> task: 8801d9f346c0 task.stack: 8801d9f08000
> RIP: 0010:[]  []
> sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> RSP: 0018:8801d9f0eef8  EFLAGS: 00010202
> RAX:  RBX: 8801d7db1190 RCX: 0006
> RDX: 0001 RSI: 8801d9f34ee8 RDI: 8801d7db1040
> RBP: 8801d9f0f258 R08: 0001 R09: 0001
> R10: 0002 R11: 0003 R12: 8801d9f0f230
> R13: 8801c8bbc4e0 R14: 8801c8bbc530 R15: 8801d9f0ef70
> FS:  () GS:8801dc00() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0060 CR3: 0001cc15a000 CR4: 001406f0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Stack:
>  8801d7db1040 813fa207 dc00 e8c0f238
>  0002 11003b3e1dea e8c0f218 8801d9f0f190
>  0282 e8c0f140 e8c0f220 41b58ab3
> Call Trace:
>  [] sha512_mb_update+0x2f7/0x4e0
> arch/x86/crypto/sha512-mb/sha512_mb.c:588
>  [] crypto_ahash_update include/crypto/hash.h:512 [inline]
>  [] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
>  [] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
>  [] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
>  [] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
>  [] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
>  [] kthread+0x323/0x3e0 kernel/kthread.c:209
>  [] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
> Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
> 48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 
> 40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
> RIP  [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
>  RSP 
> CR2: 0060
> ---[ end trace 139fd4cda5dfe2c4 ]---
> 
> BUG: unable to handle kernel NULL pointer dereference at 0060
> IP: [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> PGD 1c68ad067 [  624.973638] PUD 1d485a067
> Oops: 0002 [#1] SMP KASAN
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Modules linked in:
> CPU: 0 PID: 517 Comm: kworker/0:1 Not tainted 4.9.0 #3
> Hardware name: Google Google Compute Engine/Google Compute Engine,
> BIOS Google 01/01/2011
> Workqueue: crypto mcryptd_queue_worker
> task: 8801d9e64700 task.stack: 8801d9838000
> RIP: 0010:[]  []
> sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
> arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
> RSP: 0018:8801d983eef8  EFLAGS: 00010202
> RAX:  RBX: 8801d7d96950 RCX: 0006
> RDX: 0001 RSI: 8801d9e64f28 RDI: 8801d7d96800
> RBP: 8801d983f258 R08: 0001 R09: 0001
> R10: 0002 R11: 0003 R12: 8801d983f230
> R13: 8801b67f5720 R14: 8801b67f5770 R15: 8801d983ef70
> FS:  () GS:8801dc00() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 0060 CR3: 0001cee58000 CR4: 001406f0
> Stack:
>  8801d7d96800 813fa207 dc00 e8c0f238
>  0002 11003b307dea e8c0f218 8801d983f190
>  0282 e8c0f140 e8c0f220 41b58ab3
> Call Trace:

crypto: NULL deref in sha512_mb_mgr_get_comp_job_avx2

2017-01-31 Thread Dmitry Vyukov
Hello,

I am getting the following reports with low frequency while running
syzkaller fuzzer. Unfortunately they are not reproducible and happen
in a background thread, so it is difficult to extract any context on
my side. I see only few such crashes per week, so most likely it is
some hard to trigger data race. The following reports are from mmotm
tree, commits 00e20cfc2bf04a0cbe1f5405f61c8426f43eee84 and
fff7e71eac7788904753136f09bcad7471f7799e. Any ideas as to how this can
happen?

BUG: unable to handle kernel NULL pointer dereference at 0060
IP: [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
PGD 1d2395067 [  220.874864] PUD 1d2860067
Oops: 0002 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 516 Comm: kworker/0:1 Not tainted 4.9.0 #4
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Workqueue: crypto mcryptd_queue_worker
task: 8801d9f346c0 task.stack: 8801d9f08000
RIP: 0010:[]  []
sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
RSP: 0018:8801d9f0eef8  EFLAGS: 00010202
RAX:  RBX: 8801d7db1190 RCX: 0006
RDX: 0001 RSI: 8801d9f34ee8 RDI: 8801d7db1040
RBP: 8801d9f0f258 R08: 0001 R09: 0001
R10: 0002 R11: 0003 R12: 8801d9f0f230
R13: 8801c8bbc4e0 R14: 8801c8bbc530 R15: 8801d9f0ef70
FS:  () GS:8801dc00() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0060 CR3: 0001cc15a000 CR4: 001406f0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Stack:
 8801d7db1040 813fa207 dc00 e8c0f238
 0002 11003b3e1dea e8c0f218 8801d9f0f190
 0282 e8c0f140 e8c0f220 41b58ab3
Call Trace:
 [] sha512_mb_update+0x2f7/0x4e0
arch/x86/crypto/sha512-mb/sha512_mb.c:588
 [] crypto_ahash_update include/crypto/hash.h:512 [inline]
 [] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
 [] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
 [] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
 [] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
 [] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
 [] kthread+0x323/0x3e0 kernel/kthread.c:209
 [] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a 00 00 00 48 83 e2 0f
48 6b da 08 48 8d 9c 1f 48 01 00 00 48 8b 03 48 c7 03 00 00 00 00 
40 60 02 00 00 00 48 8b 9f 40 01 00 00 48 c1 e3 08 48 09 d3
RIP  [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
 RSP 
CR2: 0060
---[ end trace 139fd4cda5dfe2c4 ]---

BUG: unable to handle kernel NULL pointer dereference at 0060
IP: [] sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
PGD 1c68ad067 [  624.973638] PUD 1d485a067
Oops: 0002 [#1] SMP KASAN
Dumping ftrace buffer:
   (ftrace buffer empty)
Modules linked in:
CPU: 0 PID: 517 Comm: kworker/0:1 Not tainted 4.9.0 #3
Hardware name: Google Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Workqueue: crypto mcryptd_queue_worker
task: 8801d9e64700 task.stack: 8801d9838000
RIP: 0010:[]  []
sha512_mb_mgr_get_comp_job_avx2+0x6e/0xee
arch/x86/crypto/sha512-mb/sha512_mb_mgr_flush_avx2.S:251
RSP: 0018:8801d983eef8  EFLAGS: 00010202
RAX:  RBX: 8801d7d96950 RCX: 0006
RDX: 0001 RSI: 8801d9e64f28 RDI: 8801d7d96800
RBP: 8801d983f258 R08: 0001 R09: 0001
R10: 0002 R11: 0003 R12: 8801d983f230
R13: 8801b67f5720 R14: 8801b67f5770 R15: 8801d983ef70
FS:  () GS:8801dc00() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0060 CR3: 0001cee58000 CR4: 001406f0
Stack:
 8801d7d96800 813fa207 dc00 e8c0f238
 0002 11003b307dea e8c0f218 8801d983f190
 0282 e8c0f140 e8c0f220 41b58ab3
Call Trace:
 [] sha512_mb_update+0x2f7/0x4e0
arch/x86/crypto/sha512-mb/sha512_mb.c:588
 [] crypto_ahash_update include/crypto/hash.h:512 [inline]
 [] ahash_mcryptd_update crypto/mcryptd.c:627 [inline]
 [] mcryptd_hash_update+0xcd/0x1c0 crypto/mcryptd.c:373
 [] mcryptd_queue_worker+0xff/0x6a0 crypto/mcryptd.c:181
 [] process_one_work+0xbd0/0x1c10 kernel/workqueue.c:2096
 [] worker_thread+0x223/0x1990 kernel/workqueue.c:2230
 [] kthread+0x323/0x3e0 kernel/kthread.c:209
 [] ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:433
Code: 49 0f 42 d3 48 f7 c2 f0 ff ff ff 0f 85 9a