Re: kernel BUG at kernel/futex.c:679 on v4.13-rc3-ish on arm64

2017-08-08 Thread Mark Rutland
On Tue, Aug 08, 2017 at 05:44:22PM +0100, Mel Gorman wrote:
> On Tue, Aug 08, 2017 at 09:06:48AM -0700, Linus Torvalds wrote:
> > On Tue, Aug 8, 2017 at 8:41 AM, Mark Rutland  wrote:
> > >
> > > With my __BUG_FLAGS() issue corrected, the WARN_ON_ONCE() fires once,
> > > and everything else seems fine. I'll have a go with additional debug
> > > enabled just in case.
> > 
> > Ok, great, a - mostly - false alarm.

Indeed; sorry for the somewhat bogus report.

> > I do wonder if we should just remove even that WARN_ON_ONCE() - I
> > think it was added to be careful, and the code seems to do the right
> > thing.

> If Mark confirms that removing the warning is ok for his test case, I'll
> send a patch to Thomas with a tag for stable and it should arrive at your
> inbox eventually. If I don't hear from Mark, I'll have time to try the
> test case in the morning and go from there.

So far my test case hasn't triggered any other noticeable issue, nor
splats from any debug option I have enabled, so removing the warning
sounds good to me.

I'll throw Syzkaller at a kernel with the warning removed. Given this is
the first time I've hit this after many days worth of fuzzing recent
kernels, I suspect it won't hit anything interesting by tomorrow
morning.

Thanks,
Mark.


Re: kernel BUG at kernel/futex.c:679 on v4.13-rc3-ish on arm64

2017-08-08 Thread Mel Gorman
On Tue, Aug 08, 2017 at 09:06:48AM -0700, Linus Torvalds wrote:
> On Tue, Aug 8, 2017 at 8:41 AM, Mark Rutland  wrote:
> >
> > With my __BUG_FLAGS() issue corrected, the WARN_ON_ONCE() fires once,
> > and everything else seems fine. I'll have a go with additional debug
> > enabled just in case.
> 
> Ok, great, a - mostly - false alarm.
> 
> I do wonder if we should just remove even that WARN_ON_ONCE() - I
> think it was added to be careful, and the code seems to do the right
> thing.
> 

Exactly. I didn't really expect an application to behave like this that
would create a storm of warning-related bug reports and even one that
did should fail in userspace anyway in some fashion (e.g. lost wakup or
unexpected errno).  It looks like the reproduction case is replacing the
mapping so it should be safe to remove the warning because enough time
has passed that any other "interesting" case should have triggered by now.

If Mark confirms that removing the warning is ok for his test case, I'll
send a patch to Thomas with a tag for stable and it should arrive at your
inbox eventually. If I don't hear from Mark, I'll have time to try the
test case in the morning and go from there.

> The second WARN_ON_ONCE() (that is marked as "should be impossible")
> we might as well leave around. If that one triggers, it's a lot more
> interesting.
> 

Agreed.

> Mel? No hurry - the nice thing about WARN_ON_ONCE() is that it's just
> a single note so it's neither killing the machine(*) nor causing any
> real problems.
> 

That was the intent -- "this is recoverable but I am interested in
hearing if this ever occurs without truncation or unmap being involved".
Assuming no other surprises, it'll be remove relatively shortly.

-- 
Mel Gorman
SUSE Labs


Re: kernel BUG at kernel/futex.c:679 on v4.13-rc3-ish on arm64

2017-08-08 Thread Linus Torvalds
On Tue, Aug 8, 2017 at 8:41 AM, Mark Rutland  wrote:
>
> With my __BUG_FLAGS() issue corrected, the WARN_ON_ONCE() fires once,
> and everything else seems fine. I'll have a go with additional debug
> enabled just in case.

Ok, great, a - mostly - false alarm.

I do wonder if we should just remove even that WARN_ON_ONCE() - I
think it was added to be careful, and the code seems to do the right
thing.

The second WARN_ON_ONCE() (that is marked as "should be impossible")
we might as well leave around. If that one triggers, it's a lot more
interesting.

Mel? No hurry - the nice thing about WARN_ON_ONCE() is that it's just
a single note so it's neither killing the machine(*) nor causing any
real problems.

Linus

(*) when it works, but presumably it is now fixed on ARM64, and I
expect I'll see a pull request ;).


Re: kernel BUG at kernel/futex.c:679 on v4.13-rc3-ish on arm64

2017-08-08 Thread Mark Rutland
On Tue, Aug 08, 2017 at 04:32:30PM +0100, Mel Gorman wrote:
> On Tue, Aug 08, 2017 at 11:52:05AM +0100, Mark Rutland wrote:
> > As a heads-up, I hit the below splat when using Syzkaller to fuzz arm64
> > VMAP_STACK patches [1] atop of v4.13-rc3. I haven't hit anything else
> > major, and so far I haven't had any luck reproducing this, so it may be
> > an existing issue that's difficult to hit.

> > kernel BUG at kernel/futex.c:679!
> 
> This corresponds to the warning
> 
> /*
>  * Take a reference unless it is about to be freed. Previously
>  * this reference was taken by ihold under the page lock
>  * pinning the inode in place so i_lock was unnecessary. The
>  * only way for this check to fail is if the inode was
>  * truncated in parallel so warn for now if this happens.
>  *
>  * We are not calling into get_futex_key_refs() in file-backed
>  * cases, therefore a successful atomic_inc return below will
>  * guarantee that get_futex_key() will still imply smp_mb(); 
> (B).
>  */
> if (WARN_ON_ONCE(!atomic_inc_not_zero(&inode->i_count))) {
> rcu_read_unlock();
> put_page(page);
> 
> goto again;
> }
> 
> The comment is pretty self-explanatory. The only situation I could think
> of where it could happen is if a futex existed on a shared mapping that
> was truncated during the operation. Why would an application truncate a
> mapping with a key on it? As weird as it is, the situation is recoverable
> which is what the code does but the warning was included in case I was
> not imaginative enough.
> 
> Can you tell me if it's possible that syskaller when fuzz testing was
> creating a shared mapping, creating a futex backed by the mapping and
> truncating it? If so and that's what triggers the warning then I think it
> would be reasonable to remove the warning as the source of the confusion
> is userspace truncating a mapping with active keys on it.

I think that's exactly what Syzkaller is doing.

Near the end of the log, the following are run (concurrently):

mmap(&(0x7fbc6000/0x1000)=nil, (0x1000), 0x3, 0x32, 0x, 0x0)
mmap(&(0x7fbc6000/0x1000)=nil, (0x1000), 0x3, 0x32, 0x, 0x0)
mmap(&(0x7fbc6000/0x1000)=nil, (0x1000), 0x3, 0x31, 0x, 0x0)
futex(&(0x7fbc6000)=0x71, 0xb, 0x0, &(0x7fbc6000)={0x0, 0x989680}, 
&(0x7fbc6000+0x26c)=0x9ba4, 0x1)

> If you manage to create a test case, then it would be nice to test without
> that warning and see if it completes successfully or if there is other
> fallout.

I *just* sent a test case which blats the mapping with a new anonymous
mmap:

https://lkml.kernel.org/r/20170808145732.GD19207@leverpostej

With my __BUG_FLAGS() issue corrected, the WARN_ON_ONCE() fires once,
and everything else seems fine. I'll have a go with additional debug
enabled just in case.

Thanks,
Mark


Re: kernel BUG at kernel/futex.c:679 on v4.13-rc3-ish on arm64

2017-08-08 Thread Mel Gorman
On Tue, Aug 08, 2017 at 11:52:05AM +0100, Mark Rutland wrote:
> Hi,
> 
> As a heads-up, I hit the below splat when using Syzkaller to fuzz arm64
> VMAP_STACK patches [1] atop of v4.13-rc3. I haven't hit anything else
> major, and so far I haven't had any luck reproducing this, so it may be
> an existing issue that's difficult to hit.
> 
> Note that while reported as a BUG(), it's actually the WARN_ON_ONCE()
> introduced in commit:
> 
>   65d8fc777f6dcfee ("futex: Remove requirement for lock_page() in 
> get_futex_key()")
> 
> ... misreported as I accidentally throw away the flags in __BUG_FLAGS().
> Other than that, I believe BUG() and friends are working correctly.
> 
> The Syzkaller log is huge (1.0M), so rather than attaching it, I've
> uploaded the log, report, and kernel config to:
> 
> http://data.yaey.co.uk/bugs/20170808-futex-bug/
> 
> I'll continue trying to reproduce and minimize this.
> 
> [ cut here ]
> kernel BUG at kernel/futex.c:679!

This corresponds to the warning

/*
 * Take a reference unless it is about to be freed. Previously
 * this reference was taken by ihold under the page lock
 * pinning the inode in place so i_lock was unnecessary. The
 * only way for this check to fail is if the inode was
 * truncated in parallel so warn for now if this happens.
 *
 * We are not calling into get_futex_key_refs() in file-backed
 * cases, therefore a successful atomic_inc return below will
 * guarantee that get_futex_key() will still imply smp_mb(); 
(B).
 */
if (WARN_ON_ONCE(!atomic_inc_not_zero(&inode->i_count))) {
rcu_read_unlock();
put_page(page);

goto again;
}

The comment is pretty self-explanatory. The only situation I could think
of where it could happen is if a futex existed on a shared mapping that
was truncated during the operation. Why would an application truncate a
mapping with a key on it? As weird as it is, the situation is recoverable
which is what the code does but the warning was included in case I was
not imaginative enough.

Can you tell me if it's possible that syskaller when fuzz testing was
creating a shared mapping, creating a futex backed by the mapping and
truncating it? If so and that's what triggers the warning then I think it
would be reasonable to remove the warning as the source of the confusion
is userspace truncating a mapping with active keys on it.

If you manage to create a test case, then it would be nice to test without
that warning and see if it completes successfully or if there is other
fallout.

-- 
Mel Gorman
SUSE Labs


Re: kernel BUG at kernel/futex.c:679 on v4.13-rc3-ish on arm64

2017-08-08 Thread Mark Rutland
On Tue, Aug 08, 2017 at 03:57:32PM +0100, Mark Rutland wrote:
> The below test case fires for me in a few seconds on an arm64 platform,
> triggering the kernel BUG at kernel/futex.c:679. If left running for longer, I
> then get a stream of other BUGs that I believe are a result of the first 
> issue.

Likewise on v4.12 x86_64:

[2503756.486650] [ cut here ]
[2503756.491450] WARNING: CPU: 0 PID: 21380 at kernel/futex.c:679 
get_futex_key+0x2d8/0x330
[2503756.499527] Modules linked in:
[2503756.502754] CPU: 0 PID: 21380 Comm: futex-test Not tainted 4.12.0+ #14
[2503756.509441] Hardware name: LENOVO 7484A3G/LENOVO, BIOS 5CKT54AUS 09/07/2009
[2503756.516561] task: 9696e9abc800 task.stack: a72c8637c000
[2503756.522648] RIP: 0010:get_futex_key+0x2d8/0x330
[2503756.527344] RSP: 0018:a72c8637fcd8 EFLAGS: 00010246
[2503756.532734] RAX:  RBX: a72c8637fd88 RCX: 
9696c618e690
[2503756.540025] RDX: d98dc68b5f00 RSI: 9696c618e528 RDI: 

[2503756.547319] RBP: a72c8637fd10 R08: 9696c618e670 R09: 
a72c8637fc9c
[2503756.554609] R10: 0080 R11: 9696e5f65a58 R12: 
7f595cf4a000
[2503756.561900] R13:  R14: 0001 R15: 
9696e5fa4000
[2503756.569191] FS:  7f595b759700() GS:9696fec0() 
knlGS:
[2503756.577439] CS:  0010 DS:  ES:  CR0: 80050033
[2503756.583347] CR2: 7f595cf4a000 CR3: 0001a5e1d000 CR4: 
000406f0
[2503756.590646] Call Trace:
[2503756.593265]  ? iput+0x85/0x1d0
[2503756.596491]  futex_requeue+0x101/0x880
[2503756.600410]  do_futex+0xbe/0xa00
[2503756.603810]  ? pick_next_entity+0xa0/0x150
[2503756.608073]  ? pick_next_task_fair+0x3fc/0x460
[2503756.612685]  SyS_futex+0x6c/0x150
[2503756.616169]  ? schedule+0x31/0x80
[2503756.619657]  ? exit_to_usermode_loop+0x27/0x72
[2503756.624267]  entry_SYSCALL_64_fastpath+0x13/0x94
[2503756.629051] RIP: 0033:0x7f595c84e4e9
[2503756.632795] RSP: 002b:7f595b758e78 EFLAGS: 0202 ORIG_RAX: 
00ca
[2503756.640521] RAX: ffda RBX:  RCX: 
7f595c84e4e9
[2503756.647813] RDX: 0001 RSI: 000c RDI: 
7f595cf4a000
[2503756.655109] RBP: 7f595b758ec0 R08: 7f595cf4a004 R09: 
0001
[2503756.662401] R10:  R11: 0202 R12: 
7f595cb2e880
[2503756.669693] R13: 7f595b7599c0 R14:  R15: 
0003
[2503756.676986] Code: 83 ef 01 e9 ea fd ff ff 48 8d 78 ff e9 20 fe ff ff e8 7d 
6d 06 00 eb a7 b8 ea ff ff ff e9 35 fe ff ff 48 8d 78 ff e9 d9 fe ff ff <0f> ff 
48 8b 7d d0 e8 4d f3 ff ff e9 7f fd ff ff 0f ff 48 89 f7 
[2503756.695988] ---[ end trace adb5ba028949a89a ]---

Thanks,
Mark

> >8
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> #define NR_FUTEX_THREADS 16
> 
> pthread_t threads[NR_FUTEX_THREADS];
> 
> void *mem;
> 
> #define MEM_PROT(PROT_READ | PROT_WRITE)
> #define MEM_SIZE65536
> 
> static int futex_wrapper(int *uaddr, int op, int val,
>  const struct timespec *timeout,
>  int *uaddr2, int val3)
> {
> syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3);
> }
> 
> void *poll_futex(void *unused)
> {
> for (;;) {
> futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1);
> }
> }
> 
> int main(int argc, char *argv[])
> {
> int i;
> 
> mem = mmap(NULL, MEM_SIZE, MEM_PROT,
>MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> 
> printf("Mapping @ %p\n", mem);
> 
> printf("Creating futex threads...\n");
> 
> for (i = 0; i < NR_FUTEX_THREADS; i++)
> pthread_create(&threads[i], NULL, poll_futex, NULL);
> 
> printf("Flipping mapping...\n");
> for (;;) {
> mmap(mem, MEM_SIZE, MEM_PROT,
>  MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0);
> }
> 
> return 0;
> }


Re: kernel BUG at kernel/futex.c:679 on v4.13-rc3-ish on arm64

2017-08-08 Thread Mark Rutland
On Tue, Aug 08, 2017 at 11:52:04AM +0100, Mark Rutland wrote:
> Hi,
> 
> As a heads-up, I hit the below splat when using Syzkaller to fuzz arm64
> VMAP_STACK patches [1] atop of v4.13-rc3. I haven't hit anything else
> major, and so far I haven't had any luck reproducing this, so it may be
> an existing issue that's difficult to hit.
> 
> Note that while reported as a BUG(), it's actually the WARN_ON_ONCE()
> introduced in commit:
> 
>   65d8fc777f6dcfee ("futex: Remove requirement for lock_page() in 
> get_futex_key()")
> 
> ... misreported as I accidentally throw away the flags in __BUG_FLAGS().
> Other than that, I believe BUG() and friends are working correctly.
> 
> The Syzkaller log is huge (1.0M), so rather than attaching it, I've
> uploaded the log, report, and kernel config to:
> 
> http://data.yaey.co.uk/bugs/20170808-futex-bug/
> 
> I'll continue trying to reproduce and minimize this.

AFAICT, get_futex_key() can race with an mmap() changing the page in question,
whereupon things go wrong. So I believe we need to restore some of the page
locking in get_futex_key().

The below test case fires for me in a few seconds on an arm64 platform,
triggering the kernel BUG at kernel/futex.c:679. If left running for longer, I
then get a stream of other BUGs that I believe are a result of the first issue.

Thanks,
Mark.

>8
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define NR_FUTEX_THREADS 16

pthread_t threads[NR_FUTEX_THREADS];

void *mem;

#define MEM_PROT(PROT_READ | PROT_WRITE)
#define MEM_SIZE65536

static int futex_wrapper(int *uaddr, int op, int val,
 const struct timespec *timeout,
 int *uaddr2, int val3)
{
syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3);
}

void *poll_futex(void *unused)
{
for (;;) {
futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1);
}
}

int main(int argc, char *argv[])
{
int i;

mem = mmap(NULL, MEM_SIZE, MEM_PROT,
   MAP_SHARED | MAP_ANONYMOUS, -1, 0);

printf("Mapping @ %p\n", mem);

printf("Creating futex threads...\n");

for (i = 0; i < NR_FUTEX_THREADS; i++)
pthread_create(&threads[i], NULL, poll_futex, NULL);

printf("Flipping mapping...\n");
for (;;) {
mmap(mem, MEM_SIZE, MEM_PROT,
 MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0);
}

return 0;
}


kernel BUG at kernel/futex.c:679 on v4.13-rc3-ish on arm64

2017-08-08 Thread Mark Rutland
Hi,

As a heads-up, I hit the below splat when using Syzkaller to fuzz arm64
VMAP_STACK patches [1] atop of v4.13-rc3. I haven't hit anything else
major, and so far I haven't had any luck reproducing this, so it may be
an existing issue that's difficult to hit.

Note that while reported as a BUG(), it's actually the WARN_ON_ONCE()
introduced in commit:

  65d8fc777f6dcfee ("futex: Remove requirement for lock_page() in 
get_futex_key()")

... misreported as I accidentally throw away the flags in __BUG_FLAGS().
Other than that, I believe BUG() and friends are working correctly.

The Syzkaller log is huge (1.0M), so rather than attaching it, I've
uploaded the log, report, and kernel config to:

http://data.yaey.co.uk/bugs/20170808-futex-bug/

I'll continue trying to reproduce and minimize this.

Thanks,
Mark.

[1] git://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git 
arm64/vmap-stack

[ cut here ]
kernel BUG at kernel/futex.c:679!
Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 
#3
Hardware name: linux,dummy-virt (DT)
task: 80001e271780 task.stack: 10908000
PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679
pc : [] lr : [] pstate: 8145
sp : 1090bab0
x29: 1090bab0 x28:  
x27:  x26:  
x25: 0998f000 x24: 1090bbc0 
x23: 80001e2a56d8 x22: 09a4f000 
x21: 20bc6000 x20: 80001e2a56b8 
x19: 7e5dbf00 x18: 80001e272018 
x17: 00826000 x16: 0821f790 
x15: 0a608000 x14: 0a608fc0 
x13: 0038 x12: 09d3c000 
x11: 7f7f7f7f7f7f7f7f x10: 0002 
x9 : 0998fb88 x8 : 80001e272018 
x7 : 0821a6b4 x6 :  
x5 : 0001 x4 :  
x3 :  x2 : 0001 
x1 : 00040d00 x0 :  
Process syz-executor1 (pid: 3695, stack limit = 0x10908000)
Stack: (0x1090bab0 to 0x1090c000)
baa0:   1090bb50 0821d668
bac0: 80001e271780 0998f000 0001 1090bbc0
bae0: 20bc626c 1090bc20 20bc6000 
bb00: 0001  1090bb30 09a74420
bb20: 09a73610 80001deec800 1090bb50 80001e2a54d0
bb40: 7e5dbf00 00040d00 1090bd00 0821e81c
bb60: 20bc6000 20bc626c 0998f000 000b
bb80: 000b   1090bea0
bba0: 0001 0001 042e 00040d00
bbc0:   026c 1090bbd8
bbe0:   009959d0 00989680
bc00: 081f73e0 80001ffd34c0  80001e271780
bc20: 1090bc20   1090bc38
bc40:    
bc60:    
bc80:    
bca0:    
bcc0:    
bce0:    00040d00
bd00: 1090be40 0821f850 000b 0998f000
bd20: 20bc6000 000b  20bc6000
bd40:  20bc626c 0001 1090bea0
bd60: 09d3c000 0038 0a608fc0 0a608000
bd80: 0821f790 00826000 80001e272018 000b
bda0: 0998f000 20bc6000 000b 80001e271780
bdc0: 20bc6000  20bc626c 0001
bde0: 80001e271780 1090be40 0821f8d8 1090be40
be00: 1090be40 0821f830 000b 0998f000
be20: 20bc6000 000b 0043 00040d00
be40: 1090bff0 080837b0  800016662000
be60:  0042cff4 a000 0015
be80: 0124 0062 093d1000 80001e271780
bea0: 00989680  00989680 00040d00
bec0: 20bc6000 000b  20bc6000
bee0: 20bc626c 0001  
bf00: 0062 7f7f ff227262752e7564 7f7f7f7f7f7f7f7f
bf20: 0101010101010101 0038 003f 0001
bf40:  00826000  004c0008
bf60: