On Tue, Aug 08, 2017 at 11:52:05AM +0100, Mark Rutland wrote:
> Hi,
> As a heads-up, I hit the below splat when using Syzkaller to fuzz arm64
> VMAP_STACK patches [1] atop of v4.13-rc3. I haven't hit anything else
> major, and so far I haven't had any luck reproducing this, so it may be
> an existing issue that's difficult to hit.
> Note that while reported as a BUG(), it's actually the WARN_ON_ONCE()
> introduced in commit:
>   65d8fc777f6dcfee ("futex: Remove requirement for lock_page() in 
> get_futex_key()")
> ... misreported as I accidentally throw away the flags in __BUG_FLAGS().
> Other than that, I believe BUG() and friends are working correctly.
> The Syzkaller log is huge (1.0M), so rather than attaching it, I've
> uploaded the log, report, and kernel config to:
> http://data.yaey.co.uk/bugs/20170808-futex-bug/
> I'll continue trying to reproduce and minimize this.
> ------------[ cut here ]------------
> kernel BUG at kernel/futex.c:679!

This corresponds to the warning

                 * Take a reference unless it is about to be freed. Previously
                 * this reference was taken by ihold under the page lock
                 * pinning the inode in place so i_lock was unnecessary. The
                 * only way for this check to fail is if the inode was
                 * truncated in parallel so warn for now if this happens.
                 * We are not calling into get_futex_key_refs() in file-backed
                 * cases, therefore a successful atomic_inc return below will
                 * guarantee that get_futex_key() will still imply smp_mb(); 
                if (WARN_ON_ONCE(!atomic_inc_not_zero(&inode->i_count))) {

                        goto again;

The comment is pretty self-explanatory. The only situation I could think
of where it could happen is if a futex existed on a shared mapping that
was truncated during the operation. Why would an application truncate a
mapping with a key on it? As weird as it is, the situation is recoverable
which is what the code does but the warning was included in case I was
not imaginative enough.

Can you tell me if it's possible that syskaller when fuzz testing was
creating a shared mapping, creating a futex backed by the mapping and
truncating it? If so and that's what triggers the warning then I think it
would be reasonable to remove the warning as the source of the confusion
is userspace truncating a mapping with active keys on it.

If you manage to create a test case, then it would be nice to test without
that warning and see if it completes successfully or if there is other

