* Andrea Arcangeli (aarca...@redhat.com) wrote:
> On Wed, Apr 13, 2016 at 01:50:53PM +0100, Dr. David Alan Gilbert wrote:
> > * Dr. David Alan Gilbert (dgilb...@redhat.com) wrote:
> > 
> > > +            if ( ((b + 1) % 255) == last_byte && !hit_edge) {
> > 
> > Ahem, that should be 256.
> > 
> > I'm going to bisect the kernel and see where we get to.
> > Andrea's userfaultfd self-test passes on 2.5, so it's something more
> > subtle.
> > 
> 
> David already tracked down 1df59b8497f47495e873c23abd6d3d290c730505
> good and 984065055e6e39f8dd812529e11922374bd39352 bad.
> 
> git diff 
> 1df59b8497f47495e873c23abd6d3d290c730505..984065055e6e39f8dd812529e11922374bd39352
>  fs/userfaultfd.c mm/userfaultfd.c
> 
> Nothing that could break it in the diff of the relevant two files.
> 
> The only other userfault related change in this commit range that
> comes to mind is in fixup_user_fault, but if that was buggy you don't
> userfault into futexes with postcopy so you couldn't notice, so the
> only other user of that is s390.
> 
> The next suspect is the massive THP refcounting change that went
> upstream recently:

...

> As further debug hint, can you try to disable THP and see if that
> makes the problem go away?

Yeh, looks like it is THP.
My bisect is currently at 17ec4cd985780a7e30aa45bb8f272237c12502a4
and with that from a fresh boot it fails, if I disable THP it works
and if I reenable THP back to madvise it fails.

I spotted that my previous bisect point it failed before I'd done
the next kernel build but failed after I'd done the build (but before
I rebooted!) - so I guess after the build it couldn't find any THPs to do.

Dave

> 
> Thanks,
> Andrea
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Reply via email to