* Andrea Arcangeli (aarca...@redhat.com) wrote: > On Wed, Apr 13, 2016 at 01:50:53PM +0100, Dr. David Alan Gilbert wrote: > > * Dr. David Alan Gilbert (dgilb...@redhat.com) wrote: > > > > > + if ( ((b + 1) % 255) == last_byte && !hit_edge) { > > > > Ahem, that should be 256. > > > > I'm going to bisect the kernel and see where we get to. > > Andrea's userfaultfd self-test passes on 2.5, so it's something more > > subtle. > > > > David already tracked down 1df59b8497f47495e873c23abd6d3d290c730505 > good and 984065055e6e39f8dd812529e11922374bd39352 bad. > > git diff > 1df59b8497f47495e873c23abd6d3d290c730505..984065055e6e39f8dd812529e11922374bd39352 > fs/userfaultfd.c mm/userfaultfd.c > > Nothing that could break it in the diff of the relevant two files. > > The only other userfault related change in this commit range that > comes to mind is in fixup_user_fault, but if that was buggy you don't > userfault into futexes with postcopy so you couldn't notice, so the > only other user of that is s390. > > The next suspect is the massive THP refcounting change that went > upstream recently:
... > As further debug hint, can you try to disable THP and see if that > makes the problem go away? Yeh, looks like it is THP. My bisect is currently at 17ec4cd985780a7e30aa45bb8f272237c12502a4 and with that from a fresh boot it fails, if I disable THP it works and if I reenable THP back to madvise it fails. I spotted that my previous bisect point it failed before I'd done the next kernel build but failed after I'd done the build (but before I rebooted!) - so I guess after the build it couldn't find any THPs to do. Dave > > Thanks, > Andrea -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK