Re: [git pull] vfs pile 1 (splice)

2016-10-12 Thread Christoph Lameter
On Mon, 10 Oct 2016, Linus Torvalds wrote: > But the fact that it reacts _so_ badly to double-freeing issues when > the freelist has become corrupted due to an object being free'd and > then modified is clearly very fragile and not great. Yup that is why the debug options move the freepointer

Re: [git pull] vfs pile 1 (splice)

2016-10-12 Thread Christoph Lameter
On Mon, 10 Oct 2016, Linus Torvalds wrote: > But the fact that it reacts _so_ badly to double-freeing issues when > the freelist has become corrupted due to an object being free'd and > then modified is clearly very fragile and not great. Yup that is why the debug options move the freepointer

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Linus Torvalds
On Mon, Oct 10, 2016 at 10:39 PM, Linus Torvalds wrote: > > I guess I will have to double-check that the slub corruption is gone > still with that fixed. So I'm not getting any warnings now from SLUB debugging. So the original bug seems to not have re-surfaced, and

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Linus Torvalds
On Mon, Oct 10, 2016 at 10:39 PM, Linus Torvalds wrote: > > I guess I will have to double-check that the slub corruption is gone > still with that fixed. So I'm not getting any warnings now from SLUB debugging. So the original bug seems to not have re-surfaced, and the registration bug is gone,

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Linus Torvalds
On Sun, Oct 9, 2016 at 8:41 PM, Linus Torvalds wrote: > This COMPLETELY UNTESTED patch tries to fix the nf_hook_entry code to do this. > > I repeat: it's ENTIRELY UNTESTED. Gaah. That patch was subtle garbage. The "add to list" thing did this:

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Linus Torvalds
On Sun, Oct 9, 2016 at 8:41 PM, Linus Torvalds wrote: > This COMPLETELY UNTESTED patch tries to fix the nf_hook_entry code to do this. > > I repeat: it's ENTIRELY UNTESTED. Gaah. That patch was subtle garbage. The "add to list" thing did this: rcu_assign_pointer(entry->next, p);

Re: [git pull] vfs pile 1 (splice)

2016-10-10 Thread Linus Torvalds
On Mon, Oct 10, 2016 at 7:03 AM, Christoph Lameter wrote: > > Hmm.. Then get_freepointer_safe may not be ok. Should not trigger any > faults. So the reason seems to be that SLUB doesn't actually react well to double-freeing bugs. I'm not sure how to fix that. I think the

Re: [git pull] vfs pile 1 (splice)

2016-10-10 Thread Linus Torvalds
On Mon, Oct 10, 2016 at 7:03 AM, Christoph Lameter wrote: > > Hmm.. Then get_freepointer_safe may not be ok. Should not trigger any > faults. So the reason seems to be that SLUB doesn't actually react well to double-freeing bugs. I'm not sure how to fix that. I think the optimistic load that

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Aaron Conole
Linus Torvalds writes: > On Mon, Oct 10, 2016 at 9:28 AM, Linus Torvalds > wrote: >> >> So as I already answered to Dave, I'm not actually sure that this was >> the buggy code, or that my patch would make any difference at all. > >

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Aaron Conole
Linus Torvalds writes: > On Mon, Oct 10, 2016 at 9:28 AM, Linus Torvalds > wrote: >> >> So as I already answered to Dave, I'm not actually sure that this was >> the buggy code, or that my patch would make any difference at all. > > My patch does seem to fix things, and in fact the warning about

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Linus Torvalds
On Mon, Oct 10, 2016 at 9:28 AM, Linus Torvalds wrote: > > So as I already answered to Dave, I'm not actually sure that this was > the buggy code, or that my patch would make any difference at all. My patch does seem to fix things, and in fact the warning about

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Linus Torvalds
On Mon, Oct 10, 2016 at 9:28 AM, Linus Torvalds wrote: > > So as I already answered to Dave, I'm not actually sure that this was > the buggy code, or that my patch would make any difference at all. My patch does seem to fix things, and in fact the warning about "hook not found" now triggers. So

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Linus Torvalds
On Mon, Oct 10, 2016 at 6:49 AM, Aaron Conole wrote: > > Okay, I'm looking it over. Sorry for the mess. So as I already answered to Dave, I'm not actually sure that this was the buggy code, or that my patch would make any difference at all. I never got a good reproducer for

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Linus Torvalds
On Mon, Oct 10, 2016 at 6:49 AM, Aaron Conole wrote: > > Okay, I'm looking it over. Sorry for the mess. So as I already answered to Dave, I'm not actually sure that this was the buggy code, or that my patch would make any difference at all. I never got a good reproducer for the bug: I spent

Re: [git pull] vfs pile 1 (splice)

2016-10-10 Thread Christoph Lameter
On Sun, 9 Oct 2016, Linus Torvalds wrote: > Hmm. When I enabled SLUB debugging, I also enabled DEBUG_PAGEALLOC, > because "why not". But it turns out that that may have been a mistake, > because it changes the very path that failed to no longer do that > failing access (or rather, it does it as a

Re: [git pull] vfs pile 1 (splice)

2016-10-10 Thread Christoph Lameter
On Sun, 9 Oct 2016, Linus Torvalds wrote: > Hmm. When I enabled SLUB debugging, I also enabled DEBUG_PAGEALLOC, > because "why not". But it turns out that that may have been a mistake, > because it changes the very path that failed to no longer do that > failing access (or rather, it does it as a

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Aaron Conole
Linus Torvalds writes: > On Sun, Oct 9, 2016 at 7:49 PM, Linus Torvalds > wrote: >> >> There is one *correct* way to remove an entry from a singly linked >> list, and it looks like this: >> >> struct entry **pp, *p; >> >> pp

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-10 Thread Aaron Conole
Linus Torvalds writes: > On Sun, Oct 9, 2016 at 7:49 PM, Linus Torvalds > wrote: >> >> There is one *correct* way to remove an entry from a singly linked >> list, and it looks like this: >> >> struct entry **pp, *p; >> >> pp = >> while ((p = *pp) != NULL) { >> if

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Linus Torvalds
On Sun, Oct 9, 2016 at 7:49 PM, Linus Torvalds wrote: > > There is one *correct* way to remove an entry from a singly linked > list, and it looks like this: > > struct entry **pp, *p; > > pp = > while ((p = *pp) != NULL) { > if (right_entry(p))

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Linus Torvalds
On Sun, Oct 9, 2016 at 7:49 PM, Linus Torvalds wrote: > > There is one *correct* way to remove an entry from a singly linked > list, and it looks like this: > > struct entry **pp, *p; > > pp = > while ((p = *pp) != NULL) { > if (right_entry(p)) { > *pp = p->next;

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Linus Torvalds
On Sun, Oct 9, 2016 at 6:35 PM, Aaron Conole wrote: > > I was just about to build and test something similar: So I haven't actually tested that one, but looking at the code, it really looks very bogus. In fact, that code just looks like crap. It does *not* do a proper "remove

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Linus Torvalds
On Sun, Oct 9, 2016 at 6:35 PM, Aaron Conole wrote: > > I was just about to build and test something similar: So I haven't actually tested that one, but looking at the code, it really looks very bogus. In fact, that code just looks like crap. It does *not* do a proper "remove singly linked list

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Aaron Conole
Florian Westphal writes: > Linus Torvalds wrote: >> On Sun, Oct 9, 2016 at 12:11 PM, Linus Torvalds >> wrote: >> > >> > Anyway, I don't think I can bisect it, but I'll try to narrow it down >> > a *bit* at least. >>

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Aaron Conole
Florian Westphal writes: > Linus Torvalds wrote: >> On Sun, Oct 9, 2016 at 12:11 PM, Linus Torvalds >> wrote: >> > >> > Anyway, I don't think I can bisect it, but I'll try to narrow it down >> > a *bit* at least. >> > >> > Not doing any more pulls on this unstable base, I've been puttering >>

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Florian Westphal
Linus Torvalds wrote: > On Sun, Oct 9, 2016 at 12:11 PM, Linus Torvalds > wrote: > > > > Anyway, I don't think I can bisect it, but I'll try to narrow it down > > a *bit* at least. > > > > Not doing any more pulls on this unstable

Re: slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Florian Westphal
Linus Torvalds wrote: > On Sun, Oct 9, 2016 at 12:11 PM, Linus Torvalds > wrote: > > > > Anyway, I don't think I can bisect it, but I'll try to narrow it down > > a *bit* at least. > > > > Not doing any more pulls on this unstable base, I've been puttering > > around in trying to clean up some

slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Linus Torvalds
On Sun, Oct 9, 2016 at 12:11 PM, Linus Torvalds wrote: > > Anyway, I don't think I can bisect it, but I'll try to narrow it down > a *bit* at least. > > Not doing any more pulls on this unstable base, I've been puttering > around in trying to clean up some stupid

slab corruption with current -git (was Re: [git pull] vfs pile 1 (splice))

2016-10-09 Thread Linus Torvalds
On Sun, Oct 9, 2016 at 12:11 PM, Linus Torvalds wrote: > > Anyway, I don't think I can bisect it, but I'll try to narrow it down > a *bit* at least. > > Not doing any more pulls on this unstable base, I've been puttering > around in trying to clean up some stupid printk logging issues > instead.

Re: [git pull] vfs pile 1 (splice)

2016-10-09 Thread Linus Torvalds
On Sun, Oct 9, 2016 at 11:40 AM, Linus Torvalds wrote: > > I'll continue with *just* SLUB debugging on, but I thought it was > interesting how enabling more memory access debugging actually ends up > changing some really subtle code. Indeed, now with

Re: [git pull] vfs pile 1 (splice)

2016-10-09 Thread Linus Torvalds
On Sun, Oct 9, 2016 at 11:40 AM, Linus Torvalds wrote: > > I'll continue with *just* SLUB debugging on, but I thought it was > interesting how enabling more memory access debugging actually ends up > changing some really subtle code. Indeed, now with DEBUG_PAGEALLOC disabled, I got a crash

Re: [git pull] vfs pile 1 (splice)

2016-10-09 Thread Linus Torvalds
On Sat, Oct 8, 2016 at 11:05 PM, Linus Torvalds wrote: > > Hmm. I've now gotten two oopses today, all at __kmalloc+0xc3/0x1f0, > which seems to be the > > *(void **)(object + s->offset); > > in get_freepointer(). Actually, it's in "get_freepointer_safe()", it's

Re: [git pull] vfs pile 1 (splice)

2016-10-09 Thread Linus Torvalds
On Sat, Oct 8, 2016 at 11:05 PM, Linus Torvalds wrote: > > Hmm. I've now gotten two oopses today, all at __kmalloc+0xc3/0x1f0, > which seems to be the > > *(void **)(object + s->offset); > > in get_freepointer(). Actually, it's in "get_freepointer_safe()", it's just that without

Re: [git pull] vfs pile 1 (splice)

2016-10-09 Thread Linus Torvalds
On Fri, Oct 7, 2016 at 3:20 PM, Al Viro wrote: > splice stuff. Hmm. I've now gotten two oopses today, all at __kmalloc+0xc3/0x1f0, which seems to be the *(void **)(object + s->offset); in get_freepointer(). Because it started happening today, I'm inclined to

Re: [git pull] vfs pile 1 (splice)

2016-10-09 Thread Linus Torvalds
On Fri, Oct 7, 2016 at 3:20 PM, Al Viro wrote: > splice stuff. Hmm. I've now gotten two oopses today, all at __kmalloc+0xc3/0x1f0, which seems to be the *(void **)(object + s->offset); in get_freepointer(). Because it started happening today, I'm inclined to blame mainly stuff I

[git pull] vfs pile 1 (splice)

2016-10-07 Thread Al Viro
splice stuff. There are conflicts in lustre; proposed resolution is in #merge-candidate (same as it is in linux-next). There's a bunch of branches this cycle, both mine and from other folks and I'd rather send pull requests separately. This one is the conversion of ->splice_read() to

[git pull] vfs pile 1 (splice)

2016-10-07 Thread Al Viro
splice stuff. There are conflicts in lustre; proposed resolution is in #merge-candidate (same as it is in linux-next). There's a bunch of branches this cycle, both mine and from other folks and I'd rather send pull requests separately. This one is the conversion of ->splice_read() to