Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamadawrote: > Sorry for chiming in late. > > I noticed this thread today, > honestly, the commit made me upset. > > > Can I suggest another way to make it less fragile? > __attribute((...)) can be placed after 'struct'. > > > So, we can write: > > > struct __randomize_layout path { > struct vfsmount *mnt; > struct dentry *dentry; > }; > > > instead of > > > struct path { > struct vfsmount *mnt; > struct dentry *dentry; > } __randomize_layout; Ugh. I had tried this after the struct _name_, not after "struct" itself. This does fix it, though it remains fragile, as you mention. > If we force the former notation, > the undefined __randomize_layout results in a build error > instead of silent broken code generation. > > > It is true somebody can still place > __randomize_layout after the closing brace, > but can we check this by coccicheck or checkpatch.pl? > (we can describe it in coding style documentation, of course) > > > IMHO, we should not (ab)use include/linux/kconfig.h > to bring in misc things. I'm happy to send a patch that reverts the other changes and relocates all the markings... Linus, how would you like this to go? -Kees -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamada wrote: > Sorry for chiming in late. > > I noticed this thread today, > honestly, the commit made me upset. > > > Can I suggest another way to make it less fragile? > __attribute((...)) can be placed after 'struct'. > > > So, we can write: > > > struct __randomize_layout path { > struct vfsmount *mnt; > struct dentry *dentry; > }; > > > instead of > > > struct path { > struct vfsmount *mnt; > struct dentry *dentry; > } __randomize_layout; Ugh. I had tried this after the struct _name_, not after "struct" itself. This does fix it, though it remains fragile, as you mention. > If we force the former notation, > the undefined __randomize_layout results in a build error > instead of silent broken code generation. > > > It is true somebody can still place > __randomize_layout after the closing brace, > but can we check this by coccicheck or checkpatch.pl? > (we can describe it in coding style documentation, of course) > > > IMHO, we should not (ab)use include/linux/kconfig.h > to bring in misc things. I'm happy to send a patch that reverts the other changes and relocates all the markings... Linus, how would you like this to go? -Kees -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamadawrote: > > Can I suggest another way to make it less fragile? > __attribute((...)) can be placed after 'struct'. That avoids the actual bug, but it wouldn't have helped _find_ the problem in the first place. If somebody ever does the same thing, they'd hit the same issue. And it's not just __randomize_struct, it's any of our other type markers. We can say "don't do that", but if there is no automated checking, it's still ripe to cause problems just because somebody didn't notice. So I'd rather have something that causes a build failure when something goes wrong, rather than silently accepting syntax that wasn't intended. Linus
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamada wrote: > > Can I suggest another way to make it less fragile? > __attribute((...)) can be placed after 'struct'. That avoids the actual bug, but it wouldn't have helped _find_ the problem in the first place. If somebody ever does the same thing, they'd hit the same issue. And it's not just __randomize_struct, it's any of our other type markers. We can say "don't do that", but if there is no automated checking, it's still ripe to cause problems just because somebody didn't notice. So I'd rather have something that causes a build failure when something goes wrong, rather than silently accepting syntax that wasn't intended. Linus
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
Hi Linus, 2018-02-22 7:47 GMT+09:00 Linus Torvalds: > On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero > wrote: >> >> One can see that offsets used to access various members of struct path are >> different, and also that the original file from step 3 contains an object >> named "__randomize_layout". > > Whee. > > Thanks for root-causing this issue, and this syntax of ours is clearly > *much* too fragile. > > We actually have similar issues with some of our other attributes, > where out nice "helpful" attribute shorthand can end up being just > silently interpreted as a variable name if they aren't defined in > time. > > For most of our other attributes, it just doesn't matter all that much > if some user doesn't happen to see the attribute. For > __randomize_layout, it's obviously very fatal, and silently just > generates crazy code. > > I'm not entirely sure what the right solution is, because it's > obviously much too easy to miss some #include by mistake. It's easy to > say "you should always include the proper header", but if a failure to > do so doesn't end up with any warnings or errors, but just silent bad > code generation, it's much too fragile. > > I wonder if we could change the syntax of that "__randomize_layout" > thing. Some of our related helper macros (ie > randomized_struct_fields_start/end) don't have the same problem, > because if you don't have the define for them, the compiler will > complain about bad syntax. > > And other attribute specifiers we encourage people to put in other > parts of the type, like __user etc, so they don't have that same > parsing issue. > > I guess one _extreme_ fix for this would be to put > > extern struct nostruct __randomize_layout; > > in our include/linux/kconfig.h, which I think we end up always > including first thanks to having it on the command line. > > Because if you do that, you actually get an error: > > CC [M] fs/nfsd/nfs4xdr.o > In file included from ./include/linux/fs_struct.h:5:0, >from fs/nfsd/nfs4xdr.c:36: > ./include/linux/path.h:11:3: error: conflicting types for > ‘__randomize_layout’ >} __randomize_layout; > ^~ > In file included from :0:0: > ././include/linux/kconfig.h:8:28: note: previous declaration of > ‘__randomize_layout’ was here >extern struct nostruct __randomize_layout; > ^~ > make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1 > > and we would have figured this out immediately. > > Broken example patch appended, in case somebody wants to play with > something like this or comes up with a better model entirely.. > >Linus > Sorry for chiming in late. I noticed this thread today, honestly, the commit made me upset. Can I suggest another way to make it less fragile? __attribute((...)) can be placed after 'struct'. So, we can write: struct __randomize_layout path { struct vfsmount *mnt; struct dentry *dentry; }; instead of struct path { struct vfsmount *mnt; struct dentry *dentry; } __randomize_layout; If we force the former notation, the undefined __randomize_layout results in a build error instead of silent broken code generation. It is true somebody can still place __randomize_layout after the closing brace, but can we check this by coccicheck or checkpatch.pl? (we can describe it in coding style documentation, of course) IMHO, we should not (ab)use include/linux/kconfig.h to bring in misc things. -- Best Regards Masahiro Yamada
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
Hi Linus, 2018-02-22 7:47 GMT+09:00 Linus Torvalds : > On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero > wrote: >> >> One can see that offsets used to access various members of struct path are >> different, and also that the original file from step 3 contains an object >> named "__randomize_layout". > > Whee. > > Thanks for root-causing this issue, and this syntax of ours is clearly > *much* too fragile. > > We actually have similar issues with some of our other attributes, > where out nice "helpful" attribute shorthand can end up being just > silently interpreted as a variable name if they aren't defined in > time. > > For most of our other attributes, it just doesn't matter all that much > if some user doesn't happen to see the attribute. For > __randomize_layout, it's obviously very fatal, and silently just > generates crazy code. > > I'm not entirely sure what the right solution is, because it's > obviously much too easy to miss some #include by mistake. It's easy to > say "you should always include the proper header", but if a failure to > do so doesn't end up with any warnings or errors, but just silent bad > code generation, it's much too fragile. > > I wonder if we could change the syntax of that "__randomize_layout" > thing. Some of our related helper macros (ie > randomized_struct_fields_start/end) don't have the same problem, > because if you don't have the define for them, the compiler will > complain about bad syntax. > > And other attribute specifiers we encourage people to put in other > parts of the type, like __user etc, so they don't have that same > parsing issue. > > I guess one _extreme_ fix for this would be to put > > extern struct nostruct __randomize_layout; > > in our include/linux/kconfig.h, which I think we end up always > including first thanks to having it on the command line. > > Because if you do that, you actually get an error: > > CC [M] fs/nfsd/nfs4xdr.o > In file included from ./include/linux/fs_struct.h:5:0, >from fs/nfsd/nfs4xdr.c:36: > ./include/linux/path.h:11:3: error: conflicting types for > ‘__randomize_layout’ >} __randomize_layout; > ^~ > In file included from :0:0: > ././include/linux/kconfig.h:8:28: note: previous declaration of > ‘__randomize_layout’ was here >extern struct nostruct __randomize_layout; > ^~ > make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1 > > and we would have figured this out immediately. > > Broken example patch appended, in case somebody wants to play with > something like this or comes up with a better model entirely.. > >Linus > Sorry for chiming in late. I noticed this thread today, honestly, the commit made me upset. Can I suggest another way to make it less fragile? __attribute((...)) can be placed after 'struct'. So, we can write: struct __randomize_layout path { struct vfsmount *mnt; struct dentry *dentry; }; instead of struct path { struct vfsmount *mnt; struct dentry *dentry; } __randomize_layout; If we force the former notation, the undefined __randomize_layout results in a build error instead of silent broken code generation. It is true somebody can still place __randomize_layout after the closing brace, but can we check this by coccicheck or checkpatch.pl? (we can describe it in coding style documentation, of course) IMHO, we should not (ab)use include/linux/kconfig.h to bring in misc things. -- Best Regards Masahiro Yamada
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 4:23 PM, Kees Cookwrote: > On Wed, Feb 21, 2018 at 4:22 PM, Linus Torvalds > wrote: >> On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook wrote: >>> >>> Do you want me to send the patch for this, or do you already have it >>> prepared? >> >> I'd rather get something explicitly tested. I tried my earlier patch >> with "make allmodconfig" (and a fix to nfsd to make it compile), but >> now I'm back to testing hjl's gas updates so it would be better to get >> a tested commit with a good commit message. >> >>> The body-fields I had prepared for the nfs were: >>> >>> Reported-by: Patrick McLean >>> Reported-by: Maciej S. Szmigiero >> >> Oh, I think Maciej needs to get more than a "Reported-by:". This was a >> really subtle thing that we didn't figure out in the original thread, >> so give him a gold star in the form of "Root-caused-by:" or something. > > Oops, I just sent this out. I will adjust a re-send. I couldn't find a > documented field name for this... With the "root-cause" hint, I see we have used: 2Root-cause-analysis-by: 2Root-caused-by: 1Root-cause-found-by: I'll go with your "Root-caused-by" to tip the scale. :) -Kees -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 4:23 PM, Kees Cook wrote: > On Wed, Feb 21, 2018 at 4:22 PM, Linus Torvalds > wrote: >> On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook wrote: >>> >>> Do you want me to send the patch for this, or do you already have it >>> prepared? >> >> I'd rather get something explicitly tested. I tried my earlier patch >> with "make allmodconfig" (and a fix to nfsd to make it compile), but >> now I'm back to testing hjl's gas updates so it would be better to get >> a tested commit with a good commit message. >> >>> The body-fields I had prepared for the nfs were: >>> >>> Reported-by: Patrick McLean >>> Reported-by: Maciej S. Szmigiero >> >> Oh, I think Maciej needs to get more than a "Reported-by:". This was a >> really subtle thing that we didn't figure out in the original thread, >> so give him a gold star in the form of "Root-caused-by:" or something. > > Oops, I just sent this out. I will adjust a re-send. I couldn't find a > documented field name for this... With the "root-cause" hint, I see we have used: 2Root-cause-analysis-by: 2Root-caused-by: 1Root-cause-found-by: I'll go with your "Root-caused-by" to tip the scale. :) -Kees -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 4:22 PM, Linus Torvaldswrote: > On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook wrote: >> >> Do you want me to send the patch for this, or do you already have it >> prepared? > > I'd rather get something explicitly tested. I tried my earlier patch > with "make allmodconfig" (and a fix to nfsd to make it compile), but > now I'm back to testing hjl's gas updates so it would be better to get > a tested commit with a good commit message. > >> The body-fields I had prepared for the nfs were: >> >> Reported-by: Patrick McLean >> Reported-by: Maciej S. Szmigiero > > Oh, I think Maciej needs to get more than a "Reported-by:". This was a > really subtle thing that we didn't figure out in the original thread, > so give him a gold star in the form of "Root-caused-by:" or something. Oops, I just sent this out. I will adjust a re-send. I couldn't find a documented field name for this... > *Fixing* this ends up being a one-liner or so. Finding the cause was > the painful part. Yes indeed! -Kees -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 4:22 PM, Linus Torvalds wrote: > On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook wrote: >> >> Do you want me to send the patch for this, or do you already have it >> prepared? > > I'd rather get something explicitly tested. I tried my earlier patch > with "make allmodconfig" (and a fix to nfsd to make it compile), but > now I'm back to testing hjl's gas updates so it would be better to get > a tested commit with a good commit message. > >> The body-fields I had prepared for the nfs were: >> >> Reported-by: Patrick McLean >> Reported-by: Maciej S. Szmigiero > > Oh, I think Maciej needs to get more than a "Reported-by:". This was a > really subtle thing that we didn't figure out in the original thread, > so give him a gold star in the form of "Root-caused-by:" or something. Oops, I just sent this out. I will adjust a re-send. I couldn't find a documented field name for this... > *Fixing* this ends up being a one-liner or so. Finding the cause was > the painful part. Yes indeed! -Kees -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 4:12 PM, Kees Cookwrote: > > Do you want me to send the patch for this, or do you already have it > prepared? I'd rather get something explicitly tested. I tried my earlier patch with "make allmodconfig" (and a fix to nfsd to make it compile), but now I'm back to testing hjl's gas updates so it would be better to get a tested commit with a good commit message. > The body-fields I had prepared for the nfs were: > > Reported-by: Patrick McLean > Reported-by: Maciej S. Szmigiero Oh, I think Maciej needs to get more than a "Reported-by:". This was a really subtle thing that we didn't figure out in the original thread, so give him a gold star in the form of "Root-caused-by:" or something. *Fixing* this ends up being a one-liner or so. Finding the cause was the painful part. Linus
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook wrote: > > Do you want me to send the patch for this, or do you already have it > prepared? I'd rather get something explicitly tested. I tried my earlier patch with "make allmodconfig" (and a fix to nfsd to make it compile), but now I'm back to testing hjl's gas updates so it would be better to get a tested commit with a good commit message. > The body-fields I had prepared for the nfs were: > > Reported-by: Patrick McLean > Reported-by: Maciej S. Szmigiero Oh, I think Maciej needs to get more than a "Reported-by:". This was a really subtle thing that we didn't figure out in the original thread, so give him a gold star in the form of "Root-caused-by:" or something. *Fixing* this ends up being a one-liner or so. Finding the cause was the painful part. Linus
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 3:24 PM, Linus Torvaldswrote: > On Wed, Feb 21, 2018 at 2:52 PM, Kees Cook wrote: >> >> I'll play with Linus's suggestion and see what we get. > > It may be just as well to just include from > and be done with it. Hah, yeah, that would certainly solve it too. :) > I do hate including unnecessary stuff because it makes builds slower, > but kernel header files probably don't get much more core than > . It also has the benefit of not letting it "go wrong" in the first place. (And the separate fix for nfs isn't needed...) Do you want me to send the patch for this, or do you already have it prepared? The body-fields I had prepared for the nfs were: Reported-by: Patrick McLean Reported-by: Maciej S. Szmigiero Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization") -Kees -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 3:24 PM, Linus Torvalds wrote: > On Wed, Feb 21, 2018 at 2:52 PM, Kees Cook wrote: >> >> I'll play with Linus's suggestion and see what we get. > > It may be just as well to just include from > and be done with it. Hah, yeah, that would certainly solve it too. :) > I do hate including unnecessary stuff because it makes builds slower, > but kernel header files probably don't get much more core than > . It also has the benefit of not letting it "go wrong" in the first place. (And the separate fix for nfs isn't needed...) Do you want me to send the patch for this, or do you already have it prepared? The body-fields I had prepared for the nfs were: Reported-by: Patrick McLean Reported-by: Maciej S. Szmigiero Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization") -Kees -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 2:47 PM, Linus Torvaldswrote: > And other attribute specifiers we encourage people to put in other > parts of the type, like __user etc, so they don't have that same > parsing issue. Looking at other attributes we use on structs, we may have similar risks for these: __packed cacheline_aligned cacheline_aligned_in_smp cacheline_internodealigned_in_smp But they just haven't been used in places that we could trip over it as badly, AFAICT. > I guess one _extreme_ fix for this would be to put > > extern struct nostruct __randomize_layout; > > in our include/linux/kconfig.h, which I think we end up always > including first thanks to having it on the command line. We could do that for all the above, but I wonder if the real problem is our convention of using "regular" names for these kinds of attributes instead of parameterized names. If we always used something like: #define __struct(x) __attribute__(x) We'd avoid it, but we'd uglify our struct attributes: struct thing { ... } __struct(randomize_layout); though trying this now creates other problems. Hmmm. (Regardless, let me send the nfs fix separately...) -Kees > > Because if you do that, you actually get an error: > > CC [M] fs/nfsd/nfs4xdr.o > In file included from ./include/linux/fs_struct.h:5:0, >from fs/nfsd/nfs4xdr.c:36: > ./include/linux/path.h:11:3: error: conflicting types for > ‘__randomize_layout’ >} __randomize_layout; > ^~ > In file included from :0:0: > ././include/linux/kconfig.h:8:28: note: previous declaration of > ‘__randomize_layout’ was here >extern struct nostruct __randomize_layout; > ^~ > make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1 > > and we would have figured this out immediately. > > Broken example patch appended, in case somebody wants to play with > something like this or comes up with a better model entirely.. > >Linus > > --- > > diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h > index fec5076eda91..537dacb83380 100644 > --- a/include/linux/kconfig.h > +++ b/include/linux/kconfig.h > @@ -4,6 +4,10 @@ > > #include > > +#ifndef __ASSEMBLY__ > + extern struct nostruct __randomize_layout; > +#endif > + > #define __ARG_PLACEHOLDER_1 0, > #define __take_second_arg(__ignored, val, ...) val -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 2:47 PM, Linus Torvalds wrote: > And other attribute specifiers we encourage people to put in other > parts of the type, like __user etc, so they don't have that same > parsing issue. Looking at other attributes we use on structs, we may have similar risks for these: __packed cacheline_aligned cacheline_aligned_in_smp cacheline_internodealigned_in_smp But they just haven't been used in places that we could trip over it as badly, AFAICT. > I guess one _extreme_ fix for this would be to put > > extern struct nostruct __randomize_layout; > > in our include/linux/kconfig.h, which I think we end up always > including first thanks to having it on the command line. We could do that for all the above, but I wonder if the real problem is our convention of using "regular" names for these kinds of attributes instead of parameterized names. If we always used something like: #define __struct(x) __attribute__(x) We'd avoid it, but we'd uglify our struct attributes: struct thing { ... } __struct(randomize_layout); though trying this now creates other problems. Hmmm. (Regardless, let me send the nfs fix separately...) -Kees > > Because if you do that, you actually get an error: > > CC [M] fs/nfsd/nfs4xdr.o > In file included from ./include/linux/fs_struct.h:5:0, >from fs/nfsd/nfs4xdr.c:36: > ./include/linux/path.h:11:3: error: conflicting types for > ‘__randomize_layout’ >} __randomize_layout; > ^~ > In file included from :0:0: > ././include/linux/kconfig.h:8:28: note: previous declaration of > ‘__randomize_layout’ was here >extern struct nostruct __randomize_layout; > ^~ > make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1 > > and we would have figured this out immediately. > > Broken example patch appended, in case somebody wants to play with > something like this or comes up with a better model entirely.. > >Linus > > --- > > diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h > index fec5076eda91..537dacb83380 100644 > --- a/include/linux/kconfig.h > +++ b/include/linux/kconfig.h > @@ -4,6 +4,10 @@ > > #include > > +#ifndef __ASSEMBLY__ > + extern struct nostruct __randomize_layout; > +#endif > + > #define __ARG_PLACEHOLDER_1 0, > #define __take_second_arg(__ignored, val, ...) val -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 2:52 PM, Kees Cookwrote: > > I'll play with Linus's suggestion and see what we get. It may be just as well to just include from and be done with it. If you look at that hacky script I documented in commit 23c35f48f5fb ("pinctrl: remove include file from ") and run it in a fully built kernel tree, you'll see that that header is included from pretty much every single file anyway. At least for me, for an allmodconfig build, the top headers are 23322 arch/x86/include/uapi/asm/types.h 23322 include/asm-generic/int-ll64.h 23322 include/linux/types.h 23322 include/uapi/asm-generic/int-ll64.h 23322 include/uapi/asm-generic/types.h 23322 include/uapi/linux/types.h 23323 arch/x86/include/uapi/asm/bitsperlong.h 23323 include/asm-generic/bitsperlong.h 23323 include/uapi/asm-generic/bitsperlong.h 23326 include/linux/stringify.h 23390 include/linux/compiler_types.h and considering that I have 25949 object files in that tree, it really means that just about every compile ended up including that file anyway (yeah, the "orc_types.h" header ends up being mentioned twice for most files, so it looks even more hot, but that's not real data). I do hate including unnecessary stuff because it makes builds slower, but kernel header files probably don't get much more core than . Linus
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 2:52 PM, Kees Cook wrote: > > I'll play with Linus's suggestion and see what we get. It may be just as well to just include from and be done with it. If you look at that hacky script I documented in commit 23c35f48f5fb ("pinctrl: remove include file from ") and run it in a fully built kernel tree, you'll see that that header is included from pretty much every single file anyway. At least for me, for an allmodconfig build, the top headers are 23322 arch/x86/include/uapi/asm/types.h 23322 include/asm-generic/int-ll64.h 23322 include/linux/types.h 23322 include/uapi/asm-generic/int-ll64.h 23322 include/uapi/asm-generic/types.h 23322 include/uapi/linux/types.h 23323 arch/x86/include/uapi/asm/bitsperlong.h 23323 include/asm-generic/bitsperlong.h 23323 include/uapi/asm-generic/bitsperlong.h 23326 include/linux/stringify.h 23390 include/linux/compiler_types.h and considering that I have 25949 object files in that tree, it really means that just about every compile ended up including that file anyway (yeah, the "orc_types.h" header ends up being mentioned twice for most files, so it looks even more hot, but that's not real data). I do hate including unnecessary stuff because it makes builds slower, but kernel header files probably don't get much more core than . Linus
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigierowrote: > One can see that offsets used to access various members of struct path are > different, and also that the original file from step 3 contains an object > named "__randomize_layout". > > This is caused by a fact that the current version of nfs4xdr.c includes > linux/fs_struct.h as the very first included header which then includes > linux/path.h as the very first included header, which then defines > struct path, but without including any files on its own. > > This results in __randomize_layout tag at the end of struct path > definition being treated as a variable name (since linux/compiler-gcc.h > that defines it as a type attribute has not been included yet). Oh, well done! That would explain the code offset I was seeing when the plugin on, but no-op, since the variable would still exist. I'll play with Linus's suggestion and see what we get. Thanks! -Kees -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero wrote: > One can see that offsets used to access various members of struct path are > different, and also that the original file from step 3 contains an object > named "__randomize_layout". > > This is caused by a fact that the current version of nfs4xdr.c includes > linux/fs_struct.h as the very first included header which then includes > linux/path.h as the very first included header, which then defines > struct path, but without including any files on its own. > > This results in __randomize_layout tag at the end of struct path > definition being treated as a variable name (since linux/compiler-gcc.h > that defines it as a type attribute has not been included yet). Oh, well done! That would explain the code offset I was seeing when the plugin on, but no-op, since the variable would still exist. I'll play with Linus's suggestion and see what we get. Thanks! -Kees -- Kees Cook Pixel Security
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigierowrote: > > One can see that offsets used to access various members of struct path are > different, and also that the original file from step 3 contains an object > named "__randomize_layout". Whee. Thanks for root-causing this issue, and this syntax of ours is clearly *much* too fragile. We actually have similar issues with some of our other attributes, where out nice "helpful" attribute shorthand can end up being just silently interpreted as a variable name if they aren't defined in time. For most of our other attributes, it just doesn't matter all that much if some user doesn't happen to see the attribute. For __randomize_layout, it's obviously very fatal, and silently just generates crazy code. I'm not entirely sure what the right solution is, because it's obviously much too easy to miss some #include by mistake. It's easy to say "you should always include the proper header", but if a failure to do so doesn't end up with any warnings or errors, but just silent bad code generation, it's much too fragile. I wonder if we could change the syntax of that "__randomize_layout" thing. Some of our related helper macros (ie randomized_struct_fields_start/end) don't have the same problem, because if you don't have the define for them, the compiler will complain about bad syntax. And other attribute specifiers we encourage people to put in other parts of the type, like __user etc, so they don't have that same parsing issue. I guess one _extreme_ fix for this would be to put extern struct nostruct __randomize_layout; in our include/linux/kconfig.h, which I think we end up always including first thanks to having it on the command line. Because if you do that, you actually get an error: CC [M] fs/nfsd/nfs4xdr.o In file included from ./include/linux/fs_struct.h:5:0, from fs/nfsd/nfs4xdr.c:36: ./include/linux/path.h:11:3: error: conflicting types for ‘__randomize_layout’ } __randomize_layout; ^~ In file included from :0:0: ././include/linux/kconfig.h:8:28: note: previous declaration of ‘__randomize_layout’ was here extern struct nostruct __randomize_layout; ^~ make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1 and we would have figured this out immediately. Broken example patch appended, in case somebody wants to play with something like this or comes up with a better model entirely.. Linus --- diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h index fec5076eda91..537dacb83380 100644 --- a/include/linux/kconfig.h +++ b/include/linux/kconfig.h @@ -4,6 +4,10 @@ #include +#ifndef __ASSEMBLY__ + extern struct nostruct __randomize_layout; +#endif + #define __ARG_PLACEHOLDER_1 0, #define __take_second_arg(__ignored, val, ...) val
Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero wrote: > > One can see that offsets used to access various members of struct path are > different, and also that the original file from step 3 contains an object > named "__randomize_layout". Whee. Thanks for root-causing this issue, and this syntax of ours is clearly *much* too fragile. We actually have similar issues with some of our other attributes, where out nice "helpful" attribute shorthand can end up being just silently interpreted as a variable name if they aren't defined in time. For most of our other attributes, it just doesn't matter all that much if some user doesn't happen to see the attribute. For __randomize_layout, it's obviously very fatal, and silently just generates crazy code. I'm not entirely sure what the right solution is, because it's obviously much too easy to miss some #include by mistake. It's easy to say "you should always include the proper header", but if a failure to do so doesn't end up with any warnings or errors, but just silent bad code generation, it's much too fragile. I wonder if we could change the syntax of that "__randomize_layout" thing. Some of our related helper macros (ie randomized_struct_fields_start/end) don't have the same problem, because if you don't have the define for them, the compiler will complain about bad syntax. And other attribute specifiers we encourage people to put in other parts of the type, like __user etc, so they don't have that same parsing issue. I guess one _extreme_ fix for this would be to put extern struct nostruct __randomize_layout; in our include/linux/kconfig.h, which I think we end up always including first thanks to having it on the command line. Because if you do that, you actually get an error: CC [M] fs/nfsd/nfs4xdr.o In file included from ./include/linux/fs_struct.h:5:0, from fs/nfsd/nfs4xdr.c:36: ./include/linux/path.h:11:3: error: conflicting types for ‘__randomize_layout’ } __randomize_layout; ^~ In file included from :0:0: ././include/linux/kconfig.h:8:28: note: previous declaration of ‘__randomize_layout’ was here extern struct nostruct __randomize_layout; ^~ make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1 and we would have figured this out immediately. Broken example patch appended, in case somebody wants to play with something like this or comes up with a better model entirely.. Linus --- diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h index fec5076eda91..537dacb83380 100644 --- a/include/linux/kconfig.h +++ b/include/linux/kconfig.h @@ -4,6 +4,10 @@ #include +#ifndef __ASSEMBLY__ + extern struct nostruct __randomize_layout; +#endif + #define __ARG_PLACEHOLDER_1 0, #define __take_second_arg(__ignored, val, ...) val
RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On 18.11.2017 06:14, Kees Cook wrote: > On Fri, Nov 17, 2017 at 5:54 PM, Patrick McLeanwrote: >> On 2017-11-17 04:55 PM, Linus Torvalds wrote: >>> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean wrote: I am still getting the crash at d9e12200852d, I figured I would double-check the "good" and "bad" kernels before starting a full bisect. >>> >>> .. but without GCC_PLUGIN_RANDSTRUCT it's solid? >> >> Yes, without GCC_PLUGIN_RANDSTRUCT it's solid. > > That's strange. With d9e12200852d the shuffle_seed variables won't > ever actually get used. (i.e. I wouldn't expect the seed to change any > behavior.) > > Can you confirm with something like this: > > > for (i = 0; i < 4; i++) { > seed[i] = shuffle_seed[i]; > > > You should see no reports of "Shuffling struct ..." > > And if it reports nothing, and you're on d9e12200852d, can you confirm > that switching to a "good" seed fixes it? (If it _does_, then I > suspect a build artifact being left behind or something odd like > that.) > >>> Kees removed even the baseline "randomize pure function pointer >>> structures", so at that commit, nothing should be randomized. >>> >>> But maybe the plugin code itself ends up confusing gcc somehow? >>> >>> Even when it doesn't actually do that "relayout_struct()" on the >>> structure, it always does those TYPE_ATTRIBUTES() games. > > FWIW, myself doing a build at d9e12200852d with and without > GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output > where I did spot-checks. > I have also hit a GPF in nfsd4_encode_fattr() with RANDSTRUCT plugin enabled. This function is located in a fs/nfsd/nfs4xdr.c file. The fault happened at "xdr_encode_hyper(p, exp->ex_path.mnt->mnt_sb->s_maxbytes)" line, namely when accessing s_maxbytes. exp->ex_path is of type struct path that has been annotated with __randomize_layout. It seems to me that this annotation isn't really taken into consideration when compiling nfs4xdr.c. This most likely results in dereferencing a value of exp->ex_path.dentry instead of exp->ex_path.mnt. Then some member of struct dentry is dereferenced as struct super_block to access its s_maxbytes member which results in an oops if it happens to be an invalid pointer (which it was in my case). How to reproduce the problem statically (tested on current Linus's tree and on 4.15.4, with gcc 7.3.0): 1) Enable RANDSTRUCT plugin, 2) Use a RANDSTRUCT seed that results in shuffling of struct path, Example: "55e5fea7ff662b333a190209ab31f35b6f0f2470f7d0e3c64430936169571106". 3) make fs/nfsd/nfs4xdr.s and save the result, 4) Insert "#include " at the top of fs/nfsd/nfs4xdr.c as the very first include directive. 5) make fs/nfsd/nfs4xdr.s and compare the result with the one from step 3. One can see that offsets used to access various members of struct path are different, and also that the original file from step 3 contains an object named "__randomize_layout". This is caused by a fact that the current version of nfs4xdr.c includes linux/fs_struct.h as the very first included header which then includes linux/path.h as the very first included header, which then defines struct path, but without including any files on its own. This results in __randomize_layout tag at the end of struct path definition being treated as a variable name (since linux/compiler-gcc.h that defines it as a type attribute has not been included yet). It looks like to me that every header file that defines a randomized struct also has to include linux/compiler_types.h or some other file that ultimately results in that file inclusion in order to make the RANDSTRUCT plugin work correctly. Maciej
RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)
On 18.11.2017 06:14, Kees Cook wrote: > On Fri, Nov 17, 2017 at 5:54 PM, Patrick McLean wrote: >> On 2017-11-17 04:55 PM, Linus Torvalds wrote: >>> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean wrote: I am still getting the crash at d9e12200852d, I figured I would double-check the "good" and "bad" kernels before starting a full bisect. >>> >>> .. but without GCC_PLUGIN_RANDSTRUCT it's solid? >> >> Yes, without GCC_PLUGIN_RANDSTRUCT it's solid. > > That's strange. With d9e12200852d the shuffle_seed variables won't > ever actually get used. (i.e. I wouldn't expect the seed to change any > behavior.) > > Can you confirm with something like this: > > > for (i = 0; i < 4; i++) { > seed[i] = shuffle_seed[i]; > > > You should see no reports of "Shuffling struct ..." > > And if it reports nothing, and you're on d9e12200852d, can you confirm > that switching to a "good" seed fixes it? (If it _does_, then I > suspect a build artifact being left behind or something odd like > that.) > >>> Kees removed even the baseline "randomize pure function pointer >>> structures", so at that commit, nothing should be randomized. >>> >>> But maybe the plugin code itself ends up confusing gcc somehow? >>> >>> Even when it doesn't actually do that "relayout_struct()" on the >>> structure, it always does those TYPE_ATTRIBUTES() games. > > FWIW, myself doing a build at d9e12200852d with and without > GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output > where I did spot-checks. > I have also hit a GPF in nfsd4_encode_fattr() with RANDSTRUCT plugin enabled. This function is located in a fs/nfsd/nfs4xdr.c file. The fault happened at "xdr_encode_hyper(p, exp->ex_path.mnt->mnt_sb->s_maxbytes)" line, namely when accessing s_maxbytes. exp->ex_path is of type struct path that has been annotated with __randomize_layout. It seems to me that this annotation isn't really taken into consideration when compiling nfs4xdr.c. This most likely results in dereferencing a value of exp->ex_path.dentry instead of exp->ex_path.mnt. Then some member of struct dentry is dereferenced as struct super_block to access its s_maxbytes member which results in an oops if it happens to be an invalid pointer (which it was in my case). How to reproduce the problem statically (tested on current Linus's tree and on 4.15.4, with gcc 7.3.0): 1) Enable RANDSTRUCT plugin, 2) Use a RANDSTRUCT seed that results in shuffling of struct path, Example: "55e5fea7ff662b333a190209ab31f35b6f0f2470f7d0e3c64430936169571106". 3) make fs/nfsd/nfs4xdr.s and save the result, 4) Insert "#include " at the top of fs/nfsd/nfs4xdr.c as the very first include directive. 5) make fs/nfsd/nfs4xdr.s and compare the result with the one from step 3. One can see that offsets used to access various members of struct path are different, and also that the original file from step 3 contains an object named "__randomize_layout". This is caused by a fact that the current version of nfs4xdr.c includes linux/fs_struct.h as the very first included header which then includes linux/path.h as the very first included header, which then defines struct path, but without including any files on its own. This results in __randomize_layout tag at the end of struct path definition being treated as a variable name (since linux/compiler-gcc.h that defines it as a type attribute has not been included yet). It looks like to me that every header file that defines a randomized struct also has to include linux/compiler_types.h or some other file that ultimately results in that file inclusion in order to make the RANDSTRUCT plugin work correctly. Maciej
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 17, 2017 at 9:29 PM, Linus Torvaldswrote: > On Fri, Nov 17, 2017 at 9:14 PM, Kees Cook wrote: >> >> FWIW, myself doing a build at d9e12200852d with and without >> GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output >> where I did spot-checks. > > That would probably be a good thing to check anyway - check the > difference between GCC_PLUGIN_RANDSTRUCT on and off at that commit. > > Just do > >objdump --disassemble vmlinux > file > > and compare the two files for where the differences start occurring. Yeah, I was just doing that now. Looks like there _is_ something getting changed just from having the plugin enabled, but it appears localized. For me, the first non-offset change happens in lookup_user_key and persists for a while. -813893a7: 0f 85 55 03 00 00 jne 81389702 -813893ad: f0 41 ff 06 lock incl (%r14) -813893b1: 83 fb 07cmp$0x7,%ebx -813893b4: 4c 89 b5 70 ff ff ffmov%r14,-0x90(%rbp) ... +813893a7: 0f 85 35 03 00 00 jne 813896e2 +813893ad: 4d 89 f0mov%r14,%r8 +813893b0: f0 41 ff 06 lock incl (%r14) +813893b4: 83 fb 07cmp$0x7,%ebx +813893b7: 4c 89 b5 70 ff ff ffmov%r14,-0x90(%rbp) And removing the TYPE_ATTRIBUTES() poking makes the register storage differences go away, but there's still a 0x40 byte offset delta. I'll continue looking at this tomorrow. -Kees -- Kees Cook Pixel Security
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 17, 2017 at 9:29 PM, Linus Torvalds wrote: > On Fri, Nov 17, 2017 at 9:14 PM, Kees Cook wrote: >> >> FWIW, myself doing a build at d9e12200852d with and without >> GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output >> where I did spot-checks. > > That would probably be a good thing to check anyway - check the > difference between GCC_PLUGIN_RANDSTRUCT on and off at that commit. > > Just do > >objdump --disassemble vmlinux > file > > and compare the two files for where the differences start occurring. Yeah, I was just doing that now. Looks like there _is_ something getting changed just from having the plugin enabled, but it appears localized. For me, the first non-offset change happens in lookup_user_key and persists for a while. -813893a7: 0f 85 55 03 00 00 jne 81389702 -813893ad: f0 41 ff 06 lock incl (%r14) -813893b1: 83 fb 07cmp$0x7,%ebx -813893b4: 4c 89 b5 70 ff ff ffmov%r14,-0x90(%rbp) ... +813893a7: 0f 85 35 03 00 00 jne 813896e2 +813893ad: 4d 89 f0mov%r14,%r8 +813893b0: f0 41 ff 06 lock incl (%r14) +813893b4: 83 fb 07cmp$0x7,%ebx +813893b7: 4c 89 b5 70 ff ff ffmov%r14,-0x90(%rbp) And removing the TYPE_ATTRIBUTES() poking makes the register storage differences go away, but there's still a 0x40 byte offset delta. I'll continue looking at this tomorrow. -Kees -- Kees Cook Pixel Security
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 17, 2017 at 9:14 PM, Kees Cookwrote: > > FWIW, myself doing a build at d9e12200852d with and without > GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output > where I did spot-checks. That would probably be a good thing to check anyway - check the difference between GCC_PLUGIN_RANDSTRUCT on and off at that commit. Just do objdump --disassemble vmlinux > file and compare the two files for where the differences start occurring. Linus
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 17, 2017 at 9:14 PM, Kees Cook wrote: > > FWIW, myself doing a build at d9e12200852d with and without > GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output > where I did spot-checks. That would probably be a good thing to check anyway - check the difference between GCC_PLUGIN_RANDSTRUCT on and off at that commit. Just do objdump --disassemble vmlinux > file and compare the two files for where the differences start occurring. Linus
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 17, 2017 at 5:54 PM, Patrick McLeanwrote: > On 2017-11-17 04:55 PM, Linus Torvalds wrote: >> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean wrote: >>> >>> I am still getting the crash at d9e12200852d, I figured I would >>> double-check the "good" and "bad" kernels before starting a full bisect. >> >> .. but without GCC_PLUGIN_RANDSTRUCT it's solid? > > Yes, without GCC_PLUGIN_RANDSTRUCT it's solid. That's strange. With d9e12200852d the shuffle_seed variables won't ever actually get used. (i.e. I wouldn't expect the seed to change any behavior.) Can you confirm with something like this: diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c index cdaac8c66734..aac570a57d7d 100644 --- a/scripts/gcc-plugins/randomize_layout_plugin.c +++ b/scripts/gcc-plugins/randomize_layout_plugin.c @@ -267,12 +267,10 @@ static void shuffle(const_tree type, tree *newtree, unsigned long length) structname = ORIG_TYPE_NAME(type); -#ifdef __DEBUG_PLUGIN fprintf(stderr, "Shuffling struct %s %p\n", (const char *)structname, type); #ifdef __DEBUG_VERBOSE debug_tree((tree)type); #endif -#endif for (i = 0; i < 4; i++) { seed[i] = shuffle_seed[i]; You should see no reports of "Shuffling struct ..." And if it reports nothing, and you're on d9e12200852d, can you confirm that switching to a "good" seed fixes it? (If it _does_, then I suspect a build artifact being left behind or something odd like that.) >> Kees removed even the baseline "randomize pure function pointer >> structures", so at that commit, nothing should be randomized. >> >> But maybe the plugin code itself ends up confusing gcc somehow? >> >> Even when it doesn't actually do that "relayout_struct()" on the >> structure, it always does those TYPE_ATTRIBUTES() games. FWIW, myself doing a build at d9e12200852d with and without GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output where I did spot-checks. Also, do you have any other plugins enabled? (Can you send your .config?) -Kees -- Kees Cook Pixel Security
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 17, 2017 at 5:54 PM, Patrick McLean wrote: > On 2017-11-17 04:55 PM, Linus Torvalds wrote: >> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean wrote: >>> >>> I am still getting the crash at d9e12200852d, I figured I would >>> double-check the "good" and "bad" kernels before starting a full bisect. >> >> .. but without GCC_PLUGIN_RANDSTRUCT it's solid? > > Yes, without GCC_PLUGIN_RANDSTRUCT it's solid. That's strange. With d9e12200852d the shuffle_seed variables won't ever actually get used. (i.e. I wouldn't expect the seed to change any behavior.) Can you confirm with something like this: diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c b/scripts/gcc-plugins/randomize_layout_plugin.c index cdaac8c66734..aac570a57d7d 100644 --- a/scripts/gcc-plugins/randomize_layout_plugin.c +++ b/scripts/gcc-plugins/randomize_layout_plugin.c @@ -267,12 +267,10 @@ static void shuffle(const_tree type, tree *newtree, unsigned long length) structname = ORIG_TYPE_NAME(type); -#ifdef __DEBUG_PLUGIN fprintf(stderr, "Shuffling struct %s %p\n", (const char *)structname, type); #ifdef __DEBUG_VERBOSE debug_tree((tree)type); #endif -#endif for (i = 0; i < 4; i++) { seed[i] = shuffle_seed[i]; You should see no reports of "Shuffling struct ..." And if it reports nothing, and you're on d9e12200852d, can you confirm that switching to a "good" seed fixes it? (If it _does_, then I suspect a build artifact being left behind or something odd like that.) >> Kees removed even the baseline "randomize pure function pointer >> structures", so at that commit, nothing should be randomized. >> >> But maybe the plugin code itself ends up confusing gcc somehow? >> >> Even when it doesn't actually do that "relayout_struct()" on the >> structure, it always does those TYPE_ATTRIBUTES() games. FWIW, myself doing a build at d9e12200852d with and without GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output where I did spot-checks. Also, do you have any other plugins enabled? (Can you send your .config?) -Kees -- Kees Cook Pixel Security
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-17 04:55 PM, Linus Torvalds wrote: > On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLeanwrote: >> >> I am still getting the crash at d9e12200852d, I figured I would >> double-check the "good" and "bad" kernels before starting a full bisect. > > .. but without GCC_PLUGIN_RANDSTRUCT it's solid? Yes, without GCC_PLUGIN_RANDSTRUCT it's solid. > Kees removed even the baseline "randomize pure function pointer > structures", so at that commit, nothing should be randomized. > > But maybe the plugin code itself ends up confusing gcc somehow? > > Even when it doesn't actually do that "relayout_struct()" on the > structure, it always does those TYPE_ATTRIBUTES() games.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-17 04:55 PM, Linus Torvalds wrote: > On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean wrote: >> >> I am still getting the crash at d9e12200852d, I figured I would >> double-check the "good" and "bad" kernels before starting a full bisect. > > .. but without GCC_PLUGIN_RANDSTRUCT it's solid? Yes, without GCC_PLUGIN_RANDSTRUCT it's solid. > Kees removed even the baseline "randomize pure function pointer > structures", so at that commit, nothing should be randomized. > > But maybe the plugin code itself ends up confusing gcc somehow? > > Even when it doesn't actually do that "relayout_struct()" on the > structure, it always does those TYPE_ATTRIBUTES() games.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLeanwrote: > > I am still getting the crash at d9e12200852d, I figured I would > double-check the "good" and "bad" kernels before starting a full bisect. .. but without GCC_PLUGIN_RANDSTRUCT it's solid? Kees removed even the baseline "randomize pure function pointer structures", so at that commit, nothing should be randomized. But maybe the plugin code itself ends up confusing gcc somehow? Even when it doesn't actually do that "relayout_struct()" on the structure, it always does those TYPE_ATTRIBUTES() games. Linus
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean wrote: > > I am still getting the crash at d9e12200852d, I figured I would > double-check the "good" and "bad" kernels before starting a full bisect. .. but without GCC_PLUGIN_RANDSTRUCT it's solid? Kees removed even the baseline "randomize pure function pointer structures", so at that commit, nothing should be randomized. But maybe the plugin code itself ends up confusing gcc somehow? Even when it doesn't actually do that "relayout_struct()" on the structure, it always does those TYPE_ATTRIBUTES() games. Linus
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-17 01:26 PM, Kees Cook wrote: > On Fri, Nov 17, 2017 at 11:03 AM, Patrick McLeanwrote: >> On 2017-11-16 04:54 PM, Kees Cook wrote: >>> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean wrote: On 2017-11-11 09:31 AM, Linus Torvalds wrote: > Boris Lukashev points out that Patrick should probably check a newer > version of gcc. > > I looked around, and in one of the emails, Patrick said: > > "No changes, both the working and broken kernels were built with >distro-provided gcc 5.4.0 and binutils 2.28.1" > > and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but > it's a bug-fix release to a pretty old branch that is not exactly new. > > It would probably be good to check if the problems persist with gcc > 6.x or 7.x.. I have no idea which gcc version the randstruct people > tend to use themselves. I just tested it with gcc 7.2, and was able to reproduce the NULL pointer dereference, the backtrace looks slightly different this time. I will also test with binutils 2.29, though I doubt that will make any difference. > [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at > 0560 > [ 56.166563] IP: vfs_statfs+0x7c/0xc0 > [ 56.167249] PGD 0 P4D 0 > [ 56.167860] Oops: [#1] SMP > [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 > xt_multiport xt_addrtype iptable_mangle iptable> > [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O > 4.14.0-git-kratos-1 #1 > [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 > [ 56.182729] task: 88040c412a00 task.stack: c90002c18000 > [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 > [ 56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202 > [ 56.185143] RAX: RBX: c90002c1bbf0 RCX: > 0020 > [ 56.186085] RDX: 1801 RSI: 1801 RDI: > > [ 56.187066] RBP: c90002c1bbc0 R08: ff00 R09: > 00ff > [ 56.188268] R10: 0038be3a R11: 880408b18258 R12: > > [ 56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: > c90002c1bbf0 > [ 56.190444] FS: () GS:88041fc0() > knlGS: > [ 56.191876] CS: 0010 DS: ES: CR0: 80050033 > [ 56.192843] CR2: 0560 CR3: 01e0a002 CR4: > 001606f0 > [ 56.193898] Call Trace: > [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 > [ 56.195267] ? generic_permission+0x12c/0x1a0 > [ 56.196025] nfsd4_encode_getattr+0x25/0x30 > [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 > [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 > [ 56.198268] nfsd_dispatch+0xe8/0x220 > [ 56.198968] svc_process_common+0x475/0x640 > [ 56.199696] ? nfsd_destroy+0x60/0x60 > [ 56.200404] svc_process+0xf2/0x1a0 > [ 56.201079] nfsd+0xe3/0x150 > [ 56.201706] kthread+0x117/0x130 > [ 56.202354] ? kthread_create_on_node+0x40/0x40 > [ 56.203100] ret_from_fork+0x25/0x30 > [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 > 81 ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> > [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28 > [ 56.207110] CR2: 0560 > [ 56.207763] ---[ end trace d452986a80f64aaa ]--- > On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: >> >> I'll take a closer look at this and see if I can provide something to >> narrow it down. >>> >>> How reliable is this crash? The best idea I have to isolate it would >>> be to bisect the additions of the __randomize_layout markings on >>> various structures. I would start with the ones Al is most upset to >>> see randomized. ;) >> >> It's pretty reliable, once I get a bad seed I can reproduce the crash >> pretty quickly. >> >>> For the first step, I'd try a revert of >>> 9225331b310821760f39ba55b00b8973602adbb5, which enables a large >>> portion of struct randomization. If that doesn't change things, I can >>> provide a series that reverts 3859a271a003aba01e45b85c9d8b355eb7bf25f9 >>> and then re-applies __randomize_layout one structure per patch, and >>> you could bisect that? >> >> Sure, I can bisect that. > > Okay, that should at least let us know if this is a specific struct > that is not expecting to get randomized, or if there is some deeper > flaw. Here's the tree, based on 4.14: > https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/randstruct/bisection > > With commit d9e12200852d, all randomization selections are reverted. I > would expect this to be a "good" kernel for the bisect. I am still
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-17 01:26 PM, Kees Cook wrote: > On Fri, Nov 17, 2017 at 11:03 AM, Patrick McLean wrote: >> On 2017-11-16 04:54 PM, Kees Cook wrote: >>> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean wrote: On 2017-11-11 09:31 AM, Linus Torvalds wrote: > Boris Lukashev points out that Patrick should probably check a newer > version of gcc. > > I looked around, and in one of the emails, Patrick said: > > "No changes, both the working and broken kernels were built with >distro-provided gcc 5.4.0 and binutils 2.28.1" > > and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but > it's a bug-fix release to a pretty old branch that is not exactly new. > > It would probably be good to check if the problems persist with gcc > 6.x or 7.x.. I have no idea which gcc version the randstruct people > tend to use themselves. I just tested it with gcc 7.2, and was able to reproduce the NULL pointer dereference, the backtrace looks slightly different this time. I will also test with binutils 2.29, though I doubt that will make any difference. > [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at > 0560 > [ 56.166563] IP: vfs_statfs+0x7c/0xc0 > [ 56.167249] PGD 0 P4D 0 > [ 56.167860] Oops: [#1] SMP > [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 > xt_multiport xt_addrtype iptable_mangle iptable> > [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O > 4.14.0-git-kratos-1 #1 > [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 > [ 56.182729] task: 88040c412a00 task.stack: c90002c18000 > [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 > [ 56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202 > [ 56.185143] RAX: RBX: c90002c1bbf0 RCX: > 0020 > [ 56.186085] RDX: 1801 RSI: 1801 RDI: > > [ 56.187066] RBP: c90002c1bbc0 R08: ff00 R09: > 00ff > [ 56.188268] R10: 0038be3a R11: 880408b18258 R12: > > [ 56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: > c90002c1bbf0 > [ 56.190444] FS: () GS:88041fc0() > knlGS: > [ 56.191876] CS: 0010 DS: ES: CR0: 80050033 > [ 56.192843] CR2: 0560 CR3: 01e0a002 CR4: > 001606f0 > [ 56.193898] Call Trace: > [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 > [ 56.195267] ? generic_permission+0x12c/0x1a0 > [ 56.196025] nfsd4_encode_getattr+0x25/0x30 > [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 > [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 > [ 56.198268] nfsd_dispatch+0xe8/0x220 > [ 56.198968] svc_process_common+0x475/0x640 > [ 56.199696] ? nfsd_destroy+0x60/0x60 > [ 56.200404] svc_process+0xf2/0x1a0 > [ 56.201079] nfsd+0xe3/0x150 > [ 56.201706] kthread+0x117/0x130 > [ 56.202354] ? kthread_create_on_node+0x40/0x40 > [ 56.203100] ret_from_fork+0x25/0x30 > [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 > 81 ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> > [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28 > [ 56.207110] CR2: 0560 > [ 56.207763] ---[ end trace d452986a80f64aaa ]--- > On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: >> >> I'll take a closer look at this and see if I can provide something to >> narrow it down. >>> >>> How reliable is this crash? The best idea I have to isolate it would >>> be to bisect the additions of the __randomize_layout markings on >>> various structures. I would start with the ones Al is most upset to >>> see randomized. ;) >> >> It's pretty reliable, once I get a bad seed I can reproduce the crash >> pretty quickly. >> >>> For the first step, I'd try a revert of >>> 9225331b310821760f39ba55b00b8973602adbb5, which enables a large >>> portion of struct randomization. If that doesn't change things, I can >>> provide a series that reverts 3859a271a003aba01e45b85c9d8b355eb7bf25f9 >>> and then re-applies __randomize_layout one structure per patch, and >>> you could bisect that? >> >> Sure, I can bisect that. > > Okay, that should at least let us know if this is a specific struct > that is not expecting to get randomized, or if there is some deeper > flaw. Here's the tree, based on 4.14: > https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/randstruct/bisection > > With commit d9e12200852d, all randomization selections are reverted. I > would expect this to be a "good" kernel for the bisect. I am still getting the crash at d9e12200852d, I figured I would double-check the
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 17, 2017 at 11:03 AM, Patrick McLeanwrote: > On 2017-11-16 04:54 PM, Kees Cook wrote: >> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean wrote: >>> On 2017-11-11 09:31 AM, Linus Torvalds wrote: Boris Lukashev points out that Patrick should probably check a newer version of gcc. I looked around, and in one of the emails, Patrick said: "No changes, both the working and broken kernels were built with distro-provided gcc 5.4.0 and binutils 2.28.1" and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but it's a bug-fix release to a pretty old branch that is not exactly new. It would probably be good to check if the problems persist with gcc 6.x or 7.x.. I have no idea which gcc version the randstruct people tend to use themselves. >>> >>> I just tested it with gcc 7.2, and was able to reproduce the NULL >>> pointer dereference, the backtrace looks slightly different this time. >>> >>> I will also test with binutils 2.29, though I doubt that will make any >>> difference. >>> [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at 0560 [ 56.166563] IP: vfs_statfs+0x7c/0xc0 [ 56.167249] PGD 0 P4D 0 [ 56.167860] Oops: [#1] SMP [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_multiport xt_addrtype iptable_mangle iptable> [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O 4.14.0-git-kratos-1 #1 [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 [ 56.182729] task: 88040c412a00 task.stack: c90002c18000 [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 [ 56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202 [ 56.185143] RAX: RBX: c90002c1bbf0 RCX: 0020 [ 56.186085] RDX: 1801 RSI: 1801 RDI: [ 56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 00ff [ 56.188268] R10: 0038be3a R11: 880408b18258 R12: [ 56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: c90002c1bbf0 [ 56.190444] FS: () GS:88041fc0() knlGS: [ 56.191876] CS: 0010 DS: ES: CR0: 80050033 [ 56.192843] CR2: 0560 CR3: 01e0a002 CR4: 001606f0 [ 56.193898] Call Trace: [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 [ 56.195267] ? generic_permission+0x12c/0x1a0 [ 56.196025] nfsd4_encode_getattr+0x25/0x30 [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 [ 56.198268] nfsd_dispatch+0xe8/0x220 [ 56.198968] svc_process_common+0x475/0x640 [ 56.199696] ? nfsd_destroy+0x60/0x60 [ 56.200404] svc_process+0xf2/0x1a0 [ 56.201079] nfsd+0xe3/0x150 [ 56.201706] kthread+0x117/0x130 [ 56.202354] ? kthread_create_on_node+0x40/0x40 [ 56.203100] ret_from_fork+0x25/0x30 [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28 [ 56.207110] CR2: 0560 [ 56.207763] ---[ end trace d452986a80f64aaa ]--- >>> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: > > I'll take a closer look at this and see if I can provide something to > narrow it down. >> >> How reliable is this crash? The best idea I have to isolate it would >> be to bisect the additions of the __randomize_layout markings on >> various structures. I would start with the ones Al is most upset to >> see randomized. ;) > > It's pretty reliable, once I get a bad seed I can reproduce the crash > pretty quickly. > >> >> All that said, I'd like to better understand the BIOS side of this a >> little better. In the first email in this thread, you showed two BUGs >> separated by a little time, which implies to me that the NULL deref >> and the BIOS no longer POSTing are separate (though seemingly related) >> issues. Have you had machines survive the BUG without blowing up the >> BIOS? > > We had 3 machines die due to the BIOS issue (all of them pretty quickly > with the bad-seed kernel). All the dead machines had the same > motherboard model. I have not managed to reproduce the issue again on > the machine I restored via the IPMI interface, I suspect that it may be > a bug in the BIOS that was fixed in a more recent version. > >> >> I'm still trying to wrap my head around how the BIOS could be blowing >> up. I assume there's some magic memory address that is getting poked >> as a result of some struct randomization bug, so tracking that down >>
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 17, 2017 at 11:03 AM, Patrick McLean wrote: > On 2017-11-16 04:54 PM, Kees Cook wrote: >> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean wrote: >>> On 2017-11-11 09:31 AM, Linus Torvalds wrote: Boris Lukashev points out that Patrick should probably check a newer version of gcc. I looked around, and in one of the emails, Patrick said: "No changes, both the working and broken kernels were built with distro-provided gcc 5.4.0 and binutils 2.28.1" and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but it's a bug-fix release to a pretty old branch that is not exactly new. It would probably be good to check if the problems persist with gcc 6.x or 7.x.. I have no idea which gcc version the randstruct people tend to use themselves. >>> >>> I just tested it with gcc 7.2, and was able to reproduce the NULL >>> pointer dereference, the backtrace looks slightly different this time. >>> >>> I will also test with binutils 2.29, though I doubt that will make any >>> difference. >>> [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at 0560 [ 56.166563] IP: vfs_statfs+0x7c/0xc0 [ 56.167249] PGD 0 P4D 0 [ 56.167860] Oops: [#1] SMP [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_multiport xt_addrtype iptable_mangle iptable> [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O 4.14.0-git-kratos-1 #1 [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 [ 56.182729] task: 88040c412a00 task.stack: c90002c18000 [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 [ 56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202 [ 56.185143] RAX: RBX: c90002c1bbf0 RCX: 0020 [ 56.186085] RDX: 1801 RSI: 1801 RDI: [ 56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 00ff [ 56.188268] R10: 0038be3a R11: 880408b18258 R12: [ 56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: c90002c1bbf0 [ 56.190444] FS: () GS:88041fc0() knlGS: [ 56.191876] CS: 0010 DS: ES: CR0: 80050033 [ 56.192843] CR2: 0560 CR3: 01e0a002 CR4: 001606f0 [ 56.193898] Call Trace: [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 [ 56.195267] ? generic_permission+0x12c/0x1a0 [ 56.196025] nfsd4_encode_getattr+0x25/0x30 [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 [ 56.198268] nfsd_dispatch+0xe8/0x220 [ 56.198968] svc_process_common+0x475/0x640 [ 56.199696] ? nfsd_destroy+0x60/0x60 [ 56.200404] svc_process+0xf2/0x1a0 [ 56.201079] nfsd+0xe3/0x150 [ 56.201706] kthread+0x117/0x130 [ 56.202354] ? kthread_create_on_node+0x40/0x40 [ 56.203100] ret_from_fork+0x25/0x30 [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28 [ 56.207110] CR2: 0560 [ 56.207763] ---[ end trace d452986a80f64aaa ]--- >>> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: > > I'll take a closer look at this and see if I can provide something to > narrow it down. >> >> How reliable is this crash? The best idea I have to isolate it would >> be to bisect the additions of the __randomize_layout markings on >> various structures. I would start with the ones Al is most upset to >> see randomized. ;) > > It's pretty reliable, once I get a bad seed I can reproduce the crash > pretty quickly. > >> >> All that said, I'd like to better understand the BIOS side of this a >> little better. In the first email in this thread, you showed two BUGs >> separated by a little time, which implies to me that the NULL deref >> and the BIOS no longer POSTing are separate (though seemingly related) >> issues. Have you had machines survive the BUG without blowing up the >> BIOS? > > We had 3 machines die due to the BIOS issue (all of them pretty quickly > with the bad-seed kernel). All the dead machines had the same > motherboard model. I have not managed to reproduce the issue again on > the machine I restored via the IPMI interface, I suspect that it may be > a bug in the BIOS that was fixed in a more recent version. > >> >> I'm still trying to wrap my head around how the BIOS could be blowing >> up. I assume there's some magic memory address that is getting poked >> as a result of some struct randomization bug, so tracking that down >> should be possible assuming you can stand reflashing your BIOS across
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-16 04:54 PM, Kees Cook wrote: > On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLeanwrote: >> On 2017-11-11 09:31 AM, Linus Torvalds wrote: >>> Boris Lukashev points out that Patrick should probably check a newer >>> version of gcc. >>> >>> I looked around, and in one of the emails, Patrick said: >>> >>> "No changes, both the working and broken kernels were built with >>>distro-provided gcc 5.4.0 and binutils 2.28.1" >>> >>> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but >>> it's a bug-fix release to a pretty old branch that is not exactly new. >>> >>> It would probably be good to check if the problems persist with gcc >>> 6.x or 7.x.. I have no idea which gcc version the randstruct people >>> tend to use themselves. >> >> I just tested it with gcc 7.2, and was able to reproduce the NULL >> pointer dereference, the backtrace looks slightly different this time. >> >> I will also test with binutils 2.29, though I doubt that will make any >> difference. >> >>> [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at >>> 0560 >>> [ 56.166563] IP: vfs_statfs+0x7c/0xc0 >>> [ 56.167249] PGD 0 P4D 0 >>> [ 56.167860] Oops: [#1] SMP >>> [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 >>> xt_multiport xt_addrtype iptable_mangle iptable> >>> [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O >>> 4.14.0-git-kratos-1 #1 >>> [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 >>> [ 56.182729] task: 88040c412a00 task.stack: c90002c18000 >>> [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 >>> [ 56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202 >>> [ 56.185143] RAX: RBX: c90002c1bbf0 RCX: >>> 0020 >>> [ 56.186085] RDX: 1801 RSI: 1801 RDI: >>> >>> [ 56.187066] RBP: c90002c1bbc0 R08: ff00 R09: >>> 00ff >>> [ 56.188268] R10: 0038be3a R11: 880408b18258 R12: >>> >>> [ 56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: >>> c90002c1bbf0 >>> [ 56.190444] FS: () GS:88041fc0() >>> knlGS: >>> [ 56.191876] CS: 0010 DS: ES: CR0: 80050033 >>> [ 56.192843] CR2: 0560 CR3: 01e0a002 CR4: >>> 001606f0 >>> [ 56.193898] Call Trace: >>> [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 >>> [ 56.195267] ? generic_permission+0x12c/0x1a0 >>> [ 56.196025] nfsd4_encode_getattr+0x25/0x30 >>> [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 >>> [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 >>> [ 56.198268] nfsd_dispatch+0xe8/0x220 >>> [ 56.198968] svc_process_common+0x475/0x640 >>> [ 56.199696] ? nfsd_destroy+0x60/0x60 >>> [ 56.200404] svc_process+0xf2/0x1a0 >>> [ 56.201079] nfsd+0xe3/0x150 >>> [ 56.201706] kthread+0x117/0x130 >>> [ 56.202354] ? kthread_create_on_node+0x40/0x40 >>> [ 56.203100] ret_from_fork+0x25/0x30 >>> [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 >>> ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> >>> [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28 >>> [ 56.207110] CR2: 0560 >>> [ 56.207763] ---[ end trace d452986a80f64aaa ]--- >> >>> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: I'll take a closer look at this and see if I can provide something to narrow it down. > > How reliable is this crash? The best idea I have to isolate it would > be to bisect the additions of the __randomize_layout markings on > various structures. I would start with the ones Al is most upset to > see randomized. ;) It's pretty reliable, once I get a bad seed I can reproduce the crash pretty quickly. > > All that said, I'd like to better understand the BIOS side of this a > little better. In the first email in this thread, you showed two BUGs > separated by a little time, which implies to me that the NULL deref > and the BIOS no longer POSTing are separate (though seemingly related) > issues. Have you had machines survive the BUG without blowing up the > BIOS? We had 3 machines die due to the BIOS issue (all of them pretty quickly with the bad-seed kernel). All the dead machines had the same motherboard model. I have not managed to reproduce the issue again on the machine I restored via the IPMI interface, I suspect that it may be a bug in the BIOS that was fixed in a more recent version. > > I'm still trying to wrap my head around how the BIOS could be blowing > up. I assume there's some magic memory address that is getting poked > as a result of some struct randomization bug, so tracking that down > should be possible assuming you can stand reflashing your BIOS across > the bisects. That is our theory, some magic memory address that caused an overwrite of the flash where the BIOS code is
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-16 04:54 PM, Kees Cook wrote: > On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean wrote: >> On 2017-11-11 09:31 AM, Linus Torvalds wrote: >>> Boris Lukashev points out that Patrick should probably check a newer >>> version of gcc. >>> >>> I looked around, and in one of the emails, Patrick said: >>> >>> "No changes, both the working and broken kernels were built with >>>distro-provided gcc 5.4.0 and binutils 2.28.1" >>> >>> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but >>> it's a bug-fix release to a pretty old branch that is not exactly new. >>> >>> It would probably be good to check if the problems persist with gcc >>> 6.x or 7.x.. I have no idea which gcc version the randstruct people >>> tend to use themselves. >> >> I just tested it with gcc 7.2, and was able to reproduce the NULL >> pointer dereference, the backtrace looks slightly different this time. >> >> I will also test with binutils 2.29, though I doubt that will make any >> difference. >> >>> [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at >>> 0560 >>> [ 56.166563] IP: vfs_statfs+0x7c/0xc0 >>> [ 56.167249] PGD 0 P4D 0 >>> [ 56.167860] Oops: [#1] SMP >>> [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 >>> xt_multiport xt_addrtype iptable_mangle iptable> >>> [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O >>> 4.14.0-git-kratos-1 #1 >>> [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 >>> [ 56.182729] task: 88040c412a00 task.stack: c90002c18000 >>> [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 >>> [ 56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202 >>> [ 56.185143] RAX: RBX: c90002c1bbf0 RCX: >>> 0020 >>> [ 56.186085] RDX: 1801 RSI: 1801 RDI: >>> >>> [ 56.187066] RBP: c90002c1bbc0 R08: ff00 R09: >>> 00ff >>> [ 56.188268] R10: 0038be3a R11: 880408b18258 R12: >>> >>> [ 56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: >>> c90002c1bbf0 >>> [ 56.190444] FS: () GS:88041fc0() >>> knlGS: >>> [ 56.191876] CS: 0010 DS: ES: CR0: 80050033 >>> [ 56.192843] CR2: 0560 CR3: 01e0a002 CR4: >>> 001606f0 >>> [ 56.193898] Call Trace: >>> [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 >>> [ 56.195267] ? generic_permission+0x12c/0x1a0 >>> [ 56.196025] nfsd4_encode_getattr+0x25/0x30 >>> [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 >>> [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 >>> [ 56.198268] nfsd_dispatch+0xe8/0x220 >>> [ 56.198968] svc_process_common+0x475/0x640 >>> [ 56.199696] ? nfsd_destroy+0x60/0x60 >>> [ 56.200404] svc_process+0xf2/0x1a0 >>> [ 56.201079] nfsd+0xe3/0x150 >>> [ 56.201706] kthread+0x117/0x130 >>> [ 56.202354] ? kthread_create_on_node+0x40/0x40 >>> [ 56.203100] ret_from_fork+0x25/0x30 >>> [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 >>> ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> >>> [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28 >>> [ 56.207110] CR2: 0560 >>> [ 56.207763] ---[ end trace d452986a80f64aaa ]--- >> >>> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: I'll take a closer look at this and see if I can provide something to narrow it down. > > How reliable is this crash? The best idea I have to isolate it would > be to bisect the additions of the __randomize_layout markings on > various structures. I would start with the ones Al is most upset to > see randomized. ;) It's pretty reliable, once I get a bad seed I can reproduce the crash pretty quickly. > > All that said, I'd like to better understand the BIOS side of this a > little better. In the first email in this thread, you showed two BUGs > separated by a little time, which implies to me that the NULL deref > and the BIOS no longer POSTing are separate (though seemingly related) > issues. Have you had machines survive the BUG without blowing up the > BIOS? We had 3 machines die due to the BIOS issue (all of them pretty quickly with the bad-seed kernel). All the dead machines had the same motherboard model. I have not managed to reproduce the issue again on the machine I restored via the IPMI interface, I suspect that it may be a bug in the BIOS that was fixed in a more recent version. > > I'm still trying to wrap my head around how the BIOS could be blowing > up. I assume there's some magic memory address that is getting poked > as a result of some struct randomization bug, so tracking that down > should be possible assuming you can stand reflashing your BIOS across > the bisects. That is our theory, some magic memory address that caused an overwrite of the flash where the BIOS code is stored. We are working under the assumption
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLeanwrote: > On 2017-11-11 09:31 AM, Linus Torvalds wrote: >> Boris Lukashev points out that Patrick should probably check a newer >> version of gcc. >> >> I looked around, and in one of the emails, Patrick said: >> >> "No changes, both the working and broken kernels were built with >>distro-provided gcc 5.4.0 and binutils 2.28.1" >> >> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but >> it's a bug-fix release to a pretty old branch that is not exactly new. >> >> It would probably be good to check if the problems persist with gcc >> 6.x or 7.x.. I have no idea which gcc version the randstruct people >> tend to use themselves. > > I just tested it with gcc 7.2, and was able to reproduce the NULL > pointer dereference, the backtrace looks slightly different this time. > > I will also test with binutils 2.29, though I doubt that will make any > difference. > >> [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at >> 0560 >> [ 56.166563] IP: vfs_statfs+0x7c/0xc0 >> [ 56.167249] PGD 0 P4D 0 >> [ 56.167860] Oops: [#1] SMP >> [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 >> xt_multiport xt_addrtype iptable_mangle iptable> >> [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O >> 4.14.0-git-kratos-1 #1 >> [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 >> [ 56.182729] task: 88040c412a00 task.stack: c90002c18000 >> [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 >> [ 56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202 >> [ 56.185143] RAX: RBX: c90002c1bbf0 RCX: >> 0020 >> [ 56.186085] RDX: 1801 RSI: 1801 RDI: >> >> [ 56.187066] RBP: c90002c1bbc0 R08: ff00 R09: >> 00ff >> [ 56.188268] R10: 0038be3a R11: 880408b18258 R12: >> >> [ 56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: >> c90002c1bbf0 >> [ 56.190444] FS: () GS:88041fc0() >> knlGS: >> [ 56.191876] CS: 0010 DS: ES: CR0: 80050033 >> [ 56.192843] CR2: 0560 CR3: 01e0a002 CR4: >> 001606f0 >> [ 56.193898] Call Trace: >> [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 >> [ 56.195267] ? generic_permission+0x12c/0x1a0 >> [ 56.196025] nfsd4_encode_getattr+0x25/0x30 >> [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 >> [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 >> [ 56.198268] nfsd_dispatch+0xe8/0x220 >> [ 56.198968] svc_process_common+0x475/0x640 >> [ 56.199696] ? nfsd_destroy+0x60/0x60 >> [ 56.200404] svc_process+0xf2/0x1a0 >> [ 56.201079] nfsd+0xe3/0x150 >> [ 56.201706] kthread+0x117/0x130 >> [ 56.202354] ? kthread_create_on_node+0x40/0x40 >> [ 56.203100] ret_from_fork+0x25/0x30 >> [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 >> ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> >> [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28 >> [ 56.207110] CR2: 0560 >> [ 56.207763] ---[ end trace d452986a80f64aaa ]--- > >> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: >>> >>> I'll take a closer look at this and see if I can provide something to >>> narrow it down. How reliable is this crash? The best idea I have to isolate it would be to bisect the additions of the __randomize_layout markings on various structures. I would start with the ones Al is most upset to see randomized. ;) All that said, I'd like to better understand the BIOS side of this a little better. In the first email in this thread, you showed two BUGs separated by a little time, which implies to me that the NULL deref and the BIOS no longer POSTing are separate (though seemingly related) issues. Have you had machines survive the BUG without blowing up the BIOS? I'm still trying to wrap my head around how the BIOS could be blowing up. I assume there's some magic memory address that is getting poked as a result of some struct randomization bug, so tracking that down should be possible assuming you can stand reflashing your BIOS across the bisects. For the first step, I'd try a revert of 9225331b310821760f39ba55b00b8973602adbb5, which enables a large portion of struct randomization. If that doesn't change things, I can provide a series that reverts 3859a271a003aba01e45b85c9d8b355eb7bf25f9 and then re-applies __randomize_layout one structure per patch, and you could bisect that? -Kees -- Kees Cook Pixel Security
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean wrote: > On 2017-11-11 09:31 AM, Linus Torvalds wrote: >> Boris Lukashev points out that Patrick should probably check a newer >> version of gcc. >> >> I looked around, and in one of the emails, Patrick said: >> >> "No changes, both the working and broken kernels were built with >>distro-provided gcc 5.4.0 and binutils 2.28.1" >> >> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but >> it's a bug-fix release to a pretty old branch that is not exactly new. >> >> It would probably be good to check if the problems persist with gcc >> 6.x or 7.x.. I have no idea which gcc version the randstruct people >> tend to use themselves. > > I just tested it with gcc 7.2, and was able to reproduce the NULL > pointer dereference, the backtrace looks slightly different this time. > > I will also test with binutils 2.29, though I doubt that will make any > difference. > >> [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at >> 0560 >> [ 56.166563] IP: vfs_statfs+0x7c/0xc0 >> [ 56.167249] PGD 0 P4D 0 >> [ 56.167860] Oops: [#1] SMP >> [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 >> xt_multiport xt_addrtype iptable_mangle iptable> >> [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O >> 4.14.0-git-kratos-1 #1 >> [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 >> [ 56.182729] task: 88040c412a00 task.stack: c90002c18000 >> [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 >> [ 56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202 >> [ 56.185143] RAX: RBX: c90002c1bbf0 RCX: >> 0020 >> [ 56.186085] RDX: 1801 RSI: 1801 RDI: >> >> [ 56.187066] RBP: c90002c1bbc0 R08: ff00 R09: >> 00ff >> [ 56.188268] R10: 0038be3a R11: 880408b18258 R12: >> >> [ 56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: >> c90002c1bbf0 >> [ 56.190444] FS: () GS:88041fc0() >> knlGS: >> [ 56.191876] CS: 0010 DS: ES: CR0: 80050033 >> [ 56.192843] CR2: 0560 CR3: 01e0a002 CR4: >> 001606f0 >> [ 56.193898] Call Trace: >> [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 >> [ 56.195267] ? generic_permission+0x12c/0x1a0 >> [ 56.196025] nfsd4_encode_getattr+0x25/0x30 >> [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 >> [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 >> [ 56.198268] nfsd_dispatch+0xe8/0x220 >> [ 56.198968] svc_process_common+0x475/0x640 >> [ 56.199696] ? nfsd_destroy+0x60/0x60 >> [ 56.200404] svc_process+0xf2/0x1a0 >> [ 56.201079] nfsd+0xe3/0x150 >> [ 56.201706] kthread+0x117/0x130 >> [ 56.202354] ? kthread_create_on_node+0x40/0x40 >> [ 56.203100] ret_from_fork+0x25/0x30 >> [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 >> ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> >> [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28 >> [ 56.207110] CR2: 0560 >> [ 56.207763] ---[ end trace d452986a80f64aaa ]--- > >> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: >>> >>> I'll take a closer look at this and see if I can provide something to >>> narrow it down. How reliable is this crash? The best idea I have to isolate it would be to bisect the additions of the __randomize_layout markings on various structures. I would start with the ones Al is most upset to see randomized. ;) All that said, I'd like to better understand the BIOS side of this a little better. In the first email in this thread, you showed two BUGs separated by a little time, which implies to me that the NULL deref and the BIOS no longer POSTing are separate (though seemingly related) issues. Have you had machines survive the BUG without blowing up the BIOS? I'm still trying to wrap my head around how the BIOS could be blowing up. I assume there's some magic memory address that is getting poked as a result of some struct randomization bug, so tracking that down should be possible assuming you can stand reflashing your BIOS across the bisects. For the first step, I'd try a revert of 9225331b310821760f39ba55b00b8973602adbb5, which enables a large portion of struct randomization. If that doesn't change things, I can provide a series that reverts 3859a271a003aba01e45b85c9d8b355eb7bf25f9 and then re-applies __randomize_layout one structure per patch, and you could bisect that? -Kees -- Kees Cook Pixel Security
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Tue, Nov 14, 2017 at 3:53 PM, Rasmus Villemoeswrote: > > Odd. 7.2 and 7.1 (both of which I've just compiled from source, no > special configure flags or anything) generate exactly the same (good) > code for fs/namespace.o after patching. I also tried with > CONFIG_CC_OPTIMIZE_FOR_SIZE=y just in case, but that still produces > reasonable code. Oh well. It might be configuration-dependent. I usually end up having a couple of different configurations depending on what I'm doing, one being a fairly minimal one, and one being the "allmodconfig" I use mainly for build testing. Linus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Tue, Nov 14, 2017 at 3:53 PM, Rasmus Villemoes wrote: > > Odd. 7.2 and 7.1 (both of which I've just compiled from source, no > special configure flags or anything) generate exactly the same (good) > code for fs/namespace.o after patching. I also tried with > CONFIG_CC_OPTIMIZE_FOR_SIZE=y just in case, but that still produces > reasonable code. Oh well. It might be configuration-dependent. I usually end up having a couple of different configurations depending on what I'm doing, one being a fairly minimal one, and one being the "allmodconfig" I use mainly for build testing. Linus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On 14 November 2017 at 23:43, Linus Torvaldswrote: > On Tue, Nov 14, 2017 at 2:24 PM, Rasmus Villemoes > wrote: >> >> Can you be more specific? That's not what I see with gcc 7.1. > > I have gcc-7.2.1, and it made a horrible mess of the do_mount() code. Odd. 7.2 and 7.1 (both of which I've just compiled from source, no special configure flags or anything) generate exactly the same (good) code for fs/namespace.o after patching. I also tried with CONFIG_CC_OPTIMIZE_FOR_SIZE=y just in case, but that still produces reasonable code. Oh well. Rasmus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On 14 November 2017 at 23:43, Linus Torvalds wrote: > On Tue, Nov 14, 2017 at 2:24 PM, Rasmus Villemoes > wrote: >> >> Can you be more specific? That's not what I see with gcc 7.1. > > I have gcc-7.2.1, and it made a horrible mess of the do_mount() code. Odd. 7.2 and 7.1 (both of which I've just compiled from source, no special configure flags or anything) generate exactly the same (good) code for fs/namespace.o after patching. I also tried with CONFIG_CC_OPTIMIZE_FOR_SIZE=y just in case, but that still produces reasonable code. Oh well. Rasmus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Tue, Nov 14, 2017 at 2:24 PM, Rasmus Villemoeswrote: > > Can you be more specific? That's not what I see with gcc 7.1. I have gcc-7.2.1, and it made a horrible mess of the do_mount() code. Look for the comment "/* Separate the per-mountpoint flags */" and do the obvious conversion of the simple single-bit stuff. My gcc actually turned those into jumps after _after_ I converted it into the ternary operation, and then the ternary conversion actually did much worse, because it actually had two sides (one with a zero value, and one with the bit value to be set). I didn't have time to look into _why_ that code generated branch-overs, when the otherwise similar reverse case in fs/statfs.c did not. Linus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Tue, Nov 14, 2017 at 2:24 PM, Rasmus Villemoes wrote: > > Can you be more specific? That's not what I see with gcc 7.1. I have gcc-7.2.1, and it made a horrible mess of the do_mount() code. Look for the comment "/* Separate the per-mountpoint flags */" and do the obvious conversion of the simple single-bit stuff. My gcc actually turned those into jumps after _after_ I converted it into the ternary operation, and then the ternary conversion actually did much worse, because it actually had two sides (one with a zero value, and one with the bit value to be set). I didn't have time to look into _why_ that code generated branch-overs, when the otherwise similar reverse case in fs/statfs.c did not. Linus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On 14 November 2017 at 00:54, Linus Torvaldswrote: > On Mon, Nov 13, 2017 at 3:30 PM, Linus Torvalds > wrote: >> >> So let's just rewrite that mnt_flags conversion that way, justr to get >> gcc to generate the obvious code. > > Oh wow. I tried to do the same thing in fs/namespace.c where it does > the reverse bit translation, and gcc makes a _horrible_ mess of it and > actually makes the code much worse because for some reason the pattern > doesn't trigger. [trimming cc list] Can you be more specific? That's not what I see with gcc 7.1. I've found two blocks where I replaced the if's with the ternary expressions, one in clone_mnt, one in do_mount. In clone_mnt, gcc-7.1 seems to do (first instruction is the unconditional |= MNT_LOCK_ATIME) 10d7: 0d 00 00 04 00 or $0x4,%eax 10dc: 89 43 30mov%eax,0x30(%rbx) 10df: a8 02 test $0x2,%al 10e1: 74 08 je 10eb 10e3: 0d 00 00 20 00 or $0x20,%eax 10e8: 89 43 30mov%eax,0x30(%rbx) 10eb: a8 01 test $0x1,%al 10ed: 74 08 je 10f7 10ef: 0d 00 00 10 00 or $0x10,%eax 10f4: 89 43 30mov%eax,0x30(%rbx) and after patching 10cc: 0d 00 00 04 00 or $0x4,%eax 10d1: 89 c2 mov%eax,%edx 10d3: c1 e2 10shl$0x10,%edx 10d6: 81 e2 00 00 40 00 and$0x40,%edx 10dc: 09 d0 or %edx,%eax 10de: 89 c2 mov%eax,%edx 10e0: c1 e2 14shl$0x14,%edx 10e3: 81 e2 00 00 20 00 and$0x20,%edx 10e9: 09 c2 or %eax,%edx (with a final store of %eax to 0x30(%rbx)). Either way it's four instructions per flag, but I assume the one without the branches is preferable. Similarly, in do_mount, before we have 3429: 44 89 f8mov%r15d,%eax 342c: 83 c8 01or $0x1,%eax 342f: 40 f6 c5 02 test $0x2,%bpl 3433: 44 0f 45 f8 cmovne %eax,%r15d 3437: 44 89 f8mov%r15d,%eax 343a: 83 c8 02or $0x2,%eax 343d: 40 f6 c5 04 test $0x4,%bpl 3441: 44 0f 45 f8 cmovne %eax,%r15d but after patching it does something like 3425: 4d 89 femov%r15,%r14 3428: 48 c1 ea 07 shr$0x7,%rdx 342c: 49 d1 eeshr%r14 342f: 89 d0 mov%edx,%eax 3431: c1 e1 05shl$0x5,%ecx 3434: 83 e0 08and$0x8,%eax 3437: 41 83 e6 07 and$0x7,%r14d 343b: 41 09 c6or %eax,%r14d 343e: 89 d0 mov%edx,%eax which actually makes use of MS_{NOSUID,NODEV,NOEXEC} all being 1 bit off from their MNT_ counterparts (witness the shr %r14 and the and with 0x7), so there we also cut 37 bytes according to bloat-o-meter. Now, it does seem that older (and not that old in absolute terms) compilers may generate worse code after the transformation - the 'replace with shift-mask-or' pattern doesn't exist until 7.1. E.g. with 5.4 we have $ scripts/bloat-o-meter fs/namespace.o.{0,1}-5.4 add/remove: 0/0 grow/shrink: 2/0 up/down: 50/0 (50) function old new delta do_mount31533195 +42 clone_mnt768 776 +8 5.4 compiles the "if() ..." construction roughly as 7.1, i.e. with cmovs, but the ternary expressions are transformed into this abomination 336f: 48 89 damov%rbx,%rdx 3372: 83 e2 04and$0x4,%edx 3375: 48 83 fa 01 cmp$0x1,%rdx 3379: 19 d2 sbb%edx,%edx 337b: f7 d2 not%edx 337d: 83 e2 02and$0x2,%edx 3380: 09 d0 or %edx,%eax 3382: 48 89 damov%rbx,%rdx 3385: 83 e2 08and$0x8,%edx 3388: 48 83 fa 01 cmp$0x1,%rdx 338c: 19 d2 sbb%edx,%edx 338e: f7 d2 not%edx 3390: 83 e2 04and$0x4,%edx 3393: 09 d0 or %edx,%eax Was it something like this you saw? Rasmus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On 14 November 2017 at 00:54, Linus Torvalds wrote: > On Mon, Nov 13, 2017 at 3:30 PM, Linus Torvalds > wrote: >> >> So let's just rewrite that mnt_flags conversion that way, justr to get >> gcc to generate the obvious code. > > Oh wow. I tried to do the same thing in fs/namespace.c where it does > the reverse bit translation, and gcc makes a _horrible_ mess of it and > actually makes the code much worse because for some reason the pattern > doesn't trigger. [trimming cc list] Can you be more specific? That's not what I see with gcc 7.1. I've found two blocks where I replaced the if's with the ternary expressions, one in clone_mnt, one in do_mount. In clone_mnt, gcc-7.1 seems to do (first instruction is the unconditional |= MNT_LOCK_ATIME) 10d7: 0d 00 00 04 00 or $0x4,%eax 10dc: 89 43 30mov%eax,0x30(%rbx) 10df: a8 02 test $0x2,%al 10e1: 74 08 je 10eb 10e3: 0d 00 00 20 00 or $0x20,%eax 10e8: 89 43 30mov%eax,0x30(%rbx) 10eb: a8 01 test $0x1,%al 10ed: 74 08 je 10f7 10ef: 0d 00 00 10 00 or $0x10,%eax 10f4: 89 43 30mov%eax,0x30(%rbx) and after patching 10cc: 0d 00 00 04 00 or $0x4,%eax 10d1: 89 c2 mov%eax,%edx 10d3: c1 e2 10shl$0x10,%edx 10d6: 81 e2 00 00 40 00 and$0x40,%edx 10dc: 09 d0 or %edx,%eax 10de: 89 c2 mov%eax,%edx 10e0: c1 e2 14shl$0x14,%edx 10e3: 81 e2 00 00 20 00 and$0x20,%edx 10e9: 09 c2 or %eax,%edx (with a final store of %eax to 0x30(%rbx)). Either way it's four instructions per flag, but I assume the one without the branches is preferable. Similarly, in do_mount, before we have 3429: 44 89 f8mov%r15d,%eax 342c: 83 c8 01or $0x1,%eax 342f: 40 f6 c5 02 test $0x2,%bpl 3433: 44 0f 45 f8 cmovne %eax,%r15d 3437: 44 89 f8mov%r15d,%eax 343a: 83 c8 02or $0x2,%eax 343d: 40 f6 c5 04 test $0x4,%bpl 3441: 44 0f 45 f8 cmovne %eax,%r15d but after patching it does something like 3425: 4d 89 femov%r15,%r14 3428: 48 c1 ea 07 shr$0x7,%rdx 342c: 49 d1 eeshr%r14 342f: 89 d0 mov%edx,%eax 3431: c1 e1 05shl$0x5,%ecx 3434: 83 e0 08and$0x8,%eax 3437: 41 83 e6 07 and$0x7,%r14d 343b: 41 09 c6or %eax,%r14d 343e: 89 d0 mov%edx,%eax which actually makes use of MS_{NOSUID,NODEV,NOEXEC} all being 1 bit off from their MNT_ counterparts (witness the shr %r14 and the and with 0x7), so there we also cut 37 bytes according to bloat-o-meter. Now, it does seem that older (and not that old in absolute terms) compilers may generate worse code after the transformation - the 'replace with shift-mask-or' pattern doesn't exist until 7.1. E.g. with 5.4 we have $ scripts/bloat-o-meter fs/namespace.o.{0,1}-5.4 add/remove: 0/0 grow/shrink: 2/0 up/down: 50/0 (50) function old new delta do_mount31533195 +42 clone_mnt768 776 +8 5.4 compiles the "if() ..." construction roughly as 7.1, i.e. with cmovs, but the ternary expressions are transformed into this abomination 336f: 48 89 damov%rbx,%rdx 3372: 83 e2 04and$0x4,%edx 3375: 48 83 fa 01 cmp$0x1,%rdx 3379: 19 d2 sbb%edx,%edx 337b: f7 d2 not%edx 337d: 83 e2 02and$0x2,%edx 3380: 09 d0 or %edx,%eax 3382: 48 89 damov%rbx,%rdx 3385: 83 e2 08and$0x8,%edx 3388: 48 83 fa 01 cmp$0x1,%rdx 338c: 19 d2 sbb%edx,%edx 338e: f7 d2 not%edx 3390: 83 e2 04and$0x4,%edx 3393: 09 d0 or %edx,%eax Was it something like this you saw? Rasmus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Mon, Nov 13, 2017 at 3:30 PM, Linus Torvaldswrote: > > So let's just rewrite that mnt_flags conversion that way, justr to get > gcc to generate the obvious code. Oh wow. I tried to do the same thing in fs/namespace.c where it does the reverse bit translation, and gcc makes a _horrible_ mess of it and actually makes the code much worse because for some reason the pattern doesn't trigger. So this gcc optimization is apparently pretty damn fragile in general. It triggers for the trivial cases, but then other code around it can confuse it badly. So I don't think I'll touch this, it seems to not be really reliably something that makes gcc generate what should be the obvious code.. Linus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Mon, Nov 13, 2017 at 3:30 PM, Linus Torvalds wrote: > > So let's just rewrite that mnt_flags conversion that way, justr to get > gcc to generate the obvious code. Oh wow. I tried to do the same thing in fs/namespace.c where it does the reverse bit translation, and gcc makes a _horrible_ mess of it and actually makes the code much worse because for some reason the pattern doesn't trigger. So this gcc optimization is apparently pretty damn fragile in general. It triggers for the trivial cases, but then other code around it can confuse it badly. So I don't think I'll touch this, it seems to not be really reliably something that makes gcc generate what should be the obvious code.. Linus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Mon, Nov 13, 2017 at 2:59 PM, Rasmus Villemoeswrote: >> Sadly, gcc makes a mess of it and actually generates code that looks >> like the original C.[...] > > Actually, new enough gcc (7.1, I think) does contain a pattern that does > this, but unfortunately only if one spells it > > y |= (x & BIT) ? OTHER_BIT : 0; Ahh, I should have recognized that, I think that's what we ended up doing with the VM_READ -> PROT_READ translation in a few places, exactly because gcc would then recognize it and do the much better code generation. > which is half-way to doing it by hand, I suppose. Yeah, but it is at least acceptable, and the code is still legible C. The alternatives of doing it _entirely_ by hand tend to be much worse (ie you end up using a macro from hell that checks which of the two bits are bigger and shifting in the right direction by using multiplication or division). So let's just rewrite that mnt_flags conversion that way, justr to get gcc to generate the obvious code. It's a bit sad how gcc didn't pick up on the original code, especially since it had already done the much more complicated translation of doing the if-conversion. Thanks for pointing out the gcc pattern. Linus
Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Mon, Nov 13, 2017 at 2:59 PM, Rasmus Villemoes wrote: >> Sadly, gcc makes a mess of it and actually generates code that looks >> like the original C.[...] > > Actually, new enough gcc (7.1, I think) does contain a pattern that does > this, but unfortunately only if one spells it > > y |= (x & BIT) ? OTHER_BIT : 0; Ahh, I should have recognized that, I think that's what we ended up doing with the VM_READ -> PROT_READ translation in a few places, exactly because gcc would then recognize it and do the much better code generation. > which is half-way to doing it by hand, I suppose. Yeah, but it is at least acceptable, and the code is still legible C. The alternatives of doing it _entirely_ by hand tend to be much worse (ie you end up using a macro from hell that checks which of the two bits are bigger and shifting in the right direction by using multiplication or division). So let's just rewrite that mnt_flags conversion that way, justr to get gcc to generate the obvious code. It's a bit sad how gcc didn't pick up on the original code, especially since it had already done the much more complicated translation of doing the if-conversion. Thanks for pointing out the gcc pattern. Linus
bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Thu, Nov 09 2017, Linus Torvaldswrote: > The code disassembles to > >0: 83 c9 08 or $0x8,%ecx >3: 40 f6 c6 04 test $0x4,%sil >7: 0f 45 d1 cmovne %ecx,%edx >a: 89 d1mov%edx,%ecx >c: 80 cd 04 or $0x4,%ch >f: 40 f6 c6 08 test $0x8,%sil > 13: 0f 45 d1 cmovne %ecx,%edx > 16: 89 d1mov%edx,%ecx > 18: 80 cd 08 or $0x8,%ch > 1b: 40 f6 c6 10 test $0x10,%sil > 1f: 0f 45 d1 cmovne %ecx,%edx > 22: 89 d1mov%edx,%ecx > 24: 80 cd 10 or $0x10,%ch > 27: 83 e6 20 and$0x20,%esi > 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction > 31: 0f 45 d1 cmovne %ecx,%edx > 34: 83 ca 20 or $0x20,%edx > 37: 89 f1mov%esi,%ecx > 39: 83 e1 10 and$0x10,%ecx > 3c: 89 cfmov%ecx,%edi > > and all those odd cmovne and bit-ops are just the bit selection code > in flags_by_mnt(), which is inlined through calculate_f_flags (which > is _also_ inlined) into vfs_statfs(). > > Sadly, gcc makes a mess of it and actually generates code that looks > like the original C. I would have hoped that gcc could have turned > >if (x & BIT) > y |= OTHER_BIT; > > into > > y |= (x & BIT) shifted-by-the-bit-difference-between BIT/OTHER_BIT; > > but that doesn't happen. Actually, new enough gcc (7.1, I think) does contain a pattern that does this, but unfortunately only if one spells it y |= (x & BIT) ? OTHER_BIT : 0; which is half-way to doing it by hand, I suppose. Doing the - if (mnt_flags & MNT_READONLY) - flags |= ST_RDONLY; + flags |= (mnt_flags & MNT_READONLY) ? ST_RDONLY : 0; and pasting into godbolt.org, one can apparently get gcc to compile it to flags_by_mnt(int): leal (%rdi,%rdi), %edx movl %edi, %eax sarl $6, %eax movl %edx, %ecx andl $1, %eax andl $12, %edx andl $2, %ecx orl %ecx, %eax orl %eax, %edx movl %edi, %eax sall $7, %eax andl $7168, %eax orl %edx, %eax ret Rasmus
bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]
On Thu, Nov 09 2017, Linus Torvalds wrote: > The code disassembles to > >0: 83 c9 08 or $0x8,%ecx >3: 40 f6 c6 04 test $0x4,%sil >7: 0f 45 d1 cmovne %ecx,%edx >a: 89 d1mov%edx,%ecx >c: 80 cd 04 or $0x4,%ch >f: 40 f6 c6 08 test $0x8,%sil > 13: 0f 45 d1 cmovne %ecx,%edx > 16: 89 d1mov%edx,%ecx > 18: 80 cd 08 or $0x8,%ch > 1b: 40 f6 c6 10 test $0x10,%sil > 1f: 0f 45 d1 cmovne %ecx,%edx > 22: 89 d1mov%edx,%ecx > 24: 80 cd 10 or $0x10,%ch > 27: 83 e6 20 and$0x20,%esi > 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction > 31: 0f 45 d1 cmovne %ecx,%edx > 34: 83 ca 20 or $0x20,%edx > 37: 89 f1mov%esi,%ecx > 39: 83 e1 10 and$0x10,%ecx > 3c: 89 cfmov%ecx,%edi > > and all those odd cmovne and bit-ops are just the bit selection code > in flags_by_mnt(), which is inlined through calculate_f_flags (which > is _also_ inlined) into vfs_statfs(). > > Sadly, gcc makes a mess of it and actually generates code that looks > like the original C. I would have hoped that gcc could have turned > >if (x & BIT) > y |= OTHER_BIT; > > into > > y |= (x & BIT) shifted-by-the-bit-difference-between BIT/OTHER_BIT; > > but that doesn't happen. Actually, new enough gcc (7.1, I think) does contain a pattern that does this, but unfortunately only if one spells it y |= (x & BIT) ? OTHER_BIT : 0; which is half-way to doing it by hand, I suppose. Doing the - if (mnt_flags & MNT_READONLY) - flags |= ST_RDONLY; + flags |= (mnt_flags & MNT_READONLY) ? ST_RDONLY : 0; and pasting into godbolt.org, one can apparently get gcc to compile it to flags_by_mnt(int): leal (%rdi,%rdi), %edx movl %edi, %eax sarl $6, %eax movl %edx, %ecx andl $1, %eax andl $12, %edx andl $2, %ecx orl %ecx, %eax orl %eax, %edx movl %edi, %eax sall $7, %eax andl $7168, %eax orl %edx, %eax ret Rasmus
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-11 09:31 AM, Linus Torvalds wrote: > Boris Lukashev points out that Patrick should probably check a newer > version of gcc. > > I looked around, and in one of the emails, Patrick said: > > "No changes, both the working and broken kernels were built with >distro-provided gcc 5.4.0 and binutils 2.28.1" > > and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but > it's a bug-fix release to a pretty old branch that is not exactly new. > > It would probably be good to check if the problems persist with gcc > 6.x or 7.x.. I have no idea which gcc version the randstruct people > tend to use themselves. I just tested it with gcc 7.2, and was able to reproduce the NULL pointer dereference, the backtrace looks slightly different this time. I will also test with binutils 2.29, though I doubt that will make any difference. > [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at > 0560 > [ 56.166563] IP: vfs_statfs+0x7c/0xc0 > [ 56.167249] PGD 0 P4D 0 > [ 56.167860] Oops: [#1] SMP > [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 > xt_multiport xt_addrtype iptable_mangle iptable> > [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O > 4.14.0-git-kratos-1 #1 > [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 > [ 56.182729] task: 88040c412a00 task.stack: c90002c18000 > [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 > [ 56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202 > [ 56.185143] RAX: RBX: c90002c1bbf0 RCX: > 0020 > [ 56.186085] RDX: 1801 RSI: 1801 RDI: > > [ 56.187066] RBP: c90002c1bbc0 R08: ff00 R09: > 00ff > [ 56.188268] R10: 0038be3a R11: 880408b18258 R12: > > [ 56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: > c90002c1bbf0 > [ 56.190444] FS: () GS:88041fc0() > knlGS: > [ 56.191876] CS: 0010 DS: ES: CR0: 80050033 > [ 56.192843] CR2: 0560 CR3: 01e0a002 CR4: > 001606f0 > [ 56.193898] Call Trace: > [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 > [ 56.195267] ? generic_permission+0x12c/0x1a0 > [ 56.196025] nfsd4_encode_getattr+0x25/0x30 > [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 > [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 > [ 56.198268] nfsd_dispatch+0xe8/0x220 > [ 56.198968] svc_process_common+0x475/0x640 > [ 56.199696] ? nfsd_destroy+0x60/0x60 > [ 56.200404] svc_process+0xf2/0x1a0 > [ 56.201079] nfsd+0xe3/0x150 > [ 56.201706] kthread+0x117/0x130 > [ 56.202354] ? kthread_create_on_node+0x40/0x40 > [ 56.203100] ret_from_fork+0x25/0x30 > [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 ce > 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> > [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28 > [ 56.207110] CR2: 0560 > [ 56.207763] ---[ end trace d452986a80f64aaa ]--- > On Sat, Nov 11, 2017 at 8:13 AM, Kees Cookwrote: >> >> I'll take a closer look at this and see if I can provide something to >> narrow it down.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-11 09:31 AM, Linus Torvalds wrote: > Boris Lukashev points out that Patrick should probably check a newer > version of gcc. > > I looked around, and in one of the emails, Patrick said: > > "No changes, both the working and broken kernels were built with >distro-provided gcc 5.4.0 and binutils 2.28.1" > > and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but > it's a bug-fix release to a pretty old branch that is not exactly new. > > It would probably be good to check if the problems persist with gcc > 6.x or 7.x.. I have no idea which gcc version the randstruct people > tend to use themselves. I just tested it with gcc 7.2, and was able to reproduce the NULL pointer dereference, the backtrace looks slightly different this time. I will also test with binutils 2.29, though I doubt that will make any difference. > [ 56.165181] BUG: unable to handle kernel NULL pointer dereference at > 0560 > [ 56.166563] IP: vfs_statfs+0x7c/0xc0 > [ 56.167249] PGD 0 P4D 0 > [ 56.167860] Oops: [#1] SMP > [ 56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 > xt_multiport xt_addrtype iptable_mangle iptable> > [ 56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G O > 4.14.0-git-kratos-1 #1 > [ 56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 > [ 56.182729] task: 88040c412a00 task.stack: c90002c18000 > [ 56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0 > [ 56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202 > [ 56.185143] RAX: RBX: c90002c1bbf0 RCX: > 0020 > [ 56.186085] RDX: 1801 RSI: 1801 RDI: > > [ 56.187066] RBP: c90002c1bbc0 R08: ff00 R09: > 00ff > [ 56.188268] R10: 0038be3a R11: 880408b18258 R12: > > [ 56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: > c90002c1bbf0 > [ 56.190444] FS: () GS:88041fc0() > knlGS: > [ 56.191876] CS: 0010 DS: ES: CR0: 80050033 > [ 56.192843] CR2: 0560 CR3: 01e0a002 CR4: > 001606f0 > [ 56.193898] Call Trace: > [ 56.194510] nfsd4_encode_fattr+0x201/0x1f90 > [ 56.195267] ? generic_permission+0x12c/0x1a0 > [ 56.196025] nfsd4_encode_getattr+0x25/0x30 > [ 56.196753] nfsd4_encode_operation+0x98/0x1b0 > [ 56.197526] nfsd4_proc_compound+0x2a0/0x5e0 > [ 56.198268] nfsd_dispatch+0xe8/0x220 > [ 56.198968] svc_process_common+0x475/0x640 > [ 56.199696] ? nfsd_destroy+0x60/0x60 > [ 56.200404] svc_process+0xf2/0x1a0 > [ 56.201079] nfsd+0xe3/0x150 > [ 56.201706] kthread+0x117/0x130 > [ 56.202354] ? kthread_create_on_node+0x40/0x40 > [ 56.203100] ret_from_fork+0x25/0x30 > [ 56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 ce > 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce> > [ 56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28 > [ 56.207110] CR2: 0560 > [ 56.207763] ---[ end trace d452986a80f64aaa ]--- > On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: >> >> I'll take a closer look at this and see if I can provide something to >> narrow it down.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
Boris Lukashev points out that Patrick should probably check a newer version of gcc. I looked around, and in one of the emails, Patrick said: "No changes, both the working and broken kernels were built with distro-provided gcc 5.4.0 and binutils 2.28.1" and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but it's a bug-fix release to a pretty old branch that is not exactly new. It would probably be good to check if the problems persist with gcc 6.x or 7.x.. I have no idea which gcc version the randstruct people tend to use themselves. Linus On Sat, Nov 11, 2017 at 8:13 AM, Kees Cookwrote: > > I'll take a closer look at this and see if I can provide something to > narrow it down.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
Boris Lukashev points out that Patrick should probably check a newer version of gcc. I looked around, and in one of the emails, Patrick said: "No changes, both the working and broken kernels were built with distro-provided gcc 5.4.0 and binutils 2.28.1" and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but it's a bug-fix release to a pretty old branch that is not exactly new. It would probably be good to check if the problems persist with gcc 6.x or 7.x.. I have no idea which gcc version the randstruct people tend to use themselves. Linus On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook wrote: > > I'll take a closer look at this and see if I can provide something to > narrow it down.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 10, 2017 at 6:36 PM, Linus Torvaldswrote: > [ Bringing in the gcc plugin people and the kernel hardening list, > since it now is no longer even remotely looking like a nfsd, vfs or > filesystem issue any more ] > > Kees, Emese, > the whole thread is on lkml, but there's clearly something horribly > wrong with RANDSTRUCT, and it's not new even though it looked that way > for a while. It wouldn't be the first issue we've seen; it's (obviously) a pretty aggressive change to the resulting build. > Patrick seems to trigger it with nfsd, so it might be specific to that. > > Alternatively, it might just be that very few people run > RANDSTRUCT-built kernels, or just have been lucky with the seeding. Given its potential cache-line abuse, I'm not surprised that its usage is more limited than other features. > Sorry for top-posting, but there's not really anything in the email > itself to reply to, other than saying thanks to Patrick for narrowing > it down like this. Agreed; thanks Patrick! :) Given that the issue is non-deterministic, I wonder if the bug is related to some kind of missing RCU or barrier that goes unnoticed in normal struct layouts. > It would have been very interesting if it had actually bisected to > something, but it seems that the real issue is just the choice of > seeding for RANDSTRUCT. That's where we've seen bugs in the past: some pathological ordering of a struct uncovers a corner case. In the past it's been much more deterministic: doesn't build, or immediately crashes on boot, etc. I'll take a closer look at this and see if I can provide something to narrow it down. -Kees > > Linus > > On Fri, Nov 10, 2017 at 4:27 PM, Patrick McLean wrote: >> On 2017-11-10 03:26 PM, Patrick McLean wrote: >>> On 2017-11-10 10:42 AM, Linus Torvalds wrote: I really don't see anything that looks even half-way suspicious in that 4.13.8..11 range. But as mentioned, compiler interactions can be _really_ subtle. And hey, it can be a real kernel bug too, that just happens to be exposed by RANDSTRUCT, so a bisect really would be very nice. >>> >>> I am working on bisecting the issue now, but I think I have some more >>> evidence pointing to a compiler issue related to RANDSTRUCT. There are >>> actually 3 issues that we have seen. Sometimes we get the null pointer >>> deref in the initial message, sometimes we get the GPF, and sometimes we >>> see an issue where the NFS clients see all files as root-owned >>> directories. Any given kernel will always see the same issue, but after >>> a "make mrproper" and recompile (with the same .config), the issue will >>> often change. I suspect that all 3 of these problems are actually the >>> same issue manifesting itself in different ways depending on what seed >>> the RANDSTRUCT gcc plugin is using. >> >> Further update on this, using the same seed for RANDSTRUCT, I have >> reproduced this issue on v4.13.0, so it does not seem to be recently >> introduced. The older kernel apparently only worked for us because we >> were lucky. Generally we always compile new kernels from a fresh tree, >> so they are never using the same seed. >> >> In case someone wants to play with this, here are some interesting seeds >> (in include/generated/randomize_layout_hash.h): >> >> Produce a NULL pointer dereference (though I am not sure what the client >> does to produce this). >> 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc >> >> All files for nfsd4 clients appear as directories owned as root, no >> matter the real owner (this happens for all clients we have tested): >> 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e >> >> This is the seed that was breaking motherboards (make sure you have a >> way to flash the BIOS with this one): >> 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd >> >> Finally, here is a seed that produces a kernel that does not exhibit any >> problems we are aware of: >> e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b >> Because in the end, compiler bugs are very rare. They are particularly annoying when they do happen, though, so they loom big in the mind of people who have had to chase them down. -- Kees Cook Pixel Security
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 10, 2017 at 6:36 PM, Linus Torvalds wrote: > [ Bringing in the gcc plugin people and the kernel hardening list, > since it now is no longer even remotely looking like a nfsd, vfs or > filesystem issue any more ] > > Kees, Emese, > the whole thread is on lkml, but there's clearly something horribly > wrong with RANDSTRUCT, and it's not new even though it looked that way > for a while. It wouldn't be the first issue we've seen; it's (obviously) a pretty aggressive change to the resulting build. > Patrick seems to trigger it with nfsd, so it might be specific to that. > > Alternatively, it might just be that very few people run > RANDSTRUCT-built kernels, or just have been lucky with the seeding. Given its potential cache-line abuse, I'm not surprised that its usage is more limited than other features. > Sorry for top-posting, but there's not really anything in the email > itself to reply to, other than saying thanks to Patrick for narrowing > it down like this. Agreed; thanks Patrick! :) Given that the issue is non-deterministic, I wonder if the bug is related to some kind of missing RCU or barrier that goes unnoticed in normal struct layouts. > It would have been very interesting if it had actually bisected to > something, but it seems that the real issue is just the choice of > seeding for RANDSTRUCT. That's where we've seen bugs in the past: some pathological ordering of a struct uncovers a corner case. In the past it's been much more deterministic: doesn't build, or immediately crashes on boot, etc. I'll take a closer look at this and see if I can provide something to narrow it down. -Kees > > Linus > > On Fri, Nov 10, 2017 at 4:27 PM, Patrick McLean wrote: >> On 2017-11-10 03:26 PM, Patrick McLean wrote: >>> On 2017-11-10 10:42 AM, Linus Torvalds wrote: I really don't see anything that looks even half-way suspicious in that 4.13.8..11 range. But as mentioned, compiler interactions can be _really_ subtle. And hey, it can be a real kernel bug too, that just happens to be exposed by RANDSTRUCT, so a bisect really would be very nice. >>> >>> I am working on bisecting the issue now, but I think I have some more >>> evidence pointing to a compiler issue related to RANDSTRUCT. There are >>> actually 3 issues that we have seen. Sometimes we get the null pointer >>> deref in the initial message, sometimes we get the GPF, and sometimes we >>> see an issue where the NFS clients see all files as root-owned >>> directories. Any given kernel will always see the same issue, but after >>> a "make mrproper" and recompile (with the same .config), the issue will >>> often change. I suspect that all 3 of these problems are actually the >>> same issue manifesting itself in different ways depending on what seed >>> the RANDSTRUCT gcc plugin is using. >> >> Further update on this, using the same seed for RANDSTRUCT, I have >> reproduced this issue on v4.13.0, so it does not seem to be recently >> introduced. The older kernel apparently only worked for us because we >> were lucky. Generally we always compile new kernels from a fresh tree, >> so they are never using the same seed. >> >> In case someone wants to play with this, here are some interesting seeds >> (in include/generated/randomize_layout_hash.h): >> >> Produce a NULL pointer dereference (though I am not sure what the client >> does to produce this). >> 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc >> >> All files for nfsd4 clients appear as directories owned as root, no >> matter the real owner (this happens for all clients we have tested): >> 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e >> >> This is the seed that was breaking motherboards (make sure you have a >> way to flash the BIOS with this one): >> 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd >> >> Finally, here is a seed that produces a kernel that does not exhibit any >> problems we are aware of: >> e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b >> Because in the end, compiler bugs are very rare. They are particularly annoying when they do happen, though, so they loom big in the mind of people who have had to chase them down. -- Kees Cook Pixel Security
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Wed, 8 Nov 2017 16:43:17 -0800 Patrick McLeanwrote: > As of 4.13.11 (and also with 4.14-rc) we have an issue where when > serving nfs4 sometimes we get the following BUG. When this bug happens, > it usually also causes the motherboard to no longer POST until we > externally re-flash the BIOS (using the BMC web interface). If a > motherboard does not have an external way to flash the BIOS, this would > brick the hardware. If that is a production x86 system then you need to raise a large red flag with the vendor because it should not even be possible to splat the BIOS firmware on a modern PC by running even malicious OS code. Not only that but if it has a flaw, and you bisect down to create a reproducer then it's not going to take the bad guys very long to turn it into an interesting toy to run if they ever exploit a box with that board. Alan
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Wed, 8 Nov 2017 16:43:17 -0800 Patrick McLean wrote: > As of 4.13.11 (and also with 4.14-rc) we have an issue where when > serving nfs4 sometimes we get the following BUG. When this bug happens, > it usually also causes the motherboard to no longer POST until we > externally re-flash the BIOS (using the BMC web interface). If a > motherboard does not have an external way to flash the BIOS, this would > brick the hardware. If that is a production x86 system then you need to raise a large red flag with the vendor because it should not even be possible to splat the BIOS firmware on a modern PC by running even malicious OS code. Not only that but if it has a flaw, and you bisect down to create a reproducer then it's not going to take the bad guys very long to turn it into an interesting toy to run if they ever exploit a box with that board. Alan
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
[ Bringing in the gcc plugin people and the kernel hardening list, since it now is no longer even remotely looking like a nfsd, vfs or filesystem issue any more ] Kees, Emese, the whole thread is on lkml, but there's clearly something horribly wrong with RANDSTRUCT, and it's not new even though it looked that way for a while. Patrick seems to trigger it with nfsd, so it might be specific to that. Alternatively, it might just be that very few people run RANDSTRUCT-built kernels, or just have been lucky with the seeding. Sorry for top-posting, but there's not really anything in the email itself to reply to, other than saying thanks to Patrick for narrowing it down like this. It would have been very interesting if it had actually bisected to something, but it seems that the real issue is just the choice of seeding for RANDSTRUCT. Linus On Fri, Nov 10, 2017 at 4:27 PM, Patrick McLeanwrote: > On 2017-11-10 03:26 PM, Patrick McLean wrote: >> On 2017-11-10 10:42 AM, Linus Torvalds wrote: >>> >>> I really don't see anything that looks even half-way suspicious in >>> that 4.13.8..11 range. But as mentioned, compiler interactions can be >>> _really_ subtle. >>> >>> And hey, it can be a real kernel bug too, that just happens to be >>> exposed by RANDSTRUCT, so a bisect really would be very nice. >> >> I am working on bisecting the issue now, but I think I have some more >> evidence pointing to a compiler issue related to RANDSTRUCT. There are >> actually 3 issues that we have seen. Sometimes we get the null pointer >> deref in the initial message, sometimes we get the GPF, and sometimes we >> see an issue where the NFS clients see all files as root-owned >> directories. Any given kernel will always see the same issue, but after >> a "make mrproper" and recompile (with the same .config), the issue will >> often change. I suspect that all 3 of these problems are actually the >> same issue manifesting itself in different ways depending on what seed >> the RANDSTRUCT gcc plugin is using. > > Further update on this, using the same seed for RANDSTRUCT, I have > reproduced this issue on v4.13.0, so it does not seem to be recently > introduced. The older kernel apparently only worked for us because we > were lucky. Generally we always compile new kernels from a fresh tree, > so they are never using the same seed. > > In case someone wants to play with this, here are some interesting seeds > (in include/generated/randomize_layout_hash.h): > > Produce a NULL pointer dereference (though I am not sure what the client > does to produce this). > 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc > > All files for nfsd4 clients appear as directories owned as root, no > matter the real owner (this happens for all clients we have tested): > 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e > > This is the seed that was breaking motherboards (make sure you have a > way to flash the BIOS with this one): > 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd > > Finally, here is a seed that produces a kernel that does not exhibit any > problems we are aware of: > e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b > >>> >>> Because in the end, compiler bugs are very rare. They are particularly >>> annoying when they do happen, though, so they loom big in the mind of >>> people who have had to chase them down. >>>
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
[ Bringing in the gcc plugin people and the kernel hardening list, since it now is no longer even remotely looking like a nfsd, vfs or filesystem issue any more ] Kees, Emese, the whole thread is on lkml, but there's clearly something horribly wrong with RANDSTRUCT, and it's not new even though it looked that way for a while. Patrick seems to trigger it with nfsd, so it might be specific to that. Alternatively, it might just be that very few people run RANDSTRUCT-built kernels, or just have been lucky with the seeding. Sorry for top-posting, but there's not really anything in the email itself to reply to, other than saying thanks to Patrick for narrowing it down like this. It would have been very interesting if it had actually bisected to something, but it seems that the real issue is just the choice of seeding for RANDSTRUCT. Linus On Fri, Nov 10, 2017 at 4:27 PM, Patrick McLean wrote: > On 2017-11-10 03:26 PM, Patrick McLean wrote: >> On 2017-11-10 10:42 AM, Linus Torvalds wrote: >>> >>> I really don't see anything that looks even half-way suspicious in >>> that 4.13.8..11 range. But as mentioned, compiler interactions can be >>> _really_ subtle. >>> >>> And hey, it can be a real kernel bug too, that just happens to be >>> exposed by RANDSTRUCT, so a bisect really would be very nice. >> >> I am working on bisecting the issue now, but I think I have some more >> evidence pointing to a compiler issue related to RANDSTRUCT. There are >> actually 3 issues that we have seen. Sometimes we get the null pointer >> deref in the initial message, sometimes we get the GPF, and sometimes we >> see an issue where the NFS clients see all files as root-owned >> directories. Any given kernel will always see the same issue, but after >> a "make mrproper" and recompile (with the same .config), the issue will >> often change. I suspect that all 3 of these problems are actually the >> same issue manifesting itself in different ways depending on what seed >> the RANDSTRUCT gcc plugin is using. > > Further update on this, using the same seed for RANDSTRUCT, I have > reproduced this issue on v4.13.0, so it does not seem to be recently > introduced. The older kernel apparently only worked for us because we > were lucky. Generally we always compile new kernels from a fresh tree, > so they are never using the same seed. > > In case someone wants to play with this, here are some interesting seeds > (in include/generated/randomize_layout_hash.h): > > Produce a NULL pointer dereference (though I am not sure what the client > does to produce this). > 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc > > All files for nfsd4 clients appear as directories owned as root, no > matter the real owner (this happens for all clients we have tested): > 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e > > This is the seed that was breaking motherboards (make sure you have a > way to flash the BIOS with this one): > 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd > > Finally, here is a seed that produces a kernel that does not exhibit any > problems we are aware of: > e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b > >>> >>> Because in the end, compiler bugs are very rare. They are particularly >>> annoying when they do happen, though, so they loom big in the mind of >>> people who have had to chase them down. >>>
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 10, 2017 at 08:13:06PM -0500, J. Bruce Fields wrote: > On Fri, Nov 10, 2017 at 03:26:27PM -0800, Patrick McLean wrote: > > > > > > On 2017-11-10 10:42 AM, Linus Torvalds wrote: > > > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean> > > wrote: > > >> > > >> Something must have changed since 4.13.8 to trigger this though. > > > > > > Arnd pointed to some commits that might be relevant for the cp210x > > > module, but those are all already in 4.13.8, so if 4.13.8 really is > > > rock solid for you, I don't think that's it. > > > > > > I really don't see anything that looks even half-way suspicious in > > > that 4.13.8..11 range. But as mentioned, compiler interactions can be > > > _really_ subtle. > > > > > > And hey, it can be a real kernel bug too, that just happens to be > > > exposed by RANDSTRUCT, so a bisect really would be very nice. > > > > I am working on bisecting the issue now, but I think I have some more > > evidence pointing to a compiler issue related to RANDSTRUCT. There are > > actually 3 issues that we have seen. Sometimes we get the null pointer > > deref in the initial message, sometimes we get the GPF, and sometimes we > > see an issue where the NFS clients see all files as root-owned > > directories. > > That suggests that stat.uid is 0 and stat.mode & S_IFMT is 004 in > the stat structure that nfsd passed to vfs_getattr(). > > No idea what sort of information is useful when tracking down this kind > of bug, but you could also run wireshark and take a look at the server's > GETATTR replies to see if there's some other corruption. FWIW, having looked at some of the __bugger_layout users... Compiler bugs aside, * use in struct {dentry,inode,mount,block_device} has to go - cache use patterns at hash lookups are _not_ something to play with like that. * struct file_lock and struct super_block - ditto, only it's not hash lookups that hurt here. struct vm_area_struct, while we are at it. * struct group_info - Cthulhu's pus-leaking warts, what's the point randomizing _that_? No, really - here's the damn thing in all its glory: struct group_info { atomic_tusage; int ngroups; kgid_t gid[0]; } __randomize_layout; I really hope that plugin does *not* try to move the ->gid[] anywhere... Which leaves us a choice between putting ->usage first or second. Sure, every bit helps, but... even for security theatre that looks a bit too pathetic. * struct vfsmount. Wow. All of log2(3!) bits. Congratulations. At least that's better than struct path. Oh, wait - they'd done struct path as well... What the hell had they been doing? Muscarine old-fashioned way? Looks like a mix of pointless and truly dangerous. And then there are compiler bugs and the charming effect on reproducibility...
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 10, 2017 at 08:13:06PM -0500, J. Bruce Fields wrote: > On Fri, Nov 10, 2017 at 03:26:27PM -0800, Patrick McLean wrote: > > > > > > On 2017-11-10 10:42 AM, Linus Torvalds wrote: > > > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean > > > wrote: > > >> > > >> Something must have changed since 4.13.8 to trigger this though. > > > > > > Arnd pointed to some commits that might be relevant for the cp210x > > > module, but those are all already in 4.13.8, so if 4.13.8 really is > > > rock solid for you, I don't think that's it. > > > > > > I really don't see anything that looks even half-way suspicious in > > > that 4.13.8..11 range. But as mentioned, compiler interactions can be > > > _really_ subtle. > > > > > > And hey, it can be a real kernel bug too, that just happens to be > > > exposed by RANDSTRUCT, so a bisect really would be very nice. > > > > I am working on bisecting the issue now, but I think I have some more > > evidence pointing to a compiler issue related to RANDSTRUCT. There are > > actually 3 issues that we have seen. Sometimes we get the null pointer > > deref in the initial message, sometimes we get the GPF, and sometimes we > > see an issue where the NFS clients see all files as root-owned > > directories. > > That suggests that stat.uid is 0 and stat.mode & S_IFMT is 004 in > the stat structure that nfsd passed to vfs_getattr(). > > No idea what sort of information is useful when tracking down this kind > of bug, but you could also run wireshark and take a look at the server's > GETATTR replies to see if there's some other corruption. FWIW, having looked at some of the __bugger_layout users... Compiler bugs aside, * use in struct {dentry,inode,mount,block_device} has to go - cache use patterns at hash lookups are _not_ something to play with like that. * struct file_lock and struct super_block - ditto, only it's not hash lookups that hurt here. struct vm_area_struct, while we are at it. * struct group_info - Cthulhu's pus-leaking warts, what's the point randomizing _that_? No, really - here's the damn thing in all its glory: struct group_info { atomic_tusage; int ngroups; kgid_t gid[0]; } __randomize_layout; I really hope that plugin does *not* try to move the ->gid[] anywhere... Which leaves us a choice between putting ->usage first or second. Sure, every bit helps, but... even for security theatre that looks a bit too pathetic. * struct vfsmount. Wow. All of log2(3!) bits. Congratulations. At least that's better than struct path. Oh, wait - they'd done struct path as well... What the hell had they been doing? Muscarine old-fashioned way? Looks like a mix of pointless and truly dangerous. And then there are compiler bugs and the charming effect on reproducibility...
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 10, 2017 at 03:26:27PM -0800, Patrick McLean wrote: > > > On 2017-11-10 10:42 AM, Linus Torvalds wrote: > > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLeanwrote: > >> > >> Something must have changed since 4.13.8 to trigger this though. > > > > Arnd pointed to some commits that might be relevant for the cp210x > > module, but those are all already in 4.13.8, so if 4.13.8 really is > > rock solid for you, I don't think that's it. > > > > I really don't see anything that looks even half-way suspicious in > > that 4.13.8..11 range. But as mentioned, compiler interactions can be > > _really_ subtle. > > > > And hey, it can be a real kernel bug too, that just happens to be > > exposed by RANDSTRUCT, so a bisect really would be very nice. > > I am working on bisecting the issue now, but I think I have some more > evidence pointing to a compiler issue related to RANDSTRUCT. There are > actually 3 issues that we have seen. Sometimes we get the null pointer > deref in the initial message, sometimes we get the GPF, and sometimes we > see an issue where the NFS clients see all files as root-owned > directories. That suggests that stat.uid is 0 and stat.mode & S_IFMT is 004 in the stat structure that nfsd passed to vfs_getattr(). No idea what sort of information is useful when tracking down this kind of bug, but you could also run wireshark and take a look at the server's GETATTR replies to see if there's some other corruption. --b. > Any given kernel will always see the same issue, but after > a "make mrproper" and recompile (with the same .config), the issue will > often change. I suspect that all 3 of these problems are actually the > same issue manifesting itself in different ways depending on what seed > the RANDSTRUCT gcc plugin is using. > > > > > Because in the end, compiler bugs are very rare. They are particularly > > annoying when they do happen, though, so they loom big in the mind of > > people who have had to chase them down. > >
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 10, 2017 at 03:26:27PM -0800, Patrick McLean wrote: > > > On 2017-11-10 10:42 AM, Linus Torvalds wrote: > > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean wrote: > >> > >> Something must have changed since 4.13.8 to trigger this though. > > > > Arnd pointed to some commits that might be relevant for the cp210x > > module, but those are all already in 4.13.8, so if 4.13.8 really is > > rock solid for you, I don't think that's it. > > > > I really don't see anything that looks even half-way suspicious in > > that 4.13.8..11 range. But as mentioned, compiler interactions can be > > _really_ subtle. > > > > And hey, it can be a real kernel bug too, that just happens to be > > exposed by RANDSTRUCT, so a bisect really would be very nice. > > I am working on bisecting the issue now, but I think I have some more > evidence pointing to a compiler issue related to RANDSTRUCT. There are > actually 3 issues that we have seen. Sometimes we get the null pointer > deref in the initial message, sometimes we get the GPF, and sometimes we > see an issue where the NFS clients see all files as root-owned > directories. That suggests that stat.uid is 0 and stat.mode & S_IFMT is 004 in the stat structure that nfsd passed to vfs_getattr(). No idea what sort of information is useful when tracking down this kind of bug, but you could also run wireshark and take a look at the server's GETATTR replies to see if there's some other corruption. --b. > Any given kernel will always see the same issue, but after > a "make mrproper" and recompile (with the same .config), the issue will > often change. I suspect that all 3 of these problems are actually the > same issue manifesting itself in different ways depending on what seed > the RANDSTRUCT gcc plugin is using. > > > > > Because in the end, compiler bugs are very rare. They are particularly > > annoying when they do happen, though, so they loom big in the mind of > > people who have had to chase them down. > >
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-10 03:26 PM, Patrick McLean wrote: > > > On 2017-11-10 10:42 AM, Linus Torvalds wrote: >> On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLeanwrote: >>> >>> Something must have changed since 4.13.8 to trigger this though. >> >> Arnd pointed to some commits that might be relevant for the cp210x >> module, but those are all already in 4.13.8, so if 4.13.8 really is >> rock solid for you, I don't think that's it. >> >> I really don't see anything that looks even half-way suspicious in >> that 4.13.8..11 range. But as mentioned, compiler interactions can be >> _really_ subtle. >> >> And hey, it can be a real kernel bug too, that just happens to be >> exposed by RANDSTRUCT, so a bisect really would be very nice. > > I am working on bisecting the issue now, but I think I have some more > evidence pointing to a compiler issue related to RANDSTRUCT. There are > actually 3 issues that we have seen. Sometimes we get the null pointer > deref in the initial message, sometimes we get the GPF, and sometimes we > see an issue where the NFS clients see all files as root-owned > directories. Any given kernel will always see the same issue, but after > a "make mrproper" and recompile (with the same .config), the issue will > often change. I suspect that all 3 of these problems are actually the > same issue manifesting itself in different ways depending on what seed > the RANDSTRUCT gcc plugin is using. > Further update on this, using the same seed for RANDSTRUCT, I have reproduced this issue on v4.13.0, so it does not seem to be recently introduced. The older kernel apparently only worked for us because we were lucky. Generally we always compile new kernels from a fresh tree, so they are never using the same seed. In case someone wants to play with this, here are some interesting seeds (in include/generated/randomize_layout_hash.h): Produce a NULL pointer dereference (though I am not sure what the client does to produce this). 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc All files for nfsd4 clients appear as directories owned as root, no matter the real owner (this happens for all clients we have tested): 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e This is the seed that was breaking motherboards (make sure you have a way to flash the BIOS with this one): 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd Finally, here is a seed that produces a kernel that does not exhibit any problems we are aware of: e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b >> >> Because in the end, compiler bugs are very rare. They are particularly >> annoying when they do happen, though, so they loom big in the mind of >> people who have had to chase them down. >>
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-10 03:26 PM, Patrick McLean wrote: > > > On 2017-11-10 10:42 AM, Linus Torvalds wrote: >> On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean wrote: >>> >>> Something must have changed since 4.13.8 to trigger this though. >> >> Arnd pointed to some commits that might be relevant for the cp210x >> module, but those are all already in 4.13.8, so if 4.13.8 really is >> rock solid for you, I don't think that's it. >> >> I really don't see anything that looks even half-way suspicious in >> that 4.13.8..11 range. But as mentioned, compiler interactions can be >> _really_ subtle. >> >> And hey, it can be a real kernel bug too, that just happens to be >> exposed by RANDSTRUCT, so a bisect really would be very nice. > > I am working on bisecting the issue now, but I think I have some more > evidence pointing to a compiler issue related to RANDSTRUCT. There are > actually 3 issues that we have seen. Sometimes we get the null pointer > deref in the initial message, sometimes we get the GPF, and sometimes we > see an issue where the NFS clients see all files as root-owned > directories. Any given kernel will always see the same issue, but after > a "make mrproper" and recompile (with the same .config), the issue will > often change. I suspect that all 3 of these problems are actually the > same issue manifesting itself in different ways depending on what seed > the RANDSTRUCT gcc plugin is using. > Further update on this, using the same seed for RANDSTRUCT, I have reproduced this issue on v4.13.0, so it does not seem to be recently introduced. The older kernel apparently only worked for us because we were lucky. Generally we always compile new kernels from a fresh tree, so they are never using the same seed. In case someone wants to play with this, here are some interesting seeds (in include/generated/randomize_layout_hash.h): Produce a NULL pointer dereference (though I am not sure what the client does to produce this). 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc All files for nfsd4 clients appear as directories owned as root, no matter the real owner (this happens for all clients we have tested): 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e This is the seed that was breaking motherboards (make sure you have a way to flash the BIOS with this one): 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd Finally, here is a seed that produces a kernel that does not exhibit any problems we are aware of: e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b >> >> Because in the end, compiler bugs are very rare. They are particularly >> annoying when they do happen, though, so they loom big in the mind of >> people who have had to chase them down. >>
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-10 10:42 AM, Linus Torvalds wrote: > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLeanwrote: >> >> Something must have changed since 4.13.8 to trigger this though. > > Arnd pointed to some commits that might be relevant for the cp210x > module, but those are all already in 4.13.8, so if 4.13.8 really is > rock solid for you, I don't think that's it. > > I really don't see anything that looks even half-way suspicious in > that 4.13.8..11 range. But as mentioned, compiler interactions can be > _really_ subtle. > > And hey, it can be a real kernel bug too, that just happens to be > exposed by RANDSTRUCT, so a bisect really would be very nice. I am working on bisecting the issue now, but I think I have some more evidence pointing to a compiler issue related to RANDSTRUCT. There are actually 3 issues that we have seen. Sometimes we get the null pointer deref in the initial message, sometimes we get the GPF, and sometimes we see an issue where the NFS clients see all files as root-owned directories. Any given kernel will always see the same issue, but after a "make mrproper" and recompile (with the same .config), the issue will often change. I suspect that all 3 of these problems are actually the same issue manifesting itself in different ways depending on what seed the RANDSTRUCT gcc plugin is using. > > Because in the end, compiler bugs are very rare. They are particularly > annoying when they do happen, though, so they loom big in the mind of > people who have had to chase them down. >
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-10 10:42 AM, Linus Torvalds wrote: > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean wrote: >> >> Something must have changed since 4.13.8 to trigger this though. > > Arnd pointed to some commits that might be relevant for the cp210x > module, but those are all already in 4.13.8, so if 4.13.8 really is > rock solid for you, I don't think that's it. > > I really don't see anything that looks even half-way suspicious in > that 4.13.8..11 range. But as mentioned, compiler interactions can be > _really_ subtle. > > And hey, it can be a real kernel bug too, that just happens to be > exposed by RANDSTRUCT, so a bisect really would be very nice. I am working on bisecting the issue now, but I think I have some more evidence pointing to a compiler issue related to RANDSTRUCT. There are actually 3 issues that we have seen. Sometimes we get the null pointer deref in the initial message, sometimes we get the GPF, and sometimes we see an issue where the NFS clients see all files as root-owned directories. Any given kernel will always see the same issue, but after a "make mrproper" and recompile (with the same .config), the issue will often change. I suspect that all 3 of these problems are actually the same issue manifesting itself in different ways depending on what seed the RANDSTRUCT gcc plugin is using. > > Because in the end, compiler bugs are very rare. They are particularly > annoying when they do happen, though, so they loom big in the mind of > people who have had to chase them down. >
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLeanwrote: > > Something must have changed since 4.13.8 to trigger this though. Well, yes and no. Obviously something changed, but it doesn't necessarily have to be anything particular. Almost every time we've seen compiler bugs, it's been an innocuous change that just happened to trigger a latent issue. Pretty much by definition compiler bugs tend to be about rare situations, so it's some odd special case that triggers. Since it's apparently fairly repeatable for you, a bisection between 4.13.8 and 4.13.11 would be very interesting, and shouldn't take all that long. There's only 142 commits in that range, so even just a partial bisection of say four of five rounds should narrow it down to just a couple of commits. And even a full bisection should only take something like 8 build/test cycles. Arnd pointed to some commits that might be relevant for the cp210x module, but those are all already in 4.13.8, so if 4.13.8 really is rock solid for you, I don't think that's it. I really don't see anything that looks even half-way suspicious in that 4.13.8..11 range. But as mentioned, compiler interactions can be _really_ subtle. And hey, it can be a real kernel bug too, that just happens to be exposed by RANDSTRUCT, so a bisect really would be very nice. Because in the end, compiler bugs are very rare. They are particularly annoying when they do happen, though, so they loom big in the mind of people who have had to chase them down. Linus
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean wrote: > > Something must have changed since 4.13.8 to trigger this though. Well, yes and no. Obviously something changed, but it doesn't necessarily have to be anything particular. Almost every time we've seen compiler bugs, it's been an innocuous change that just happened to trigger a latent issue. Pretty much by definition compiler bugs tend to be about rare situations, so it's some odd special case that triggers. Since it's apparently fairly repeatable for you, a bisection between 4.13.8 and 4.13.11 would be very interesting, and shouldn't take all that long. There's only 142 commits in that range, so even just a partial bisection of say four of five rounds should narrow it down to just a couple of commits. And even a full bisection should only take something like 8 build/test cycles. Arnd pointed to some commits that might be relevant for the cp210x module, but those are all already in 4.13.8, so if 4.13.8 really is rock solid for you, I don't think that's it. I really don't see anything that looks even half-way suspicious in that 4.13.8..11 range. But as mentioned, compiler interactions can be _really_ subtle. And hey, it can be a real kernel bug too, that just happens to be exposed by RANDSTRUCT, so a bisect really would be very nice. Because in the end, compiler bugs are very rare. They are particularly annoying when they do happen, though, so they loom big in the mind of people who have had to chase them down. Linus
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 10, 2017 at 2:58 AM, Patrick McLeanwrote: > On 2017-11-09 12:04 PM, Linus Torvalds wrote: >> On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean wrote: > > We will check our fork against the in-kernel cp201x driver to make sure > we didn't miss anything, but it seems odd we would be hitting the issue > so consistently in the NFS code path, rather than somewhere in USB, > serial, or GPIO paths. > >> So since you seem to be able to reproduce this _reasonably_ easily, >> it's definitely worth checking that it still reproduces even without >> the gcc plugins. > > I haven't been able to reproduce it with RANDSTRUCT disabled (and > structleak enabled). I will keep trying for a little while more, but > evidence seems to be pointing to that. > > Something must have changed since 4.13.8 to trigger this though. This > did not crop up at all until we tried 4.13.11, where it we saw it pretty > quickly. We have a pretty large number of machines running 4.13.6 with > RANDSTRUCT enabled and running a the same workload with many more > clients, and have not seen this bug at all. I couldn't find anything overly suspicious between 4.13.8 and 4.13.11, see the full list of commits since 3.14.6 at https://pastebin.com/AcxBZR7H The ones I couldn't immediately rule out (but no smoking gun either) would be: 9970679f497a x86/cpu/AMD: Apply the Erratum 688 fix when the BIOS doesn't ca6711747c5a assoc_array: Fix a buggy node-splitting case 2fbb8bf749b5 xfs: move two more RT specific functions into CONFIG_XFS_RT 1e1427356d8d xfs: trim writepage mapping to within eof 9df9b634f637 xfs: cancel dirty pages on invalidation cd3f0bee1b94 xfs: handle error if xfs_btree_get_bufs fails 58cfca25f540 xfs: reinit btree pointer on attr tree inactivation walk 659a9989b68b xfs: don't change inode mode if ACL update fails 88ccd3b6884a xfs: move more RT specific code under CONFIG_XFS_RT 5733ebee586c xfs: Don't log uninitialised fields in inode structures 199a7448c097 xfs: handle racy AIO in xfs_reflink_end_cow ee5d69c908a1 xfs: always swap the cow forks when swapping extents 2888145444f1 xfs: Capture state of the right inode in xfs_iflush_done d0fa252b207f xfs: perag initialization should only touch m_ag_max_usable for AG 0 8da6f7fbe43c xfs: update i_size after unwritten conversion in dio completion a9eac76e958b xfs: report zeroed or not correctly in xfs_zero_range() 67d51bdcc9f4 fs/xfs: Use %pS printk format for direct addresses 2bf3122f2130 xfs: evict CoW fork extents when performing finsert/fcollapse a58a0826656d xfs: don't unconditionally clear the reflink flag on zero-block files c61e905e0ee2 iomap_dio_rw: Allocate AIO completion queue before submitting dio 7610595830bb pkcs7: Prevent NULL pointer dereference, since sinfo is not always set. 24a33a0c96f3 KEYS: don't let add_key() update an uninstantiated key ad4aa448c9b2 FS-Cache: fix dereference of NULL user_key_payload f45b8fe12221 KEYS: Fix race between updating and finding a negative key e56be12012c2 ecryptfs: fix dereference of NULL user_key_payload 363ce0b01fe0 fscrypt: fix dereference of NULL user_key_payload cc757d55c903 lib/digsig: fix dereference of NULL user_key_payload f5e97214207f x86/microcode/intel: Disable late loading on model 79 7b5e405b7878 Revert "tools/power turbostat: stop migrating, unless '-m'" 8b1e10789c84 KEYS: encrypted: fix dereference of NULL user_key_payload a258a35a9930 mm: page_vma_mapped: ensure pmd is loaded with READ_ONCE outside of lock e47a56cbf519 usb: xhci: Handle error condition in xhci_stop_device() d53911e63388 usb: xhci: Reset halted endpoint if trb is noop d1120fe38b3f xhci: Cleanup current_cmd in xhci_cleanup_command_queue() 301d332138d2 xhci: Identify USB 3.1 capable hosts by their port protocol capability 015e94ead900 usb: hub: Allow reset retry for USB2 devices on connect bounce 1916547b28bd usb: quirks: add quirk for WORLDE MINI MIDI keyboard e3a038930502 usb: cdc_acm: Add quirk for Elatec TWN3 c2110c8dea7a USB: serial: metro-usb: add MS7820 device id 775462fd5c53 USB: core: fix out-of-bounds access bug in usb_get_bos_descriptor() a9fdf6354267 USB: devio: Revert "USB: devio: Don't corrupt user memory" However, you mentioned cp210x, and I noticed related changes in 4.13.8: e21045a22395 USB: serial: console: fix use-after-free after failed setup 6c7cb458405e USB: serial: console: fix use-after-free on disconnect 4b3e3c7282d6 USB: serial: qcserial: add Dell DW5818, DW5819 c796da1d110f USB: serial: option: add support for TP-Link LTE module e7e0b4b39663 USB: serial: cp210x: add support for ELV TFD500 1ae2c690f967 USB: serial: cp210x: fix partnum regression 78a02c93648e USB: serial: ftdi_sio: add id for Cypress WICED dev board You could try reverting those seven, this could point to your forked driver if it makes a difference. Arnd
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Fri, Nov 10, 2017 at 2:58 AM, Patrick McLean wrote: > On 2017-11-09 12:04 PM, Linus Torvalds wrote: >> On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean wrote: > > We will check our fork against the in-kernel cp201x driver to make sure > we didn't miss anything, but it seems odd we would be hitting the issue > so consistently in the NFS code path, rather than somewhere in USB, > serial, or GPIO paths. > >> So since you seem to be able to reproduce this _reasonably_ easily, >> it's definitely worth checking that it still reproduces even without >> the gcc plugins. > > I haven't been able to reproduce it with RANDSTRUCT disabled (and > structleak enabled). I will keep trying for a little while more, but > evidence seems to be pointing to that. > > Something must have changed since 4.13.8 to trigger this though. This > did not crop up at all until we tried 4.13.11, where it we saw it pretty > quickly. We have a pretty large number of machines running 4.13.6 with > RANDSTRUCT enabled and running a the same workload with many more > clients, and have not seen this bug at all. I couldn't find anything overly suspicious between 4.13.8 and 4.13.11, see the full list of commits since 3.14.6 at https://pastebin.com/AcxBZR7H The ones I couldn't immediately rule out (but no smoking gun either) would be: 9970679f497a x86/cpu/AMD: Apply the Erratum 688 fix when the BIOS doesn't ca6711747c5a assoc_array: Fix a buggy node-splitting case 2fbb8bf749b5 xfs: move two more RT specific functions into CONFIG_XFS_RT 1e1427356d8d xfs: trim writepage mapping to within eof 9df9b634f637 xfs: cancel dirty pages on invalidation cd3f0bee1b94 xfs: handle error if xfs_btree_get_bufs fails 58cfca25f540 xfs: reinit btree pointer on attr tree inactivation walk 659a9989b68b xfs: don't change inode mode if ACL update fails 88ccd3b6884a xfs: move more RT specific code under CONFIG_XFS_RT 5733ebee586c xfs: Don't log uninitialised fields in inode structures 199a7448c097 xfs: handle racy AIO in xfs_reflink_end_cow ee5d69c908a1 xfs: always swap the cow forks when swapping extents 2888145444f1 xfs: Capture state of the right inode in xfs_iflush_done d0fa252b207f xfs: perag initialization should only touch m_ag_max_usable for AG 0 8da6f7fbe43c xfs: update i_size after unwritten conversion in dio completion a9eac76e958b xfs: report zeroed or not correctly in xfs_zero_range() 67d51bdcc9f4 fs/xfs: Use %pS printk format for direct addresses 2bf3122f2130 xfs: evict CoW fork extents when performing finsert/fcollapse a58a0826656d xfs: don't unconditionally clear the reflink flag on zero-block files c61e905e0ee2 iomap_dio_rw: Allocate AIO completion queue before submitting dio 7610595830bb pkcs7: Prevent NULL pointer dereference, since sinfo is not always set. 24a33a0c96f3 KEYS: don't let add_key() update an uninstantiated key ad4aa448c9b2 FS-Cache: fix dereference of NULL user_key_payload f45b8fe12221 KEYS: Fix race between updating and finding a negative key e56be12012c2 ecryptfs: fix dereference of NULL user_key_payload 363ce0b01fe0 fscrypt: fix dereference of NULL user_key_payload cc757d55c903 lib/digsig: fix dereference of NULL user_key_payload f5e97214207f x86/microcode/intel: Disable late loading on model 79 7b5e405b7878 Revert "tools/power turbostat: stop migrating, unless '-m'" 8b1e10789c84 KEYS: encrypted: fix dereference of NULL user_key_payload a258a35a9930 mm: page_vma_mapped: ensure pmd is loaded with READ_ONCE outside of lock e47a56cbf519 usb: xhci: Handle error condition in xhci_stop_device() d53911e63388 usb: xhci: Reset halted endpoint if trb is noop d1120fe38b3f xhci: Cleanup current_cmd in xhci_cleanup_command_queue() 301d332138d2 xhci: Identify USB 3.1 capable hosts by their port protocol capability 015e94ead900 usb: hub: Allow reset retry for USB2 devices on connect bounce 1916547b28bd usb: quirks: add quirk for WORLDE MINI MIDI keyboard e3a038930502 usb: cdc_acm: Add quirk for Elatec TWN3 c2110c8dea7a USB: serial: metro-usb: add MS7820 device id 775462fd5c53 USB: core: fix out-of-bounds access bug in usb_get_bos_descriptor() a9fdf6354267 USB: devio: Revert "USB: devio: Don't corrupt user memory" However, you mentioned cp210x, and I noticed related changes in 4.13.8: e21045a22395 USB: serial: console: fix use-after-free after failed setup 6c7cb458405e USB: serial: console: fix use-after-free on disconnect 4b3e3c7282d6 USB: serial: qcserial: add Dell DW5818, DW5819 c796da1d110f USB: serial: option: add support for TP-Link LTE module e7e0b4b39663 USB: serial: cp210x: add support for ELV TFD500 1ae2c690f967 USB: serial: cp210x: fix partnum regression 78a02c93648e USB: serial: ftdi_sio: add id for Cypress WICED dev board You could try reverting those seven, this could point to your forked driver if it makes a difference. Arnd
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-09 12:04 PM, Linus Torvalds wrote: > On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLeanwrote: >> >> We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and >> CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as >> CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before. > > It might be worth just verifying without RANDSTRUCT in particular. > > And most obviously: if there is some module or part of the kernel that > got compiled with a different seed for the randstruct hashing, that > will break in nasty nasty ways. Your out-of-kernel module is the > obvious suspect for something like that, but honestly, it could be > some missing build dependency, or simply a missing special case in the > plugin itself a missing __no_randomize_layout or any number of things. > We will check our fork against the in-kernel cp201x driver to make sure we didn't miss anything, but it seems odd we would be hitting the issue so consistently in the NFS code path, rather than somewhere in USB, serial, or GPIO paths. > So since you seem to be able to reproduce this _reasonably_ easily, > it's definitely worth checking that it still reproduces even without > the gcc plugins. I haven't been able to reproduce it with RANDSTRUCT disabled (and structleak enabled). I will keep trying for a little while more, but evidence seems to be pointing to that. Something must have changed since 4.13.8 to trigger this though. This did not crop up at all until we tried 4.13.11, where it we saw it pretty quickly. We have a pretty large number of machines running 4.13.6 with RANDSTRUCT enabled and running a the same workload with many more clients, and have not seen this bug at all.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-09 12:04 PM, Linus Torvalds wrote: > On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean wrote: >> >> We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and >> CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as >> CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before. > > It might be worth just verifying without RANDSTRUCT in particular. > > And most obviously: if there is some module or part of the kernel that > got compiled with a different seed for the randstruct hashing, that > will break in nasty nasty ways. Your out-of-kernel module is the > obvious suspect for something like that, but honestly, it could be > some missing build dependency, or simply a missing special case in the > plugin itself a missing __no_randomize_layout or any number of things. > We will check our fork against the in-kernel cp201x driver to make sure we didn't miss anything, but it seems odd we would be hitting the issue so consistently in the NFS code path, rather than somewhere in USB, serial, or GPIO paths. > So since you seem to be able to reproduce this _reasonably_ easily, > it's definitely worth checking that it still reproduces even without > the gcc plugins. I haven't been able to reproduce it with RANDSTRUCT disabled (and structleak enabled). I will keep trying for a little while more, but evidence seems to be pointing to that. Something must have changed since 4.13.8 to trigger this though. This did not crop up at all until we tried 4.13.11, where it we saw it pretty quickly. We have a pretty large number of machines running 4.13.6 with RANDSTRUCT enabled and running a the same workload with many more clients, and have not seen this bug at all.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-09 11:51 AM, Patrick McLean wrote: > On 2017-11-09 11:37 AM, Al Viro wrote: >> On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote: >> Here is the BUG we are getting: > [ 58.962528] BUG: unable to handle kernel NULL pointer dereference at > 0230 > [ 58.963918] IP: vfs_statfs+0x73/0xb0 >>> >>> The code disassembles to >> >>> 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction >> >>> that matters (and that traps) but I'm almost certain that it's the >>> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() >>> when it then does >>> >>> flags_by_sb(mnt->mnt_sb->s_flags); >>> >>> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is >>> NULL, because we wouldn't have gotten this far if it was. >>> >> >> All instances of struct dentry are created by __d_alloc()[*], which assigns >> ->d_sb (never to be modified afterwards) *and* dereferences the pointer >> it has stored in ->d_sb before the created struct dentry becomes visible >> to anyone else. No struct dentry should ever be observed with NULL ->d_sb; >> the only way to get that is memory corruption or looking at freed instance >> after its memory has been reused for something else and zeroed. >> >> In other words, we should never observe a struct mount with NULL >> ->mnt.mnt_sb - >> not without memory corruption or looking at freed instance. >> >> The pointer in that case should've come from exp->ex_path.mnt, exp being >> the argument of nfsd4_encode_fattr(). Sure, it might have been a dangling >> reference. However, it looks a lot more like a memory corruptor *OR* >> miscompiled kernel. >> >> What kind of load do the reproducer boxen have and how fast does that >> bug trigger? Would it be possible to slap something like >> if (unlikely(!exp->exp_path.mnt->mnt_sb)) { >> struct mount *m = real_mount(exp->exp_path.mnt); >> printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt); >> printk(KERN_ERR "name: [%s]\n", m->mnt_devname); >> printk(KERN_ERR "ns: [%p]\n", m->mnt_ns); >> printk(KERN_ERR "parent: [%p]\n", m->mnt_parent); >> WARN_ON(1); >> err = -EINVAL; >> goto out_nfserr; >> } >> in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added >> in fs/nfsd/nfs4xdr.c) and see what will it catch? >> >> Both with and without randomized structs, if possible - I might be barking >> at the wrong tree, but IMO the very first step in localizing that crap is >> to find out whether it's toolchain-related or not. > That condition did not seem to trigger, and I am getting a slightly different crash message (GPF rather than null pointer dereference). Here is the dump from the latest crash (with CONFIG_GCC_PLUGIN_STRUCTLEAK, CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and CONFIG_GCC_PLUGIN_RANDSTRUCT all enabled). > [ 36.834232] general protection fault: [#1] SMP > [ 36.835168] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 > xt_multiport xt_addrtype iptable_mangle iptable_raw iptable_nat nf_nat_ipv4 > nf_nat gkuart(O) usbserial x86_pkg_temp_thermal ie31200_edac tpm_tis > ipmi_ssif tpm_tis_core ext4 mbcache jbd2 e1000e crc32c_intel > [ 36.839120] CPU: 1 PID: 3969 Comm: nfsd Tainted: G O > 4.14.0-rc8-git-kratos-1-00053-gd93d4ce103fd-dirty #1 > [ 36.840883] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 > [ 36.841892] task: 88040a0b1c80 task.stack: c900027bc000 > [ 36.842887] RIP: 0010:vfs_statfs+0x73/0xb0 > [ 36.843728] RSP: 0018:c900027bfb30 EFLAGS: 00010202 > [ 36.844687] RAX: RBX: c900027bfbf8 RCX: > 180d > [ 36.845891] RDX: 080d RSI: 0020 RDI: > e2006d6574737973 > [ 36.847075] RBP: c900027bfbc8 R08: R09: > 00ff > [ 36.848175] R10: 0038be3a R11: 88040b687578 R12: > > [ 36.849260] R13: 88040d7dc400 R14: 88040d38b000 R15: > c900027bfbf8 > [ 36.850347] FS: () GS:88041fc4() > knlGS: > [ 36.851891] CS: 0010 DS: ES: CR0: 80050033 > [ 36.852873] CR2: 7f049228edc0 CR3: 01e0a004 CR4: > 001606e0 > [ 36.853942] Call Trace: > [ 36.854667] nfsd4_encode_fattr+0x34e/0x23b0 > [ 36.855578] ? ext4_get_acl+0x1b2/0x260 [ext4] > [ 36.856485] ? get_acl+0x7a/0xf0 > [ 36.857266] ? generic_permission+0x125/0x1a0 > [ 36.858150] nfsd4_encode_getattr+0x25/0x30 > [ 36.859002] nfsd4_encode_operation+0x98/0x1a0 > [ 36.859889] nfsd4_proc_compound+0x3eb/0x5c0 > [ 36.860736] nfsd_dispatch+0xa8/0x230 > [ 36.861538] svc_process_common+0x347/0x640 > [ 36.862383] svc_process+0x100/0x1b0 > [ 36.863204] nfsd+0xe0/0x150 > [ 36.863984] kthread+0xfc/0x130 > [ 36.864781] ? nfsd_destroy+0x50/0x50 > [ 36.865624] ?
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-09 11:51 AM, Patrick McLean wrote: > On 2017-11-09 11:37 AM, Al Viro wrote: >> On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote: >> Here is the BUG we are getting: > [ 58.962528] BUG: unable to handle kernel NULL pointer dereference at > 0230 > [ 58.963918] IP: vfs_statfs+0x73/0xb0 >>> >>> The code disassembles to >> >>> 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction >> >>> that matters (and that traps) but I'm almost certain that it's the >>> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() >>> when it then does >>> >>> flags_by_sb(mnt->mnt_sb->s_flags); >>> >>> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is >>> NULL, because we wouldn't have gotten this far if it was. >>> >> >> All instances of struct dentry are created by __d_alloc()[*], which assigns >> ->d_sb (never to be modified afterwards) *and* dereferences the pointer >> it has stored in ->d_sb before the created struct dentry becomes visible >> to anyone else. No struct dentry should ever be observed with NULL ->d_sb; >> the only way to get that is memory corruption or looking at freed instance >> after its memory has been reused for something else and zeroed. >> >> In other words, we should never observe a struct mount with NULL >> ->mnt.mnt_sb - >> not without memory corruption or looking at freed instance. >> >> The pointer in that case should've come from exp->ex_path.mnt, exp being >> the argument of nfsd4_encode_fattr(). Sure, it might have been a dangling >> reference. However, it looks a lot more like a memory corruptor *OR* >> miscompiled kernel. >> >> What kind of load do the reproducer boxen have and how fast does that >> bug trigger? Would it be possible to slap something like >> if (unlikely(!exp->exp_path.mnt->mnt_sb)) { >> struct mount *m = real_mount(exp->exp_path.mnt); >> printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt); >> printk(KERN_ERR "name: [%s]\n", m->mnt_devname); >> printk(KERN_ERR "ns: [%p]\n", m->mnt_ns); >> printk(KERN_ERR "parent: [%p]\n", m->mnt_parent); >> WARN_ON(1); >> err = -EINVAL; >> goto out_nfserr; >> } >> in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added >> in fs/nfsd/nfs4xdr.c) and see what will it catch? >> >> Both with and without randomized structs, if possible - I might be barking >> at the wrong tree, but IMO the very first step in localizing that crap is >> to find out whether it's toolchain-related or not. > That condition did not seem to trigger, and I am getting a slightly different crash message (GPF rather than null pointer dereference). Here is the dump from the latest crash (with CONFIG_GCC_PLUGIN_STRUCTLEAK, CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and CONFIG_GCC_PLUGIN_RANDSTRUCT all enabled). > [ 36.834232] general protection fault: [#1] SMP > [ 36.835168] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 > xt_multiport xt_addrtype iptable_mangle iptable_raw iptable_nat nf_nat_ipv4 > nf_nat gkuart(O) usbserial x86_pkg_temp_thermal ie31200_edac tpm_tis > ipmi_ssif tpm_tis_core ext4 mbcache jbd2 e1000e crc32c_intel > [ 36.839120] CPU: 1 PID: 3969 Comm: nfsd Tainted: G O > 4.14.0-rc8-git-kratos-1-00053-gd93d4ce103fd-dirty #1 > [ 36.840883] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013 > [ 36.841892] task: 88040a0b1c80 task.stack: c900027bc000 > [ 36.842887] RIP: 0010:vfs_statfs+0x73/0xb0 > [ 36.843728] RSP: 0018:c900027bfb30 EFLAGS: 00010202 > [ 36.844687] RAX: RBX: c900027bfbf8 RCX: > 180d > [ 36.845891] RDX: 080d RSI: 0020 RDI: > e2006d6574737973 > [ 36.847075] RBP: c900027bfbc8 R08: R09: > 00ff > [ 36.848175] R10: 0038be3a R11: 88040b687578 R12: > > [ 36.849260] R13: 88040d7dc400 R14: 88040d38b000 R15: > c900027bfbf8 > [ 36.850347] FS: () GS:88041fc4() > knlGS: > [ 36.851891] CS: 0010 DS: ES: CR0: 80050033 > [ 36.852873] CR2: 7f049228edc0 CR3: 01e0a004 CR4: > 001606e0 > [ 36.853942] Call Trace: > [ 36.854667] nfsd4_encode_fattr+0x34e/0x23b0 > [ 36.855578] ? ext4_get_acl+0x1b2/0x260 [ext4] > [ 36.856485] ? get_acl+0x7a/0xf0 > [ 36.857266] ? generic_permission+0x125/0x1a0 > [ 36.858150] nfsd4_encode_getattr+0x25/0x30 > [ 36.859002] nfsd4_encode_operation+0x98/0x1a0 > [ 36.859889] nfsd4_proc_compound+0x3eb/0x5c0 > [ 36.860736] nfsd_dispatch+0xa8/0x230 > [ 36.861538] svc_process_common+0x347/0x640 > [ 36.862383] svc_process+0x100/0x1b0 > [ 36.863204] nfsd+0xe0/0x150 > [ 36.863984] kthread+0xfc/0x130 > [ 36.864781] ? nfsd_destroy+0x50/0x50 > [ 36.865624] ?
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-09 12:47 PM, J. Bruce Fields wrote: > On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote: >> Anyway, that cmovne noise makes it a bit hard to see the actual part >> that matters (and that traps) but I'm almost certain that it's the >> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() >> when it then does >> >> flags_by_sb(mnt->mnt_sb->s_flags); >> >> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is >> NULL, because we wouldn't have gotten this far if it was. >> >> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a >> proper path. And the vfs_statfs() code itself hasn't changed in a >> while. >> >> Which does seem to implicate nfsd as having passed in a bad path to >> vfs_statfs(). But I'm not seeing any changes in nfsd either. >> >> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11 >> range. There is a bunch of xfs changes, though. What's the underlying >> filesystem that you are exporting? >> >> But bringing in Al Viro and Bruce Fields explicitly in case they see >> something. And Darrick, just in case it might be xfs. > > Looking at https://lkml.org/lkml/2017/11/8/1086 for the actual oops... > > It doesn't remind me of any known issue. > > So either I'm overlooking something or the bug's elsewhere. > > It sounds like you're varying *only* the server version, so there's not > much chance that this could be triggered by changes in client behavior? > We are definitely only varying the kernel on the server, nothing on the client side is changing. The client in this case is essentially an embedded device that we do not have a whole lot of control over.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-09 12:47 PM, J. Bruce Fields wrote: > On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote: >> Anyway, that cmovne noise makes it a bit hard to see the actual part >> that matters (and that traps) but I'm almost certain that it's the >> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() >> when it then does >> >> flags_by_sb(mnt->mnt_sb->s_flags); >> >> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is >> NULL, because we wouldn't have gotten this far if it was. >> >> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a >> proper path. And the vfs_statfs() code itself hasn't changed in a >> while. >> >> Which does seem to implicate nfsd as having passed in a bad path to >> vfs_statfs(). But I'm not seeing any changes in nfsd either. >> >> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11 >> range. There is a bunch of xfs changes, though. What's the underlying >> filesystem that you are exporting? >> >> But bringing in Al Viro and Bruce Fields explicitly in case they see >> something. And Darrick, just in case it might be xfs. > > Looking at https://lkml.org/lkml/2017/11/8/1086 for the actual oops... > > It doesn't remind me of any known issue. > > So either I'm overlooking something or the bug's elsewhere. > > It sounds like you're varying *only* the server version, so there's not > much chance that this could be triggered by changes in client behavior? > We are definitely only varying the kernel on the server, nothing on the client side is changing. The client in this case is essentially an embedded device that we do not have a whole lot of control over.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Thu, Nov 09, 2017 at 12:04:19PM -0800, Linus Torvalds wrote: > That case has probably not gotten a huge amount of testing. As Al > points out, it can cause absolutely horrendous cache access pattern > changes, but it might also be triggering some corruption in case > there's a problem with the plugin, or with some piece of kernel code > that gets confused by it. I suspect that it might be an effect of randomize shite done both to struct mount *AND* to struct vfsmount embedded into it. With pointers to embedded struct vfsmount kept around a lot, and container_of() used to get from them to corresponding struct mount. That smells like a combination of idiocy that might have never occured to the authors of said gcc plugin. On the other hand, triggered gcc bugs certainly do add randomness, so good luck explaining to the security community that it's not a good idea...
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Thu, Nov 09, 2017 at 12:04:19PM -0800, Linus Torvalds wrote: > That case has probably not gotten a huge amount of testing. As Al > points out, it can cause absolutely horrendous cache access pattern > changes, but it might also be triggering some corruption in case > there's a problem with the plugin, or with some piece of kernel code > that gets confused by it. I suspect that it might be an effect of randomize shite done both to struct mount *AND* to struct vfsmount embedded into it. With pointers to embedded struct vfsmount kept around a lot, and container_of() used to get from them to corresponding struct mount. That smells like a combination of idiocy that might have never occured to the authors of said gcc plugin. On the other hand, triggered gcc bugs certainly do add randomness, so good luck explaining to the security community that it's not a good idea...
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote: > Anyway, that cmovne noise makes it a bit hard to see the actual part > that matters (and that traps) but I'm almost certain that it's the > "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() > when it then does > > flags_by_sb(mnt->mnt_sb->s_flags); > > and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is > NULL, because we wouldn't have gotten this far if it was. > > Now, afaik, mnt->mnt_sb should never be NULL in the first place for a > proper path. And the vfs_statfs() code itself hasn't changed in a > while. > > Which does seem to implicate nfsd as having passed in a bad path to > vfs_statfs(). But I'm not seeing any changes in nfsd either. > > In particular, there are *no* nfsd changes in that 4.13.8..4.13.11 > range. There is a bunch of xfs changes, though. What's the underlying > filesystem that you are exporting? > > But bringing in Al Viro and Bruce Fields explicitly in case they see > something. And Darrick, just in case it might be xfs. Looking at https://lkml.org/lkml/2017/11/8/1086 for the actual oops... It doesn't remind me of any known issue. And I don't see how we can call vfs_statfs() with a bad path: nfsd4_encode_getattr would have to have been called with nfserr 0 and ga_fhp->fh_export bad. Looking at nfsd4_proc_compound, I can't see how we could get there in the op->status == 0 case without the fh_verify() in nfsd4_getattr having succeeded and assigned the result to ga_fhp. So either I'm overlooking something or the bug's elsewhere. It sounds like you're varying *only* the server version, so there's not much chance that this could be triggered by changes in client behavior? --b.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote: > Anyway, that cmovne noise makes it a bit hard to see the actual part > that matters (and that traps) but I'm almost certain that it's the > "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() > when it then does > > flags_by_sb(mnt->mnt_sb->s_flags); > > and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is > NULL, because we wouldn't have gotten this far if it was. > > Now, afaik, mnt->mnt_sb should never be NULL in the first place for a > proper path. And the vfs_statfs() code itself hasn't changed in a > while. > > Which does seem to implicate nfsd as having passed in a bad path to > vfs_statfs(). But I'm not seeing any changes in nfsd either. > > In particular, there are *no* nfsd changes in that 4.13.8..4.13.11 > range. There is a bunch of xfs changes, though. What's the underlying > filesystem that you are exporting? > > But bringing in Al Viro and Bruce Fields explicitly in case they see > something. And Darrick, just in case it might be xfs. Looking at https://lkml.org/lkml/2017/11/8/1086 for the actual oops... It doesn't remind me of any known issue. And I don't see how we can call vfs_statfs() with a bad path: nfsd4_encode_getattr would have to have been called with nfserr 0 and ga_fhp->fh_export bad. Looking at nfsd4_proc_compound, I can't see how we could get there in the op->status == 0 case without the fh_verify() in nfsd4_getattr having succeeded and assigned the result to ga_fhp. So either I'm overlooking something or the bug's elsewhere. It sounds like you're varying *only* the server version, so there's not much chance that this could be triggered by changes in client behavior? --b.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLeanwrote: > > We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and > CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as > CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before. It might be worth just verifying without RANDSTRUCT in particular. That case has probably not gotten a huge amount of testing. As Al points out, it can cause absolutely horrendous cache access pattern changes, but it might also be triggering some corruption in case there's a problem with the plugin, or with some piece of kernel code that gets confused by it. And most obviously: if there is some module or part of the kernel that got compiled with a different seed for the randstruct hashing, that will break in nasty nasty ways. Your out-of-kernel module is the obvious suspect for something like that, but honestly, it could be some missing build dependency, or simply a missing special case in the plugin itself a missing __no_randomize_layout or any number of things. We've hit gcc bugs many times before - and the plugins are just new opportunities to hit cases that have gotten a lot less testing than the "normal" code flow has. The structleak plugin is much less likely to be a problem (simply because it's a much simpler plugin), but hey, something being NULL when it shouldn't possibly be might be a stray "leak initialization". So since you seem to be able to reproduce this _reasonably_ easily, it's definitely worth checking that it still reproduces even without the gcc plugins. Just to narrow it down a bit. Linus
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean wrote: > > We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and > CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as > CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before. It might be worth just verifying without RANDSTRUCT in particular. That case has probably not gotten a huge amount of testing. As Al points out, it can cause absolutely horrendous cache access pattern changes, but it might also be triggering some corruption in case there's a problem with the plugin, or with some piece of kernel code that gets confused by it. And most obviously: if there is some module or part of the kernel that got compiled with a different seed for the randstruct hashing, that will break in nasty nasty ways. Your out-of-kernel module is the obvious suspect for something like that, but honestly, it could be some missing build dependency, or simply a missing special case in the plugin itself a missing __no_randomize_layout or any number of things. We've hit gcc bugs many times before - and the plugins are just new opportunities to hit cases that have gotten a lot less testing than the "normal" code flow has. The structleak plugin is much less likely to be a problem (simply because it's a much simpler plugin), but hey, something being NULL when it shouldn't possibly be might be a stray "leak initialization". So since you seem to be able to reproduce this _reasonably_ easily, it's definitely worth checking that it still reproduces even without the gcc plugins. Just to narrow it down a bit. Linus
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-09 11:37 AM, Al Viro wrote: > On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote: > >>> Here is the BUG we are getting: [ 58.962528] BUG: unable to handle kernel NULL pointer dereference at 0230 [ 58.963918] IP: vfs_statfs+0x73/0xb0 >> >> The code disassembles to > >> 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction > >> that matters (and that traps) but I'm almost certain that it's the >> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() >> when it then does >> >> flags_by_sb(mnt->mnt_sb->s_flags); >> >> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is >> NULL, because we wouldn't have gotten this far if it was. >> > > All instances of struct dentry are created by __d_alloc()[*], which assigns > ->d_sb (never to be modified afterwards) *and* dereferences the pointer > it has stored in ->d_sb before the created struct dentry becomes visible > to anyone else. No struct dentry should ever be observed with NULL ->d_sb; > the only way to get that is memory corruption or looking at freed instance > after its memory has been reused for something else and zeroed. > > In other words, we should never observe a struct mount with NULL ->mnt.mnt_sb > - > not without memory corruption or looking at freed instance. > > The pointer in that case should've come from exp->ex_path.mnt, exp being > the argument of nfsd4_encode_fattr(). Sure, it might have been a dangling > reference. However, it looks a lot more like a memory corruptor *OR* > miscompiled kernel. > > What kind of load do the reproducer boxen have and how fast does that > bug trigger? Would it be possible to slap something like > if (unlikely(!exp->exp_path.mnt->mnt_sb)) { > struct mount *m = real_mount(exp->exp_path.mnt); > printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt); > printk(KERN_ERR "name: [%s]\n", m->mnt_devname); > printk(KERN_ERR "ns: [%p]\n", m->mnt_ns); > printk(KERN_ERR "parent: [%p]\n", m->mnt_parent); > WARN_ON(1); > err = -EINVAL; > goto out_nfserr; > } > in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added > in fs/nfsd/nfs4xdr.c) and see what will it catch? > > Both with and without randomized structs, if possible - I might be barking > at the wrong tree, but IMO the very first step in localizing that crap is > to find out whether it's toolchain-related or not. The reproducer boxen are not under particularly heavy load, they are serving NFS to 1 or 2 clients (which are essentially embedded devices). When the bug triggers, it usually triggers pretty fast and reliably, but it seems to only trigger on some subset of bootups. Once it fails to trigger, we seem to have to reboot to get it to trigger. I should be able to have some results with that added in a few hours. It's weirdly unreliable to reproduce this. We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-09 11:37 AM, Al Viro wrote: > On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote: > >>> Here is the BUG we are getting: [ 58.962528] BUG: unable to handle kernel NULL pointer dereference at 0230 [ 58.963918] IP: vfs_statfs+0x73/0xb0 >> >> The code disassembles to > >> 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction > >> that matters (and that traps) but I'm almost certain that it's the >> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() >> when it then does >> >> flags_by_sb(mnt->mnt_sb->s_flags); >> >> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is >> NULL, because we wouldn't have gotten this far if it was. >> > > All instances of struct dentry are created by __d_alloc()[*], which assigns > ->d_sb (never to be modified afterwards) *and* dereferences the pointer > it has stored in ->d_sb before the created struct dentry becomes visible > to anyone else. No struct dentry should ever be observed with NULL ->d_sb; > the only way to get that is memory corruption or looking at freed instance > after its memory has been reused for something else and zeroed. > > In other words, we should never observe a struct mount with NULL ->mnt.mnt_sb > - > not without memory corruption or looking at freed instance. > > The pointer in that case should've come from exp->ex_path.mnt, exp being > the argument of nfsd4_encode_fattr(). Sure, it might have been a dangling > reference. However, it looks a lot more like a memory corruptor *OR* > miscompiled kernel. > > What kind of load do the reproducer boxen have and how fast does that > bug trigger? Would it be possible to slap something like > if (unlikely(!exp->exp_path.mnt->mnt_sb)) { > struct mount *m = real_mount(exp->exp_path.mnt); > printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt); > printk(KERN_ERR "name: [%s]\n", m->mnt_devname); > printk(KERN_ERR "ns: [%p]\n", m->mnt_ns); > printk(KERN_ERR "parent: [%p]\n", m->mnt_parent); > WARN_ON(1); > err = -EINVAL; > goto out_nfserr; > } > in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added > in fs/nfsd/nfs4xdr.c) and see what will it catch? > > Both with and without randomized structs, if possible - I might be barking > at the wrong tree, but IMO the very first step in localizing that crap is > to find out whether it's toolchain-related or not. The reproducer boxen are not under particularly heavy load, they are serving NFS to 1 or 2 clients (which are essentially embedded devices). When the bug triggers, it usually triggers pretty fast and reliably, but it seems to only trigger on some subset of bootups. Once it fails to trigger, we seem to have to reboot to get it to trigger. I should be able to have some results with that added in a few hours. It's weirdly unreliable to reproduce this. We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-09 11:38 AM, Al Viro wrote: > On Thu, Nov 09, 2017 at 11:34:19AM -0800, Patrick McLean wrote: > >>> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11 >>> range. There is a bunch of xfs changes, though. What's the underlying >>> filesystem that you are exporting? >> >> It's an ext4 filesystem. > > Had there been toolchain changes around the same period? > No changes, both the working and broken kernels were built with distro-provided gcc 5.4.0 and binutils 2.28.1.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-09 11:38 AM, Al Viro wrote: > On Thu, Nov 09, 2017 at 11:34:19AM -0800, Patrick McLean wrote: > >>> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11 >>> range. There is a bunch of xfs changes, though. What's the underlying >>> filesystem that you are exporting? >> >> It's an ext4 filesystem. > > Had there been toolchain changes around the same period? > No changes, both the working and broken kernels were built with distro-provided gcc 5.4.0 and binutils 2.28.1.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Thu, Nov 09, 2017 at 11:34:19AM -0800, Patrick McLean wrote: > > In particular, there are *no* nfsd changes in that 4.13.8..4.13.11 > > range. There is a bunch of xfs changes, though. What's the underlying > > filesystem that you are exporting? > > It's an ext4 filesystem. Had there been toolchain changes around the same period?
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Thu, Nov 09, 2017 at 11:34:19AM -0800, Patrick McLean wrote: > > In particular, there are *no* nfsd changes in that 4.13.8..4.13.11 > > range. There is a bunch of xfs changes, though. What's the underlying > > filesystem that you are exporting? > > It's an ext4 filesystem. Had there been toolchain changes around the same period?
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote: > > Here is the BUG we are getting: > >> [ 58.962528] BUG: unable to handle kernel NULL pointer dereference at > >> 0230 > >> [ 58.963918] IP: vfs_statfs+0x73/0xb0 > > The code disassembles to > 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction > that matters (and that traps) but I'm almost certain that it's the > "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() > when it then does > > flags_by_sb(mnt->mnt_sb->s_flags); > > and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is > NULL, because we wouldn't have gotten this far if it was. > > Now, afaik, mnt->mnt_sb should never be NULL in the first place for a > proper path. And the vfs_statfs() code itself hasn't changed in a > while. > > Which does seem to implicate nfsd as having passed in a bad path to > vfs_statfs(). But I'm not seeing any changes in nfsd either. It definitely is NULL mnt->mnt_sb and that should never happen. All struct mount instances are allocated by alloc_vfsmnt(). Its callers are * vfs_kern_mount(). Assigns ->mnt_sb to root->d_sb before anyone else sees the address of that object. * clone_mnt(). Assigns ->mnt_sb to that of preexisting instance before anyone else sees the address of that object. No other callers exist and no other places ever modify the value of that field. All instances of struct dentry are created by __d_alloc()[*], which assigns ->d_sb (never to be modified afterwards) *and* dereferences the pointer it has stored in ->d_sb before the created struct dentry becomes visible to anyone else. No struct dentry should ever be observed with NULL ->d_sb; the only way to get that is memory corruption or looking at freed instance after its memory has been reused for something else and zeroed. In other words, we should never observe a struct mount with NULL ->mnt.mnt_sb - not without memory corruption or looking at freed instance. The pointer in that case should've come from exp->ex_path.mnt, exp being the argument of nfsd4_encode_fattr(). Sure, it might have been a dangling reference. However, it looks a lot more like a memory corruptor *OR* miscompiled kernel. What kind of load do the reproducer boxen have and how fast does that bug trigger? Would it be possible to slap something like if (unlikely(!exp->exp_path.mnt->mnt_sb)) { struct mount *m = real_mount(exp->exp_path.mnt); printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt); printk(KERN_ERR "name: [%s]\n", m->mnt_devname); printk(KERN_ERR "ns: [%p]\n", m->mnt_ns); printk(KERN_ERR "parent: [%p]\n", m->mnt_parent); WARN_ON(1); err = -EINVAL; goto out_nfserr; } in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added in fs/nfsd/nfs4xdr.c) and see what will it catch? Both with and without randomized structs, if possible - I might be barking at the wrong tree, but IMO the very first step in localizing that crap is to find out whether it's toolchain-related or not. [*] strictly speaking, there is one exception - lib/test_printf.c has four static struct dentry instances. No chance of those being returned by any ->mount() instance, though.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote: > > Here is the BUG we are getting: > >> [ 58.962528] BUG: unable to handle kernel NULL pointer dereference at > >> 0230 > >> [ 58.963918] IP: vfs_statfs+0x73/0xb0 > > The code disassembles to > 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction > that matters (and that traps) but I'm almost certain that it's the > "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() > when it then does > > flags_by_sb(mnt->mnt_sb->s_flags); > > and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is > NULL, because we wouldn't have gotten this far if it was. > > Now, afaik, mnt->mnt_sb should never be NULL in the first place for a > proper path. And the vfs_statfs() code itself hasn't changed in a > while. > > Which does seem to implicate nfsd as having passed in a bad path to > vfs_statfs(). But I'm not seeing any changes in nfsd either. It definitely is NULL mnt->mnt_sb and that should never happen. All struct mount instances are allocated by alloc_vfsmnt(). Its callers are * vfs_kern_mount(). Assigns ->mnt_sb to root->d_sb before anyone else sees the address of that object. * clone_mnt(). Assigns ->mnt_sb to that of preexisting instance before anyone else sees the address of that object. No other callers exist and no other places ever modify the value of that field. All instances of struct dentry are created by __d_alloc()[*], which assigns ->d_sb (never to be modified afterwards) *and* dereferences the pointer it has stored in ->d_sb before the created struct dentry becomes visible to anyone else. No struct dentry should ever be observed with NULL ->d_sb; the only way to get that is memory corruption or looking at freed instance after its memory has been reused for something else and zeroed. In other words, we should never observe a struct mount with NULL ->mnt.mnt_sb - not without memory corruption or looking at freed instance. The pointer in that case should've come from exp->ex_path.mnt, exp being the argument of nfsd4_encode_fattr(). Sure, it might have been a dangling reference. However, it looks a lot more like a memory corruptor *OR* miscompiled kernel. What kind of load do the reproducer boxen have and how fast does that bug trigger? Would it be possible to slap something like if (unlikely(!exp->exp_path.mnt->mnt_sb)) { struct mount *m = real_mount(exp->exp_path.mnt); printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt); printk(KERN_ERR "name: [%s]\n", m->mnt_devname); printk(KERN_ERR "ns: [%p]\n", m->mnt_ns); printk(KERN_ERR "parent: [%p]\n", m->mnt_parent); WARN_ON(1); err = -EINVAL; goto out_nfserr; } in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added in fs/nfsd/nfs4xdr.c) and see what will it catch? Both with and without randomized structs, if possible - I might be barking at the wrong tree, but IMO the very first step in localizing that crap is to find out whether it's toolchain-related or not. [*] strictly speaking, there is one exception - lib/test_printf.c has four static struct dentry instances. No chance of those being returned by any ->mount() instance, though.
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-08 06:40 PM, Linus Torvalds wrote: > On Wed, Nov 8, 2017 at 4:43 PM, Patrick McLeanwrote: >> As of 4.13.11 (and also with 4.14-rc) we have an issue where when >> serving nfs4 sometimes we get the following BUG. When this bug happens, >> it usually also causes the motherboard to no longer POST until we >> externally re-flash the BIOS (using the BMC web interface). If a >> motherboard does not have an external way to flash the BIOS, this would >> brick the hardware. > > That sounds like your BIOS is just broken. All the dead boards were from the same vendor. We are going to try some boards from another vendor today. > > The kernel oops is probably just a trigger for that - possibly because > you reboot with a particular state that breaks the BIOS. > > Also, are you sure you really need to reflash the BIOS? It's actually > fairly hard to overwrite the BIOS itself, but crashing with bad > hardware state (where "bad" can just mean "unexpected by the BIOS") > can cause the BIOS to not properly re-initialize things, and hang at > boot. > > So not booting cleanly from a warm reset is a reasonably common BIOS failure. > > And yes, reflashing tends to force a full initialization and thus > "fixes" things, but it may be a big hammer when a cold boot or just a > "reset BIOS to safe defaults" might be sufficient. > > In pretty much all cases this is a sign of a nasty BIOS problem, > though, and you may want to look into a firmware update from the > vendor for that. We tried a cold power off (physically unplugging the machine from power) and a CMOS reset, and neither helped. The only thing that actually restored one of the dead boards was a reflash. I did the reflash with the latest code when I reflashed it. > > But on to the kernel side: > >> Here is the BUG we are getting: >>> [ 58.962528] BUG: unable to handle kernel NULL pointer dereference at >>> 0230 >>> [ 58.963918] IP: vfs_statfs+0x73/0xb0 > > The code disassembles to > >0: 83 c9 08 or $0x8,%ecx >3: 40 f6 c6 04 test $0x4,%sil >7: 0f 45 d1 cmovne %ecx,%edx >a: 89 d1mov%edx,%ecx >c: 80 cd 04 or $0x4,%ch >f: 40 f6 c6 08 test $0x8,%sil > 13: 0f 45 d1 cmovne %ecx,%edx > 16: 89 d1mov%edx,%ecx > 18: 80 cd 08 or $0x8,%ch > 1b: 40 f6 c6 10 test $0x10,%sil > 1f: 0f 45 d1 cmovne %ecx,%edx > 22: 89 d1mov%edx,%ecx > 24: 80 cd 10 or $0x10,%ch > 27: 83 e6 20 and$0x20,%esi > 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction > 31: 0f 45 d1 cmovne %ecx,%edx > 34: 83 ca 20 or $0x20,%edx > 37: 89 f1mov%esi,%ecx > 39: 83 e1 10 and$0x10,%ecx > 3c: 89 cfmov%ecx,%edi > > and all those odd cmovne and bit-ops are just the bit selection code > in flags_by_mnt(), which is inlined through calculate_f_flags (which > is _also_ inlined) into vfs_statfs(). > > Sadly, gcc makes a mess of it and actually generates code that looks > like the original C. I would have hoped that gcc could have turned > >if (x & BIT) > y |= OTHER_BIT; > > into > > y |= (x & BIT) shifted-by-the-bit-difference-between BIT/OTHER_BIT; > > but that doesn't happen. We actually do it by hand in some other more > critical places, but it's painful to do by hand (because the shift > direction/amount is not trivial to do in C). > > Anyway, that cmovne noise makes it a bit hard to see the actual part > that matters (and that traps) but I'm almost certain that it's the > "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() > when it then does > > flags_by_sb(mnt->mnt_sb->s_flags); > > and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is > NULL, because we wouldn't have gotten this far if it was. > > Now, afaik, mnt->mnt_sb should never be NULL in the first place for a > proper path. And the vfs_statfs() code itself hasn't changed in a > while. > > Which does seem to implicate nfsd as having passed in a bad path to > vfs_statfs(). But I'm not seeing any changes in nfsd either. > > In particular, there are *no* nfsd changes in that 4.13.8..4.13.11 > range. There is a bunch of xfs changes, though. What's the underlying > filesystem that you are exporting? It's an ext4 filesystem. > > But bringing in Al Viro and Bruce Fields explicitly in case they see > something. And Darrick, just in case it might be xfs. > Thanks
Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11
On 2017-11-08 06:40 PM, Linus Torvalds wrote: > On Wed, Nov 8, 2017 at 4:43 PM, Patrick McLean wrote: >> As of 4.13.11 (and also with 4.14-rc) we have an issue where when >> serving nfs4 sometimes we get the following BUG. When this bug happens, >> it usually also causes the motherboard to no longer POST until we >> externally re-flash the BIOS (using the BMC web interface). If a >> motherboard does not have an external way to flash the BIOS, this would >> brick the hardware. > > That sounds like your BIOS is just broken. All the dead boards were from the same vendor. We are going to try some boards from another vendor today. > > The kernel oops is probably just a trigger for that - possibly because > you reboot with a particular state that breaks the BIOS. > > Also, are you sure you really need to reflash the BIOS? It's actually > fairly hard to overwrite the BIOS itself, but crashing with bad > hardware state (where "bad" can just mean "unexpected by the BIOS") > can cause the BIOS to not properly re-initialize things, and hang at > boot. > > So not booting cleanly from a warm reset is a reasonably common BIOS failure. > > And yes, reflashing tends to force a full initialization and thus > "fixes" things, but it may be a big hammer when a cold boot or just a > "reset BIOS to safe defaults" might be sufficient. > > In pretty much all cases this is a sign of a nasty BIOS problem, > though, and you may want to look into a firmware update from the > vendor for that. We tried a cold power off (physically unplugging the machine from power) and a CMOS reset, and neither helped. The only thing that actually restored one of the dead boards was a reflash. I did the reflash with the latest code when I reflashed it. > > But on to the kernel side: > >> Here is the BUG we are getting: >>> [ 58.962528] BUG: unable to handle kernel NULL pointer dereference at >>> 0230 >>> [ 58.963918] IP: vfs_statfs+0x73/0xb0 > > The code disassembles to > >0: 83 c9 08 or $0x8,%ecx >3: 40 f6 c6 04 test $0x4,%sil >7: 0f 45 d1 cmovne %ecx,%edx >a: 89 d1mov%edx,%ecx >c: 80 cd 04 or $0x4,%ch >f: 40 f6 c6 08 test $0x8,%sil > 13: 0f 45 d1 cmovne %ecx,%edx > 16: 89 d1mov%edx,%ecx > 18: 80 cd 08 or $0x8,%ch > 1b: 40 f6 c6 10 test $0x10,%sil > 1f: 0f 45 d1 cmovne %ecx,%edx > 22: 89 d1mov%edx,%ecx > 24: 80 cd 10 or $0x10,%ch > 27: 83 e6 20 and$0x20,%esi > 2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction > 31: 0f 45 d1 cmovne %ecx,%edx > 34: 83 ca 20 or $0x20,%edx > 37: 89 f1mov%esi,%ecx > 39: 83 e1 10 and$0x10,%ecx > 3c: 89 cfmov%ecx,%edi > > and all those odd cmovne and bit-ops are just the bit selection code > in flags_by_mnt(), which is inlined through calculate_f_flags (which > is _also_ inlined) into vfs_statfs(). > > Sadly, gcc makes a mess of it and actually generates code that looks > like the original C. I would have hoped that gcc could have turned > >if (x & BIT) > y |= OTHER_BIT; > > into > > y |= (x & BIT) shifted-by-the-bit-difference-between BIT/OTHER_BIT; > > but that doesn't happen. We actually do it by hand in some other more > critical places, but it's painful to do by hand (because the shift > direction/amount is not trivial to do in C). > > Anyway, that cmovne noise makes it a bit hard to see the actual part > that matters (and that traps) but I'm almost certain that it's the > "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags() > when it then does > > flags_by_sb(mnt->mnt_sb->s_flags); > > and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is > NULL, because we wouldn't have gotten this far if it was. > > Now, afaik, mnt->mnt_sb should never be NULL in the first place for a > proper path. And the vfs_statfs() code itself hasn't changed in a > while. > > Which does seem to implicate nfsd as having passed in a bad path to > vfs_statfs(). But I'm not seeing any changes in nfsd either. > > In particular, there are *no* nfsd changes in that 4.13.8..4.13.11 > range. There is a bunch of xfs changes, though. What's the underlying > filesystem that you are exporting? It's an ext4 filesystem. > > But bringing in Al Viro and Bruce Fields explicitly in case they see > something. And Darrick, just in case it might be xfs. > Thanks