subject:"\[nfsd4\] potentially hardware breaking regression in 4.14\-rc and 4.13.11"

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-03-05 Thread Kees Cook

On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamada
 wrote:
> Sorry for chiming in late.
>
> I noticed this thread today,
> honestly, the commit made me upset.
>
>
> Can I suggest another way to make it less fragile?
> __attribute((...)) can be placed after 'struct'.
>
>
> So, we can write:
>
>
> struct __randomize_layout path {
> struct vfsmount *mnt;
> struct dentry *dentry;
> };
>
>
>   instead of
>
>
> struct path {
> struct vfsmount *mnt;
> struct dentry *dentry;
> } __randomize_layout;

Ugh. I had tried this after the struct _name_, not after "struct"
itself. This does fix it, though it remains fragile, as you mention.

> If we force the former notation,
> the undefined __randomize_layout results in a build error
> instead of silent broken code generation.
>
>
> It is true somebody can still place
> __randomize_layout after the closing brace,
> but can we check this by coccicheck or checkpatch.pl?
> (we can describe it in coding style documentation, of course)
>
>
> IMHO, we should not (ab)use include/linux/kconfig.h
> to bring in misc things.

I'm happy to send a patch that reverts the other changes and relocates
all the markings...

Linus, how would you like this to go?

-Kees

-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-03-05 Thread Kees Cook

On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamada
 wrote:
> Sorry for chiming in late.
>
> I noticed this thread today,
> honestly, the commit made me upset.
>
>
> Can I suggest another way to make it less fragile?
> __attribute((...)) can be placed after 'struct'.
>
>
> So, we can write:
>
>
> struct __randomize_layout path {
> struct vfsmount *mnt;
> struct dentry *dentry;
> };
>
>
>   instead of
>
>
> struct path {
> struct vfsmount *mnt;
> struct dentry *dentry;
> } __randomize_layout;

Ugh. I had tried this after the struct _name_, not after "struct"
itself. This does fix it, though it remains fragile, as you mention.

> If we force the former notation,
> the undefined __randomize_layout results in a build error
> instead of silent broken code generation.
>
>
> It is true somebody can still place
> __randomize_layout after the closing brace,
> but can we check this by coccicheck or checkpatch.pl?
> (we can describe it in coding style documentation, of course)
>
>
> IMHO, we should not (ab)use include/linux/kconfig.h
> to bring in misc things.

I'm happy to send a patch that reverts the other changes and relocates
all the markings...

Linus, how would you like this to go?

-Kees

-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-03-05 Thread Linus Torvalds

On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamada
 wrote:
>
> Can I suggest another way to make it less fragile?
> __attribute((...)) can be placed after 'struct'.

That avoids the actual bug, but it wouldn't have helped _find_ the
problem in the first place.

If somebody ever does the same thing, they'd hit the same issue. And
it's not just __randomize_struct, it's any of our other type markers.

We can say "don't do that", but if there is no automated checking,
it's still ripe to cause problems just because somebody didn't notice.

So I'd rather have something that causes a build failure when
something goes wrong, rather than silently accepting syntax that
wasn't intended.

   Linus

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-03-05 Thread Linus Torvalds

On Mon, Mar 5, 2018 at 1:27 AM, Masahiro Yamada
 wrote:
>
> Can I suggest another way to make it less fragile?
> __attribute((...)) can be placed after 'struct'.

That avoids the actual bug, but it wouldn't have helped _find_ the
problem in the first place.

If somebody ever does the same thing, they'd hit the same issue. And
it's not just __randomize_struct, it's any of our other type markers.

We can say "don't do that", but if there is no automated checking,
it's still ripe to cause problems just because somebody didn't notice.

So I'd rather have something that causes a build failure when
something goes wrong, rather than silently accepting syntax that
wasn't intended.

   Linus

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-03-05 Thread Masahiro Yamada

Hi Linus,

2018-02-22 7:47 GMT+09:00 Linus Torvalds :
> On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero
>  wrote:
>>
>> One can see that offsets used to access various members of struct path are
>> different, and also that the original file from step 3 contains an object
>> named "__randomize_layout".
>
> Whee.
>
> Thanks for root-causing this issue, and this syntax of ours is clearly
> *much* too fragile.
>
> We actually have similar issues with some of our other attributes,
> where out nice "helpful" attribute shorthand can end up being just
> silently interpreted as a variable name if they aren't defined in
> time.
>
> For most of our other attributes, it just doesn't matter all that much
> if some user doesn't happen to see the attribute. For
> __randomize_layout, it's obviously very fatal, and silently just
> generates crazy code.
>
> I'm not entirely sure what the right solution is, because it's
> obviously much too easy to miss some #include by mistake. It's easy to
> say "you should always include the proper header", but if a failure to
> do so doesn't end up with any warnings or errors, but just silent bad
> code generation, it's much too fragile.
>
> I wonder if we could change the syntax of that "__randomize_layout"
> thing. Some of our related helper macros (ie
> randomized_struct_fields_start/end) don't have the same problem,
> because if you don't have the define for them, the compiler will
> complain about bad syntax.
>
> And other attribute specifiers we encourage people to put in other
> parts of the type, like __user etc, so they don't have that same
> parsing issue.
>
> I guess one _extreme_ fix for this would be to put
>
> extern struct nostruct __randomize_layout;
>
> in our include/linux/kconfig.h, which I think we end up always
> including first thanks to having it on the command line.
>
> Because if you do that, you actually get an error:
>
> CC [M]  fs/nfsd/nfs4xdr.o
>   In file included from ./include/linux/fs_struct.h:5:0,
>from fs/nfsd/nfs4xdr.c:36:
>   ./include/linux/path.h:11:3: error: conflicting types for 
> ‘__randomize_layout’
>} __randomize_layout;
>  ^~
>   In file included from :0:0:
>   ././include/linux/kconfig.h:8:28: note: previous declaration of
> ‘__randomize_layout’ was here
>extern struct nostruct __randomize_layout;
>   ^~
>   make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1
>
> and we would have figured this out immediately.
>
> Broken example patch appended, in case somebody wants to play with
> something like this or comes up with a better model entirely..
>
>Linus
>


Sorry for chiming in late.

I noticed this thread today,
honestly, the commit made me upset.


Can I suggest another way to make it less fragile?
__attribute((...)) can be placed after 'struct'.


So, we can write:


struct __randomize_layout path {
struct vfsmount *mnt;
struct dentry *dentry;
};


  instead of


struct path {
struct vfsmount *mnt;
struct dentry *dentry;
} __randomize_layout;



If we force the former notation,
the undefined __randomize_layout results in a build error
instead of silent broken code generation.


It is true somebody can still place
__randomize_layout after the closing brace,
but can we check this by coccicheck or checkpatch.pl?
(we can describe it in coding style documentation, of course)


IMHO, we should not (ab)use include/linux/kconfig.h
to bring in misc things.


-- 
Best Regards
Masahiro Yamada

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-03-05 Thread Masahiro Yamada

Hi Linus,

2018-02-22 7:47 GMT+09:00 Linus Torvalds :
> On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero
>  wrote:
>>
>> One can see that offsets used to access various members of struct path are
>> different, and also that the original file from step 3 contains an object
>> named "__randomize_layout".
>
> Whee.
>
> Thanks for root-causing this issue, and this syntax of ours is clearly
> *much* too fragile.
>
> We actually have similar issues with some of our other attributes,
> where out nice "helpful" attribute shorthand can end up being just
> silently interpreted as a variable name if they aren't defined in
> time.
>
> For most of our other attributes, it just doesn't matter all that much
> if some user doesn't happen to see the attribute. For
> __randomize_layout, it's obviously very fatal, and silently just
> generates crazy code.
>
> I'm not entirely sure what the right solution is, because it's
> obviously much too easy to miss some #include by mistake. It's easy to
> say "you should always include the proper header", but if a failure to
> do so doesn't end up with any warnings or errors, but just silent bad
> code generation, it's much too fragile.
>
> I wonder if we could change the syntax of that "__randomize_layout"
> thing. Some of our related helper macros (ie
> randomized_struct_fields_start/end) don't have the same problem,
> because if you don't have the define for them, the compiler will
> complain about bad syntax.
>
> And other attribute specifiers we encourage people to put in other
> parts of the type, like __user etc, so they don't have that same
> parsing issue.
>
> I guess one _extreme_ fix for this would be to put
>
> extern struct nostruct __randomize_layout;
>
> in our include/linux/kconfig.h, which I think we end up always
> including first thanks to having it on the command line.
>
> Because if you do that, you actually get an error:
>
> CC [M]  fs/nfsd/nfs4xdr.o
>   In file included from ./include/linux/fs_struct.h:5:0,
>from fs/nfsd/nfs4xdr.c:36:
>   ./include/linux/path.h:11:3: error: conflicting types for 
> ‘__randomize_layout’
>} __randomize_layout;
>  ^~
>   In file included from :0:0:
>   ././include/linux/kconfig.h:8:28: note: previous declaration of
> ‘__randomize_layout’ was here
>extern struct nostruct __randomize_layout;
>   ^~
>   make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1
>
> and we would have figured this out immediately.
>
> Broken example patch appended, in case somebody wants to play with
> something like this or comes up with a better model entirely..
>
>Linus
>


Sorry for chiming in late.

I noticed this thread today,
honestly, the commit made me upset.


Can I suggest another way to make it less fragile?
__attribute((...)) can be placed after 'struct'.


So, we can write:


struct __randomize_layout path {
struct vfsmount *mnt;
struct dentry *dentry;
};


  instead of


struct path {
struct vfsmount *mnt;
struct dentry *dentry;
} __randomize_layout;



If we force the former notation,
the undefined __randomize_layout results in a build error
instead of silent broken code generation.


It is true somebody can still place
__randomize_layout after the closing brace,
but can we check this by coccicheck or checkpatch.pl?
(we can describe it in coding style documentation, of course)


IMHO, we should not (ab)use include/linux/kconfig.h
to bring in misc things.


-- 
Best Regards
Masahiro Yamada

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Kees Cook

On Wed, Feb 21, 2018 at 4:23 PM, Kees Cook  wrote:
> On Wed, Feb 21, 2018 at 4:22 PM, Linus Torvalds
>  wrote:
>> On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook  wrote:
>>>
>>> Do you want me to send the patch for this, or do you already have it
>>> prepared?
>>
>> I'd rather get something explicitly tested. I tried my earlier patch
>> with "make allmodconfig" (and a fix to nfsd to make it compile), but
>> now I'm back to testing hjl's gas updates so it would be better to get
>> a tested commit with a good commit message.
>>
>>> The body-fields I had prepared for the nfs were:
>>>
>>> Reported-by: Patrick McLean 
>>> Reported-by: Maciej S. Szmigiero 
>>
>> Oh, I think Maciej needs to get more than a "Reported-by:". This was a
>> really subtle thing that we didn't figure out in the original thread,
>> so give him a gold star in the form of "Root-caused-by:" or something.
>
> Oops, I just sent this out. I will adjust a re-send. I couldn't find a
> documented field name for this...

With the "root-cause" hint, I see we have used:

2Root-cause-analysis-by:
2Root-caused-by:
1Root-cause-found-by:

I'll go with your "Root-caused-by" to tip the scale. :)

-Kees

-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Kees Cook

On Wed, Feb 21, 2018 at 4:23 PM, Kees Cook  wrote:
> On Wed, Feb 21, 2018 at 4:22 PM, Linus Torvalds
>  wrote:
>> On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook  wrote:
>>>
>>> Do you want me to send the patch for this, or do you already have it
>>> prepared?
>>
>> I'd rather get something explicitly tested. I tried my earlier patch
>> with "make allmodconfig" (and a fix to nfsd to make it compile), but
>> now I'm back to testing hjl's gas updates so it would be better to get
>> a tested commit with a good commit message.
>>
>>> The body-fields I had prepared for the nfs were:
>>>
>>> Reported-by: Patrick McLean 
>>> Reported-by: Maciej S. Szmigiero 
>>
>> Oh, I think Maciej needs to get more than a "Reported-by:". This was a
>> really subtle thing that we didn't figure out in the original thread,
>> so give him a gold star in the form of "Root-caused-by:" or something.
>
> Oops, I just sent this out. I will adjust a re-send. I couldn't find a
> documented field name for this...

With the "root-cause" hint, I see we have used:

2Root-cause-analysis-by:
2Root-caused-by:
1Root-cause-found-by:

I'll go with your "Root-caused-by" to tip the scale. :)

-Kees

-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Kees Cook

On Wed, Feb 21, 2018 at 4:22 PM, Linus Torvalds
 wrote:
> On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook  wrote:
>>
>> Do you want me to send the patch for this, or do you already have it
>> prepared?
>
> I'd rather get something explicitly tested. I tried my earlier patch
> with "make allmodconfig" (and a fix to nfsd to make it compile), but
> now I'm back to testing hjl's gas updates so it would be better to get
> a tested commit with a good commit message.
>
>> The body-fields I had prepared for the nfs were:
>>
>> Reported-by: Patrick McLean 
>> Reported-by: Maciej S. Szmigiero 
>
> Oh, I think Maciej needs to get more than a "Reported-by:". This was a
> really subtle thing that we didn't figure out in the original thread,
> so give him a gold star in the form of "Root-caused-by:" or something.

Oops, I just sent this out. I will adjust a re-send. I couldn't find a
documented field name for this...

> *Fixing* this ends up being a one-liner or so. Finding the cause was
> the painful part.

Yes indeed!

-Kees

-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Kees Cook

On Wed, Feb 21, 2018 at 4:22 PM, Linus Torvalds
 wrote:
> On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook  wrote:
>>
>> Do you want me to send the patch for this, or do you already have it
>> prepared?
>
> I'd rather get something explicitly tested. I tried my earlier patch
> with "make allmodconfig" (and a fix to nfsd to make it compile), but
> now I'm back to testing hjl's gas updates so it would be better to get
> a tested commit with a good commit message.
>
>> The body-fields I had prepared for the nfs were:
>>
>> Reported-by: Patrick McLean 
>> Reported-by: Maciej S. Szmigiero 
>
> Oh, I think Maciej needs to get more than a "Reported-by:". This was a
> really subtle thing that we didn't figure out in the original thread,
> so give him a gold star in the form of "Root-caused-by:" or something.

Oops, I just sent this out. I will adjust a re-send. I couldn't find a
documented field name for this...

> *Fixing* this ends up being a one-liner or so. Finding the cause was
> the painful part.

Yes indeed!

-Kees

-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Linus Torvalds

On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook  wrote:
>
> Do you want me to send the patch for this, or do you already have it
> prepared?

I'd rather get something explicitly tested. I tried my earlier patch
with "make allmodconfig" (and a fix to nfsd to make it compile), but
now I'm back to testing hjl's gas updates so it would be better to get
a tested commit with a good commit message.

> The body-fields I had prepared for the nfs were:
>
> Reported-by: Patrick McLean 
> Reported-by: Maciej S. Szmigiero 

Oh, I think Maciej needs to get more than a "Reported-by:". This was a
really subtle thing that we didn't figure out in the original thread,
so give him a gold star in the form of "Root-caused-by:" or something.

*Fixing* this ends up being a one-liner or so. Finding the cause was
the painful part.

   Linus

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Linus Torvalds

On Wed, Feb 21, 2018 at 4:12 PM, Kees Cook  wrote:
>
> Do you want me to send the patch for this, or do you already have it
> prepared?

I'd rather get something explicitly tested. I tried my earlier patch
with "make allmodconfig" (and a fix to nfsd to make it compile), but
now I'm back to testing hjl's gas updates so it would be better to get
a tested commit with a good commit message.

> The body-fields I had prepared for the nfs were:
>
> Reported-by: Patrick McLean 
> Reported-by: Maciej S. Szmigiero 

Oh, I think Maciej needs to get more than a "Reported-by:". This was a
really subtle thing that we didn't figure out in the original thread,
so give him a gold star in the form of "Root-caused-by:" or something.

*Fixing* this ends up being a one-liner or so. Finding the cause was
the painful part.

   Linus

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Kees Cook

On Wed, Feb 21, 2018 at 3:24 PM, Linus Torvalds
 wrote:
> On Wed, Feb 21, 2018 at 2:52 PM, Kees Cook  wrote:
>>
>> I'll play with Linus's suggestion and see what we get.
>
> It may be just as well to just include  from
>  and be done with it.

Hah, yeah, that would certainly solve it too. :)

> I do hate including unnecessary stuff because it makes builds slower,
> but kernel header files probably don't get much more core than
> .

It also has the benefit of not letting it "go wrong" in the first
place. (And the separate fix for nfs isn't needed...)

Do you want me to send the patch for this, or do you already have it
prepared? The body-fields I had prepared for the nfs were:

Reported-by: Patrick McLean 
Reported-by: Maciej S. Szmigiero 
Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization")

-Kees

-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Kees Cook

On Wed, Feb 21, 2018 at 3:24 PM, Linus Torvalds
 wrote:
> On Wed, Feb 21, 2018 at 2:52 PM, Kees Cook  wrote:
>>
>> I'll play with Linus's suggestion and see what we get.
>
> It may be just as well to just include  from
>  and be done with it.

Hah, yeah, that would certainly solve it too. :)

> I do hate including unnecessary stuff because it makes builds slower,
> but kernel header files probably don't get much more core than
> .

It also has the benefit of not letting it "go wrong" in the first
place. (And the separate fix for nfs isn't needed...)

Do you want me to send the patch for this, or do you already have it
prepared? The body-fields I had prepared for the nfs were:

Reported-by: Patrick McLean 
Reported-by: Maciej S. Szmigiero 
Fixes: 3859a271a003 ("randstruct: Mark various structs for randomization")

-Kees

-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Kees Cook

On Wed, Feb 21, 2018 at 2:47 PM, Linus Torvalds
 wrote:
> And other attribute specifiers we encourage people to put in other
> parts of the type, like __user etc, so they don't have that same
> parsing issue.

Looking at other attributes we use on structs, we may have similar
risks for these:

__packed
cacheline_aligned
cacheline_aligned_in_smp
cacheline_internodealigned_in_smp

But they just haven't been used in places that we could trip over it
as badly, AFAICT.

> I guess one _extreme_ fix for this would be to put
>
> extern struct nostruct __randomize_layout;
>
> in our include/linux/kconfig.h, which I think we end up always
> including first thanks to having it on the command line.

We could do that for all the above, but I wonder if the real problem
is our convention of using "regular" names for these kinds of
attributes instead of parameterized names. If we always used something
like:

#define __struct(x)   __attribute__(x)

We'd avoid it, but we'd uglify our struct attributes:

struct thing { ... } __struct(randomize_layout);

though trying this now creates other problems. Hmmm.

(Regardless, let me send the nfs fix separately...)

-Kees

>
> Because if you do that, you actually get an error:
>
> CC [M]  fs/nfsd/nfs4xdr.o
>   In file included from ./include/linux/fs_struct.h:5:0,
>from fs/nfsd/nfs4xdr.c:36:
>   ./include/linux/path.h:11:3: error: conflicting types for 
> ‘__randomize_layout’
>} __randomize_layout;
>  ^~
>   In file included from :0:0:
>   ././include/linux/kconfig.h:8:28: note: previous declaration of
> ‘__randomize_layout’ was here
>extern struct nostruct __randomize_layout;
>   ^~
>   make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1
>
> and we would have figured this out immediately.
>
> Broken example patch appended, in case somebody wants to play with
> something like this or comes up with a better model entirely..
>
>Linus
>
> ---
>
> diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h
> index fec5076eda91..537dacb83380 100644
> --- a/include/linux/kconfig.h
> +++ b/include/linux/kconfig.h
> @@ -4,6 +4,10 @@
>
>  #include 
>
> +#ifndef __ASSEMBLY__
> + extern struct nostruct __randomize_layout;
> +#endif
> +
>  #define __ARG_PLACEHOLDER_1 0,
>  #define __take_second_arg(__ignored, val, ...) val



-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Kees Cook

On Wed, Feb 21, 2018 at 2:47 PM, Linus Torvalds
 wrote:
> And other attribute specifiers we encourage people to put in other
> parts of the type, like __user etc, so they don't have that same
> parsing issue.

Looking at other attributes we use on structs, we may have similar
risks for these:

__packed
cacheline_aligned
cacheline_aligned_in_smp
cacheline_internodealigned_in_smp

But they just haven't been used in places that we could trip over it
as badly, AFAICT.

> I guess one _extreme_ fix for this would be to put
>
> extern struct nostruct __randomize_layout;
>
> in our include/linux/kconfig.h, which I think we end up always
> including first thanks to having it on the command line.

We could do that for all the above, but I wonder if the real problem
is our convention of using "regular" names for these kinds of
attributes instead of parameterized names. If we always used something
like:

#define __struct(x)   __attribute__(x)

We'd avoid it, but we'd uglify our struct attributes:

struct thing { ... } __struct(randomize_layout);

though trying this now creates other problems. Hmmm.

(Regardless, let me send the nfs fix separately...)

-Kees

>
> Because if you do that, you actually get an error:
>
> CC [M]  fs/nfsd/nfs4xdr.o
>   In file included from ./include/linux/fs_struct.h:5:0,
>from fs/nfsd/nfs4xdr.c:36:
>   ./include/linux/path.h:11:3: error: conflicting types for 
> ‘__randomize_layout’
>} __randomize_layout;
>  ^~
>   In file included from :0:0:
>   ././include/linux/kconfig.h:8:28: note: previous declaration of
> ‘__randomize_layout’ was here
>extern struct nostruct __randomize_layout;
>   ^~
>   make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1
>
> and we would have figured this out immediately.
>
> Broken example patch appended, in case somebody wants to play with
> something like this or comes up with a better model entirely..
>
>Linus
>
> ---
>
> diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h
> index fec5076eda91..537dacb83380 100644
> --- a/include/linux/kconfig.h
> +++ b/include/linux/kconfig.h
> @@ -4,6 +4,10 @@
>
>  #include 
>
> +#ifndef __ASSEMBLY__
> + extern struct nostruct __randomize_layout;
> +#endif
> +
>  #define __ARG_PLACEHOLDER_1 0,
>  #define __take_second_arg(__ignored, val, ...) val



-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Linus Torvalds

On Wed, Feb 21, 2018 at 2:52 PM, Kees Cook  wrote:
>
> I'll play with Linus's suggestion and see what we get.

It may be just as well to just include  from
 and be done with it.

If you look at that hacky script I documented in commit 23c35f48f5fb
("pinctrl: remove include file from ") and run it in a
fully built kernel tree, you'll see that that header is included from
pretty much every single file anyway. At least for me, for an
allmodconfig build, the top headers are

  23322 arch/x86/include/uapi/asm/types.h
  23322 include/asm-generic/int-ll64.h
  23322 include/linux/types.h
  23322 include/uapi/asm-generic/int-ll64.h
  23322 include/uapi/asm-generic/types.h
  23322 include/uapi/linux/types.h
  23323 arch/x86/include/uapi/asm/bitsperlong.h
  23323 include/asm-generic/bitsperlong.h
  23323 include/uapi/asm-generic/bitsperlong.h
  23326 include/linux/stringify.h
  23390 include/linux/compiler_types.h

and considering that I have 25949 object files in that tree, it really
means that just about every compile ended up including that
 file anyway (yeah, the "orc_types.h" header
ends up being mentioned twice for most files, so it looks even more
hot, but that's not real data).

I do hate including unnecessary stuff because it makes builds slower,
but kernel header files probably don't get much more core than
.

  Linus

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Linus Torvalds

On Wed, Feb 21, 2018 at 2:52 PM, Kees Cook  wrote:
>
> I'll play with Linus's suggestion and see what we get.

It may be just as well to just include  from
 and be done with it.

If you look at that hacky script I documented in commit 23c35f48f5fb
("pinctrl: remove include file from ") and run it in a
fully built kernel tree, you'll see that that header is included from
pretty much every single file anyway. At least for me, for an
allmodconfig build, the top headers are

  23322 arch/x86/include/uapi/asm/types.h
  23322 include/asm-generic/int-ll64.h
  23322 include/linux/types.h
  23322 include/uapi/asm-generic/int-ll64.h
  23322 include/uapi/asm-generic/types.h
  23322 include/uapi/linux/types.h
  23323 arch/x86/include/uapi/asm/bitsperlong.h
  23323 include/asm-generic/bitsperlong.h
  23323 include/uapi/asm-generic/bitsperlong.h
  23326 include/linux/stringify.h
  23390 include/linux/compiler_types.h

and considering that I have 25949 object files in that tree, it really
means that just about every compile ended up including that
 file anyway (yeah, the "orc_types.h" header
ends up being mentioned twice for most files, so it looks even more
hot, but that's not real data).

I do hate including unnecessary stuff because it makes builds slower,
but kernel header files probably don't get much more core than
.

  Linus

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Kees Cook

On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero
 wrote:
> One can see that offsets used to access various members of struct path are
> different, and also that the original file from step 3 contains an object
> named "__randomize_layout".
>
> This is caused by a fact that the current version of nfs4xdr.c includes
> linux/fs_struct.h as the very first included header which then includes
> linux/path.h as the very first included header, which then defines
> struct path, but without including any files on its own.
>
> This results in __randomize_layout tag at the end of struct path
> definition being treated as a variable name (since linux/compiler-gcc.h
> that defines it as a type attribute has not been included yet).

Oh, well done! That would explain the code offset I was seeing when
the plugin on, but no-op, since the variable would still exist.

I'll play with Linus's suggestion and see what we get.

Thanks!

-Kees

-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Kees Cook

On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero
 wrote:
> One can see that offsets used to access various members of struct path are
> different, and also that the original file from step 3 contains an object
> named "__randomize_layout".
>
> This is caused by a fact that the current version of nfs4xdr.c includes
> linux/fs_struct.h as the very first included header which then includes
> linux/path.h as the very first included header, which then defines
> struct path, but without including any files on its own.
>
> This results in __randomize_layout tag at the end of struct path
> definition being treated as a variable name (since linux/compiler-gcc.h
> that defines it as a type attribute has not been included yet).

Oh, well done! That would explain the code offset I was seeing when
the plugin on, but no-op, since the variable would still exist.

I'll play with Linus's suggestion and see what we get.

Thanks!

-Kees

-- 
Kees Cook
Pixel Security

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Linus Torvalds

On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero
 wrote:
>
> One can see that offsets used to access various members of struct path are
> different, and also that the original file from step 3 contains an object
> named "__randomize_layout".

Whee.

Thanks for root-causing this issue, and this syntax of ours is clearly
*much* too fragile.

We actually have similar issues with some of our other attributes,
where out nice "helpful" attribute shorthand can end up being just
silently interpreted as a variable name if they aren't defined in
time.

For most of our other attributes, it just doesn't matter all that much
if some user doesn't happen to see the attribute. For
__randomize_layout, it's obviously very fatal, and silently just
generates crazy code.

I'm not entirely sure what the right solution is, because it's
obviously much too easy to miss some #include by mistake. It's easy to
say "you should always include the proper header", but if a failure to
do so doesn't end up with any warnings or errors, but just silent bad
code generation, it's much too fragile.

I wonder if we could change the syntax of that "__randomize_layout"
thing. Some of our related helper macros (ie
randomized_struct_fields_start/end) don't have the same problem,
because if you don't have the define for them, the compiler will
complain about bad syntax.

And other attribute specifiers we encourage people to put in other
parts of the type, like __user etc, so they don't have that same
parsing issue.

I guess one _extreme_ fix for this would be to put

extern struct nostruct __randomize_layout;

in our include/linux/kconfig.h, which I think we end up always
including first thanks to having it on the command line.

Because if you do that, you actually get an error:

CC [M]  fs/nfsd/nfs4xdr.o
  In file included from ./include/linux/fs_struct.h:5:0,
   from fs/nfsd/nfs4xdr.c:36:
  ./include/linux/path.h:11:3: error: conflicting types for ‘__randomize_layout’
   } __randomize_layout;
 ^~
  In file included from :0:0:
  ././include/linux/kconfig.h:8:28: note: previous declaration of
‘__randomize_layout’ was here
   extern struct nostruct __randomize_layout;
  ^~
  make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1

and we would have figured this out immediately.

Broken example patch appended, in case somebody wants to play with
something like this or comes up with a better model entirely..

   Linus

---

diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h
index fec5076eda91..537dacb83380 100644
--- a/include/linux/kconfig.h
+++ b/include/linux/kconfig.h
@@ -4,6 +4,10 @@

 #include 

+#ifndef __ASSEMBLY__
+ extern struct nostruct __randomize_layout;
+#endif
+
 #define __ARG_PLACEHOLDER_1 0,
 #define __take_second_arg(__ignored, val, ...) val

Re: RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Linus Torvalds

On Wed, Feb 21, 2018 at 2:19 PM, Maciej S. Szmigiero
 wrote:
>
> One can see that offsets used to access various members of struct path are
> different, and also that the original file from step 3 contains an object
> named "__randomize_layout".

Whee.

Thanks for root-causing this issue, and this syntax of ours is clearly
*much* too fragile.

We actually have similar issues with some of our other attributes,
where out nice "helpful" attribute shorthand can end up being just
silently interpreted as a variable name if they aren't defined in
time.

For most of our other attributes, it just doesn't matter all that much
if some user doesn't happen to see the attribute. For
__randomize_layout, it's obviously very fatal, and silently just
generates crazy code.

I'm not entirely sure what the right solution is, because it's
obviously much too easy to miss some #include by mistake. It's easy to
say "you should always include the proper header", but if a failure to
do so doesn't end up with any warnings or errors, but just silent bad
code generation, it's much too fragile.

I wonder if we could change the syntax of that "__randomize_layout"
thing. Some of our related helper macros (ie
randomized_struct_fields_start/end) don't have the same problem,
because if you don't have the define for them, the compiler will
complain about bad syntax.

And other attribute specifiers we encourage people to put in other
parts of the type, like __user etc, so they don't have that same
parsing issue.

I guess one _extreme_ fix for this would be to put

extern struct nostruct __randomize_layout;

in our include/linux/kconfig.h, which I think we end up always
including first thanks to having it on the command line.

Because if you do that, you actually get an error:

CC [M]  fs/nfsd/nfs4xdr.o
  In file included from ./include/linux/fs_struct.h:5:0,
   from fs/nfsd/nfs4xdr.c:36:
  ./include/linux/path.h:11:3: error: conflicting types for ‘__randomize_layout’
   } __randomize_layout;
 ^~
  In file included from :0:0:
  ././include/linux/kconfig.h:8:28: note: previous declaration of
‘__randomize_layout’ was here
   extern struct nostruct __randomize_layout;
  ^~
  make[1]: *** [scripts/Makefile.build:317: fs/nfsd/nfs4xdr.o] Error 1

and we would have figured this out immediately.

Broken example patch appended, in case somebody wants to play with
something like this or comes up with a better model entirely..

   Linus

---

diff --git a/include/linux/kconfig.h b/include/linux/kconfig.h
index fec5076eda91..537dacb83380 100644
--- a/include/linux/kconfig.h
+++ b/include/linux/kconfig.h
@@ -4,6 +4,10 @@

 #include 

+#ifndef __ASSEMBLY__
+ extern struct nostruct __randomize_layout;
+#endif
+
 #define __ARG_PLACEHOLDER_1 0,
 #define __take_second_arg(__ignored, val, ...) val

RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Maciej S. Szmigiero

On 18.11.2017 06:14, Kees Cook wrote:
> On Fri, Nov 17, 2017 at 5:54 PM, Patrick McLean  wrote:
>> On 2017-11-17 04:55 PM, Linus Torvalds wrote:
>>> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean  wrote:

 I am still getting the crash at d9e12200852d, I figured I would
 double-check the "good" and "bad" kernels before starting a full bisect.
>>>
>>> .. but without GCC_PLUGIN_RANDSTRUCT it's solid?
>>
>> Yes, without GCC_PLUGIN_RANDSTRUCT it's solid.
> 
> That's strange. With d9e12200852d the shuffle_seed variables won't
> ever actually get used. (i.e. I wouldn't expect the seed to change any
> behavior.)
> 
> Can you confirm with something like this:
> 
> 
> for (i = 0; i < 4; i++) {
> seed[i] = shuffle_seed[i];
> 
> 
> You should see no reports of "Shuffling struct ..."
> 
> And if it reports nothing, and you're on d9e12200852d, can you confirm
> that switching to a "good" seed fixes it? (If it _does_, then I
> suspect a build artifact being left behind or something odd like
> that.)
> 
>>> Kees removed even the baseline "randomize pure function pointer
>>> structures", so at that commit, nothing should be randomized.
>>>
>>> But maybe the plugin code itself ends up confusing gcc somehow?
>>>
>>> Even when it doesn't actually do that "relayout_struct()" on the
>>> structure, it always does those TYPE_ATTRIBUTES() games.
> 
> FWIW, myself doing a build at d9e12200852d with and without
> GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output
> where I did spot-checks.
> 

I have also hit a GPF in nfsd4_encode_fattr() with RANDSTRUCT plugin
enabled.
This function is located in a fs/nfsd/nfs4xdr.c file.

The fault happened at "xdr_encode_hyper(p, 
exp->ex_path.mnt->mnt_sb->s_maxbytes)"
line, namely when accessing s_maxbytes.

exp->ex_path is of type struct path that has been annotated with
__randomize_layout.
It seems to me that this annotation isn't really taken into consideration
when compiling nfs4xdr.c.
This most likely results in dereferencing a value of exp->ex_path.dentry
instead of exp->ex_path.mnt. Then some member of struct dentry is
dereferenced as struct super_block to access its s_maxbytes member which
results in an oops if it happens to be an invalid pointer (which it was
in my case).

How to reproduce the problem statically (tested on current Linus's tree
and on 4.15.4, with gcc 7.3.0):
1) Enable RANDSTRUCT plugin,

2) Use a RANDSTRUCT seed that results in shuffling of struct path,
Example: "55e5fea7ff662b333a190209ab31f35b6f0f2470f7d0e3c64430936169571106".

3) make fs/nfsd/nfs4xdr.s and save the result,

4) Insert "#include " at the top of
fs/nfsd/nfs4xdr.c as the very first include directive.

5) make fs/nfsd/nfs4xdr.s and compare the result with the one from step 3.

One can see that offsets used to access various members of struct path are
different, and also that the original file from step 3 contains an object
named "__randomize_layout".

This is caused by a fact that the current version of nfs4xdr.c includes
linux/fs_struct.h as the very first included header which then includes
linux/path.h as the very first included header, which then defines
struct path, but without including any files on its own.

This results in __randomize_layout tag at the end of struct path
definition being treated as a variable name (since linux/compiler-gcc.h
that defines it as a type attribute has not been included yet).

It looks like to me that every header file that defines a randomized
struct also has to include linux/compiler_types.h or some other file
that ultimately results in that file inclusion in order to make
the RANDSTRUCT plugin work correctly.

Maciej

RANDSTRUCT structs need linux/compiler_types.h (Was: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11)

2018-02-21 Thread Maciej S. Szmigiero

On 18.11.2017 06:14, Kees Cook wrote:
> On Fri, Nov 17, 2017 at 5:54 PM, Patrick McLean  wrote:
>> On 2017-11-17 04:55 PM, Linus Torvalds wrote:
>>> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean  wrote:

 I am still getting the crash at d9e12200852d, I figured I would
 double-check the "good" and "bad" kernels before starting a full bisect.
>>>
>>> .. but without GCC_PLUGIN_RANDSTRUCT it's solid?
>>
>> Yes, without GCC_PLUGIN_RANDSTRUCT it's solid.
> 
> That's strange. With d9e12200852d the shuffle_seed variables won't
> ever actually get used. (i.e. I wouldn't expect the seed to change any
> behavior.)
> 
> Can you confirm with something like this:
> 
> 
> for (i = 0; i < 4; i++) {
> seed[i] = shuffle_seed[i];
> 
> 
> You should see no reports of "Shuffling struct ..."
> 
> And if it reports nothing, and you're on d9e12200852d, can you confirm
> that switching to a "good" seed fixes it? (If it _does_, then I
> suspect a build artifact being left behind or something odd like
> that.)
> 
>>> Kees removed even the baseline "randomize pure function pointer
>>> structures", so at that commit, nothing should be randomized.
>>>
>>> But maybe the plugin code itself ends up confusing gcc somehow?
>>>
>>> Even when it doesn't actually do that "relayout_struct()" on the
>>> structure, it always does those TYPE_ATTRIBUTES() games.
> 
> FWIW, myself doing a build at d9e12200852d with and without
> GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output
> where I did spot-checks.
> 

I have also hit a GPF in nfsd4_encode_fattr() with RANDSTRUCT plugin
enabled.
This function is located in a fs/nfsd/nfs4xdr.c file.

The fault happened at "xdr_encode_hyper(p, 
exp->ex_path.mnt->mnt_sb->s_maxbytes)"
line, namely when accessing s_maxbytes.

exp->ex_path is of type struct path that has been annotated with
__randomize_layout.
It seems to me that this annotation isn't really taken into consideration
when compiling nfs4xdr.c.
This most likely results in dereferencing a value of exp->ex_path.dentry
instead of exp->ex_path.mnt. Then some member of struct dentry is
dereferenced as struct super_block to access its s_maxbytes member which
results in an oops if it happens to be an invalid pointer (which it was
in my case).

How to reproduce the problem statically (tested on current Linus's tree
and on 4.15.4, with gcc 7.3.0):
1) Enable RANDSTRUCT plugin,

2) Use a RANDSTRUCT seed that results in shuffling of struct path,
Example: "55e5fea7ff662b333a190209ab31f35b6f0f2470f7d0e3c64430936169571106".

3) make fs/nfsd/nfs4xdr.s and save the result,

4) Insert "#include " at the top of
fs/nfsd/nfs4xdr.c as the very first include directive.

5) make fs/nfsd/nfs4xdr.s and compare the result with the one from step 3.

One can see that offsets used to access various members of struct path are
different, and also that the original file from step 3 contains an object
named "__randomize_layout".

This is caused by a fact that the current version of nfs4xdr.c includes
linux/fs_struct.h as the very first included header which then includes
linux/path.h as the very first included header, which then defines
struct path, but without including any files on its own.

This results in __randomize_layout tag at the end of struct path
definition being treated as a variable name (since linux/compiler-gcc.h
that defines it as a type attribute has not been included yet).

It looks like to me that every header file that defines a randomized
struct also has to include linux/compiler_types.h or some other file
that ultimately results in that file inclusion in order to make
the RANDSTRUCT plugin work correctly.

Maciej

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-18 Thread Kees Cook

On Fri, Nov 17, 2017 at 9:29 PM, Linus Torvalds
 wrote:
> On Fri, Nov 17, 2017 at 9:14 PM, Kees Cook  wrote:
>>
>> FWIW, myself doing a build at d9e12200852d with and without
>> GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output
>> where I did spot-checks.
>
> That would probably be a good thing to check anyway - check the
> difference between GCC_PLUGIN_RANDSTRUCT on and off at that commit.
>
> Just do
>
>objdump --disassemble vmlinux > file
>
> and compare the two files for where the differences start occurring.

Yeah, I was just doing that now. Looks like there _is_ something
getting changed just from having the plugin enabled, but it appears
localized. For me, the first non-offset change happens in
lookup_user_key and persists for a while.

-813893a7:  0f 85 55 03 00 00   jne
81389702 
-813893ad:  f0 41 ff 06 lock incl (%r14)
-813893b1:  83 fb 07cmp$0x7,%ebx
-813893b4:  4c 89 b5 70 ff ff ffmov%r14,-0x90(%rbp)
...
+813893a7:  0f 85 35 03 00 00   jne
813896e2 
+813893ad:  4d 89 f0mov%r14,%r8
+813893b0:  f0 41 ff 06 lock incl (%r14)
+813893b4:  83 fb 07cmp$0x7,%ebx
+813893b7:  4c 89 b5 70 ff ff ffmov%r14,-0x90(%rbp)

And removing the TYPE_ATTRIBUTES() poking makes the register storage
differences go away, but there's still a 0x40 byte offset delta.

I'll continue looking at this tomorrow.

-Kees

-- 
Kees Cook
Pixel Security

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-18 Thread Kees Cook

On Fri, Nov 17, 2017 at 9:29 PM, Linus Torvalds
 wrote:
> On Fri, Nov 17, 2017 at 9:14 PM, Kees Cook  wrote:
>>
>> FWIW, myself doing a build at d9e12200852d with and without
>> GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output
>> where I did spot-checks.
>
> That would probably be a good thing to check anyway - check the
> difference between GCC_PLUGIN_RANDSTRUCT on and off at that commit.
>
> Just do
>
>objdump --disassemble vmlinux > file
>
> and compare the two files for where the differences start occurring.

Yeah, I was just doing that now. Looks like there _is_ something
getting changed just from having the plugin enabled, but it appears
localized. For me, the first non-offset change happens in
lookup_user_key and persists for a while.

-813893a7:  0f 85 55 03 00 00   jne
81389702 
-813893ad:  f0 41 ff 06 lock incl (%r14)
-813893b1:  83 fb 07cmp$0x7,%ebx
-813893b4:  4c 89 b5 70 ff ff ffmov%r14,-0x90(%rbp)
...
+813893a7:  0f 85 35 03 00 00   jne
813896e2 
+813893ad:  4d 89 f0mov%r14,%r8
+813893b0:  f0 41 ff 06 lock incl (%r14)
+813893b4:  83 fb 07cmp$0x7,%ebx
+813893b7:  4c 89 b5 70 ff ff ffmov%r14,-0x90(%rbp)

And removing the TYPE_ATTRIBUTES() poking makes the register storage
differences go away, but there's still a 0x40 byte offset delta.

I'll continue looking at this tomorrow.

-Kees

-- 
Kees Cook
Pixel Security

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Linus Torvalds

On Fri, Nov 17, 2017 at 9:14 PM, Kees Cook  wrote:
>
> FWIW, myself doing a build at d9e12200852d with and without
> GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output
> where I did spot-checks.

That would probably be a good thing to check anyway - check the
difference between GCC_PLUGIN_RANDSTRUCT on and off at that commit.

Just do

   objdump --disassemble vmlinux > file

and compare the two files for where the differences start occurring.

 Linus

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Linus Torvalds

On Fri, Nov 17, 2017 at 9:14 PM, Kees Cook  wrote:
>
> FWIW, myself doing a build at d9e12200852d with and without
> GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output
> where I did spot-checks.

That would probably be a good thing to check anyway - check the
difference between GCC_PLUGIN_RANDSTRUCT on and off at that commit.

Just do

   objdump --disassemble vmlinux > file

and compare the two files for where the differences start occurring.

 Linus

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Kees Cook

On Fri, Nov 17, 2017 at 5:54 PM, Patrick McLean  wrote:
> On 2017-11-17 04:55 PM, Linus Torvalds wrote:
>> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean  wrote:
>>>
>>> I am still getting the crash at d9e12200852d, I figured I would
>>> double-check the "good" and "bad" kernels before starting a full bisect.
>>
>> .. but without GCC_PLUGIN_RANDSTRUCT it's solid?
>
> Yes, without GCC_PLUGIN_RANDSTRUCT it's solid.

That's strange. With d9e12200852d the shuffle_seed variables won't
ever actually get used. (i.e. I wouldn't expect the seed to change any
behavior.)

Can you confirm with something like this:


diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c
b/scripts/gcc-plugins/randomize_layout_plugin.c
index cdaac8c66734..aac570a57d7d 100644
--- a/scripts/gcc-plugins/randomize_layout_plugin.c
+++ b/scripts/gcc-plugins/randomize_layout_plugin.c
@@ -267,12 +267,10 @@ static void shuffle(const_tree type, tree
*newtree, unsigned long length)

structname = ORIG_TYPE_NAME(type);

-#ifdef __DEBUG_PLUGIN
fprintf(stderr, "Shuffling struct %s %p\n", (const char
*)structname, type);
 #ifdef __DEBUG_VERBOSE
debug_tree((tree)type);
 #endif
-#endif

for (i = 0; i < 4; i++) {
seed[i] = shuffle_seed[i];


You should see no reports of "Shuffling struct ..."

And if it reports nothing, and you're on d9e12200852d, can you confirm
that switching to a "good" seed fixes it? (If it _does_, then I
suspect a build artifact being left behind or something odd like
that.)

>> Kees removed even the baseline "randomize pure function pointer
>> structures", so at that commit, nothing should be randomized.
>>
>> But maybe the plugin code itself ends up confusing gcc somehow?
>>
>> Even when it doesn't actually do that "relayout_struct()" on the
>> structure, it always does those TYPE_ATTRIBUTES() games.

FWIW, myself doing a build at d9e12200852d with and without
GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output
where I did spot-checks.

Also, do you have any other plugins enabled? (Can you send your .config?)

-Kees

-- 
Kees Cook
Pixel Security

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Kees Cook

On Fri, Nov 17, 2017 at 5:54 PM, Patrick McLean  wrote:
> On 2017-11-17 04:55 PM, Linus Torvalds wrote:
>> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean  wrote:
>>>
>>> I am still getting the crash at d9e12200852d, I figured I would
>>> double-check the "good" and "bad" kernels before starting a full bisect.
>>
>> .. but without GCC_PLUGIN_RANDSTRUCT it's solid?
>
> Yes, without GCC_PLUGIN_RANDSTRUCT it's solid.

That's strange. With d9e12200852d the shuffle_seed variables won't
ever actually get used. (i.e. I wouldn't expect the seed to change any
behavior.)

Can you confirm with something like this:


diff --git a/scripts/gcc-plugins/randomize_layout_plugin.c
b/scripts/gcc-plugins/randomize_layout_plugin.c
index cdaac8c66734..aac570a57d7d 100644
--- a/scripts/gcc-plugins/randomize_layout_plugin.c
+++ b/scripts/gcc-plugins/randomize_layout_plugin.c
@@ -267,12 +267,10 @@ static void shuffle(const_tree type, tree
*newtree, unsigned long length)

structname = ORIG_TYPE_NAME(type);

-#ifdef __DEBUG_PLUGIN
fprintf(stderr, "Shuffling struct %s %p\n", (const char
*)structname, type);
 #ifdef __DEBUG_VERBOSE
debug_tree((tree)type);
 #endif
-#endif

for (i = 0; i < 4; i++) {
seed[i] = shuffle_seed[i];


You should see no reports of "Shuffling struct ..."

And if it reports nothing, and you're on d9e12200852d, can you confirm
that switching to a "good" seed fixes it? (If it _does_, then I
suspect a build artifact being left behind or something odd like
that.)

>> Kees removed even the baseline "randomize pure function pointer
>> structures", so at that commit, nothing should be randomized.
>>
>> But maybe the plugin code itself ends up confusing gcc somehow?
>>
>> Even when it doesn't actually do that "relayout_struct()" on the
>> structure, it always does those TYPE_ATTRIBUTES() games.

FWIW, myself doing a build at d9e12200852d with and without
GCC_PLUGIN_RANDSTRUCT _appears_ to produce identical objdump output
where I did spot-checks.

Also, do you have any other plugins enabled? (Can you send your .config?)

-Kees

-- 
Kees Cook
Pixel Security

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Patrick McLean



On 2017-11-17 04:55 PM, Linus Torvalds wrote:
> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean  wrote:
>>
>> I am still getting the crash at d9e12200852d, I figured I would
>> double-check the "good" and "bad" kernels before starting a full bisect.
> 
> .. but without GCC_PLUGIN_RANDSTRUCT it's solid?

Yes, without GCC_PLUGIN_RANDSTRUCT it's solid.

> Kees removed even the baseline "randomize pure function pointer
> structures", so at that commit, nothing should be randomized.
> 
> But maybe the plugin code itself ends up confusing gcc somehow?
> 
> Even when it doesn't actually do that "relayout_struct()" on the
> structure, it always does those TYPE_ATTRIBUTES() games.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Patrick McLean



On 2017-11-17 04:55 PM, Linus Torvalds wrote:
> On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean  wrote:
>>
>> I am still getting the crash at d9e12200852d, I figured I would
>> double-check the "good" and "bad" kernels before starting a full bisect.
> 
> .. but without GCC_PLUGIN_RANDSTRUCT it's solid?

Yes, without GCC_PLUGIN_RANDSTRUCT it's solid.

> Kees removed even the baseline "randomize pure function pointer
> structures", so at that commit, nothing should be randomized.
> 
> But maybe the plugin code itself ends up confusing gcc somehow?
> 
> Even when it doesn't actually do that "relayout_struct()" on the
> structure, it always does those TYPE_ATTRIBUTES() games.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Linus Torvalds

On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean  wrote:
>
> I am still getting the crash at d9e12200852d, I figured I would
> double-check the "good" and "bad" kernels before starting a full bisect.

.. but without GCC_PLUGIN_RANDSTRUCT it's solid?

Kees removed even the baseline "randomize pure function pointer
structures", so at that commit, nothing should be randomized.

But maybe the plugin code itself ends up confusing gcc somehow?

Even when it doesn't actually do that "relayout_struct()" on the
structure, it always does those TYPE_ATTRIBUTES() games.

   Linus

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Linus Torvalds

On Fri, Nov 17, 2017 at 4:27 PM, Patrick McLean  wrote:
>
> I am still getting the crash at d9e12200852d, I figured I would
> double-check the "good" and "bad" kernels before starting a full bisect.

.. but without GCC_PLUGIN_RANDSTRUCT it's solid?

Kees removed even the baseline "randomize pure function pointer
structures", so at that commit, nothing should be randomized.

But maybe the plugin code itself ends up confusing gcc somehow?

Even when it doesn't actually do that "relayout_struct()" on the
structure, it always does those TYPE_ATTRIBUTES() games.

   Linus

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Patrick McLean

On 2017-11-17 01:26 PM, Kees Cook wrote:
> On Fri, Nov 17, 2017 at 11:03 AM, Patrick McLean  wrote:
>> On 2017-11-16 04:54 PM, Kees Cook wrote:
>>> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean  wrote:
 On 2017-11-11 09:31 AM, Linus Torvalds wrote:
> Boris Lukashev points out that Patrick should probably check a newer
> version of gcc.
>
> I looked around, and in one of the emails, Patrick said:
>
>   "No changes, both the working and broken kernels were built with
>distro-provided gcc 5.4.0 and binutils 2.28.1"
>
> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
> it's a bug-fix release to a pretty old branch that is not exactly new.
>
> It would probably be good to check if the problems persist with gcc
> 6.x or 7.x.. I have no idea which gcc version the randstruct people
> tend to use themselves.

 I just tested it with gcc 7.2, and was able to reproduce the NULL
 pointer dereference, the backtrace looks slightly different this time.

 I will also test with binutils 2.29, though I doubt that will make any
 difference.

> [   56.165181] BUG: unable to handle kernel NULL pointer dereference at 
> 0560
> [   56.166563] IP: vfs_statfs+0x7c/0xc0
> [   56.167249] PGD 0 P4D 0
> [   56.167860] Oops:  [#1] SMP
> [   56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
> xt_multiport xt_addrtype iptable_mangle iptable>
> [   56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G   O
> 4.14.0-git-kratos-1 #1
> [   56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
> [   56.182729] task: 88040c412a00 task.stack: c90002c18000
> [   56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0
> [   56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202
> [   56.185143] RAX:  RBX: c90002c1bbf0 RCX: 
> 0020
> [   56.186085] RDX: 1801 RSI: 1801 RDI: 
> 
> [   56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 
> 00ff
> [   56.188268] R10: 0038be3a R11: 880408b18258 R12: 
> 
> [   56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: 
> c90002c1bbf0
> [   56.190444] FS:  () GS:88041fc0() 
> knlGS:
> [   56.191876] CS:  0010 DS:  ES:  CR0: 80050033
> [   56.192843] CR2: 0560 CR3: 01e0a002 CR4: 
> 001606f0
> [   56.193898] Call Trace:
> [   56.194510]  nfsd4_encode_fattr+0x201/0x1f90
> [   56.195267]  ? generic_permission+0x12c/0x1a0
> [   56.196025]  nfsd4_encode_getattr+0x25/0x30
> [   56.196753]  nfsd4_encode_operation+0x98/0x1b0
> [   56.197526]  nfsd4_proc_compound+0x2a0/0x5e0
> [   56.198268]  nfsd_dispatch+0xe8/0x220
> [   56.198968]  svc_process_common+0x475/0x640
> [   56.199696]  ? nfsd_destroy+0x60/0x60
> [   56.200404]  svc_process+0xf2/0x1a0
> [   56.201079]  nfsd+0xe3/0x150
> [   56.201706]  kthread+0x117/0x130
> [   56.202354]  ? kthread_create_on_node+0x40/0x40
> [   56.203100]  ret_from_fork+0x25/0x30
> [   56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 
> 81 ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce>
> [   56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28
> [   56.207110] CR2: 0560
> [   56.207763] ---[ end trace d452986a80f64aaa ]---

> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:
>>
>> I'll take a closer look at this and see if I can provide something to
>> narrow it down.
>>>
>>> How reliable is this crash? The best idea I have to isolate it would
>>> be to bisect the additions of the __randomize_layout markings on
>>> various structures. I would start with the ones Al is most upset to
>>> see randomized. ;)
>>
>> It's pretty reliable, once I get a bad seed I can reproduce the crash
>> pretty quickly.
>>
>>> For the first step, I'd try a revert of
>>> 9225331b310821760f39ba55b00b8973602adbb5, which enables a large
>>> portion of struct randomization. If that doesn't change things, I can
>>> provide a series that reverts 3859a271a003aba01e45b85c9d8b355eb7bf25f9
>>> and then re-applies __randomize_layout one structure per patch, and
>>> you could bisect that?
>>
>> Sure, I can bisect that.
> 
> Okay, that should at least let us know if this is a specific struct
> that is not expecting to get randomized, or if there is some deeper
> flaw. Here's the tree, based on 4.14:
> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/randstruct/bisection
> 
> With commit d9e12200852d, all randomization selections are reverted. I
> would expect this to be a "good" kernel for the bisect.

I am still

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Patrick McLean

On 2017-11-17 01:26 PM, Kees Cook wrote:
> On Fri, Nov 17, 2017 at 11:03 AM, Patrick McLean  wrote:
>> On 2017-11-16 04:54 PM, Kees Cook wrote:
>>> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean  wrote:
 On 2017-11-11 09:31 AM, Linus Torvalds wrote:
> Boris Lukashev points out that Patrick should probably check a newer
> version of gcc.
>
> I looked around, and in one of the emails, Patrick said:
>
>   "No changes, both the working and broken kernels were built with
>distro-provided gcc 5.4.0 and binutils 2.28.1"
>
> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
> it's a bug-fix release to a pretty old branch that is not exactly new.
>
> It would probably be good to check if the problems persist with gcc
> 6.x or 7.x.. I have no idea which gcc version the randstruct people
> tend to use themselves.

 I just tested it with gcc 7.2, and was able to reproduce the NULL
 pointer dereference, the backtrace looks slightly different this time.

 I will also test with binutils 2.29, though I doubt that will make any
 difference.

> [   56.165181] BUG: unable to handle kernel NULL pointer dereference at 
> 0560
> [   56.166563] IP: vfs_statfs+0x7c/0xc0
> [   56.167249] PGD 0 P4D 0
> [   56.167860] Oops:  [#1] SMP
> [   56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
> xt_multiport xt_addrtype iptable_mangle iptable>
> [   56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G   O
> 4.14.0-git-kratos-1 #1
> [   56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
> [   56.182729] task: 88040c412a00 task.stack: c90002c18000
> [   56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0
> [   56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202
> [   56.185143] RAX:  RBX: c90002c1bbf0 RCX: 
> 0020
> [   56.186085] RDX: 1801 RSI: 1801 RDI: 
> 
> [   56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 
> 00ff
> [   56.188268] R10: 0038be3a R11: 880408b18258 R12: 
> 
> [   56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: 
> c90002c1bbf0
> [   56.190444] FS:  () GS:88041fc0() 
> knlGS:
> [   56.191876] CS:  0010 DS:  ES:  CR0: 80050033
> [   56.192843] CR2: 0560 CR3: 01e0a002 CR4: 
> 001606f0
> [   56.193898] Call Trace:
> [   56.194510]  nfsd4_encode_fattr+0x201/0x1f90
> [   56.195267]  ? generic_permission+0x12c/0x1a0
> [   56.196025]  nfsd4_encode_getattr+0x25/0x30
> [   56.196753]  nfsd4_encode_operation+0x98/0x1b0
> [   56.197526]  nfsd4_proc_compound+0x2a0/0x5e0
> [   56.198268]  nfsd_dispatch+0xe8/0x220
> [   56.198968]  svc_process_common+0x475/0x640
> [   56.199696]  ? nfsd_destroy+0x60/0x60
> [   56.200404]  svc_process+0xf2/0x1a0
> [   56.201079]  nfsd+0xe3/0x150
> [   56.201706]  kthread+0x117/0x130
> [   56.202354]  ? kthread_create_on_node+0x40/0x40
> [   56.203100]  ret_from_fork+0x25/0x30
> [   56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 
> 81 ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce>
> [   56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28
> [   56.207110] CR2: 0560
> [   56.207763] ---[ end trace d452986a80f64aaa ]---

> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:
>>
>> I'll take a closer look at this and see if I can provide something to
>> narrow it down.
>>>
>>> How reliable is this crash? The best idea I have to isolate it would
>>> be to bisect the additions of the __randomize_layout markings on
>>> various structures. I would start with the ones Al is most upset to
>>> see randomized. ;)
>>
>> It's pretty reliable, once I get a bad seed I can reproduce the crash
>> pretty quickly.
>>
>>> For the first step, I'd try a revert of
>>> 9225331b310821760f39ba55b00b8973602adbb5, which enables a large
>>> portion of struct randomization. If that doesn't change things, I can
>>> provide a series that reverts 3859a271a003aba01e45b85c9d8b355eb7bf25f9
>>> and then re-applies __randomize_layout one structure per patch, and
>>> you could bisect that?
>>
>> Sure, I can bisect that.
> 
> Okay, that should at least let us know if this is a specific struct
> that is not expecting to get randomized, or if there is some deeper
> flaw. Here's the tree, based on 4.14:
> https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/log/?h=kspp/randstruct/bisection
> 
> With commit d9e12200852d, all randomization selections are reverted. I
> would expect this to be a "good" kernel for the bisect.

I am still getting the crash at d9e12200852d, I figured I would
double-check the

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Kees Cook

On Fri, Nov 17, 2017 at 11:03 AM, Patrick McLean  wrote:
> On 2017-11-16 04:54 PM, Kees Cook wrote:
>> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean  wrote:
>>> On 2017-11-11 09:31 AM, Linus Torvalds wrote:
 Boris Lukashev points out that Patrick should probably check a newer
 version of gcc.

 I looked around, and in one of the emails, Patrick said:

   "No changes, both the working and broken kernels were built with
distro-provided gcc 5.4.0 and binutils 2.28.1"

 and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
 it's a bug-fix release to a pretty old branch that is not exactly new.

 It would probably be good to check if the problems persist with gcc
 6.x or 7.x.. I have no idea which gcc version the randstruct people
 tend to use themselves.
>>>
>>> I just tested it with gcc 7.2, and was able to reproduce the NULL
>>> pointer dereference, the backtrace looks slightly different this time.
>>>
>>> I will also test with binutils 2.29, though I doubt that will make any
>>> difference.
>>>
 [   56.165181] BUG: unable to handle kernel NULL pointer dereference at 
 0560
 [   56.166563] IP: vfs_statfs+0x7c/0xc0
 [   56.167249] PGD 0 P4D 0
 [   56.167860] Oops:  [#1] SMP
 [   56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
 xt_multiport xt_addrtype iptable_mangle iptable>
 [   56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G   O
 4.14.0-git-kratos-1 #1
 [   56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
 [   56.182729] task: 88040c412a00 task.stack: c90002c18000
 [   56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0
 [   56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202
 [   56.185143] RAX:  RBX: c90002c1bbf0 RCX: 
 0020
 [   56.186085] RDX: 1801 RSI: 1801 RDI: 
 
 [   56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 
 00ff
 [   56.188268] R10: 0038be3a R11: 880408b18258 R12: 
 
 [   56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: 
 c90002c1bbf0
 [   56.190444] FS:  () GS:88041fc0() 
 knlGS:
 [   56.191876] CS:  0010 DS:  ES:  CR0: 80050033
 [   56.192843] CR2: 0560 CR3: 01e0a002 CR4: 
 001606f0
 [   56.193898] Call Trace:
 [   56.194510]  nfsd4_encode_fattr+0x201/0x1f90
 [   56.195267]  ? generic_permission+0x12c/0x1a0
 [   56.196025]  nfsd4_encode_getattr+0x25/0x30
 [   56.196753]  nfsd4_encode_operation+0x98/0x1b0
 [   56.197526]  nfsd4_proc_compound+0x2a0/0x5e0
 [   56.198268]  nfsd_dispatch+0xe8/0x220
 [   56.198968]  svc_process_common+0x475/0x640
 [   56.199696]  ? nfsd_destroy+0x60/0x60
 [   56.200404]  svc_process+0xf2/0x1a0
 [   56.201079]  nfsd+0xe3/0x150
 [   56.201706]  kthread+0x117/0x130
 [   56.202354]  ? kthread_create_on_node+0x40/0x40
 [   56.203100]  ret_from_fork+0x25/0x30
 [   56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 
 ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce>
 [   56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28
 [   56.207110] CR2: 0560
 [   56.207763] ---[ end trace d452986a80f64aaa ]---
>>>
 On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:
>
> I'll take a closer look at this and see if I can provide something to
> narrow it down.
>>
>> How reliable is this crash? The best idea I have to isolate it would
>> be to bisect the additions of the __randomize_layout markings on
>> various structures. I would start with the ones Al is most upset to
>> see randomized. ;)
>
> It's pretty reliable, once I get a bad seed I can reproduce the crash
> pretty quickly.
>
>>
>> All that said, I'd like to better understand the BIOS side of this a
>> little better. In the first email in this thread, you showed two BUGs
>> separated by a little time, which implies to me that the NULL deref
>> and the BIOS no longer POSTing are separate (though seemingly related)
>> issues. Have you had machines survive the BUG without blowing up the
>> BIOS?
>
> We had 3 machines die due to the BIOS issue (all of them pretty quickly
> with the bad-seed kernel). All the dead machines had the same
> motherboard model. I have not managed to reproduce the issue again on
> the machine I restored via the IPMI interface, I suspect that it may be
> a bug in the BIOS that was fixed in a more recent version.
>
>>
>> I'm still trying to wrap my head around how the BIOS could be blowing
>> up. I assume there's some magic memory address that is getting poked
>> as a result of some struct randomization bug, so tracking that down
>>

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Kees Cook

On Fri, Nov 17, 2017 at 11:03 AM, Patrick McLean  wrote:
> On 2017-11-16 04:54 PM, Kees Cook wrote:
>> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean  wrote:
>>> On 2017-11-11 09:31 AM, Linus Torvalds wrote:
 Boris Lukashev points out that Patrick should probably check a newer
 version of gcc.

 I looked around, and in one of the emails, Patrick said:

   "No changes, both the working and broken kernels were built with
distro-provided gcc 5.4.0 and binutils 2.28.1"

 and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
 it's a bug-fix release to a pretty old branch that is not exactly new.

 It would probably be good to check if the problems persist with gcc
 6.x or 7.x.. I have no idea which gcc version the randstruct people
 tend to use themselves.
>>>
>>> I just tested it with gcc 7.2, and was able to reproduce the NULL
>>> pointer dereference, the backtrace looks slightly different this time.
>>>
>>> I will also test with binutils 2.29, though I doubt that will make any
>>> difference.
>>>
 [   56.165181] BUG: unable to handle kernel NULL pointer dereference at 
 0560
 [   56.166563] IP: vfs_statfs+0x7c/0xc0
 [   56.167249] PGD 0 P4D 0
 [   56.167860] Oops:  [#1] SMP
 [   56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
 xt_multiport xt_addrtype iptable_mangle iptable>
 [   56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G   O
 4.14.0-git-kratos-1 #1
 [   56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
 [   56.182729] task: 88040c412a00 task.stack: c90002c18000
 [   56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0
 [   56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202
 [   56.185143] RAX:  RBX: c90002c1bbf0 RCX: 
 0020
 [   56.186085] RDX: 1801 RSI: 1801 RDI: 
 
 [   56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 
 00ff
 [   56.188268] R10: 0038be3a R11: 880408b18258 R12: 
 
 [   56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: 
 c90002c1bbf0
 [   56.190444] FS:  () GS:88041fc0() 
 knlGS:
 [   56.191876] CS:  0010 DS:  ES:  CR0: 80050033
 [   56.192843] CR2: 0560 CR3: 01e0a002 CR4: 
 001606f0
 [   56.193898] Call Trace:
 [   56.194510]  nfsd4_encode_fattr+0x201/0x1f90
 [   56.195267]  ? generic_permission+0x12c/0x1a0
 [   56.196025]  nfsd4_encode_getattr+0x25/0x30
 [   56.196753]  nfsd4_encode_operation+0x98/0x1b0
 [   56.197526]  nfsd4_proc_compound+0x2a0/0x5e0
 [   56.198268]  nfsd_dispatch+0xe8/0x220
 [   56.198968]  svc_process_common+0x475/0x640
 [   56.199696]  ? nfsd_destroy+0x60/0x60
 [   56.200404]  svc_process+0xf2/0x1a0
 [   56.201079]  nfsd+0xe3/0x150
 [   56.201706]  kthread+0x117/0x130
 [   56.202354]  ? kthread_create_on_node+0x40/0x40
 [   56.203100]  ret_from_fork+0x25/0x30
 [   56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 
 ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce>
 [   56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28
 [   56.207110] CR2: 0560
 [   56.207763] ---[ end trace d452986a80f64aaa ]---
>>>
 On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:
>
> I'll take a closer look at this and see if I can provide something to
> narrow it down.
>>
>> How reliable is this crash? The best idea I have to isolate it would
>> be to bisect the additions of the __randomize_layout markings on
>> various structures. I would start with the ones Al is most upset to
>> see randomized. ;)
>
> It's pretty reliable, once I get a bad seed I can reproduce the crash
> pretty quickly.
>
>>
>> All that said, I'd like to better understand the BIOS side of this a
>> little better. In the first email in this thread, you showed two BUGs
>> separated by a little time, which implies to me that the NULL deref
>> and the BIOS no longer POSTing are separate (though seemingly related)
>> issues. Have you had machines survive the BUG without blowing up the
>> BIOS?
>
> We had 3 machines die due to the BIOS issue (all of them pretty quickly
> with the bad-seed kernel). All the dead machines had the same
> motherboard model. I have not managed to reproduce the issue again on
> the machine I restored via the IPMI interface, I suspect that it may be
> a bug in the BIOS that was fixed in a more recent version.
>
>>
>> I'm still trying to wrap my head around how the BIOS could be blowing
>> up. I assume there's some magic memory address that is getting poked
>> as a result of some struct randomization bug, so tracking that down
>> should be possible assuming you can stand reflashing your BIOS across

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Patrick McLean

On 2017-11-16 04:54 PM, Kees Cook wrote:
> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean  wrote:
>> On 2017-11-11 09:31 AM, Linus Torvalds wrote:
>>> Boris Lukashev points out that Patrick should probably check a newer
>>> version of gcc.
>>>
>>> I looked around, and in one of the emails, Patrick said:
>>>
>>>   "No changes, both the working and broken kernels were built with
>>>distro-provided gcc 5.4.0 and binutils 2.28.1"
>>>
>>> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
>>> it's a bug-fix release to a pretty old branch that is not exactly new.
>>>
>>> It would probably be good to check if the problems persist with gcc
>>> 6.x or 7.x.. I have no idea which gcc version the randstruct people
>>> tend to use themselves.
>>
>> I just tested it with gcc 7.2, and was able to reproduce the NULL
>> pointer dereference, the backtrace looks slightly different this time.
>>
>> I will also test with binutils 2.29, though I doubt that will make any
>> difference.
>>
>>> [   56.165181] BUG: unable to handle kernel NULL pointer dereference at 
>>> 0560
>>> [   56.166563] IP: vfs_statfs+0x7c/0xc0
>>> [   56.167249] PGD 0 P4D 0
>>> [   56.167860] Oops:  [#1] SMP
>>> [   56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
>>> xt_multiport xt_addrtype iptable_mangle iptable>
>>> [   56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G   O
>>> 4.14.0-git-kratos-1 #1
>>> [   56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
>>> [   56.182729] task: 88040c412a00 task.stack: c90002c18000
>>> [   56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0
>>> [   56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202
>>> [   56.185143] RAX:  RBX: c90002c1bbf0 RCX: 
>>> 0020
>>> [   56.186085] RDX: 1801 RSI: 1801 RDI: 
>>> 
>>> [   56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 
>>> 00ff
>>> [   56.188268] R10: 0038be3a R11: 880408b18258 R12: 
>>> 
>>> [   56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: 
>>> c90002c1bbf0
>>> [   56.190444] FS:  () GS:88041fc0() 
>>> knlGS:
>>> [   56.191876] CS:  0010 DS:  ES:  CR0: 80050033
>>> [   56.192843] CR2: 0560 CR3: 01e0a002 CR4: 
>>> 001606f0
>>> [   56.193898] Call Trace:
>>> [   56.194510]  nfsd4_encode_fattr+0x201/0x1f90
>>> [   56.195267]  ? generic_permission+0x12c/0x1a0
>>> [   56.196025]  nfsd4_encode_getattr+0x25/0x30
>>> [   56.196753]  nfsd4_encode_operation+0x98/0x1b0
>>> [   56.197526]  nfsd4_proc_compound+0x2a0/0x5e0
>>> [   56.198268]  nfsd_dispatch+0xe8/0x220
>>> [   56.198968]  svc_process_common+0x475/0x640
>>> [   56.199696]  ? nfsd_destroy+0x60/0x60
>>> [   56.200404]  svc_process+0xf2/0x1a0
>>> [   56.201079]  nfsd+0xe3/0x150
>>> [   56.201706]  kthread+0x117/0x130
>>> [   56.202354]  ? kthread_create_on_node+0x40/0x40
>>> [   56.203100]  ret_from_fork+0x25/0x30
>>> [   56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 
>>> ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce>
>>> [   56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28
>>> [   56.207110] CR2: 0560
>>> [   56.207763] ---[ end trace d452986a80f64aaa ]---
>>
>>> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:

 I'll take a closer look at this and see if I can provide something to
 narrow it down.
> 
> How reliable is this crash? The best idea I have to isolate it would
> be to bisect the additions of the __randomize_layout markings on
> various structures. I would start with the ones Al is most upset to
> see randomized. ;)

It's pretty reliable, once I get a bad seed I can reproduce the crash
pretty quickly.

> 
> All that said, I'd like to better understand the BIOS side of this a
> little better. In the first email in this thread, you showed two BUGs
> separated by a little time, which implies to me that the NULL deref
> and the BIOS no longer POSTing are separate (though seemingly related)
> issues. Have you had machines survive the BUG without blowing up the
> BIOS?

We had 3 machines die due to the BIOS issue (all of them pretty quickly
with the bad-seed kernel). All the dead machines had the same
motherboard model. I have not managed to reproduce the issue again on
the machine I restored via the IPMI interface, I suspect that it may be
a bug in the BIOS that was fixed in a more recent version.

> 
> I'm still trying to wrap my head around how the BIOS could be blowing
> up. I assume there's some magic memory address that is getting poked
> as a result of some struct randomization bug, so tracking that down
> should be possible assuming you can stand reflashing your BIOS across
> the bisects.

That is our theory, some magic memory address that caused an overwrite
of the flash where the BIOS code is

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-17 Thread Patrick McLean

On 2017-11-16 04:54 PM, Kees Cook wrote:
> On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean  wrote:
>> On 2017-11-11 09:31 AM, Linus Torvalds wrote:
>>> Boris Lukashev points out that Patrick should probably check a newer
>>> version of gcc.
>>>
>>> I looked around, and in one of the emails, Patrick said:
>>>
>>>   "No changes, both the working and broken kernels were built with
>>>distro-provided gcc 5.4.0 and binutils 2.28.1"
>>>
>>> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
>>> it's a bug-fix release to a pretty old branch that is not exactly new.
>>>
>>> It would probably be good to check if the problems persist with gcc
>>> 6.x or 7.x.. I have no idea which gcc version the randstruct people
>>> tend to use themselves.
>>
>> I just tested it with gcc 7.2, and was able to reproduce the NULL
>> pointer dereference, the backtrace looks slightly different this time.
>>
>> I will also test with binutils 2.29, though I doubt that will make any
>> difference.
>>
>>> [   56.165181] BUG: unable to handle kernel NULL pointer dereference at 
>>> 0560
>>> [   56.166563] IP: vfs_statfs+0x7c/0xc0
>>> [   56.167249] PGD 0 P4D 0
>>> [   56.167860] Oops:  [#1] SMP
>>> [   56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
>>> xt_multiport xt_addrtype iptable_mangle iptable>
>>> [   56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G   O
>>> 4.14.0-git-kratos-1 #1
>>> [   56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
>>> [   56.182729] task: 88040c412a00 task.stack: c90002c18000
>>> [   56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0
>>> [   56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202
>>> [   56.185143] RAX:  RBX: c90002c1bbf0 RCX: 
>>> 0020
>>> [   56.186085] RDX: 1801 RSI: 1801 RDI: 
>>> 
>>> [   56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 
>>> 00ff
>>> [   56.188268] R10: 0038be3a R11: 880408b18258 R12: 
>>> 
>>> [   56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: 
>>> c90002c1bbf0
>>> [   56.190444] FS:  () GS:88041fc0() 
>>> knlGS:
>>> [   56.191876] CS:  0010 DS:  ES:  CR0: 80050033
>>> [   56.192843] CR2: 0560 CR3: 01e0a002 CR4: 
>>> 001606f0
>>> [   56.193898] Call Trace:
>>> [   56.194510]  nfsd4_encode_fattr+0x201/0x1f90
>>> [   56.195267]  ? generic_permission+0x12c/0x1a0
>>> [   56.196025]  nfsd4_encode_getattr+0x25/0x30
>>> [   56.196753]  nfsd4_encode_operation+0x98/0x1b0
>>> [   56.197526]  nfsd4_proc_compound+0x2a0/0x5e0
>>> [   56.198268]  nfsd_dispatch+0xe8/0x220
>>> [   56.198968]  svc_process_common+0x475/0x640
>>> [   56.199696]  ? nfsd_destroy+0x60/0x60
>>> [   56.200404]  svc_process+0xf2/0x1a0
>>> [   56.201079]  nfsd+0xe3/0x150
>>> [   56.201706]  kthread+0x117/0x130
>>> [   56.202354]  ? kthread_create_on_node+0x40/0x40
>>> [   56.203100]  ret_from_fork+0x25/0x30
>>> [   56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 
>>> ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce>
>>> [   56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28
>>> [   56.207110] CR2: 0560
>>> [   56.207763] ---[ end trace d452986a80f64aaa ]---
>>
>>> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:

 I'll take a closer look at this and see if I can provide something to
 narrow it down.
> 
> How reliable is this crash? The best idea I have to isolate it would
> be to bisect the additions of the __randomize_layout markings on
> various structures. I would start with the ones Al is most upset to
> see randomized. ;)

It's pretty reliable, once I get a bad seed I can reproduce the crash
pretty quickly.

> 
> All that said, I'd like to better understand the BIOS side of this a
> little better. In the first email in this thread, you showed two BUGs
> separated by a little time, which implies to me that the NULL deref
> and the BIOS no longer POSTing are separate (though seemingly related)
> issues. Have you had machines survive the BUG without blowing up the
> BIOS?

We had 3 machines die due to the BIOS issue (all of them pretty quickly
with the bad-seed kernel). All the dead machines had the same
motherboard model. I have not managed to reproduce the issue again on
the machine I restored via the IPMI interface, I suspect that it may be
a bug in the BIOS that was fixed in a more recent version.

> 
> I'm still trying to wrap my head around how the BIOS could be blowing
> up. I assume there's some magic memory address that is getting poked
> as a result of some struct randomization bug, so tracking that down
> should be possible assuming you can stand reflashing your BIOS across
> the bisects.

That is our theory, some magic memory address that caused an overwrite
of the flash where the BIOS code is stored. We are working under the
assumption

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-16 Thread Kees Cook

On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean  wrote:
> On 2017-11-11 09:31 AM, Linus Torvalds wrote:
>> Boris Lukashev points out that Patrick should probably check a newer
>> version of gcc.
>>
>> I looked around, and in one of the emails, Patrick said:
>>
>>   "No changes, both the working and broken kernels were built with
>>distro-provided gcc 5.4.0 and binutils 2.28.1"
>>
>> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
>> it's a bug-fix release to a pretty old branch that is not exactly new.
>>
>> It would probably be good to check if the problems persist with gcc
>> 6.x or 7.x.. I have no idea which gcc version the randstruct people
>> tend to use themselves.
>
> I just tested it with gcc 7.2, and was able to reproduce the NULL
> pointer dereference, the backtrace looks slightly different this time.
>
> I will also test with binutils 2.29, though I doubt that will make any
> difference.
>
>> [   56.165181] BUG: unable to handle kernel NULL pointer dereference at 
>> 0560
>> [   56.166563] IP: vfs_statfs+0x7c/0xc0
>> [   56.167249] PGD 0 P4D 0
>> [   56.167860] Oops:  [#1] SMP
>> [   56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
>> xt_multiport xt_addrtype iptable_mangle iptable>
>> [   56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G   O
>> 4.14.0-git-kratos-1 #1
>> [   56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
>> [   56.182729] task: 88040c412a00 task.stack: c90002c18000
>> [   56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0
>> [   56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202
>> [   56.185143] RAX:  RBX: c90002c1bbf0 RCX: 
>> 0020
>> [   56.186085] RDX: 1801 RSI: 1801 RDI: 
>> 
>> [   56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 
>> 00ff
>> [   56.188268] R10: 0038be3a R11: 880408b18258 R12: 
>> 
>> [   56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: 
>> c90002c1bbf0
>> [   56.190444] FS:  () GS:88041fc0() 
>> knlGS:
>> [   56.191876] CS:  0010 DS:  ES:  CR0: 80050033
>> [   56.192843] CR2: 0560 CR3: 01e0a002 CR4: 
>> 001606f0
>> [   56.193898] Call Trace:
>> [   56.194510]  nfsd4_encode_fattr+0x201/0x1f90
>> [   56.195267]  ? generic_permission+0x12c/0x1a0
>> [   56.196025]  nfsd4_encode_getattr+0x25/0x30
>> [   56.196753]  nfsd4_encode_operation+0x98/0x1b0
>> [   56.197526]  nfsd4_proc_compound+0x2a0/0x5e0
>> [   56.198268]  nfsd_dispatch+0xe8/0x220
>> [   56.198968]  svc_process_common+0x475/0x640
>> [   56.199696]  ? nfsd_destroy+0x60/0x60
>> [   56.200404]  svc_process+0xf2/0x1a0
>> [   56.201079]  nfsd+0xe3/0x150
>> [   56.201706]  kthread+0x117/0x130
>> [   56.202354]  ? kthread_create_on_node+0x40/0x40
>> [   56.203100]  ret_from_fork+0x25/0x30
>> [   56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 
>> ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce>
>> [   56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28
>> [   56.207110] CR2: 0560
>> [   56.207763] ---[ end trace d452986a80f64aaa ]---
>
>> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:
>>>
>>> I'll take a closer look at this and see if I can provide something to
>>> narrow it down.

How reliable is this crash? The best idea I have to isolate it would
be to bisect the additions of the __randomize_layout markings on
various structures. I would start with the ones Al is most upset to
see randomized. ;)

All that said, I'd like to better understand the BIOS side of this a
little better. In the first email in this thread, you showed two BUGs
separated by a little time, which implies to me that the NULL deref
and the BIOS no longer POSTing are separate (though seemingly related)
issues. Have you had machines survive the BUG without blowing up the
BIOS?

I'm still trying to wrap my head around how the BIOS could be blowing
up. I assume there's some magic memory address that is getting poked
as a result of some struct randomization bug, so tracking that down
should be possible assuming you can stand reflashing your BIOS across
the bisects.

For the first step, I'd try a revert of
9225331b310821760f39ba55b00b8973602adbb5, which enables a large
portion of struct randomization. If that doesn't change things, I can
provide a series that reverts 3859a271a003aba01e45b85c9d8b355eb7bf25f9
and then re-applies __randomize_layout one structure per patch, and
you could bisect that?

-Kees

-- 
Kees Cook
Pixel Security

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-16 Thread Kees Cook

On Mon, Nov 13, 2017 at 2:48 PM, Patrick McLean  wrote:
> On 2017-11-11 09:31 AM, Linus Torvalds wrote:
>> Boris Lukashev points out that Patrick should probably check a newer
>> version of gcc.
>>
>> I looked around, and in one of the emails, Patrick said:
>>
>>   "No changes, both the working and broken kernels were built with
>>distro-provided gcc 5.4.0 and binutils 2.28.1"
>>
>> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
>> it's a bug-fix release to a pretty old branch that is not exactly new.
>>
>> It would probably be good to check if the problems persist with gcc
>> 6.x or 7.x.. I have no idea which gcc version the randstruct people
>> tend to use themselves.
>
> I just tested it with gcc 7.2, and was able to reproduce the NULL
> pointer dereference, the backtrace looks slightly different this time.
>
> I will also test with binutils 2.29, though I doubt that will make any
> difference.
>
>> [   56.165181] BUG: unable to handle kernel NULL pointer dereference at 
>> 0560
>> [   56.166563] IP: vfs_statfs+0x7c/0xc0
>> [   56.167249] PGD 0 P4D 0
>> [   56.167860] Oops:  [#1] SMP
>> [   56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
>> xt_multiport xt_addrtype iptable_mangle iptable>
>> [   56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G   O
>> 4.14.0-git-kratos-1 #1
>> [   56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
>> [   56.182729] task: 88040c412a00 task.stack: c90002c18000
>> [   56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0
>> [   56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202
>> [   56.185143] RAX:  RBX: c90002c1bbf0 RCX: 
>> 0020
>> [   56.186085] RDX: 1801 RSI: 1801 RDI: 
>> 
>> [   56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 
>> 00ff
>> [   56.188268] R10: 0038be3a R11: 880408b18258 R12: 
>> 
>> [   56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: 
>> c90002c1bbf0
>> [   56.190444] FS:  () GS:88041fc0() 
>> knlGS:
>> [   56.191876] CS:  0010 DS:  ES:  CR0: 80050033
>> [   56.192843] CR2: 0560 CR3: 01e0a002 CR4: 
>> 001606f0
>> [   56.193898] Call Trace:
>> [   56.194510]  nfsd4_encode_fattr+0x201/0x1f90
>> [   56.195267]  ? generic_permission+0x12c/0x1a0
>> [   56.196025]  nfsd4_encode_getattr+0x25/0x30
>> [   56.196753]  nfsd4_encode_operation+0x98/0x1b0
>> [   56.197526]  nfsd4_proc_compound+0x2a0/0x5e0
>> [   56.198268]  nfsd_dispatch+0xe8/0x220
>> [   56.198968]  svc_process_common+0x475/0x640
>> [   56.199696]  ? nfsd_destroy+0x60/0x60
>> [   56.200404]  svc_process+0xf2/0x1a0
>> [   56.201079]  nfsd+0xe3/0x150
>> [   56.201706]  kthread+0x117/0x130
>> [   56.202354]  ? kthread_create_on_node+0x40/0x40
>> [   56.203100]  ret_from_fork+0x25/0x30
>> [   56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 
>> ce 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce>
>> [   56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28
>> [   56.207110] CR2: 0560
>> [   56.207763] ---[ end trace d452986a80f64aaa ]---
>
>> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:
>>>
>>> I'll take a closer look at this and see if I can provide something to
>>> narrow it down.

How reliable is this crash? The best idea I have to isolate it would
be to bisect the additions of the __randomize_layout markings on
various structures. I would start with the ones Al is most upset to
see randomized. ;)

All that said, I'd like to better understand the BIOS side of this a
little better. In the first email in this thread, you showed two BUGs
separated by a little time, which implies to me that the NULL deref
and the BIOS no longer POSTing are separate (though seemingly related)
issues. Have you had machines survive the BUG without blowing up the
BIOS?

I'm still trying to wrap my head around how the BIOS could be blowing
up. I assume there's some magic memory address that is getting poked
as a result of some struct randomization bug, so tracking that down
should be possible assuming you can stand reflashing your BIOS across
the bisects.

For the first step, I'd try a revert of
9225331b310821760f39ba55b00b8973602adbb5, which enables a large
portion of struct randomization. If that doesn't change things, I can
provide a series that reverts 3859a271a003aba01e45b85c9d8b355eb7bf25f9
and then re-applies __randomize_layout one structure per patch, and
you could bisect that?

-Kees

-- 
Kees Cook
Pixel Security

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-14 Thread Linus Torvalds

On Tue, Nov 14, 2017 at 3:53 PM, Rasmus Villemoes
 wrote:
>
> Odd. 7.2 and 7.1 (both of which I've just compiled from source, no
> special configure flags or anything) generate exactly the same (good)
> code for fs/namespace.o after patching. I also tried with
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y just in case, but that still produces
> reasonable code. Oh well.

It might be configuration-dependent. I usually end up having a couple
of different configurations depending on what I'm doing, one being a
fairly minimal one, and one being the "allmodconfig" I use mainly for
build testing.

   Linus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-14 Thread Linus Torvalds

On Tue, Nov 14, 2017 at 3:53 PM, Rasmus Villemoes
 wrote:
>
> Odd. 7.2 and 7.1 (both of which I've just compiled from source, no
> special configure flags or anything) generate exactly the same (good)
> code for fs/namespace.o after patching. I also tried with
> CONFIG_CC_OPTIMIZE_FOR_SIZE=y just in case, but that still produces
> reasonable code. Oh well.

It might be configuration-dependent. I usually end up having a couple
of different configurations depending on what I'm doing, one being a
fairly minimal one, and one being the "allmodconfig" I use mainly for
build testing.

   Linus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-14 Thread Rasmus Villemoes

On 14 November 2017 at 23:43, Linus Torvalds
 wrote:
> On Tue, Nov 14, 2017 at 2:24 PM, Rasmus Villemoes
>  wrote:
>>
>> Can you be more specific? That's not what I see with gcc 7.1.
>
> I have gcc-7.2.1, and it made a horrible mess of the do_mount() code.

Odd. 7.2 and 7.1 (both of which I've just compiled from source, no
special configure flags or anything) generate exactly the same (good)
code for fs/namespace.o after patching. I also tried with
CONFIG_CC_OPTIMIZE_FOR_SIZE=y just in case, but that still produces
reasonable code. Oh well.

Rasmus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-14 Thread Rasmus Villemoes

On 14 November 2017 at 23:43, Linus Torvalds
 wrote:
> On Tue, Nov 14, 2017 at 2:24 PM, Rasmus Villemoes
>  wrote:
>>
>> Can you be more specific? That's not what I see with gcc 7.1.
>
> I have gcc-7.2.1, and it made a horrible mess of the do_mount() code.

Odd. 7.2 and 7.1 (both of which I've just compiled from source, no
special configure flags or anything) generate exactly the same (good)
code for fs/namespace.o after patching. I also tried with
CONFIG_CC_OPTIMIZE_FOR_SIZE=y just in case, but that still produces
reasonable code. Oh well.

Rasmus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-14 Thread Linus Torvalds

On Tue, Nov 14, 2017 at 2:24 PM, Rasmus Villemoes
 wrote:
>
> Can you be more specific? That's not what I see with gcc 7.1.

I have gcc-7.2.1, and it made a horrible mess of the do_mount() code.

Look for the comment "/* Separate the per-mountpoint flags */" and do
the obvious conversion of the simple single-bit stuff.

My gcc actually turned those into jumps after _after_ I converted it
into the ternary operation, and then the ternary conversion actually
did much worse, because it actually had two sides (one with a zero
value, and one with the bit value to be set).

I didn't have time to look into _why_ that code generated
branch-overs, when the otherwise similar reverse case in fs/statfs.c
did not.

   Linus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-14 Thread Linus Torvalds

On Tue, Nov 14, 2017 at 2:24 PM, Rasmus Villemoes
 wrote:
>
> Can you be more specific? That's not what I see with gcc 7.1.

I have gcc-7.2.1, and it made a horrible mess of the do_mount() code.

Look for the comment "/* Separate the per-mountpoint flags */" and do
the obvious conversion of the simple single-bit stuff.

My gcc actually turned those into jumps after _after_ I converted it
into the ternary operation, and then the ternary conversion actually
did much worse, because it actually had two sides (one with a zero
value, and one with the bit value to be set).

I didn't have time to look into _why_ that code generated
branch-overs, when the otherwise similar reverse case in fs/statfs.c
did not.

   Linus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-14 Thread Rasmus Villemoes

On 14 November 2017 at 00:54, Linus Torvalds
 wrote:
> On Mon, Nov 13, 2017 at 3:30 PM, Linus Torvalds
>  wrote:
>>
>> So let's just rewrite that mnt_flags conversion that way, justr to get
>> gcc to generate the obvious code.
>
> Oh wow. I tried to do the same thing in fs/namespace.c where it does
> the reverse bit translation, and gcc makes a _horrible_ mess of it and
> actually makes the code much worse because for some reason the pattern
> doesn't trigger.

[trimming cc list]

Can you be more specific? That's not what I see with gcc 7.1. I've
found two blocks where I replaced the if's with the ternary
expressions, one in clone_mnt, one in do_mount. In clone_mnt, gcc-7.1
seems to do (first instruction is the unconditional |= MNT_LOCK_ATIME)

10d7:   0d 00 00 04 00  or $0x4,%eax
10dc:   89 43 30mov%eax,0x30(%rbx)
10df:   a8 02   test   $0x2,%al
10e1:   74 08   je 10eb 
10e3:   0d 00 00 20 00  or $0x20,%eax
10e8:   89 43 30mov%eax,0x30(%rbx)
10eb:   a8 01   test   $0x1,%al
10ed:   74 08   je 10f7 
10ef:   0d 00 00 10 00  or $0x10,%eax
10f4:   89 43 30mov%eax,0x30(%rbx)

and after patching

10cc:   0d 00 00 04 00  or $0x4,%eax
10d1:   89 c2   mov%eax,%edx
10d3:   c1 e2 10shl$0x10,%edx
10d6:   81 e2 00 00 40 00   and$0x40,%edx
10dc:   09 d0   or %edx,%eax
10de:   89 c2   mov%eax,%edx
10e0:   c1 e2 14shl$0x14,%edx
10e3:   81 e2 00 00 20 00   and$0x20,%edx
10e9:   09 c2   or %eax,%edx

(with a final store of %eax to 0x30(%rbx)). Either way it's four
instructions per flag, but I assume the one without the branches is
preferable.

Similarly, in do_mount, before we have

3429:   44 89 f8mov%r15d,%eax
342c:   83 c8 01or $0x1,%eax
342f:   40 f6 c5 02 test   $0x2,%bpl
3433:   44 0f 45 f8 cmovne %eax,%r15d
3437:   44 89 f8mov%r15d,%eax
343a:   83 c8 02or $0x2,%eax
343d:   40 f6 c5 04 test   $0x4,%bpl
3441:   44 0f 45 f8 cmovne %eax,%r15d

but after patching it does something like

3425:   4d 89 femov%r15,%r14
3428:   48 c1 ea 07 shr$0x7,%rdx
342c:   49 d1 eeshr%r14
342f:   89 d0   mov%edx,%eax
3431:   c1 e1 05shl$0x5,%ecx
3434:   83 e0 08and$0x8,%eax
3437:   41 83 e6 07 and$0x7,%r14d
343b:   41 09 c6or %eax,%r14d
343e:   89 d0   mov%edx,%eax

which actually makes use of MS_{NOSUID,NODEV,NOEXEC} all being 1 bit
off from their MNT_ counterparts (witness the shr %r14 and the and
with 0x7), so there we also cut 37 bytes according to bloat-o-meter.

Now, it does seem that older (and not that old in absolute terms)
compilers may generate worse code after the transformation - the
'replace with shift-mask-or' pattern doesn't exist until 7.1. E.g.
with 5.4 we have

$ scripts/bloat-o-meter fs/namespace.o.{0,1}-5.4
add/remove: 0/0 grow/shrink: 2/0 up/down: 50/0 (50)
function old new   delta
do_mount31533195 +42
clone_mnt768 776  +8

5.4 compiles the "if() ..." construction roughly as 7.1, i.e. with
cmovs, but the ternary expressions are transformed into this
abomination

336f:   48 89 damov%rbx,%rdx
3372:   83 e2 04and$0x4,%edx
3375:   48 83 fa 01 cmp$0x1,%rdx
3379:   19 d2   sbb%edx,%edx
337b:   f7 d2   not%edx
337d:   83 e2 02and$0x2,%edx
3380:   09 d0   or %edx,%eax
3382:   48 89 damov%rbx,%rdx
3385:   83 e2 08and$0x8,%edx
3388:   48 83 fa 01 cmp$0x1,%rdx
338c:   19 d2   sbb%edx,%edx
338e:   f7 d2   not%edx
3390:   83 e2 04and$0x4,%edx
3393:   09 d0   or %edx,%eax

Was it something like this you saw?

Rasmus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-14 Thread Rasmus Villemoes

On 14 November 2017 at 00:54, Linus Torvalds
 wrote:
> On Mon, Nov 13, 2017 at 3:30 PM, Linus Torvalds
>  wrote:
>>
>> So let's just rewrite that mnt_flags conversion that way, justr to get
>> gcc to generate the obvious code.
>
> Oh wow. I tried to do the same thing in fs/namespace.c where it does
> the reverse bit translation, and gcc makes a _horrible_ mess of it and
> actually makes the code much worse because for some reason the pattern
> doesn't trigger.

[trimming cc list]

Can you be more specific? That's not what I see with gcc 7.1. I've
found two blocks where I replaced the if's with the ternary
expressions, one in clone_mnt, one in do_mount. In clone_mnt, gcc-7.1
seems to do (first instruction is the unconditional |= MNT_LOCK_ATIME)

10d7:   0d 00 00 04 00  or $0x4,%eax
10dc:   89 43 30mov%eax,0x30(%rbx)
10df:   a8 02   test   $0x2,%al
10e1:   74 08   je 10eb 
10e3:   0d 00 00 20 00  or $0x20,%eax
10e8:   89 43 30mov%eax,0x30(%rbx)
10eb:   a8 01   test   $0x1,%al
10ed:   74 08   je 10f7 
10ef:   0d 00 00 10 00  or $0x10,%eax
10f4:   89 43 30mov%eax,0x30(%rbx)

and after patching

10cc:   0d 00 00 04 00  or $0x4,%eax
10d1:   89 c2   mov%eax,%edx
10d3:   c1 e2 10shl$0x10,%edx
10d6:   81 e2 00 00 40 00   and$0x40,%edx
10dc:   09 d0   or %edx,%eax
10de:   89 c2   mov%eax,%edx
10e0:   c1 e2 14shl$0x14,%edx
10e3:   81 e2 00 00 20 00   and$0x20,%edx
10e9:   09 c2   or %eax,%edx

(with a final store of %eax to 0x30(%rbx)). Either way it's four
instructions per flag, but I assume the one without the branches is
preferable.

Similarly, in do_mount, before we have

3429:   44 89 f8mov%r15d,%eax
342c:   83 c8 01or $0x1,%eax
342f:   40 f6 c5 02 test   $0x2,%bpl
3433:   44 0f 45 f8 cmovne %eax,%r15d
3437:   44 89 f8mov%r15d,%eax
343a:   83 c8 02or $0x2,%eax
343d:   40 f6 c5 04 test   $0x4,%bpl
3441:   44 0f 45 f8 cmovne %eax,%r15d

but after patching it does something like

3425:   4d 89 femov%r15,%r14
3428:   48 c1 ea 07 shr$0x7,%rdx
342c:   49 d1 eeshr%r14
342f:   89 d0   mov%edx,%eax
3431:   c1 e1 05shl$0x5,%ecx
3434:   83 e0 08and$0x8,%eax
3437:   41 83 e6 07 and$0x7,%r14d
343b:   41 09 c6or %eax,%r14d
343e:   89 d0   mov%edx,%eax

which actually makes use of MS_{NOSUID,NODEV,NOEXEC} all being 1 bit
off from their MNT_ counterparts (witness the shr %r14 and the and
with 0x7), so there we also cut 37 bytes according to bloat-o-meter.

Now, it does seem that older (and not that old in absolute terms)
compilers may generate worse code after the transformation - the
'replace with shift-mask-or' pattern doesn't exist until 7.1. E.g.
with 5.4 we have

$ scripts/bloat-o-meter fs/namespace.o.{0,1}-5.4
add/remove: 0/0 grow/shrink: 2/0 up/down: 50/0 (50)
function old new   delta
do_mount31533195 +42
clone_mnt768 776  +8

5.4 compiles the "if() ..." construction roughly as 7.1, i.e. with
cmovs, but the ternary expressions are transformed into this
abomination

336f:   48 89 damov%rbx,%rdx
3372:   83 e2 04and$0x4,%edx
3375:   48 83 fa 01 cmp$0x1,%rdx
3379:   19 d2   sbb%edx,%edx
337b:   f7 d2   not%edx
337d:   83 e2 02and$0x2,%edx
3380:   09 d0   or %edx,%eax
3382:   48 89 damov%rbx,%rdx
3385:   83 e2 08and$0x8,%edx
3388:   48 83 fa 01 cmp$0x1,%rdx
338c:   19 d2   sbb%edx,%edx
338e:   f7 d2   not%edx
3390:   83 e2 04and$0x4,%edx
3393:   09 d0   or %edx,%eax

Was it something like this you saw?

Rasmus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-13 Thread Linus Torvalds

On Mon, Nov 13, 2017 at 3:30 PM, Linus Torvalds
 wrote:
>
> So let's just rewrite that mnt_flags conversion that way, justr to get
> gcc to generate the obvious code.

Oh wow. I tried to do the same thing in fs/namespace.c where it does
the reverse bit translation, and gcc makes a _horrible_ mess of it and
actually makes the code much worse because for some reason the pattern
doesn't trigger.

So this gcc optimization is apparently pretty damn fragile in general.
It triggers for the trivial cases, but then other code around it can
confuse it badly.

So I don't think I'll touch this, it seems to not be really reliably
something that makes gcc generate what should be the obvious code..

   Linus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-13 Thread Linus Torvalds

On Mon, Nov 13, 2017 at 3:30 PM, Linus Torvalds
 wrote:
>
> So let's just rewrite that mnt_flags conversion that way, justr to get
> gcc to generate the obvious code.

Oh wow. I tried to do the same thing in fs/namespace.c where it does
the reverse bit translation, and gcc makes a _horrible_ mess of it and
actually makes the code much worse because for some reason the pattern
doesn't trigger.

So this gcc optimization is apparently pretty damn fragile in general.
It triggers for the trivial cases, but then other code around it can
confuse it badly.

So I don't think I'll touch this, it seems to not be really reliably
something that makes gcc generate what should be the obvious code..

   Linus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-13 Thread Linus Torvalds

On Mon, Nov 13, 2017 at 2:59 PM, Rasmus Villemoes
 wrote:
>> Sadly, gcc makes a mess of it and actually generates code that looks
>> like the original C.[...]
>
> Actually, new enough gcc (7.1, I think) does contain a pattern that does
> this, but unfortunately only if one spells it
>
>   y |= (x & BIT) ? OTHER_BIT : 0;

Ahh, I should have recognized that, I think that's what we ended up
doing with the VM_READ -> PROT_READ translation in a few places,
exactly because gcc would then recognize it and do the much better
code generation.

> which is half-way to doing it by hand, I suppose.

Yeah, but it is at least acceptable, and the code is still legible C.
The alternatives of doing it _entirely_ by hand tend to be much worse
(ie you end up using a macro from hell that checks which of the two
bits are bigger and shifting in the right direction by using
multiplication or division).

So let's just rewrite that mnt_flags conversion that way, justr to get
gcc to generate the obvious code.

It's a bit sad how gcc didn't pick up on the original code, especially
since it had already done the much more complicated translation of
doing the if-conversion.

Thanks for pointing out the gcc pattern.

  Linus

Re: bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-13 Thread Linus Torvalds

On Mon, Nov 13, 2017 at 2:59 PM, Rasmus Villemoes
 wrote:
>> Sadly, gcc makes a mess of it and actually generates code that looks
>> like the original C.[...]
>
> Actually, new enough gcc (7.1, I think) does contain a pattern that does
> this, but unfortunately only if one spells it
>
>   y |= (x & BIT) ? OTHER_BIT : 0;

Ahh, I should have recognized that, I think that's what we ended up
doing with the VM_READ -> PROT_READ translation in a few places,
exactly because gcc would then recognize it and do the much better
code generation.

> which is half-way to doing it by hand, I suppose.

Yeah, but it is at least acceptable, and the code is still legible C.
The alternatives of doing it _entirely_ by hand tend to be much worse
(ie you end up using a macro from hell that checks which of the two
bits are bigger and shifting in the right direction by using
multiplication or division).

So let's just rewrite that mnt_flags conversion that way, justr to get
gcc to generate the obvious code.

It's a bit sad how gcc didn't pick up on the original code, especially
since it had already done the much more complicated translation of
doing the if-conversion.

Thanks for pointing out the gcc pattern.

  Linus

bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-13 Thread Rasmus Villemoes

On Thu, Nov 09 2017, Linus Torvalds  wrote:

> The code disassembles to
>
>0: 83 c9 08  or $0x8,%ecx
>3: 40 f6 c6 04  test   $0x4,%sil
>7: 0f 45 d1  cmovne %ecx,%edx
>a: 89 d1mov%edx,%ecx
>c: 80 cd 04  or $0x4,%ch
>f: 40 f6 c6 08  test   $0x8,%sil
>   13: 0f 45 d1  cmovne %ecx,%edx
>   16: 89 d1mov%edx,%ecx
>   18: 80 cd 08  or $0x8,%ch
>   1b: 40 f6 c6 10  test   $0x10,%sil
>   1f: 0f 45 d1  cmovne %ecx,%edx
>   22: 89 d1mov%edx,%ecx
>   24: 80 cd 10  or $0x10,%ch
>   27: 83 e6 20  and$0x20,%esi
>   2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction
>   31: 0f 45 d1  cmovne %ecx,%edx
>   34: 83 ca 20  or $0x20,%edx
>   37: 89 f1mov%esi,%ecx
>   39: 83 e1 10  and$0x10,%ecx
>   3c: 89 cfmov%ecx,%edi
>
> and all those odd cmovne and bit-ops are just the bit selection code
> in flags_by_mnt(), which is inlined through calculate_f_flags (which
> is _also_ inlined) into vfs_statfs().
>
> Sadly, gcc makes a mess of it and actually generates code that looks
> like the original C. I would have hoped that gcc could have turned
>
>if (x & BIT)
> y |= OTHER_BIT;
>
> into
>
> y |= (x & BIT) shifted-by-the-bit-difference-between BIT/OTHER_BIT;
>
> but that doesn't happen.

Actually, new enough gcc (7.1, I think) does contain a pattern that does
this, but unfortunately only if one spells it

  y |= (x & BIT) ? OTHER_BIT : 0;

which is half-way to doing it by hand, I suppose. Doing the

-   if (mnt_flags & MNT_READONLY)
-   flags |= ST_RDONLY;
+   flags |= (mnt_flags & MNT_READONLY) ? ST_RDONLY : 0;

and pasting into godbolt.org, one can apparently get gcc to compile it
to

flags_by_mnt(int):
  leal (%rdi,%rdi), %edx
  movl %edi, %eax
  sarl $6, %eax
  movl %edx, %ecx
  andl $1, %eax
  andl $12, %edx
  andl $2, %ecx
  orl %ecx, %eax
  orl %eax, %edx
  movl %edi, %eax
  sall $7, %eax
  andl $7168, %eax
  orl %edx, %eax
  ret

Rasmus

bit tweaks [was: Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11]

2017-11-13 Thread Rasmus Villemoes

On Thu, Nov 09 2017, Linus Torvalds  wrote:

> The code disassembles to
>
>0: 83 c9 08  or $0x8,%ecx
>3: 40 f6 c6 04  test   $0x4,%sil
>7: 0f 45 d1  cmovne %ecx,%edx
>a: 89 d1mov%edx,%ecx
>c: 80 cd 04  or $0x4,%ch
>f: 40 f6 c6 08  test   $0x8,%sil
>   13: 0f 45 d1  cmovne %ecx,%edx
>   16: 89 d1mov%edx,%ecx
>   18: 80 cd 08  or $0x8,%ch
>   1b: 40 f6 c6 10  test   $0x10,%sil
>   1f: 0f 45 d1  cmovne %ecx,%edx
>   22: 89 d1mov%edx,%ecx
>   24: 80 cd 10  or $0x10,%ch
>   27: 83 e6 20  and$0x20,%esi
>   2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction
>   31: 0f 45 d1  cmovne %ecx,%edx
>   34: 83 ca 20  or $0x20,%edx
>   37: 89 f1mov%esi,%ecx
>   39: 83 e1 10  and$0x10,%ecx
>   3c: 89 cfmov%ecx,%edi
>
> and all those odd cmovne and bit-ops are just the bit selection code
> in flags_by_mnt(), which is inlined through calculate_f_flags (which
> is _also_ inlined) into vfs_statfs().
>
> Sadly, gcc makes a mess of it and actually generates code that looks
> like the original C. I would have hoped that gcc could have turned
>
>if (x & BIT)
> y |= OTHER_BIT;
>
> into
>
> y |= (x & BIT) shifted-by-the-bit-difference-between BIT/OTHER_BIT;
>
> but that doesn't happen.

Actually, new enough gcc (7.1, I think) does contain a pattern that does
this, but unfortunately only if one spells it

  y |= (x & BIT) ? OTHER_BIT : 0;

which is half-way to doing it by hand, I suppose. Doing the

-   if (mnt_flags & MNT_READONLY)
-   flags |= ST_RDONLY;
+   flags |= (mnt_flags & MNT_READONLY) ? ST_RDONLY : 0;

and pasting into godbolt.org, one can apparently get gcc to compile it
to

flags_by_mnt(int):
  leal (%rdi,%rdi), %edx
  movl %edi, %eax
  sarl $6, %eax
  movl %edx, %ecx
  andl $1, %eax
  andl $12, %edx
  andl $2, %ecx
  orl %ecx, %eax
  orl %eax, %edx
  movl %edi, %eax
  sall $7, %eax
  andl $7168, %eax
  orl %edx, %eax
  ret

Rasmus

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-13 Thread Patrick McLean

On 2017-11-11 09:31 AM, Linus Torvalds wrote:
> Boris Lukashev points out that Patrick should probably check a newer
> version of gcc.
> 
> I looked around, and in one of the emails, Patrick said:
> 
>   "No changes, both the working and broken kernels were built with
>distro-provided gcc 5.4.0 and binutils 2.28.1"
> 
> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
> it's a bug-fix release to a pretty old branch that is not exactly new.
> 
> It would probably be good to check if the problems persist with gcc
> 6.x or 7.x.. I have no idea which gcc version the randstruct people
> tend to use themselves.

I just tested it with gcc 7.2, and was able to reproduce the NULL
pointer dereference, the backtrace looks slightly different this time.

I will also test with binutils 2.29, though I doubt that will make any
difference.

> [   56.165181] BUG: unable to handle kernel NULL pointer dereference at 
> 0560
> [   56.166563] IP: vfs_statfs+0x7c/0xc0
> [   56.167249] PGD 0 P4D 0
> [   56.167860] Oops:  [#1] SMP
> [   56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
> xt_multiport xt_addrtype iptable_mangle iptable>
> [   56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G   O
> 4.14.0-git-kratos-1 #1
> [   56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
> [   56.182729] task: 88040c412a00 task.stack: c90002c18000
> [   56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0
> [   56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202
> [   56.185143] RAX:  RBX: c90002c1bbf0 RCX: 
> 0020
> [   56.186085] RDX: 1801 RSI: 1801 RDI: 
> 
> [   56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 
> 00ff
> [   56.188268] R10: 0038be3a R11: 880408b18258 R12: 
> 
> [   56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: 
> c90002c1bbf0
> [   56.190444] FS:  () GS:88041fc0() 
> knlGS:
> [   56.191876] CS:  0010 DS:  ES:  CR0: 80050033
> [   56.192843] CR2: 0560 CR3: 01e0a002 CR4: 
> 001606f0
> [   56.193898] Call Trace:
> [   56.194510]  nfsd4_encode_fattr+0x201/0x1f90
> [   56.195267]  ? generic_permission+0x12c/0x1a0
> [   56.196025]  nfsd4_encode_getattr+0x25/0x30
> [   56.196753]  nfsd4_encode_operation+0x98/0x1b0
> [   56.197526]  nfsd4_proc_compound+0x2a0/0x5e0
> [   56.198268]  nfsd_dispatch+0xe8/0x220
> [   56.198968]  svc_process_common+0x475/0x640
> [   56.199696]  ? nfsd_destroy+0x60/0x60
> [   56.200404]  svc_process+0xf2/0x1a0
> [   56.201079]  nfsd+0xe3/0x150
> [   56.201706]  kthread+0x117/0x130
> [   56.202354]  ? kthread_create_on_node+0x40/0x40
> [   56.203100]  ret_from_fork+0x25/0x30
> [   56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 ce 
> 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce>
> [   56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28
> [   56.207110] CR2: 0560
> [   56.207763] ---[ end trace d452986a80f64aaa ]---

> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:
>>
>> I'll take a closer look at this and see if I can provide something to
>> narrow it down.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-13 Thread Patrick McLean

On 2017-11-11 09:31 AM, Linus Torvalds wrote:
> Boris Lukashev points out that Patrick should probably check a newer
> version of gcc.
> 
> I looked around, and in one of the emails, Patrick said:
> 
>   "No changes, both the working and broken kernels were built with
>distro-provided gcc 5.4.0 and binutils 2.28.1"
> 
> and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
> it's a bug-fix release to a pretty old branch that is not exactly new.
> 
> It would probably be good to check if the problems persist with gcc
> 6.x or 7.x.. I have no idea which gcc version the randstruct people
> tend to use themselves.

I just tested it with gcc 7.2, and was able to reproduce the NULL
pointer dereference, the backtrace looks slightly different this time.

I will also test with binutils 2.29, though I doubt that will make any
difference.

> [   56.165181] BUG: unable to handle kernel NULL pointer dereference at 
> 0560
> [   56.166563] IP: vfs_statfs+0x7c/0xc0
> [   56.167249] PGD 0 P4D 0
> [   56.167860] Oops:  [#1] SMP
> [   56.176478] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
> xt_multiport xt_addrtype iptable_mangle iptable>
> [   56.180227] CPU: 0 PID: 3985 Comm: nfsd Tainted: G   O
> 4.14.0-git-kratos-1 #1
> [   56.181728] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
> [   56.182729] task: 88040c412a00 task.stack: c90002c18000
> [   56.183629] RIP: 0010:vfs_statfs+0x7c/0xc0
> [   56.184341] RSP: 0018:c90002c1bb28 EFLAGS: 00010202
> [   56.185143] RAX:  RBX: c90002c1bbf0 RCX: 
> 0020
> [   56.186085] RDX: 1801 RSI: 1801 RDI: 
> 
> [   56.187066] RBP: c90002c1bbc0 R08: ff00 R09: 
> 00ff
> [   56.188268] R10: 0038be3a R11: 880408b18258 R12: 
> 
> [   56.189336] R13: 88040c23ad00 R14: 88040b874000 R15: 
> c90002c1bbf0
> [   56.190444] FS:  () GS:88041fc0() 
> knlGS:
> [   56.191876] CS:  0010 DS:  ES:  CR0: 80050033
> [   56.192843] CR2: 0560 CR3: 01e0a002 CR4: 
> 001606f0
> [   56.193898] Call Trace:
> [   56.194510]  nfsd4_encode_fattr+0x201/0x1f90
> [   56.195267]  ? generic_permission+0x12c/0x1a0
> [   56.196025]  nfsd4_encode_getattr+0x25/0x30
> [   56.196753]  nfsd4_encode_operation+0x98/0x1b0
> [   56.197526]  nfsd4_proc_compound+0x2a0/0x5e0
> [   56.198268]  nfsd_dispatch+0xe8/0x220
> [   56.198968]  svc_process_common+0x475/0x640
> [   56.199696]  ? nfsd_destroy+0x60/0x60
> [   56.200404]  svc_process+0xf2/0x1a0
> [   56.201079]  nfsd+0xe3/0x150
> [   56.201706]  kthread+0x117/0x130
> [   56.202354]  ? kthread_create_on_node+0x40/0x40
> [   56.203100]  ret_from_fork+0x25/0x30
> [   56.203774] Code: d6 89 d6 81 ce 00 04 00 00 f6 c1 08 0f 45 d6 89 d6 81 ce 
> 00 08 00 00 f6 c1 10 0f 45 d6 89 d6 81 ce>
> [   56.206289] RIP: vfs_statfs+0x7c/0xc0 RSP: c90002c1bb28
> [   56.207110] CR2: 0560
> [   56.207763] ---[ end trace d452986a80f64aaa ]---

> On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:
>>
>> I'll take a closer look at this and see if I can provide something to
>> narrow it down.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-11 Thread Linus Torvalds

Boris Lukashev points out that Patrick should probably check a newer
version of gcc.

I looked around, and in one of the emails, Patrick said:

  "No changes, both the working and broken kernels were built with
   distro-provided gcc 5.4.0 and binutils 2.28.1"

and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
it's a bug-fix release to a pretty old branch that is not exactly new.

It would probably be good to check if the problems persist with gcc
6.x or 7.x.. I have no idea which gcc version the randstruct people
tend to use themselves.

   Linus

On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:
>
> I'll take a closer look at this and see if I can provide something to
> narrow it down.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-11 Thread Linus Torvalds

Boris Lukashev points out that Patrick should probably check a newer
version of gcc.

I looked around, and in one of the emails, Patrick said:

  "No changes, both the working and broken kernels were built with
   distro-provided gcc 5.4.0 and binutils 2.28.1"

and gcc-5.4.0 is certainly not very recent. It's not _ancient_, but
it's a bug-fix release to a pretty old branch that is not exactly new.

It would probably be good to check if the problems persist with gcc
6.x or 7.x.. I have no idea which gcc version the randstruct people
tend to use themselves.

   Linus

On Sat, Nov 11, 2017 at 8:13 AM, Kees Cook  wrote:
>
> I'll take a closer look at this and see if I can provide something to
> narrow it down.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-11 Thread Kees Cook

On Fri, Nov 10, 2017 at 6:36 PM, Linus Torvalds
 wrote:
> [ Bringing in the gcc plugin people and the kernel hardening list,
> since it now is no longer even remotely looking like a nfsd, vfs or
> filesystem issue any more ]
>
> Kees, Emese,
>  the whole thread is on lkml, but there's clearly something horribly
> wrong with RANDSTRUCT, and it's not new even though it looked that way
> for a while.

It wouldn't be the first issue we've seen; it's (obviously) a pretty
aggressive change to the resulting build.

> Patrick seems to trigger it with nfsd, so it might be specific to that.
>
> Alternatively, it might just be that very few people run
> RANDSTRUCT-built kernels, or just have been lucky with the seeding.

Given its potential cache-line abuse, I'm not surprised that its usage
is more limited than other features.

> Sorry for top-posting, but there's not really anything in the email
> itself to reply to, other than saying thanks to Patrick for narrowing
> it down like this.

Agreed; thanks Patrick! :) Given that the issue is non-deterministic,
I wonder if the bug is related to some kind of missing RCU or barrier
that goes unnoticed in normal struct layouts.

> It would have been very interesting if it had actually bisected to
> something, but it seems that the real issue is just the choice of
> seeding for RANDSTRUCT.

That's where we've seen bugs in the past: some pathological ordering
of a struct uncovers a corner case. In the past it's been much more
deterministic: doesn't build, or immediately crashes on boot, etc.

I'll take a closer look at this and see if I can provide something to
narrow it down.

-Kees

>
>  Linus
>
> On Fri, Nov 10, 2017 at 4:27 PM, Patrick McLean  wrote:
>> On 2017-11-10 03:26 PM, Patrick McLean wrote:
>>> On 2017-11-10 10:42 AM, Linus Torvalds wrote:

 I really don't see anything that looks even half-way suspicious in
 that 4.13.8..11 range. But as mentioned, compiler interactions can be
 _really_ subtle.

 And hey, it can be a real kernel bug too, that just happens to be
 exposed by RANDSTRUCT, so a bisect really would be very nice.
>>>
>>> I am working on bisecting the issue now, but I think I have some more
>>> evidence pointing to a compiler issue related to RANDSTRUCT. There are
>>> actually 3 issues that we have seen. Sometimes we get the null pointer
>>> deref in the initial message, sometimes we get the GPF, and sometimes we
>>> see an issue where the NFS clients see all files as root-owned
>>> directories. Any given kernel will always see the same issue, but after
>>> a "make mrproper" and recompile (with the same .config), the issue will
>>> often change. I suspect that all 3 of these problems are actually the
>>> same issue manifesting itself in different ways depending on what seed
>>> the RANDSTRUCT gcc plugin is using.
>>
>> Further update on this, using the same seed for RANDSTRUCT, I have
>> reproduced this issue on v4.13.0, so it does not seem to be recently
>> introduced. The older kernel apparently only worked for us because we
>> were lucky. Generally we always compile new kernels from a fresh tree,
>> so they are never using the same seed.
>>
>> In case someone wants to play with this, here are some interesting seeds
>> (in include/generated/randomize_layout_hash.h):
>>
>> Produce a NULL pointer dereference (though I am not sure what the client
>> does to produce this).
>> 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc
>>
>> All files for nfsd4 clients appear as directories owned as root, no
>> matter the real owner (this happens for all clients we have tested):
>> 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e
>>
>> This is the seed that was breaking motherboards (make sure you have a
>> way to flash the BIOS with this one):
>> 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd
>>
>> Finally, here is a seed that produces a kernel that does not exhibit any
>> problems we are aware of:
>> e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b
>>

 Because in the end, compiler bugs are very rare. They are particularly
 annoying when they do happen, though, so they loom big in the mind of
 people who have had to chase them down.




-- 
Kees Cook
Pixel Security

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-11 Thread Kees Cook

On Fri, Nov 10, 2017 at 6:36 PM, Linus Torvalds
 wrote:
> [ Bringing in the gcc plugin people and the kernel hardening list,
> since it now is no longer even remotely looking like a nfsd, vfs or
> filesystem issue any more ]
>
> Kees, Emese,
>  the whole thread is on lkml, but there's clearly something horribly
> wrong with RANDSTRUCT, and it's not new even though it looked that way
> for a while.

It wouldn't be the first issue we've seen; it's (obviously) a pretty
aggressive change to the resulting build.

> Patrick seems to trigger it with nfsd, so it might be specific to that.
>
> Alternatively, it might just be that very few people run
> RANDSTRUCT-built kernels, or just have been lucky with the seeding.

Given its potential cache-line abuse, I'm not surprised that its usage
is more limited than other features.

> Sorry for top-posting, but there's not really anything in the email
> itself to reply to, other than saying thanks to Patrick for narrowing
> it down like this.

Agreed; thanks Patrick! :) Given that the issue is non-deterministic,
I wonder if the bug is related to some kind of missing RCU or barrier
that goes unnoticed in normal struct layouts.

> It would have been very interesting if it had actually bisected to
> something, but it seems that the real issue is just the choice of
> seeding for RANDSTRUCT.

That's where we've seen bugs in the past: some pathological ordering
of a struct uncovers a corner case. In the past it's been much more
deterministic: doesn't build, or immediately crashes on boot, etc.

I'll take a closer look at this and see if I can provide something to
narrow it down.

-Kees

>
>  Linus
>
> On Fri, Nov 10, 2017 at 4:27 PM, Patrick McLean  wrote:
>> On 2017-11-10 03:26 PM, Patrick McLean wrote:
>>> On 2017-11-10 10:42 AM, Linus Torvalds wrote:

 I really don't see anything that looks even half-way suspicious in
 that 4.13.8..11 range. But as mentioned, compiler interactions can be
 _really_ subtle.

 And hey, it can be a real kernel bug too, that just happens to be
 exposed by RANDSTRUCT, so a bisect really would be very nice.
>>>
>>> I am working on bisecting the issue now, but I think I have some more
>>> evidence pointing to a compiler issue related to RANDSTRUCT. There are
>>> actually 3 issues that we have seen. Sometimes we get the null pointer
>>> deref in the initial message, sometimes we get the GPF, and sometimes we
>>> see an issue where the NFS clients see all files as root-owned
>>> directories. Any given kernel will always see the same issue, but after
>>> a "make mrproper" and recompile (with the same .config), the issue will
>>> often change. I suspect that all 3 of these problems are actually the
>>> same issue manifesting itself in different ways depending on what seed
>>> the RANDSTRUCT gcc plugin is using.
>>
>> Further update on this, using the same seed for RANDSTRUCT, I have
>> reproduced this issue on v4.13.0, so it does not seem to be recently
>> introduced. The older kernel apparently only worked for us because we
>> were lucky. Generally we always compile new kernels from a fresh tree,
>> so they are never using the same seed.
>>
>> In case someone wants to play with this, here are some interesting seeds
>> (in include/generated/randomize_layout_hash.h):
>>
>> Produce a NULL pointer dereference (though I am not sure what the client
>> does to produce this).
>> 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc
>>
>> All files for nfsd4 clients appear as directories owned as root, no
>> matter the real owner (this happens for all clients we have tested):
>> 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e
>>
>> This is the seed that was breaking motherboards (make sure you have a
>> way to flash the BIOS with this one):
>> 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd
>>
>> Finally, here is a seed that produces a kernel that does not exhibit any
>> problems we are aware of:
>> e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b
>>

 Because in the end, compiler bugs are very rare. They are particularly
 annoying when they do happen, though, so they loom big in the mind of
 people who have had to chase them down.




-- 
Kees Cook
Pixel Security

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Alan Cox

On Wed, 8 Nov 2017 16:43:17 -0800
Patrick McLean  wrote:

> As of 4.13.11 (and also with 4.14-rc) we have an issue where when
> serving nfs4 sometimes we get the following BUG. When this bug happens,
> it usually also causes the motherboard to no longer POST until we
> externally re-flash the BIOS (using the BMC web interface). If a
> motherboard does not have an external way to flash the BIOS, this would
> brick the hardware.

If that is a production x86 system then you need to raise a large red
flag with the vendor because it should not even be possible to splat the
BIOS firmware on a modern PC by running even malicious OS code.

Not only that but if it has a flaw, and you bisect down to create a
reproducer then it's not going to take the bad guys very long to turn it
into an interesting toy to run if they ever exploit a box with that board.

Alan

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Alan Cox

On Wed, 8 Nov 2017 16:43:17 -0800
Patrick McLean  wrote:

> As of 4.13.11 (and also with 4.14-rc) we have an issue where when
> serving nfs4 sometimes we get the following BUG. When this bug happens,
> it usually also causes the motherboard to no longer POST until we
> externally re-flash the BIOS (using the BMC web interface). If a
> motherboard does not have an external way to flash the BIOS, this would
> brick the hardware.

If that is a production x86 system then you need to raise a large red
flag with the vendor because it should not even be possible to splat the
BIOS firmware on a modern PC by running even malicious OS code.

Not only that but if it has a flaw, and you bisect down to create a
reproducer then it's not going to take the bad guys very long to turn it
into an interesting toy to run if they ever exploit a box with that board.

Alan

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Linus Torvalds

[ Bringing in the gcc plugin people and the kernel hardening list,
since it now is no longer even remotely looking like a nfsd, vfs or
filesystem issue any more ]

Kees, Emese,
 the whole thread is on lkml, but there's clearly something horribly
wrong with RANDSTRUCT, and it's not new even though it looked that way
for a while.

Patrick seems to trigger it with nfsd, so it might be specific to that.

Alternatively, it might just be that very few people run
RANDSTRUCT-built kernels, or just have been lucky with the seeding.

Sorry for top-posting, but there's not really anything in the email
itself to reply to, other than saying thanks to Patrick for narrowing
it down like this.

It would have been very interesting if it had actually bisected to
something, but it seems that the real issue is just the choice of
seeding for RANDSTRUCT.

 Linus

On Fri, Nov 10, 2017 at 4:27 PM, Patrick McLean  wrote:
> On 2017-11-10 03:26 PM, Patrick McLean wrote:
>> On 2017-11-10 10:42 AM, Linus Torvalds wrote:
>>>
>>> I really don't see anything that looks even half-way suspicious in
>>> that 4.13.8..11 range. But as mentioned, compiler interactions can be
>>> _really_ subtle.
>>>
>>> And hey, it can be a real kernel bug too, that just happens to be
>>> exposed by RANDSTRUCT, so a bisect really would be very nice.
>>
>> I am working on bisecting the issue now, but I think I have some more
>> evidence pointing to a compiler issue related to RANDSTRUCT. There are
>> actually 3 issues that we have seen. Sometimes we get the null pointer
>> deref in the initial message, sometimes we get the GPF, and sometimes we
>> see an issue where the NFS clients see all files as root-owned
>> directories. Any given kernel will always see the same issue, but after
>> a "make mrproper" and recompile (with the same .config), the issue will
>> often change. I suspect that all 3 of these problems are actually the
>> same issue manifesting itself in different ways depending on what seed
>> the RANDSTRUCT gcc plugin is using.
>
> Further update on this, using the same seed for RANDSTRUCT, I have
> reproduced this issue on v4.13.0, so it does not seem to be recently
> introduced. The older kernel apparently only worked for us because we
> were lucky. Generally we always compile new kernels from a fresh tree,
> so they are never using the same seed.
>
> In case someone wants to play with this, here are some interesting seeds
> (in include/generated/randomize_layout_hash.h):
>
> Produce a NULL pointer dereference (though I am not sure what the client
> does to produce this).
> 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc
>
> All files for nfsd4 clients appear as directories owned as root, no
> matter the real owner (this happens for all clients we have tested):
> 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e
>
> This is the seed that was breaking motherboards (make sure you have a
> way to flash the BIOS with this one):
> 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd
>
> Finally, here is a seed that produces a kernel that does not exhibit any
> problems we are aware of:
> e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b
>
>>>
>>> Because in the end, compiler bugs are very rare. They are particularly
>>> annoying when they do happen, though, so they loom big in the mind of
>>> people who have had to chase them down.
>>>

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Linus Torvalds

[ Bringing in the gcc plugin people and the kernel hardening list,
since it now is no longer even remotely looking like a nfsd, vfs or
filesystem issue any more ]

Kees, Emese,
 the whole thread is on lkml, but there's clearly something horribly
wrong with RANDSTRUCT, and it's not new even though it looked that way
for a while.

Patrick seems to trigger it with nfsd, so it might be specific to that.

Alternatively, it might just be that very few people run
RANDSTRUCT-built kernels, or just have been lucky with the seeding.

Sorry for top-posting, but there's not really anything in the email
itself to reply to, other than saying thanks to Patrick for narrowing
it down like this.

It would have been very interesting if it had actually bisected to
something, but it seems that the real issue is just the choice of
seeding for RANDSTRUCT.

 Linus

On Fri, Nov 10, 2017 at 4:27 PM, Patrick McLean  wrote:
> On 2017-11-10 03:26 PM, Patrick McLean wrote:
>> On 2017-11-10 10:42 AM, Linus Torvalds wrote:
>>>
>>> I really don't see anything that looks even half-way suspicious in
>>> that 4.13.8..11 range. But as mentioned, compiler interactions can be
>>> _really_ subtle.
>>>
>>> And hey, it can be a real kernel bug too, that just happens to be
>>> exposed by RANDSTRUCT, so a bisect really would be very nice.
>>
>> I am working on bisecting the issue now, but I think I have some more
>> evidence pointing to a compiler issue related to RANDSTRUCT. There are
>> actually 3 issues that we have seen. Sometimes we get the null pointer
>> deref in the initial message, sometimes we get the GPF, and sometimes we
>> see an issue where the NFS clients see all files as root-owned
>> directories. Any given kernel will always see the same issue, but after
>> a "make mrproper" and recompile (with the same .config), the issue will
>> often change. I suspect that all 3 of these problems are actually the
>> same issue manifesting itself in different ways depending on what seed
>> the RANDSTRUCT gcc plugin is using.
>
> Further update on this, using the same seed for RANDSTRUCT, I have
> reproduced this issue on v4.13.0, so it does not seem to be recently
> introduced. The older kernel apparently only worked for us because we
> were lucky. Generally we always compile new kernels from a fresh tree,
> so they are never using the same seed.
>
> In case someone wants to play with this, here are some interesting seeds
> (in include/generated/randomize_layout_hash.h):
>
> Produce a NULL pointer dereference (though I am not sure what the client
> does to produce this).
> 5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc
>
> All files for nfsd4 clients appear as directories owned as root, no
> matter the real owner (this happens for all clients we have tested):
> 3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e
>
> This is the seed that was breaking motherboards (make sure you have a
> way to flash the BIOS with this one):
> 3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd
>
> Finally, here is a seed that produces a kernel that does not exhibit any
> problems we are aware of:
> e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b
>
>>>
>>> Because in the end, compiler bugs are very rare. They are particularly
>>> annoying when they do happen, though, so they loom big in the mind of
>>> people who have had to chase them down.
>>>

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Al Viro

On Fri, Nov 10, 2017 at 08:13:06PM -0500, J. Bruce Fields wrote:
> On Fri, Nov 10, 2017 at 03:26:27PM -0800, Patrick McLean wrote:
> > 
> > 
> > On 2017-11-10 10:42 AM, Linus Torvalds wrote:
> > > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean  
> > > wrote:
> > >>
> > >> Something must have changed since 4.13.8 to trigger this though.
> > > 
> > > Arnd pointed to some commits that might be relevant for the cp210x
> > > module, but those are all already in 4.13.8, so if 4.13.8 really is
> > > rock solid for you, I don't think that's it.
> > > 
> > > I really don't see anything that looks even half-way suspicious in
> > > that 4.13.8..11 range. But as mentioned, compiler interactions can be
> > > _really_ subtle.
> > > 
> > > And hey, it can be a real kernel bug too, that just happens to be
> > > exposed by RANDSTRUCT, so a bisect really would be very nice.
> > 
> > I am working on bisecting the issue now, but I think I have some more
> > evidence pointing to a compiler issue related to RANDSTRUCT. There are
> > actually 3 issues that we have seen. Sometimes we get the null pointer
> > deref in the initial message, sometimes we get the GPF, and sometimes we
> > see an issue where the NFS clients see all files as root-owned
> > directories.
> 
> That suggests that stat.uid is 0 and stat.mode & S_IFMT is 004 in
> the stat structure that nfsd passed to vfs_getattr().
> 
> No idea what sort of information is useful when tracking down this kind
> of bug, but you could also run wireshark and take a look at the server's
> GETATTR replies to see if there's some other corruption.

FWIW, having looked at some of the __bugger_layout users...  Compiler bugs
aside,
* use in struct {dentry,inode,mount,block_device} has to go - cache
use patterns at hash lookups are _not_ something to play with like that.
* struct file_lock and struct super_block - ditto, only it's not
hash lookups that hurt here.  struct vm_area_struct, while we are at it.
* struct group_info - Cthulhu's pus-leaking warts, what's the point
randomizing _that_?  No, really - here's the damn thing in all its glory:
struct group_info {
atomic_tusage;
int ngroups;
kgid_t  gid[0];
} __randomize_layout;
I really hope that plugin does *not* try to move the ->gid[] anywhere...
Which leaves us a choice between putting ->usage first or second.  Sure,
every bit helps, but... even for security theatre that looks a bit too
pathetic.
* struct vfsmount.  Wow.  All of log2(3!) bits.  Congratulations.
At least that's better than struct path.  Oh, wait - they'd done struct path
as well...

What the hell had they been doing?  Muscarine old-fashioned way?  Looks like
a mix of pointless and truly dangerous.  And then there are compiler bugs and
the charming effect on reproducibility...

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Al Viro

On Fri, Nov 10, 2017 at 08:13:06PM -0500, J. Bruce Fields wrote:
> On Fri, Nov 10, 2017 at 03:26:27PM -0800, Patrick McLean wrote:
> > 
> > 
> > On 2017-11-10 10:42 AM, Linus Torvalds wrote:
> > > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean  
> > > wrote:
> > >>
> > >> Something must have changed since 4.13.8 to trigger this though.
> > > 
> > > Arnd pointed to some commits that might be relevant for the cp210x
> > > module, but those are all already in 4.13.8, so if 4.13.8 really is
> > > rock solid for you, I don't think that's it.
> > > 
> > > I really don't see anything that looks even half-way suspicious in
> > > that 4.13.8..11 range. But as mentioned, compiler interactions can be
> > > _really_ subtle.
> > > 
> > > And hey, it can be a real kernel bug too, that just happens to be
> > > exposed by RANDSTRUCT, so a bisect really would be very nice.
> > 
> > I am working on bisecting the issue now, but I think I have some more
> > evidence pointing to a compiler issue related to RANDSTRUCT. There are
> > actually 3 issues that we have seen. Sometimes we get the null pointer
> > deref in the initial message, sometimes we get the GPF, and sometimes we
> > see an issue where the NFS clients see all files as root-owned
> > directories.
> 
> That suggests that stat.uid is 0 and stat.mode & S_IFMT is 004 in
> the stat structure that nfsd passed to vfs_getattr().
> 
> No idea what sort of information is useful when tracking down this kind
> of bug, but you could also run wireshark and take a look at the server's
> GETATTR replies to see if there's some other corruption.

FWIW, having looked at some of the __bugger_layout users...  Compiler bugs
aside,
* use in struct {dentry,inode,mount,block_device} has to go - cache
use patterns at hash lookups are _not_ something to play with like that.
* struct file_lock and struct super_block - ditto, only it's not
hash lookups that hurt here.  struct vm_area_struct, while we are at it.
* struct group_info - Cthulhu's pus-leaking warts, what's the point
randomizing _that_?  No, really - here's the damn thing in all its glory:
struct group_info {
atomic_tusage;
int ngroups;
kgid_t  gid[0];
} __randomize_layout;
I really hope that plugin does *not* try to move the ->gid[] anywhere...
Which leaves us a choice between putting ->usage first or second.  Sure,
every bit helps, but... even for security theatre that looks a bit too
pathetic.
* struct vfsmount.  Wow.  All of log2(3!) bits.  Congratulations.
At least that's better than struct path.  Oh, wait - they'd done struct path
as well...

What the hell had they been doing?  Muscarine old-fashioned way?  Looks like
a mix of pointless and truly dangerous.  And then there are compiler bugs and
the charming effect on reproducibility...

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread J. Bruce Fields

On Fri, Nov 10, 2017 at 03:26:27PM -0800, Patrick McLean wrote:
> 
> 
> On 2017-11-10 10:42 AM, Linus Torvalds wrote:
> > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean  wrote:
> >>
> >> Something must have changed since 4.13.8 to trigger this though.
> > 
> > Arnd pointed to some commits that might be relevant for the cp210x
> > module, but those are all already in 4.13.8, so if 4.13.8 really is
> > rock solid for you, I don't think that's it.
> > 
> > I really don't see anything that looks even half-way suspicious in
> > that 4.13.8..11 range. But as mentioned, compiler interactions can be
> > _really_ subtle.
> > 
> > And hey, it can be a real kernel bug too, that just happens to be
> > exposed by RANDSTRUCT, so a bisect really would be very nice.
> 
> I am working on bisecting the issue now, but I think I have some more
> evidence pointing to a compiler issue related to RANDSTRUCT. There are
> actually 3 issues that we have seen. Sometimes we get the null pointer
> deref in the initial message, sometimes we get the GPF, and sometimes we
> see an issue where the NFS clients see all files as root-owned
> directories.

That suggests that stat.uid is 0 and stat.mode & S_IFMT is 004 in
the stat structure that nfsd passed to vfs_getattr().

No idea what sort of information is useful when tracking down this kind
of bug, but you could also run wireshark and take a look at the server's
GETATTR replies to see if there's some other corruption.

--b.

> Any given kernel will always see the same issue, but after
> a "make mrproper" and recompile (with the same .config), the issue will
> often change. I suspect that all 3 of these problems are actually the
> same issue manifesting itself in different ways depending on what seed
> the RANDSTRUCT gcc plugin is using.
> 
> > 
> > Because in the end, compiler bugs are very rare. They are particularly
> > annoying when they do happen, though, so they loom big in the mind of
> > people who have had to chase them down.
> >

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread J. Bruce Fields

On Fri, Nov 10, 2017 at 03:26:27PM -0800, Patrick McLean wrote:
> 
> 
> On 2017-11-10 10:42 AM, Linus Torvalds wrote:
> > On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean  wrote:
> >>
> >> Something must have changed since 4.13.8 to trigger this though.
> > 
> > Arnd pointed to some commits that might be relevant for the cp210x
> > module, but those are all already in 4.13.8, so if 4.13.8 really is
> > rock solid for you, I don't think that's it.
> > 
> > I really don't see anything that looks even half-way suspicious in
> > that 4.13.8..11 range. But as mentioned, compiler interactions can be
> > _really_ subtle.
> > 
> > And hey, it can be a real kernel bug too, that just happens to be
> > exposed by RANDSTRUCT, so a bisect really would be very nice.
> 
> I am working on bisecting the issue now, but I think I have some more
> evidence pointing to a compiler issue related to RANDSTRUCT. There are
> actually 3 issues that we have seen. Sometimes we get the null pointer
> deref in the initial message, sometimes we get the GPF, and sometimes we
> see an issue where the NFS clients see all files as root-owned
> directories.

That suggests that stat.uid is 0 and stat.mode & S_IFMT is 004 in
the stat structure that nfsd passed to vfs_getattr().

No idea what sort of information is useful when tracking down this kind
of bug, but you could also run wireshark and take a look at the server's
GETATTR replies to see if there's some other corruption.

--b.

> Any given kernel will always see the same issue, but after
> a "make mrproper" and recompile (with the same .config), the issue will
> often change. I suspect that all 3 of these problems are actually the
> same issue manifesting itself in different ways depending on what seed
> the RANDSTRUCT gcc plugin is using.
> 
> > 
> > Because in the end, compiler bugs are very rare. They are particularly
> > annoying when they do happen, though, so they loom big in the mind of
> > people who have had to chase them down.
> >

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Patrick McLean

On 2017-11-10 03:26 PM, Patrick McLean wrote:
> 
> 
> On 2017-11-10 10:42 AM, Linus Torvalds wrote:
>> On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean  wrote:
>>>
>>> Something must have changed since 4.13.8 to trigger this though.
>>
>> Arnd pointed to some commits that might be relevant for the cp210x
>> module, but those are all already in 4.13.8, so if 4.13.8 really is
>> rock solid for you, I don't think that's it.
>>
>> I really don't see anything that looks even half-way suspicious in
>> that 4.13.8..11 range. But as mentioned, compiler interactions can be
>> _really_ subtle.
>>
>> And hey, it can be a real kernel bug too, that just happens to be
>> exposed by RANDSTRUCT, so a bisect really would be very nice.
> 
> I am working on bisecting the issue now, but I think I have some more
> evidence pointing to a compiler issue related to RANDSTRUCT. There are
> actually 3 issues that we have seen. Sometimes we get the null pointer
> deref in the initial message, sometimes we get the GPF, and sometimes we
> see an issue where the NFS clients see all files as root-owned
> directories. Any given kernel will always see the same issue, but after
> a "make mrproper" and recompile (with the same .config), the issue will
> often change. I suspect that all 3 of these problems are actually the
> same issue manifesting itself in different ways depending on what seed
> the RANDSTRUCT gcc plugin is using.
> 

Further update on this, using the same seed for RANDSTRUCT, I have
reproduced this issue on v4.13.0, so it does not seem to be recently
introduced. The older kernel apparently only worked for us because we
were lucky. Generally we always compile new kernels from a fresh tree,
so they are never using the same seed.

In case someone wants to play with this, here are some interesting seeds
(in include/generated/randomize_layout_hash.h):

Produce a NULL pointer dereference (though I am not sure what the client
does to produce this).
5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc

All files for nfsd4 clients appear as directories owned as root, no
matter the real owner (this happens for all clients we have tested):
3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e

This is the seed that was breaking motherboards (make sure you have a
way to flash the BIOS with this one):
3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd

Finally, here is a seed that produces a kernel that does not exhibit any
problems we are aware of:
e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b

>>
>> Because in the end, compiler bugs are very rare. They are particularly
>> annoying when they do happen, though, so they loom big in the mind of
>> people who have had to chase them down.
>>

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Patrick McLean

On 2017-11-10 03:26 PM, Patrick McLean wrote:
> 
> 
> On 2017-11-10 10:42 AM, Linus Torvalds wrote:
>> On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean  wrote:
>>>
>>> Something must have changed since 4.13.8 to trigger this though.
>>
>> Arnd pointed to some commits that might be relevant for the cp210x
>> module, but those are all already in 4.13.8, so if 4.13.8 really is
>> rock solid for you, I don't think that's it.
>>
>> I really don't see anything that looks even half-way suspicious in
>> that 4.13.8..11 range. But as mentioned, compiler interactions can be
>> _really_ subtle.
>>
>> And hey, it can be a real kernel bug too, that just happens to be
>> exposed by RANDSTRUCT, so a bisect really would be very nice.
> 
> I am working on bisecting the issue now, but I think I have some more
> evidence pointing to a compiler issue related to RANDSTRUCT. There are
> actually 3 issues that we have seen. Sometimes we get the null pointer
> deref in the initial message, sometimes we get the GPF, and sometimes we
> see an issue where the NFS clients see all files as root-owned
> directories. Any given kernel will always see the same issue, but after
> a "make mrproper" and recompile (with the same .config), the issue will
> often change. I suspect that all 3 of these problems are actually the
> same issue manifesting itself in different ways depending on what seed
> the RANDSTRUCT gcc plugin is using.
> 

Further update on this, using the same seed for RANDSTRUCT, I have
reproduced this issue on v4.13.0, so it does not seem to be recently
introduced. The older kernel apparently only worked for us because we
were lucky. Generally we always compile new kernels from a fresh tree,
so they are never using the same seed.

In case someone wants to play with this, here are some interesting seeds
(in include/generated/randomize_layout_hash.h):

Produce a NULL pointer dereference (though I am not sure what the client
does to produce this).
5970d6494d0f4236ec57147a46e700f4f501536236d96f6f68ea223e06a258bc

All files for nfsd4 clients appear as directories owned as root, no
matter the real owner (this happens for all clients we have tested):
3f158cd1014800ce5eb6c1f532ac64f2357fdb9a684096557d2fbb1d281f325e

This is the seed that was breaking motherboards (make sure you have a
way to flash the BIOS with this one):
3e32f2d1b4a3dde9f2fd95151386cd1d5bd6167597a0b868f6273aabfc5712dd

Finally, here is a seed that produces a kernel that does not exhibit any
problems we are aware of:
e8698c12137fcd1dcbff6d1ed97e5d766128447a27ce9f9d61e0cb8c05ad4d3b

>>
>> Because in the end, compiler bugs are very rare. They are particularly
>> annoying when they do happen, though, so they loom big in the mind of
>> people who have had to chase them down.
>>

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Patrick McLean

On 2017-11-10 10:42 AM, Linus Torvalds wrote:
> On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean  wrote:
>>
>> Something must have changed since 4.13.8 to trigger this though.
> 
> Arnd pointed to some commits that might be relevant for the cp210x
> module, but those are all already in 4.13.8, so if 4.13.8 really is
> rock solid for you, I don't think that's it.
> 
> I really don't see anything that looks even half-way suspicious in
> that 4.13.8..11 range. But as mentioned, compiler interactions can be
> _really_ subtle.
> 
> And hey, it can be a real kernel bug too, that just happens to be
> exposed by RANDSTRUCT, so a bisect really would be very nice.

I am working on bisecting the issue now, but I think I have some more
evidence pointing to a compiler issue related to RANDSTRUCT. There are
actually 3 issues that we have seen. Sometimes we get the null pointer
deref in the initial message, sometimes we get the GPF, and sometimes we
see an issue where the NFS clients see all files as root-owned
directories. Any given kernel will always see the same issue, but after
a "make mrproper" and recompile (with the same .config), the issue will
often change. I suspect that all 3 of these problems are actually the
same issue manifesting itself in different ways depending on what seed
the RANDSTRUCT gcc plugin is using.

> 
> Because in the end, compiler bugs are very rare. They are particularly
> annoying when they do happen, though, so they loom big in the mind of
> people who have had to chase them down.
>

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Patrick McLean

On 2017-11-10 10:42 AM, Linus Torvalds wrote:
> On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean  wrote:
>>
>> Something must have changed since 4.13.8 to trigger this though.
> 
> Arnd pointed to some commits that might be relevant for the cp210x
> module, but those are all already in 4.13.8, so if 4.13.8 really is
> rock solid for you, I don't think that's it.
> 
> I really don't see anything that looks even half-way suspicious in
> that 4.13.8..11 range. But as mentioned, compiler interactions can be
> _really_ subtle.
> 
> And hey, it can be a real kernel bug too, that just happens to be
> exposed by RANDSTRUCT, so a bisect really would be very nice.

I am working on bisecting the issue now, but I think I have some more
evidence pointing to a compiler issue related to RANDSTRUCT. There are
actually 3 issues that we have seen. Sometimes we get the null pointer
deref in the initial message, sometimes we get the GPF, and sometimes we
see an issue where the NFS clients see all files as root-owned
directories. Any given kernel will always see the same issue, but after
a "make mrproper" and recompile (with the same .config), the issue will
often change. I suspect that all 3 of these problems are actually the
same issue manifesting itself in different ways depending on what seed
the RANDSTRUCT gcc plugin is using.

> 
> Because in the end, compiler bugs are very rare. They are particularly
> annoying when they do happen, though, so they loom big in the mind of
> people who have had to chase them down.
>

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Linus Torvalds

On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean  wrote:
>
> Something must have changed since 4.13.8 to trigger this though.

Well, yes and no.

Obviously something changed, but it doesn't necessarily have to be
anything particular.

Almost every time we've seen compiler bugs, it's been an innocuous
change that just happened to trigger a latent issue. Pretty much by
definition compiler bugs tend to be about rare situations, so it's
some odd special case that triggers.

Since it's apparently fairly repeatable for you, a bisection between
4.13.8 and 4.13.11 would be very interesting, and shouldn't take all
that long. There's only 142 commits in that range, so even just a
partial bisection of say four of five rounds should narrow it down to
just a couple of commits. And even a full bisection should only take
something like 8 build/test cycles.

Arnd pointed to some commits that might be relevant for the cp210x
module, but those are all already in 4.13.8, so if 4.13.8 really is
rock solid for you, I don't think that's it.

I really don't see anything that looks even half-way suspicious in
that 4.13.8..11 range. But as mentioned, compiler interactions can be
_really_ subtle.

And hey, it can be a real kernel bug too, that just happens to be
exposed by RANDSTRUCT, so a bisect really would be very nice.

Because in the end, compiler bugs are very rare. They are particularly
annoying when they do happen, though, so they loom big in the mind of
people who have had to chase them down.

 Linus

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Linus Torvalds

On Thu, Nov 9, 2017 at 5:58 PM, Patrick McLean  wrote:
>
> Something must have changed since 4.13.8 to trigger this though.

Well, yes and no.

Obviously something changed, but it doesn't necessarily have to be
anything particular.

Almost every time we've seen compiler bugs, it's been an innocuous
change that just happened to trigger a latent issue. Pretty much by
definition compiler bugs tend to be about rare situations, so it's
some odd special case that triggers.

Since it's apparently fairly repeatable for you, a bisection between
4.13.8 and 4.13.11 would be very interesting, and shouldn't take all
that long. There's only 142 commits in that range, so even just a
partial bisection of say four of five rounds should narrow it down to
just a couple of commits. And even a full bisection should only take
something like 8 build/test cycles.

Arnd pointed to some commits that might be relevant for the cp210x
module, but those are all already in 4.13.8, so if 4.13.8 really is
rock solid for you, I don't think that's it.

I really don't see anything that looks even half-way suspicious in
that 4.13.8..11 range. But as mentioned, compiler interactions can be
_really_ subtle.

And hey, it can be a real kernel bug too, that just happens to be
exposed by RANDSTRUCT, so a bisect really would be very nice.

Because in the end, compiler bugs are very rare. They are particularly
annoying when they do happen, though, so they loom big in the mind of
people who have had to chase them down.

 Linus

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Arnd Bergmann

On Fri, Nov 10, 2017 at 2:58 AM, Patrick McLean  wrote:
> On 2017-11-09 12:04 PM, Linus Torvalds wrote:
>> On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean  wrote:

>
> We will check our fork against the in-kernel cp201x driver to make sure
> we didn't miss anything, but it seems odd we would be hitting the issue
> so consistently in the NFS code path, rather than somewhere in USB,
> serial, or GPIO paths.
>
>> So since you seem to be able to reproduce this _reasonably_ easily,
>> it's definitely worth checking that it still reproduces even without
>> the gcc plugins.
>
> I haven't been able to reproduce it with RANDSTRUCT disabled (and
> structleak enabled). I will keep trying for a little while more, but
> evidence seems to be pointing to that.
>
> Something must have changed since 4.13.8 to trigger this though. This
> did not crop up at all until we tried 4.13.11, where it we saw it pretty
> quickly. We have a pretty large number of machines running 4.13.6 with
> RANDSTRUCT enabled and running a the same workload with many more
> clients, and have not seen this bug at all.

I couldn't find anything overly suspicious between 4.13.8 and 4.13.11,
see the full list of commits since 3.14.6 at https://pastebin.com/AcxBZR7H

The ones I couldn't immediately rule out (but no smoking gun either) would be:

9970679f497a x86/cpu/AMD: Apply the Erratum 688 fix when the BIOS doesn't
ca6711747c5a assoc_array: Fix a buggy node-splitting case
2fbb8bf749b5 xfs: move two more RT specific functions into CONFIG_XFS_RT
1e1427356d8d xfs: trim writepage mapping to within eof
9df9b634f637 xfs: cancel dirty pages on invalidation
cd3f0bee1b94 xfs: handle error if xfs_btree_get_bufs fails
58cfca25f540 xfs: reinit btree pointer on attr tree inactivation walk
659a9989b68b xfs: don't change inode mode if ACL update fails
88ccd3b6884a xfs: move more RT specific code under CONFIG_XFS_RT
5733ebee586c xfs: Don't log uninitialised fields in inode structures
199a7448c097 xfs: handle racy AIO in xfs_reflink_end_cow
ee5d69c908a1 xfs: always swap the cow forks when swapping extents
2888145444f1 xfs: Capture state of the right inode in xfs_iflush_done
d0fa252b207f xfs: perag initialization should only touch
m_ag_max_usable for AG 0
8da6f7fbe43c xfs: update i_size after unwritten conversion in dio completion
a9eac76e958b xfs: report zeroed or not correctly in xfs_zero_range()
67d51bdcc9f4 fs/xfs: Use %pS printk format for direct addresses
2bf3122f2130 xfs: evict CoW fork extents when performing finsert/fcollapse
a58a0826656d xfs: don't unconditionally clear the reflink flag on
zero-block files
c61e905e0ee2 iomap_dio_rw: Allocate AIO completion queue before submitting dio
7610595830bb pkcs7: Prevent NULL pointer dereference, since sinfo is
not always set.
24a33a0c96f3 KEYS: don't let add_key() update an uninstantiated key
ad4aa448c9b2 FS-Cache: fix dereference of NULL user_key_payload
f45b8fe12221 KEYS: Fix race between updating and finding a negative key
e56be12012c2 ecryptfs: fix dereference of NULL user_key_payload
363ce0b01fe0 fscrypt: fix dereference of NULL user_key_payload
cc757d55c903 lib/digsig: fix dereference of NULL user_key_payload
f5e97214207f x86/microcode/intel: Disable late loading on model 79
7b5e405b7878 Revert "tools/power turbostat: stop migrating, unless '-m'"
8b1e10789c84 KEYS: encrypted: fix dereference of NULL user_key_payload
a258a35a9930 mm: page_vma_mapped: ensure pmd is loaded with READ_ONCE
outside of lock
e47a56cbf519 usb: xhci: Handle error condition in xhci_stop_device()
d53911e63388 usb: xhci: Reset halted endpoint if trb is noop
d1120fe38b3f xhci: Cleanup current_cmd in xhci_cleanup_command_queue()
301d332138d2 xhci: Identify USB 3.1 capable hosts by their port
protocol capability
015e94ead900 usb: hub: Allow reset retry for USB2 devices on connect bounce
1916547b28bd usb: quirks: add quirk for WORLDE MINI MIDI keyboard
e3a038930502 usb: cdc_acm: Add quirk for Elatec TWN3
c2110c8dea7a USB: serial: metro-usb: add MS7820 device id
775462fd5c53 USB: core: fix out-of-bounds access bug in usb_get_bos_descriptor()
a9fdf6354267 USB: devio: Revert "USB: devio: Don't corrupt user memory"

However, you mentioned cp210x, and I noticed related changes in 4.13.8:

e21045a22395 USB: serial: console: fix use-after-free after failed setup
6c7cb458405e USB: serial: console: fix use-after-free on disconnect
4b3e3c7282d6 USB: serial: qcserial: add Dell DW5818, DW5819
c796da1d110f USB: serial: option: add support for TP-Link LTE module
e7e0b4b39663 USB: serial: cp210x: add support for ELV TFD500
1ae2c690f967 USB: serial: cp210x: fix partnum regression
78a02c93648e USB: serial: ftdi_sio: add id for Cypress WICED dev board

You could try reverting those seven, this could point to your forked driver
if it makes a difference.

   Arnd

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-10 Thread Arnd Bergmann

On Fri, Nov 10, 2017 at 2:58 AM, Patrick McLean  wrote:
> On 2017-11-09 12:04 PM, Linus Torvalds wrote:
>> On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean  wrote:

>
> We will check our fork against the in-kernel cp201x driver to make sure
> we didn't miss anything, but it seems odd we would be hitting the issue
> so consistently in the NFS code path, rather than somewhere in USB,
> serial, or GPIO paths.
>
>> So since you seem to be able to reproduce this _reasonably_ easily,
>> it's definitely worth checking that it still reproduces even without
>> the gcc plugins.
>
> I haven't been able to reproduce it with RANDSTRUCT disabled (and
> structleak enabled). I will keep trying for a little while more, but
> evidence seems to be pointing to that.
>
> Something must have changed since 4.13.8 to trigger this though. This
> did not crop up at all until we tried 4.13.11, where it we saw it pretty
> quickly. We have a pretty large number of machines running 4.13.6 with
> RANDSTRUCT enabled and running a the same workload with many more
> clients, and have not seen this bug at all.

I couldn't find anything overly suspicious between 4.13.8 and 4.13.11,
see the full list of commits since 3.14.6 at https://pastebin.com/AcxBZR7H

The ones I couldn't immediately rule out (but no smoking gun either) would be:

9970679f497a x86/cpu/AMD: Apply the Erratum 688 fix when the BIOS doesn't
ca6711747c5a assoc_array: Fix a buggy node-splitting case
2fbb8bf749b5 xfs: move two more RT specific functions into CONFIG_XFS_RT
1e1427356d8d xfs: trim writepage mapping to within eof
9df9b634f637 xfs: cancel dirty pages on invalidation
cd3f0bee1b94 xfs: handle error if xfs_btree_get_bufs fails
58cfca25f540 xfs: reinit btree pointer on attr tree inactivation walk
659a9989b68b xfs: don't change inode mode if ACL update fails
88ccd3b6884a xfs: move more RT specific code under CONFIG_XFS_RT
5733ebee586c xfs: Don't log uninitialised fields in inode structures
199a7448c097 xfs: handle racy AIO in xfs_reflink_end_cow
ee5d69c908a1 xfs: always swap the cow forks when swapping extents
2888145444f1 xfs: Capture state of the right inode in xfs_iflush_done
d0fa252b207f xfs: perag initialization should only touch
m_ag_max_usable for AG 0
8da6f7fbe43c xfs: update i_size after unwritten conversion in dio completion
a9eac76e958b xfs: report zeroed or not correctly in xfs_zero_range()
67d51bdcc9f4 fs/xfs: Use %pS printk format for direct addresses
2bf3122f2130 xfs: evict CoW fork extents when performing finsert/fcollapse
a58a0826656d xfs: don't unconditionally clear the reflink flag on
zero-block files
c61e905e0ee2 iomap_dio_rw: Allocate AIO completion queue before submitting dio
7610595830bb pkcs7: Prevent NULL pointer dereference, since sinfo is
not always set.
24a33a0c96f3 KEYS: don't let add_key() update an uninstantiated key
ad4aa448c9b2 FS-Cache: fix dereference of NULL user_key_payload
f45b8fe12221 KEYS: Fix race between updating and finding a negative key
e56be12012c2 ecryptfs: fix dereference of NULL user_key_payload
363ce0b01fe0 fscrypt: fix dereference of NULL user_key_payload
cc757d55c903 lib/digsig: fix dereference of NULL user_key_payload
f5e97214207f x86/microcode/intel: Disable late loading on model 79
7b5e405b7878 Revert "tools/power turbostat: stop migrating, unless '-m'"
8b1e10789c84 KEYS: encrypted: fix dereference of NULL user_key_payload
a258a35a9930 mm: page_vma_mapped: ensure pmd is loaded with READ_ONCE
outside of lock
e47a56cbf519 usb: xhci: Handle error condition in xhci_stop_device()
d53911e63388 usb: xhci: Reset halted endpoint if trb is noop
d1120fe38b3f xhci: Cleanup current_cmd in xhci_cleanup_command_queue()
301d332138d2 xhci: Identify USB 3.1 capable hosts by their port
protocol capability
015e94ead900 usb: hub: Allow reset retry for USB2 devices on connect bounce
1916547b28bd usb: quirks: add quirk for WORLDE MINI MIDI keyboard
e3a038930502 usb: cdc_acm: Add quirk for Elatec TWN3
c2110c8dea7a USB: serial: metro-usb: add MS7820 device id
775462fd5c53 USB: core: fix out-of-bounds access bug in usb_get_bos_descriptor()
a9fdf6354267 USB: devio: Revert "USB: devio: Don't corrupt user memory"

However, you mentioned cp210x, and I noticed related changes in 4.13.8:

e21045a22395 USB: serial: console: fix use-after-free after failed setup
6c7cb458405e USB: serial: console: fix use-after-free on disconnect
4b3e3c7282d6 USB: serial: qcserial: add Dell DW5818, DW5819
c796da1d110f USB: serial: option: add support for TP-Link LTE module
e7e0b4b39663 USB: serial: cp210x: add support for ELV TFD500
1ae2c690f967 USB: serial: cp210x: fix partnum regression
78a02c93648e USB: serial: ftdi_sio: add id for Cypress WICED dev board

You could try reverting those seven, this could point to your forked driver
if it makes a difference.

   Arnd

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean

On 2017-11-09 12:04 PM, Linus Torvalds wrote:
> On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean  wrote:
>>
>> We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and
>> CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as
>> CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before.
> 
> It might be worth just verifying without RANDSTRUCT in particular.
> 
> And most obviously: if there is some module or part of the kernel that
> got compiled with a different seed for the randstruct hashing, that
> will break in nasty nasty ways. Your out-of-kernel module is the
> obvious suspect for something like that, but honestly, it could be
> some missing build dependency, or simply a missing special case in the
> plugin itself a missing __no_randomize_layout or any number of things.
> 

We will check our fork against the in-kernel cp201x driver to make sure
we didn't miss anything, but it seems odd we would be hitting the issue
so consistently in the NFS code path, rather than somewhere in USB,
serial, or GPIO paths.

> So since you seem to be able to reproduce this _reasonably_ easily,
> it's definitely worth checking that it still reproduces even without
> the gcc plugins.

I haven't been able to reproduce it with RANDSTRUCT disabled (and
structleak enabled). I will keep trying for a little while more, but
evidence seems to be pointing to that.

Something must have changed since 4.13.8 to trigger this though. This
did not crop up at all until we tried 4.13.11, where it we saw it pretty
quickly. We have a pretty large number of machines running 4.13.6 with
RANDSTRUCT enabled and running a the same workload with many more
clients, and have not seen this bug at all.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean

On 2017-11-09 12:04 PM, Linus Torvalds wrote:
> On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean  wrote:
>>
>> We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and
>> CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as
>> CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before.
> 
> It might be worth just verifying without RANDSTRUCT in particular.
> 
> And most obviously: if there is some module or part of the kernel that
> got compiled with a different seed for the randstruct hashing, that
> will break in nasty nasty ways. Your out-of-kernel module is the
> obvious suspect for something like that, but honestly, it could be
> some missing build dependency, or simply a missing special case in the
> plugin itself a missing __no_randomize_layout or any number of things.
> 

We will check our fork against the in-kernel cp201x driver to make sure
we didn't miss anything, but it seems odd we would be hitting the issue
so consistently in the NFS code path, rather than somewhere in USB,
serial, or GPIO paths.

> So since you seem to be able to reproduce this _reasonably_ easily,
> it's definitely worth checking that it still reproduces even without
> the gcc plugins.

I haven't been able to reproduce it with RANDSTRUCT disabled (and
structleak enabled). I will keep trying for a little while more, but
evidence seems to be pointing to that.

Something must have changed since 4.13.8 to trigger this though. This
did not crop up at all until we tried 4.13.11, where it we saw it pretty
quickly. We have a pretty large number of machines running 4.13.6 with
RANDSTRUCT enabled and running a the same workload with many more
clients, and have not seen this bug at all.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean



On 2017-11-09 11:51 AM, Patrick McLean wrote:
> On 2017-11-09 11:37 AM, Al Viro wrote:
>> On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:
>>
 Here is the BUG we are getting:
> [   58.962528] BUG: unable to handle kernel NULL pointer dereference at 
> 0230
> [   58.963918] IP: vfs_statfs+0x73/0xb0
>>>
>>> The code disassembles to
>>
>>>   2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction
>>
>>> that matters (and that traps) but I'm almost certain that it's the
>>> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
>>> when it then does
>>>
>>>  flags_by_sb(mnt->mnt_sb->s_flags);
>>>
>>> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
>>> NULL, because we wouldn't have gotten this far if it was.
>>>
>>
>> All instances of struct dentry are created by __d_alloc()[*], which assigns
>> ->d_sb (never to be modified afterwards) *and* dereferences the pointer
>> it has stored in ->d_sb before the created struct dentry becomes visible
>> to anyone else.  No struct dentry should ever be observed with NULL ->d_sb;
>> the only way to get that is memory corruption or looking at freed instance
>> after its memory has been reused for something else and zeroed.
>>
>> In other words, we should never observe a struct mount with NULL 
>> ->mnt.mnt_sb -
>> not without memory corruption or looking at freed instance.
>>
>> The pointer in that case should've come from exp->ex_path.mnt, exp being
>> the argument of nfsd4_encode_fattr().  Sure, it might have been a dangling
>> reference.  However, it looks a lot more like a memory corruptor *OR*
>> miscompiled kernel.
>>
>> What kind of load do the reproducer boxen have and how fast does that
>> bug trigger?  Would it be possible to slap something like
>>  if (unlikely(!exp->exp_path.mnt->mnt_sb)) {
>>  struct mount *m = real_mount(exp->exp_path.mnt);
>>  printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt);
>>  printk(KERN_ERR "name: [%s]\n", m->mnt_devname);
>>  printk(KERN_ERR "ns: [%p]\n", m->mnt_ns);
>>  printk(KERN_ERR "parent: [%p]\n", m->mnt_parent);
>>  WARN_ON(1);
>>  err = -EINVAL;
>>  goto out_nfserr;
>>  }
>> in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added
>> in fs/nfsd/nfs4xdr.c) and see what will it catch?
>>
>> Both with and without randomized structs, if possible - I might be barking
>> at the wrong tree, but IMO the very first step in localizing that crap is
>> to find out whether it's toolchain-related or not.
> 

That condition did not seem to trigger, and I am getting a slightly
different crash message (GPF rather than null pointer dereference). Here
is the dump from the latest crash (with CONFIG_GCC_PLUGIN_STRUCTLEAK,
CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and CONFIG_GCC_PLUGIN_RANDSTRUCT
all enabled).

> [   36.834232] general protection fault:  [#1] SMP
> [   36.835168] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
> xt_multiport xt_addrtype iptable_mangle iptable_raw iptable_nat nf_nat_ipv4 
> nf_nat gkuart(O) usbserial x86_pkg_temp_thermal ie31200_edac tpm_tis 
> ipmi_ssif tpm_tis_core ext4 mbcache jbd2 e1000e crc32c_intel
> [   36.839120] CPU: 1 PID: 3969 Comm: nfsd Tainted: G   O
> 4.14.0-rc8-git-kratos-1-00053-gd93d4ce103fd-dirty #1
> [   36.840883] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
> [   36.841892] task: 88040a0b1c80 task.stack: c900027bc000
> [   36.842887] RIP: 0010:vfs_statfs+0x73/0xb0
> [   36.843728] RSP: 0018:c900027bfb30 EFLAGS: 00010202
> [   36.844687] RAX:  RBX: c900027bfbf8 RCX: 
> 180d
> [   36.845891] RDX: 080d RSI: 0020 RDI: 
> e2006d6574737973
> [   36.847075] RBP: c900027bfbc8 R08:  R09: 
> 00ff
> [   36.848175] R10: 0038be3a R11: 88040b687578 R12: 
> 
> [   36.849260] R13: 88040d7dc400 R14: 88040d38b000 R15: 
> c900027bfbf8
> [   36.850347] FS:  () GS:88041fc4() 
> knlGS:
> [   36.851891] CS:  0010 DS:  ES:  CR0: 80050033
> [   36.852873] CR2: 7f049228edc0 CR3: 01e0a004 CR4: 
> 001606e0
> [   36.853942] Call Trace:
> [   36.854667]  nfsd4_encode_fattr+0x34e/0x23b0
> [   36.855578]  ? ext4_get_acl+0x1b2/0x260 [ext4]
> [   36.856485]  ? get_acl+0x7a/0xf0
> [   36.857266]  ? generic_permission+0x125/0x1a0
> [   36.858150]  nfsd4_encode_getattr+0x25/0x30
> [   36.859002]  nfsd4_encode_operation+0x98/0x1a0
> [   36.859889]  nfsd4_proc_compound+0x3eb/0x5c0
> [   36.860736]  nfsd_dispatch+0xa8/0x230
> [   36.861538]  svc_process_common+0x347/0x640
> [   36.862383]  svc_process+0x100/0x1b0
> [   36.863204]  nfsd+0xe0/0x150
> [   36.863984]  kthread+0xfc/0x130
> [   36.864781]  ? nfsd_destroy+0x50/0x50
> [   36.865624]  ?

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean



On 2017-11-09 11:51 AM, Patrick McLean wrote:
> On 2017-11-09 11:37 AM, Al Viro wrote:
>> On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:
>>
 Here is the BUG we are getting:
> [   58.962528] BUG: unable to handle kernel NULL pointer dereference at 
> 0230
> [   58.963918] IP: vfs_statfs+0x73/0xb0
>>>
>>> The code disassembles to
>>
>>>   2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction
>>
>>> that matters (and that traps) but I'm almost certain that it's the
>>> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
>>> when it then does
>>>
>>>  flags_by_sb(mnt->mnt_sb->s_flags);
>>>
>>> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
>>> NULL, because we wouldn't have gotten this far if it was.
>>>
>>
>> All instances of struct dentry are created by __d_alloc()[*], which assigns
>> ->d_sb (never to be modified afterwards) *and* dereferences the pointer
>> it has stored in ->d_sb before the created struct dentry becomes visible
>> to anyone else.  No struct dentry should ever be observed with NULL ->d_sb;
>> the only way to get that is memory corruption or looking at freed instance
>> after its memory has been reused for something else and zeroed.
>>
>> In other words, we should never observe a struct mount with NULL 
>> ->mnt.mnt_sb -
>> not without memory corruption or looking at freed instance.
>>
>> The pointer in that case should've come from exp->ex_path.mnt, exp being
>> the argument of nfsd4_encode_fattr().  Sure, it might have been a dangling
>> reference.  However, it looks a lot more like a memory corruptor *OR*
>> miscompiled kernel.
>>
>> What kind of load do the reproducer boxen have and how fast does that
>> bug trigger?  Would it be possible to slap something like
>>  if (unlikely(!exp->exp_path.mnt->mnt_sb)) {
>>  struct mount *m = real_mount(exp->exp_path.mnt);
>>  printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt);
>>  printk(KERN_ERR "name: [%s]\n", m->mnt_devname);
>>  printk(KERN_ERR "ns: [%p]\n", m->mnt_ns);
>>  printk(KERN_ERR "parent: [%p]\n", m->mnt_parent);
>>  WARN_ON(1);
>>  err = -EINVAL;
>>  goto out_nfserr;
>>  }
>> in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added
>> in fs/nfsd/nfs4xdr.c) and see what will it catch?
>>
>> Both with and without randomized structs, if possible - I might be barking
>> at the wrong tree, but IMO the very first step in localizing that crap is
>> to find out whether it's toolchain-related or not.
> 

That condition did not seem to trigger, and I am getting a slightly
different crash message (GPF rather than null pointer dereference). Here
is the dump from the latest crash (with CONFIG_GCC_PLUGIN_STRUCTLEAK,
CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL and CONFIG_GCC_PLUGIN_RANDSTRUCT
all enabled).

> [   36.834232] general protection fault:  [#1] SMP
> [   36.835168] Modules linked in: ipt_MASQUERADE nf_nat_masquerade_ipv4 
> xt_multiport xt_addrtype iptable_mangle iptable_raw iptable_nat nf_nat_ipv4 
> nf_nat gkuart(O) usbserial x86_pkg_temp_thermal ie31200_edac tpm_tis 
> ipmi_ssif tpm_tis_core ext4 mbcache jbd2 e1000e crc32c_intel
> [   36.839120] CPU: 1 PID: 3969 Comm: nfsd Tainted: G   O
> 4.14.0-rc8-git-kratos-1-00053-gd93d4ce103fd-dirty #1
> [   36.840883] Hardware name: TYAN S5510/S5510, BIOS V2.02 03/12/2013
> [   36.841892] task: 88040a0b1c80 task.stack: c900027bc000
> [   36.842887] RIP: 0010:vfs_statfs+0x73/0xb0
> [   36.843728] RSP: 0018:c900027bfb30 EFLAGS: 00010202
> [   36.844687] RAX:  RBX: c900027bfbf8 RCX: 
> 180d
> [   36.845891] RDX: 080d RSI: 0020 RDI: 
> e2006d6574737973
> [   36.847075] RBP: c900027bfbc8 R08:  R09: 
> 00ff
> [   36.848175] R10: 0038be3a R11: 88040b687578 R12: 
> 
> [   36.849260] R13: 88040d7dc400 R14: 88040d38b000 R15: 
> c900027bfbf8
> [   36.850347] FS:  () GS:88041fc4() 
> knlGS:
> [   36.851891] CS:  0010 DS:  ES:  CR0: 80050033
> [   36.852873] CR2: 7f049228edc0 CR3: 01e0a004 CR4: 
> 001606e0
> [   36.853942] Call Trace:
> [   36.854667]  nfsd4_encode_fattr+0x34e/0x23b0
> [   36.855578]  ? ext4_get_acl+0x1b2/0x260 [ext4]
> [   36.856485]  ? get_acl+0x7a/0xf0
> [   36.857266]  ? generic_permission+0x125/0x1a0
> [   36.858150]  nfsd4_encode_getattr+0x25/0x30
> [   36.859002]  nfsd4_encode_operation+0x98/0x1a0
> [   36.859889]  nfsd4_proc_compound+0x3eb/0x5c0
> [   36.860736]  nfsd_dispatch+0xa8/0x230
> [   36.861538]  svc_process_common+0x347/0x640
> [   36.862383]  svc_process+0x100/0x1b0
> [   36.863204]  nfsd+0xe0/0x150
> [   36.863984]  kthread+0xfc/0x130
> [   36.864781]  ? nfsd_destroy+0x50/0x50
> [   36.865624]  ?

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean



On 2017-11-09 12:47 PM, J. Bruce Fields wrote:
> On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:
>> Anyway, that cmovne noise makes it a bit hard to see the actual part
>> that matters (and that traps) but I'm almost certain that it's the
>> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
>> when it then does
>>
>>  flags_by_sb(mnt->mnt_sb->s_flags);
>>
>> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
>> NULL, because we wouldn't have gotten this far if it was.
>>
>> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a
>> proper path. And the vfs_statfs() code itself hasn't changed in a
>> while.
>>
>> Which does seem to implicate nfsd as having passed in a bad path to
>> vfs_statfs(). But I'm not seeing any changes in nfsd either.
>>
>> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11
>> range. There is a bunch of xfs changes, though. What's the underlying
>> filesystem that you are exporting?
>>
>> But bringing in Al Viro and Bruce Fields explicitly in case they see
>> something. And Darrick, just in case it might be xfs.
> 
> Looking at https://lkml.org/lkml/2017/11/8/1086 for the actual oops...
> 
> It doesn't remind me of any known issue.
> 
> So either I'm overlooking something or the bug's elsewhere.
> 
> It sounds like you're varying *only* the server version, so there's not
> much chance that this could be triggered by changes in client behavior?
> 

We are definitely only varying the kernel on the server, nothing on the
client side is changing. The client in this case is essentially an
embedded device that we do not have a whole lot of control over.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean



On 2017-11-09 12:47 PM, J. Bruce Fields wrote:
> On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:
>> Anyway, that cmovne noise makes it a bit hard to see the actual part
>> that matters (and that traps) but I'm almost certain that it's the
>> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
>> when it then does
>>
>>  flags_by_sb(mnt->mnt_sb->s_flags);
>>
>> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
>> NULL, because we wouldn't have gotten this far if it was.
>>
>> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a
>> proper path. And the vfs_statfs() code itself hasn't changed in a
>> while.
>>
>> Which does seem to implicate nfsd as having passed in a bad path to
>> vfs_statfs(). But I'm not seeing any changes in nfsd either.
>>
>> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11
>> range. There is a bunch of xfs changes, though. What's the underlying
>> filesystem that you are exporting?
>>
>> But bringing in Al Viro and Bruce Fields explicitly in case they see
>> something. And Darrick, just in case it might be xfs.
> 
> Looking at https://lkml.org/lkml/2017/11/8/1086 for the actual oops...
> 
> It doesn't remind me of any known issue.
> 
> So either I'm overlooking something or the bug's elsewhere.
> 
> It sounds like you're varying *only* the server version, so there's not
> much chance that this could be triggered by changes in client behavior?
> 

We are definitely only varying the kernel on the server, nothing on the
client side is changing. The client in this case is essentially an
embedded device that we do not have a whole lot of control over.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Al Viro

On Thu, Nov 09, 2017 at 12:04:19PM -0800, Linus Torvalds wrote:

> That case has probably not gotten a huge amount of testing.  As Al
> points out, it can cause absolutely horrendous cache access pattern
> changes, but it might also be triggering some corruption in case
> there's a problem with the plugin, or with some piece of kernel code
> that gets confused by it.

I suspect that it might be an effect of randomize shite done both to
struct mount *AND* to struct vfsmount embedded into it.  With
pointers to embedded struct vfsmount kept around a lot, and container_of()
used to get from them to corresponding struct mount.

That smells like a combination of idiocy that might have never occured
to the authors of said gcc plugin.

On the other hand, triggered gcc bugs certainly do add randomness, so good
luck explaining to the security community that it's not a good idea...

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Al Viro

On Thu, Nov 09, 2017 at 12:04:19PM -0800, Linus Torvalds wrote:

> That case has probably not gotten a huge amount of testing.  As Al
> points out, it can cause absolutely horrendous cache access pattern
> changes, but it might also be triggering some corruption in case
> there's a problem with the plugin, or with some piece of kernel code
> that gets confused by it.

I suspect that it might be an effect of randomize shite done both to
struct mount *AND* to struct vfsmount embedded into it.  With
pointers to embedded struct vfsmount kept around a lot, and container_of()
used to get from them to corresponding struct mount.

That smells like a combination of idiocy that might have never occured
to the authors of said gcc plugin.

On the other hand, triggered gcc bugs certainly do add randomness, so good
luck explaining to the security community that it's not a good idea...

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread J. Bruce Fields

On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:
> Anyway, that cmovne noise makes it a bit hard to see the actual part
> that matters (and that traps) but I'm almost certain that it's the
> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
> when it then does
> 
>  flags_by_sb(mnt->mnt_sb->s_flags);
> 
> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
> NULL, because we wouldn't have gotten this far if it was.
> 
> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a
> proper path. And the vfs_statfs() code itself hasn't changed in a
> while.
> 
> Which does seem to implicate nfsd as having passed in a bad path to
> vfs_statfs(). But I'm not seeing any changes in nfsd either.
> 
> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11
> range. There is a bunch of xfs changes, though. What's the underlying
> filesystem that you are exporting?
> 
> But bringing in Al Viro and Bruce Fields explicitly in case they see
> something. And Darrick, just in case it might be xfs.

Looking at https://lkml.org/lkml/2017/11/8/1086 for the actual oops...

It doesn't remind me of any known issue.

And I don't see how we can call vfs_statfs() with a bad path:
nfsd4_encode_getattr would have to have been called with nfserr 0 and
ga_fhp->fh_export bad.

Looking at nfsd4_proc_compound, I can't see how we could get there in
the op->status == 0 case without the fh_verify() in nfsd4_getattr having
succeeded and assigned the result to ga_fhp.

So either I'm overlooking something or the bug's elsewhere.

It sounds like you're varying *only* the server version, so there's not
much chance that this could be triggered by changes in client behavior?

--b.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread J. Bruce Fields

On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:
> Anyway, that cmovne noise makes it a bit hard to see the actual part
> that matters (and that traps) but I'm almost certain that it's the
> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
> when it then does
> 
>  flags_by_sb(mnt->mnt_sb->s_flags);
> 
> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
> NULL, because we wouldn't have gotten this far if it was.
> 
> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a
> proper path. And the vfs_statfs() code itself hasn't changed in a
> while.
> 
> Which does seem to implicate nfsd as having passed in a bad path to
> vfs_statfs(). But I'm not seeing any changes in nfsd either.
> 
> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11
> range. There is a bunch of xfs changes, though. What's the underlying
> filesystem that you are exporting?
> 
> But bringing in Al Viro and Bruce Fields explicitly in case they see
> something. And Darrick, just in case it might be xfs.

Looking at https://lkml.org/lkml/2017/11/8/1086 for the actual oops...

It doesn't remind me of any known issue.

And I don't see how we can call vfs_statfs() with a bad path:
nfsd4_encode_getattr would have to have been called with nfserr 0 and
ga_fhp->fh_export bad.

Looking at nfsd4_proc_compound, I can't see how we could get there in
the op->status == 0 case without the fh_verify() in nfsd4_getattr having
succeeded and assigned the result to ga_fhp.

So either I'm overlooking something or the bug's elsewhere.

It sounds like you're varying *only* the server version, so there's not
much chance that this could be triggered by changes in client behavior?

--b.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Linus Torvalds

On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean  wrote:
>
> We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and
> CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as
> CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before.

It might be worth just verifying without RANDSTRUCT in particular.

That case has probably not gotten a huge amount of testing.  As Al
points out, it can cause absolutely horrendous cache access pattern
changes, but it might also be triggering some corruption in case
there's a problem with the plugin, or with some piece of kernel code
that gets confused by it.

And most obviously: if there is some module or part of the kernel that
got compiled with a different seed for the randstruct hashing, that
will break in nasty nasty ways. Your out-of-kernel module is the
obvious suspect for something like that, but honestly, it could be
some missing build dependency, or simply a missing special case in the
plugin itself a missing __no_randomize_layout or any number of things.

We've hit gcc bugs many times before - and the plugins are just new
opportunities to hit cases that have gotten a lot less testing than
the "normal" code flow has.

The structleak plugin is much less likely to be a problem (simply
because it's a much simpler plugin), but hey, something being NULL
when it shouldn't possibly be might be a stray "leak initialization".

So since you seem to be able to reproduce this _reasonably_ easily,
it's definitely worth checking that it still reproduces even without
the gcc plugins.

Just to narrow it down a bit.

  Linus

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Linus Torvalds

On Thu, Nov 9, 2017 at 11:51 AM, Patrick McLean  wrote:
>
> We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and
> CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as
> CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before.

It might be worth just verifying without RANDSTRUCT in particular.

That case has probably not gotten a huge amount of testing.  As Al
points out, it can cause absolutely horrendous cache access pattern
changes, but it might also be triggering some corruption in case
there's a problem with the plugin, or with some piece of kernel code
that gets confused by it.

And most obviously: if there is some module or part of the kernel that
got compiled with a different seed for the randstruct hashing, that
will break in nasty nasty ways. Your out-of-kernel module is the
obvious suspect for something like that, but honestly, it could be
some missing build dependency, or simply a missing special case in the
plugin itself a missing __no_randomize_layout or any number of things.

We've hit gcc bugs many times before - and the plugins are just new
opportunities to hit cases that have gotten a lot less testing than
the "normal" code flow has.

The structleak plugin is much less likely to be a problem (simply
because it's a much simpler plugin), but hey, something being NULL
when it shouldn't possibly be might be a stray "leak initialization".

So since you seem to be able to reproduce this _reasonably_ easily,
it's definitely worth checking that it still reproduces even without
the gcc plugins.

Just to narrow it down a bit.

  Linus

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean

On 2017-11-09 11:37 AM, Al Viro wrote:
> On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:
> 
>>> Here is the BUG we are getting:
 [   58.962528] BUG: unable to handle kernel NULL pointer dereference at 
 0230
 [   58.963918] IP: vfs_statfs+0x73/0xb0
>>
>> The code disassembles to
> 
>>   2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction
> 
>> that matters (and that traps) but I'm almost certain that it's the
>> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
>> when it then does
>>
>>  flags_by_sb(mnt->mnt_sb->s_flags);
>>
>> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
>> NULL, because we wouldn't have gotten this far if it was.
>>
> 
> All instances of struct dentry are created by __d_alloc()[*], which assigns
> ->d_sb (never to be modified afterwards) *and* dereferences the pointer
> it has stored in ->d_sb before the created struct dentry becomes visible
> to anyone else.  No struct dentry should ever be observed with NULL ->d_sb;
> the only way to get that is memory corruption or looking at freed instance
> after its memory has been reused for something else and zeroed.
> 
> In other words, we should never observe a struct mount with NULL ->mnt.mnt_sb 
> -
> not without memory corruption or looking at freed instance.
> 
> The pointer in that case should've come from exp->ex_path.mnt, exp being
> the argument of nfsd4_encode_fattr().  Sure, it might have been a dangling
> reference.  However, it looks a lot more like a memory corruptor *OR*
> miscompiled kernel.
> 
> What kind of load do the reproducer boxen have and how fast does that
> bug trigger?  Would it be possible to slap something like
>   if (unlikely(!exp->exp_path.mnt->mnt_sb)) {
>   struct mount *m = real_mount(exp->exp_path.mnt);
>   printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt);
>   printk(KERN_ERR "name: [%s]\n", m->mnt_devname);
>   printk(KERN_ERR "ns: [%p]\n", m->mnt_ns);
>   printk(KERN_ERR "parent: [%p]\n", m->mnt_parent);
>   WARN_ON(1);
>   err = -EINVAL;
>   goto out_nfserr;
>   }
> in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added
> in fs/nfsd/nfs4xdr.c) and see what will it catch?
> 
> Both with and without randomized structs, if possible - I might be barking
> at the wrong tree, but IMO the very first step in localizing that crap is
> to find out whether it's toolchain-related or not.

The reproducer boxen are not under particularly heavy load, they are
serving NFS to 1 or 2 clients (which are essentially embedded devices).
When the bug triggers, it usually triggers pretty fast and reliably, but
it seems to only trigger on some subset of bootups. Once it fails to
trigger, we seem to have to reboot to get it to trigger.

I should be able to have some results with that added in a few hours.
It's weirdly unreliable to reproduce this.

We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and
CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as
CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean

On 2017-11-09 11:37 AM, Al Viro wrote:
> On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:
> 
>>> Here is the BUG we are getting:
 [   58.962528] BUG: unable to handle kernel NULL pointer dereference at 
 0230
 [   58.963918] IP: vfs_statfs+0x73/0xb0
>>
>> The code disassembles to
> 
>>   2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction
> 
>> that matters (and that traps) but I'm almost certain that it's the
>> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
>> when it then does
>>
>>  flags_by_sb(mnt->mnt_sb->s_flags);
>>
>> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
>> NULL, because we wouldn't have gotten this far if it was.
>>
> 
> All instances of struct dentry are created by __d_alloc()[*], which assigns
> ->d_sb (never to be modified afterwards) *and* dereferences the pointer
> it has stored in ->d_sb before the created struct dentry becomes visible
> to anyone else.  No struct dentry should ever be observed with NULL ->d_sb;
> the only way to get that is memory corruption or looking at freed instance
> after its memory has been reused for something else and zeroed.
> 
> In other words, we should never observe a struct mount with NULL ->mnt.mnt_sb 
> -
> not without memory corruption or looking at freed instance.
> 
> The pointer in that case should've come from exp->ex_path.mnt, exp being
> the argument of nfsd4_encode_fattr().  Sure, it might have been a dangling
> reference.  However, it looks a lot more like a memory corruptor *OR*
> miscompiled kernel.
> 
> What kind of load do the reproducer boxen have and how fast does that
> bug trigger?  Would it be possible to slap something like
>   if (unlikely(!exp->exp_path.mnt->mnt_sb)) {
>   struct mount *m = real_mount(exp->exp_path.mnt);
>   printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt);
>   printk(KERN_ERR "name: [%s]\n", m->mnt_devname);
>   printk(KERN_ERR "ns: [%p]\n", m->mnt_ns);
>   printk(KERN_ERR "parent: [%p]\n", m->mnt_parent);
>   WARN_ON(1);
>   err = -EINVAL;
>   goto out_nfserr;
>   }
> in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added
> in fs/nfsd/nfs4xdr.c) and see what will it catch?
> 
> Both with and without randomized structs, if possible - I might be barking
> at the wrong tree, but IMO the very first step in localizing that crap is
> to find out whether it's toolchain-related or not.

The reproducer boxen are not under particularly heavy load, they are
serving NFS to 1 or 2 clients (which are essentially embedded devices).
When the bug triggers, it usually triggers pretty fast and reliably, but
it seems to only trigger on some subset of bootups. Once it fails to
trigger, we seem to have to reboot to get it to trigger.

I should be able to have some results with that added in a few hours.
It's weirdly unreliable to reproduce this.

We do have CONFIG_GCC_PLUGIN_STRUCTLEAK and
CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL enabled on these boxes as well as
CONFIG_GCC_PLUGIN_RANDSTRUCT as you pointed out before.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean



On 2017-11-09 11:38 AM, Al Viro wrote:
> On Thu, Nov 09, 2017 at 11:34:19AM -0800, Patrick McLean wrote:
> 
>>> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11
>>> range. There is a bunch of xfs changes, though. What's the underlying
>>> filesystem that you are exporting?
>>
>> It's an ext4 filesystem.
> 
> Had there been toolchain changes around the same period?
> 
No changes, both the working and broken kernels were built with
distro-provided gcc 5.4.0 and binutils 2.28.1.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean



On 2017-11-09 11:38 AM, Al Viro wrote:
> On Thu, Nov 09, 2017 at 11:34:19AM -0800, Patrick McLean wrote:
> 
>>> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11
>>> range. There is a bunch of xfs changes, though. What's the underlying
>>> filesystem that you are exporting?
>>
>> It's an ext4 filesystem.
> 
> Had there been toolchain changes around the same period?
> 
No changes, both the working and broken kernels were built with
distro-provided gcc 5.4.0 and binutils 2.28.1.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Al Viro

On Thu, Nov 09, 2017 at 11:34:19AM -0800, Patrick McLean wrote:

> > In particular, there are *no* nfsd changes in that 4.13.8..4.13.11
> > range. There is a bunch of xfs changes, though. What's the underlying
> > filesystem that you are exporting?
> 
> It's an ext4 filesystem.

Had there been toolchain changes around the same period?

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Al Viro

On Thu, Nov 09, 2017 at 11:34:19AM -0800, Patrick McLean wrote:

> > In particular, there are *no* nfsd changes in that 4.13.8..4.13.11
> > range. There is a bunch of xfs changes, though. What's the underlying
> > filesystem that you are exporting?
> 
> It's an ext4 filesystem.

Had there been toolchain changes around the same period?

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Al Viro

On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:

> > Here is the BUG we are getting:
> >> [   58.962528] BUG: unable to handle kernel NULL pointer dereference at 
> >> 0230
> >> [   58.963918] IP: vfs_statfs+0x73/0xb0
> 
> The code disassembles to

>   2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction

> that matters (and that traps) but I'm almost certain that it's the
> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
> when it then does
> 
>  flags_by_sb(mnt->mnt_sb->s_flags);
> 
> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
> NULL, because we wouldn't have gotten this far if it was.
> 
> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a
> proper path. And the vfs_statfs() code itself hasn't changed in a
> while.
> 
> Which does seem to implicate nfsd as having passed in a bad path to
> vfs_statfs(). But I'm not seeing any changes in nfsd either.

It definitely is NULL mnt->mnt_sb and that should never happen.  All
struct mount instances are allocated by alloc_vfsmnt().  Its callers
are
* vfs_kern_mount().  Assigns ->mnt_sb to root->d_sb before
anyone else sees the address of that object.
* clone_mnt().  Assigns ->mnt_sb to that of preexisting instance
before anyone else sees the address of that object.

No other callers exist and no other places ever modify the value of that
field.

All instances of struct dentry are created by __d_alloc()[*], which assigns
->d_sb (never to be modified afterwards) *and* dereferences the pointer
it has stored in ->d_sb before the created struct dentry becomes visible
to anyone else.  No struct dentry should ever be observed with NULL ->d_sb;
the only way to get that is memory corruption or looking at freed instance
after its memory has been reused for something else and zeroed.

In other words, we should never observe a struct mount with NULL ->mnt.mnt_sb -
not without memory corruption or looking at freed instance.

The pointer in that case should've come from exp->ex_path.mnt, exp being
the argument of nfsd4_encode_fattr().  Sure, it might have been a dangling
reference.  However, it looks a lot more like a memory corruptor *OR*
miscompiled kernel.

What kind of load do the reproducer boxen have and how fast does that
bug trigger?  Would it be possible to slap something like
if (unlikely(!exp->exp_path.mnt->mnt_sb)) {
struct mount *m = real_mount(exp->exp_path.mnt);
printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt);
printk(KERN_ERR "name: [%s]\n", m->mnt_devname);
printk(KERN_ERR "ns: [%p]\n", m->mnt_ns);
printk(KERN_ERR "parent: [%p]\n", m->mnt_parent);
WARN_ON(1);
err = -EINVAL;
goto out_nfserr;
}
in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added
in fs/nfsd/nfs4xdr.c) and see what will it catch?

Both with and without randomized structs, if possible - I might be barking
at the wrong tree, but IMO the very first step in localizing that crap is
to find out whether it's toolchain-related or not.

[*] strictly speaking, there is one exception - lib/test_printf.c has
four static struct dentry instances.  No chance of those being returned
by any ->mount() instance, though.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Al Viro

On Wed, Nov 08, 2017 at 06:40:22PM -0800, Linus Torvalds wrote:

> > Here is the BUG we are getting:
> >> [   58.962528] BUG: unable to handle kernel NULL pointer dereference at 
> >> 0230
> >> [   58.963918] IP: vfs_statfs+0x73/0xb0
> 
> The code disassembles to

>   2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction

> that matters (and that traps) but I'm almost certain that it's the
> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
> when it then does
> 
>  flags_by_sb(mnt->mnt_sb->s_flags);
> 
> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
> NULL, because we wouldn't have gotten this far if it was.
> 
> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a
> proper path. And the vfs_statfs() code itself hasn't changed in a
> while.
> 
> Which does seem to implicate nfsd as having passed in a bad path to
> vfs_statfs(). But I'm not seeing any changes in nfsd either.

It definitely is NULL mnt->mnt_sb and that should never happen.  All
struct mount instances are allocated by alloc_vfsmnt().  Its callers
are
* vfs_kern_mount().  Assigns ->mnt_sb to root->d_sb before
anyone else sees the address of that object.
* clone_mnt().  Assigns ->mnt_sb to that of preexisting instance
before anyone else sees the address of that object.

No other callers exist and no other places ever modify the value of that
field.

All instances of struct dentry are created by __d_alloc()[*], which assigns
->d_sb (never to be modified afterwards) *and* dereferences the pointer
it has stored in ->d_sb before the created struct dentry becomes visible
to anyone else.  No struct dentry should ever be observed with NULL ->d_sb;
the only way to get that is memory corruption or looking at freed instance
after its memory has been reused for something else and zeroed.

In other words, we should never observe a struct mount with NULL ->mnt.mnt_sb -
not without memory corruption or looking at freed instance.

The pointer in that case should've come from exp->ex_path.mnt, exp being
the argument of nfsd4_encode_fattr().  Sure, it might have been a dangling
reference.  However, it looks a lot more like a memory corruptor *OR*
miscompiled kernel.

What kind of load do the reproducer boxen have and how fast does that
bug trigger?  Would it be possible to slap something like
if (unlikely(!exp->exp_path.mnt->mnt_sb)) {
struct mount *m = real_mount(exp->exp_path.mnt);
printk(KERN_ERR "mnt: %p\n", exp->exp_path.mnt);
printk(KERN_ERR "name: [%s]\n", m->mnt_devname);
printk(KERN_ERR "ns: [%p]\n", m->mnt_ns);
printk(KERN_ERR "parent: [%p]\n", m->mnt_parent);
WARN_ON(1);
err = -EINVAL;
goto out_nfserr;
}
in the beginning of nfsd4_encode_fattr() (with include of ../mount.h added
in fs/nfsd/nfs4xdr.c) and see what will it catch?

Both with and without randomized structs, if possible - I might be barking
at the wrong tree, but IMO the very first step in localizing that crap is
to find out whether it's toolchain-related or not.

[*] strictly speaking, there is one exception - lib/test_printf.c has
four static struct dentry instances.  No chance of those being returned
by any ->mount() instance, though.

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean



On 2017-11-08 06:40 PM, Linus Torvalds wrote:
> On Wed, Nov 8, 2017 at 4:43 PM, Patrick McLean  wrote:
>> As of 4.13.11 (and also with 4.14-rc) we have an issue where when
>> serving nfs4 sometimes we get the following BUG. When this bug happens,
>> it usually also causes the motherboard to no longer POST until we
>> externally re-flash the BIOS (using the BMC web interface). If a
>> motherboard does not have an external way to flash the BIOS, this would
>> brick the hardware.
> 
> That sounds like your BIOS is just broken.

All the dead boards were from the same vendor. We are going to try some
boards from another vendor today.

> 
> The kernel oops is probably just a trigger for that - possibly because
> you reboot with a particular state that breaks the BIOS.
> 
> Also, are you sure you really need to reflash the BIOS? It's actually
> fairly hard to overwrite the BIOS itself, but crashing with bad
> hardware state (where "bad" can just mean "unexpected by the BIOS")
> can cause the BIOS to not properly re-initialize things, and hang at
> boot.
> 
> So not booting cleanly from a warm reset is a reasonably common BIOS failure.
> 
> And yes, reflashing tends to force a full initialization and thus
> "fixes" things, but it may be a big hammer when a cold boot or just a
> "reset BIOS to safe defaults" might be sufficient.
> 
> In pretty much all cases this is a sign of a nasty BIOS problem,
> though, and you may want to look into a firmware update from the
> vendor for that.

We tried a cold power off (physically unplugging the machine from power)
and a CMOS reset, and neither helped. The only thing that actually
restored one of the dead boards was a reflash. I did the reflash with
the latest code when I reflashed it.

> 
> But on to the kernel side:
> 
>> Here is the BUG we are getting:
>>> [   58.962528] BUG: unable to handle kernel NULL pointer dereference at 
>>> 0230
>>> [   58.963918] IP: vfs_statfs+0x73/0xb0
> 
> The code disassembles to
> 
>0: 83 c9 08  or $0x8,%ecx
>3: 40 f6 c6 04  test   $0x4,%sil
>7: 0f 45 d1  cmovne %ecx,%edx
>a: 89 d1mov%edx,%ecx
>c: 80 cd 04  or $0x4,%ch
>f: 40 f6 c6 08  test   $0x8,%sil
>   13: 0f 45 d1  cmovne %ecx,%edx
>   16: 89 d1mov%edx,%ecx
>   18: 80 cd 08  or $0x8,%ch
>   1b: 40 f6 c6 10  test   $0x10,%sil
>   1f: 0f 45 d1  cmovne %ecx,%edx
>   22: 89 d1mov%edx,%ecx
>   24: 80 cd 10  or $0x10,%ch
>   27: 83 e6 20  and$0x20,%esi
>   2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction
>   31: 0f 45 d1  cmovne %ecx,%edx
>   34: 83 ca 20  or $0x20,%edx
>   37: 89 f1mov%esi,%ecx
>   39: 83 e1 10  and$0x10,%ecx
>   3c: 89 cfmov%ecx,%edi
> 
> and all those odd cmovne and bit-ops are just the bit selection code
> in flags_by_mnt(), which is inlined through calculate_f_flags (which
> is _also_ inlined) into vfs_statfs().
> 
> Sadly, gcc makes a mess of it and actually generates code that looks
> like the original C. I would have hoped that gcc could have turned
> 
>if (x & BIT)
> y |= OTHER_BIT;
> 
> into
> 
> y |= (x & BIT) shifted-by-the-bit-difference-between BIT/OTHER_BIT;
> 
> but that doesn't happen. We actually do it by hand in some other more
> critical places, but it's painful to do by hand (because the shift
> direction/amount is not trivial to do in C).
> 
> Anyway, that cmovne noise makes it a bit hard to see the actual part
> that matters (and that traps) but I'm almost certain that it's the
> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
> when it then does
> 
>  flags_by_sb(mnt->mnt_sb->s_flags);
> 
> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
> NULL, because we wouldn't have gotten this far if it was.
> 
> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a
> proper path. And the vfs_statfs() code itself hasn't changed in a
> while.
> 
> Which does seem to implicate nfsd as having passed in a bad path to
> vfs_statfs(). But I'm not seeing any changes in nfsd either.
> 
> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11
> range. There is a bunch of xfs changes, though. What's the underlying
> filesystem that you are exporting?

It's an ext4 filesystem.

> 
> But bringing in Al Viro and Bruce Fields explicitly in case they see
> something. And Darrick, just in case it might be xfs.
> 

Thanks

Re: [nfsd4] potentially hardware breaking regression in 4.14-rc and 4.13.11

2017-11-09 Thread Patrick McLean



On 2017-11-08 06:40 PM, Linus Torvalds wrote:
> On Wed, Nov 8, 2017 at 4:43 PM, Patrick McLean  wrote:
>> As of 4.13.11 (and also with 4.14-rc) we have an issue where when
>> serving nfs4 sometimes we get the following BUG. When this bug happens,
>> it usually also causes the motherboard to no longer POST until we
>> externally re-flash the BIOS (using the BMC web interface). If a
>> motherboard does not have an external way to flash the BIOS, this would
>> brick the hardware.
> 
> That sounds like your BIOS is just broken.

All the dead boards were from the same vendor. We are going to try some
boards from another vendor today.

> 
> The kernel oops is probably just a trigger for that - possibly because
> you reboot with a particular state that breaks the BIOS.
> 
> Also, are you sure you really need to reflash the BIOS? It's actually
> fairly hard to overwrite the BIOS itself, but crashing with bad
> hardware state (where "bad" can just mean "unexpected by the BIOS")
> can cause the BIOS to not properly re-initialize things, and hang at
> boot.
> 
> So not booting cleanly from a warm reset is a reasonably common BIOS failure.
> 
> And yes, reflashing tends to force a full initialization and thus
> "fixes" things, but it may be a big hammer when a cold boot or just a
> "reset BIOS to safe defaults" might be sufficient.
> 
> In pretty much all cases this is a sign of a nasty BIOS problem,
> though, and you may want to look into a firmware update from the
> vendor for that.

We tried a cold power off (physically unplugging the machine from power)
and a CMOS reset, and neither helped. The only thing that actually
restored one of the dead boards was a reflash. I did the reflash with
the latest code when I reflashed it.

> 
> But on to the kernel side:
> 
>> Here is the BUG we are getting:
>>> [   58.962528] BUG: unable to handle kernel NULL pointer dereference at 
>>> 0230
>>> [   58.963918] IP: vfs_statfs+0x73/0xb0
> 
> The code disassembles to
> 
>0: 83 c9 08  or $0x8,%ecx
>3: 40 f6 c6 04  test   $0x4,%sil
>7: 0f 45 d1  cmovne %ecx,%edx
>a: 89 d1mov%edx,%ecx
>c: 80 cd 04  or $0x4,%ch
>f: 40 f6 c6 08  test   $0x8,%sil
>   13: 0f 45 d1  cmovne %ecx,%edx
>   16: 89 d1mov%edx,%ecx
>   18: 80 cd 08  or $0x8,%ch
>   1b: 40 f6 c6 10  test   $0x10,%sil
>   1f: 0f 45 d1  cmovne %ecx,%edx
>   22: 89 d1mov%edx,%ecx
>   24: 80 cd 10  or $0x10,%ch
>   27: 83 e6 20  and$0x20,%esi
>   2a:* 48 8b b7 30 02 00 00 mov0x230(%rdi),%rsi <-- trapping instruction
>   31: 0f 45 d1  cmovne %ecx,%edx
>   34: 83 ca 20  or $0x20,%edx
>   37: 89 f1mov%esi,%ecx
>   39: 83 e1 10  and$0x10,%ecx
>   3c: 89 cfmov%ecx,%edi
> 
> and all those odd cmovne and bit-ops are just the bit selection code
> in flags_by_mnt(), which is inlined through calculate_f_flags (which
> is _also_ inlined) into vfs_statfs().
> 
> Sadly, gcc makes a mess of it and actually generates code that looks
> like the original C. I would have hoped that gcc could have turned
> 
>if (x & BIT)
> y |= OTHER_BIT;
> 
> into
> 
> y |= (x & BIT) shifted-by-the-bit-difference-between BIT/OTHER_BIT;
> 
> but that doesn't happen. We actually do it by hand in some other more
> critical places, but it's painful to do by hand (because the shift
> direction/amount is not trivial to do in C).
> 
> Anyway, that cmovne noise makes it a bit hard to see the actual part
> that matters (and that traps) but I'm almost certain that it's the
> "mnt->mnt_sb->s_flags" loading that is part of calculate_f_flags()
> when it then does
> 
>  flags_by_sb(mnt->mnt_sb->s_flags);
> 
> and I think mnt->mnt_sb is NULL. We know it's not 'mnt' itself that is
> NULL, because we wouldn't have gotten this far if it was.
> 
> Now, afaik, mnt->mnt_sb should never be NULL in the first place for a
> proper path. And the vfs_statfs() code itself hasn't changed in a
> while.
> 
> Which does seem to implicate nfsd as having passed in a bad path to
> vfs_statfs(). But I'm not seeing any changes in nfsd either.
> 
> In particular, there are *no* nfsd changes in that 4.13.8..4.13.11
> range. There is a bunch of xfs changes, though. What's the underlying
> filesystem that you are exporting?

It's an ext4 filesystem.

> 
> But bringing in Al Viro and Bruce Fields explicitly in case they see
> something. And Darrick, just in case it might be xfs.
> 

Thanks

1 2 >

1 - 100 of 106 matches

Mail list logo