Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Mon, Apr 9, 2018 at 11:00 AM, Kees Cook wrote: > > Can we update the comment near the top to explain why we need > __UNIQUE_ID() since we've now rediscovered why it was originally > there? Too late, since it's already committed in my tree and going through the build test paces. But I really hope somebody revisits __UNIQUE_ID, and maybe they can then add a comment to the users too. It looks like gcc needs at least version 4.3 to have a working __COUNTER__ implementation. What a coincidence - we don't really support anything older now anyway. And similarly for clang - the only versions that didn't have it are not able to build the kernel anyway. So we should remove any version conditionals, and we should remove the known-broken generic fallback that uses __LINE__ instead of __COUNTER__. And once you do that, the "prefix" is actually pointlessm and we could drop it to make __UNIQUE_ID() easier to use. The *REAL* problem, however, is that __UNIQUE_ID is such a pain to use in general. You basically have two choices: - use it as a single-declaration throwaway variable name that is never actually used - use it only in a wrapper macro that then passes it off as an argument to another macro that uses it several times That "throwaway" use sounds pointless, but it's actually the *common* use we have for this thing outside of this min/max case. It's used to shut up the compiler about other unused variables, where people do things like static __unused__ __UNIQUE_ID(xyz) = .. mark other variable as used here ..; and I think the reason it's common is that it's the only *easy* use of that unique-name-generator macro. Having to pass it off to a second macro to use as a variable name makes it cumbersome to use for most normal macro cases. It would be nice to have the equivalent of __COUNTER__ that just expanded not to a continually increasing number, but expanded to a number that expands for each macro expansion. Then you could do things like int UNIQUE(a) = (a); to generate a variable name that was unique within a particular macro expansion, and then actually _use_ UNIQUE(a) in the same macro expansion to refer to it without having to pass it off as a name to another macro that does the declaration and use. You can't use __UNIQUE_ID() for that, because every new __UNIQUE_ID(a) will give _another_ unique ID. (And you *could* use the __LINE__ based one for that, but then you'd get the whole "nasty shadowing" problem if we have nested macros again). Ugh. We have no wonderful model for unique names. Which is why we obviously just use the "add underscores until it's unique" and the get it wrong with nesting. Linus
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Mon, Apr 9, 2018 at 10:14 AM, Linus Torvalds wrote: > On Mon, Apr 9, 2018 at 10:03 AM, Linus Torvalds > wrote: >> >> Our old "min()" had the internal variables called "min1" and "min2", >> which is crazy too. > > Actually, no, it used the really cumbersome "__UNIQUE_ID" and then > passed that odd as the name 'min1/2', > > Ugh, I find that really nasty to read, but it was obviously done > because we hit this before. Ooof. Nice find. > And our __UNIQUE_ID() macro is garbage anyway, since it falls back on > the line number, which doesn't really work for macros anyway. But we > have proper macros for both clang and gcc, so maybe we should ignore > the broken fallback. > > A patch like the attached, perhaps? Can we update the comment near the top to explain why we need __UNIQUE_ID() since we've now rediscovered why it was originally there? -Kees -- Kees Cook Pixel Security
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Mon, Apr 9, 2018 at 10:14 AM, Linus Torvalds wrote: > > Ugh, I find that really nasty to read, but it was obviously done > because we hit this before. Side note, we have a *lof* of those "__x" and "__y" names in the helper macros. Clearly min/max was the only one we really had ever hit (because it used to have that UNIQUE_ID thing), and the patch I sent hopefully fixes the only real problem, but it would be good to perhaps look at this in general. And no, -Wshadow still isn't the right answer, because while it warns about this, it also warns about a lot of perfectly normal cases where we have shadowing of names but it doesn't really matter. Maybe "make __UNIQUE_ID easier to use and encourage that model" is the right answer. Right now "__UNIQUE_ID" is actually really nasty in several ways: (1) the already mentioned "the fallback is broken for same-line use" This doesn't really matter because both gcc and clang have _true_ unique macros, but we should probably remove the fallback as "know broken and not really guaranteed to give a unique ID" (2) The argument you give to __UNIQUE_ID() is pointless. The only reason it exists is because the broken fallback case is so broken and by definition __LINE__ will be the same not only if two different macros are used on the same line, but for trivial and common case of *one* macro using it. That second problem is a problem only because it encourages crazy naming. For example, in the patch I sent I used "__UNIQUE_ID(__x)". If we actually just wanted a prefix, it would be more logical to just use "__UNIQUE_ID(x)" instead, but then macro arghument expansion means that you don't really get "x" as the prefix, but whatever the _arghument_ x was. So then I have to use "__x" or something just to avoid argument expansion. Maybe I should just have used a plain number (which cannot have the argument expansion issue), but that doesn't work because the prefix is directly attached to the unique number, so using __UNIQUE_ID(1) and __UNIQUE_ID(2) is actually unsafe _too_, because the numbers can end up not being unique at all. I really do dislike __UNIQUE_ID() for all these reasons. It has various really subtle problems that make it unnecessarily hard to use and/or have odd nasty issues. I think there's a reason why __UNIQUE_ID() really isn't used much. Linus
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Mon, 9 Apr 2018, Linus Torvalds wrote: > On Mon, Apr 9, 2018 at 10:03 AM, Linus Torvalds > wrote: > > > > Our old "min()" had the internal variables called "min1" and "min2", > > which is crazy too. > > Actually, no, it used the really cumbersome "__UNIQUE_ID" and then > passed that odd as the name 'min1/2', > > Ugh, I find that really nasty to read, but it was obviously done > because we hit this before. > > And our __UNIQUE_ID() macro is garbage anyway, since it falls back on > the line number, which doesn't really work for macros anyway. But we > have proper macros for both clang and gcc, so maybe we should ignore > the broken fallback. > > A patch like the attached, perhaps? I applied this on top of 38c23685b273cfb4ccf31a199feccce3bdcb5d83 and everything works as expected. Thanks!
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Mon, Apr 9, 2018 at 10:11 AM, Sebastian Ott wrote: > > Arghso obvious now :-) > > When I change these it works. Mind testing the slightly more complex patch I just emailed out, and I can commit it? Linus
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Mon, Apr 9, 2018 at 10:03 AM, Linus Torvalds wrote: > > Our old "min()" had the internal variables called "min1" and "min2", > which is crazy too. Actually, no, it used the really cumbersome "__UNIQUE_ID" and then passed that odd as the name 'min1/2', Ugh, I find that really nasty to read, but it was obviously done because we hit this before. And our __UNIQUE_ID() macro is garbage anyway, since it falls back on the line number, which doesn't really work for macros anyway. But we have proper macros for both clang and gcc, so maybe we should ignore the broken fallback. A patch like the attached, perhaps? Linus include/linux/kernel.h | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 4ae1dfd9bf05..52b70894eaa5 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -822,14 +822,15 @@ static inline void ftrace_dump(enum ftrace_dump_mode oops_dump_mode) { } #define __cmp(x, y, op) ((x) op (y) ? (x) : (y)) -#define __cmp_once(x, y, op) ({ \ - typeof(x) __x = (x); \ - typeof(y) __y = (y); \ - __cmp(__x, __y, op); }) - -#define __careful_cmp(x, y, op)\ - __builtin_choose_expr(__safe_cmp(x, y), \ - __cmp(x, y, op), __cmp_once(x, y, op)) +#define __cmp_once(x, y, unique_x, unique_y, op) ({ \ + typeof(x) unique_x = (x); \ + typeof(y) unique_y = (y); \ + __cmp(unique_x, unique_y, op); }) + +#define __careful_cmp(x, y, op) \ + __builtin_choose_expr(__safe_cmp(x, y), \ + __cmp(x, y, op), \ + __cmp_once(x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y), op)) /** * min - return minimum of two values of the same or compatible types
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Mon, 9 Apr 2018, Linus Torvalds wrote: > On Mon, Apr 9, 2018 at 9:18 AM, Sebastian Ott wrote: > > > > Both of the following return 0 on my machine: > > + pr_warn("%u\n", min_not_zero(100, 1000)); > > + pr_warn("%u\n", min_not_zero(1000, 100)); > > Oooh. > > [ Raises hand, and says "I know, I know, pick me, pick me" ] > > min_not_zero() hasinternal variables "__x" and "__y". > > And "__cmp_once()" has internal variables "__x" and "__y". Arghso obvious now :-) When I change these it works. Sebastian
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Mon, Apr 9, 2018 at 9:18 AM, Sebastian Ott wrote: > > Both of the following return 0 on my machine: > + pr_warn("%u\n", min_not_zero(100, 1000)); > + pr_warn("%u\n", min_not_zero(1000, 100)); Oooh. [ Raises hand, and says "I know, I know, pick me, pick me" ] min_not_zero() hasinternal variables "__x" and "__y". And "__cmp_once()" has internal variables "__x" and "__y". When min_not_zero does min(__x, __y)); that expands to crazy stuff. Out old "min()" had the internal variables called "min1" and "min2", which is crazy too. Linus
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Mon, 9 Apr 2018, Sebastian Ott wrote: > On Fri, 6 Apr 2018, Kees Cook wrote: > > On Fri, Apr 6, 2018 at 2:47 AM, Sebastian Ott > > wrote: > > > Today's kernel oopsed on s390. Bisect points to: > > > 3c8ba0d61d04 ("kernel.h: Retain constant expression output for > > > max()/min()") > > > > > > [1.898277] dasd-eckd 0.0.3304: DASD with 4 KB/block, 21636720 KB > > > total size, 48 KB/track, compatible disk layout > > > [1.898308] [ cut here ] > > > [1.898310] kernel BUG at block/bio.c:1798! > > > > Well that's extremely bad. :( > > What happened is that the bio build by the partition detection code was > attempted to be split by the block layer because the block queue had a > max_sector setting of 0. blk_queue_max_hw_sectors uses min_not_zero. > > Both of the following return 0 on my machine: > + pr_warn("%u\n", min_not_zero(100, 1000)); > + pr_warn("%u\n", min_not_zero(1000, 100)); > > So, we now know what failed...the question is why? I copied these macros to a userspace program to easily test it on other machines/compilers. maybe I did something wrong but min_not_zero did not work - even on fedora on an x86 laptop. Sebastian#include #define __typecheck(x, y) \ (!!(sizeof((typeof(x) *)1 == (typeof(y) *)1))) #define __is_constexpr(x) \ (sizeof(int) == sizeof(*(8 ? ((void *)((long)(x) * 0l)) : (int *)8))) #define __no_side_effects(x, y) \ (__is_constexpr(x) && __is_constexpr(y)) #define __safe_cmp(x, y)\ (__typecheck(x, y) && __no_side_effects(x, y)) #define __cmp(x, y, op)((x) op (y) ? (x) : (y)) #define __cmp_once(x, y, op) ({ \ typeof(x) __x = (x);\ typeof(y) __y = (y);\ __cmp(__x, __y, op); }) #define __careful_cmp(x, y, op) \ __builtin_choose_expr(__safe_cmp(x, y), \ __cmp(x, y, op), __cmp_once(x, y, op)) #define min(x, y) __careful_cmp(x, y, <) #define max(x, y) __careful_cmp(x, y, >) #define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <) #define max_t(type, x, y) __careful_cmp((type)(x), (type)(y), >) #define min_not_zero(x, y) ({ \ typeof(x) __x = (x);\ typeof(y) __y = (y);\ __x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); }) int main() { printf("test %u\n", min_not_zero(100, 1000)); return 0; }
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Fri, 6 Apr 2018, Kees Cook wrote: > On Fri, Apr 6, 2018 at 2:47 AM, Sebastian Ott > wrote: > > Today's kernel oopsed on s390. Bisect points to: > > 3c8ba0d61d04 ("kernel.h: Retain constant expression output for max()/min()") > > > > [1.898277] dasd-eckd 0.0.3304: DASD with 4 KB/block, 21636720 KB total > > size, 48 KB/track, compatible disk layout > > [1.898308] [ cut here ] > > [1.898310] kernel BUG at block/bio.c:1798! > > Well that's extremely bad. :( What happened is that the bio build by the partition detection code was attempted to be split by the block layer because the block queue had a max_sector setting of 0. blk_queue_max_hw_sectors uses min_not_zero. Both of the following return 0 on my machine: + pr_warn("%u\n", min_not_zero(100, 1000)); + pr_warn("%u\n", min_not_zero(1000, 100)); So, we now know what failed...the question is why? Sebastian
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Fri, 6 Apr 2018, Kees Cook wrote: > On Fri, Apr 6, 2018 at 9:47 AM, Kees Cook wrote: > > On Fri, Apr 6, 2018 at 2:47 AM, Sebastian Ott > > wrote: > >> Today's kernel oopsed on s390. Bisect points to: > >> 3c8ba0d61d04 ("kernel.h: Retain constant expression output for > >> max()/min()") > >> > >> [1.898277] dasd-eckd 0.0.3304: DASD with 4 KB/block, 21636720 KB total > >> size, 48 KB/track, compatible disk layout > >> [1.898308] [ cut here ] > >> [1.898310] kernel BUG at block/bio.c:1798! > > > > Well that's extremely bad. :( > > > >> Bisect log and config attached. I'll look at min/max users in the affected > >> areas later today. > > > > Seems like a comparison of objdump output with/without the patch may > > be needed. And why is this s390 only? Ugh. > > I did a objdump diff with your .config and it's rather large -- mostly > seems to be register swaps, so it's not easy to browse. Yes, I'm looking at that too. > BTW, what version of gcc did you use? I built using: gcc version 7.2.0 (GCC)
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Fri, Apr 6, 2018 at 9:47 AM, Kees Cook wrote: > On Fri, Apr 6, 2018 at 2:47 AM, Sebastian Ott > wrote: >> Today's kernel oopsed on s390. Bisect points to: >> 3c8ba0d61d04 ("kernel.h: Retain constant expression output for max()/min()") >> >> [1.898277] dasd-eckd 0.0.3304: DASD with 4 KB/block, 21636720 KB total >> size, 48 KB/track, compatible disk layout >> [1.898308] [ cut here ] >> [1.898310] kernel BUG at block/bio.c:1798! > > Well that's extremely bad. :( > >> Bisect log and config attached. I'll look at min/max users in the affected >> areas later today. > > Seems like a comparison of objdump output with/without the patch may > be needed. And why is this s390 only? Ugh. I did a objdump diff with your .config and it's rather large -- mostly seems to be register swaps, so it's not easy to browse. BTW, what version of gcc did you use? I built using: s390x-linux-gnu-gcc (Ubuntu 7.3.0-13ubuntu1) 7.3.0 -Kees -- Kees Cook Pixel Security
Re: [bisected] 3c8ba0d61d04ced9f8d9ff93977995a9e4e96e91 oopses on s390
On Fri, Apr 6, 2018 at 2:47 AM, Sebastian Ott wrote: > Today's kernel oopsed on s390. Bisect points to: > 3c8ba0d61d04 ("kernel.h: Retain constant expression output for max()/min()") > > [1.898277] dasd-eckd 0.0.3304: DASD with 4 KB/block, 21636720 KB total > size, 48 KB/track, compatible disk layout > [1.898308] [ cut here ] > [1.898310] kernel BUG at block/bio.c:1798! Well that's extremely bad. :( > Bisect log and config attached. I'll look at min/max users in the affected > areas later today. Seems like a comparison of objdump output with/without the patch may be needed. And why is this s390 only? Ugh. -Kees -- Kees Cook Pixel Security