Re: [PATCH 2/7] Makefile, x86, LLVM: disable unsupported optimization flags
It "works" for the cases that I currently care about but I have to say that I am uneasy about adding -Werror to the cc-option test in this way. Suppose that one of the *other* flags that is implicitly passed to the compiler by cc-option - eg something that was explicitly specified in $(KBUILD_CFLAGS) - triggers a warning. In that case all calls to cc-option will silently fail because of the -Werror and valid options will not be detected correctly. If everyone is OK with that because "it shouldn't normally ever happen" then that is fine, but if does result in a subtle change from existing behavior (and a trap that I almost immediately fell into after applying a similar patch). On Wed, Apr 5, 2017 at 12:01 PM, Matthias Kaehlcke <m...@chromium.org> wrote: > Hi Masahiro, > > El Thu, Apr 06, 2017 at 03:08:26AM +0900 Masahiro Yamada ha dit: > >> 2017-03-17 9:15 GMT+09:00 Michael Davidson <m...@google.com>: >> > Unfortunately, while clang generates a warning about these flags >> > being unsupported it still exits with a status of 0 so we have >> > to explicitly disable them instead of just using a cc-option check. >> > >> > Signed-off-by: Michael Davidson <m...@google.com> >> >> >> Instead, does the following work for you? >> https://patchwork.kernel.org/patch/9657285/ > > Thanks for the pointer, I was about to give this change (or rather its > ancestor) a rework myself :) > >> You need to use >> $(call cc-option, ...) >> for -falign-jumps=1 and -falign-loops=1 > > I can confirm that this works. > > Thanks > > Matthias
Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array
On Fri, Mar 17, 2017 at 5:44 AM, Peter Zijlstrawrote: > > Be that as it may; what you construct above is disgusting. Surely the > code can be refactored to not look like dog vomit? > > Also; its not immediately obvious conf->copies is 'small' and this > doesn't blow up the stack; I feel that deserves a comment somewhere. > I agree that the code is horrible. It is, in fact, exactly the same solution that was used to remove variable length arrays in structs from several of the crypto drivers a few years ago - see the definition of SHASH_DESC_ON_STACK() in "crypto/hash.h" - I did not, however, hide the horrors in a macro preferring to leave the implementation visible as a warning to whoever might touch the code next. I believe that the actual stack usage is exactly the same as it was previously. I can certainly wrap this up in a macro and add comments with appropriately dire warnings in it if you feel that is both necessary and sufficient.
[PATCH 3/7] x86, LLVM: suppress clang warnings about unaligned accesses
Suppress clang warnings about potential unaliged accesses to members in packed structs. This gets rid of almost 10,000 warnings about accesses to the ring 0 stack pointer in the TSS. Signed-off-by: Michael Davidson <m...@google.com> --- arch/x86/Makefile | 5 + 1 file changed, 5 insertions(+) diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 894a8d18bf97..7f21703c475d 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -128,6 +128,11 @@ endif KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args) endif +ifeq ($(cc-name),clang) +# Suppress clang warnings about potential unaligned accesses. +KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member) +endif + ifdef CONFIG_X86_X32 x32_ld_ok := $(call try-run,\ /bin/echo -e '1: .quad 1b' | \ -- 2.12.0.367.g23dc2f6d3c-goog
[PATCH 5/7] x86, boot, LLVM: Use regparm=0 for memcpy and memset
Use the standard regparm=0 calling convention for memcpy and memset when building with clang. This is a work around for a long standing clang bug (see https://llvm.org/bugs/show_bug.cgi?id=3997) where clang always uses the standard regparm=0 calling convention for any implcit calls to memcpy and memset that it generates (eg for structure assignments and initialization) even if an alternate calling convention such as regparm=3 has been specified. Signed-off-by: Michael Davidson <m...@google.com> --- arch/x86/boot/copy.S | 15 +-- arch/x86/boot/string.h | 13 + 2 files changed, 26 insertions(+), 2 deletions(-) diff --git a/arch/x86/boot/copy.S b/arch/x86/boot/copy.S index 1eb7d298b47d..57142d1ad0d2 100644 --- a/arch/x86/boot/copy.S +++ b/arch/x86/boot/copy.S @@ -18,6 +18,12 @@ .text GLOBAL(memcpy) +#ifdef __clang__ /* Use normal ABI calling conventions */ + movw4(%esp), %ax + movw8(%esp), %dx + movw12(%esp), %cx +#endif +_memcpy: pushw %si pushw %di movw%ax, %di @@ -34,6 +40,11 @@ GLOBAL(memcpy) ENDPROC(memcpy) GLOBAL(memset) +#ifdef __clang__ /* Use normal ABI calling conventions */ + movw4(%esp), %ax + movw8(%esp), %dx + movw12(%esp), %cx +#endif pushw %di movw%ax, %di movzbl %dl, %eax @@ -52,7 +63,7 @@ GLOBAL(copy_from_fs) pushw %ds pushw %fs popw%ds - calll memcpy + calll _memcpy popw%ds retl ENDPROC(copy_from_fs) @@ -61,7 +72,7 @@ GLOBAL(copy_to_fs) pushw %es pushw %fs popw%es - calll memcpy + calll _memcpy popw%es retl ENDPROC(copy_to_fs) diff --git a/arch/x86/boot/string.h b/arch/x86/boot/string.h index 113588ddb43f..e735cccb3fc8 100644 --- a/arch/x86/boot/string.h +++ b/arch/x86/boot/string.h @@ -6,8 +6,21 @@ #undef memset #undef memcmp +/* + * Use normal ABI calling conventions - i.e. regparm(0) - + * for memcpy() and memset() if we are building the real + * mode setup code with clang since clang may make implicit + * calls to these functions that assume regparm(0). + */ +#if defined(_SETUP) && defined(__clang__) +void __attribute__((regparm(0))) *memcpy(void *dst, const void *src, +size_t len); +void __attribute__((regparm(0))) *memset(void *dst, int c, size_t len); +#else void *memcpy(void *dst, const void *src, size_t len); void *memset(void *dst, int c, size_t len); +#endif + int memcmp(const void *s1, const void *s2, size_t len); /* -- 2.12.0.367.g23dc2f6d3c-goog
[PATCH 0/7] LLVM: make x86_64 kernel build with clang.
This patch set is sufficient to get the x86_64 kernel to build and boot correctly with clang-3.8 or greater. The resulting build still has about 300 warnings, very few of which appear to be significant. Most of them should be fixable with some minor code refactoring although a few of them, such as the complaints about implict conversions between enumerated types may be candidates for just being disabled. Michael Davidson (7): Makefile, LLVM: add -no-integrated-as to KBUILD_[AC]FLAGS Makefile, x86, LLVM: disable unsupported optimization flags x86, LLVM: suppress clang warnings about unaligned accesses x86, boot, LLVM: #undef memcpy etc in string.c x86, boot, LLVM: Use regparm=0 for memcpy and memset md/raid10, LLVM: get rid of variable length array crypto, x86, LLVM: aesni - fix token pasting Makefile| 4 arch/x86/Makefile | 7 +++ arch/x86/boot/copy.S| 15 +-- arch/x86/boot/string.c | 9 + arch/x86/boot/string.h | 13 + arch/x86/crypto/aes_ctrby8_avx-x86_64.S | 7 ++- drivers/md/raid10.c | 9 - 7 files changed, 52 insertions(+), 12 deletions(-) -- 2.12.0.367.g23dc2f6d3c-goog
[PATCH 2/7] Makefile, x86, LLVM: disable unsupported optimization flags
Unfortunately, while clang generates a warning about these flags being unsupported it still exits with a status of 0 so we have to explicitly disable them instead of just using a cc-option check. Signed-off-by: Michael Davidson <m...@google.com> --- Makefile | 2 ++ arch/x86/Makefile | 2 ++ 2 files changed, 4 insertions(+) diff --git a/Makefile b/Makefile index b21fd0ca2946..5e97e5fc1eea 100644 --- a/Makefile +++ b/Makefile @@ -629,7 +629,9 @@ ARCH_AFLAGS := ARCH_CFLAGS := include arch/$(SRCARCH)/Makefile +ifneq ($(cc-name),clang) KBUILD_CFLAGS += $(call cc-option,-fno-delete-null-pointer-checks,) +endif KBUILD_CFLAGS += $(call cc-disable-warning,frame-address,) ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION diff --git a/arch/x86/Makefile b/arch/x86/Makefile index 2d449337a360..894a8d18bf97 100644 --- a/arch/x86/Makefile +++ b/arch/x86/Makefile @@ -87,11 +87,13 @@ else KBUILD_AFLAGS += -m64 KBUILD_CFLAGS += -m64 +ifneq ($(cc-name),clang) # Align jump targets to 1 byte, not the default 16 bytes: KBUILD_CFLAGS += -falign-jumps=1 # Pack loops tightly as well: KBUILD_CFLAGS += -falign-loops=1 +endif # Don't autogenerate traditional x87 instructions KBUILD_CFLAGS += $(call cc-option,-mno-80387) -- 2.12.0.367.g23dc2f6d3c-goog
[PATCH 4/7] x86, boot, LLVM: #undef memcpy etc in string.c
undef memcpy and friends in boot/string.c so that the functions defined here will have the correct names, otherwise we end up up trying to redefine __builtin_memcpy etc. Surprisingly, gcc allows this (and, helpfully, discards the __builtin_ prefix from the function name when compiling it), but clang does not. Adding these #undef's appears to preserve what I assume was the original intent of the code. Signed-off-by: Michael Davidson <m...@google.com> --- arch/x86/boot/string.c | 9 + 1 file changed, 9 insertions(+) diff --git a/arch/x86/boot/string.c b/arch/x86/boot/string.c index 5457b02fc050..b40266850869 100644 --- a/arch/x86/boot/string.c +++ b/arch/x86/boot/string.c @@ -16,6 +16,15 @@ #include "ctype.h" #include "string.h" +/* + * Undef these macros so that the functions that we provide + * here will have the correct names regardless of how string.h + * may have chosen to #define them. + */ +#undef memcpy +#undef memset +#undef memcmp + int memcmp(const void *s1, const void *s2, size_t len) { bool diff; -- 2.12.0.367.g23dc2f6d3c-goog
[PATCH 6/7] md/raid10, LLVM: get rid of variable length array
Replace a variable length array in a struct by allocating the memory for the entire struct in a char array on the stack. Signed-off-by: Michael Davidson <m...@google.com> --- drivers/md/raid10.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 063c43d83b72..158ebdff782c 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -4654,11 +4654,10 @@ static int handle_reshape_read_error(struct mddev *mddev, /* Use sync reads to get the blocks from somewhere else */ int sectors = r10_bio->sectors; struct r10conf *conf = mddev->private; - struct { - struct r10bio r10_bio; - struct r10dev devs[conf->copies]; - } on_stack; - struct r10bio *r10b = _stack.r10_bio; + char on_stack_r10_bio[sizeof(struct r10bio) + + conf->copies * sizeof(struct r10dev)] + __aligned(__alignof__(struct r10bio)); + struct r10bio *r10b = (struct r10bio *)on_stack_r10_bio; int slot = 0; int idx = 0; struct bio_vec *bvec = r10_bio->master_bio->bi_io_vec; -- 2.12.0.367.g23dc2f6d3c-goog
[PATCH 7/7] crypto, x86, LLVM: aesni - fix token pasting
aes_ctrby8_avx-x86_64.S uses the C preprocessor for token pasting of character sequences that are not valid preprocessor tokens. While this is allowed when preprocessing assembler files it exposes an incompatibilty between the clang and gcc preprocessors where clang does not strip leading white space from macro parameters, leading to the CONCAT(%xmm, i) macro expansion on line 96 resulting in a token with a space character embedded in it. While this could be fixed by deleting the offending space character, the assembler is perfectly capable of handling the token pasting correctly for itself so it seems preferable to let it do so and to get rid or the CONCAT(), DDQ() and XMM() preprocessor macros. Signed-off-by: Michael Davidson <m...@google.com> --- arch/x86/crypto/aes_ctrby8_avx-x86_64.S | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S index a916c4a61165..5f6a5af9c489 100644 --- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S +++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S @@ -65,7 +65,6 @@ #include #include -#define CONCAT(a,b)a##b #define VMOVDQ vmovdqu #define xdata0 %xmm0 @@ -92,8 +91,6 @@ #define num_bytes %r8 #define tmp%r10 -#defineDDQ(i) CONCAT(ddq_add_,i) -#defineXMM(i) CONCAT(%xmm, i) #defineDDQ_DATA0 #defineXDATA 1 #define KEY_1281 @@ -131,12 +128,12 @@ ddq_add_8: /* generate a unique variable for ddq_add_x */ .macro setddq n - var_ddq_add = DDQ(\n) + var_ddq_add = ddq_add_\n .endm /* generate a unique variable for xmm register */ .macro setxdata n - var_xdata = XMM(\n) + var_xdata = %xmm\n .endm /* club the numeric 'id' to the symbol 'name' */ -- 2.12.0.367.g23dc2f6d3c-goog
[PATCH 1/7] Makefile, LLVM: add -no-integrated-as to KBUILD_[AC]FLAGS
Add -no-integrated-as to KBUILD_AFLAGS and KBUILD_CFLAGS for clang. Signed-off-by: Michael Davidson <m...@google.com> --- Makefile | 2 ++ 1 file changed, 2 insertions(+) diff --git a/Makefile b/Makefile index b841fb36beb2..b21fd0ca2946 100644 --- a/Makefile +++ b/Makefile @@ -704,6 +704,8 @@ KBUILD_CFLAGS += $(call cc-disable-warning, tautological-compare) # See modpost pattern 2 KBUILD_CFLAGS += $(call cc-option, -mno-global-merge,) KBUILD_CFLAGS += $(call cc-option, -fcatch-undefined-behavior) +KBUILD_CFLAGS += $(call cc-option, -no-integrated-as) +KBUILD_AFLAGS += $(call cc-option, -no-integrated-as) else # These warnings generated too much noise in a regular build. -- 2.12.0.367.g23dc2f6d3c-goog
[PATCH] crypto, x86: aesni - fix token pasting for clang
aes_ctrby8_avx-x86_64.S uses the C preprocessor for token pasting of character sequences that are not valid preprocessor tokens. While this is allowed when preprocessing assembler files it exposes an incompatibilty between the clang and gcc preprocessors where clang does not strip leading white space from macro parameters, leading to the CONCAT(%xmm, i) macro expansion on line 96 resulting in a token with a space character embedded in it. While this could be resolved by deleting the offending space character, the assembler is perfectly capable of doing the token pasting correctly for itself so we can just get rid of the preprocessor macros. Signed-off-by: Michael Davidson <m...@google.com> --- arch/x86/crypto/aes_ctrby8_avx-x86_64.S | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S index a916c4a61165..5f6a5af9c489 100644 --- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S +++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S @@ -65,7 +65,6 @@ #include #include -#define CONCAT(a,b)a##b #define VMOVDQ vmovdqu #define xdata0 %xmm0 @@ -92,8 +91,6 @@ #define num_bytes %r8 #define tmp%r10 -#defineDDQ(i) CONCAT(ddq_add_,i) -#defineXMM(i) CONCAT(%xmm, i) #defineDDQ_DATA0 #defineXDATA 1 #define KEY_1281 @@ -131,12 +128,12 @@ ddq_add_8: /* generate a unique variable for ddq_add_x */ .macro setddq n - var_ddq_add = DDQ(\n) + var_ddq_add = ddq_add_\n .endm /* generate a unique variable for xmm register */ .macro setxdata n - var_xdata = XMM(\n) + var_xdata = %xmm\n .endm /* club the numeric 'id' to the symbol 'name' */ -- 2.12.0.367.g23dc2f6d3c-goog