Re: [PATCH 2/7] Makefile, x86, LLVM: disable unsupported optimization flags

2017-04-05 Thread Michael Davidson
It "works" for the cases that I currently care about but I have to say
that I am uneasy about adding -Werror to the cc-option test in this
way.

Suppose that one of the *other* flags that is implicitly passed to the
compiler by cc-option - eg something that was explicitly specified in
$(KBUILD_CFLAGS) - triggers a warning. In that case all calls to
cc-option will silently fail because of the -Werror and valid options
will not be detected correctly.

If everyone is OK with that because "it shouldn't normally ever
happen" then that is fine, but if does result in a subtle change from
existing behavior (and a trap that I almost immediately fell into
after applying a similar patch).

On Wed, Apr 5, 2017 at 12:01 PM, Matthias Kaehlcke <m...@chromium.org> wrote:
> Hi Masahiro,
>
> El Thu, Apr 06, 2017 at 03:08:26AM +0900 Masahiro Yamada ha dit:
>
>> 2017-03-17 9:15 GMT+09:00 Michael Davidson <m...@google.com>:
>> > Unfortunately, while clang generates a warning about these flags
>> > being unsupported it still exits with a status of 0 so we have
>> > to explicitly disable them instead of just using a cc-option check.
>> >
>> > Signed-off-by: Michael Davidson <m...@google.com>
>>
>>
>> Instead, does the following work for you?
>> https://patchwork.kernel.org/patch/9657285/
>
> Thanks for the pointer, I was about to give this change (or rather its
> ancestor) a rework myself :)
>
>> You need to use
>> $(call cc-option, ...)
>> for -falign-jumps=1 and -falign-loops=1
>
> I can confirm that this works.
>
> Thanks
>
> Matthias


Re: [PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-17 Thread Michael Davidson
On Fri, Mar 17, 2017 at 5:44 AM, Peter Zijlstra  wrote:
>
> Be that as it may; what you construct above is disgusting. Surely the
> code can be refactored to not look like dog vomit?
>
> Also; its not immediately obvious conf->copies is 'small' and this
> doesn't blow up the stack; I feel that deserves a comment somewhere.
>

I agree that the code is horrible.

It is, in fact, exactly the same solution that was used to remove
variable length arrays in structs from several of the crypto drivers a
few years ago - see the definition of SHASH_DESC_ON_STACK() in
"crypto/hash.h" - I did not, however, hide the horrors in a macro
preferring to leave the implementation visible as a warning to whoever
might touch the code next.

I believe that the actual stack usage is exactly the same as it was previously.

I can certainly wrap this  up in a macro and add comments with
appropriately dire warnings in it if you feel that is both necessary
and sufficient.


[PATCH 3/7] x86, LLVM: suppress clang warnings about unaligned accesses

2017-03-16 Thread Michael Davidson
Suppress clang warnings about potential unaliged accesses
to members in packed structs. This gets rid of almost 10,000
warnings about accesses to the ring 0 stack pointer in the TSS.

Signed-off-by: Michael Davidson <m...@google.com>
---
 arch/x86/Makefile | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 894a8d18bf97..7f21703c475d 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -128,6 +128,11 @@ endif
 KBUILD_CFLAGS += $(call cc-option,-maccumulate-outgoing-args)
 endif
 
+ifeq ($(cc-name),clang)
+# Suppress clang warnings about potential unaligned accesses.
+KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
+endif
+
 ifdef CONFIG_X86_X32
x32_ld_ok := $(call try-run,\
/bin/echo -e '1: .quad 1b' | \
-- 
2.12.0.367.g23dc2f6d3c-goog



[PATCH 5/7] x86, boot, LLVM: Use regparm=0 for memcpy and memset

2017-03-16 Thread Michael Davidson
Use the standard regparm=0 calling convention for memcpy and
memset when building with clang.

This is a work around for a long standing clang bug
(see https://llvm.org/bugs/show_bug.cgi?id=3997) where
clang always uses the standard regparm=0 calling convention
for any implcit calls to memcpy and memset that it generates
(eg for structure assignments and initialization) even if an
alternate calling convention such as regparm=3 has been specified.

Signed-off-by: Michael Davidson <m...@google.com>
---
 arch/x86/boot/copy.S   | 15 +--
 arch/x86/boot/string.h | 13 +
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/x86/boot/copy.S b/arch/x86/boot/copy.S
index 1eb7d298b47d..57142d1ad0d2 100644
--- a/arch/x86/boot/copy.S
+++ b/arch/x86/boot/copy.S
@@ -18,6 +18,12 @@
.text
 
 GLOBAL(memcpy)
+#ifdef __clang__   /* Use normal ABI calling conventions */
+   movw4(%esp), %ax
+   movw8(%esp), %dx
+   movw12(%esp), %cx
+#endif
+_memcpy:
pushw   %si
pushw   %di
movw%ax, %di
@@ -34,6 +40,11 @@ GLOBAL(memcpy)
 ENDPROC(memcpy)
 
 GLOBAL(memset)
+#ifdef __clang__   /* Use normal ABI calling conventions */
+   movw4(%esp), %ax
+   movw8(%esp), %dx
+   movw12(%esp), %cx
+#endif
pushw   %di
movw%ax, %di
movzbl  %dl, %eax
@@ -52,7 +63,7 @@ GLOBAL(copy_from_fs)
pushw   %ds
pushw   %fs
popw%ds
-   calll   memcpy
+   calll   _memcpy
popw%ds
retl
 ENDPROC(copy_from_fs)
@@ -61,7 +72,7 @@ GLOBAL(copy_to_fs)
pushw   %es
pushw   %fs
popw%es
-   calll   memcpy
+   calll   _memcpy
popw%es
retl
 ENDPROC(copy_to_fs)
diff --git a/arch/x86/boot/string.h b/arch/x86/boot/string.h
index 113588ddb43f..e735cccb3fc8 100644
--- a/arch/x86/boot/string.h
+++ b/arch/x86/boot/string.h
@@ -6,8 +6,21 @@
 #undef memset
 #undef memcmp
 
+/*
+ * Use normal ABI calling conventions - i.e. regparm(0) -
+ * for memcpy() and memset() if we are building the real
+ * mode setup code with clang since clang may make implicit
+ * calls to these functions that assume regparm(0).
+ */
+#if defined(_SETUP) && defined(__clang__)
+void __attribute__((regparm(0))) *memcpy(void *dst, const void *src,
+size_t len);
+void __attribute__((regparm(0))) *memset(void *dst, int c, size_t len);
+#else
 void *memcpy(void *dst, const void *src, size_t len);
 void *memset(void *dst, int c, size_t len);
+#endif
+
 int memcmp(const void *s1, const void *s2, size_t len);
 
 /*
-- 
2.12.0.367.g23dc2f6d3c-goog



[PATCH 0/7] LLVM: make x86_64 kernel build with clang.

2017-03-16 Thread Michael Davidson
This patch set is sufficient to get the x86_64 kernel to build
and boot correctly with clang-3.8 or greater.

The resulting build still has about 300 warnings, very few of
which appear to be significant. Most of them should be fixable
with some minor code refactoring although a few of them, such
as the complaints about implict conversions between enumerated
types may be candidates for just being disabled.

Michael Davidson (7):
  Makefile, LLVM: add -no-integrated-as to KBUILD_[AC]FLAGS
  Makefile, x86, LLVM: disable unsupported optimization flags
  x86, LLVM: suppress clang warnings about unaligned accesses
  x86, boot, LLVM: #undef memcpy etc in string.c
  x86, boot, LLVM: Use regparm=0 for memcpy and memset
  md/raid10, LLVM: get rid of variable length array
  crypto, x86, LLVM: aesni - fix token pasting

 Makefile|  4 
 arch/x86/Makefile   |  7 +++
 arch/x86/boot/copy.S| 15 +--
 arch/x86/boot/string.c  |  9 +
 arch/x86/boot/string.h  | 13 +
 arch/x86/crypto/aes_ctrby8_avx-x86_64.S |  7 ++-
 drivers/md/raid10.c |  9 -
 7 files changed, 52 insertions(+), 12 deletions(-)

-- 
2.12.0.367.g23dc2f6d3c-goog



[PATCH 2/7] Makefile, x86, LLVM: disable unsupported optimization flags

2017-03-16 Thread Michael Davidson
Unfortunately, while clang generates a warning about these flags
being unsupported it still exits with a status of 0 so we have
to explicitly disable them instead of just using a cc-option check.

Signed-off-by: Michael Davidson <m...@google.com>
---
 Makefile  | 2 ++
 arch/x86/Makefile | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/Makefile b/Makefile
index b21fd0ca2946..5e97e5fc1eea 100644
--- a/Makefile
+++ b/Makefile
@@ -629,7 +629,9 @@ ARCH_AFLAGS :=
 ARCH_CFLAGS :=
 include arch/$(SRCARCH)/Makefile
 
+ifneq ($(cc-name),clang)
 KBUILD_CFLAGS  += $(call cc-option,-fno-delete-null-pointer-checks,)
+endif
 KBUILD_CFLAGS  += $(call cc-disable-warning,frame-address,)
 
 ifdef CONFIG_LD_DEAD_CODE_DATA_ELIMINATION
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 2d449337a360..894a8d18bf97 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -87,11 +87,13 @@ else
 KBUILD_AFLAGS += -m64
 KBUILD_CFLAGS += -m64
 
+ifneq ($(cc-name),clang)
 # Align jump targets to 1 byte, not the default 16 bytes:
 KBUILD_CFLAGS += -falign-jumps=1
 
 # Pack loops tightly as well:
 KBUILD_CFLAGS += -falign-loops=1
+endif
 
 # Don't autogenerate traditional x87 instructions
 KBUILD_CFLAGS += $(call cc-option,-mno-80387)
-- 
2.12.0.367.g23dc2f6d3c-goog



[PATCH 4/7] x86, boot, LLVM: #undef memcpy etc in string.c

2017-03-16 Thread Michael Davidson
undef memcpy and friends in boot/string.c so that the functions
defined here will have the correct names, otherwise we end up
up trying to redefine __builtin_memcpy etc.
Surprisingly, gcc allows this (and, helpfully, discards the
__builtin_ prefix from the function name when compiling it),
but clang does not.

Adding these #undef's appears to preserve what I assume was
the original intent of the code.

Signed-off-by: Michael Davidson <m...@google.com>
---
 arch/x86/boot/string.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/arch/x86/boot/string.c b/arch/x86/boot/string.c
index 5457b02fc050..b40266850869 100644
--- a/arch/x86/boot/string.c
+++ b/arch/x86/boot/string.c
@@ -16,6 +16,15 @@
 #include "ctype.h"
 #include "string.h"
 
+/*
+ * Undef these macros so that the functions that we provide
+ * here will have the correct names regardless of how string.h
+ * may have chosen to #define them.
+ */
+#undef memcpy
+#undef memset
+#undef memcmp
+
 int memcmp(const void *s1, const void *s2, size_t len)
 {
bool diff;
-- 
2.12.0.367.g23dc2f6d3c-goog



[PATCH 6/7] md/raid10, LLVM: get rid of variable length array

2017-03-16 Thread Michael Davidson
Replace a variable length array in a struct by allocating
the memory for the entire struct in a char array on the stack.

Signed-off-by: Michael Davidson <m...@google.com>
---
 drivers/md/raid10.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 063c43d83b72..158ebdff782c 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -4654,11 +4654,10 @@ static int handle_reshape_read_error(struct mddev 
*mddev,
/* Use sync reads to get the blocks from somewhere else */
int sectors = r10_bio->sectors;
struct r10conf *conf = mddev->private;
-   struct {
-   struct r10bio r10_bio;
-   struct r10dev devs[conf->copies];
-   } on_stack;
-   struct r10bio *r10b = _stack.r10_bio;
+   char on_stack_r10_bio[sizeof(struct r10bio) +
+ conf->copies * sizeof(struct r10dev)]
+ __aligned(__alignof__(struct r10bio));
+   struct r10bio *r10b = (struct r10bio *)on_stack_r10_bio;
int slot = 0;
int idx = 0;
struct bio_vec *bvec = r10_bio->master_bio->bi_io_vec;
-- 
2.12.0.367.g23dc2f6d3c-goog



[PATCH 7/7] crypto, x86, LLVM: aesni - fix token pasting

2017-03-16 Thread Michael Davidson
aes_ctrby8_avx-x86_64.S uses the C preprocessor for token pasting
of character sequences that are not valid preprocessor tokens.
While this is allowed when preprocessing assembler files it exposes
an incompatibilty between the clang and gcc preprocessors where
clang does not strip leading white space from macro parameters,
leading to the CONCAT(%xmm, i) macro expansion on line 96 resulting
in a token with a space character embedded in it.

While this could be fixed by deleting the offending space character,
the assembler is perfectly capable of handling the token pasting
correctly for itself so it seems preferable to let it do so and to
get rid or the CONCAT(), DDQ() and XMM() preprocessor macros.

Signed-off-by: Michael Davidson <m...@google.com>
---
 arch/x86/crypto/aes_ctrby8_avx-x86_64.S | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S 
b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
index a916c4a61165..5f6a5af9c489 100644
--- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
+++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
@@ -65,7 +65,6 @@
 #include 
 #include 
 
-#define CONCAT(a,b)a##b
 #define VMOVDQ vmovdqu
 
 #define xdata0 %xmm0
@@ -92,8 +91,6 @@
 #define num_bytes  %r8
 
 #define tmp%r10
-#defineDDQ(i)  CONCAT(ddq_add_,i)
-#defineXMM(i)  CONCAT(%xmm, i)
 #defineDDQ_DATA0
 #defineXDATA   1
 #define KEY_1281
@@ -131,12 +128,12 @@ ddq_add_8:
 /* generate a unique variable for ddq_add_x */
 
 .macro setddq n
-   var_ddq_add = DDQ(\n)
+   var_ddq_add = ddq_add_\n
 .endm
 
 /* generate a unique variable for xmm register */
 .macro setxdata n
-   var_xdata = XMM(\n)
+   var_xdata = %xmm\n
 .endm
 
 /* club the numeric 'id' to the symbol 'name' */
-- 
2.12.0.367.g23dc2f6d3c-goog



[PATCH 1/7] Makefile, LLVM: add -no-integrated-as to KBUILD_[AC]FLAGS

2017-03-16 Thread Michael Davidson
Add -no-integrated-as to KBUILD_AFLAGS and KBUILD_CFLAGS
for clang.

Signed-off-by: Michael Davidson <m...@google.com>
---
 Makefile | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Makefile b/Makefile
index b841fb36beb2..b21fd0ca2946 100644
--- a/Makefile
+++ b/Makefile
@@ -704,6 +704,8 @@ KBUILD_CFLAGS += $(call cc-disable-warning, 
tautological-compare)
 # See modpost pattern 2
 KBUILD_CFLAGS += $(call cc-option, -mno-global-merge,)
 KBUILD_CFLAGS += $(call cc-option, -fcatch-undefined-behavior)
+KBUILD_CFLAGS += $(call cc-option, -no-integrated-as)
+KBUILD_AFLAGS += $(call cc-option, -no-integrated-as)
 else
 
 # These warnings generated too much noise in a regular build.
-- 
2.12.0.367.g23dc2f6d3c-goog



[PATCH] crypto, x86: aesni - fix token pasting for clang

2017-03-15 Thread Michael Davidson
aes_ctrby8_avx-x86_64.S uses the C preprocessor for token pasting
of character sequences that are not valid preprocessor tokens.
While this is allowed when preprocessing assembler files it exposes
an incompatibilty between the clang and gcc preprocessors where
clang does not strip leading white space from macro parameters,
leading to the CONCAT(%xmm, i) macro expansion on line 96 resulting
in a token with a space character embedded in it.

While this could be resolved by deleting the offending space character,
the assembler is perfectly capable of doing the token pasting correctly
for itself so we can just get rid of the preprocessor macros.

Signed-off-by: Michael Davidson <m...@google.com>
---
 arch/x86/crypto/aes_ctrby8_avx-x86_64.S | 7 ++-
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S 
b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
index a916c4a61165..5f6a5af9c489 100644
--- a/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
+++ b/arch/x86/crypto/aes_ctrby8_avx-x86_64.S
@@ -65,7 +65,6 @@
 #include 
 #include 
 
-#define CONCAT(a,b)a##b
 #define VMOVDQ vmovdqu
 
 #define xdata0 %xmm0
@@ -92,8 +91,6 @@
 #define num_bytes  %r8
 
 #define tmp%r10
-#defineDDQ(i)  CONCAT(ddq_add_,i)
-#defineXMM(i)  CONCAT(%xmm, i)
 #defineDDQ_DATA0
 #defineXDATA   1
 #define KEY_1281
@@ -131,12 +128,12 @@ ddq_add_8:
 /* generate a unique variable for ddq_add_x */
 
 .macro setddq n
-   var_ddq_add = DDQ(\n)
+   var_ddq_add = ddq_add_\n
 .endm
 
 /* generate a unique variable for xmm register */
 .macro setxdata n
-   var_xdata = XMM(\n)
+   var_xdata = %xmm\n
 .endm
 
 /* club the numeric 'id' to the symbol 'name' */
-- 
2.12.0.367.g23dc2f6d3c-goog