[PATCH] cgroup: reorder flexible array members of struct cgroup_root

2017-10-16 Thread Nick Desaulniers
When compiling arch/x86/boot/compressed/eboot.c with HOSTCC=clang, the
following warning is observed:

./include/linux/cgroup-defs.h:391:16: warning: field 'cgrp' with
variable sized type 'struct cgroup' not at the end of a struct or class
is a GNU extension [-Wgnu-variable-sized-type-not-at-end]
struct cgroup cgrp;
  ^
Flexible array members are a C99 feature, but must be the last member of
a struct. Structs with flexible members composed in other structs must
also be the final members, unless using GNU C extensions.

struct cgroup_root's member cgrp is a struct cgroup, struct cgroup's
member ancestor_ids is a flexible member.

Signed-off-by: Nick Desaulniers 
---
Alternatively, we could:
* Not use flexible array members. The flexible array members and allocation
  strategy, added in commit b11cfb5807e30 mentions the hot path, so this is
  likely not an option?
* Disable this warning for HOSTCC==clang.  I'd rather not, since there's
  nothing really requiring the use of this particular GNU C extension here,
  or specific location of the member cgrp within struct cgroup_root AFAICT.

 include/linux/cgroup-defs.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index ade4a78a54c2..2ef256932bf3 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -387,9 +387,6 @@ struct cgroup_root {
/* Unique id for this hierarchy. */
int hierarchy_id;
 
-   /* The root cgroup.  Root is destroyed on its release. */
-   struct cgroup cgrp;
-
/* for cgrp->ancestor_ids[0] */
int cgrp_ancestor_id_storage;
 
@@ -410,6 +407,9 @@ struct cgroup_root {
 
/* The name for this hierarchy - may be empty */
char name[MAX_CGROUP_ROOT_NAMELEN];
+
+   /* The root cgroup.  Root is destroyed on its release. */
+   struct cgroup cgrp;
 };
 
 /*
-- 
2.11.0



Re: [PATCH] cgroup: reorder flexible array members of struct cgroup_root

2017-10-16 Thread Nick Desaulniers
On Mon, Oct 16, 2017 at 11:33:21PM -0700, Nick Desaulniers wrote:
> When compiling arch/x86/boot/compressed/eboot.c with HOSTCC=clang, the

Actually, not sure this is because HOSTCC was specifically clang, I
think that could be reworded to `When compiling ... with Clang, the
...`.


Re: [PATCH] cgroup: reorder flexible array members of struct cgroup_root

2017-10-16 Thread Nick Desaulniers
On Mon, Oct 16, 2017 at 11:40 PM, Nick Desaulniers
 wrote:
> On Mon, Oct 16, 2017 at 11:33:21PM -0700, Nick Desaulniers wrote:
>> When compiling arch/x86/boot/compressed/eboot.c with HOSTCC=clang, the
>
> Actually, not sure this is because HOSTCC was specifically clang, I
> think that could be reworded to `When compiling ... with Clang, the
> ...`.

arch/x86/boot/compressed/Makefile:28 indeed overwrites KBUILD_CFLAGS,
so wording should be changed to as-suggested in my previous post,
since that means it's CC=clang, not HOSTCC=clang as the source of the
warning.


Re: [PATCH] KVM: x86: dynamically allocate large struct in em_fxrstor

2017-05-25 Thread Nick Desaulniers
On Thu, May 25, 2017 at 04:07:08PM +0200, Paolo Bonzini wrote:
> I think we should do the fixup backwards.
> 
> That is:
> 
> - first do get_fpu
> 
> - if the fixup is necessary, i.e. ctxt->mode < X86EMUL_MODE_PROT64, do
> fxsave into &fxstate.
> 
> - then do segmented_read_std with the correct size, which is
>   - offsetof(struct fxregs_state, xmm_space[16]), i.e. 416
> if ctxt->mode == X86EMUL_MODE_PROT64
>   - offsetof(struct fxregs_state, xmm_space[8]), i.e. 288
> if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=1
>   - offsetof(struct fxregs_state, xmm_space[0]), i.e. 160
> if ctxt->mode < X86EMUL_MODE_PROT64 and CR4.OSFXSR=0

but we still want to do a segmented_read_std with size 512 otherwise,
correct?

> - then check fx_state.mxcsr
> 
> - then do fxrstor

This sounds like we conditionally do the fxsave, but then always do the
fxrstor.  Is that ok? I guess the original code kind of does that as
well.

> - finally do put_fpu

Sounds straight forward.  I can see how fxsave and CR4.OSFXSR are
accessed in fxstor_fixup.  Is it ok to skip those memcpy's that would
otherwise occur when calling fxrstor_fixup() (which after these changes,
we would not be)?


Re: [PATCH] mm/zsmalloc: fix -Wunneeded-internal-declaration warning

2017-05-25 Thread Nick Desaulniers
On Wed, May 24, 2017 at 05:16:18PM +0900, Sergey Senozhatsky wrote:
> On (05/23/17 22:38), Nick Desaulniers wrote:
> > Fixes the following warning, found with Clang:
> well, no objections from my side. MM seems to be getting more and
> more `__maybe_unused' annotations because of clang.

Indeed, but does find bugs when this warning pops up unexpected (unlike
in this particular instance). See:

https://patchwork.kernel.org/patch/9738897/
https://www.spinics.net/lists/intel-gfx/msg128737.html

TL;DR
>>> You have actually uncovered a bug here where the call is not
>>> supposed to be optional in the first place.


Re: [Patch v2] mm/vmscan: fix unsequenced modification and access warning

2017-05-25 Thread Nick Desaulniers
On Tue, May 16, 2017 at 10:27:46AM +0200, Michal Hocko wrote:
> I guess it is worth reporting this to clang bugzilla. Could you take
> care of that Nick?

>From https://bugs.llvm.org//show_bug.cgi?id=33065#c5
it seems that this is indeed a sequence bug in the previous version of
this code and not a compiler bug.  You can read that response for the
properly-cited wording but my TL;DR/understanding is for the given code:

struct foo bar = {
  .a = (c = 0),
  .b = c,
};

That the compiler is allowed to reorder the initializations of bar.a and
bar.b, so what the value of c here might not be what you expect.


[PATCH] Input: mousedev - fix implicit conversion warning

2017-05-25 Thread Nick Desaulniers
Clang warns:

drivers/input/mousedev.c:653:63: error: implicit conversion from 'int'
to 'signed char' changes value from 200 to -56
[-Wconstant-conversion]
  client->ps2[1] = 0x60; client->ps2[2] = 3; client->ps2[3] = 200;
~ ^~~

As far as I can tell, from

http://www.computer-engineering.org/ps2mouse/

Under "Command Set" > "0xE9 (Status Request)"

the value 200 is a valid sample rate. An explicit cast silences this
warning.

Signed-off-by: Nick Desaulniers 
---
 drivers/input/mousedev.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
index 0e0ff84088fd..816e2431bba8 100644
--- a/drivers/input/mousedev.c
+++ b/drivers/input/mousedev.c
@@ -650,7 +650,9 @@ static void mousedev_generate_response(struct 
mousedev_client *client,
break;
 
case 0xe9: /* Get info */
-   client->ps2[1] = 0x60; client->ps2[2] = 3; client->ps2[3] = 200;
+   client->ps2[1] = 0x60;
+   client->ps2[2] = 3;
+   client->ps2[3] = (char) 200;
client->bufsiz = 4;
break;
 
-- 
2.11.0



[PATCH v2] Input: mousedev - fix implicit conversion warning

2017-05-26 Thread Nick Desaulniers
Clang warns:

drivers/input/mousedev.c:653:63: error: implicit conversion from 'int'
to 'signed char' changes value from 200 to -56
[-Wconstant-conversion]
  client->ps2[1] = 0x60; client->ps2[2] = 3; client->ps2[3] = 200;
~ ^~~

As far as I can tell, from

http://www.computer-engineering.org/ps2mouse/

Under "Command Set" > "0xE9 (Status Request)"

the value 200 is a valid sample rate. Using unsigned char, rather than
signed char, for client->ps2 silences this warning.

Signed-off-by: Nick Desaulniers 
---
* Changed from using an explicit cast to changing the signedness
  of the declaration.

 drivers/input/mousedev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
index 0e0ff84088fd..c83688eb2ef4 100644
--- a/drivers/input/mousedev.c
+++ b/drivers/input/mousedev.c
@@ -103,7 +103,7 @@ struct mousedev_client {
spinlock_t packet_lock;
int pos_x, pos_y;
 
-   signed char ps2[6];
+   unsigned char ps2[6];
unsigned char ready, buffer, bufsiz;
unsigned char imexseq, impsseq;
enum mousedev_emul mode;
-- 
2.11.0



[PATCH v3] Input: mousedev - fix implicit conversion warning

2017-05-26 Thread Nick Desaulniers
Clang warns:

drivers/input/mousedev.c:653:63: error: implicit conversion from 'int'
to 'signed char' changes value from 200 to -56
[-Wconstant-conversion]
  client->ps2[1] = 0x60; client->ps2[2] = 3; client->ps2[3] = 200;
~ ^~~

As far as I can tell, from

http://www.computer-engineering.org/ps2mouse/

Under "Command Set" > "0xE9 (Status Request)"

the value 200 is a valid sample rate. Using unsigned char, rather than
signed char, for client->ps2 silences this warning.

Signed-off-by: Nick Desaulniers 
---
* Additionally change function signature for the lone function dealing
  with ps2 data.

 drivers/input/mousedev.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
index 0e0ff84088fd..0e31a109b1b4 100644
--- a/drivers/input/mousedev.c
+++ b/drivers/input/mousedev.c
@@ -103,7 +103,7 @@ struct mousedev_client {
spinlock_t packet_lock;
int pos_x, pos_y;
 
-   signed char ps2[6];
+   unsigned char ps2[6];
unsigned char ready, buffer, bufsiz;
unsigned char imexseq, impsseq;
enum mousedev_emul mode;
@@ -577,7 +577,7 @@ static inline int mousedev_limit_delta(int delta, int limit)
 }
 
 static void mousedev_packet(struct mousedev_client *client,
-   signed char *ps2_data)
+   unsigned char *ps2_data)
 {
struct mousedev_motion *p = &client->packets[client->tail];
 
-- 
2.11.0



Re: [PATCH v3] Input: mousedev - fix implicit conversion warning

2017-05-29 Thread Nick Desaulniers
On Fri, May 26, 2017 at 10:07:46AM -0700, Dmitry Torokhov wrote:
> If you look at the code of this fucntion we really use ps2_data as
> signed in calculations, and this change would break that. While making
> ps2_data u8 might be beneficial we'd need to rework mousedev_packet() to
> use signed temporaries for dx, dy and dz before stufifng them into
> ps2_data.

Is your recommendation then to stick with the simple cast as in V1 of
this patch, or stick with u8 and rework mousedev_packet()?


[PATCH v2] KVM: x86: avoid large stack allocations in em_fxrstor

2017-05-29 Thread Nick Desaulniers
em_fxstor previously called fxstor_fixup.  Both created instances of
struct fxregs_state on the stack, which triggered the warning:

arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes
in function
  'em_fxrstor' [-Wframe-larger-than=]
static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
   ^
with CONFIG_FRAME_WARN set to 1024.

This patch does the fixup in em_fxstor now, avoiding one additional
struct fxregs_state, and now fxstor_fixup can be removed as it has no
other call sites.

Signed-off-by: Nick Desaulniers 
---
New in V2:
* reworked patch to do what was recommended by maintainers in:
  https://lkml.org/lkml/2017/5/25/391

 arch/x86/kvm/emulate.c | 55 +-
 1 file changed, 19 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 0816ab2e8adc..0087d3458604 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3985,57 +3985,40 @@ static int em_fxsave(struct x86_emulate_ctxt *ctxt)
return segmented_write_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
 }
 
-static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
-   struct fxregs_state *new)
-{
-   int rc = X86EMUL_CONTINUE;
-   struct fxregs_state old;
-
-   rc = asm_safe("fxsave %[fx]", , [fx] "+m"(old));
-   if (rc != X86EMUL_CONTINUE)
-   return rc;
-
-   /*
-* 64 bit host will restore XMM 8-15, which is not correct on non-64
-* bit guests.  Load the current values in order to preserve 64 bit
-* XMMs after fxrstor.
-*/
-#ifdef CONFIG_X86_64
-   /* XXX: accessing XMM 8-15 very awkwardly */
-   memcpy(&new->xmm_space[8 * 16/4], &old.xmm_space[8 * 16/4], 8 * 16);
-#endif
-
-   /*
-* Hardware doesn't save and restore XMM 0-7 without CR4.OSFXSR, but
-* does save and restore MXCSR.
-*/
-   if (!(ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR))
-   memcpy(new->xmm_space, old.xmm_space, 8 * 16);
-
-   return rc;
-}
-
 static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 {
struct fxregs_state fx_state;
int rc;
+   unsigned int size;
 
rc = check_fxsr(ctxt);
if (rc != X86EMUL_CONTINUE)
return rc;
 
-   rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
+   ctxt->ops->get_fpu(ctxt);
+
+   if (ctxt->mode < X86EMUL_MODE_PROT64) {
+   rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
+   if (rc != X86EMUL_CONTINUE)
+   return rc;
+   /*
+* Hardware doesn't save and restore XMM 0-7 without
+* CR4.OSFXSR, but does save and restore MXCSR.
+*/
+   if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
+   size = offsetof(struct fxregs_state, xmm_space[8]);
+   else
+   size = offsetof(struct fxregs_state, xmm_space[0]);
+   } else if (ctxt->mode == X86EMUL_MODE_PROT64)
+   size = offsetof(struct fxregs_state, xmm_space[16]);
+
+   rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
if (rc != X86EMUL_CONTINUE)
return rc;
 
if (fx_state.mxcsr >> 16)
return emulate_gp(ctxt, 0);
 
-   ctxt->ops->get_fpu(ctxt);
-
-   if (ctxt->mode < X86EMUL_MODE_PROT64)
-   rc = fxrstor_fixup(ctxt, &fx_state);
-
if (rc == X86EMUL_CONTINUE)
rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state));
 
-- 
2.11.0



Re: [PATCH] drm/i915/gvt: remove redundant -Wall

2017-05-29 Thread Nick Desaulniers
ping for review


Re: [PATCH v2] KVM: x86: avoid large stack allocations in em_fxrstor

2017-05-29 Thread Nick Desaulniers
On Tue, May 30, 2017 at 04:14:50AM +0800, kbuild test robot wrote:
>arch/x86/kvm/emulate.c: In function 'em_fxrstor':
> >> arch/x86/kvm/emulate.c:4015:5: warning: 'size' may be used uninitialized 
> >> in this function [-Wmaybe-uninitialized]
>  rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
>  ~~~^

I definitely thought about this before submitting.  size is initialized
in the case ctxt->mode < X86EMUL_MODE_PROT64, and
ctxt->mode == X86EMUL_MODE_PROT64, but uninitialized in the case
ctxt->mode > X86EMUL_MODE_PROT64.

Paulo mentions this is impossible in:

https://lkml.org/lkml/2017/5/26/62

>>> ctxt->mode is never > X86EMUL_MODE_PROT64
>>> (see the definition of enum x86emul_mode in
>>> arch/x86/include/asm/kvm_emulate.h.)

>From arch/x86/include/asm/kvm_emulate.h:

272 enum x86emul_mode {
273   X86EMUL_MODE_REAL,  /* Real mode. */
274   X86EMUL_MODE_VM86,  /* Virtual 8086 mode. */
275   X86EMUL_MODE_PROT16,  /* 16-bit protected mode. */
276   X86EMUL_MODE_PROT32,  /* 32-bit protected mode. */
277   X86EMUL_MODE_PROT64,  /* 64-bit (long) mode.*/
278 };

I would still rather err on the side of safety, and initialize the
variable. So I will submit a v3 initializing size to 0, and check that
size was set to a reasonable value at some point before invoke
segmented_read_std.


[PATCH v3] KVM: x86: avoid large stack allocations in em_fxrstor

2017-05-29 Thread Nick Desaulniers
em_fxstor previously called fxstor_fixup.  Both created instances of
struct fxregs_state on the stack, which triggered the warning:

arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes
in function
  'em_fxrstor' [-Wframe-larger-than=]
static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
   ^
with CONFIG_FRAME_WARN set to 1024.

This patch does the fixup in em_fxstor now, avoiding one additional
struct fxregs_state, and now fxstor_fixup can be removed as it has no
other call sites.

Signed-off-by: Nick Desaulniers 
---
New in V3:
* initialized size to 0 to avoid maybe-uninitialized warning.  Check
  that it gets set to something other than 0 before passing size to
  segmented_read_std().

New in V2:
* reworked patch to do what was recommended by maintainers in:
  https://lkml.org/lkml/2017/5/25/391

 arch/x86/kvm/emulate.c | 58 +++---
 1 file changed, 22 insertions(+), 36 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 0816ab2e8adc..8c74a3764405 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3985,57 +3985,43 @@ static int em_fxsave(struct x86_emulate_ctxt *ctxt)
return segmented_write_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
 }
 
-static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
-   struct fxregs_state *new)
-{
-   int rc = X86EMUL_CONTINUE;
-   struct fxregs_state old;
-
-   rc = asm_safe("fxsave %[fx]", , [fx] "+m"(old));
-   if (rc != X86EMUL_CONTINUE)
-   return rc;
-
-   /*
-* 64 bit host will restore XMM 8-15, which is not correct on non-64
-* bit guests.  Load the current values in order to preserve 64 bit
-* XMMs after fxrstor.
-*/
-#ifdef CONFIG_X86_64
-   /* XXX: accessing XMM 8-15 very awkwardly */
-   memcpy(&new->xmm_space[8 * 16/4], &old.xmm_space[8 * 16/4], 8 * 16);
-#endif
-
-   /*
-* Hardware doesn't save and restore XMM 0-7 without CR4.OSFXSR, but
-* does save and restore MXCSR.
-*/
-   if (!(ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR))
-   memcpy(new->xmm_space, old.xmm_space, 8 * 16);
-
-   return rc;
-}
-
 static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 {
struct fxregs_state fx_state;
int rc;
+   unsigned int size = 0;
 
rc = check_fxsr(ctxt);
if (rc != X86EMUL_CONTINUE)
return rc;
 
-   rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
+   ctxt->ops->get_fpu(ctxt);
+
+   if (ctxt->mode < X86EMUL_MODE_PROT64) {
+   rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
+   if (rc != X86EMUL_CONTINUE)
+   return rc;
+   /*
+* Hardware doesn't save and restore XMM 0-7 without
+* CR4.OSFXSR, but does save and restore MXCSR.
+*/
+   if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
+   size = offsetof(struct fxregs_state, xmm_space[8]);
+   else
+   size = offsetof(struct fxregs_state, xmm_space[0]);
+   } else if (ctxt->mode == X86EMUL_MODE_PROT64)
+   size = offsetof(struct fxregs_state, xmm_space[16]);
+
+   if (size == 0)
+   return X86EMUL_UNHANDLEABLE;
+
+   rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
if (rc != X86EMUL_CONTINUE)
return rc;
 
if (fx_state.mxcsr >> 16)
return emulate_gp(ctxt, 0);
 
-   ctxt->ops->get_fpu(ctxt);
-
-   if (ctxt->mode < X86EMUL_MODE_PROT64)
-   rc = fxrstor_fixup(ctxt, &fx_state);
-
if (rc == X86EMUL_CONTINUE)
rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state));
 
-- 
2.11.0



Re: [PATCH v3] KVM: x86: avoid large stack allocations in em_fxrstor

2017-05-29 Thread Nick Desaulniers
On Mon, May 29, 2017 at 01:39:08PM -0700, Nick Desaulniers wrote:
> + if (ctxt->mode < X86EMUL_MODE_PROT64) {
> + rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
> + if (rc != X86EMUL_CONTINUE)
> + return rc;
> + /*
> +  * Hardware doesn't save and restore XMM 0-7 without
> +  * CR4.OSFXSR, but does save and restore MXCSR.
> +  */
> + if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
> + size = offsetof(struct fxregs_state, xmm_space[8]);
> + else
> + size = offsetof(struct fxregs_state, xmm_space[0]);
> + } else if (ctxt->mode == X86EMUL_MODE_PROT64)
> + size = offsetof(struct fxregs_state, xmm_space[16]);
> +
> + if (size == 0)
> + return X86EMUL_UNHANDLEABLE;
> +
> + rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
>   if (rc != X86EMUL_CONTINUE)
>   return rc;

Thinking more about this, I think it may be more elegant to move the
segmented_read_std into the then/else branches above, remove
initialization of size, and remove the size == 0 check.  Thoughts?


[PATCH v4] KVM: x86: avoid large stack allocations in em_fxrstor

2017-05-29 Thread Nick Desaulniers
em_fxstor previously called fxstor_fixup.  Both created instances of
struct fxregs_state on the stack, which triggered the warning:

arch/x86/kvm/emulate.c:4018:12: warning: stack frame size of 1080 bytes
in function
  'em_fxrstor' [-Wframe-larger-than=]
static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
   ^
with CONFIG_FRAME_WARN set to 1024.

This patch does the fixup in em_fxstor now, avoiding one additional
struct fxregs_state, and now fxstor_fixup can be removed as it has no
other call sites.

Signed-off-by: Nick Desaulniers 
---
New in V4:
* undo changes for V3, prefer to only call segmented_read_std() when
  size has been initialized.

New in V3:
* initialized size to 0 to avoid maybe-uninitialized warning.  Check
  that it gets set to something other than 0 before passing size to
  segmented_read_std().

New in V2:
* reworked patch to do what was recommended by maintainers in:
  https://lkml.org/lkml/2017/5/25/391

 arch/x86/kvm/emulate.c | 64 --
 1 file changed, 26 insertions(+), 38 deletions(-)

diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index 0816ab2e8adc..0367f7f81792 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -3985,57 +3985,45 @@ static int em_fxsave(struct x86_emulate_ctxt *ctxt)
return segmented_write_std(ctxt, ctxt->memop.addr.mem, &fx_state, size);
 }
 
-static int fxrstor_fixup(struct x86_emulate_ctxt *ctxt,
-   struct fxregs_state *new)
-{
-   int rc = X86EMUL_CONTINUE;
-   struct fxregs_state old;
-
-   rc = asm_safe("fxsave %[fx]", , [fx] "+m"(old));
-   if (rc != X86EMUL_CONTINUE)
-   return rc;
-
-   /*
-* 64 bit host will restore XMM 8-15, which is not correct on non-64
-* bit guests.  Load the current values in order to preserve 64 bit
-* XMMs after fxrstor.
-*/
-#ifdef CONFIG_X86_64
-   /* XXX: accessing XMM 8-15 very awkwardly */
-   memcpy(&new->xmm_space[8 * 16/4], &old.xmm_space[8 * 16/4], 8 * 16);
-#endif
-
-   /*
-* Hardware doesn't save and restore XMM 0-7 without CR4.OSFXSR, but
-* does save and restore MXCSR.
-*/
-   if (!(ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR))
-   memcpy(new->xmm_space, old.xmm_space, 8 * 16);
-
-   return rc;
-}
-
 static int em_fxrstor(struct x86_emulate_ctxt *ctxt)
 {
struct fxregs_state fx_state;
int rc;
+   unsigned int size;
 
rc = check_fxsr(ctxt);
if (rc != X86EMUL_CONTINUE)
return rc;
 
-   rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state, 512);
-   if (rc != X86EMUL_CONTINUE)
-   return rc;
+   ctxt->ops->get_fpu(ctxt);
+
+   if (ctxt->mode < X86EMUL_MODE_PROT64) {
+   rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state));
+   if (rc != X86EMUL_CONTINUE)
+   return rc;
+   /*
+* Hardware doesn't save and restore XMM 0-7 without
+* CR4.OSFXSR, but does save and restore MXCSR.
+*/
+   if (ctxt->ops->get_cr(ctxt, 4) & X86_CR4_OSFXSR)
+   size = offsetof(struct fxregs_state, xmm_space[8]);
+   else
+   size = offsetof(struct fxregs_state, xmm_space[0]);
+   rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state,
+   size);
+   if (rc != X86EMUL_CONTINUE)
+   return rc;
+   } else if (ctxt->mode == X86EMUL_MODE_PROT64) {
+   size = offsetof(struct fxregs_state, xmm_space[16]);
+   rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, &fx_state,
+   size);
+   if (rc != X86EMUL_CONTINUE)
+   return rc;
+   }
 
if (fx_state.mxcsr >> 16)
return emulate_gp(ctxt, 0);
 
-   ctxt->ops->get_fpu(ctxt);
-
-   if (ctxt->mode < X86EMUL_MODE_PROT64)
-   rc = fxrstor_fixup(ctxt, &fx_state);
-
if (rc == X86EMUL_CONTINUE)
rc = asm_safe("fxrstor %[fx]", : [fx] "m"(fx_state));
 
-- 
2.11.0



[PATCH v4] Input: mousedev - fix implicit conversion warning

2017-05-29 Thread Nick Desaulniers
Clang warns:

drivers/input/mousedev.c:653:63: error: implicit conversion from 'int'
to 'signed char' changes value from 200 to -56
[-Wconstant-conversion]
  client->ps2[1] = 0x60; client->ps2[2] = 3; client->ps2[3] = 200;
~ ^~~

As far as I can tell, from

http://www.computer-engineering.org/ps2mouse/

Under "Command Set" > "0xE9 (Status Request)"

the value 200 is a valid sample rate. Using unsigned char [], rather than
signed char [], for client->ps2 silences this warning.

Also moves some reused logic into a helper function.

Signed-off-by: Nick Desaulniers 
---
What's new in v4:
* replace mousedev_limit_delta() with update_clamped() that also updates
  the ps2_data and delta values. The use of the temporary val should
  avoid integral conversion and promotion issues.

 drivers/input/mousedev.c | 23 +++
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/drivers/input/mousedev.c b/drivers/input/mousedev.c
index 0e0ff84088fd..e645b8c6f2cb 100644
--- a/drivers/input/mousedev.c
+++ b/drivers/input/mousedev.c
@@ -103,7 +103,7 @@ struct mousedev_client {
spinlock_t packet_lock;
int pos_x, pos_y;
 
-   signed char ps2[6];
+   unsigned char ps2[6];
unsigned char ready, buffer, bufsiz;
unsigned char imexseq, impsseq;
enum mousedev_emul mode;
@@ -571,27 +571,27 @@ static int mousedev_open(struct inode *inode, struct file 
*file)
return error;
 }
 
-static inline int mousedev_limit_delta(int delta, int limit)
+static inline void
+update_clamped(unsigned char *ps2_data, int *delta, int limit)
 {
-   return delta > limit ? limit : (delta < -limit ? -limit : delta);
+   int val = *delta > limit ? limit : (*delta < -limit ? -limit : *delta);
+   *ps2_data = (unsigned char) val;
+   *delta -= val;
 }
 
 static void mousedev_packet(struct mousedev_client *client,
-   signed char *ps2_data)
+   unsigned char *ps2_data)
 {
struct mousedev_motion *p = &client->packets[client->tail];
 
ps2_data[0] = 0x08 |
((p->dx < 0) << 4) | ((p->dy < 0) << 5) | (p->buttons & 0x07);
-   ps2_data[1] = mousedev_limit_delta(p->dx, 127);
-   ps2_data[2] = mousedev_limit_delta(p->dy, 127);
-   p->dx -= ps2_data[1];
-   p->dy -= ps2_data[2];
+   update_clamped(&ps2_data[1], &p->dx, 127);
+   update_clamped(&ps2_data[2], &p->dy, 127);
 
switch (client->mode) {
case MOUSEDEV_EMUL_EXPS:
-   ps2_data[3] = mousedev_limit_delta(p->dz, 7);
-   p->dz -= ps2_data[3];
+   update_clamped(&ps2_data[3], &p->dz, 7);
ps2_data[3] = (ps2_data[3] & 0x0f) | ((p->buttons & 0x18) << 1);
client->bufsiz = 4;
break;
@@ -599,8 +599,7 @@ static void mousedev_packet(struct mousedev_client *client,
case MOUSEDEV_EMUL_IMPS:
ps2_data[0] |=
((p->buttons & 0x10) >> 3) | ((p->buttons & 0x08) >> 1);
-   ps2_data[3] = mousedev_limit_delta(p->dz, 127);
-   p->dz -= ps2_data[3];
+   update_clamped(&ps2_data[3], &p->dz, 127);
client->bufsiz = 4;
break;
 
-- 
2.11.0



Re: [PATCH v2] sched/sysctl: Fix attributes of some extern declarations

2017-10-31 Thread Nick Desaulniers
> El Tue, Oct 30, 2017 at 10:57:58AM +0100 Ingo Molnar ha dit:
>> So I hate this change, because it pointlessly duplicates an attribute that 
>> should
>> only matter at the definition site.
>
> It's certainly not ideal, and then again essentially the same is done
> in kernel/sched/sched.h, just that here the specific attribute is
> hidden behind const_debug.
>
>> The Clang warning:
>>
>> >   kernel/sched/sched.h:1618:33: warning: section attribute is specified on
>> > redeclared variable [-Wsection]
>>
>> suggests that the -Wsection warning can be turned off. The Clang build should
>> probably do that.

Naive question: can these definitions be hoisted to include/linux/sched.h?


[PATCH] kbuild: Set KBUILD_CFLAGS before incl. arch Makefile

2017-11-02 Thread Nick Desaulniers
From: Chris Fries 

Set the clang KBUILD_CFLAGS up before including arch/ Makefiles,
so that ld-options (etc.) can work correctly.

This fixes errors with clang such as ld-options trying to CC
against your host architecture, but LD trying to link against
your target architecture.

We didn't notice this problem on Android, because we took the original
LLVMLinux patch into our 4.4 kernels, which did not have this issue. We
ran into this taking the proper upstream patch on newer kernel versions.
The original LLVMLinux patch can be seen at:

http://git.linuxfoundation.org/?p=llvmlinux/kernel.git;a=blobdiff;f=Makefile;h=389006c4ef494cda3a1ee52bf355618673ab4f31;hp=e41a3356abee83f08288362950bfceebd25ec3c2;hb=ef9126da11b18ff34eb1f01561f53c378860336c;hpb=f800c25b7a762d445ba1439a2428c8362157eba6

It seems that when the patch was re-upstreamed, a V2 was requested that
moved the definition of Clang's target triple to be later in the top
level Makefile than the inclusion of the arch specific Makefile,
breaking macros like ld-option when cross compiling. V2 was requested
at:

https://lkml.org/lkml/2017/4/21/116

Signed-off-by: Chris Fries 
Signed-off-by: Nick Desaulniers 
---
 Makefile | 64 
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/Makefile b/Makefile
index 5f91a28a3cea..72ea86157114 100644
--- a/Makefile
+++ b/Makefile
@@ -512,6 +512,38 @@ ifneq ($(filter install,$(MAKECMDGOALS)),)
 endif
 endif
 
+ifeq ($(cc-name),clang)
+ifneq ($(CROSS_COMPILE),)
+CLANG_TARGET   := --target=$(notdir $(CROSS_COMPILE:%-=%))
+GCC_TOOLCHAIN  := $(realpath $(dir $(shell which $(LD)))/..)
+endif
+ifneq ($(GCC_TOOLCHAIN),)
+CLANG_GCC_TC   := --gcc-toolchain=$(GCC_TOOLCHAIN)
+endif
+KBUILD_CFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC)
+KBUILD_AFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC)
+KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
+KBUILD_CFLAGS += $(call cc-disable-warning, unused-variable)
+KBUILD_CFLAGS += $(call cc-disable-warning, format-invalid-specifier)
+KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
+KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
+# Quiet clang warning: comparison of unsigned expression < 0 is always false
+KBUILD_CFLAGS += $(call cc-disable-warning, tautological-compare)
+# CLANG uses a _MergedGlobals as optimization, but this breaks modpost, as the
+# source of a reference will be _MergedGlobals and not on of the whitelisted 
names.
+# See modpost pattern 2
+KBUILD_CFLAGS += $(call cc-option, -mno-global-merge,)
+KBUILD_CFLAGS += $(call cc-option, -fcatch-undefined-behavior)
+KBUILD_CFLAGS += $(call cc-option, -no-integrated-as)
+KBUILD_AFLAGS += $(call cc-option, -no-integrated-as)
+else
+
+# These warnings generated too much noise in a regular build.
+# Use make W=1 to enable them (see scripts/Makefile.extrawarn)
+KBUILD_CFLAGS += $(call cc-disable-warning, unused-but-set-variable)
+KBUILD_CFLAGS += $(call cc-disable-warning, unused-const-variable)
+endif
+
 ifeq ($(mixed-targets),1)
 # ===
 # We're called with mixed targets (*config and build targets).
@@ -695,38 +727,6 @@ ifdef CONFIG_CC_STACKPROTECTOR
 endif
 KBUILD_CFLAGS += $(stackp-flag)
 
-ifeq ($(cc-name),clang)
-ifneq ($(CROSS_COMPILE),)
-CLANG_TARGET   := --target=$(notdir $(CROSS_COMPILE:%-=%))
-GCC_TOOLCHAIN  := $(realpath $(dir $(shell which $(LD)))/..)
-endif
-ifneq ($(GCC_TOOLCHAIN),)
-CLANG_GCC_TC   := --gcc-toolchain=$(GCC_TOOLCHAIN)
-endif
-KBUILD_CFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC)
-KBUILD_AFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC)
-KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
-KBUILD_CFLAGS += $(call cc-disable-warning, unused-variable)
-KBUILD_CFLAGS += $(call cc-disable-warning, format-invalid-specifier)
-KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
-KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
-# Quiet clang warning: comparison of unsigned expression < 0 is always false
-KBUILD_CFLAGS += $(call cc-disable-warning, tautological-compare)
-# CLANG uses a _MergedGlobals as optimization, but this breaks modpost, as the
-# source of a reference will be _MergedGlobals and not on of the whitelisted 
names.
-# See modpost pattern 2
-KBUILD_CFLAGS += $(call cc-option, -mno-global-merge,)
-KBUILD_CFLAGS += $(call cc-option, -fcatch-undefined-behavior)
-KBUILD_CFLAGS += $(call cc-option, -no-integrated-as)
-KBUILD_AFLAGS += $(call cc-option, -no-integrated-as)
-else
-
-# These warnings generated too much noise in a regular build.
-# Use make W=1 to enable them (see scripts/Makefile.extrawarn)
-KBUILD_CFLAGS += $(call cc-disable-warning, unused-but-set-variable)
-KBUILD_CFLAGS += $(call cc-disable-warning, unused-const-variable)
-endif
-
 ifdef CONFIG_FRAME_POINTER
 KBUILD_CFLAGS  += -fno-omit-frame-pointer -fno-optimize-sibling-calls
 else
-- 
2.15.0.403.gc27cc4dac6-goog



[PATCH] kbuild: fix linker feature test macros when cross compiling with Clang

2017-10-26 Thread Nick Desaulniers
I was not seeing my linker flags getting added when using ld-option when
cross compiling with Clang. Upon investigation, this seems to be due to
a difference in how GCC vs Clang handle cross compilation.

GCC is configured at build time to support one backend, that is implicit
when compiling.  Clang is explicit via the use of `-target ` and
ships with all supported backends by default.

GNU Make feature test macros that compile then link will always fail
when cross compiling with Clang unless Clang's triple is passed along to
the compiler. For example:

$ clang -x c /dev/null -c -o temp.o
$ aarch64-linux-android/bin/ld -E temp.o
aarch64-linux-android/bin/ld:
unknown architecture of input file `temp.o' is incompatible with
aarch64 output
aarch64-linux-android/bin/ld:
warning: cannot find entry symbol _start; defaulting to
00400078
$ echo $?
1

$ clang -target aarch64-linux-android- -x c /dev/null -c -o temp.o
$ aarch64-linux-android/bin/ld -E temp.o
aarch64-linux-android/bin/ld:
warning: cannot find entry symbol _start; defaulting to 004002e4
$ echo $?
0

This causes conditional checks that invoke $(CC) without the target
triple, then $(LD) on the result, to always fail.

Signed-off-by: Nick Desaulniers 
Reviewed-by: Matthias Kaehlcke 
---
 scripts/Kbuild.include | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 9ffd3dda3889..23c4df90e8ff 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -160,12 +160,12 @@ cc-if-fullversion = $(shell [ $(cc-fullversion) $(1) $(2) 
] && echo $(3) || echo
 # cc-ldoption
 # Usage: ldflags += $(call cc-ldoption, -Wl$(comma)--hash-style=both)
 cc-ldoption = $(call try-run,\
-   $(CC) $(1) -nostdlib -x c /dev/null -o "$$TMP",$(1),$(2))
+   $(CC) $(CLANG_TARGET) $(1) -nostdlib -x c /dev/null -o 
"$$TMP",$(1),$(2))
 
 # ld-option
 # Usage: LDFLAGS += $(call ld-option, -X)
 ld-option = $(call try-run,\
-   $(CC) -x c /dev/null -c -o "$$TMPO" ; $(LD) $(1) "$$TMPO" -o 
"$$TMP",$(1),$(2))
+   $(CC) $(CLANG_TARGET) -x c /dev/null -c -o "$$TMPO" ; $(LD) $(1) 
"$$TMPO" -o "$$TMP",$(1),$(2))
 
 # ar-option
 # Usage: KBUILD_ARFLAGS := $(call ar-option,D)
-- 
2.15.0.rc2.357.g7e34df9404-goog



[PATCH] arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27

2017-10-26 Thread Nick Desaulniers
Upon upgrading to binutils 2.27, we found that our lz4 compressed kernel
images were significantly larger, resulting is 10ms boot time regressions.

As noted by Rahul:
"aarch64 binaries uses RELA relocations, where each relocation entry
includes an addend value. This is similar to x86_64.  On x86_64, the
addend values are also stored at the relocation offset for relative
relocations. This is an optimization: in the case where code does not
need to be relocated, the loader can simply skip processing relative
relocations.  In binutils-2.25, both bfd and gold linkers did this for
x86_64, but only the gold linker did this for aarch64.  The kernel build
here is using the bfd linker, which stored zeroes at the relocation
offsets for relative relocations.  Since a set of zeroes compresses
better than a set of non-zero addend values, this behavior was resulting
in much better lz4 compression.

The bfd linker in binutils-2.27 is now storing the actual addend values
at the relocation offsets. The behavior is now consistent with what it
does for x86_64 and what gold linker does for both architectures.  The
change happened in this upstream commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1f56df9d0d5ad89806c24e71f296576d82344613
Since a bunch of zeroes got replaced by non-zero addend values, we see
the side effect of lz4 compressed image being a bit bigger.

To get the old behavior from the bfd linker, "--no-apply-dynamic-relocs"
flag can be used:
$ LDFLAGS="--no-apply-dynamic-relocs" ./build/build.sh
With this flag, the compressed image size is back to what it was with
binutils-2.25.

If the kernel is using ASLR, there aren't additional runtime costs to
--no-apply-dynamic-relocs, as the relocations will need to be applied
again anyway after the kernel is relocated to a random address.

If the kernel is not using ASLR, then presumably the current default
behavior of the linker is better. Since the static linker performed the
dynamic relocs, and the kernel is not moved to a different address at
load time, it can skip applying the relocations all over again."

Some measurements:

$ ld -v
GNU ld (binutils-2.25-f3d35cf6) 2.25.51.20141117
^
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300652760 Oct 26 11:57 vmlinux
$ ls -l Image.lz4-dtb
-rw-r- 1 ndesaulniers eng 16932627 Oct 26 11:57 Image.lz4-dtb

$ ld -v
GNU ld (binutils-2.27-53dd00a1) 2.27.0.20170315
^
pre patch:
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 11:43 vmlinux
$ ls -l Image.lz4-dtb
-rw-r- 1 ndesaulniers eng 18159474 Oct 26 11:43 Image.lz4-dtb

post patch:
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 12:06 vmlinux
$ ls -l Image.lz4-dtb
-rw-r- 1 ndesaulniers eng 16932466 Oct 26 12:06 Image.lz4-dtb

10ms boot time savings isn't anything to get excited about, but users of
arm64+lz4+bfd-2.27 should not have to pay a penalty for no runtime
improvement.

Reported-by: Gopinath Elanchezhian 
Reported-by: Sindhuri Pentyala 
Reported-by: Wei Wang 
Suggested-by: Rahul Chaudhry 
Suggested-by: Siqi Lin 
Suggested-by: Stephen Hines 
Signed-off-by: Nick Desaulniers 
---
* Question to reviewers: should I be using $(and ..., ...) here rather
  than double equals block? grep turns up no hits in the kernel for
  an example.

 arch/arm64/Makefile | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 939b310913cf..eed3b8bdc089 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -18,6 +18,12 @@ ifneq ($(CONFIG_RELOCATABLE),)
 LDFLAGS_vmlinux+= -pie -shared -Bsymbolic
 endif
 
+ifeq ($(CONFIG_KERNEL_LZ4), y)
+ifeq ($(CONFIG_RANDOMIZE_BASE), y)
+LDFLAGS_vmlinux+= $(call ld-option, --no-apply-dynamic-relocs)
+endif
+endif
+
 ifeq ($(CONFIG_ARM64_ERRATUM_843419),y)
   ifeq ($(call ld-option, --fix-cortex-a53-843419),)
 $(warning ld does not support --fix-cortex-a53-843419; kernel may be 
susceptible to erratum)
-- 
2.15.0.rc2.357.g7e34df9404-goog



Re: [PATCH] arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27

2017-10-26 Thread Nick Desaulniers
On Thu, Oct 26, 2017 at 1:41 PM, Ard Biesheuvel
 wrote:
> Since we will need to support bfd ld < 2.27 for a while to come, and
> given that we cannot test in the code whether the relocation targets
> are seeded with the correct values, I propose we simply drop the outer
> ifeq here, and  stick with the old behavior unconditionally. Once
> we're ready to drop support for <2.27 binutils, we can revisit this if
> desired.

Ard, thanks for the quick review!

--no-apply-dynamic-relocs was added in binutils 2.27, so ld-option
should support 2.27 and prior:

$ aarch64-linux-android/bin/ld -v
GNU ld (binutils-2.27-53dd00a1) 2.27.0.20170315
$ aarch64-linux-android/bin/ld -E --no-apply-dynamic-relocs temp.o
$ echo $?
0

$ aarch64-linux-android/bin/ld -v
GNU ld (binutils-2.25-f3d35cf6) 2.25.51.20141117
$ aarch64-linux-android/bin/ld -E --no-apply-dynamic-relocs temp.o
./prebuilts/gcc/linux-x86/aarch64/aarch64-linux-android-4.9/aarch64-linux-android/bin/ld:
unrecognized option '--no-apply-dynamic-relocs'
$ echo $?
1

ld-option will catch that.

> are seeded with the correct values, I propose we simply drop the outer

I haven't verified this with other compression schemes, but my
teammate Wei just reported this benefits gzip as well.  What do you
think?

> Also, you should be using CONFIG_RELOCATABLE not CONFIG_RANDOMIZE_BASE,.

Sure thing, will send v2 once you clarify the outer conditional and if
ld-option helps us take this patch now.

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27

2017-10-26 Thread Nick Desaulniers
+ folks in Suggested-by/Reported by lines, since git send-email seems
to only pull in folks on Signed-off-by line :(

https://lkml.org/lkml/2017/10/26/590


Re: [PATCH] arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27

2017-10-26 Thread Nick Desaulniers
On Thu, Oct 26, 2017 at 2:17 PM, Siqi Lin  wrote:
> I'm OK with sticking with the <2.27 binutils behavior. The gzip data is:

That's what this patch does; goes back to the <2.27 behavior for 2.27+.

> binutils 2.25:
> Image 41467904
> Image.gz 13395151
> binutils 2.27:
> Image 41467392
> Image.gz 14114953
>
> gzipped kernel increased by 0.69 MiB.

That's without this patch applied?  With it applied, what are the
stats (for gzip)?

> The one special case I see is !CONFIG_RELOCATABLE and compression is
> used, where there's a tradeoff between compressed image size and the
> benefit of dynamic relocs.

if !CONFIG_RELOCATABLE, then this patch (well v2 which will use
CONFIG_RELOCATABLE rather than CONFIG_RANDOMIZE_BASE) doesn't do
anything.


Re: [PATCH] arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27

2017-10-26 Thread Nick Desaulniers
> On Thu, Oct 26, 2017 at 1:41 PM, Ard Biesheuvel
 wrote:>are seeded with the correct values,
I propose we simply drop the outer

> On Thu, Oct 26, 2017 at 2:17 PM, Siqi Lin  wrote:
>> I'm OK with sticking with the <2.27 binutils behavior. The gzip data is:

> ... Nick:
> That's what this patch does; goes back to the <2.27 behavior for 2.27+.

Sorry, rereading this thread, it sounds like neither of you were
disagreeing with me?  Will post v2.


[PATCH v2] arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27

2017-10-26 Thread Nick Desaulniers
Upon upgrading to binutils 2.27, we found that our lz4 compressed kernel
images were significantly larger, resulting is 10ms boot time regressions.

As noted by Rahul:
"aarch64 binaries uses RELA relocations, where each relocation entry
includes an addend value. This is similar to x86_64.  On x86_64, the
addend values are also stored at the relocation offset for relative
relocations. This is an optimization: in the case where code does not
need to be relocated, the loader can simply skip processing relative
relocations.  In binutils-2.25, both bfd and gold linkers did this for
x86_64, but only the gold linker did this for aarch64.  The kernel build
here is using the bfd linker, which stored zeroes at the relocation
offsets for relative relocations.  Since a set of zeroes compresses
better than a set of non-zero addend values, this behavior was resulting
in much better lz4 compression.

The bfd linker in binutils-2.27 is now storing the actual addend values
at the relocation offsets. The behavior is now consistent with what it
does for x86_64 and what gold linker does for both architectures.  The
change happened in this upstream commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1f56df9d0d5ad89806c24e71f296576d82344613
Since a bunch of zeroes got replaced by non-zero addend values, we see
the side effect of lz4 compressed image being a bit bigger.

To get the old behavior from the bfd linker, "--no-apply-dynamic-relocs"
flag can be used:
$ LDFLAGS="--no-apply-dynamic-relocs" ./build/build.sh
With this flag, the compressed image size is back to what it was with
binutils-2.25.

If the kernel is using ASLR, there aren't additional runtime costs to
--no-apply-dynamic-relocs, as the relocations will need to be applied
again anyway after the kernel is relocated to a random address.

If the kernel is not using ASLR, then presumably the current default
behavior of the linker is better. Since the static linker performed the
dynamic relocs, and the kernel is not moved to a different address at
load time, it can skip applying the relocations all over again."

Some measurements:

$ ld -v
GNU ld (binutils-2.25-f3d35cf6) 2.25.51.20141117
^
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300652760 Oct 26 11:57 vmlinux
$ ls -l Image.lz4-dtb
-rw-r- 1 ndesaulniers eng 16932627 Oct 26 11:57 Image.lz4-dtb

$ ld -v
GNU ld (binutils-2.27-53dd00a1) 2.27.0.20170315
^
pre patch:
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 11:43 vmlinux
$ ls -l Image.lz4-dtb
-rw-r- 1 ndesaulniers eng 18159474 Oct 26 11:43 Image.lz4-dtb

post patch:
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 12:06 vmlinux
$ ls -l Image.lz4-dtb
-rw-r- 1 ndesaulniers eng 16932466 Oct 26 12:06 Image.lz4-dtb

We've also verified gzip improves by 0.69 MB.

Any compression scheme should be able to get better results from the
longer runs of zeros, not just GZIP and LZ4.

10ms boot time savings isn't anything to get excited about, but users of
arm64+compression+bfd-2.27 should not have to pay a boot time penalty for no
runtime improvement.

Reported-by: Gopinath Elanchezhian 
Reported-by: Sindhuri Pentyala 
Reported-by: Wei Wang 
Suggested-by: Ard Biesheuvel 
Suggested-by: Rahul Chaudhry 
Suggested-by: Siqi Lin 
Suggested-by: Stephen Hines 
Signed-off-by: Nick Desaulniers 
---
Changes since v1:
* dropped LZ4 only outer conditional, per Ard.
* changed inner conditional for all RELOCATABLE, not just
  RANDOMIZE_BASE, per Ard.
* updated commit message with findings for gzip.
* added Ard to suggested by line in commit message.

 arch/arm64/Makefile | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 939b310913cf..9f47d4276a21 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -18,6 +18,10 @@ ifneq ($(CONFIG_RELOCATABLE),)
 LDFLAGS_vmlinux+= -pie -shared -Bsymbolic
 endif
 
+ifeq ($(CONFIG_RELOCATABLE), y)
+LDFLAGS_vmlinux+= $(call ld-option, --no-apply-dynamic-relocs)
+endif
+
 ifeq ($(CONFIG_ARM64_ERRATUM_843419),y)
   ifeq ($(call ld-option, --fix-cortex-a53-843419),)
 $(warning ld does not support --fix-cortex-a53-843419; kernel may be 
susceptible to erratum)
-- 
2.15.0.rc2.357.g7e34df9404-goog



Re: [PATCH 07/15] arm64: use -mno-implicit-float instead of -mgeneral-regs-only

2017-11-03 Thread Nick Desaulniers
I think this bug was fixed upstream in LLVM.  Do we still want to take
this workaround?

On Fri, Nov 3, 2017 at 10:11 AM, Sami Tolvanen  wrote:
> From: Greg Hackmann 
>
> LLVM bug 30792 causes clang's AArch64 backend to crash compiling
> arch/arm64/crypto/aes-ce-cipher.c.  Replacing -mgeneral-regs-only with
> -mno-implicit-float is the suggested workaround.
>
> Signed-off-by: Greg Hackmann 
> Cc: Matthias Kaehlcke 
> Signed-off-by: Sami Tolvanen 
> ---
>  arch/arm64/Makefile | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
> index 939b310913cf..eb6f3c9ec6cb 100644
> --- a/arch/arm64/Makefile
> +++ b/arch/arm64/Makefile
> @@ -45,7 +45,13 @@ $(warning Detected assembler with broken .inst; 
> disassembly will be unreliable)
>endif
>  endif
>
> -KBUILD_CFLAGS  += -mgeneral-regs-only $(lseinstr) $(brokengasinst)
> +ifeq ($(cc-name),clang)
> +# This is a workaround for https://bugs.llvm.org/show_bug.cgi?id=30792.
> +KBUILD_CFLAGS  += -mno-implicit-float
> +else
> +KBUILD_CFLAGS  += -mgeneral-regs-only
> +endif
> +KBUILD_CFLAGS  += $(lseinstr) $(brokengasinst)
>  KBUILD_CFLAGS  += -fno-asynchronous-unwind-tables
>  KBUILD_CFLAGS  += $(call cc-option, -mpc-relative-literal-loads)
>  KBUILD_AFLAGS  += $(lseinstr) $(brokengasinst)
> --
> 2.15.0.403.gc27cc4dac6-goog
>



-- 
Thanks,
~Nick Desaulniers


Re: [PATCH 13/15] arm64: fix mrs_s/msr_s macros for clang LTO

2017-11-03 Thread Nick Desaulniers
These mrs_s and msr_s macros in particular were preventing us from
linking arm64 with Clang's integrated assembler, regardless of LTO.
Those macros ran into: https://bugs.llvm.org/show_bug.cgi?id=19749.
So while I appreciate how clever they are, they prevent us from
assembling with Clang so I would like to see them go.

On Fri, Nov 3, 2017 at 10:12 AM, Sami Tolvanen  wrote:
> Clang's integrated assembler does not allow assembly macros defined
> in one inline asm block using the .macro directive to be used across
> separate asm blocks. LLVM developers consider this a feature and not a
> bug, recommending code refactoring:
>
>   https://bugs.llvm.org/show_bug.cgi?id=19749
>
> As binutils doesn't allow macros to be redefined, this change adds C
> preprocessor macros that define the assembly macros globally for binutils
> and locally for clang's integrated assembler.
>
> Suggested-by: Greg Hackmann 
> Suggested-by: Nick Desaulniers 
> Signed-off-by: Sami Tolvanen 
> ---
>  arch/arm64/include/asm/kvm_hyp.h |  2 ++
>  arch/arm64/include/asm/sysreg.h  | 71 
> ++--
>  2 files changed, 56 insertions(+), 17 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_hyp.h 
> b/arch/arm64/include/asm/kvm_hyp.h
> index 4572a9b560fa..6840704ea894 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -29,6 +29,7 @@
> ({  \
> u64 reg;\
> asm volatile(ALTERNATIVE("mrs %0, " __stringify(r##nvh),\
> +DEFINE_MRS_S   \
>  "mrs_s %0, " __stringify(r##vh),\
>  ARM64_HAS_VIRT_HOST_EXTN)  \
>  : "=r" (reg)); \
> @@ -39,6 +40,7 @@
> do {\
> u64 __val = (u64)(v);   \
> asm volatile(ALTERNATIVE("msr " __stringify(r##nvh) ", %x0",\
> +DEFINE_MSR_S   \
>  "msr_s " __stringify(r##vh) ", %x0",\
>  ARM64_HAS_VIRT_HOST_EXTN)  \
>  : : "rZ" (__val)); \
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index f707fed5886f..1588ac3f3690 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -463,21 +463,58 @@
>
>  #include 
>
> -asm(
> -"  .irp
> num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n"
> -"  .equ.L__reg_num_x\\num, \\num\n"
> -"  .endr\n"
> +#define ___MRS_MSR_S_REGNUM\
> +"  .irp
> num,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30\n"
>  \
> +"  .equ.L__reg_num_x\\num, \\num\n"\
> +"  .endr\n"\
>  "  .equ.L__reg_num_xzr, 31\n"
> -"\n"
> -"  .macro  mrs_s, rt, sreg\n"
> -   __emit_inst(0xd520|(\\sreg)|(.L__reg_num_\\rt))
> +
> +#define ___MRS_S   \
> +"  .macro  mrs_s, rt, sreg\n"  \
> +   __emit_inst(0xd520|(\\sreg)|(.L__reg_num_\\rt)) \
>  "  .endm\n"
> -"\n"
> -"  .macro  msr_s, sreg, rt\n"
> -   __emit_inst(0xd500|(\\sreg)|(.L__reg_num_\\rt))
> +
> +#define ___MSR_S   \
> +"  .macro  msr_s, sreg, rt\n"  \
> +   __emit_inst(0xd500|(\\sreg)|(.L__reg_num_\\rt)) \
>  "  .endm\n"
> +
> +/*
> + * llvm-as doesn't allow macros defined in an asm block to be used in other
> + * asm blocks, which means we cannot define them globally.
> + */
> +#if !defined(CONFIG_CLANG_LTO)
> +asm(
> +   ___MRS_MSR_S_REGNUM
> +   ___MRS_S
> +   ___MSR_S
>  );
>
> +#undef  ___MRS_MSR_S_REGNUM
> +#define ___MRS_MSR_S_REGNUM
> +#undef  ___MRS_S
> +#define ___MRS_S
> +#undef  ___MSR_S
> +#define ___MSR_S
> +#endif
> +
> +#define DEFINE_MRS_S   \
> +   ___MRS_MSR_S_RE

Re: [PATCH 00/15] Add support for clang LTO

2017-11-03 Thread Nick Desaulniers
On Fri, Nov 3, 2017 at 10:51 AM, Mark Rutland  wrote:
> I had to create an aarch64-linux-gnu-clang wrapper, too. I'm not sure if
> there's build system help to avoid needing that?

Gah!  So a BIG difference with Clang vs GCC for cross compiling is
that Clang by default ships with all backends enabled, and uses a
`-target ` CFLAG to determine the arch to cross compile for,
while GCC is configured at compile time to support one back end, IIUC.
Clang _can_ be built with all other back ends disabled but one, but I
have a problem with this approach:  distributions of Clang are rarely
configured this way (all backends enabled by default), and it seems we
just found this past week that Clang if configured with just one
backend, it will silently ignore -target flags for other backends and
generate code for the configured backend (this led to:
https://lkml.org/lkml/2017/11/2/892).  I consider this a bug in Clang,
so I just filed: https://bugs.llvm.org/show_bug.cgi?id=35196

>   CC  arch/arm64/crypto/aes-ce-cipher.o
> fatal error: error in backend: Do not know how to split the result of this 
> operator!

Yep, we've seen this.  It was the FINAL bug in Clang for compiling the
kernel for arm64.  It was recently fixed upstream in llvm by
g...@google.com, but so recent that you'll need to either compiler
Clang from source from ToT or work around it like we have in Android
with:
https://android.googlesource.com/kernel/msm/+/9093342a0186dad05095b70f1806938310ace6e7

aka patch 7 in this series.


Re: [PATCH 00/15] Add support for clang LTO

2017-11-03 Thread Nick Desaulniers
On Fri, Nov 3, 2017 at 11:09 AM, Mark Rutland  wrote:
ently compile
> What's the minimum set of patches necessary to work with clang (ignoring
> LTO)?

If you have a build of clang-5, then just patch 7 in this series to
work around the last compiler bug.  If you build clang from source
from master, ToT, for arm64, then none. :)


Re: [PATCH 13/15] arm64: fix mrs_s/msr_s macros for clang LTO

2017-11-03 Thread Nick Desaulniers
On Fri, Nov 3, 2017 at 11:06 AM, Mark Rutland  wrote:
> On Fri, Nov 03, 2017 at 10:53:08AM -0700, Nick Desaulniers wrote:
>> These mrs_s and msr_s macros in particular were preventing us from
>> linking arm64 with Clang's integrated assembler, regardless of LTO.
>> Those macros ran into: https://bugs.llvm.org/show_bug.cgi?id=19749.
>> So while I appreciate how clever they are, they prevent us from
>> assembling with Clang so I would like to see them go.
>
> They're necessary to work with some currently-supported toolchains
> (which don't support the s*_*_c*_c*_* syntax), so they're not going to
> go completely.
>
> If you could suggest something that clang might prefer, which doesn't
> adversely affect GCC, I'm all ears.

I wasn't clear in my point; I recognize what they do and agree that
they are needed.

More specifically, my problem is the use of .macro assembly
directives, since Clang "considers it a feature" not to let you invoke
a .macro defined in one inline asm block from another asm inline
block.

So I would like to see *the use of .macro in separate inline asm
blocks within mrs_s and msr_s c-preprocessor macros go.*


Re: [PATCH 07/15] arm64: use -mno-implicit-float instead of -mgeneral-regs-only

2017-11-03 Thread Nick Desaulniers
On Fri, Nov 3, 2017 at 11:02 AM, Mark Rutland  wrote:
> Ah, so I guess this is what I was hitting when testing with clang 5.0.0.

Exactly.

> I was under the impression that this series was jsut enablnig LTO
> support, not clang support generally.

Clang is supported at least for the arm64*/x86-64 configs that we've
gotten around to testing.

* minus this one last compiler bug (but maybe "last" is optimistic).

> It would be much nicer if we could depend on a fixed clang to start
> with...

I agree, which is why I'm on the fence with this patch.  With a newer
version of Clang, it's not needed.

There are some kbuild macros for certain versions of GCC, maybe
something like that can be used to conditionally swap these flags if
using an older version of Clang?

>> +# This is a workaround for https://bugs.llvm.org/show_bug.cgi?id=30792.
>> +KBUILD_CFLAGS+= -mno-implicit-float
>> +else
>> +KBUILD_CFLAGS+= -mgeneral-regs-only
>> +endif


Re: [PATCH 01/15] kbuild: add ld-name macro and support for GNU gold

2017-11-03 Thread Nick Desaulniers
+ Kbuild mailing list and maintainers

The use of these ternary like operations will need to be expanded if
additional compilers come along (less likely), or additional linkers
(more likely, we are looking into lld right now) but we can cross that
bridge when we get there.

Reviewed-by: Nick Desaulniers 

On Fri, Nov 3, 2017 at 10:11 AM, Sami Tolvanen  wrote:
> GNU gold may require different flags than GNU ld. Add a macro for
> detecting the linker and conditionally add gold specific flags from
> LDFLAGS_GOLD.
>
> Signed-off-by: Sami Tolvanen 
> ---
>  Makefile   | 5 +
>  scripts/Kbuild.include | 4 
>  2 files changed, 9 insertions(+)
>
> diff --git a/Makefile b/Makefile
> index 3a8868ee967e..59980d5a03d0 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -826,6 +826,11 @@ include scripts/Makefile.kasan
>  include scripts/Makefile.extrawarn
>  include scripts/Makefile.ubsan
>
> +# Add any flags specific to ld.gold
> +ifeq ($(ld-name),gold)
> +LDFLAGS+= $(LDFLAGS_GOLD)
> +endif
> +
>  # Add any arch overrides and user supplied CPPFLAGS, AFLAGS and CFLAGS as the
>  # last assignments
>  KBUILD_CPPFLAGS += $(ARCH_CPPFLAGS) $(KCPPFLAGS)
> diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
> index 9ffd3dda3889..584d6cecd7c0 100644
> --- a/scripts/Kbuild.include
> +++ b/scripts/Kbuild.include
> @@ -172,6 +172,10 @@ ld-option = $(call try-run,\
>  # Important: no spaces around options
>  ar-option = $(call try-run, $(AR) rc$(1) "$$TMP",$(1),$(2))
>
> +# ld-name
> +# Expands to either bfd or gold
> +ld-name = $(shell $(LD) -v 2>&1 | grep -q "GNU gold" && echo gold || echo 
> bfd)
> +
>  # ld-version
>  # Note this is mainly for HJ Lu's 3 number binutil versions
>  ld-version = $(shell $(LD) --version | $(srctree)/scripts/ld-version.sh)
> --
> 2.15.0.403.gc27cc4dac6-goog
>



-- 
Thanks,
~Nick Desaulniers


Re: [PATCH 00/15] Add support for clang LTO

2017-11-03 Thread Nick Desaulniers
On Fri, Nov 3, 2017 at 11:29 AM, Mark Rutland  wrote:
> On Fri, Nov 03, 2017 at 11:07:04AM -0700, Nick Desaulniers wrote:
>> On Fri, Nov 3, 2017 at 10:51 AM, Mark Rutland  wrote:
>> > I had to create an aarch64-linux-gnu-clang wrapper, too. I'm not sure if
>> > there's build system help to avoid needing that?
>>
>> Gah!  So a BIG difference with Clang vs GCC for cross compiling is
>> that Clang by default ships with all backends enabled, and uses a
>> `-target ` CFLAG to determine the arch to cross compile for,
>> while GCC is configured at compile time to support one back end, IIUC.
>
> Yup.
>
> I initally tried passing CC=clang\ --target=aarch64-linux-gnu
>
> ... but since that conflicts with CROSS_COMPILE, you then have to
> override AS, LD, OBJCOPY, etc, separately too.

make CC=clang CROSS_COMPILE=aarch64-linux-gnu-

I think you may need a trailing hyphen on your target as it gets
prefixed onto as and ld:

Makefile:

 348 # Make variables (CC, etc...)
 349 AS= $(CROSS_COMPILE)as
 350 LD= $(CROSS_COMPILE)ld


Re: [PATCH 09/15] arm64: keep .altinstructions and .altinstr_replacement

2017-11-03 Thread Nick Desaulniers
This patch can likely be taken regardless of the rest of the series.
It would be good to get additional review from the person who added
CONFIG_LD_DEAD_CODE_DATA_ELIMINATION maybe?

On Fri, Nov 3, 2017 at 10:11 AM, Sami Tolvanen  wrote:
> Make sure the linker doesn't remove .altinstructions or
> .altinstr_replacement when CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is
> enabled.
>
> Signed-off-by: Sami Tolvanen 
> ---
>  arch/arm64/kernel/vmlinux.lds.S | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
> index 7da3e5c366a0..15479995869c 100644
> --- a/arch/arm64/kernel/vmlinux.lds.S
> +++ b/arch/arm64/kernel/vmlinux.lds.S
> @@ -138,11 +138,11 @@ SECTIONS
> . = ALIGN(4);
> .altinstructions : {
> __alt_instructions = .;
> -   *(.altinstructions)
> +   KEEP(*(.altinstructions))
> __alt_instructions_end = .;
> }
> .altinstr_replacement : {
> -   *(.altinstr_replacement)
> +   KEEP(*(.altinstr_replacement))
> }
>
>     . = ALIGN(PAGE_SIZE);
> --
> 2.15.0.403.gc27cc4dac6-goog
>



-- 
Thanks,
~Nick Desaulniers


Re: [PATCH v2] kbuild: fix linker feature test macros when cross compiling with Clang

2017-11-06 Thread Nick Desaulniers
Comparing make V=1 with the suggested config before my patch, after my
patch, and after Masahiro's suggestion to add $(LDFLAGS):

before:
...
ld -m elf_i386  -pie-T arch/x86/boot/compressed/vmlinux.lds ...
...

after my:
...
ld -m elf_i386 -T arch/x86/boot/compressed/vmlinux.lds ...
...

after my+Masahiro:
...
ld -m elf_i386 -pie-T arch/x86/boot/compressed/vmlinux.lds ...
...

:)

Just to dig into this a little more (though I suspect we've found the issue):

Next is to debug what we're passing to try-run and see what errors
it's masking.  Adding to arch/x86/boot/compressed/Makefile:
--- a/arch/x86/boot/compressed/Makefile
+++ b/arch/x86/boot/compressed/Makefile
@@ -44,6 +44,10 @@ LDFLAGS := -m elf_$(UTS_MACHINE)
 # Compressed kernel should be built as PIE since it may be loaded at any
 # address by the bootloader.
 ifeq ($(CONFIG_X86_32),y)
+$(info "XXX")
+$(info "KBUILD_CPPFLAGS: ${KBUILD_CPPFLAGS}")
+$(info "CC_OPTION_CFLAGS: ${CC_OPTION_CFLAGS}")
+$(info "LDFLAGS: ${LDFLAGS}")
 LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
 else
 # To build 64-bit compressed kernel as PIE, we disable relocation

then grepping for XXX in a build output:
"XXX"
"KBUILD_CPPFLAGS: -D__KERNEL__  "
"CC_OPTION_CFLAGS: -m32 -D__KERNEL__ -O2 -fno-strict-aliasing -fPIE
-DDISABLE_BRANCH_PROFILING -march=i386 -mno-mmx -mno-sse
-ffreestanding -fno-stack-protector"
"LD_FLAGS: "

then trying this on the command line:
➜  kernel-all git:(kbuild) ✗ cc -pie -D__KERNEL__ -m32 -D__KERNEL__
-O2 -fno-strict-aliasing -fPIE -DDISABLE_BRANCH_PROFILING -march=i386
-mno-mmx -mno-sse -ffreestanding -fno-stack-protector -x c /dev/null
-c -o temp.o
➜  kernel-all git:(kbuild) ✗ ld -pie temp.o -o temp
ld: i386 architecture of input file `temp.o' is incompatible with
i386:x86-64 output
ld: warning: cannot find entry symbol _start; defaulting to 0078
➜  kernel-all git:(kbuild) ✗ echo $?
1
➜  kernel-all git:(kbuild) ✗ file temp.o
temp.o: ELF 32-bit LSB  relocatable, Intel 80386, version 1 (SYSV), not stripped

So it looks like without LDFLAGS, we don't tell the linker to link as
32b, causing ld-option to fail to add -pie to LDFLAGS.

➜  kernel-all git:(kbuild) ✗ ld -pie -m elf_i386 temp.o -o temp
ld: warning: cannot find entry symbol _start; defaulting to 0185
➜  kernel-all git:(kbuild) ✗ echo $?
0

sure enough, just before the call to ld-option from
arch/x86/boot/compressed/Makefile:

 43 LDFLAGS := -m elf_$(UTS_MACHINE)

so looks like Masahiro's suggestion is correct.  Will send a v3.


[PATCH v3] kbuild: fix linker feature test macros when cross compiling with Clang

2017-11-06 Thread Nick Desaulniers
I was not seeing my linker flags getting added when using ld-option when
cross compiling with Clang. Upon investigation, this seems to be due to
a difference in how GCC vs Clang handle cross compilation.

GCC is configured at build time to support one backend, that is implicit
when compiling.  Clang is explicit via the use of `-target ` and
ships with all supported backends by default.

GNU Make feature test macros that compile then link will always fail
when cross compiling with Clang unless Clang's triple is passed along to
the compiler. For example:

$ clang -x c /dev/null -c -o temp.o
$ aarch64-linux-android/bin/ld -E temp.o
aarch64-linux-android/bin/ld:
unknown architecture of input file `temp.o' is incompatible with
aarch64 output
aarch64-linux-android/bin/ld:
warning: cannot find entry symbol _start; defaulting to
00400078
$ echo $?
1

$ clang -target aarch64-linux-android- -x c /dev/null -c -o temp.o
$ aarch64-linux-android/bin/ld -E temp.o
aarch64-linux-android/bin/ld:
warning: cannot find entry symbol _start; defaulting to 004002e4
$ echo $?
0

This causes conditional checks that invoke $(CC) without the target
triple, then $(LD) on the result, to always fail.

Suggested-by: Masahiro Yamada 
Signed-off-by: Nick Desaulniers 
Reviewed-by: Matthias Kaehlcke 
---
Changes since v2:
* Add LDFLAGS to ld-option, as per Masahiro, and spotted by 0-day bot:
  https://lists.01.org/pipermail/lkp/2017-October/007427.html

 scripts/Kbuild.include | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 064f477dfdca..be1c9d65eaf4 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -228,12 +228,13 @@ cc-if-fullversion = $(shell [ $(cc-fullversion) $(1) $(2) 
] && echo $(3) || echo
 # cc-ldoption
 # Usage: ldflags += $(call cc-ldoption, -Wl$(comma)--hash-style=both)
 cc-ldoption = $(call try-run-cached,\
-   $(CC) $(1) -nostdlib -x c /dev/null -o "$$TMP",$(1),$(2))
+   $(CC) $(1) $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -nostdlib -x c 
/dev/null -o "$$TMP",$(1),$(2))
 
 # ld-option
 # Usage: LDFLAGS += $(call ld-option, -X)
 ld-option = $(call try-run-cached,\
-   $(CC) -x c /dev/null -c -o "$$TMPO" ; $(LD) $(1) "$$TMPO" -o 
"$$TMP",$(1),$(2))
+   $(CC) $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -x c /dev/null -c -o 
"$$TMPO"; \
+   $(LD) $(LDFLAGS) $(1) "$$TMPO" -o "$$TMP",$(1),$(2))
 
 # ar-option
 # Usage: KBUILD_ARFLAGS := $(call ar-option,D)
-- 
2.15.0.403.gc27cc4dac6-goog



[PATCH v3] arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27

2017-10-27 Thread Nick Desaulniers
Upon upgrading to binutils 2.27, we found that our lz4 and gzip
compressed kernel images were significantly larger, resulting is 10ms
boot time regressions.

As noted by Rahul:
"aarch64 binaries uses RELA relocations, where each relocation entry
includes an addend value. This is similar to x86_64.  On x86_64, the
addend values are also stored at the relocation offset for relative
relocations. This is an optimization: in the case where code does not
need to be relocated, the loader can simply skip processing relative
relocations.  In binutils-2.25, both bfd and gold linkers did this for
x86_64, but only the gold linker did this for aarch64.  The kernel build
here is using the bfd linker, which stored zeroes at the relocation
offsets for relative relocations.  Since a set of zeroes compresses
better than a set of non-zero addend values, this behavior was resulting
in much better lz4 compression.

The bfd linker in binutils-2.27 is now storing the actual addend values
at the relocation offsets. The behavior is now consistent with what it
does for x86_64 and what gold linker does for both architectures.  The
change happened in this upstream commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1f56df9d0d5ad89806c24e71f296576d82344613
Since a bunch of zeroes got replaced by non-zero addend values, we see
the side effect of lz4 compressed image being a bit bigger.

To get the old behavior from the bfd linker, "--no-apply-dynamic-relocs"
flag can be used:
$ LDFLAGS="--no-apply-dynamic-relocs" make
With this flag, the compressed image size is back to what it was with
binutils-2.25.

If the kernel is using ASLR, there aren't additional runtime costs to
--no-apply-dynamic-relocs, as the relocations will need to be applied
again anyway after the kernel is relocated to a random address.

If the kernel is not using ASLR, then presumably the current default
behavior of the linker is better. Since the static linker performed the
dynamic relocs, and the kernel is not moved to a different address at
load time, it can skip applying the relocations all over again."

Some measurements:

$ ld -v
GNU ld (binutils-2.25-f3d35cf6) 2.25.51.20141117
^
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300652760 Oct 26 11:57 vmlinux
$ ls -l Image.lz4-dtb
-rw-r- 1 ndesaulniers eng 16932627 Oct 26 11:57 Image.lz4-dtb

$ ld -v
GNU ld (binutils-2.27-53dd00a1) 2.27.0.20170315
^
pre patch:
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 11:43 vmlinux
$ ls -l Image.lz4-dtb
-rw-r- 1 ndesaulniers eng 18159474 Oct 26 11:43 Image.lz4-dtb

post patch:
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 12:06 vmlinux
$ ls -l Image.lz4-dtb
-rw-r- 1 ndesaulniers eng 16932466 Oct 26 12:06 Image.lz4-dtb

By Siqi's measurement w/ gzip:
binutils 2.27 with this patch (with --no-apply-dynamic-relocs):
Image 41535488
Image.gz 13404067

binutils 2.27 without this patch (without --no-apply-dynamic-relocs):
Image 41535488
Image.gz 14125516

Any compression scheme should be able to get better results from the
longer runs of zeros, not just GZIP and LZ4.

10ms boot time savings isn't anything to get excited about, but users of
arm64+compression+bfd-2.27 should not have to pay a penalty for no
runtime improvement.

Reported-by: Gopinath Elanchezhian 
Reported-by: Sindhuri Pentyala 
Reported-by: Wei Wang 
Suggested-by: Ard Biesheuvel 
Suggested-by: Rahul Chaudhry 
Suggested-by: Siqi Lin 
Suggested-by: Stephen Hines 
Signed-off-by: Nick Desaulniers 
Reviewed-by: Ard Biesheuvel 
---
Changes since v2:
* Combine this this LDFLAGS_vmlinux append to previous block, per Ard.
* Add Siqi's measurements to commit message.
* Change the conditional to check if set rather than if not empty, since
  this is similar to the rest of the Makefile, and more readable IMO,
  though that's subjective. That part can be reverted if we don't want to
  steal blame.

 arch/arm64/Makefile | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
index 939b310913cf..669378099dbe 100644
--- a/arch/arm64/Makefile
+++ b/arch/arm64/Makefile
@@ -14,8 +14,9 @@ LDFLAGS_vmlinux   :=-p --no-undefined -X
 CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
 GZFLAGS:=-9
 
-ifneq ($(CONFIG_RELOCATABLE),)
-LDFLAGS_vmlinux+= -pie -shared -Bsymbolic
+ifeq ($(CONFIG_RELOCATABLE), y)
+LDFLAGS_vmlinux+= -pie -shared -Bsymbolic \
+   $(call ld-option, --no-apply-dynamic-relocs)
 endif
 
 ifeq ($(CONFIG_ARM64_ERRATUM_843419),y)
-- 
2.15.0.rc2.357.g7e34df9404-goog



Re: [PATCH] kbuild: fix linker feature test macros when cross compiling with Clang

2017-10-27 Thread Nick Desaulniers
+ linux-kbu...@vger.kernel.org

On Fri, Oct 27, 2017 at 4:20 AM, Masahiro Yamada
 wrote:
> I do not like to add $(CLANG_TARGET) to a place for common helpers.
> Instead of $(CLANG_TARGET), please add
> $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS)
> to cc-ldoption and ld-option.

Thanks for the review.  I agree that sounds like a better option.

> I have two requests next time:
>   - please include  linux-kbu...@vger.kernel.org in your To list

Sure thing. scripts/get_maintainer.pl does not recommend that list for
this file; is there a way to explicitly add that list to the
recommendation for that source file?

>   - please base your patch on linux-kbuild/kbuild branch

Will do. Do I need to note it's based off that branch? Otherwise wont
0-day bot complain that my patch doesn't apply/build on torvalds/linux
?


Re: [PATCH] kbuild: fix linker feature test macros when cross compiling with Clang

2017-10-27 Thread Nick Desaulniers
On Fri, Oct 27, 2017 at 11:28 AM, Nick Desaulniers
 wrote:
> On Fri, Oct 27, 2017 at 4:20 AM, Masahiro Yamada
>  wrote:
>>   - please base your patch on linux-kbuild/kbuild branch
>
> Will do. Do I need to note it's based off that branch? Otherwise wont
> 0-day bot complain that my patch doesn't apply/build on torvalds/linux
> ?

Talked to some teammates about this, sounds like it's not a problem,
so I'll just send v2 with a note to the reviewer.


[PATCH v2] kbuild: fix linker feature test macros when cross compiling with Clang

2017-10-27 Thread Nick Desaulniers
I was not seeing my linker flags getting added when using ld-option when
cross compiling with Clang. Upon investigation, this seems to be due to
a difference in how GCC vs Clang handle cross compilation.

GCC is configured at build time to support one backend, that is implicit
when compiling.  Clang is explicit via the use of `-target ` and
ships with all supported backends by default.

GNU Make feature test macros that compile then link will always fail
when cross compiling with Clang unless Clang's triple is passed along to
the compiler. For example:

$ clang -x c /dev/null -c -o temp.o
$ aarch64-linux-android/bin/ld -E temp.o
aarch64-linux-android/bin/ld:
unknown architecture of input file `temp.o' is incompatible with
aarch64 output
aarch64-linux-android/bin/ld:
warning: cannot find entry symbol _start; defaulting to
00400078
$ echo $?
1

$ clang -target aarch64-linux-android- -x c /dev/null -c -o temp.o
$ aarch64-linux-android/bin/ld -E temp.o
aarch64-linux-android/bin/ld:
warning: cannot find entry symbol _start; defaulting to 004002e4
$ echo $?
0

This causes conditional checks that invoke $(CC) without the target
triple, then $(LD) on the result, to always fail.

Suggested-by: Masahiro Yamada 
Signed-off-by: Nick Desaulniers 
Reviewed-by: Matthias Kaehlcke 
---
Changes since v1:
* base patch off of
  git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git
  kbuild branch, per Masahiro.
* Use $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) rather than $(CLANG_TRIPLE), per
  Masahiro.

 scripts/Kbuild.include | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
index 064f477dfdca..0f09e4508554 100644
--- a/scripts/Kbuild.include
+++ b/scripts/Kbuild.include
@@ -228,12 +228,13 @@ cc-if-fullversion = $(shell [ $(cc-fullversion) $(1) $(2) 
] && echo $(3) || echo
 # cc-ldoption
 # Usage: ldflags += $(call cc-ldoption, -Wl$(comma)--hash-style=both)
 cc-ldoption = $(call try-run-cached,\
-   $(CC) $(1) -nostdlib -x c /dev/null -o "$$TMP",$(1),$(2))
+   $(CC) $(1) $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -nostdlib -x c 
/dev/null -o "$$TMP",$(1),$(2))
 
 # ld-option
 # Usage: LDFLAGS += $(call ld-option, -X)
 ld-option = $(call try-run-cached,\
-   $(CC) -x c /dev/null -c -o "$$TMPO" ; $(LD) $(1) "$$TMPO" -o 
"$$TMP",$(1),$(2))
+   $(CC) $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -x c /dev/null -c -o 
"$$TMPO"; \
+   $(LD) $(1) "$$TMPO" -o "$$TMP",$(1),$(2))
 
 # ar-option
 # Usage: KBUILD_ARFLAGS := $(call ar-option,D)
-- 
2.15.0.rc2.357.g7e34df9404-goog



[RFC] Testing builds with Clang

2017-10-28 Thread Nick Desaulniers
Hello,
Recently, a fair amount of effort has gone into compiling the kernel
with Clang/LLVM.  We were curious what might be some of the
requirements to getting 0-day coverage for Clang built kernels?

https://lwn.net/Articles/734071/

As in, does Clang need to support allmodconfig, randconfig, all
ARCH's, etc before 0-day can be turned on?  Such requirements can help
us set goals.

Also, is https://github.com/intel/lkp-tests the codebase for the 0-day bot?

Thanks,
~Nick Desaulniers


Re: [PATCH v2] kbuild: fix linker feature test macros when cross compiling with Clang

2017-10-30 Thread Nick Desaulniers
That's safe to do for now.  Here's the 0-day failure thread:
https://lists.01.org/pipermail/lkp/2017-October/007427.html  If I can
sort out the issue, I'll submit a v3.

On Mon, Oct 30, 2017 at 8:46 AM, Masahiro Yamada
 wrote:
> 2017-10-30 15:50 GMT+09:00 Masahiro Yamada :
>> 2017-10-29 0:00 GMT+09:00 Masahiro Yamada :
>>> 2017-10-28 5:13 GMT+09:00 Nick Desaulniers :
>>>> I was not seeing my linker flags getting added when using ld-option when
>>>> cross compiling with Clang. Upon investigation, this seems to be due to
>>>> a difference in how GCC vs Clang handle cross compilation.
>>>>
>>>> GCC is configured at build time to support one backend, that is implicit
>>>> when compiling.  Clang is explicit via the use of `-target ` and
>>>> ships with all supported backends by default.
>>>>
>>>> GNU Make feature test macros that compile then link will always fail
>>>> when cross compiling with Clang unless Clang's triple is passed along to
>>>> the compiler. For example:
>>>>
>>>> $ clang -x c /dev/null -c -o temp.o
>>>> $ aarch64-linux-android/bin/ld -E temp.o
>>>> aarch64-linux-android/bin/ld:
>>>> unknown architecture of input file `temp.o' is incompatible with
>>>> aarch64 output
>>>> aarch64-linux-android/bin/ld:
>>>> warning: cannot find entry symbol _start; defaulting to
>>>> 00400078
>>>> $ echo $?
>>>> 1
>>>>
>>>> $ clang -target aarch64-linux-android- -x c /dev/null -c -o temp.o
>>>> $ aarch64-linux-android/bin/ld -E temp.o
>>>> aarch64-linux-android/bin/ld:
>>>> warning: cannot find entry symbol _start; defaulting to 004002e4
>>>> $ echo $?
>>>> 0
>>>>
>>>> This causes conditional checks that invoke $(CC) without the target
>>>> triple, then $(LD) on the result, to always fail.
>>>>
>>>> Suggested-by: Masahiro Yamada 
>>>> Signed-off-by: Nick Desaulniers 
>>>> Reviewed-by: Matthias Kaehlcke 
>>>> ---
>>>> Changes since v1:
>>>> * base patch off of
>>>>   git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild.git
>>>>   kbuild branch, per Masahiro.
>>>> * Use $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) rather than $(CLANG_TRIPLE), 
>>>> per
>>>>   Masahiro.
>>>>
>>>>  scripts/Kbuild.include | 5 +++--
>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/scripts/Kbuild.include b/scripts/Kbuild.include
>>>> index 064f477dfdca..0f09e4508554 100644
>>>> --- a/scripts/Kbuild.include
>>>> +++ b/scripts/Kbuild.include
>>>> @@ -228,12 +228,13 @@ cc-if-fullversion = $(shell [ $(cc-fullversion) $(1) 
>>>> $(2) ] && echo $(3) || echo
>>>>  # cc-ldoption
>>>>  # Usage: ldflags += $(call cc-ldoption, -Wl$(comma)--hash-style=both)
>>>>  cc-ldoption = $(call try-run-cached,\
>>>> -   $(CC) $(1) -nostdlib -x c /dev/null -o "$$TMP",$(1),$(2))
>>>> +   $(CC) $(1) $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -nostdlib -x c 
>>>> /dev/null -o "$$TMP",$(1),$(2))
>>>>
>>>>  # ld-option
>>>>  # Usage: LDFLAGS += $(call ld-option, -X)
>>>>  ld-option = $(call try-run-cached,\
>>>> -   $(CC) -x c /dev/null -c -o "$$TMPO" ; $(LD) $(1) "$$TMPO" -o 
>>>> "$$TMP",$(1),$(2))
>>>> +   $(CC) $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -x c /dev/null -c -o 
>>>> "$$TMPO"; \
>>>> +   $(LD) $(1) "$$TMPO" -o "$$TMP",$(1),$(2))
>>>>
>>>>  # ar-option
>>>>  # Usage: KBUILD_ARFLAGS := $(call ar-option,D)
>>>> --
>>>> 2.15.0.rc2.357.g7e34df9404-goog
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-kbuild" in
>>>> the body of a message to majord...@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>> Applied to linux-kbuild/kbuild.  Thanks!
>>
>>
>>
>> I do not know the cause of the problem reported by the 0-day bot, but
>> if the problem happens in the following line,
>>
>> ifeq ($(CONFIG_X86_32),y)
>> LDFLAGS += $(call ld-option, -pie) $(call ld-option, --no-dynamic-linker)
>> else
>>
>>
>>
>> Does the following solve the issue?(adding $(LDFLAGS))
>>
>> ld-option = $(call try-run,\
>> $(CC) $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -x c /dev/null -c
>> -o "$$TMPO"; \
>> $(LD) $(LDFLAGS) $(1) "$$TMPO" -o "$$TMP",$(1),$(2))
>>
>>
>
>
> I dropped this patch for now.
>
> --
> Best Regards
> Masahiro Yamada



-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] parisc: prefer _THIS_IP_ and _RET_IP_ statement expressions

2018-08-02 Thread Nick Desaulniers
On Wed, Aug 1, 2018 at 5:49 PM John David Anglin  wrote:
>
> On 2018-08-01 6:18 PM, Nick Desaulniers wrote:
> >> What about the uses in the fs support, etc?
> > Sorry, I don't see it?
> I mean _THIS_IP_.

I don't understand, I'm referring to current_text_addr().  Maybe this
explains more what I'm trying to do:
https://lkml.org/lkml/2018/8/1/1689

If I understand your point correctly, is it that you're saying that
_THIS_IP_ should be implemented in terms of inline assembly (as in
what current_text_addr() is currently)?  If that's what you mean and
I'm understanding correctly, my point is that we should be preferring
the generic C implementation as that's what's being used in most
places currently, so if it was broken you'd likely already know about
it.  Unless unwinding is truly broken by the additional label, I don't
think we need an inline assembly implementation of current_text_addr()
for parisc (or any arch for that matter).  If we do, then it can be
localized to the parisc unwinding code, that way it can be
consolidated everywhere else for every other arch.

-- 
Thanks,
~Nick Desaulniers


Re: native_save_fl() causes a warning

2018-08-03 Thread Nick Desaulniers
On Fri, Aug 3, 2018 at 6:10 AM Jean Delvare  wrote:
>
> Hi Nick,
>
> It seems that this linux kernel commit of yours:
>
> commit d0a8d9378d16eb3c69bd8e6d23779fbdbee3a8c7
> Author: Nick Desaulniers
> Date:   Thu Jun 21 09:23:24 2018 -0700
>
> x86/paravirt: Make native_save_fl() extern inline
>
> introduced a new warning (with W=1):
>
> ./arch/x86/include/asm/irqflags.h:16:29: warning: no previous prototype for 
> ‘native_save_fl’ [-Wmissing-prototypes]
>  extern inline unsigned long native_save_fl(void)
>  ^
>
> Please fix it.

Hi Jean, thanks for the report.  David Laight also reported this
warning; he tested a patch I sent him overnight.

Let me guess, you're using a version of GCC < 4.9?  It seems that GCC
< 4.9 will produce that warning for extern inline functions without
previous declarations.

I'll add your Reported-By tag to the patch that I will send out in a
few minutes.

> Secondly, I am quite curious why you changed only native_save_fl() from
> static inline to extern inline, when native_restore_fl(),
> native_irq_disable() and native_irq_enable() are equally referenced by
> address in arch/x86/kernel/paravirt.c and thus should suffer from the
> same problem. Can you explain?

This is a good point.  With native_save_fl, we were not able to boot
the kernel at all.  Maybe this was called from the boot sequence
(maybe Juergen knows more)?  It seems that the other functions aren't
preventing us from booting, but maybe exercising other paths in
paravirt would expose such an issue?  Or maybe paravirt doesn't have
the same calling convention requirements for those functions?

Is there a standard testing procedure for paravirt? I'd be happy to
try it to see if we can expose more things that should have the same
cleanup.
-- 
Thanks,
~Nick Desaulniers


[PATCH] x86/irqflags: provide a declaration for native_save_fl

2018-08-03 Thread Nick Desaulniers
Fixes commit d0a8d9378d16 ("x86/paravirt: Make native_save_fl() extern
inline").

It was reported that the above commit was causing users of gcc < 4.9 to
observe -Werror=missing-prototypes errors.

Indeed, it seems that:
extern inline unsigned long native_save_fl(void) { return 0; }

compiled with -Werror=missing-prototypes produces this warning in gcc <
4.9, but not gcc >= 4.9.

Cc: sta...@vger.kernel.org # 4.17, 4.14, 4.9, 4.4
Reported-by: David Laight 
Reported-by: Jean Delvare 
Signed-off-by: Nick Desaulniers 
---
 arch/x86/include/asm/irqflags.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index c4fc17220df9..c14f2a74b2be 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -13,6 +13,8 @@
  * Interrupt control:
  */
 
+/* Declaration required for gcc < 4.9 to prevent -Werror=missing-prototypes */
+extern inline unsigned long native_save_fl(void);
 extern inline unsigned long native_save_fl(void)
 {
unsigned long flags;
-- 
2.18.0.597.ga71716f1ad-goog



Re: [PATCH] parisc: prefer _THIS_IP_ and _RET_IP_ statement expressions

2018-08-03 Thread Nick Desaulniers
On Fri, Aug 3, 2018 at 10:57 AM John David Anglin  wrote:
>
> On 2018-08-02 4:31 PM, Nick Desaulniers wrote:
> > If I understand your point correctly, is it that you're saying that
> > _THIS_IP_  should be implemented in terms of inline assembly (as in
> > what current_text_addr() is currently)?  If that's what you mean and
> > I'm understanding correctly, my point is that we should be preferring
> > the generic C implementation as that's what's being used in most
> > places currently, so if it was broken you'd likely already know about
> > it.  Unless unwinding is truly broken by the additional label, I don't
> > think we need an inline assembly implementation of current_text_addr()
> > for parisc (or any arch for that matter).  If we do, then it can be
> > localized to the parisc unwinding code, that way it can be
> > consolidated everywhere else for every other arch.
> The label breaks the unwind data, not the unwind code.  So, localizing
> the use of
> current_text_addr() to the parisc unwind code doesn't help.

But the kernel uses the generic _THIS_IP_ *everywhere*, not parisc's
custom current_text_addr().  So if this did actually break unwinding,
you should have noticed by now.  There's only one call site that uses
the custom current_text_addr(), which is what my patch is addressing.

Localizing the custom implementation would literally produce the same
binary as is produced today, but allow us to start removing all the
other custom implementations of current_text_addr() from
arch/*/include/asm/processor.h, which is why I proposed that as an
alternative to this patch.

> Personally, I prefer the implementation of current_text_addr() because:
>
> 1) The generated code is smaller, and

One instruction vs two, for a single call site that I bet is cold and
not inlined in multiple places.

> 2) it doesn't introduce any unnecessary labels into the text.
>
> As noted, these labels can cause issues with unwinding.

Can you confirm that applying this patch actually breaks unwinding?

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] parisc: prefer _THIS_IP_ and _RET_IP_ statement expressions

2018-08-03 Thread Nick Desaulniers
On Fri, Aug 3, 2018 at 12:09 PM John David Anglin  wrote:
>
> On 2018-08-03 2:11 PM, Nick Desaulniers wrote:
> > But the kernel uses the generic_THIS_IP_  *everywhere*, not parisc's
> > custom current_text_addr().  So if this did actually break unwinding,
> > you should have noticed by now.
> The unwind problem was noticed.

So parisc is currently broken (doesn't unwind) due to the pervasive
use of _THIS_IP_ (generic C) throughout the kernel?

If no, that implies this patch (generic C) causes no unwinding problems.
If yes, that implies that the diff I posted later in this thread
(inline assembly) is preferable, and that parisc has bigger problems
(and probably needs to do rewrite the unwinding code to handle these
extra labels everywhere).

> Patches were recently applied to gcc and binutils to try and fix it.
> The gcc patch moved
> branch tables to rodata so that the label at the head of the table
> wasn't in text.
>
> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01804.html
> https://sourceware.org/ml/binutils/2018-07/msg00474.html
>
> When I saw your suggested change, I realized there was another source of
> text labels
> that need linker relocations.

Thank you for the links.

On Fri, Aug 3, 2018 at 10:57 AM John David Anglin  wrote:
> The label breaks the unwind data, not the unwind code.  So, localizing
> the use of
> current_text_addr() to the parisc unwind code doesn't help.

Have you confirmed that applying my patch breaks *the ability to
unwind correctly*? It looks like return_address() is used in
ftrace_return_address(), so I assume you can boot a kernel with my
patch applied, and CONFIG_FTRACE=y, then run:

$ sudo trace-cmd record -p function date
$ trace-cmd report | grep date- | less

and see if the stacks aren't unwound or look messed up.
-- 
Thanks,
~Nick Desaulniers


Re: [tip:x86/urgent] x86/irqflags: Provide a declaration for native_save_fl

2018-08-06 Thread Nick Desaulniers
On Sun, Aug 5, 2018 at 1:33 PM tip-bot for Nick Desaulniers
 wrote:
>
> Commit-ID:  208cbb32558907f68b3b2a081ca2337ac3744794
> Gitweb: 
> https://git.kernel.org/tip/208cbb32558907f68b3b2a081ca2337ac3744794
> Author: Nick Desaulniers 
> AuthorDate: Fri, 3 Aug 2018 10:05:50 -0700
> Committer:  Thomas Gleixner 
> CommitDate: Sun, 5 Aug 2018 22:30:37 +0200
>
> x86/irqflags: Provide a declaration for native_save_fl
>
> It was reported that the commit d0a8d9378d16 is causing users of gcc < 4.9
> to observe -Werror=missing-prototypes errors.
>
> Indeed, it seems that:
> extern inline unsigned long native_save_fl(void) { return 0; }
>
> compiled with -Werror=missing-prototypes produces this warning in gcc <
> 4.9, but not gcc >= 4.9.
>
> Fixes: d0a8d9378d16 ("x86/paravirt: Make native_save_fl() extern inline").
> Reported-by: David Laight 
> Reported-by: Jean Delvare 
> Signed-off-by: Nick Desaulniers 
> Signed-off-by: Thomas Gleixner 
> Cc: h...@zytor.com
> Cc: jgr...@suse.com
> Cc: kstew...@linuxfoundation.org
> Cc: gre...@linuxfoundation.org
> Cc: boris.ostrov...@oracle.com
> Cc: astrac...@google.com
> Cc: m...@chromium.org
> Cc: a...@arndb.de
> Cc: tstel...@redhat.com
> Cc: sedat.di...@gmail.com
> Cc: david.lai...@aculab.com
> Cc: sta...@vger.kernel.org

Not sure if this was going to be cleaned up in an automated way, but
looks like this commit message drops the comment to stable as to how
far back it should go:

# 4.17, 4.14, 4.9, 4.4

also, there were tested by's reported:

Tested-by: David Laight 
Tested-by: Sedat Dilek 

--
Thanks,
~Nick Desaulniers


Re: [tip:x86/urgent] x86/irqflags: Provide a declaration for native_save_fl

2018-08-07 Thread Nick Desaulniers
On Tue, Aug 7, 2018 at 12:29 AM Thomas Gleixner  wrote:
>
> On Mon, 6 Aug 2018, Nick Desaulniers wrote:
> > On Sun, Aug 5, 2018 at 1:33 PM tip-bot for Nick Desaulniers
> >  wrote:
> > > Cc: sta...@vger.kernel.org
> >
> > Not sure if this was going to be cleaned up in an automated way, but
> > looks like this commit message drops the comment to stable as to how
> > far back it should go:
> >
> > # 4.17, 4.14, 4.9, 4.4
>
> The Fixes tag is enough. If the upstream commit was backported, then the
> tools of the stable folks will find it and know exactly how far it needs to
> go back.

Oh, cool.

> > also, there were tested by's reported:
> >
> > Tested-by: David Laight 
> > Tested-by: Sedat Dilek 
>
> Which came in after I applied it

I've seen other maintainers revise commit messages before sending pull
requests along, but I guess that's problematic as anyone else who has
pulled before the revision would then have a merge conflict.
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] parisc: prefer _THIS_IP_ and _RET_IP_ statement expressions

2018-08-07 Thread Nick Desaulniers
On Fri, Aug 3, 2018 at 3:34 PM Helge Deller  wrote:
>
> On 03.08.2018 22:33, Nick Desaulniers wrote:
> > On Fri, Aug 3, 2018 at 12:09 PM John David Anglin  
> > wrote:
> >>
> >> On 2018-08-03 2:11 PM, Nick Desaulniers wrote:
> >>> But the kernel uses the generic_THIS_IP_  *everywhere*, not parisc's
> >>> custom current_text_addr().  So if this did actually break unwinding,
> >>> you should have noticed by now.
> >> The unwind problem was noticed.
> >
> > So parisc is currently broken (doesn't unwind) due to the pervasive
> > use of _THIS_IP_ (generic C) throughout the kernel?
>
> I tested it on the 32bit kernel.
> The answer is: No. Unwinding works (with and without your patch).
>
> > If no, that implies this patch (generic C) causes no unwinding problems.
>
> correct.
>
> > If yes, that implies that the diff I posted later in this thread
> > (inline assembly) is preferable, and that parisc has bigger problems
> > (and probably needs to do rewrite the unwinding code to handle these
> > extra labels everywhere).
> >
> >> Patches were recently applied to gcc and binutils to try and fix it.
> >> The gcc patch moved
> >> branch tables to rodata so that the label at the head of the table
> >> wasn't in text.
> >>
> >> https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01804.html
> >> https://sourceware.org/ml/binutils/2018-07/msg00474.html
> >>
> >> When I saw your suggested change, I realized there was another source of
> >> text labels
> >> that need linker relocations.
> >
> > Thank you for the links.
> >
> > On Fri, Aug 3, 2018 at 10:57 AM John David Anglin  
> > wrote:
> >> The label breaks the unwind data, not the unwind code.  So, localizing
> >> the use of
> >> current_text_addr() to the parisc unwind code doesn't help.
> >
> > Have you confirmed that applying my patch breaks *the ability to
> > unwind correctly*?
>
> I tested your patch (on 32bit).
> Your patch does not break anything.
>
> > It looks like return_address() is used in
> > ftrace_return_address(), so I assume you can boot a kernel with my
> > patch applied, and CONFIG_FTRACE=y, then run:
> >
> > $ sudo trace-cmd record -p function date
> > $ trace-cmd report | grep date- | less
> >
> > and see if the stacks aren't unwound or look messed up.
>
> I faced issues with trace-cmd, but calling ftracing functions manually worked.
>
> So, your patch is basically OK and doesn't break anything.
> But I agree with Dave that Andrew, that THIS_IP is ugly.

I don't disagree, and other maintainers have remarked on _THIS_IP_
being ugly, but renaming it en masse is a tree wide change, which I'm
trying to avoid at the moment.

It sounds like we have a working patch? Are there 64b parisc machines
to test on, or can this get merged?
--
Thanks,
~Nick Desaulniers


Re: [PATCH] parisc: prefer _THIS_IP_ and _RET_IP_ statement expressions

2018-08-07 Thread Nick Desaulniers
On Tue, Aug 7, 2018 at 1:30 PM Helge Deller  wrote:
>
> On 07.08.2018 20:11, Nick Desaulniers wrote:
> > On Fri, Aug 3, 2018 at 3:34 PM Helge Deller  wrote:
> >> So, your patch is basically OK and doesn't break anything.
> >> But I agree with Dave and Andrew, that THIS_IP is ugly.
> >
> > I don't disagree, and other maintainers have remarked on _THIS_IP_
> > being ugly, but renaming it en masse is a tree wide change, which I'm
> > trying to avoid at the moment.
>
> Understandable.
>
> > It sounds like we have a working patch? Are there 64b parisc machines
> > to test on, or can this get merged?
>
> Go ahead and merge it.

Thank you, but I was under the impression this would go up through the
parisc tree?
https://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux.git/
right? Or is there a different tree/maintainer I should ask?

> In addition, somehow I'd prefer if there would be a way to add to 
> include/linux/kernel.h:
> +#if !defined(_THIS_IP_)
> #define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
> +#endif
>
> That way it would somehow be possible to replace that label calulation by the
> preferable inline assembly of current_text_address()...

I'm asserting that there is no need within the entire kernel at the
moment to have inline assembly to get the instruction pointer.  If
there are no call sites of the inline assembly version, there is no
need to define per arch inline assembly versions when C (with GNU
extensions) will suffice.

And if in the future there are, either those call sites can have a
local implementation (as in the second diff I sent in this thread), or
some other change to _THIS_IP_ (as you propose) can be made.  But
until then...

YAGNI: "You Ain't Gonna Need It"

Once this patch and the other 3 outstanding ones are merged, we'll be
sending patches to delete all arch specific assembly implementations
as they will be dead code (no callers, kernel-wide).
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] include/linux/compiler*.h: Use feature checking instead of version checks for attributes

2018-08-27 Thread Nick Desaulniers
On Sun, Aug 26, 2018 at 10:58 AM Miguel Ojeda
 wrote:
>
> Instead of using version checks per-compiler to define (or not) each 
> attribute,
> use __has_attribute to test for them, following the cleanup started with
> commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually 
> exclusive").
>
> All the attributes that are fairly common/standard (i.e. those that do not
> require extra logic to define them) have been moved to a new file
> include/linux/compiler_attributes.h. The attributes have been sorted
> and divided between "required" and "optional".

Nice! Thanks Miguel.  Regarding sorting, I'm happy with that.  In
fact, some of the comments can be removed IMO, as the attributes have
common definitions in the docs (maybe an added link to the gcc and
clang attribute docs at the top of the file rather than per attribute
comments).

>
> Further, attributes that are already supported in gcc >= 4.6 and recent clang
> were simply made to be required (instead of testing for them):
>   * always_inline
>   * const (pure was already "required", by the way)
>   * gnu_inline

There's an important test for gnu_inline that isn't checking that it's
supported, but rather what the implicit behavior is depending on which
C standard is being used.  It's important not to remove that.

>
> Finally, some other bits were cleaned up in the process:
>   * __optimize: removed (unused in the whole kernel tree)

A+ for removing dead code.  I also don't see it used anywhere.

>   * __must_be_array: removed from -gcc and -clang (identical), moved to _types

Yep, uses a builtin (we should add guards for that, later, in a
similar style change that guards the use of builtins). Looks good.

> (it depends on the unconditionally used  __builtin_types_compatible_p
>   * Removes unneeded underscores on the attributes' names

That doesn't sound right, but lets see what you mean by that.

>
> There are some things that can be further cleaned up afterwards:
>   * __attribute_const__: rename to __const

This doesn't look correct to me; the kernel is full of call sites for
__attribute_const__. You can't rename the definition without renaming
all of the call sites (and that would be too big a change for this
patch, IMO).  Skip the rename, and it also looks like you just removed
it outright (Oops).

>   * __noretpoline: avoid checking for defined(__notrepoline)
>   * __compiletime_warning/error: they are in two different places,
> -gcc and compiler.h.
>   * sparse' attributes could potentially go into the end of attributes.h
> too (as another separate section).
>
> Compile-tested an x86 allmodconfig for a while with gcc 8.2.0 and 4.6.4.

It's important to test changes to compiler-clang.h with clang. ;)

>
> Cc: Eli Friedman 
> Cc: Christopher Li 
> Cc: Kees Cook 
> Cc: Ingo Molnar 
> Cc: Geert Uytterhoeven 
> Cc: Arnd Bergmann 
> Cc: Greg Kroah-Hartman 
> Cc: Masahiro Yamada 
> Cc: Joe Perches 
> Cc: Dominique Martinet 
> Cc: Nick Desaulniers 
> Cc: Linus Torvalds 
> Signed-off-by: Miguel Ojeda 
> ---
> *Seems* to work, but note that I did not finish the entire allmodconfig :)
>
> A few things could be splitted into their own patch, but I kept it
> as one for simplicity for a first review.
>
> These files are becoming no-headaches-readable again, yay.

A+

>
>  include/linux/compiler-clang.h  |   5 --
>  include/linux/compiler-gcc.h|  60 
>  include/linux/compiler-intel.h  |   6 --
>  include/linux/compiler.h|   4 --
>  include/linux/compiler_attributes.h | 105 
>  include/linux/compiler_types.h  |  91 
>  6 files changed, 118 insertions(+), 153 deletions(-)
>  create mode 100644 include/linux/compiler_attributes.h
>
> diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> index b1ce500fe8b3..3e7dafb3ea80 100644
> --- a/include/linux/compiler-clang.h
> +++ b/include/linux/compiler-clang.h
> @@ -21,8 +21,6 @@
>  #define __SANITIZE_ADDRESS__
>  #endif
>
> -#define __no_sanitize_address __attribute__((no_sanitize("address")))
> -
>  /*
>   * Not all versions of clang implement the the type-generic versions
>   * of the builtin overflow checkers. Fortunately, clang implements
> @@ -41,6 +39,3 @@
>   * compilers, like ICC.
>   */
>  #define barrier() __asm__ __volatile__("" : : : "memory")
> -#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
> -#define __assume_aligned(a, ...)   \
> -   __attribute__((__assume_aligned__(a, ## __VA_ARGS__)))
> diff --git a/include/linux/co

Re: [PATCH] include/linux/compiler*.h: Use feature checking instead of version checks for attributes

2018-08-27 Thread Nick Desaulniers
On Mon, Aug 27, 2018 at 10:43 AM Nick Desaulniers
 wrote:
>
> On Sun, Aug 26, 2018 at 10:58 AM Miguel Ojeda
>  wrote:
> >
> > Instead of using version checks per-compiler to define (or not) each 
> > attribute,
> > use __has_attribute to test for them, following the cleanup started with
> > commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h mutually 
> > exclusive").
> >
> > All the attributes that are fairly common/standard (i.e. those that do not
> > require extra logic to define them) have been moved to a new file
> > include/linux/compiler_attributes.h. The attributes have been sorted
> > and divided between "required" and "optional".
>
> Nice! Thanks Miguel.  Regarding sorting, I'm happy with that.  In
> fact, some of the comments can be removed IMO, as the attributes have
> common definitions in the docs (maybe an added link to the gcc and
> clang attribute docs at the top of the file rather than per attribute
> comments).
>
> >
> > Further, attributes that are already supported in gcc >= 4.6 and recent 
> > clang
> > were simply made to be required (instead of testing for them):
> >   * always_inline
> >   * const (pure was already "required", by the way)
> >   * gnu_inline
>
> There's an important test for gnu_inline that isn't checking that it's
> supported, but rather what the implicit behavior is depending on which
> C standard is being used.  It's important not to remove that.
>
> >
> > Finally, some other bits were cleaned up in the process:
> >   * __optimize: removed (unused in the whole kernel tree)
>
> A+ for removing dead code.  I also don't see it used anywhere.
>
> >   * __must_be_array: removed from -gcc and -clang (identical), moved to 
> > _types
>
> Yep, uses a builtin (we should add guards for that, later, in a
> similar style change that guards the use of builtins). Looks good.
>
> > (it depends on the unconditionally used  __builtin_types_compatible_p
> >   * Removes unneeded underscores on the attributes' names
>
> That doesn't sound right, but lets see what you mean by that.
>
> >
> > There are some things that can be further cleaned up afterwards:
> >   * __attribute_const__: rename to __const
>
> This doesn't look correct to me; the kernel is full of call sites for
> __attribute_const__. You can't rename the definition without renaming
> all of the call sites (and that would be too big a change for this
> patch, IMO).  Skip the rename, and it also looks like you just removed
> it outright (Oops).
>
> >   * __noretpoline: avoid checking for defined(__notrepoline)
> >   * __compiletime_warning/error: they are in two different places,
> > -gcc and compiler.h.
> >   * sparse' attributes could potentially go into the end of attributes.h
> > too (as another separate section).
> >
> > Compile-tested an x86 allmodconfig for a while with gcc 8.2.0 and 4.6.4.
>
> It's important to test changes to compiler-clang.h with clang. ;)
>
> >
> > Cc: Eli Friedman 
> > Cc: Christopher Li 
> > Cc: Kees Cook 
> > Cc: Ingo Molnar 
> > Cc: Geert Uytterhoeven 
> > Cc: Arnd Bergmann 
> > Cc: Greg Kroah-Hartman 
> > Cc: Masahiro Yamada 
> > Cc: Joe Perches 
> > Cc: Dominique Martinet 
> > Cc: Nick Desaulniers 
> > Cc: Linus Torvalds 
> > Signed-off-by: Miguel Ojeda 
> > ---
> > *Seems* to work, but note that I did not finish the entire allmodconfig :)
> >
> > A few things could be splitted into their own patch, but I kept it
> > as one for simplicity for a first review.
> >
> > These files are becoming no-headaches-readable again, yay.
>
> A+
>
> >
> >  include/linux/compiler-clang.h  |   5 --
> >  include/linux/compiler-gcc.h|  60 
> >  include/linux/compiler-intel.h  |   6 --
> >  include/linux/compiler.h|   4 --
> >  include/linux/compiler_attributes.h | 105 
> >  include/linux/compiler_types.h  |  91 
> >  6 files changed, 118 insertions(+), 153 deletions(-)
> >  create mode 100644 include/linux/compiler_attributes.h
> >
> > diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> > index b1ce500fe8b3..3e7dafb3ea80 100644
> > --- a/include/linux/compiler-clang.h
> > +++ b/include/linux/compiler-clang.h
> > @@ -21,8 +21,6 @@
> >  #define __SANITIZE_ADDRESS__
> >  #endif
> >
> > -#define __no_s

Re: [PATCH v2] compiler.h: give up __compiletime_assert_fallback()

2018-08-27 Thread Nick Desaulniers
On Mon, Aug 27, 2018 at 1:05 PM Daniel Santos  wrote:
>
> Hello Masahiro,
>
>
> On 08/25/2018 01:16 PM, Masahiro Yamada wrote:
> > __compiletime_assert_fallback() is supposed to stop building earlier
> > by using the negative-array-size method in case the compiler does not
> > support "error" attribute, but has never worked like that.
> >
> > You can simply try:
> >
> > BUILD_BUG_ON(1);
> >
> > GCC immediately terminates the build, but Clang does not report
> > anything because Clang does not support the "error" attribute now.
> > It will later fail at link time, but __compiletime_assert_fallback()
> > is not working at least.
> >
> > The root cause is commit 1d6a0d19c855 ("bug.h: prevent double evaluation
> > of `condition' in BUILD_BUG_ON").
>
> I didn't really think this particular patch was necessary, but it was
> requested that I eliminate double evaluation and I didn't feel like
> arguing it at the time. :)  In my philosophy however, one should *never*
> use an expression with side effects in any type of assert.
>
> > Prior to that commit, BUILD_BUG_ON()
> > was checked by the negative-array-size method *and* the link-time trick.
> > Since that commit, the negative-array-size is not effective because
> > '__cond' is no longer constant.
>
> Now we're back to the question of "what do you mean by 'constant'"?  If
> you mean a C constant expression (as defined in the C standard) than
> almost none of this code fits that criteria.  For these compile-time
> assertions to work, we are concerned with the data flow analysis and
> constant propagation performed by the compiler during optimization.  You
> will notice in include/linux/compiler.h that __compiletime_assert is a
> no-op when __OPTIMIZE__ is not defined.

Depending on optimizations for static assertions sounds problematic.

>
> > As the comment in 
> > says, GCC (and Clang as well) only emits the error for obvious cases.
> >
> > When '__cond' is a variable,
> >
> > ((void)sizeof(char[1 - 2 * __cond]))
> >
> > ... is not obvious for the compiler to know the array size is negative.
> >
> > Reverting that commit would break BUILD_BUG() because negative-size-array
> > is evaluated before the code is optimized out.
> >
> > Let's give up __compiletime_assert_fallback().  This commit does not
> > change the current behavior since it just rips off the useless code.
>
> Clang is not the only target audience of
> __compiletime_assert_fallback().  Instead of ripping out something that
> may benefit builds with gcc 4.2 and earlier, why not override its

Note that with commit cafa0010cd51 ("Raise the minimum required gcc
version to 4.6") that gcc < 4.6 is irrelevant.

> definition in compiler-clang.h with something that will break the build
> for Clang?  It would need an #ifndef __compiletime_error_fallback here
> though.
>
> > Signed-off-by: Masahiro Yamada 
> > Reviewed-by: Kees Cook 
> > Reviewed-by: Nick Desaulniers 
> > ---
> >
> > Changes in v2:
> >   - Rebase
> >
> >  include/linux/compiler.h | 17 +
> >  1 file changed, 1 insertion(+), 16 deletions(-)
> >
> > diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> > index 681d866..87c776c 100644
> > --- a/include/linux/compiler.h
> > +++ b/include/linux/compiler.h
> > @@ -314,29 +314,14 @@ static inline void *offset_to_ptr(const int *off)
> >  #endif
> >  #ifndef __compiletime_error
> >  # define __compiletime_error(message)
> > -/*
> > - * Sparse complains of variable sized arrays due to the temporary variable 
> > in
> > - * __compiletime_assert. Unfortunately we can't just expand it out to make
> > - * sparse see a constant array size without breaking compiletime_assert on 
> > old
> > - * versions of GCC (e.g. 4.2.4), so hide the array from sparse altogether.
> > - */
> > -# ifndef __CHECKER__
> > -#  define __compiletime_error_fallback(condition) \
> > - do { ((void)sizeof(char[1 - 2 * condition])); } while (0)
> > -# endif
> > -#endif
> > -#ifndef __compiletime_error_fallback
> > -# define __compiletime_error_fallback(condition) do { } while (0)
> >  #endif
> >
> >  #ifdef __OPTIMIZE__
> >  # define __compiletime_assert(condition, msg, prefix, suffix)  
> >   \
> >   do {\
> > - int __cond = !(condition);  \
> >

Re: [PATCH v2] compiler.h: give up __compiletime_assert_fallback()

2018-08-27 Thread Nick Desaulniers
On Mon, Aug 27, 2018 at 1:42 PM Daniel Santos  wrote:
>
> Hello Nick,
>
> On 08/27/2018 03:09 PM, Nick Desaulniers wrote:
> >> Now we're back to the question of "what do you mean by 'constant'"?  If
> >> you mean a C constant expression (as defined in the C standard) than
> >> almost none of this code fits that criteria.  For these compile-time
> >> assertions to work, we are concerned with the data flow analysis and
> >> constant propagation performed by the compiler during optimization.  You
> >> will notice in include/linux/compiler.h that __compiletime_assert is a
> >> no-op when __OPTIMIZE__ is not defined.
> > Depending on optimizations for static assertions sounds problematic.
>
> (with my best Palpatine voice) It is unavoidable.

LOL

>
> Actually it's theoretically possible, but the compiler would have to do
> something akin to copying it's control flow graph et. al, run -O2-ish
> optimizations, perform the static assertions and then throw away the
> optimized control flow graph and emit code based upon the original.
>
>
> >>> As the comment in 
> >>> says, GCC (and Clang as well) only emits the error for obvious cases.
> >>>
> >>> When '__cond' is a variable,
> >>>
> >>> ((void)sizeof(char[1 - 2 * __cond]))
> >>>
> >>> ... is not obvious for the compiler to know the array size is negative.
> >>>
> >>> Reverting that commit would break BUILD_BUG() because negative-size-array
> >>> is evaluated before the code is optimized out.
> >>>
> >>> Let's give up __compiletime_assert_fallback().  This commit does not
> >>> change the current behavior since it just rips off the useless code.
> >> Clang is not the only target audience of
> >> __compiletime_assert_fallback().  Instead of ripping out something that
> >> may benefit builds with gcc 4.2 and earlier, why not override its
> > Note that with commit cafa0010cd51 ("Raise the minimum required gcc
> > version to 4.6") that gcc < 4.6 is irrelevant.
>
> Ah, I guess I'm not keeping up, that's wonderful news!  Considering that
> I guess I would be OK with its removal, but I still think it would be
> better if a similar mechanism to break the Clang build could be found.
>
> >> definition in compiler-clang.h with something that will break the build
> >> for Clang?  It would need an #ifndef __compiletime_error_fallback here
> >> though.
> >>
> >>> Signed-off-by: Masahiro Yamada 
> >>> Reviewed-by: Kees Cook 
> >>> Reviewed-by: Nick Desaulniers 
> >>> ---
> >>>
> >>> Changes in v2:
> >>>   - Rebase
> >>>
> >>>  include/linux/compiler.h | 17 +
> >>>  1 file changed, 1 insertion(+), 16 deletions(-)
> >>>
> >>> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> >>> index 681d866..87c776c 100644
> >>> --- a/include/linux/compiler.h
> >>> +++ b/include/linux/compiler.h
> >>> @@ -314,29 +314,14 @@ static inline void *offset_to_ptr(const int *off)
> >>>  #endif
> >>>  #ifndef __compiletime_error
> >>>  # define __compiletime_error(message)
> >>> -/*
> >>> - * Sparse complains of variable sized arrays due to the temporary 
> >>> variable in
> >>> - * __compiletime_assert. Unfortunately we can't just expand it out to 
> >>> make
> >>> - * sparse see a constant array size without breaking compiletime_assert 
> >>> on old
> >>> - * versions of GCC (e.g. 4.2.4), so hide the array from sparse 
> >>> altogether.
> >>> - */
> >>> -# ifndef __CHECKER__
> >>> -#  define __compiletime_error_fallback(condition) \
> >>> - do { ((void)sizeof(char[1 - 2 * condition])); } while (0)
> >>> -# endif
> >>> -#endif
> >>> -#ifndef __compiletime_error_fallback
> >>> -# define __compiletime_error_fallback(condition) do { } while (0)
> >>>  #endif
> >>>
> >>>  #ifdef __OPTIMIZE__
> >>>  # define __compiletime_assert(condition, msg, prefix, suffix)
> >>> \
> >>>   do {\
> >>> - int __cond = !(condition);  \
> >>>   extern void prefix ## suffix(void) 
> >>> _

[PATCH] x86/irqflags: mark native_restore_fl extern inline

2018-08-27 Thread Nick Desaulniers
Fixes commit 208cbb325589 ("x86/irqflags: Provide a declaration for
native_save_fl")

This should have been marked extern inline in order to pick up the out
of line definition in arch/x86/kernel/irqflags.S.

Cc: sta...@vger.kernel.org # 4.18, 4.14, 4.9, 4.4
Reported-by: Ben Hutchings 
Signed-off-by: Nick Desaulniers 
---
 arch/x86/include/asm/irqflags.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index c14f2a74b2be..15450a675031 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -33,7 +33,8 @@ extern inline unsigned long native_save_fl(void)
return flags;
 }
 
-static inline void native_restore_fl(unsigned long flags)
+extern inline void native_restore_fl(unsigned long flags);
+extern inline void native_restore_fl(unsigned long flags)
 {
asm volatile("push %0 ; popf"
 : /* no output */
-- 
2.19.0.rc0.228.g281dcd1b4d0-goog



Re: [PATCH] include/linux/compiler*.h: Use feature checking instead of version checks for attributes

2018-08-28 Thread Nick Desaulniers
On Tue, Aug 28, 2018 at 8:04 AM Miguel Ojeda
 wrote:
>
> Hi Nick,
>
> On Mon, Aug 27, 2018 at 7:43 PM, Nick Desaulniers
>  wrote:
> > On Sun, Aug 26, 2018 at 10:58 AM Miguel Ojeda
> >  wrote:
> >>
> >> Instead of using version checks per-compiler to define (or not) each 
> >> attribute,
> >> use __has_attribute to test for them, following the cleanup started with
> >> commit 815f0ddb346c ("include/linux/compiler*.h: make compiler-*.h 
> >> mutually exclusive").
> >>
> >> All the attributes that are fairly common/standard (i.e. those that do not
> >> require extra logic to define them) have been moved to a new file
> >> include/linux/compiler_attributes.h. The attributes have been sorted
> >> and divided between "required" and "optional".
> >
> > Nice! Thanks Miguel.  Regarding sorting, I'm happy with that.  In
> > fact, some of the comments can be removed IMO, as the attributes have
> > common definitions in the docs (maybe an added link to the gcc and
> > clang attribute docs at the top of the file rather than per attribute
> > comments).
>
> Thanks for the review!
>
> I thought about that, although there isn't a single page with them in
> GCC (we could group them by type though: function ones, variable
> ones... and then link to those).
> On the other hand, maybe writing a
> Doc/ file is better and allows us to write as much as one would like
> about each of them (and a link to each page compiler's page about it,
> etc.). I think in the end the Doc/ file might be the best, in order
> not to crowd the header.

A comment is closer to the source, but I guess that's bytes for each
inclusion for every file.  I don't feel passionate about this point
one way or the other.

>
> >
> >>
> >> Further, attributes that are already supported in gcc >= 4.6 and recent 
> >> clang
> >> were simply made to be required (instead of testing for them):
> >>   * always_inline
> >>   * const (pure was already "required", by the way)
> >>   * gnu_inline
> >
> > There's an important test for gnu_inline that isn't checking that it's
> > supported, but rather what the implicit behavior is depending on which
> > C standard is being used.  It's important not to remove that.
>
> Hm... I actually thought it was not available at some point before 4.6
> and removed the #ifdef. The comment even says it is featuring
> detecting it so that the old GCC inlining is used; but it shouldn't
> matter if you always use it, no?

Good point.  Rather than defining it only if GNU inline is not the
current behavior is a bit more verbose than just always defining it.
This seems to confirm that that should work:
https://godbolt.org/z/igwh32.

>
> I just went looking for more info in d03db2bc2 ("compiler-gcc.h: Add
> __attribute__((gnu_inline)) to all inline declarations") and if I
> understood the commit message, the problem is compiling with implicit
> new standard in newer compilers which trigger the C90 behavior, while
> we need the old one --- but if we use gnu_inline, we are getting it
> regardless.
>
> I am sure I am missing something, but I think a clarification is
> needed (and in the code comment as well) -- a bit off-topic, anyway.
>
> [Also, I wouldn't define an attribute or not depending on some other
> condition. I would, instead, define something some other symbol with
> that logic (i.e. instead of using "__gnu_inline", because that is
> lying -- it is not using the attribute even if the compiler supports
> it).]
>
> >
> >>
> >> Finally, some other bits were cleaned up in the process:
> >>   * __optimize: removed (unused in the whole kernel tree)
> >
> > A+ for removing dead code.  I also don't see it used anywhere.
> >
> >>   * __must_be_array: removed from -gcc and -clang (identical), moved to 
> >> _types
> >
> > Yep, uses a builtin (we should add guards for that, later, in a
> > similar style change that guards the use of builtins). Looks good.
> >
> >> (it depends on the unconditionally used  __builtin_types_compatible_p
> >>   * Removes unneeded underscores on the attributes' names
> >
> > That doesn't sound right, but lets see what you mean by that.
>
> Some attributes used the __name__ syntax (i.e. inside the double
> parenthesis), others didn't. I simplified by removing all the extra
> underscores.

A+

>
> >
> >>
> >> There are some things that can be further cleaned up afterwar

Re: [PATCH] include/linux/compiler*.h: Use feature checking instead of version checks for attributes

2018-08-28 Thread Nick Desaulniers
On Tue, Aug 28, 2018 at 8:10 AM Miguel Ojeda
 wrote:
>
> Hi Nick,
>
> On Mon, Aug 27, 2018 at 7:48 PM, Nick Desaulniers
>  wrote:
> > On Mon, Aug 27, 2018 at 10:43 AM Nick Desaulniers
> >> > +
> >> > +/*
> >> > + * Optional attributes: your compiler may or may not support them.
> >> > + *
> >> > + * To check for them, we use __has_attribute, which is supported on gcc 
> >> > >= 5,
> >> > + * clang >= 2.9 and icc >= 17. In the meantime, to support 4.6 <= gcc < 
> >> > 5,
> >> > + * we implement it by hand.
> >> > + */
> >> > +#ifndef __has_attribute
> >> > +#define __has_attribute(x) __GCC46_has_attribute_##x
> >> > +#define __GCC46_has_attribute_assume_aligned 0
> >> > +#define __GCC46_has_attribute_designated_init 0
> >> > +#define __GCC46_has_attribute_externally_visible 1
> >> > +#define __GCC46_has_attribute_noclone 1
> >> > +#define __GCC46_has_attribute_optimize 1
> >> > +#define __GCC46_has_attribute_no_sanitize_address 0
> >> > +#endif
> >
> > And a follow up; I'm trying to understand what will happen in the case
> > of say gcc 4.9 here.  Were any of these supported between gcc 4.6 and
> > 5.0?  If so, then this code will not use them.  It's simpler than
> > explicit version checks, but it won't use features that are supported.
> >
>
> I addressed that in the email I sent afterwards:
>
> """
> Note that:
>   - assume_aligned came with gcc 4.9
>   - no_sanitize_address came with gcc 4.8
>
> So if we feel it is important to have them there (before gcc 5), we
> would need here a quick version check here.
> """
>
> The idea is that, in the future, whenever gcc 5 or later is the
> minimum version, we just get rid of the #ifdef block without touching
> the rest of the code :-)

So if __has_attribute came with gcc 5, then that means that this patch
will break assume_aligned for gcc-4.9 users and no_sanitize_address
for gcc-4.8 and gcc-4.9 users?  The slab allocator uses
assume_aligned, and no_sanitize_address for CONFIG_KASAN.  Should this
patch ever come back through stable, Android and ChromeOS
gcc-4.9/KASAN builds will break.

I don't think we should leave that for a follow up; I would like to
see it as part of this patch.  It's ok to have explicit version checks
for those 2 attributes since it's not possible to feature detect them
for the versions of gcc that we support in this code base.  I think
you should add them in a v2 of this patch.  Then we can point to this
commit as the *shining example* of how to do proper feature detection,
falling back to version checks only as a last resort.

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] x86/irqflags: mark native_restore_fl extern inline

2018-08-28 Thread Nick Desaulniers
On Tue, Aug 28, 2018 at 5:43 AM Greg Kroah-Hartman
 wrote:
>
> On Tue, Aug 28, 2018 at 08:02:37AM +0200, Juergen Gross wrote:
> > On 28/08/18 07:13, Greg Kroah-Hartman wrote:
> > > On Mon, Aug 27, 2018 at 02:40:09PM -0700, Nick Desaulniers wrote:
> > >> Fixes commit 208cbb325589 ("x86/irqflags: Provide a declaration for
> > >> native_save_fl")
> > >>
> > >> This should have been marked extern inline in order to pick up the out
> > >> of line definition in arch/x86/kernel/irqflags.S.
> > >>
> > >> Cc: sta...@vger.kernel.org # 4.18, 4.14, 4.9, 4.4
> > >> Reported-by: Ben Hutchings 
> > >> Signed-off-by: Nick Desaulniers 
> > >> ---
> > >>  arch/x86/include/asm/irqflags.h | 3 ++-
> > >>  1 file changed, 2 insertions(+), 1 deletion(-)
> > >>
> > >> diff --git a/arch/x86/include/asm/irqflags.h 
> > >> b/arch/x86/include/asm/irqflags.h
> > >> index c14f2a74b2be..15450a675031 100644
> > >> --- a/arch/x86/include/asm/irqflags.h
> > >> +++ b/arch/x86/include/asm/irqflags.h
> > >> @@ -33,7 +33,8 @@ extern inline unsigned long native_save_fl(void)
> > >>return flags;
> > >>  }
> > >>
> > >> -static inline void native_restore_fl(unsigned long flags)
> > >> +extern inline void native_restore_fl(unsigned long flags);
> > >> +extern inline void native_restore_fl(unsigned long flags)
> > >
> > > This looks odd to me, but my coffee hasn't kicked in yet this morning.
> > > Why do you need both lines here?  Shouldn't the actual function be
> > > sufficient?  If not, a comment explaining this would be nice.

That's a valid concern.  I did hesitate while writing the commit
message whether to describe that part or not.  The reason to include
the seemingly-additional-declaration was to prevent another case of
commit 208cbb325589 ("x86/irqflags: Provide a declaration for
native_save_fl").  It doesn't hurt to provide more info in the commit
message, so next time I'll prefer to explain more in my commit
messages than less.

Thanks Juergen and Greg for reviewing the patch.

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH v2] compiler.h: give up __compiletime_assert_fallback()

2018-08-28 Thread Nick Desaulniers
On Mon, Aug 27, 2018 at 1:42 PM Daniel Santos  wrote:
>
> Hello Nick,
>
> On 08/27/2018 03:09 PM, Nick Desaulniers wrote:
> >>> Let's give up __compiletime_assert_fallback().  This commit does not
> >>> change the current behavior since it just rips off the useless code.
> >> Clang is not the only target audience of
> >> __compiletime_assert_fallback().  Instead of ripping out something that
> >> may benefit builds with gcc 4.2 and earlier, why not override its
> > Note that with commit cafa0010cd51 ("Raise the minimum required gcc
> > version to 4.6") that gcc < 4.6 is irrelevant.
>
> Ah, I guess I'm not keeping up, that's wonderful news!  Considering that
> I guess I would be OK with its removal, but I still think it would be
> better if a similar mechanism to break the Clang build could be found.

I'm consulting with our best language lawyers to see what combinations
of _Static_assert and __builtin_constant_p would do the trick.


Re: [PATCH] compiler-gcc: get back Clang build

2018-08-21 Thread Nick Desaulniers
ot understand "#pragma GCC diagnostic push" 
> */
> +#if GCC_VERSION >= 40600
>  #define __diag_str1(s) #s
>  #define __diag_str(s)  __diag_str1(s)
>  #define __diag(s)  _Pragma(__diag_str(GCC diagnostic s))
> +#endif
>
>  #if GCC_VERSION >= 8
>  #define __diag_GCC_8(s)__diag(s)
> --
> 2.7.4
>

I think the current state of always including compiler-gcc.h, then
possibly including compiler-clang.h or compiler-intel.h to overwrite
some things is kind of a spaghetti mess, because now we wind up with
these circular dependencies on GCC_VERSION.  And that Clang just
happened to declare __GNUC_MAJOR__ and friends in a way that didn't
blow up until today was a bit of luck; that was a time bomb waiting to
go off.  I think it's time to shave this yak.  Adding these
GCC_VERSION checks back doesn't simplify the state of things.

I think a more proper fix might be something along the lines of (in
compiler_types.h):

#include  /* potential new header for common
definitions (or just put them above) */
#if /* more proper check for gcc and JUST gcc */
#include 
#elif /* compiler check for icc */
#include 
#elif /* compiler check for clang */
#include 
#endif

#ifndef /* something from the above that's not mission critical */
#warning "compiler missing foo"
#undef foo
#endif

#ifndef /* something from above that IS mission critical */
#error "compiler missing bar"
#endif

I appreciate the patch, but I think there's another way we can clean this up.
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] compiler-gcc: get back Clang build

2018-08-21 Thread Nick Desaulniers
On Tue, Aug 21, 2018 at 5:38 AM Dominique Martinet
 wrote:
>
> Nick Desaulniers Aug. 21, 2018, 8:09 a.m. UTC:
> > Thanks for noticing, and sending this patch.  I'm happy to see others
> > testing with Clang.  I noticed this too near the end of the day
> > https://github.com/ClangBuiltLinux/linux/issues/27.
>
> FWIW libbcc so many BPF users also use clang, so this has more impact
> than just testing to build linux with clang (not that this would be any
> reason to delay fixing either way)
>
> I would tend to agree havin a compiler-common + make clang/intel not
> include compiler-gcc would probably be best in the long run but we might
> want a quick fix for 4.19 meanwhile..

That's fair. SOP here is quick (full) revert, then come up with a
better fix.  And I do prefer Masahiro's partial revert to a full
revert of Joe's patch.  That will give us more time to develop the
proper fix rather than rush.  I'll try to see how we can more properly
split the compiler specific headers.

Tested with gcc-7 and clang-8.

Tested-by: Nick Desaulniers 
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] compiler-gcc: get back Clang build

2018-08-21 Thread Nick Desaulniers
On Tue, Aug 21, 2018 at 3:39 AM Joe Perches  wrote:
>
> On Tue, 2018-08-21 at 15:48 +0900, Masahiro Yamada wrote:
> > Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
> > missed the fact that  is included by Clang
> > as well as by GCC.
> >
> > Clang actually defines __GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__
> > and it looks like GCC 4.2.1.
> >
> >   $ scripts/gcc-version.sh -p clang
> >   040201
>
> Perhaps this would work, but I can't test it as
> my clang version doesn't otherwise build a defconfig
> and errors out with
>
> $ make CC=clang
> arch/x86/Makefile:179: *** Compiler lacks asm-goto support..  Stop.

Sorry, we're working on implementing this in clang and llvm for x86.
I recently reviewed the design doc, and am trying to see how I can
actively help push this along.  It's not a small/quick change to llvm,
but we're working on it.

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] compiler-gcc: get back Clang build

2018-08-21 Thread Nick Desaulniers
On Tue, Aug 21, 2018 at 9:33 AM Joe Perches  wrote:
>
> On Tue, 2018-08-21 at 15:48 +0900, Masahiro Yamada wrote:
> > Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
> > missed the fact that  is included by Clang
> > as well as by GCC.
> >
> > Clang actually defines __GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__
> > and it looks like GCC 4.2.1.
> >
> >   $ scripts/gcc-version.sh -p clang
> >   040201
> >
> > If you try to build the kernel with Clang, you will get the
> > "Sorry, your compiler is too old - please upgrade it."
> > followed by a bunch of "unknown attribute" warnings.
> >
> > Add !defined(__clang__) to the minimum version check.
> >
> > Also, revive the version test blocks for versions >= 4.2.1
> > in order to disable features not supported by Clang.
>
> What is the minimum clang version required to compile the kernel?

Depends on the architecture and which kernel version/LTS branch you're
using.  I'm trying to backport fixes to LTS branches, but sometimes a
compiler upgrade is required.  I know that's not great, but I'm
actively trying to fix it.

> What features are not supported by the minimum clang version?
>
> On my system, using clang
>
> $ clang -v
> clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
>
> and
>
> $ git checkout v4.16 ; make clean ; make CC=clang defconfig ; make CC=clang
> HEAD is now at 0adb32858b0b... Linux 4.16
>
> is successful
>
> but
>
> $ git checkout v4.17 ; make clean ; make CC=clang defconfig ; make CC=clang
> HEAD is now at 29dcea88779c... Linux 4.17
> arch/x86/Makefile:184: *** Compiler lacks asm-goto support..  Stop.
> arch/x86/Makefile:184: *** Compiler lacks asm-goto support..  Stop.
>

See commit e501ce9 ("x86: Force asm-goto").

$ git describe --contains e501ce9 | sed 's/~.*//'
v4.17-rc1

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] compiler-gcc: get back Clang build

2018-08-21 Thread Nick Desaulniers
On Tue, Aug 21, 2018 at 9:45 AM Joe Perches  wrote:
>
> On Tue, 2018-08-21 at 09:32 -0700, Nick Desaulniers wrote:
> > On Tue, Aug 21, 2018 at 5:38 AM Dominique Martinet
> >  wrote:
> > >
> > > Nick Desaulniers Aug. 21, 2018, 8:09 a.m. UTC:
> > > > Thanks for noticing, and sending this patch.  I'm happy to see others
> > > > testing with Clang.  I noticed this too near the end of the day
> > > > https://github.com/ClangBuiltLinux/linux/issues/27.
> > >
> > > FWIW libbcc so many BPF users also use clang, so this has more impact
> > > than just testing to build linux with clang (not that this would be any
> > > reason to delay fixing either way)
> > >
> > > I would tend to agree havin a compiler-common + make clang/intel not
> > > include compiler-gcc would probably be best in the long run but we might
> > > want a quick fix for 4.19 meanwhile..
> >
> > That's fair. SOP here is quick (full) revert, then come up with a
> > better fix.  And I do prefer Masahiro's partial revert to a full
> > revert of Joe's patch.  That will give us more time to develop the
> > proper fix rather than rush.  I'll try to see how we can more properly
> > split the compiler specific headers.
> >
> > Tested with gcc-7 and clang-8.
>
> clang-8? Isn't the latest officlal clang 6.0.1 ?

Yes, but I have a local llvm tree that I work out of, that's in my
$PATH, so my version of clang is never too far behind Top of Tree.
For android, we're using clang-5, but currently staging an upgrade to
clang 6.0.1.

> So if something other than 6.0.x is required,
> then some additional check should probably be
> added to compiler-clang.h as well.
>

Sure, but that doesn't need to go in Mashiro's patch today.  That can
wait for a proper separation between compiler headers where we can
then implement improved version checks.

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] compiler-gcc: get back Clang build

2018-08-22 Thread Nick Desaulniers
On Tue, Aug 21, 2018 at 9:32 PM Dominique Martinet
 wrote:
>
> Joe Perches wrote on Tue, Aug 21, 2018:
> > On Wed, 2018-08-22 at 06:16 +0200, Dominique Martinet wrote:
> > > I think that could work, but at the point making a separate
> > > compiler-common.h and not including compiler-gcc.h for clang sounds
> > > better to me... More importantly here, either solution sound complex
> > > enough to require more than a few days and proper testing for all archs
> > > etc when compared to the partial revert we have here.
> >
> > The immediate need for a partial revert seems unnecessary as
> > clang hasn't really worked for a couple releases now.
>
> Sorry for repeating myself, clang is used by bcc for compiling BPF
> programs (e.g. bpf_module_create_c_from_string() or any similar function
> will use the clang libs to compile the bpf program with linux headers),
> and this has always been working because it's not using our makefiles.
>
> This broke today in master and I only joined this thread after looking
> at why the build started failing today and noticing this patch, it
> definitely hasn't been broken for two releases, or I wouldn't be here
> with this timing :)
>
>
> > The separate compiler file changes are much more sensible,
> > even if it takes a few days.
>
> A few days are fine, but I think some form of fix in 4.19-rc1 would be
> good.
>
> I'll stop taking your time now, but I think you are/were underestimating
> how many people use clang with the linux kernel headers indirectly.
> BPF is a well-used tool :)

Hi Dominique,
I'm currently testing a fix in
https://github.com/ClangBuiltLinux/linux/commit/1f89ae7622c26b8131f42f3a362d6ef41b88a595,
can you please share with me your steps to test/verify that the patch
fixes the issue for eBPF?  I'll go talk to a co-worker who know more
about eBPF, but I've not yet done anything with it.

Also, does anyone know who I should talk to about ICC testing?
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] compiler-gcc: get back Clang build

2018-08-22 Thread Nick Desaulniers
On Wed, Aug 22, 2018 at 11:31 AM Nick Desaulniers
 wrote:
> Hi Dominique,
> I'm currently testing a fix in
> https://github.com/ClangBuiltLinux/linux/commit/1f89ae7622c26b8131f42f3a362d6ef41b88a595,

Sorry, maybe https://github.com/ClangBuiltLinux/linux/commits/compiler_detection
is a better link, as the sha changes whenever I modify the patch and
force push.
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] compiler-gcc: get back Clang build

2018-08-22 Thread Nick Desaulniers
On Wed, Aug 22, 2018 at 1:50 PM Joe Perches  wrote:
>
> On Wed, 2018-08-22 at 11:31 -0700, Nick Desaulniers wrote:
> > On Tue, Aug 21, 2018 at 9:32 PM Dominique Martinet
> >  wrote:
> > >
> > > Joe Perches wrote on Tue, Aug 21, 2018:
> > > > On Wed, 2018-08-22 at 06:16 +0200, Dominique Martinet wrote:
> > > > > I think that could work, but at the point making a separate
> > > > > compiler-common.h and not including compiler-gcc.h for clang sounds
> > > > > better to me... More importantly here, either solution sound complex
> > > > > enough to require more than a few days and proper testing for all 
> > > > > archs
> > > > > etc when compared to the partial revert we have here.
> > > >
> > > > The immediate need for a partial revert seems unnecessary as
> > > > clang hasn't really worked for a couple releases now.
> > >
> > > Sorry for repeating myself, clang is used by bcc for compiling BPF
> > > programs (e.g. bpf_module_create_c_from_string() or any similar function
> > > will use the clang libs to compile the bpf program with linux headers),
> > > and this has always been working because it's not using our makefiles.
> > >
> > > This broke today in master and I only joined this thread after looking
> > > at why the build started failing today and noticing this patch, it
> > > definitely hasn't been broken for two releases, or I wouldn't be here
> > > with this timing :)
> > >
> > >
> > > > The separate compiler file changes are much more sensible,
> > > > even if it takes a few days.
> > >
> > > A few days are fine, but I think some form of fix in 4.19-rc1 would be
> > > good.
> > >
> > > I'll stop taking your time now, but I think you are/were underestimating
> > > how many people use clang with the linux kernel headers indirectly.
> > > BPF is a well-used tool :)
> >
> > Hi Dominique,
> > I'm currently testing a fix in
> > https://github.com/ClangBuiltLinux/linux/commit/1f89ae7622c26b8131f42f3a362d6ef41b88a595,
> > can you please share with me your steps to test/verify that the patch
> > fixes the issue for eBPF?  I'll go talk to a co-worker who know more
> > about eBPF, but I've not yet done anything with it.
>
> A mild suggestion about the patch would be to break it up into
> 2 patches to improve how people read and review them.
>
> 1 include/linux/compiler-*
> 2 everything else
>
> Yes, some kernel configs might not build properly between 1 and 2
> but that likely doesn't matter as those configs probably don't
> build before 1 either.

If we ordered the patches so that the "everything else" went in first,
it would not be a problem.  The first patch would just be the checks
that GCC_VERSION is defined.

In general, I'm happy to split patches, but in this suggested case, it
only shaves off 26 lines from the main body of work.  I don't think 26
lines is enough to justify splitting for readability.  But let me know
if you feel strongly about that point and I'll be happy to split. (or
possibly by "everything else" you meant something else?)

> Perhaps the test in arch/arm/kernel/asm-offsets.c for incompatible
> gcc compiler versions from 4.8.0 to 4.8.2 should be moved to
> compiler-gcc.h.

This a good suggestion, and one I've thought about, although in the
context of builtins.  *Detour ahead*.  Builtins are not portable by
default, and their use really should be feature detected (impossible
currently on gcc due to
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66970) or version checks
and protected by macros like the symbols in my current patch.  I think
I might soon hoist them to a shared header that safely detects their
support or provides a fallback when possible.  One issue is that some
builtins are arch specific, so do you want to feature detect arch
specific ones for builds of a different arch?  And that's the problem
I have with hoisting arch specific compiler checks into a shared
header; builds of other archs should pay no build penalty for
unrelated archs.  All that to say that I think it's best to keep arch
specific checks in arch/ specific subdirs, but I welcome more thoughts
on this.

Ok, I've boot tested this patch for x86_64 and arm64 in qemu, with
CC=gcc-7.2, CC=clang-8, and CC=clang-8 HOSTCC=clang-8 and am feeling
quite confident to send it for review.

>> Also, does anyone know who I should talk to about ICC testing?
> No clue

+ Daniel and HPA (the last commits to include/linux/compiler*
mentioning `icc` in 2015 and 2013 respectively)
--
Thanks,
~Nick Desaulniers


[PATCH] include/linux/compiler*.h: make compiler-*.h mutually exclusive

2018-08-22 Thread Nick Desaulniers
Commit cafa0010cd51 ("Raise the minimum required gcc version to 4.6")
recently exposed a brittle part of the build for supporting non-gcc
compilers.

Both Clang and ICC define __GNUC__, __GNUC_MINOR__, and
__GNUC_PATCHLEVEL__ for quick compatibility with code bases that haven't
added compiler specific checks for __clang__ or __INTEL_COMPILER.

This is brittle, as they happened to get compatibility by posing as a
certain version of GCC.  This broke when upgrading the minimal version
of GCC required to build the kernel, to a version above what ICC and
Clang claim to be.

Rather than always including compiler-gcc.h then undefining or
redefining macros in compiler-intel.h or compiler-clang.h, let's
separate out the compiler specific macro definitions into mutually
exclusive headers, do more proper compiler detection, and keep shared
definitions in compiler_types.h.

Reported-by: Masahiro Yamada 
Suggested-by: Eli Friedman 
Suggested-by: Joe Perches 
Signed-off-by: Nick Desaulniers 
---
 arch/arm/kernel/asm-offsets.c |  13 +-
 drivers/gpu/drm/i915/i915_utils.h |   2 +-
 drivers/watchdog/kempld_wdt.c |   5 -
 include/linux/compiler-clang.h|  20 ++-
 include/linux/compiler-gcc.h  |  88 ---
 include/linux/compiler-intel.h|  13 +-
 include/linux/compiler_types.h| 238 ++
 mm/ksm.c  |   4 +-
 mm/migrate.c  |   3 +-
 9 files changed, 133 insertions(+), 253 deletions(-)

diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 974d8d7d1bcd..3968d6c22455 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -38,25 +38,14 @@
 #error Sorry, your compiler targets APCS-26 but this kernel requires APCS-32
 #endif
 /*
- * GCC 3.0, 3.1: general bad code generation.
- * GCC 3.2.0: incorrect function argument offset calculation.
- * GCC 3.2.x: miscompiles NEW_AUX_ENT in fs/binfmt_elf.c
- *(http://gcc.gnu.org/PR8896) and incorrect structure
- *   initialisation in fs/jffs2/erase.c
  * GCC 4.8.0-4.8.2: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58854
  *   miscompiles find_get_entry(), and can result in EXT3 and EXT4
  *   filesystem corruption (possibly other FS too).
  */
-#ifdef __GNUC__
-#if (__GNUC__ == 3 && __GNUC_MINOR__ < 3)
-#error Your compiler is too buggy; it is known to miscompile kernels.
-#errorKnown good compilers: 3.3, 4.x
-#endif
-#if GCC_VERSION >= 40800 && GCC_VERSION < 40803
+#if defined(GCC_VERSION) && GCC_VERSION >= 40800 && GCC_VERSION < 40803
 #error Your compiler is too buggy; it is known to miscompile kernels
 #error and result in filesystem corruption and oopses.
 #endif
-#endif
 
 int main(void)
 {
diff --git a/drivers/gpu/drm/i915/i915_utils.h 
b/drivers/gpu/drm/i915/i915_utils.h
index 00165ad55fb3..395dd2511568 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -43,7 +43,7 @@
 #define MISSING_CASE(x) WARN(1, "Missing case (%s == %ld)\n", \
 __stringify(x), (long)(x))
 
-#if GCC_VERSION >= 7
+#if defined(GCC_VERSION) && GCC_VERSION >= 7
 #define add_overflows(A, B) \
__builtin_add_overflow_p((A), (B), (typeof((A) + (B)))0)
 #else
diff --git a/drivers/watchdog/kempld_wdt.c b/drivers/watchdog/kempld_wdt.c
index 2f3b049ea301..e268add43010 100644
--- a/drivers/watchdog/kempld_wdt.c
+++ b/drivers/watchdog/kempld_wdt.c
@@ -146,12 +146,7 @@ static int kempld_wdt_set_stage_timeout(struct 
kempld_wdt_data *wdt_data,
u32 remainder;
u8 stage_cfg;
 
-#if GCC_VERSION < 40400
-   /* work around a bug compiling do_div() */
-   prescaler = READ_ONCE(kempld_prescaler[PRESCALER_21]);
-#else
prescaler = kempld_prescaler[PRESCALER_21];
-#endif
 
if (!stage)
return -EINVAL;
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 7087446c24c8..5f0754692ce6 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -6,11 +6,7 @@
 /* Some compiler specific definitions are overwritten here
  * for Clang compiler
  */
-
-#ifdef uninitialized_var
-#undef uninitialized_var
 #define uninitialized_var(x) x = *(&(x))
-#endif
 
 /* same as gcc, this was present in clang-2.6 so we can assume it works
  * with any version that can compile the kernel
@@ -25,14 +21,8 @@
 #define __SANITIZE_ADDRESS__
 #endif
 
-#undef __no_sanitize_address
 #define __no_sanitize_address __attribute__((no_sanitize("address")))
 
-/* Clang doesn't have a way to turn it off per-function, yet. */
-#ifdef __noretpoline
-#undef __noretpoline
-#endif
-
 /*
  * Not all versions of clang implement the the type-generic versions
  * of the builtin overflow checkers. Fortunately, clang implements
@@ -40,9 +30,17 @@
  * checks. Unfortunately, 

Re: [PATCH] include/linux/compiler*.h: make compiler-*.h mutually exclusive

2018-08-23 Thread Nick Desaulniers
One reply for a bunch of the various threads, to keep the number of emails down:

On Wed, Aug 22, 2018 at 5:20 PM Joe Perches  wrote:
> On Wed, 2018-08-22 at 16:37 -0700, Nick Desaulniers wrote:
> > +/* Compiler specific macros. */
> >  #ifdef __clang__
> >  #include 
>
> probably better as
>
> #if defined(__clang)
>
> to match the style of the #elif defined()s below it

Hi Joe,
Thanks for the feedback. I always appreciate it.  If you have some
cleanups, want to send them to me, and I'll bundle them up for a PR?
I'm ok with that change.

> > +#ifdef __GNUC_STDC_INLINE__
> > +# define __gnu_inline__attribute__((gnu_inline))
> > +#else
> > +# define __gnu_inline
> > +#endif
>
> Perhaps __gnu_inline should be in compiler-gcc and this
> should use
>
> #ifndef __gnu_inline
> #define __gnu_inline
> #endif

Not this case; it's how we get gnu89 semantics for `extern inline` is
not compiler specific (therefor should not go in a compiler specific
header).

> > +#if !defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) || \
> > + !defined(CONFIG_OPTIMIZE_INLINING)
> > +#define inline \
> > + inline __attribute__((always_inline, unused)) notrace __gnu_inline
> > +#else
> > +#define inline inline__attribute__((unused)) notrace __gnu_inline
> > +#endif
>
> This bit might be better adding another __ attribute like:
>
> #if defined(CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING) &&
> defined(CONFIG_OPTIMIZE_INLINING)
> #define __optimized_inline __unused
> #else
> #define __optimized_inline __unused __attribute__((always_inline))
> #endif
>
> #define inline inline __optimized_inline notrace __gnu_inline

Sure, as above I'm happy to take clean ups, but that might result in
treewide changes (maybe less benefit but could separate `inline` from
`__optimized_inline`).  Maybe bikeshed territory, but `force_inline`
or some notion that we're trying to overrule the compiler here might
make intent clearer.

> > -#if (GCC_VERSION >= 40700 && GCC_VERSION < 40900) && defined(CONFIG_ARM)
> > +#if defined(CONFIG_ARM) && \
> > + defined(GCC_VERSION) && GCC_VERSION < 40900 && GCC_VERSION >= 40700
>
> I find the reversed version tests a bit odd to read

Sorry, I probably didn't need to reorder that. My thought process was
in terms of short circuiting, what order was most likely for us to
bail out of the condition first?  If it hurts readability, I'm happy
to take clean ups.

On Wed, Aug 22, 2018 at 5:25 PM Dominique Martinet
 wrote:
> Overall looks good to me, just pointing at the same error I wrote in my
> other mail here -- I saw that by the time I was done writing this this
> patch got taken but that alone will probably warrant a follow-up :/
> > +#define barrier() (__asm__ __volatile__("" : : : "memory"))
>
> barrier here has gained external () compared to the definition and
> compiler-gcc.h:
> #define barrier() __asm__ __volatile__("": : :"memory")

Dominique,
Yes, sorry about that (looks like Linus noticed that, too).  Was a
stupid last minute mistake on my part, trying to appease
checkpatch.pl.  One of these days, I'll get frustrated enough to
rewrite checkpatch.pl as a set of clang tidy checks (so that it
actually parses the code), but I'll have to learn how to read perl
first to start translating.

> I've also added a few style nitpicks/questions but feel free to ignore
> these.

No, please, I really appreciate you taking the time to actually test
this and provide feedback.  That kind of support is critical in open
source.

> On a side note, I noticed tools/include/linux/compiler.h includes
> linux/compiler-gcc.h but maybe it should include linux/compiler_types.h?
> (I'm not sure at who uses that header, so it really is an open question
> here)

Without looking into the code, this sounds like compiler.h is wrong.
Looking at the source, there's references to ancient Android NDK's
(what?! Let me show this to the NDK maintainers).  This whole thing
needs to be cleaned up, too, IMO.  Oh, there's two compiler-gcc.h in
the tree:

- tools/include/linux/compiler-gcc.h (that's what's being included in
the case you point out)
- include/linux/compiler-gcc.h

Maybe they can be combined?

> > -#define __nostackprotector   __optimize("no-stack-protector")
>
> I do not see this being added back, it's probably fine though as it
> doesn't look used?
> (I didn't check all removed lines, maybe about half)

For each case, I triple checked that the macro was actually being used
in the code.  __nostackprotector was not, so I dropped it.

Sounds like I may have messed up `__naked` 

Re: [PATCH] include/linux/compiler*.h: make compiler-*.h mutually exclusive

2018-08-23 Thread Nick Desaulniers
On Thu, Aug 23, 2018 at 2:19 PM Joe Perches  wrote:
>
> On Thu, 2018-08-23 at 14:03 -0700, Nick Desaulniers wrote:
> > One reply for a bunch of the various threads, to keep the number of emails 
> > down:
> >
> > On Wed, Aug 22, 2018 at 5:20 PM Joe Perches  wrote:
> > > On Wed, 2018-08-22 at 16:37 -0700, Nick Desaulniers wrote:
> > > > +/* Compiler specific macros. */
> > > >  #ifdef __clang__
> > > >  #include 
> > >
> > > probably better as
> > >
> > > #if defined(__clang)
> > >
> > > to match the style of the #elif defined()s below it
> >
> > Hi Joe,
> > Thanks for the feedback. I always appreciate it.  If you have some
> > cleanups, want to send them to me, and I'll bundle them up for a PR?
> > I'm ok with that change.
> >
> > > > +#ifdef __GNUC_STDC_INLINE__
> > > > +# define __gnu_inline__attribute__((gnu_inline))
> > > > +#else
> > > > +# define __gnu_inline
> > > > +#endif
> > >
> > > Perhaps __gnu_inline should be in compiler-gcc and this
> > > should use
> > >
> > > #ifndef __gnu_inline
> > > #define __gnu_inline
> > > #endif
> >
> > Not this case; it's how we get gnu89 semantics for `extern inline` is
> > not compiler specific (therefor should not go in a compiler specific
> > header).
>
> It's not possible to know that compilers support what
> __attribute__(()) and at what version that support
> exists unless it is specified somewhere.

__has_attribute:
https://clang.llvm.org/docs/LanguageExtensions.html#has-attribute

The release notes of GCC-5 mention __has_attribute.
https://gcc.gnu.org/gcc-5/changes.html

The point of feature detection is that it _doesn't matter_ what
version that support exists.  Either it does and you can use it, or it
doesn't and you can decide whether to stop compiling or there's a
valid work around.

Feature detection should be preferred to explicit version checks
except in 2 specific cases:
1. It's not possible to properly perform feature detection.  Language
features should not be added unless it's possible to safely check for
them.
2. A very specific version of a very specific compiler is broken and
needs to be explicitly guarded against.

> As far as I can tell,  gnu_inline is not recognized by clang.
>
> https://clang.llvm.org/docs/AttributeReference.html

If that was the case, I would not have added it in commit d03db2bc26f0
("compiler-gcc.h: Add __attribute__((gnu_inline)) to all inline
declarations").
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d03db2bc26f0e4a6849ad649a09c9c73fccdc656

Docs can sometimes fall behind, the lone source of truth is the source code.
https://github.com/llvm-mirror/clang/search?q=gnu_inline&unscoped_q=gnu_inline

Godbolt is also incredibly helpful for testing various compiler versions:
https://godbolt.org/z/uMJ-mo


https://reviews.llvm.org/D51190

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] tracing: do not leak kernel addresses

2018-07-26 Thread Nick Desaulniers
On Thu, Jul 26, 2018 at 8:22 AM Steven Rostedt  wrote:
>
> On Thu, 26 Jul 2018 08:14:08 -0700
> Mark Salyzyn  wrote:
>
> > Thank you Steve, much appreciated feedback, I have asked the security
> > developers to keep this in mind and come up with a correct fix.
> >
> > The correct fix that meets your guidelines would _not_ be suitable for
> > stable due to the invasiveness it sounds, only for the latest will such
> > a rework make sense. As such, the fix proposed in this patch is the only
> > one that meets the bar for stable patch simplicity, and merely(!) needs
> > to state that if the fix is taken, perf and trace are broken.
> >
> > Posting this patch publicly on the lists, that may never be applied, may
> > be the limit of our responsibility for a fix to stable kernel releases,
> > to be optionally applied by vendors concerned with this CVE criteria?
> >
>
> The patch breaks the code it touches. It makes it useless.

Doesn't that depend on kptr_restrict, or would it be broken if
kptr_restrict was set to 0?

> If you want
> something for stable, add a command line parameter that just disables
> the creation of that file. Otherwise you will break usespace and that
> will be a definitely NAK from Linus, and for stable itself. This is a
> very minor security issue, and does not justify breaking userspace
> applications. I would be very upset if a new stable release broke both
> perf and trace-cmd's ability to read certain trace events.

I don't disagree.

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] tracing: do not leak kernel addresses

2018-07-26 Thread Nick Desaulniers
On Thu, Jul 26, 2018 at 8:32 AM Greg KH  wrote:
>
> On Thu, Jul 26, 2018 at 08:14:08AM -0700, Mark Salyzyn wrote:
> > On 07/25/2018 06:07 PM, Steven Rostedt wrote:
> > > On Wed, 25 Jul 2018 13:22:36 -0700
> > > Mark Salyzyn  wrote:
> > >
> > > > From: Nick Desaulniers 
> > > >
> > > > Switch from 0x%lx to 0x%pK to print the kernel addresses.
> > > >
> > > > Fixes: CVE-2017-0630
> > > Wait This breaks perf and trace-cmd! They require this to be able
> > > to print various strings in trace events. This file is root read only,
> > > as the CVE says.
> > >
> > > NAK for this fix. Come up with something that doesn't break perf and
> > > trace-cmd. That will not be trivial, as the format is stored in the
> > > ring buffer with an address, then referenced directly. It also handles
> > > trace_printk() functions that simply point to the string format itself.
> > >
> > > A fix would require having a pointer be the same that is referenced
> > > inside the kernel as well as in this file. Maybe make the format string
> > > placed in a location that doesn't leak where the rest of the kernel
> > > exists?
> > >
> > > -- Steve
> > Thank you Steve, much appreciated feedback, I have asked the security
> > developers to keep this in mind and come up with a correct fix.
> >
> > The correct fix that meets your guidelines would _not_ be suitable for
> > stable due to the invasiveness it sounds, only for the latest will such a
> > rework make sense. As such, the fix proposed in this patch is the only one
> > that meets the bar for stable patch simplicity, and merely(!) needs to state
> > that if the fix is taken, perf and trace are broken.
>
> Why would I take something for the stable trees that does not match what
> is upstream?  It feels to me that this CVE is just invalid.  Yes, root
> can read the kernel address, does that mean it is a problem?  Only if
> you allow unprotected users to run with root privileges :)
>
> What exactly is the problem here in the current kernel that you are
> trying to solve?

See the section "Kernel addresses" in
Documentation/security/self-protection.  IIRC, the issue is that a
process may have CAP_SYSLOG but not necessarily CAP_SYS_ADMIN (so it
can read dmesg, but not necessarily issue a sysctl to change
kptr_restrict), get compromised and used to leak kernel addresses,
which can then be used to defeat KASLR.

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] tracing: do not leak kernel addresses

2018-07-26 Thread Nick Desaulniers
On Thu, Jul 26, 2018 at 9:37 AM Steven Rostedt  wrote:
>
> On Thu, 26 Jul 2018 09:32:07 -0700
> Nick Desaulniers  wrote:
>
> > On Thu, Jul 26, 2018 at 8:22 AM Steven Rostedt  wrote:
> > >
> > > On Thu, 26 Jul 2018 08:14:08 -0700
> > > Mark Salyzyn  wrote:
> > >
> > > > Thank you Steve, much appreciated feedback, I have asked the security
> > > > developers to keep this in mind and come up with a correct fix.
> > > >
> > > > The correct fix that meets your guidelines would _not_ be suitable for
> > > > stable due to the invasiveness it sounds, only for the latest will such
> > > > a rework make sense. As such, the fix proposed in this patch is the only
> > > > one that meets the bar for stable patch simplicity, and merely(!) needs
> > > > to state that if the fix is taken, perf and trace are broken.
> > > >
> > > > Posting this patch publicly on the lists, that may never be applied, may
> > > > be the limit of our responsibility for a fix to stable kernel releases,
> > > > to be optionally applied by vendors concerned with this CVE criteria?
> > > >
> > >
> > > The patch breaks the code it touches. It makes it useless.
> >
> > Doesn't that depend on kptr_restrict, or would it be broken if
> > kptr_restrict was set to 0?
>
> Is that what governs the output of kallsyms?

>From my workstation:

$ cat /proc/kallsyms

prints a bunch of zero'd out addresses, while

$ sudo !!

prints out actual addresses.  Looking at kernel/kallsyms.c, it seems
that there's no use of %pK, but kallsyms_show_value() switches on
kptr_restrict (and additional values):

/*
 * We show kallsyms information even to normal users if we've enabled
 * kernel profiling and are explicitly not paranoid (so kptr_restrict
 * is clear, and sysctl_perf_event_paranoid isn't set).
 *
 * Otherwise, require CAP_SYSLOG (assuming kptr_restrict isn't set to
 * block even that).
 */
int kallsyms_show_value(void)
{
switch (kptr_restrict) {
case 0:
if (kallsyms_for_perf())
    return 1;
/* fallthrough */
case 1:
if (has_capability_noaudit(current, CAP_SYSLOG))
return 1;
/* fallthrough */
default:
return 0;
}
}


-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] tracing: do not leak kernel addresses

2018-07-26 Thread Nick Desaulniers
On Thu, Jul 26, 2018 at 9:59 AM Nick Desaulniers
 wrote:
>
> On Thu, Jul 26, 2018 at 9:37 AM Steven Rostedt  wrote:
> >
> > On Thu, 26 Jul 2018 09:32:07 -0700
> > Nick Desaulniers  wrote:
> >
> > > On Thu, Jul 26, 2018 at 8:22 AM Steven Rostedt  
> > > wrote:
> > > >
> > > > On Thu, 26 Jul 2018 08:14:08 -0700
> > > > Mark Salyzyn  wrote:
> > > >
> > > > > Thank you Steve, much appreciated feedback, I have asked the security
> > > > > developers to keep this in mind and come up with a correct fix.
> > > > >
> > > > > The correct fix that meets your guidelines would _not_ be suitable for
> > > > > stable due to the invasiveness it sounds, only for the latest will 
> > > > > such
> > > > > a rework make sense. As such, the fix proposed in this patch is the 
> > > > > only
> > > > > one that meets the bar for stable patch simplicity, and merely(!) 
> > > > > needs
> > > > > to state that if the fix is taken, perf and trace are broken.
> > > > >
> > > > > Posting this patch publicly on the lists, that may never be applied, 
> > > > > may
> > > > > be the limit of our responsibility for a fix to stable kernel 
> > > > > releases,
> > > > > to be optionally applied by vendors concerned with this CVE criteria?
> > > > >
> > > >
> > > > The patch breaks the code it touches. It makes it useless.
> > >
> > > Doesn't that depend on kptr_restrict, or would it be broken if
> > > kptr_restrict was set to 0?
> >
> > Is that what governs the output of kallsyms?
>
> From my workstation:
>
> $ cat /proc/kallsyms
>
> prints a bunch of zero'd out addresses, while
>
> $ sudo !!
>
> prints out actual addresses.  Looking at kernel/kallsyms.c, it seems
> that there's no use of %pK, but kallsyms_show_value() switches on
> kptr_restrict (and additional values):
>
> /*
>  * We show kallsyms information even to normal users if we've enabled
>  * kernel profiling and are explicitly not paranoid (so kptr_restrict
>  * is clear, and sysctl_perf_event_paranoid isn't set).
>  *
>  * Otherwise, require CAP_SYSLOG (assuming kptr_restrict isn't set to
>  * block even that).
>  */
> int kallsyms_show_value(void)
> {
> switch (kptr_restrict) {
> case 0:
> if (kallsyms_for_perf())
> return 1;
> /* fallthrough */
> case 1:
> if (has_capability_noaudit(current, CAP_SYSLOG))
> return 1;
> /* fallthrough */
> default:
> return 0;
> }
> }

What are folks thoughts on this:
1. create function show_symbols_for_perf() that replaces
kallsyms_show_value(), maybe in linux/ftrace.c (since linux/ftrace.h
is included in kernel/trace/trace_printk.c and kernel/kallsyms.c).
2. use new show_symbols_for_perf() in kernel/kallsyms.c and
kernel/trace/trace_printk.c

Where the implementation of show_symbols_for_perf() is
kallsyms_show_value() implementation-wise (just renamed since it's no
longer kallsyms specific).  Does that make sense, or should I just
send a patch?  Does it make sense to check whether
kernel/trace/trace_printk.c#t_show() should print an address based on
the same checks done in kallsyms_show_value()?

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] tracing: do not leak kernel addresses

2018-07-27 Thread Nick Desaulniers
On Fri, Jul 27, 2018 at 6:47 AM Steven Rostedt  wrote:
>
> On Fri, 27 Jul 2018 15:40:32 +0200
> Jann Horn  wrote:
>
> > > > But the code doesn't go to dmesg. It's only available
> > > > via /sys/kernel/debug/tracing/printk_formats which is only available
> > > > via root. Nobody else has access to that directory.

Oh, sorry, you're right. We're not printing an address to dmesg, but
to a sysfs node.  If you must have CAP_SYS_ADMIN to read this dir,
then printk's %pK wont save you, because then you can modify
kptr_restrict with sysctl.

> > > I think the point was that when we take capabilities into account the root
> > > privileges aren't unequivocal anymore. The 'root' owned process with only
> > > 'CAP_SYSLOG' shouldn't have access to 
> > > /sys/kernel/debug/tracing/printk_formats
> >
> > Then they shouldn't have access to debugfs at all, right?
>
> That's what I'm thinking.

I found the internal bug report (reported Jan '17, you'll have to
forgive me if my memory of the issue is hazy, or if the fix used at
the time wasn't perfect), which was reported against the Nexus 6.
>From the report, it was possible to `cat
/sys/kernel/debug/tracing/printk_formats` without being root, which I
can't do on my workstations much more modern kernel (Nexus 6 was
3.10).  So I guess the question is what governs access to files below
/sys/kernel/debug, and why was it missing from those kernels?  I
assume some check was added, but either not backported to 3.10 stable
(or more likely not pulled in to Nexus 6's kernel through stable;
Android is now in a much better place for that kind of issue).

--
Thanks,
~Nick Desaulniers


[PATCH] kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

2018-07-30 Thread Nick Desaulniers
Starting with Clang-7.0, _THIS_IP_ generates -Wreturn-stack-address
warnings for almost every translation unit. In general, I'd prefer to
leave this on (returning the address of a stack allocated variable is in
general a bad idea) and disable it only at whitelisted call sites.

We can't do something like:
 #pragma clang diagnostic push
 #pragma clang diagnostic ignored "-Wreturn-stack-address"
 
 #pragma clang diagnostic pop

in a GNU Statement Expression or macro, hence we use _Pragma, which is
its raison d'être: https://gcc.gnu.org/onlinedocs/cpp/Pragmas.html

Cc: sta...@vger.kernel.org # 4.17, 4.14, 4.9, 4.4
Signed-off-by: Nick Desaulniers 
---
 include/linux/kernel.h | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 941dc0a5a877..5906f5727f90 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -168,7 +168,15 @@
 
 
 #define _RET_IP_   (unsigned long)__builtin_return_address(0)
-#define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
+#define _THIS_IP_  (   \
+{  \
+   _Pragma("clang diagnostic push")\
+   _Pragma("clang diagnostic ignored \"-Wreturn-stack-address\"")  \
+   __label__ __here;   \
+__here: (unsigned long)&&__here;   \
+   _Pragma("clang diagnostic pop") \
+}  \
+)
 
 #ifdef CONFIG_LBDAF
 # include 
-- 
2.18.0.233.g985f88cf7e-goog



Re: [PATCH] kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

2018-07-30 Thread Nick Desaulniers
On Mon, Jul 30, 2018 at 11:39 AM Nathan Chancellor
 wrote:
>
> On Mon, Jul 30, 2018 at 10:06:20AM -0700, Nick Desaulniers wrote:
> > Starting with Clang-7.0, _THIS_IP_ generates -Wreturn-stack-address
> > warnings for almost every translation unit. In general, I'd prefer to
> > leave this on (returning the address of a stack allocated variable is in
> > general a bad idea) and disable it only at whitelisted call sites.
> >
> > We can't do something like:
> >  #pragma clang diagnostic push
> >  #pragma clang diagnostic ignored "-Wreturn-stack-address"
> >  
> >  #pragma clang diagnostic pop
> >
> > in a GNU Statement Expression or macro, hence we use _Pragma, which is
> > its raison d'être: https://gcc.gnu.org/onlinedocs/cpp/Pragmas.html
> >
> > Cc: sta...@vger.kernel.org # 4.17, 4.14, 4.9, 4.4
> > Signed-off-by: Nick Desaulniers 
> > ---
> >  include/linux/kernel.h | 10 +-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > index 941dc0a5a877..5906f5727f90 100644
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -168,7 +168,15 @@
> >
> >
> >  #define _RET_IP_ (unsigned long)__builtin_return_address(0)
> > -#define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
> > +#define _THIS_IP_  ( \
> > +{\
> > + _Pragma("clang diagnostic push")\
> > + _Pragma("clang diagnostic ignored \"-Wreturn-stack-address\"")  \
> > + __label__ __here;   \
> > +__here: (unsigned long)&&__here; \
> > + _Pragma("clang diagnostic pop") \
> > +}\
> > +)
> >
> >  #ifdef CONFIG_LBDAF
> >  # include 
> > --
> > 2.18.0.233.g985f88cf7e-goog
> >
>
> This generates a ton of warnings with GCC:
>
> In file included from ./include/linux/spinlock.h:58,
>  from ./include/linux/mmzone.h:8,
>  from ./include/linux/gfp.h:6,
>  from ./include/linux/slab.h:15,
>  from ./include/linux/crypto.h:24,
>  from arch/x86/kernel/asm-offsets.c:9:
> ./include/linux/bottom_half.h: In function ‘local_bh_disable’:
> ./include/linux/bottom_half.h:19: error: ignoring #pragma clang diagnostic 
> [-Werror=unknown-pragmas]
>   __local_bh_disable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
>
> ./include/linux/bottom_half.h:19: error: ignoring #pragma clang diagnostic 
> [-Werror=unknown-pragmas]
> ./include/linux/bottom_half.h:19: error: ignoring #pragma clang diagnostic 
> [-Werror=unknown-pragmas]
> ./include/linux/bottom_half.h: In function ‘local_bh_enable’:
> ./include/linux/bottom_half.h:32: error: ignoring #pragma clang diagnostic 
> [-Werror=unknown-pragmas]
>   __local_bh_enable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
>
> ./include/linux/bottom_half.h:32: error: ignoring #pragma clang diagnostic 
> [-Werror=unknown-pragmas]
> ./include/linux/bottom_half.h:32: error: ignoring #pragma clang diagnostic 
> [-Werror=unknown-pragmas]
> cc1: all warnings being treated as errors
> make[1]: *** [Kbuild:56: arch/x86/kernel/asm-offsets.s] Error 1
> make: *** [Makefile:1081: prepare0] Error 2

Ah, good catch.  I thought I had tested locally with gcc-8, but it
seems that it was likely only godbolt.  I do see the errors locally
when building with gcc-8.

>
> A proper solution is probably going to involve what was done for the
> -Wattribute-alias warnings from GCC 8 in commits 8793bb7f4a9d ("kbuild:
> add macro for controlling warnings to linux/compiler.h") and
> bee20031772a ("disable -Wattribute-alias warning for SYSCALL_DEFINEx()")
>
> I'll take a look at it in a bit unless someone beats me to it.

I'll fix this up, triple check with gcc-8, and attribute you in the
Suggested-by tag.  Thank you for verifying.
-- 
Thanks,
~Nick Desaulniers


[4.4 STABLE BACKPORT] x86: paravirt: make native_save_fl extern inline

2018-07-20 Thread Nick Desaulniers
native_save_fl() is marked static inline, but by using it as
a function pointer in arch/x86/kernel/paravirt.c, it MUST be outlined.

paravirt's use of native_save_fl() also requires that no GPRs other than
%rax are clobbered.

Compilers have different heuristics which they use to emit stack guard
code, the emittance of which can break paravirt's callee saved assumption
by clobbering %rcx.

Marking a function definition extern inline means that if this version
cannot be inlined, then the out-of-line version will be preferred. By
having the out-of-line version be implemented in assembly, it cannot be
instrumented with a stack protector, which might violate custom calling
conventions that code like paravirt rely on.

The semantics of extern inline has changed since gnu89. This means that
folks using GCC versions >= 5.1 may see symbol redefinition errors at
link time for subdirs that override KBUILD_CFLAGS (making the C standard
used implicit) regardless of this patch. This has been cleaned up
earlier in the patch set, but is left as a note in the commit message
for future travelers.

Reports:
https://lkml.org/lkml/2018/5/7/534
https://github.com/ClangBuiltLinux/linux/issues/16

Discussion:
https://bugs.llvm.org/show_bug.cgi?id=37512
https://lkml.org/lkml/2018/5/24/1371

Thanks to the many folks that participated in the discussion.

Acked-by: Juergen Gross 
Debugged-by: Alistair Strachan 
Debugged-by: Matthias Kaehlcke 
Reported-by: Sedat Dilek 
Signed-off-by: Nick Desaulniers 
Suggested-by: Arnd Bergmann 
Suggested-by: H. Peter Anvin 
Suggested-by: Tom Stellar 
Tested-by: Sedat Dilek 
---
Backport for 4.4. 4.4 is missing commit 784d5699eddc "x86: move exports to
actual definitions" which doesn't apply cleanly, and not really worth
backporting IMO. It's simpler to change this patch from upstream:
+ #include 
rather than
+ #include 

 arch/x86/include/asm/irqflags.h |  2 +-
 arch/x86/kernel/Makefile|  1 +
 arch/x86/kernel/irqflags.S  | 26 ++
 3 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 arch/x86/kernel/irqflags.S

diff --git a/arch/x86/include/asm/irqflags.h b/arch/x86/include/asm/irqflags.h
index b77f5edb03b0..0056bc945cd1 100644
--- a/arch/x86/include/asm/irqflags.h
+++ b/arch/x86/include/asm/irqflags.h
@@ -8,7 +8,7 @@
  * Interrupt control:
  */
 
-static inline unsigned long native_save_fl(void)
+extern inline unsigned long native_save_fl(void)
 {
unsigned long flags;
 
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index b1b78ffe01d0..7947cee61f61 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -41,6 +41,7 @@ obj-y += alternative.o i8253.o pci-nommu.o 
hw_breakpoint.o
 obj-y  += tsc.o tsc_msr.o io_delay.o rtc.o
 obj-y  += pci-iommu_table.o
 obj-y  += resource.o
+obj-y  += irqflags.o
 
 obj-y  += process.o
 obj-y  += fpu/
diff --git a/arch/x86/kernel/irqflags.S b/arch/x86/kernel/irqflags.S
new file mode 100644
index ..3817eb748eb4
--- /dev/null
+++ b/arch/x86/kernel/irqflags.S
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include 
+#include 
+#include 
+
+/*
+ * unsigned long native_save_fl(void)
+ */
+ENTRY(native_save_fl)
+   pushf
+   pop %_ASM_AX
+   ret
+ENDPROC(native_save_fl)
+EXPORT_SYMBOL(native_save_fl)
+
+/*
+ * void native_restore_fl(unsigned long flags)
+ * %eax/%rdi: flags
+ */
+ENTRY(native_restore_fl)
+   push %_ASM_ARG1
+   popf
+   ret
+ENDPROC(native_restore_fl)
+EXPORT_SYMBOL(native_restore_fl)
-- 
2.18.0.233.g985f88cf7e-goog



Re: [4.4 STABLE BACKPORT] x86: paravirt: make native_save_fl extern inline

2018-07-20 Thread Nick Desaulniers
On Fri, Jul 20, 2018 at 5:04 PM kbuild test robot  wrote:
>
> Hi Nick,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on tip/x86/core]
> [cannot apply to v4.18-rc5 next-20180720]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Nick-Desaulniers/x86-paravirt-make-native_save_fl-extern-inline/20180721-073112
> config: i386-tinyconfig (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=i386
>
> All errors (new ones prefixed by >>):
>
>arch/x86/kernel/irqflags.S: Assembler messages:
> >> arch/x86/kernel/irqflags.S:22: Error: bad register name `%_ASM_ARG1'
>
> vim +22 arch/x86/kernel/irqflags.S
>
> 16
> 17  /*
> 18   * void native_restore_fl(unsigned long flags)
> 19   * %eax/%rdi: flags
> 20   */
> 21  ENTRY(native_restore_fl)
>   > 22  push %_ASM_ARG1
>
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all       Intel Corporation

oops, probably should not have cc'ed vger.kernel.org for a patch meant
for stable.

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

2018-07-30 Thread Nick Desaulniers
On Mon, Jul 30, 2018 at 1:01 PM Nathan Chancellor
 wrote:
>
> On Mon, Jul 30, 2018 at 12:48:06PM -0700, Nick Desaulniers wrote:
> > On Mon, Jul 30, 2018 at 11:39 AM Nathan Chancellor
> >  wrote:
> > >
> > > On Mon, Jul 30, 2018 at 10:06:20AM -0700, Nick Desaulniers wrote:
> > > > Starting with Clang-7.0, _THIS_IP_ generates -Wreturn-stack-address
> > > > warnings for almost every translation unit. In general, I'd prefer to
> > > > leave this on (returning the address of a stack allocated variable is in
> > > > general a bad idea) and disable it only at whitelisted call sites.
> > > >
> > > > We can't do something like:
> > > >  #pragma clang diagnostic push
> > > >  #pragma clang diagnostic ignored "-Wreturn-stack-address"
> > > >  
> > > >  #pragma clang diagnostic pop
> > > >
> > > > in a GNU Statement Expression or macro, hence we use _Pragma, which is
> > > > its raison d'être: https://gcc.gnu.org/onlinedocs/cpp/Pragmas.html
> > > >
> > > > Cc: sta...@vger.kernel.org # 4.17, 4.14, 4.9, 4.4
> > > > Signed-off-by: Nick Desaulniers 
> > > > ---
> > > >  include/linux/kernel.h | 10 +-
> > > >  1 file changed, 9 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> > > > index 941dc0a5a877..5906f5727f90 100644
> > > > --- a/include/linux/kernel.h
> > > > +++ b/include/linux/kernel.h
> > > > @@ -168,7 +168,15 @@
> > > >
> > > >
> > > >  #define _RET_IP_ (unsigned long)__builtin_return_address(0)
> > > > -#define _THIS_IP_  ({ __label__ __here; __here: (unsigned 
> > > > long)&&__here; })
> > > > +#define _THIS_IP_  ( \
> > > > +{\
> > > > + _Pragma("clang diagnostic push")\
> > > > + _Pragma("clang diagnostic ignored \"-Wreturn-stack-address\"")  \
> > > > + __label__ __here;   \
> > > > +__here: (unsigned long)&&__here; \
> > > > + _Pragma("clang diagnostic pop") \
> > > > +}\
> > > > +)
> > > >
> > > >  #ifdef CONFIG_LBDAF
> > > >  # include 
> > > > --
> > > > 2.18.0.233.g985f88cf7e-goog
> > > >
> > >
> > > This generates a ton of warnings with GCC:
> > >
> > > In file included from ./include/linux/spinlock.h:58,
> > >  from ./include/linux/mmzone.h:8,
> > >  from ./include/linux/gfp.h:6,
> > >  from ./include/linux/slab.h:15,
> > >  from ./include/linux/crypto.h:24,
> > >  from arch/x86/kernel/asm-offsets.c:9:
> > > ./include/linux/bottom_half.h: In function ‘local_bh_disable’:
> > > ./include/linux/bottom_half.h:19: error: ignoring #pragma clang 
> > > diagnostic [-Werror=unknown-pragmas]
> > >   __local_bh_disable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
> > >
> > > ./include/linux/bottom_half.h:19: error: ignoring #pragma clang 
> > > diagnostic [-Werror=unknown-pragmas]
> > > ./include/linux/bottom_half.h:19: error: ignoring #pragma clang 
> > > diagnostic [-Werror=unknown-pragmas]
> > > ./include/linux/bottom_half.h: In function ‘local_bh_enable’:
> > > ./include/linux/bottom_half.h:32: error: ignoring #pragma clang 
> > > diagnostic [-Werror=unknown-pragmas]
> > >   __local_bh_enable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
> > >
> > > ./include/linux/bottom_half.h:32: error: ignoring #pragma clang 
> > > diagnostic [-Werror=unknown-pragmas]
> > > ./include/linux/bottom_half.h:32: error: ignoring #pragma clang 
> > > diagnostic [-Werror=unknown-pragmas]
> > > cc1: all warnings being treated as errors
> > > make[1]: *** [Kbuild:56: arch/x86/kernel/asm-offsets.s] Error 1
> > > make: *** [Makefile:1081: prepare0] Error 2
> >
> > Ah, good catch.  I thought I had tested locally with gcc-8, but it
> > seems that it was likely only godbolt.  I do see the errors locally
> > when building 

[PATCH v2 1/2] compiler-clang.h: Add CLANG_VERSION and __diag macros

2018-07-30 Thread Nick Desaulniers
These are needed for doing proper version checks, though feature
detection via __has_attribute,  __has_builtin, and __has_feature should
be preferred, see:
https://clang.llvm.org/docs/LanguageExtensions.html#feature-checking-macros

Also adds __diag support, for generating compiler version specific
_Pragma()'s.

__diag support based on commit 8793bb7f4a9d ("kbuild: add macro for
controlling warnings to linux/compiler.h")

Cc: sta...@vger.kernel.org # 4.17, 4.14, 4.9, 4.4
Suggested-by: Nathan Chancellor 
Signed-off-by: Nick Desaulniers 
---
 include/linux/compiler-clang.h | 19 +++
 include/linux/compiler_types.h |  4 
 2 files changed, 23 insertions(+)

diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index 7087446c24c8..9442e07a361e 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -7,6 +7,10 @@
  * for Clang compiler
  */
 
+#define CLANG_VERSION (__clang_major__ * 1 \
+  + __clang_minor__ * 100  \
+  + __clang_patchlevel__)
+
 #ifdef uninitialized_var
 #undef uninitialized_var
 #define uninitialized_var(x) x = *(&(x))
@@ -46,3 +50,18 @@
 __has_builtin(__builtin_sub_overflow)
 #define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
 #endif
+
+#define __diag_str1(s) #s
+#define __diag_str(s) __diag_str1(s)
+#define __diag(s) _Pragma(__diag_str(clang diagnostic s))
+#define __diag_CLANG_ignore ignored
+#define __diag_CLANG_warn warning
+#define __diag_CLANG_error error
+#define __diag_CLANG(version, severity, s) \
+   __diag_CLANG_ ## version(__diag_CLANG_ ## severity s)
+
+#if CLANG_VERSION >= 7
+#define __diag_CLANG_7(s) __diag(s)
+#else
+#define __diag_CLANG_7(s)
+#endif
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index a8ba6b04152c..a04e6bd63476 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -279,6 +279,10 @@ struct ftrace_likely_data {
 #define __diag_GCC(version, severity, string)
 #endif
 
+#ifndef __diag_CLANG
+#define __diag_CLANG(version, severity, string)
+#endif
+
 #define __diag_push()  __diag(push)
 #define __diag_pop()   __diag(pop)
 
-- 
2.18.0.345.g5c9ce644c3-goog



[PATCH v2 0/2] CLANG_VERSION and __diag macros

2018-07-30 Thread Nick Desaulniers
clang-7 has a new warning (-Wreturn-stack-address) for warning when a
function returns the address of a local variable.  This is in general a
good warning, but the kernel has a few places where GNU statement
expressions return the address of a label in order to get the current
instruction pointer (see _THIS_IP_ and current_text_addr).

In order to disable a warning at a single call site, the kernel already
has __diag macros for inserting compiler and compiler-version specific
_Pragma's.

This series adds CLANG_VERSION macros necessary for proper __diag
support, and whitelists the case in _THIS_IP_. current_text_addr will be
consolidated in a follow up series.

Nick Desaulniers (2):
  compiler-clang.h: Add CLANG_VERSION and __diag macros
  kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

 include/linux/compiler-clang.h | 19 +++
 include/linux/compiler_types.h |  4 
 include/linux/kernel.h | 10 +-
 3 files changed, 32 insertions(+), 1 deletion(-)

-- 
2.18.0.345.g5c9ce644c3-goog



[PATCH v2 2/2] kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

2018-07-30 Thread Nick Desaulniers
Starting with Clang-7.0, _THIS_IP_ generates -Wreturn-stack-address
warnings for almost every translation unit. In general, I'd prefer to
leave this on (returning the address of a stack allocated variable is in
general a bad idea) and disable it only at whitelisted call sites.

We can't do something like:
 #pragma clang diagnostic push
 #pragma clang diagnostic ignored "-Wreturn-stack-address"
 
 #pragma clang diagnostic pop

in a GNU Statement Expression or macro, hence we use _Pragma, which is
its raison d'être: https://gcc.gnu.org/onlinedocs/cpp/Pragmas.html

We also can't use compiler specific pragma's without triggering
-Werror=unknown-pragmas in other compilers, so use __diag.

Cc: sta...@vger.kernel.org # 4.17, 4.14, 4.9, 4.4
Suggested-by: Nathan Chancellor 
Signed-off-by: Nick Desaulniers 
---
 include/linux/kernel.h | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 941dc0a5a877..909bb771c0ca 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -168,7 +168,15 @@
 
 
 #define _RET_IP_   (unsigned long)__builtin_return_address(0)
-#define _THIS_IP_  ({ __label__ __here; __here: (unsigned long)&&__here; })
+#define _THIS_IP_  (   \
+{  \
+   __label__ __here;   \
+   __diag_push()   \
+   __diag_ignore(CLANG, 7, "-Wreturn-stack-address", "")   \
+__here: (unsigned long)&&__here;   \
+   __diag_pop()\
+}  \
+)
 
 #ifdef CONFIG_LBDAF
 # include 
-- 
2.18.0.345.g5c9ce644c3-goog



Re: [PATCH] kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

2018-07-31 Thread Nick Desaulniers
I think this is the v1 of this patch, also reported by Nathan
Chancellor? Hard to tell as the github link is dead.  Does 0-day
delete its branches if there's a v2?
On Mon, Jul 30, 2018 at 11:50 PM kbuild test robot  wrote:
>
> Hi Nick,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on linus/master]
> [also build test WARNING on v4.18-rc7 next-20180727]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Nick-Desaulniers/kernel-h-Disable-Wreturn-stack-address-for-_THIS_IP_/20180731-135818
> config: i386-tinyconfig (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=i386
>
> All warnings (new ones prefixed by >>):
>
>In file included from include/linux/spinlock.h:58:0,
> from include/linux/mmzone.h:8,
> from include/linux/gfp.h:6,
> from include/linux/slab.h:15,
> from include/linux/crypto.h:24,
> from arch/x86/kernel/asm-offsets.c:9:
>include/linux/bottom_half.h: In function 'local_bh_disable':
> >> include/linux/bottom_half.h:19:0: warning: ignoring #pragma clang 
> >> diagnostic [-Wunknown-pragmas]
>  __local_bh_disable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
>
> >> include/linux/bottom_half.h:19:0: warning: ignoring #pragma clang 
> >> diagnostic [-Wunknown-pragmas]
> >> include/linux/bottom_half.h:19:0: warning: ignoring #pragma clang 
> >> diagnostic [-Wunknown-pragmas]
>include/linux/bottom_half.h: In function 'local_bh_enable':
>include/linux/bottom_half.h:32:0: warning: ignoring #pragma clang 
> diagnostic [-Wunknown-pragmas]
>  __local_bh_enable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
>
>include/linux/bottom_half.h:32:0: warning: ignoring #pragma clang 
> diagnostic [-Wunknown-pragmas]
>include/linux/bottom_half.h:32:0: warning: ignoring #pragma clang 
> diagnostic [-Wunknown-pragmas]
>In file included from include/linux/mmzone.h:8:0,
> from include/linux/gfp.h:6,
> from include/linux/slab.h:15,
> from include/linux/crypto.h:24,
> from arch/x86/kernel/asm-offsets.c:9:
>include/linux/spinlock.h: In function 'spin_lock_bh':
> >> include/linux/spinlock.h:315:0: warning: ignoring #pragma clang diagnostic 
> >> [-Wunknown-pragmas]
>  raw_spin_lock_bh(&lock->rlock);
>
> >> include/linux/spinlock.h:315:0: warning: ignoring #pragma clang diagnostic 
> >> [-Wunknown-pragmas]
> >> include/linux/spinlock.h:315:0: warning: ignoring #pragma clang diagnostic 
> >> [-Wunknown-pragmas]
>include/linux/spinlock.h: In function 'spin_unlock_bh':
>include/linux/spinlock.h:355:0: warning: ignoring #pragma clang diagnostic 
> [-Wunknown-pragmas]
>  raw_spin_unlock_bh(&lock->rlock);
>
>include/linux/spinlock.h:355:0: warning: ignoring #pragma clang diagnostic 
> [-Wunknown-pragmas]
>include/linux/spinlock.h:355:0: warning: ignoring #pragma clang diagnostic 
> [-Wunknown-pragmas]
>include/linux/spinlock.h: In function 'spin_trylock_bh':
>include/linux/spinlock.h:370:0: warning: ignoring #pragma clang diagnostic 
> [-Wunknown-pragmas]
>  return raw_spin_trylock_bh(&lock->rlock);
>
>include/linux/spinlock.h:370:0: warning: ignoring #pragma clang diagnostic 
> [-Wunknown-pragmas]
>include/linux/spinlock.h:370:0: warning: ignoring #pragma clang diagnostic 
> [-Wunknown-pragmas]
> --
>In file included from include/linux/rcupdate.h:41:0,
> from include/linux/rculist.h:11,
> from include/linux/pid.h:5,
> from include/linux/sched.h:14,
> from kernel//sched/sched.h:5,
> from kernel//sched/core.c:8:
>include/linux/bottom_half.h: In function 'local_bh_disable':
> >> include/linux/bottom_half.h:19:0: warning: ignoring #pragma clang 
> >> diagnostic [-Wunknown-pragmas]
>  __local_bh_disable_ip(_THIS_IP_, SOFTIRQ_DISABLE_OFFSET);
>
> >> include/linux/bottom_half.h:19:0: warning: ignoring #pragma clang 
> >> diagnostic [-Wunknown-pragmas]
> >> include/linux/bottom_half.h:19:0: warning: ignoring #pragma clang 
> >> diagnostic [-Wunknown-pragmas]
>include/linux/bottom_half.h: In function 'local_bh_enable':
>include/linux/bottom_half.

Re: [PATCH v2 2/2] kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

2018-07-31 Thread Nick Desaulniers
Does anyone understand this error?  The code looks just fine to me,
and the source file doesn't conflict with any of the macros I've added
(certainly not in any way that could cause an indentation error as
reported here).
On Tue, Jul 31, 2018 at 3:27 AM kbuild test robot  wrote:
>
> Hi Nick,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on linus/master]
> [also build test WARNING on v4.18-rc7 next-20180727]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Nick-Desaulniers/compiler-clang-h-Add-CLANG_VERSION-and-__diag-macros/20180731-161932
> config: x86_64-fedora-25 (attached as .config)
> compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=x86_64
>
> All warnings (new ones prefixed by >>):
>
>drivers/net//wireless/intel/iwlwifi/iwl-trans.c: In function 
> 'iwl_trans_send_cmd':
> >> drivers/net//wireless/intel/iwlwifi/iwl-trans.c:137:2: warning: this 'if' 
> >> clause does not guard... [-Wmisleading-indentation]
>  if (!(cmd->flags & CMD_ASYNC))
>  ^~
>drivers/net//wireless/intel/iwlwifi/iwl-trans.c:138:1: note: ...this 
> statement, but the latter is misleadingly indented as if it were guarded by 
> the 'if'
>   lock_map_acquire_read(&trans->sync_cmd_lockdep_map);
> ^ ~
>
> vim +/if +137 drivers/net//wireless/intel/iwlwifi/iwl-trans.c
>
> 92fe8343 Emmanuel Grumbach 2015-12-01  116
> 92fe8343 Emmanuel Grumbach 2015-12-01  117  int iwl_trans_send_cmd(struct 
> iwl_trans *trans, struct iwl_host_cmd *cmd)
> 92fe8343 Emmanuel Grumbach 2015-12-01  118  {
> 92fe8343 Emmanuel Grumbach 2015-12-01  119  int ret;
> 92fe8343 Emmanuel Grumbach 2015-12-01  120
> 92fe8343 Emmanuel Grumbach 2015-12-01  121  if (unlikely(!(cmd->flags & 
> CMD_SEND_IN_RFKILL) &&
> 326477e4 Johannes Berg 2017-04-25  122   
> test_bit(STATUS_RFKILL_OPMODE, &trans->status)))
> 92fe8343 Emmanuel Grumbach 2015-12-01  123  return -ERFKILL;
> 92fe8343 Emmanuel Grumbach 2015-12-01  124
> 92fe8343 Emmanuel Grumbach 2015-12-01  125  if 
> (unlikely(test_bit(STATUS_FW_ERROR, &trans->status)))
> 92fe8343 Emmanuel Grumbach 2015-12-01  126  return -EIO;
> 92fe8343 Emmanuel Grumbach 2015-12-01  127
> 92fe8343 Emmanuel Grumbach 2015-12-01  128  if (unlikely(trans->state != 
> IWL_TRANS_FW_ALIVE)) {
> 92fe8343 Emmanuel Grumbach 2015-12-01  129  IWL_ERR(trans, "%s 
> bad state = %d\n", __func__, trans->state);
> 92fe8343 Emmanuel Grumbach 2015-12-01  130  return -EIO;
> 92fe8343 Emmanuel Grumbach 2015-12-01  131  }
> 92fe8343 Emmanuel Grumbach 2015-12-01  132
> 92fe8343 Emmanuel Grumbach 2015-12-01  133  if (WARN_ON((cmd->flags & 
> CMD_WANT_ASYNC_CALLBACK) &&
> 92fe8343 Emmanuel Grumbach 2015-12-01  134  !(cmd->flags & 
> CMD_ASYNC)))
> 92fe8343 Emmanuel Grumbach 2015-12-01  135  return -EINVAL;
> 92fe8343 Emmanuel Grumbach 2015-12-01  136
> 92fe8343 Emmanuel Grumbach 2015-12-01 @137  if (!(cmd->flags & CMD_ASYNC))
> 92fe8343 Emmanuel Grumbach 2015-12-01  138  
> lock_map_acquire_read(&trans->sync_cmd_lockdep_map);
> 92fe8343 Emmanuel Grumbach 2015-12-01  139
> 5b88792c Sara Sharon   2016-08-15  140  if (trans->wide_cmd_header && 
> !iwl_cmd_groupid(cmd->id))
> 5b88792c Sara Sharon   2016-08-15  141  cmd->id = 
> DEF_ID(cmd->id);
> 5b88792c Sara Sharon   2016-08-15  142
> 92fe8343 Emmanuel Grumbach 2015-12-01  143  ret = 
> trans->ops->send_cmd(trans, cmd);
> 92fe8343 Emmanuel Grumbach 2015-12-01  144
> 92fe8343 Emmanuel Grumbach 2015-12-01  145  if (!(cmd->flags & CMD_ASYNC))
> 92fe8343 Emmanuel Grumbach 2015-12-01  146  
> lock_map_release(&trans->sync_cmd_lockdep_map);
> 92fe8343 Emmanuel Grumbach 2015-12-01  147
> 0ec971fd Johannes Berg 2017-04-10  148  if (WARN_ON((cmd->flags & 
> CMD_WANT_SKB) && !ret && !cmd->resp_pkt))
> 0ec971fd Johannes Berg 2017-04-10  149  return -EIO;
> 0ec971fd Johannes Berg 2017-04-10  150
> 92fe8343 Emmanuel Grumbach 2015-12-01  151  return ret;
> 92fe8343 Emmanuel Grumbach 2015-12-01  152  }
> 92fe8343 Emmanuel Grumbach 2015-12-01  153  
> IWL_EXPORT_SYMBOL(iwl_trans_send_cmd);
> 39bdb17e Sharon Dvir   2015-10-15  154
>
> :: The code at line 137 was first introduced by commit
> :: 92fe83430b899b786c837e5b716a328220d47ae5 iwlwifi: uninline 
> iwl_trans_send_cmd
>
> :: TO: Emmanuel Grumbach 
> :: CC: Emmanuel Grumbach 
>
> ---
> 0-DAY kernel test infrastructureOpen Source Technology Center
> https://lists.01.org/pipermail/kbuild-all   Intel Corporation



-- 
Thanks,
~Nick Desaulniers


Re: [PATCH v2 2/2] kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

2018-07-31 Thread Nick Desaulniers
Another strange error from 0-day bot.  The source file does not make
use of my added macros, so not sure how they could generate such a
warning.
On Tue, Jul 31, 2018 at 6:54 AM kbuild test robot  wrote:
>
> Hi Nick,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on linus/master]
> [also build test ERROR on v4.18-rc7 next-20180727]
> [if your patch is applied to the wrong git tree, please drop us a note to 
> help improve the system]
>
> url:
> https://github.com/0day-ci/linux/commits/Nick-Desaulniers/compiler-clang-h-Add-CLANG_VERSION-and-__diag-macros/20180731-161932
> config: x86_64-randconfig-s1-07312048 (attached as .config)
> compiler: gcc-6 (Debian 6.4.0-9) 6.4.0 20171026
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=x86_64
>
> All errors (new ones prefixed by >>):
>
>drivers/gpu//drm/i915/i915_gem.c: In function 
> '__i915_gem_object_set_pages':
> >> drivers/gpu//drm/i915/i915_gem.c:2697:1: error: expected expression before 
> >> '#pragma'
>  GEM_BUG_ON(!HAS_PAGE_SIZES(i915, obj->mm.page_sizes.sg));
> ^~~
>
> vim +2697 drivers/gpu//drm/i915/i915_gem.c
>
> 03ac84f18 Chris Wilson 2016-10-28  2658
> 03ac84f18 Chris Wilson 2016-10-28  2659  void 
> __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
> a5c081662 Matthew Auld 2017-10-06  2660  
> struct sg_table *pages,
> 84e8978e6 Matthew Auld 2017-10-09  2661  
> unsigned int sg_page_sizes)
> 03ac84f18 Chris Wilson 2016-10-28  2662  {
> a5c081662 Matthew Auld 2017-10-06  2663 struct drm_i915_private *i915 
> = to_i915(obj->base.dev);
> a5c081662 Matthew Auld 2017-10-06  2664 unsigned long supported = 
> INTEL_INFO(i915)->page_sizes;
> a5c081662 Matthew Auld 2017-10-06  2665 int i;
> a5c081662 Matthew Auld 2017-10-06  2666
> 1233e2db1 Chris Wilson 2016-10-28  2667 
> lockdep_assert_held(&obj->mm.lock);
> 03ac84f18 Chris Wilson 2016-10-28  2668
> 03ac84f18 Chris Wilson 2016-10-28  2669 obj->mm.get_page.sg_pos = 
> pages->sgl;
> 03ac84f18 Chris Wilson 2016-10-28  2670 obj->mm.get_page.sg_idx = 0;
> 03ac84f18 Chris Wilson 2016-10-28  2671
> 03ac84f18 Chris Wilson 2016-10-28  2672 obj->mm.pages = pages;
> 2c3a3f44d Chris Wilson 2016-11-04  2673
> 2c3a3f44d Chris Wilson 2016-11-04  2674 if 
> (i915_gem_object_is_tiled(obj) &&
> f2123818f Chris Wilson 2017-10-16  2675 i915->quirks & 
> QUIRK_PIN_SWIZZLED_PAGES) {
> 2c3a3f44d Chris Wilson 2016-11-04  2676 
> GEM_BUG_ON(obj->mm.quirked);
> 2c3a3f44d Chris Wilson 2016-11-04  2677 
> __i915_gem_object_pin_pages(obj);
> 2c3a3f44d Chris Wilson 2016-11-04  2678 obj->mm.quirked = 
> true;
> 2c3a3f44d Chris Wilson 2016-11-04  2679 }
> a5c081662 Matthew Auld 2017-10-06  2680
> 84e8978e6 Matthew Auld 2017-10-09  2681 GEM_BUG_ON(!sg_page_sizes);
> 84e8978e6 Matthew Auld 2017-10-09  2682 obj->mm.page_sizes.phys = 
> sg_page_sizes;
> a5c081662 Matthew Auld 2017-10-06  2683
> a5c081662 Matthew Auld 2017-10-06  2684 /*
> 84e8978e6 Matthew Auld 2017-10-09  2685  * Calculate the supported 
> page-sizes which fit into the given
> 84e8978e6 Matthew Auld 2017-10-09  2686  * sg_page_sizes. This will 
> give us the page-sizes which we may be able
> 84e8978e6 Matthew Auld 2017-10-09  2687  * to use opportunistically 
> when later inserting into the GTT. For
> 84e8978e6 Matthew Auld 2017-10-09  2688  * example if phys=2G, then 
> in theory we should be able to use 1G, 2M,
> 84e8978e6 Matthew Auld 2017-10-09  2689  * 64K or 4K pages, although 
> in practice this will depend on a number of
> 84e8978e6 Matthew Auld 2017-10-09  2690  * other factors.
> a5c081662 Matthew Auld 2017-10-06  2691  */
> a5c081662 Matthew Auld 2017-10-06  2692 obj->mm.page_sizes.sg = 0;
> a5c081662 Matthew Auld 2017-10-06  2693 for_each_set_bit(i, 
> &supported, ilog2(I915_GTT_MAX_PAGE_SIZE) + 1) {
> a5c081662 Matthew Auld 2017-10-06  2694 if 
> (obj->mm.page_sizes.phys & ~0u << i)
> a5c081662 Matthew Auld 2017-10-06  2695 
> obj->mm.page_sizes.sg |= BIT(i);
> a5c081662 Matthew Auld 2017-10-06  2696 }
> a5c081662 Matthew Auld 2017-10-06 @2697 
> GEM_BUG_ON(!HAS_PAGE_SIZES(i915, obj->mm.page_sizes.sg));
> f2123818f Chris Wilson 2017-10-16  2698
> f2123818f Chris Wilson 2017-10-16  2699 spin_lock(&i915->mm.obj_lo

Re: [PATCH v2 2/2] kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

2018-07-31 Thread Nick Desaulniers
On Tue, Jul 31, 2018 at 10:02 AM Kees Cook  wrote:
>
> On Tue, Jul 31, 2018 at 9:48 AM, Nick Desaulniers
>  wrote:
> > Does anyone understand this error?  The code looks just fine to me,
> > and the source file doesn't conflict with any of the macros I've added
> > (certainly not in any way that could cause an indentation error as
> > reported here).
>
> It does _use_ the macro, but yeah, I'm stumped...
>
> > On Tue, Jul 31, 2018 at 3:27 AM kbuild test robot  wrote:
> >>drivers/net//wireless/intel/iwlwifi/iwl-trans.c: In function 
> >> 'iwl_trans_send_cmd':
> >> >> drivers/net//wireless/intel/iwlwifi/iwl-trans.c:137:2: warning: this 
> >> >> 'if' clause does not guard... [-Wmisleading-indentation]
> >>  if (!(cmd->flags & CMD_ASYNC))
> >>  ^~
> >>drivers/net//wireless/intel/iwlwifi/iwl-trans.c:138:1: note: ...this 
> >> statement, but the latter is misleadingly indented as if it were guarded 
> >> by the 'if'
> >>   lock_map_acquire_read(&trans->sync_cmd_lockdep_map);
> >> ^ ~
> >>
> >> vim +/if +137 drivers/net//wireless/intel/iwlwifi/iwl-trans.c
> >>
> >> 92fe8343 Emmanuel Grumbach 2015-12-01 @137  if (!(cmd->flags & 
> >> CMD_ASYNC))
> >> 92fe8343 Emmanuel Grumbach 2015-12-01  138  
> >> lock_map_acquire_read(&trans->sync_cmd_lockdep_map);
>
> #define lock_map_acquire_read(l)
> lock_acquire_shared_recursive(l, 0, 0, NULL, _THIS_IP_)
>
> #define lock_acquire_shared_recursive(l, s, t, n, i)
> lock_acquire(l, s, t, 2, 1, n, i)
>
> The config doesn't have CONFIG_LOCKDEP, so it's not:
>
> extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
>  int trylock, int read, int check,
>  struct lockdep_map *nest_lock, unsigned long ip);
>
> but rather:
>
> # define lock_acquire(l, s, t, r, c, n, i)  do { } while (0)
>
> If you build with the same gcc and config, does it reproduce for you?

Indeed, gcc 6, 7, or 8 + the provided config reproduce the issue.

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH v2 2/2] kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

2018-07-31 Thread Nick Desaulniers
On Tue, Jul 31, 2018 at 10:02 AM Kees Cook  wrote:
>
> On Tue, Jul 31, 2018 at 9:48 AM, Nick Desaulniers
> > On Tue, Jul 31, 2018 at 3:27 AM kbuild test robot  wrote:
> >>drivers/net//wireless/intel/iwlwifi/iwl-trans.c: In function 
> >> 'iwl_trans_send_cmd':
> >> >> drivers/net//wireless/intel/iwlwifi/iwl-trans.c:137:2: warning: this 
> >> >> 'if' clause does not guard... [-Wmisleading-indentation]
> >>  if (!(cmd->flags & CMD_ASYNC))
> >>  ^~
> >>drivers/net//wireless/intel/iwlwifi/iwl-trans.c:138:1: note: ...this 
> >> statement, but the latter is misleadingly indented as if it were guarded 
> >> by the 'if'
> >>   lock_map_acquire_read(&trans->sync_cmd_lockdep_map);
> >> ^ ~
> >>
> >> vim +/if +137 drivers/net//wireless/intel/iwlwifi/iwl-trans.c
> >>
> >> 92fe8343 Emmanuel Grumbach 2015-12-01 @137  if (!(cmd->flags & 
> >> CMD_ASYNC))
> >> 92fe8343 Emmanuel Grumbach 2015-12-01  138  
> >> lock_map_acquire_read(&trans->sync_cmd_lockdep_map);
>
> #define lock_map_acquire_read(l)
> lock_acquire_shared_recursive(l, 0, 0, NULL, _THIS_IP_)
>
> #define lock_acquire_shared_recursive(l, s, t, n, i)
> lock_acquire(l, s, t, 2, 1, n, i)
>
> The config doesn't have CONFIG_LOCKDEP, so it's not:
>
> extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
>  int trylock, int read, int check,
>  struct lockdep_map *nest_lock, unsigned long ip);
>
> but rather:
>
> # define lock_acquire(l, s, t, r, c, n, i)  do { } while (0)

This is tricky, if I preprocess that translation unit with the exact
flags used during compilation, I get:

```
 if (!(cmd->flags & CMD_ASYNC))

#pragma GCC diagnostic push

#pragma GCC diagnostic pop
 do { } while (0);
```

Which is not enough to trigger -Wmisleading-indentation alone.  It is
curious that if we add braces to that if statement (as Nathan notes in
a sibling post) or removing the pop (not shippable) seems to fix the
warning.

-- 
Thanks,
~Nick Desaulniers


Re: [PATCH v2 2/2] kernel.h: Disable -Wreturn-stack-address for _THIS_IP_

2018-07-31 Thread Nick Desaulniers
On Tue, Jul 31, 2018 at 11:58 AM Nick Desaulniers
 wrote:
>
> On Tue, Jul 31, 2018 at 10:02 AM Kees Cook  wrote:
> >
> > On Tue, Jul 31, 2018 at 9:48 AM, Nick Desaulniers
> > > On Tue, Jul 31, 2018 at 3:27 AM kbuild test robot  wrote:
> > >>drivers/net//wireless/intel/iwlwifi/iwl-trans.c: In function 
> > >> 'iwl_trans_send_cmd':
> > >> >> drivers/net//wireless/intel/iwlwifi/iwl-trans.c:137:2: warning: this 
> > >> >> 'if' clause does not guard... [-Wmisleading-indentation]
> > >>  if (!(cmd->flags & CMD_ASYNC))
> > >>  ^~
> > >>drivers/net//wireless/intel/iwlwifi/iwl-trans.c:138:1: note: ...this 
> > >> statement, but the latter is misleadingly indented as if it were guarded 
> > >> by the 'if'
> > >>   lock_map_acquire_read(&trans->sync_cmd_lockdep_map);
> > >> ^ ~
> > >>
> > >> vim +/if +137 drivers/net//wireless/intel/iwlwifi/iwl-trans.c
> > >>
> > >> 92fe8343 Emmanuel Grumbach 2015-12-01 @137  if (!(cmd->flags & 
> > >> CMD_ASYNC))
> > >> 92fe8343 Emmanuel Grumbach 2015-12-01  138  
> > >> lock_map_acquire_read(&trans->sync_cmd_lockdep_map);
> >
> > #define lock_map_acquire_read(l)
> > lock_acquire_shared_recursive(l, 0, 0, NULL, _THIS_IP_)
> >
> > #define lock_acquire_shared_recursive(l, s, t, n, i)
> > lock_acquire(l, s, t, 2, 1, n, i)
> >
> > The config doesn't have CONFIG_LOCKDEP, so it's not:
> >
> > extern void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
> >  int trylock, int read, int check,
> >  struct lockdep_map *nest_lock, unsigned long ip);
> >
> > but rather:
> >
> > # define lock_acquire(l, s, t, r, c, n, i)  do { } while (0)
>
> This is tricky, if I preprocess that translation unit with the exact
> flags used during compilation, I get:
>
> ```
>  if (!(cmd->flags & CMD_ASYNC))
>
> #pragma GCC diagnostic push
>
> #pragma GCC diagnostic pop
>  do { } while (0);
> ```
>
> Which is not enough to trigger -Wmisleading-indentation alone.  It is
> curious that if we add braces to that if statement (as Nathan notes in
> a sibling post) or removing the pop (not shippable) seems to fix the
> warning.

Something fishy is going on here: https://godbolt.org/g/b5dsqH

It seems that gcc's warning is technically correct, but it seems to be
a miscompile as puts() in my reduced test case is called
unconditionally.  I've filed:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86765

In the meanwhile, I've reworked the patch to change _THIS_IP_ to a
only contain a function call, to a new static inline function which
does what the statement expression used to.  This now triggers
-Wreturn-local-addr warnings in gcc, which is a warning added in
gcc-4.8, so I need to add another __diag_ignore, and case for gcc 4.8
to include/linux/compiler-gcc.h.

At this point, I think I might as well consolidate current_text_addr()
and _THIS_IP_.  Stay tuned for v3.

--
Thanks,
~Nick Desaulniers


Getting the instruction pointer on a per arch basis

2018-07-31 Thread Nick Desaulniers
I'm currently looking into cleaning up the code duplication between
current_text_addr() and _THIS_IP_, virtually every implementation of
current_text_addr() and _THIS_IP_ itself are basically:

#define _THIS_IP_ ({ __label__ _l; _l: &&_l; })

For a few arch's, they have inline assembly instead (for
current_text_addr()).  Examples:
* s390
* sh
* ia64
* x86 (um and 32b)
* c6x
* sparc

I have a patch that cuts down on the duplication, but I don't
understand why the few arch specific implementations are necessary.  I
could reduce the duplication further if it's ok to just use the
statement expression.

Does anyone know why this is the case?
-- 
Thanks,
~Nick Desaulniers


Re: Getting the instruction pointer on a per arch basis

2018-07-31 Thread Nick Desaulniers
+ More maintainers and lists for visibility

On Tue, Jul 31, 2018 at 3:32 PM Nick Desaulniers
 wrote:
>
> I'm currently looking into cleaning up the code duplication between
> current_text_addr() and _THIS_IP_, virtually every implementation of
> current_text_addr() and _THIS_IP_ itself are basically:
>
> #define _THIS_IP_ ({ __label__ _l; _l: &&_l; })
>
> For a few arch's, they have inline assembly instead (for
> current_text_addr()).  Examples:
> * s390
> * sh
> * ia64
> * x86 (um and 32b)
> * c6x
> * sparc
>
> I have a patch that cuts down on the duplication, but I don't
> understand why the few arch specific implementations are necessary.  I
> could reduce the duplication further if it's ok to just use the
> statement expression.
>
> Does anyone know why this is the case?
> --
> Thanks,
> ~Nick Desaulniers


Re: Getting the instruction pointer on a per arch basis

2018-08-01 Thread Nick Desaulniers
On Tue, Jul 31, 2018 at 10:41 PM Martin Schwidefsky
 wrote:
>
> On Tue, 31 Jul 2018 16:09:06 -0700
> Nick Desaulniers  wrote:
>
> > + More maintainers and lists for visibility
> >
> > On Tue, Jul 31, 2018 at 3:32 PM Nick Desaulniers
> >  wrote:
> > >
> > > I'm currently looking into cleaning up the code duplication between
> > > current_text_addr() and _THIS_IP_, virtually every implementation of
> > > current_text_addr() and _THIS_IP_ itself are basically:
> > >
> > > #define _THIS_IP_ ({ __label__ _l; _l: &&_l; })
> > >
> > > For a few arch's, they have inline assembly instead (for
> > > current_text_addr()).  Examples:
> > > * s390
> > > * sh
> > > * ia64
> > > * x86 (um and 32b)
> > > * c6x
> > > * sparc
> > >
> > > I have a patch that cuts down on the duplication, but I don't
> > > understand why the few arch specific implementations are necessary.  I
> > > could reduce the duplication further if it's ok to just use the
> > > statement expression.
> > >
> > > Does anyone know why this is the case?
>
> For s390 it is just that we did not know about the label trick when we
> introduced the define. The inline has an advantage though, the code
> generated with the label trick is using a LARL instruction which is
> 4 bytes, the inline assembly uses a BASR which is 2 bytes.
>
> If I use the label method in current_text_addr() the size of vmlinux
> increases by a small amount:
>
> add/remove: 33/13 grow/shrink: 101/48 up/down: 11941/-8887 (3054)

Thanks for the measurements.  Was this output produced by a utility?

> This is acceptable though, I would not mind if _THIS_IP_ and
> current_text_addr use a common definition using labels.

Thank you for this feedback Martin, I appreciate it.  Patches soon.

-- 
Thanks,
~Nick Desaulniers


[PATCH] parisc: prefer _THIS_IP_ and _RET_IP_ statement expressions

2018-08-01 Thread Nick Desaulniers
As part of the effort to reduce the code duplication between _THIS_IP_
and current_text_addr(), let's consolidate callers of
current_text_addr() to use _THIS_IP_.

Signed-off-by: Nick Desaulniers 
---
 arch/parisc/kernel/unwind.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/parisc/kernel/unwind.c b/arch/parisc/kernel/unwind.c
index 2ef83d78eec4..a4b430f440a9 100644
--- a/arch/parisc/kernel/unwind.c
+++ b/arch/parisc/kernel/unwind.c
@@ -439,8 +439,8 @@ unsigned long return_address(unsigned int level)
/* initialize unwind info */
asm volatile ("copy %%r30, %0" : "=r"(sp));
memset(&r, 0, sizeof(struct pt_regs));
-   r.iaoq[0] = (unsigned long) current_text_addr();
-   r.gr[2] = (unsigned long) __builtin_return_address(0);
+   r.iaoq[0] = _THIS_IP_;
+   r.gr[2] = _RET_IP_;
r.gr[30] = sp;
unwind_frame_init(&info, current, &r);
 
-- 
2.18.0.597.ga71716f1ad-goog



[PATCH] sh: dwarf: prefer _THIS_IP_ to current_text_addr

2018-08-01 Thread Nick Desaulniers
As part of the effort to reduce the code duplication between _THIS_IP_
and current_text_addr(), let's consolidate callers of
current_text_addr() to use _THIS_IP_.

Signed-off-by: Nick Desaulniers 
---
 arch/sh/kernel/dwarf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/sh/kernel/dwarf.c b/arch/sh/kernel/dwarf.c
index 1a2526676a87..bb511e2d9d68 100644
--- a/arch/sh/kernel/dwarf.c
+++ b/arch/sh/kernel/dwarf.c
@@ -599,7 +599,7 @@ struct dwarf_frame *dwarf_unwind_stack(unsigned long pc,
 * time this function makes its first function call.
 */
if (!pc || !prev)
-   pc = (unsigned long)current_text_addr();
+   pc = _THIS_IP_;
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
/*
-- 
2.18.0.597.ga71716f1ad-goog



get_maintainer.pl and change of email

2018-08-01 Thread Nick Desaulniers
It seems that get_maintainer.pl will make recommendations based on
commit history to a file, but over time, people change emails that
they commit from, then get_maintainer.pl recommends the possibly now
invalid email address.

I feel like we should have a look up table / file like MAINTAINERS
that tracks previous and new email addresses, which get_maintainer.pl
would parse, that way we have a record of movement between committer
email addresses allowing get_maintainer.pl to make more accurate
recommendations.  Thoughts?
-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] sh: dwarf: prefer _THIS_IP_ to current_text_addr

2018-08-01 Thread Nick Desaulniers
On Wed, Aug 1, 2018 at 11:29 AM Nick Desaulniers
 wrote:
>
> As part of the effort to reduce the code duplication between _THIS_IP_
> and current_text_addr(), let's consolidate callers of
> current_text_addr() to use _THIS_IP_.
>
> Signed-off-by: Nick Desaulniers 
> ---
>  arch/sh/kernel/dwarf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/sh/kernel/dwarf.c b/arch/sh/kernel/dwarf.c
> index 1a2526676a87..bb511e2d9d68 100644
> --- a/arch/sh/kernel/dwarf.c
> +++ b/arch/sh/kernel/dwarf.c
> @@ -599,7 +599,7 @@ struct dwarf_frame *dwarf_unwind_stack(unsigned long pc,
>  * time this function makes its first function call.
>  */
> if (!pc || !prev)
> -   pc = (unsigned long)current_text_addr();
> +   pc = _THIS_IP_;
>
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> /*
> --
> 2.18.0.597.ga71716f1ad-goog
>

Sorry, there's another call site in the arch/sh/ tree
(arch/sh/include/asm/kexec.h#64).  Let me fix both and send as a
single patch in v2.
-- 
Thanks,
~Nick Desaulniers


[PATCH v2] sh: prefer _THIS_IP_ to current_text_addr

2018-08-01 Thread Nick Desaulniers
As part of the effort to reduce the code duplication between _THIS_IP_
and current_text_addr(), let's consolidate callers of
current_text_addr() to use _THIS_IP_.

Signed-off-by: Nick Desaulniers 
---
 arch/sh/include/asm/kexec.h | 3 ++-
 arch/sh/kernel/dwarf.c  | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/sh/include/asm/kexec.h b/arch/sh/include/asm/kexec.h
index fd5f331a3912..927d80ba2332 100644
--- a/arch/sh/include/asm/kexec.h
+++ b/arch/sh/include/asm/kexec.h
@@ -4,6 +4,7 @@
 
 #include 
 #include 
+#include 
 
 /*
  * KEXEC_SOURCE_MEMORY_LIMIT maximum page get_free_page can return.
@@ -61,7 +62,7 @@ static inline void crash_setup_regs(struct pt_regs *newregs,
__asm__ __volatile__ ("stc gbr, %0" : "=r" (newregs->gbr));
__asm__ __volatile__ ("stc sr, %0" : "=r" (newregs->sr));
 
-   newregs->pc = (unsigned long)current_text_addr();
+   newregs->pc = _THIS_IP_;
}
 }
 #else
diff --git a/arch/sh/kernel/dwarf.c b/arch/sh/kernel/dwarf.c
index 1a2526676a87..bb511e2d9d68 100644
--- a/arch/sh/kernel/dwarf.c
+++ b/arch/sh/kernel/dwarf.c
@@ -599,7 +599,7 @@ struct dwarf_frame *dwarf_unwind_stack(unsigned long pc,
 * time this function makes its first function call.
 */
if (!pc || !prev)
-   pc = (unsigned long)current_text_addr();
+   pc = _THIS_IP_;
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
/*
-- 
2.18.0.597.ga71716f1ad-goog



Re: get_maintainer.pl and change of email

2018-08-01 Thread Nick Desaulniers
On Wed, Aug 1, 2018 at 11:53 AM Adam Borowski  wrote:
>
> On Wed, Aug 01, 2018 at 11:36:23AM -0700, Nick Desaulniers wrote:
> > It seems that get_maintainer.pl will make recommendations based on
> > commit history to a file, but over time, people change emails that
> > they commit from, then get_maintainer.pl recommends the possibly now
> > invalid email address.
> >
> > I feel like we should have a look up table / file like MAINTAINERS
> > that tracks previous and new email addresses, which get_maintainer.pl
> > would parse, that way we have a record of movement between committer
> > email addresses allowing get_maintainer.pl to make more accurate
> > recommendations.  Thoughts?
>
> .mailmap

Neat, that's a start.  It has a top level comment about being used by
git-shortlog, looks like get_maintainer.pl parses it, and it's a
default on option.  Cool.  Ok, looks like I just need to add someone
to the .mailmap file.  Thanks!

-- 
Thanks,
~Nick Desaulniers


  1   2   3   4   5   6   7   8   9   10   >