Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Linus Torvalds
On Sun, May 21, 2017 at 3:19 PM, Linus Torvalds
 wrote:
>
> Did an "allyesconfig" build on 32-bit x86, and looked at who uses the
> 8-byte get_user/put_user cases:

I've done more testing.

It turns out that quite independently of all these patches, our 32-bit
x86 code is entirely broken.

In particular, __get_user_asm_u64() has two independent bugs, one
fairly harmless, and one that can be entirely deadly.

The harmless one is that we have the ASM_STAC/ASM_CLAC markers around
the user access in there, even though they should have gotten removed.
That cone can be considered a "merge error" between commit

  b2f680380ddf ("x86/mm/32: Add support for 64-bit __get_user() on
32-bit kernels")

that added the 64-bit case, and commit

  11f1a4b9755f x86: reorganize SMAP handling in user space accesses

that moved the CLAC/STAC into the caller.

So it turns out that a 64-bit __get_user() case will have a double
pair of STAC/CLAC instructions, making it even slower than it should
otherwise be. But it all still *works* fine.

The much worse issue is that the asm is just buggered, and when it does

  "1: movl %2,%%eax\n" \
  "2: movl %3,%%edx\n" \

it can be that %eax is actually used for the address, so the second
move can do crazy bad things. It can (and does) generate code like
this:

 18c:   8b 00   mov(%eax),%eax
 18e:   8b 50 04mov0x4(%eax),%edx

(I'm not sure that actually happens anywhere in the kernel, but it did
happen in my test-case).

So the 64-bit output needs to be marked as being an early-clobber,
meaning that it can be written to early in the asm. So we need to use
"=", not "=A" for it.

Adding Ben LaHaise to the cc, since that "=A" bug goes back to the
original implementation of __get_user_asm_u64() (which is only a year
ago, but still).

I've committed a fix, and now the generated asm looks ok, but I don't
actually have any 32-bit x86 machines left. Hopefully somebody still
does and can test this..

Linus


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Linus Torvalds
On Sun, May 21, 2017 at 3:19 PM, Linus Torvalds
 wrote:
>
> Did an "allyesconfig" build on 32-bit x86, and looked at who uses the
> 8-byte get_user/put_user cases:

I've done more testing.

It turns out that quite independently of all these patches, our 32-bit
x86 code is entirely broken.

In particular, __get_user_asm_u64() has two independent bugs, one
fairly harmless, and one that can be entirely deadly.

The harmless one is that we have the ASM_STAC/ASM_CLAC markers around
the user access in there, even though they should have gotten removed.
That cone can be considered a "merge error" between commit

  b2f680380ddf ("x86/mm/32: Add support for 64-bit __get_user() on
32-bit kernels")

that added the 64-bit case, and commit

  11f1a4b9755f x86: reorganize SMAP handling in user space accesses

that moved the CLAC/STAC into the caller.

So it turns out that a 64-bit __get_user() case will have a double
pair of STAC/CLAC instructions, making it even slower than it should
otherwise be. But it all still *works* fine.

The much worse issue is that the asm is just buggered, and when it does

  "1: movl %2,%%eax\n" \
  "2: movl %3,%%edx\n" \

it can be that %eax is actually used for the address, so the second
move can do crazy bad things. It can (and does) generate code like
this:

 18c:   8b 00   mov(%eax),%eax
 18e:   8b 50 04mov0x4(%eax),%edx

(I'm not sure that actually happens anywhere in the kernel, but it did
happen in my test-case).

So the 64-bit output needs to be marked as being an early-clobber,
meaning that it can be written to early in the asm. So we need to use
"=", not "=A" for it.

Adding Ben LaHaise to the cc, since that "=A" bug goes back to the
original implementation of __get_user_asm_u64() (which is only a year
ago, but still).

I've committed a fix, and now the generated asm looks ok, but I don't
actually have any 32-bit x86 machines left. Hopefully somebody still
does and can test this..

Linus


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Linus Torvalds
On Sun, May 21, 2017 at 2:37 PM, Linus Torvalds
 wrote:
>
> I'm pretty sure there's a reason we added support for it on x86-32,
> because there are structures that use __u64 and fill things one entry
> at a time.

Did an "allyesconfig" build on 32-bit x86, and looked at who uses the
8-byte get_user/put_user cases:

__get_user_8:
i915_perf_open_ioctl

__put_user_8:
snapshot_ioctl
sys_sendfile64
timerfd_read
eventfd_read
userfaultfd_ioctl
kpagecgroup_read
kpagecount_read
kpageflags_read
__ncp_ioctl
blkdev_ioctl
drm_mode_object_get_properties
drm_mode_getproperty_ioctl
efi_test_ioctl
params_to_user
__rds_rdma_map

so it's not common, but both do get used.

Would any of those be changed to the unsafe versions? Maybe not. But I
think we're better off being consistent.

We basically allow all kernel integer types to be used for
put/get_user(), and the fact that some architectures don't support
them is just a quirk of that architecture, not a sign that it
shouldn't be done.

  Linus


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Linus Torvalds
On Sun, May 21, 2017 at 2:37 PM, Linus Torvalds
 wrote:
>
> I'm pretty sure there's a reason we added support for it on x86-32,
> because there are structures that use __u64 and fill things one entry
> at a time.

Did an "allyesconfig" build on 32-bit x86, and looked at who uses the
8-byte get_user/put_user cases:

__get_user_8:
i915_perf_open_ioctl

__put_user_8:
snapshot_ioctl
sys_sendfile64
timerfd_read
eventfd_read
userfaultfd_ioctl
kpagecgroup_read
kpagecount_read
kpageflags_read
__ncp_ioctl
blkdev_ioctl
drm_mode_object_get_properties
drm_mode_getproperty_ioctl
efi_test_ioctl
params_to_user
__rds_rdma_map

so it's not common, but both do get used.

Would any of those be changed to the unsafe versions? Maybe not. But I
think we're better off being consistent.

We basically allow all kernel integer types to be used for
put/get_user(), and the fact that some architectures don't support
them is just a quirk of that architecture, not a sign that it
shouldn't be done.

  Linus


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Linus Torvalds
On Sun, May 21, 2017 at 2:14 PM, Al Viro  wrote:
>
> Umm...  get_user() for anything larger than long is simply not supported on
> a lot of architectures[1].  Do we really want to do that for 
> unsafe_get_user()?

I'm pretty sure there's a reason we added support for it on x86-32,
because there are structures that use __u64 and fill things one entry
at a time.

It's entirely possible that that code then fails (maybe it compiles,
but doesn't work) on various other architectures. There's a lot of
drivers that are disabled on non-x86.

  Linus


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Linus Torvalds
On Sun, May 21, 2017 at 2:14 PM, Al Viro  wrote:
>
> Umm...  get_user() for anything larger than long is simply not supported on
> a lot of architectures[1].  Do we really want to do that for 
> unsafe_get_user()?

I'm pretty sure there's a reason we added support for it on x86-32,
because there are structures that use __u64 and fill things one entry
at a time.

It's entirely possible that that code then fails (maybe it compiles,
but doesn't work) on various other architectures. There's a lot of
drivers that are disabled on non-x86.

  Linus


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Al Viro
On Sun, May 21, 2017 at 12:35:28PM -0700, Linus Torvalds wrote:
> 
> 
> On Sun, 21 May 2017, Al Viro wrote:
> > 
> > fix unsafe_put_user()
> 
> So here's my proposed patch on top of yours to fix unsafe_get_user() with 
> "long long" arguments, and to clean up the extra-long line you did.
> 
> Comments?

>  #define unsafe_get_user(x, ptr, err_label)   
> \
>  do { 
> \
>   int __gu_err;   
> \
> - unsigned long __gu_val; 
> \
> + __inttype(*(ptr)) __gu_val; 
> \

Umm...  get_user() for anything larger than long is simply not supported on
a lot of architectures[1].  Do we really want to do that for unsafe_get_user()?

[1] at the moment, blackfin, m32r, m68k/mmu, microblaze, mn10300, nios2, sh.
arm allows it for get_user() (and rmk was really unhappy about doing so), but
not for __get_user().


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Al Viro
On Sun, May 21, 2017 at 12:35:28PM -0700, Linus Torvalds wrote:
> 
> 
> On Sun, 21 May 2017, Al Viro wrote:
> > 
> > fix unsafe_put_user()
> 
> So here's my proposed patch on top of yours to fix unsafe_get_user() with 
> "long long" arguments, and to clean up the extra-long line you did.
> 
> Comments?

>  #define unsafe_get_user(x, ptr, err_label)   
> \
>  do { 
> \
>   int __gu_err;   
> \
> - unsigned long __gu_val; 
> \
> + __inttype(*(ptr)) __gu_val; 
> \

Umm...  get_user() for anything larger than long is simply not supported on
a lot of architectures[1].  Do we really want to do that for unsafe_get_user()?

[1] at the moment, blackfin, m32r, m68k/mmu, microblaze, mn10300, nios2, sh.
arm allows it for get_user() (and rmk was really unhappy about doing so), but
not for __get_user().


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Linus Torvalds


On Sun, 21 May 2017, Al Viro wrote:
> 
> fix unsafe_put_user()

So here's my proposed patch on top of yours to fix unsafe_get_user() with 
"long long" arguments, and to clean up the extra-long line you did.

Comments?

  Linus

---
 arch/x86/include/asm/uaccess.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index d9668c3beb5b..661c497465ce 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -703,14 +703,15 @@ extern struct movsl_mask {
 #define unsafe_put_user(x, ptr, err_label) 
\
 do {   
\
int __pu_err;   
\
-   __put_user_size((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), 
__pu_err, -EFAULT); \
+   __typeof__(*(ptr)) __pu_val = (__typeof__(*(ptr)))(x);  
\
+   __put_user_size(__pu_val, (ptr), sizeof(*(ptr)), __pu_err, -EFAULT);
\
if (unlikely(__pu_err)) goto err_label; 
\
 } while (0)
 
 #define unsafe_get_user(x, ptr, err_label) 
\
 do {   
\
int __gu_err;   
\
-   unsigned long __gu_val; 
\
+   __inttype(*(ptr)) __gu_val; 
\
__get_user_size(__gu_val, (ptr), sizeof(*(ptr)), __gu_err, -EFAULT);
\
(x) = (__force __typeof__(*(ptr)))__gu_val; 
\
if (unlikely(__gu_err)) goto err_label; 
\


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Linus Torvalds


On Sun, 21 May 2017, Al Viro wrote:
> 
> fix unsafe_put_user()

So here's my proposed patch on top of yours to fix unsafe_get_user() with 
"long long" arguments, and to clean up the extra-long line you did.

Comments?

  Linus

---
 arch/x86/include/asm/uaccess.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index d9668c3beb5b..661c497465ce 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -703,14 +703,15 @@ extern struct movsl_mask {
 #define unsafe_put_user(x, ptr, err_label) 
\
 do {   
\
int __pu_err;   
\
-   __put_user_size((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), 
__pu_err, -EFAULT); \
+   __typeof__(*(ptr)) __pu_val = (__typeof__(*(ptr)))(x);  
\
+   __put_user_size(__pu_val, (ptr), sizeof(*(ptr)), __pu_err, -EFAULT);
\
if (unlikely(__pu_err)) goto err_label; 
\
 } while (0)
 
 #define unsafe_get_user(x, ptr, err_label) 
\
 do {   
\
int __gu_err;   
\
-   unsigned long __gu_val; 
\
+   __inttype(*(ptr)) __gu_val; 
\
__get_user_size(__gu_val, (ptr), sizeof(*(ptr)), __gu_err, -EFAULT);
\
(x) = (__force __typeof__(*(ptr)))__gu_val; 
\
if (unlikely(__gu_err)) goto err_label; 
\


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Linus Torvalds
On Sun, May 21, 2017 at 12:34 AM, Al Viro  wrote:
>
> -   __put_user_size((x), (ptr), sizeof(*(ptr)), __pu_err, -EFAULT);   
>   \
> +   __put_user_size((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), 
> __pu_err, -EFAULT); \

Hmm. Looking more at this, the "unsafe_get_user()" case is wrong too -
for types larger than "long".

But I see you have a pull request pending, and I'll take this fix as-is.

I *think* the right thing to do is to just do

   register __inttype(*(ptr)) __val_gu;

for unsafe_get_user.

I think the error crept in because I copied the "get_user_ex()" code,
which has the same type confusion (ie it doesn't handle values larger
then long, so "long long" on x86-32 wouldn't work).

That type limitation was ok'ish simply because get_user_ex() was
x86-only and of very limited use (and clearly never saw the 64-bit
value on a 32-bit arch case).

But for unsafe_get_user() we obviously want to make it generic enough
and just be able to replace existing get_user() calls.

  Linus


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Linus Torvalds
On Sun, May 21, 2017 at 12:34 AM, Al Viro  wrote:
>
> -   __put_user_size((x), (ptr), sizeof(*(ptr)), __pu_err, -EFAULT);   
>   \
> +   __put_user_size((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), 
> __pu_err, -EFAULT); \

Hmm. Looking more at this, the "unsafe_get_user()" case is wrong too -
for types larger than "long".

But I see you have a pull request pending, and I'll take this fix as-is.

I *think* the right thing to do is to just do

   register __inttype(*(ptr)) __val_gu;

for unsafe_get_user.

I think the error crept in because I copied the "get_user_ex()" code,
which has the same type confusion (ie it doesn't handle values larger
then long, so "long long" on x86-32 wouldn't work).

That type limitation was ok'ish simply because get_user_ex() was
x86-only and of very limited use (and clearly never saw the 64-bit
value on a 32-bit arch case).

But for unsafe_get_user() we obviously want to make it generic enough
and just be able to replace existing get_user() calls.

  Linus


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Al Viro
On Fri, May 19, 2017 at 02:08:20PM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> commit: 75f64d68f9816a1c244b8685f056389b24d97e98 ("waitid(): switch copyout 
> of siginfo to unsafe_put_user()")
> url: 
> https://github.com/0day-ci/linux/commits/Al-Viro/move-compat-wait4-and-waitid-next-to-native-variants/20170516-084127
> 
> 
> in testcase: boot

Cute...  That's unsafe_put_user() bug, actually.  There's no unsafe_put_user()
callers in mainline and it's fairly early in the cycle.  Linus, do you have
any problems with that one?  If not, I'll send a pull request with it + 
osf_wait4()
fix...

fix unsafe_put_user()

__put_user_size() relies upon its first argument having the same type as what
the second one points to; the only other user makes sure of that and
unsafe_put_user() should do the same.

Signed-off-by: Al Viro 
---

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 68766b276d9e..d9668c3beb5b 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -703,7 +703,7 @@ extern struct movsl_mask {
 #define unsafe_put_user(x, ptr, err_label) 
\
 do {   
\
int __pu_err;   
\
-   __put_user_size((x), (ptr), sizeof(*(ptr)), __pu_err, -EFAULT); 
\
+   __put_user_size((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), 
__pu_err, -EFAULT); \
if (unlikely(__pu_err)) goto err_label; 
\
 } while (0)


Re: [lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-21 Thread Al Viro
On Fri, May 19, 2017 at 02:08:20PM +0800, kernel test robot wrote:
> 
> FYI, we noticed the following commit:
> 
> commit: 75f64d68f9816a1c244b8685f056389b24d97e98 ("waitid(): switch copyout 
> of siginfo to unsafe_put_user()")
> url: 
> https://github.com/0day-ci/linux/commits/Al-Viro/move-compat-wait4-and-waitid-next-to-native-variants/20170516-084127
> 
> 
> in testcase: boot

Cute...  That's unsafe_put_user() bug, actually.  There's no unsafe_put_user()
callers in mainline and it's fairly early in the cycle.  Linus, do you have
any problems with that one?  If not, I'll send a pull request with it + 
osf_wait4()
fix...

fix unsafe_put_user()

__put_user_size() relies upon its first argument having the same type as what
the second one points to; the only other user makes sure of that and
unsafe_put_user() should do the same.

Signed-off-by: Al Viro 
---

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 68766b276d9e..d9668c3beb5b 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -703,7 +703,7 @@ extern struct movsl_mask {
 #define unsafe_put_user(x, ptr, err_label) 
\
 do {   
\
int __pu_err;   
\
-   __put_user_size((x), (ptr), sizeof(*(ptr)), __pu_err, -EFAULT); 
\
+   __put_user_size((__typeof__(*(ptr)))(x), (ptr), sizeof(*(ptr)), 
__pu_err, -EFAULT); \
if (unlikely(__pu_err)) goto err_label; 
\
 } while (0)


[lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-19 Thread kernel test robot

FYI, we noticed the following commit:

commit: 75f64d68f9816a1c244b8685f056389b24d97e98 ("waitid(): switch copyout of 
siginfo to unsafe_put_user()")
url: 
https://github.com/0day-ci/linux/commits/Al-Viro/move-compat-wait4-and-waitid-next-to-native-variants/20170516-084127


in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -smp 2 -m 512M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+--+++
|  | 3b8d2673bc 
| 75f64d68f9 |
+--+++
| boot_successes   | 8  
| 0  |
| boot_failures| 4  
| 17 |
| invoked_oom-killer:gfp_mask=0x   | 4  
||
| Mem-Info | 4  
||
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 4  
||
| Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=| 0  
| 17 |
+--+++



[   13.075040] Freeing unused kernel memory: 712K
[   13.077939] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[   13.077939] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[   13.087208] random: init: uninitialized urandom read (12 bytes read)
[   13.087208] random: init: uninitialized urandom read (12 bytes read)
[   13.101738] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x0600
[   13.101738] 
[   13.101738] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x0600
[   13.101738] 
[   13.103770] CPU: 0 PID: 1 Comm: init Not tainted 4.12.0-rc1-8-g75f64d6 #1
[   13.103770] CPU: 0 PID: 1 Comm: init Not tainted 4.12.0-rc1-8-g75f64d6 #1
[   13.105333] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[   13.105333] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[   13.107557] Call Trace:
[   13.107557] Call Trace:
[   13.108112]  dump_stack+0x27/0x31
[   13.108112]  dump_stack+0x27/0x31
[   13.108856]  panic+0x115/0x31b
[   13.108856]  panic+0x115/0x31b
[   13.109534]  do_exit+0x111d/0x1120
[   13.109534]  do_exit+0x111d/0x1120
[   13.110296]  do_group_exit+0x3d/0x110
[   13.110296]  do_group_exit+0x3d/0x110
[   13.05]  SyS_exit_group+0x24/0x30
[   13.05]  SyS_exit_group+0x24/0x30
[   13.111925]  entry_SYSCALL_64_fastpath+0x1a/0xa4
[   13.111925]  entry_SYSCALL_64_fastpath+0x1a/0xa4
[   13.112947] RIP: 0033:0x7fede11ca408
[   13.112947] RIP: 0033:0x7fede11ca408
[   13.113752] RSP: 002b:7ffe406929e8 EFLAGS: 0246 ORIG_RAX: 
00e7
[   13.113752] RSP: 002b:7ffe406929e8 EFLAGS: 0246 ORIG_RAX: 
00e7
[   13.115403] RAX: ffda RBX: 7fede217 RCX: 7fede11ca408
[   13.115403] RAX: ffda RBX: 7fede217 RCX: 7fede11ca408
[   13.116957] RDX: 0006 RSI: 003c RDI: 0006
[   13.116957] RDX: 0006 RSI: 003c RDI: 0006
[   13.118505] RBP: 7fede16d1000 R08: 00e7 R09: ffa0
[   13.118505] RBP: 7fede16d1000 R08: 00e7 R09: ffa0
[   13.120065] R10: 7fede14c5fa8 R11: 0246 R12: 00218220
[   13.120065] R10: 7fede14c5fa8 R11: 0246 R12: 00218220
[   13.121616] R13:  R14: 7fede2177048 R15: 7fede21714e8
[   13.121616] R13:  R14: 7fede2177048 R15: 7fede21714e8
[   13.123174] Kernel Offset: disabled


To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script  # job-script is attached in this 
email



Thanks,
Xiaolong
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.12.0-rc1 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y

[lkp-robot] [waitid()] 75f64d68f9: Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=

2017-05-19 Thread kernel test robot

FYI, we noticed the following commit:

commit: 75f64d68f9816a1c244b8685f056389b24d97e98 ("waitid(): switch copyout of 
siginfo to unsafe_put_user()")
url: 
https://github.com/0day-ci/linux/commits/Al-Viro/move-compat-wait4-and-waitid-next-to-native-variants/20170516-084127


in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -smp 2 -m 512M

caused below changes (please refer to attached dmesg/kmsg for entire 
log/backtrace):


+--+++
|  | 3b8d2673bc 
| 75f64d68f9 |
+--+++
| boot_successes   | 8  
| 0  |
| boot_failures| 4  
| 17 |
| invoked_oom-killer:gfp_mask=0x   | 4  
||
| Mem-Info | 4  
||
| Kernel_panic-not_syncing:Out_of_memory_and_no_killable_processes | 4  
||
| Kernel_panic-not_syncing:Attempted_to_kill_init!exitcode=| 0  
| 17 |
+--+++



[   13.075040] Freeing unused kernel memory: 712K
[   13.077939] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[   13.077939] x86/mm: Checked W+X mappings: passed, no W+X pages found.
[   13.087208] random: init: uninitialized urandom read (12 bytes read)
[   13.087208] random: init: uninitialized urandom read (12 bytes read)
[   13.101738] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x0600
[   13.101738] 
[   13.101738] Kernel panic - not syncing: Attempted to kill init! 
exitcode=0x0600
[   13.101738] 
[   13.103770] CPU: 0 PID: 1 Comm: init Not tainted 4.12.0-rc1-8-g75f64d6 #1
[   13.103770] CPU: 0 PID: 1 Comm: init Not tainted 4.12.0-rc1-8-g75f64d6 #1
[   13.105333] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[   13.105333] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
1.9.3-20161025_171302-gandalf 04/01/2014
[   13.107557] Call Trace:
[   13.107557] Call Trace:
[   13.108112]  dump_stack+0x27/0x31
[   13.108112]  dump_stack+0x27/0x31
[   13.108856]  panic+0x115/0x31b
[   13.108856]  panic+0x115/0x31b
[   13.109534]  do_exit+0x111d/0x1120
[   13.109534]  do_exit+0x111d/0x1120
[   13.110296]  do_group_exit+0x3d/0x110
[   13.110296]  do_group_exit+0x3d/0x110
[   13.05]  SyS_exit_group+0x24/0x30
[   13.05]  SyS_exit_group+0x24/0x30
[   13.111925]  entry_SYSCALL_64_fastpath+0x1a/0xa4
[   13.111925]  entry_SYSCALL_64_fastpath+0x1a/0xa4
[   13.112947] RIP: 0033:0x7fede11ca408
[   13.112947] RIP: 0033:0x7fede11ca408
[   13.113752] RSP: 002b:7ffe406929e8 EFLAGS: 0246 ORIG_RAX: 
00e7
[   13.113752] RSP: 002b:7ffe406929e8 EFLAGS: 0246 ORIG_RAX: 
00e7
[   13.115403] RAX: ffda RBX: 7fede217 RCX: 7fede11ca408
[   13.115403] RAX: ffda RBX: 7fede217 RCX: 7fede11ca408
[   13.116957] RDX: 0006 RSI: 003c RDI: 0006
[   13.116957] RDX: 0006 RSI: 003c RDI: 0006
[   13.118505] RBP: 7fede16d1000 R08: 00e7 R09: ffa0
[   13.118505] RBP: 7fede16d1000 R08: 00e7 R09: ffa0
[   13.120065] R10: 7fede14c5fa8 R11: 0246 R12: 00218220
[   13.120065] R10: 7fede14c5fa8 R11: 0246 R12: 00218220
[   13.121616] R13:  R14: 7fede2177048 R15: 7fede21714e8
[   13.121616] R13:  R14: 7fede2177048 R15: 7fede21714e8
[   13.123174] Kernel Offset: disabled


To reproduce:

git clone https://github.com/01org/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k  job-script  # job-script is attached in this 
email



Thanks,
Xiaolong
#
# Automatically generated file; DO NOT EDIT.
# Linux/x86_64 4.12.0-rc1 Kernel Configuration
#
CONFIG_64BIT=y
CONFIG_X86_64=y
CONFIG_X86=y
CONFIG_INSTRUCTION_DECODER=y
CONFIG_OUTPUT_FORMAT="elf64-x86-64"
CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig"
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_MMU=y
CONFIG_ARCH_MMAP_RND_BITS_MIN=28
CONFIG_ARCH_MMAP_RND_BITS_MAX=32
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MIN=8
CONFIG_ARCH_MMAP_RND_COMPAT_BITS_MAX=16
CONFIG_NEED_DMA_MAP_STATE=y
CONFIG_NEED_SG_DMA_LENGTH=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_BUG_RELATIVE_POINTERS=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_ARCH_HAS_CPU_RELAX=y
CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y