Bug#817816: System freeze with CPU hotplug

2016-03-10 Thread Harry Junior
On Thu, 10 Mar 2016 17:34:54 + Ben Hutchings  wrote:
> On Thu, 2016-03-10 at 17:32 +0100, Harry Junior wrote:
>> Package: linux-image-4.4.0-1-amd64
>> Version: 4.4.4-1
>> Severity: critical
>> Justification: renders system unusable
>> 
>> 
>> When I run the following commands, the system freezes:
>> 
>> # echo 0 | tee /sys/devices/system/cpu/cpu*/online && echo 1 | sudo
>> tee /sys/devices/system/cpu/cpu*/online
>> 
>> The system freezes randomly when the CPUs are being onlined or
>> offlined. The system is installed on a VMware virtual machine with 4
>> processors. Here's a stacktrace of the infinite loop:
> [...]
> 
> You just asked to offline all CPUs, making the system unusable. Â I
> don't see any bug here.

I'm sorry to disagree, but the CPU0 can't be offlined and remains online:

$ ls -l /sys/devices/system/cpu/ | grep cpu | head -4
drwxr-xr-x 6 root root0 Mar 10 18:41 cpu0
drwxr-xr-x 6 root root0 Mar 10 18:41 cpu1
drwxr-xr-x 6 root root0 Mar 10 18:41 cpu2
drwxr-xr-x 6 root root0 Mar 10 18:41 cpu3

$ ls -l /sys/devices/system/cpu/cpu*/online
-rw-r--r-- 1 root root 4096 Mar 10 18:40 /sys/devices/system/cpu/cpu1/online
-rw-r--r-- 1 root root 4096 Mar 10 18:40 /sys/devices/system/cpu/cpu2/online
-rw-r--r-- 1 root root 4096 Mar 10 18:40 /sys/devices/system/cpu/cpu3/online

  

Bug#817816: System freeze with CPU hotplug

2016-03-10 Thread Harry Junior
Package: linux-image-4.4.0-1-amd64
Version: 4.4.4-1
Severity: critical
Justification: renders system unusable


When I run the following commands, the system freezes:

# echo 0 | tee /sys/devices/system/cpu/cpu*/online && echo 1 | sudo tee 
/sys/devices/system/cpu/cpu*/online

The system freezes randomly when the CPUs are being onlined or offlined. The 
system is installed on a VMware virtual machine with 4 processors. Here's a 
stacktrace of the infinite loop:

---[regs]
  RAX: 0x8160AC40  RBX: 0x0003  RCX: 0x  
RDX: 0x0003  o d i t s Z a P c 
  RSI: 0x0286  RDI: 0x88004E6B7D98  RBP: 0x88004E6B7D98  
RSP: 0x88007CFCFDC0  RIP: 0x8110AD52
  R8 : 0x88007F60F380  R9 : 0x  R10: 0x81B004C0  
R11: 0x  R12: 0x88004E6B7DBC
  R13: 0x0282  R14: 0x88007F60F300  R15: 0x8110AD10
  CS: 0010  DS:   ES:   FS:   GS:   SS: 0018

---[code]
=> 0x8110ad52 :  mov    ebx,DWORD PTR [rbp+0x20]
   0x8110ad55 :  cmp    edx,ebx
   0x8110ad57 :  je     0x8110ad74 

   0x8110ad59 :  cmp    ebx,0x2
   0x8110ad5c :  je     0x8110ad9e 

   0x8110ad5e :  cmp    ebx,0x3
   0x8110ad61 :  jne    0x8110ad68 

   0x8110ad63 :  test   r14b,r14b
-
multi_cpu_stop (data=0x88004e6b7d98) at 
/build/linux-xT7CCq/linux-4.4.4/kernel/stop_machine.c:197
197 /build/linux-xT7CCq/linux-4.4.4/kernel/stop_machine.c: No such file or 
directory.
gdb$ bt
#0  multi_cpu_stop (data=0x88004e6b7d98) at 
/build/linux-xT7CCq/linux-4.4.4/kernel/stop_machine.c:197
#1  0x8110afed in cpu_stopper_thread (cpu=) at 
/build/linux-xT7CCq/linux-4.4.4/kernel/stop_machine.c:456
#2  0x81097d49 in smpboot_thread_fn (data=0x88004e6b7d98) at 
/build/linux-xT7CCq/linux-4.4.4/kernel/smpboot.c:163
#3  0x81094dfd in kthread (_create=0x88007ce82100) at 
/build/linux-xT7CCq/linux-4.4.4/kernel/kthread.c:209
#4  0x8158ed8f in ret_from_fork () at 
/build/linux-xT7CCq/linux-4.4.4/arch/x86/entry/entry_64.S:486
#5  0x in ?? ()
gdb$ x/x $rbp+0x20
0x88004e6b7db8: 0x0003

In the function multi_cpu_stop(), curstate equals MULTI_STOP_RUN and seems to 
never become equal to MULTI_STOP_EXIT. Let me know if you require additional 
informations.
Thanks