Bug#854281: Crash because of NULL pointer dereference in `pick_next_task_fair`

2018-10-10 Thread Robin Thoni
Hi,

I also encountered this bug randomly three times on the last 2 months,
on two servers (Linux x 3.16.0-6-amd64 #1 SMP Debian
3.16.56-1+deb8u1 (2018-05-08) x86_64 GNU/Linux), same hardware (I can
share if needed), mainly running Apache, Docker and VirtualBox
(Windows guests). They are running since February, but this bug only
happened recently.

Is it still happening on Debian Stretch with a 4.x kernel?

Thanks.

-- 
Robin THONI
NVIDIA



Bug#854281: Crash because of NULL pointer dereference in `pick_next_task_fair`

2017-02-05 Thread Paul Menzel
Package: linux-image-3.16.0-4-amd64
Version: 3.16.39-1
Severity: important

Dear Debian folks,


Linux 3.16.39 and earlier crash almost daily with the trace below.

```
Feb 04 10:05:58 fujitsu01 kernel: BUG: unable to handle kernel NULL pointer 
dereference at 0078
Feb 04 10:05:58 fujitsu01 kernel: IP: [] 
pick_next_task_fair+0x6b8/0x820
Feb 04 10:05:58 fujitsu01 kernel: PGD d700be067 PUD c681b8067 PMD 0
Feb 04 10:05:58 fujitsu01 kernel: Oops:  [#1] SMP
Feb 04 10:05:58 fujitsu01 kernel: Modules linked in: ipt_REJECT binfmt_misc 
veth xt_nat xt_tcpudp ipt_MASQUERADE xfrm_user xfrm_algo iptab
le_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype xt_conntrack 
nf_nat nf_conntrack bridge stp llc xt_multiport iptable_filter 
ip_tables x_tables x86_pkg_temp_thermal coretemp evdev ppdev kvm_intel kvm 
crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk
_helper cryptd pcspkr serio_raw shpchp tpm_tis tpm wmi parport_pc parport 
fujitsu_laptop video acpi_cpufreq acpi_pad processor button autofs
4 btrfs xor raid6_pq crct10dif_pclmul crct10dif_common crc32c_intel ahci 
libahci e1000e psmouse xhci_hcd ptp libata pps_core nvme usbcore sc
si_mod usb_common thermal fan thermal_sys i2c_hid hid i2c_core
Feb 04 10:05:58 fujitsu01 kernel: CPU: 4 PID: 28 Comm: ksoftirqd/4 Not tainted 
3.16.0-4-amd64 #1 Debian 3.16.39-1
Feb 04 10:05:58 fujitsu01 kernel: Hardware name: FUJITSU D3417-B1/D3417-B1, 
BIOS V5.0.0.11 R1.12.0 for D3417-B1x02/09/
2016
Feb 04 10:05:58 fujitsu01 kernel: task: 880fde868c20 ti: 880fde87 
task.ti: 880fde87
Feb 04 10:05:58 fujitsu01 kernel: RIP: 0010:[]  
[] pick_next_task_fair+0x6b8/0x820
Feb 04 10:05:58 fujitsu01 kernel: RSP: 0018:880fde873de0  EFLAGS: 00010046
Feb 04 10:05:58 fujitsu01 kernel: RAX: 00012279 RBX: 880fdaafd1c0 
RCX: 
Feb 04 10:05:58 fujitsu01 kernel: RDX: 0001 RSI: 880fdb0d3f28 
RDI: 880f23500c98
Feb 04 10:05:58 fujitsu01 kernel: RBP: 880fdb0d3f00 R08:  
R09: b80a
Feb 04 10:05:58 fujitsu01 kernel: R10: 021f R11: 0008 
R12: 
Feb 04 10:05:58 fujitsu01 kernel: R13:  R14:  
R15: 88102e512f40
Feb 04 10:05:58 fujitsu01 kernel: FS:  () 
GS:88102e50() knlGS:
Feb 04 10:05:58 fujitsu01 kernel: CS:  0010 DS:  ES:  CR0: 
80050033
Feb 04 10:05:58 fujitsu01 kernel: CR2: 0078 CR3: 000c209f9000 
CR4: 003407e0
Feb 04 10:05:58 fujitsu01 kernel: DR0:  DR1:  
DR2: 
Feb 04 10:05:58 fujitsu01 kernel: DR3:  DR6: fffe0ff0 
DR7: 0400
Feb 04 10:05:58 fujitsu01 kernel: Stack:
Feb 04 10:05:58 fujitsu01 kernel:  880f23500c20 0001810a0964 
880fde868c20 00012f40
Feb 04 10:05:58 fujitsu01 kernel:  88102e512fb8 8101ca75 
880fde869078 880fde868c20
Feb 04 10:05:58 fujitsu01 kernel:  88102e512f40 0004 
 
Feb 04 10:05:58 fujitsu01 kernel: Call Trace:
Feb 04 10:05:58 fujitsu01 kernel:  [] ? sched_clock+0x5/0x10
Feb 04 10:05:58 fujitsu01 kernel:  [] ? __schedule+0x106/0x6f0
Feb 04 10:05:58 fujitsu01 kernel:  [] ? 
smpboot_thread_fn+0xc6/0x190
Feb 04 10:05:58 fujitsu01 kernel:  [] ? 
SyS_setgroups+0x170/0x170
Feb 04 10:05:58 fujitsu01 kernel:  [] ? kthread+0xbd/0xe0
Feb 04 10:05:58 fujitsu01 kernel:  [] ? 
kthread_create_on_node+0x180/0x180
Feb 04 10:05:58 fujitsu01 kernel:  [] ? 
ret_from_fork+0x58/0x90
Feb 04 10:05:58 fujitsu01 kernel:  [] ? 
kthread_create_on_node+0x180/0x180
Feb 04 10:05:58 fujitsu01 kernel: Code: 49 8b 7c 24 78 48 39 fd 74 2f 44 8b 73 
68 45 8b 6c 24 68 45 39 ee 0f 8e c7 00 00 00 48 89 ef 48 89 de e8 ac 91 ff ff 
48 8b 5b 70 <49> 8b 7c 24 78 48 8b 6b 78 48 39 fd 75 d1 48 85 ed 74 cc 4c 89
Feb 04 10:05:58 fujitsu01 kernel: RIP  [] 
pick_next_task_fair+0x6b8/0x820
```

Please find all Linux kernel messages attached.

There is a similar report on the Linux kernel mailing list (LKML) [1]
for Linux 3.10.0.

Please tell me, if you need more information, or how I can help to get
a fix submitted.


Thanks,

Paul


[1] https://lkml.org/lkml/2016/5/31/990Feb 03 09:57:13 barberini01 kernel: Initializing cgroup subsys cpuset
Feb 03 09:57:13 barberini01 kernel: Initializing cgroup subsys cpu
Feb 03 09:57:13 barberini01 kernel: Initializing cgroup subsys cpuacct
Feb 03 09:57:13 barberini01 kernel: Linux version 3.16.0-4-amd64 
(debian-ker...@lists.debian.org) (gcc version 4.8.4 (Debian 4.8.4-1) ) #1 SMP 
Debian 3.16.39-1 (2016-12-30)
Feb 03 09:57:13 barberini01 kernel: Command line: 
BOOT_IMAGE=/boot/vmlinuz-3.16.0-4-amd64 
root=UUID=37c4a852-27a8-40da-bfc5-03a047fd8faa ro quiet noisapnp 
cgroup_enable=memory swapaccount=1
Feb 03 09:57:13 barberini01 kernel: e820: BIOS-provided physical RAM map:
Feb 03 09:57:13 barberini01 kernel: