Bug#854281: Crash because of NULL pointer dereference in `pick_next_task_fair`
Hi, I also encountered this bug randomly three times on the last 2 months, on two servers (Linux x 3.16.0-6-amd64 #1 SMP Debian 3.16.56-1+deb8u1 (2018-05-08) x86_64 GNU/Linux), same hardware (I can share if needed), mainly running Apache, Docker and VirtualBox (Windows guests). They are running since February, but this bug only happened recently. Is it still happening on Debian Stretch with a 4.x kernel? Thanks. -- Robin THONI NVIDIA
Bug#854281: Crash because of NULL pointer dereference in `pick_next_task_fair`
Package: linux-image-3.16.0-4-amd64 Version: 3.16.39-1 Severity: important Dear Debian folks, Linux 3.16.39 and earlier crash almost daily with the trace below. ``` Feb 04 10:05:58 fujitsu01 kernel: BUG: unable to handle kernel NULL pointer dereference at 0078 Feb 04 10:05:58 fujitsu01 kernel: IP: [] pick_next_task_fair+0x6b8/0x820 Feb 04 10:05:58 fujitsu01 kernel: PGD d700be067 PUD c681b8067 PMD 0 Feb 04 10:05:58 fujitsu01 kernel: Oops: [#1] SMP Feb 04 10:05:58 fujitsu01 kernel: Modules linked in: ipt_REJECT binfmt_misc veth xt_nat xt_tcpudp ipt_MASQUERADE xfrm_user xfrm_algo iptab le_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype xt_conntrack nf_nat nf_conntrack bridge stp llc xt_multiport iptable_filter ip_tables x_tables x86_pkg_temp_thermal coretemp evdev ppdev kvm_intel kvm crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk _helper cryptd pcspkr serio_raw shpchp tpm_tis tpm wmi parport_pc parport fujitsu_laptop video acpi_cpufreq acpi_pad processor button autofs 4 btrfs xor raid6_pq crct10dif_pclmul crct10dif_common crc32c_intel ahci libahci e1000e psmouse xhci_hcd ptp libata pps_core nvme usbcore sc si_mod usb_common thermal fan thermal_sys i2c_hid hid i2c_core Feb 04 10:05:58 fujitsu01 kernel: CPU: 4 PID: 28 Comm: ksoftirqd/4 Not tainted 3.16.0-4-amd64 #1 Debian 3.16.39-1 Feb 04 10:05:58 fujitsu01 kernel: Hardware name: FUJITSU D3417-B1/D3417-B1, BIOS V5.0.0.11 R1.12.0 for D3417-B1x02/09/ 2016 Feb 04 10:05:58 fujitsu01 kernel: task: 880fde868c20 ti: 880fde87 task.ti: 880fde87 Feb 04 10:05:58 fujitsu01 kernel: RIP: 0010:[] [] pick_next_task_fair+0x6b8/0x820 Feb 04 10:05:58 fujitsu01 kernel: RSP: 0018:880fde873de0 EFLAGS: 00010046 Feb 04 10:05:58 fujitsu01 kernel: RAX: 00012279 RBX: 880fdaafd1c0 RCX: Feb 04 10:05:58 fujitsu01 kernel: RDX: 0001 RSI: 880fdb0d3f28 RDI: 880f23500c98 Feb 04 10:05:58 fujitsu01 kernel: RBP: 880fdb0d3f00 R08: R09: b80a Feb 04 10:05:58 fujitsu01 kernel: R10: 021f R11: 0008 R12: Feb 04 10:05:58 fujitsu01 kernel: R13: R14: R15: 88102e512f40 Feb 04 10:05:58 fujitsu01 kernel: FS: () GS:88102e50() knlGS: Feb 04 10:05:58 fujitsu01 kernel: CS: 0010 DS: ES: CR0: 80050033 Feb 04 10:05:58 fujitsu01 kernel: CR2: 0078 CR3: 000c209f9000 CR4: 003407e0 Feb 04 10:05:58 fujitsu01 kernel: DR0: DR1: DR2: Feb 04 10:05:58 fujitsu01 kernel: DR3: DR6: fffe0ff0 DR7: 0400 Feb 04 10:05:58 fujitsu01 kernel: Stack: Feb 04 10:05:58 fujitsu01 kernel: 880f23500c20 0001810a0964 880fde868c20 00012f40 Feb 04 10:05:58 fujitsu01 kernel: 88102e512fb8 8101ca75 880fde869078 880fde868c20 Feb 04 10:05:58 fujitsu01 kernel: 88102e512f40 0004 Feb 04 10:05:58 fujitsu01 kernel: Call Trace: Feb 04 10:05:58 fujitsu01 kernel: [] ? sched_clock+0x5/0x10 Feb 04 10:05:58 fujitsu01 kernel: [] ? __schedule+0x106/0x6f0 Feb 04 10:05:58 fujitsu01 kernel: [] ? smpboot_thread_fn+0xc6/0x190 Feb 04 10:05:58 fujitsu01 kernel: [] ? SyS_setgroups+0x170/0x170 Feb 04 10:05:58 fujitsu01 kernel: [] ? kthread+0xbd/0xe0 Feb 04 10:05:58 fujitsu01 kernel: [] ? kthread_create_on_node+0x180/0x180 Feb 04 10:05:58 fujitsu01 kernel: [] ? ret_from_fork+0x58/0x90 Feb 04 10:05:58 fujitsu01 kernel: [] ? kthread_create_on_node+0x180/0x180 Feb 04 10:05:58 fujitsu01 kernel: Code: 49 8b 7c 24 78 48 39 fd 74 2f 44 8b 73 68 45 8b 6c 24 68 45 39 ee 0f 8e c7 00 00 00 48 89 ef 48 89 de e8 ac 91 ff ff 48 8b 5b 70 <49> 8b 7c 24 78 48 8b 6b 78 48 39 fd 75 d1 48 85 ed 74 cc 4c 89 Feb 04 10:05:58 fujitsu01 kernel: RIP [] pick_next_task_fair+0x6b8/0x820 ``` Please find all Linux kernel messages attached. There is a similar report on the Linux kernel mailing list (LKML) [1] for Linux 3.10.0. Please tell me, if you need more information, or how I can help to get a fix submitted. Thanks, Paul [1] https://lkml.org/lkml/2016/5/31/990Feb 03 09:57:13 barberini01 kernel: Initializing cgroup subsys cpuset Feb 03 09:57:13 barberini01 kernel: Initializing cgroup subsys cpu Feb 03 09:57:13 barberini01 kernel: Initializing cgroup subsys cpuacct Feb 03 09:57:13 barberini01 kernel: Linux version 3.16.0-4-amd64 (debian-ker...@lists.debian.org) (gcc version 4.8.4 (Debian 4.8.4-1) ) #1 SMP Debian 3.16.39-1 (2016-12-30) Feb 03 09:57:13 barberini01 kernel: Command line: BOOT_IMAGE=/boot/vmlinuz-3.16.0-4-amd64 root=UUID=37c4a852-27a8-40da-bfc5-03a047fd8faa ro quiet noisapnp cgroup_enable=memory swapaccount=1 Feb 03 09:57:13 barberini01 kernel: e820: BIOS-provided physical RAM map: Feb 03 09:57:13 barberini01 kernel: