Package: src:linux Version: 3.16.56-1 Severity: important Tags: upstream fixed-upstream patch
The following crash at boot was reported to me by someone who has had trouble submitting it using reportbug: [ 0.819406] divide error: 0000 [#1] SMP [ 0.821156] Modules linked in: [ 0.822474] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.16.0-6-amd64 #1 Debian 3.16.56-1 [ 0.823392] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006 [ 0.823392] task: ffff8803de66f2d0 ti: ffff8803de6c0000 task.ti: ffff8803de6c0000 [ 0.823392] RIP: 0010:[<ffffffff81924144>] [<ffffffff81924144>] init_intel_microcode+0x4b/0x5c [ 0.823392] RSP: 0000:ffff8803de6c3e60 EFLAGS: 00010206 [ 0.823392] RAX: 0000000001900000 RBX: ffffffff8181b040 RCX: 0000000000000000 [ 0.823392] RDX: 0000000000000000 RSI: 000000000000000e RDI: 0000000000000282 [ 0.823392] RBP: ffff8803d07f3180 R08: 000000000000ffff R09: 000000000000ffff [ 0.823392] R10: ffffffff8170378c R11: 000000000000015b R12: ffffffff81923f40 [ 0.823392] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 0.823392] FS: 0000000000000000(0000) GS:ffff8803e0400000(0000) knlGS:0000000000000000 [ 0.823392] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.823392] CR2: ffff8803e07ff000 CR3: 0000000001812000 CR4: 0000000000160670 [ 0.823392] Stack: [ 0.823392] ffffffff81923f6f 0000000000000018 ffff8803de6c3ec0 ffff8803de6c3e80 [ 0.823392] ffffea000d59bd28 ffffffff8181b040 ffff8803d07f3180 ffffffff81923f40 [ 0.823392] 0000000000000000 0000000000000000 ffffffff8181b040 ffffffff8100214a [ 0.823392] Call Trace: [ 0.823392] [<ffffffff81923f6f>] ? microcode_init+0x2f/0x1b9 [ 0.823392] [<ffffffff81923f40>] ? mtrr_trim_uncached_memory+0x2b4/0x2b4 [ 0.823392] [<ffffffff8100214a>] ? do_one_initcall+0xda/0x210 [ 0.823392] [<ffffffff81915800>] ? initcall_blacklist+0x6f/0xb2 [ 0.823392] [<ffffffff8108b346>] ? parse_args+0x236/0x4f0 [ 0.823392] [<ffffffff819160d0>] ? kernel_init_freeable+0x189/0x20a [ 0.823392] [<ffffffff81529170>] ? rest_init+0x80/0x80 [ 0.823392] [<ffffffff8152917a>] ? kernel_init+0xa/0xf0 [ 0.823392] [<ffffffff81539abe>] ? ret_from_fork+0x6e/0xa0 [ 0.823392] [<ffffffff81529170>] ? rest_init+0x80/0x80 [ 0.823392] Code: 00 40 74 14 0f b6 31 48 c7 c7 f8 32 71 81 31 c0 e8 7c e3 c0 ff 31 c0 c3 8b 81 90 00 00 00 0f b7 89 a8 00 00 00 31 d2 48 c1 e0 0a <48> f7 f1 89 05 8b e3 12 00 48 c7 c0 e0 d4 82 81 c3 48 c7 c0 40 [ 0.823392] RIP [<ffffffff81924144>] init_intel_microcode+0x4b/0x5c [ 0.823392] RSP <ffff8803de6c3e60> [ 0.913472] ---[ end trace a991ea625763b5ba ]--- AWS support reproduced this on the c3.large and r3.large instance types, which have 2 CPUs, but not instances with larger numbers of CPUs. I couldn't reproduce it with a c2.micro instance (1 CPU). The crash is apparently in the calc_llc_size_per_core() function (inlined into init_intel_microcode()), which was added in Linux 4.15 by commit 7e702d17ed13 "x86/microcode/intel: Extend BDW late-loading further with LLC size check" and backported into 3.16.55. The processor last-level-cache size is divided by the number of cores, and evidently the latter is not detected correctly on these EC2 instance types. The microcode loader ought to be completely disabled on virtual machines since only the hypervisor can update the microcode. This was fixed upstream in Linux 4.10 by commit a15a753539ec "x86/microcode/AMD: Do not load when running on a hypervisor" and in Linux 4.9.81, so the same regression has not occurred in stretch or sid. I intend to fix this by backporting commit a15a753539ec. Ben. -- System Information: Debian Release: buster/sid APT prefers unstable-debug APT policy: (500, 'unstable-debug'), (500, 'stable-updates'), (500, 'unstable'), (500, 'stable'), (1, 'experimental') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 4.16.0-1-amd64 (SMP w/4 CPU cores) Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), LANGUAGE=en_GB.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Init: systemd (via /run/systemd/system) LSM: AppArmor: enabled