Bug#898067: Division by zero in microcode loader when booting on EC2 c3.large/r3.large

Ben Hutchings Sun, 06 May 2018 09:18:38 -0700

Package: src:linux
Version: 3.16.56-1
Severity: important
Tags: upstream fixed-upstream patch


The following crash at boot was reported to me by someone who has had
trouble submitting it using reportbug:

[    0.819406] divide error: 0000 [#1] SMP
[    0.821156] Modules linked in:
[    0.822474] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.16.0-6-amd64 #1 
Debian 3.16.56-1
[    0.823392] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
[    0.823392] task: ffff8803de66f2d0 ti: ffff8803de6c0000 task.ti: 
ffff8803de6c0000
[    0.823392] RIP: 0010:[<ffffffff81924144>]  [<ffffffff81924144>] 
init_intel_microcode+0x4b/0x5c
[    0.823392] RSP: 0000:ffff8803de6c3e60  EFLAGS: 00010206
[    0.823392] RAX: 0000000001900000 RBX: ffffffff8181b040 RCX: 0000000000000000
[    0.823392] RDX: 0000000000000000 RSI: 000000000000000e RDI: 0000000000000282
[    0.823392] RBP: ffff8803d07f3180 R08: 000000000000ffff R09: 000000000000ffff
[    0.823392] R10: ffffffff8170378c R11: 000000000000015b R12: ffffffff81923f40
[    0.823392] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[    0.823392] FS:  0000000000000000(0000) GS:ffff8803e0400000(0000) 
knlGS:0000000000000000
[    0.823392] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.823392] CR2: ffff8803e07ff000 CR3: 0000000001812000 CR4: 0000000000160670
[    0.823392] Stack:
[    0.823392]  ffffffff81923f6f 0000000000000018 ffff8803de6c3ec0 
ffff8803de6c3e80
[    0.823392]  ffffea000d59bd28 ffffffff8181b040 ffff8803d07f3180 
ffffffff81923f40
[    0.823392]  0000000000000000 0000000000000000 ffffffff8181b040 
ffffffff8100214a
[    0.823392] Call Trace:
[    0.823392]  [<ffffffff81923f6f>] ? microcode_init+0x2f/0x1b9
[    0.823392]  [<ffffffff81923f40>] ? mtrr_trim_uncached_memory+0x2b4/0x2b4
[    0.823392]  [<ffffffff8100214a>] ? do_one_initcall+0xda/0x210
[    0.823392]  [<ffffffff81915800>] ? initcall_blacklist+0x6f/0xb2
[    0.823392]  [<ffffffff8108b346>] ? parse_args+0x236/0x4f0
[    0.823392]  [<ffffffff819160d0>] ? kernel_init_freeable+0x189/0x20a
[    0.823392]  [<ffffffff81529170>] ? rest_init+0x80/0x80
[    0.823392]  [<ffffffff8152917a>] ? kernel_init+0xa/0xf0
[    0.823392]  [<ffffffff81539abe>] ? ret_from_fork+0x6e/0xa0
[    0.823392]  [<ffffffff81529170>] ? rest_init+0x80/0x80
[    0.823392] Code: 00 40 74 14 0f b6 31 48 c7 c7 f8 32 71 81 31 c0 e8 7c e3 
c0 ff 31 c0 c3 8b 81 90 00 00 00 0f b7 89 a8 00 00 00 31 d2 48 c1 e0 0a <48> f7 
f1 89 05 8b e3 12 00 48 c7 c0 e0 d4 82 81 c3 48 c7 c0 40
[    0.823392] RIP  [<ffffffff81924144>] init_intel_microcode+0x4b/0x5c
[    0.823392]  RSP <ffff8803de6c3e60>
[    0.913472] ---[ end trace a991ea625763b5ba ]---

AWS support reproduced this on the c3.large and r3.large instance
types, which have 2 CPUs, but not instances with larger numbers of
CPUs.  I couldn't reproduce it with a c2.micro instance (1 CPU).

The crash is apparently in the calc_llc_size_per_core() function
(inlined into init_intel_microcode()), which was added in Linux 4.15
by commit 7e702d17ed13 "x86/microcode/intel: Extend BDW late-loading
further with LLC size check" and backported into 3.16.55.  The
processor last-level-cache size is divided by the number of cores, and
evidently the latter is not detected correctly on these EC2 instance
types.

The microcode loader ought to be completely disabled on virtual
machines since only the hypervisor can update the microcode.  This was
fixed upstream in Linux 4.10 by commit a15a753539ec
"x86/microcode/AMD: Do not load when running on a hypervisor" and
in Linux 4.9.81, so the same regression has not occurred in stretch
or sid.

I intend to fix this by backporting commit a15a753539ec.

Ben.

-- System Information:
Debian Release: buster/sid
  APT prefers unstable-debug
  APT policy: (500, 'unstable-debug'), (500, 'stable-updates'), (500, 
'unstable'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.16.0-1-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_GB.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Bug#898067: Division by zero in microcode loader when booting on EC2 c3.large/r3.large

Reply via email to