Josh,

On Thu, 17 Mar 2016, Josh Boyer wrote:
> We've had a report [1] of the mainline kernel crashing on a single-cpu
> QEMU machine (not kvm) in Fedora.  It looks as if the emulated machine
> is failing to provide a TSC and the calibrate_delay_is_known function
> is passing NULL to cpumask_any_but for the mask parameter.  At least
> that's all I've been able to discern thus far.
> 
> I was wondering if you had any insight into this issue, given your
> recent commit to change calibrate_delay_is_known to use
> topology_core_cpumask.  The backtrace is below.

> at           (null)
> [    0.010000] IP: [<ffffffff814698b5>] _find_next_bit.part.0+0x15/0x70
> [    0.010000] PGD 0
>
> [    0.010000] RSP: 0000:ffffffff81e03e40  EFLAGS: 00000246
> [    0.010000] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 
> 0000000000000000
> [    0.010000] RDX: 0000000000000000 RSI: 0000000000000001 RDI: 
> 0000000000000000
> [    0.010000] RBP: ffffffff81e03e50 R08: ffffffffffffffff R09: 
> 0000000000000000
> [    0.010000] R10: 0000000000000000 R11: 0000000000000000 R12: 
> 0000000000000000
> [    0.010000] R13: ffffffff82248960 R14: ffffffff822562e0 R15: 
> 0000000000000000
> [    0.010000] FS:  0000000000000000(0000) GS:ffff88001ee00000(0000)
> knlGS:0000000000000000
> [    0.010000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [    0.010000] CR2: 0000000000000000 CR3: 0000000001e06000 CR4: 
> 00000000000006b0
> [    0.010000] Stack:
> [    0.010000]  ffffffff81e03e50 ffffffff81469928 ffffffff81e03e70
> ffffffff81453d56
> [    0.010000]  0000000000000000 ffff88001f3fa780 ffffffff81e03e80
> ffffffff81040495
> [    0.010000]  ffffffff81e03f40 ffffffff8100285a ffffffff810eefb3
> ffffffff00000000
> [    0.010000] Call Trace:
> [    0.010000]  [<ffffffff81469928>] ? find_next_bit+0x18/0x20
> [    0.010000]  [<ffffffff81453d56>] cpumask_any_but+0x26/0x50

Yuck. That requires that topology_core_cpumask(cpu) is NULL.

#define topology_core_cpumask(cpu)        (per_cpu(cpu_core_map, cpu))

...

DEFINE_PER_CPU_READ_MOSTLY(cpumask_var_t, cpu_core_map);

So that can only result in a NULL pointer if you CONFIG_CPUMASK_OFFSTACK
enabled and the allocation fails, which is not checked !?@!

I tried to reproduce with Richards script, but so far no dice. Can you please
provide your kernel config?

Thanks,

        tglx

Reply via email to