On 17/07/14 18:36, Bruno Wolff III wrote:
I did a few quick boots this morning while taking a bunch of pictures. I have
gone through some of them this morning and found one that shows bug on
was triggered at 5850 which is from:
BUG_ON(!cpumask_empty(sched_group_cpus(sg)));

You can see the JPEG at:
https://bugzilla.kernel.org/attachment.cgi?id=143331


Many thanks for testing this, Bruno!

So the memory of the cpumask of some sched_group(s) in your system has been altered between __visit_domain_allocation_hell()->__sdt_alloc() and build_sched_groups().

In the meantime, PeterZ has posted a patch which barfs when this happens but also prints out the sched groups with the related cpus but also includes the cpumask_clear so your machine would boot still fine.

If you could apply the patch:

https://lkml.org/lkml/2014/7/17/288

and then run it on your machine, that would give us more details, i.e. the information on which sched_group(s) and in which sched domain level (SMT and/or DIE) this issue occurs.


Another thing which you could do is to boot with an extra 'earlyprintk=keep sched_debug' in your command line options with a build containing the cpumask_clear() in build_sched_groups() and extract the dmesg output of the scheduler-setup code:

Example:

[    0.119737] CPU0 attaching sched-domain:
[    0.119740]  domain 0: span 0-1 level SIBLING
[    0.119742]   groups: 0 (cpu_power = 588) 1 (cpu_power = 588)
[    0.119745]   domain 1: span 0-3 level MC
[    0.119747]    groups: 0-1 (cpu_power = 1176) 2-3 (cpu_power = 1176)
[    0.119751] CPU1 attaching sched-domain:
[    0.119752]  domain 0: span 0-1 level SIBLING
[    0.119753]   groups: 1 (cpu_power = 588) 0 (cpu_power = 588)
[    0.119756]   domain 1: span 0-3 level MC
[    0.119757]    groups: 0-1 (cpu_power = 1176) 2-3 (cpu_power = 1176)
[    0.119759] CPU2 attaching sched-domain:
[    0.119760]  domain 0: span 2-3 level SIBLING
[    0.119761]   groups: 2 (cpu_power = 588) 3 (cpu_power = 588)
[    0.119764]   domain 1: span 0-3 level MC
[    0.119765]    groups: 2-3 (cpu_power = 1176) 0-1 (cpu_power = 1176)
[    0.119767] CPU3 attaching sched-domain:
[    0.119768]  domain 0: span 2-3 level SIBLING
[    0.119769]   groups: 3 (cpu_power = 588) 2 (cpu_power = 588)
[    0.119772]   domain 1: span 0-3 level MC
[    0.119773]    groups: 2-3 (cpu_power = 1176) 0-1 (cpu_power = 1176)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to