Aboorva Devarajan <[email protected]> writes: > Four sysfs show() callbacks in hv-gpci take get_cpu_var(hv_gpci_reqb) > (which calls preempt_disable()) but only call the matching put_cpu_var() > on the error path under the 'out:' label. Every successful read leaks > one preempt_disable(): > > processor_bus_topology_show() > processor_config_show() > affinity_domain_via_virtual_processor_show() > affinity_domain_via_domain_show() > > (affinity_domain_via_partition_show() was already correct.) > > On a CONFIG_PREEMPT=y kernel, repeated reads raise preempt_count and > eventually return to userspace with preemption still disabled. The > next user-mode page fault then hits faulthandler_disabled() == 1, > gets forced to SIGSEGV, and the resulting coredump trips > 'BUG: scheduling while atomic' in call_usermodehelper_exec -> > wait_for_completion_state -> schedule: > > BUG: scheduling while atomic: <task>/<pid>/0x00000004 > ... > __schedule_bug+0x6c/0x90 > __schedule+0x58c/0x13a0 > schedule+0x48/0x1a0 > schedule_timeout+0x104/0x170 > wait_for_completion_state+0x16c/0x330 > call_usermodehelper_exec+0x254/0x2d0 > vfs_coredump+0x1050/0x2590 > get_signal+0xb9c/0xc80 > do_notify_resume+0xf8/0x470 > > Add an out_success label that calls put_cpu_var() before returning > the byte count, mirroring affinity_domain_via_partition_show().
Yup. That seems like the right thing to do otherwise we leak preempt count. I guess, these would be the other kind of issues which might show up now, since powerpc is moved to CONFIG_PREEMPTION=y with lazy preempt as the default preemption mode. It will be nice to have some mechanism to catch these error handling paths where preempt counts might be getting leaked. Nice Catch! As for this patch, it looks to me. Reviewed-by: Ritesh Harjani (IBM) <[email protected]>
