On 9/22/23 11:28, Valentin Schneider wrote:
On 21/09/23 17:59, Eric DeVolder wrote:The design decision to use the atomic lock is described in the comment from kexec_internal.h, cited above. However, examining the code of __crash_kexec(): if (kexec_trylock()) { if (kexec_crash_image) { ... } kexec_unlock(); } reveals that the use of kexec_trylock() here is actually a "best effort" due to the atomic lock. This atomic lock, prior to crash hotplug, would almost always be assured (another kexec syscall could hold the lock and prevent this, but that is about it). So at the point where the capture kernel would be invoked, if the lock is not obtained, then kdump doesn't occur. It is possible to instead use a mutex with proper waiting, and utilize mutex_trylock() as the "best effort" in __crash_kexec(). The use of a mutex then avoids all the lock acquisition problems that were revealed by the crash hotplug activity.@Dave thanks for the Cc, I'd have missed this otherwise. Prior to the atomic thingie, we actually had a mutex and did mutex_trylock() in __crash_kexec(). I'm a bit confused as this looks like a revert of 05c6257433b7 ("panic, kexec: make __crash_kexec() NMI safe") with just the helpers kept in - this doesn't seem to address any of the original issues regarding NMIs? Sebastian raised some good points in [1] regarding these issues. The main hurdle pointed out there is, if we end up in the slowpath during the unlock, then we can can up acquiring the ->wait_lock which isn't NMI safe. This is even worse on PREEMPT_RT, as both trylock and the unlock can end up acquiring the ->wait_lock. [1]: https://lore.kernel.org/all/yqyz%[email protected]/
Having reviewed the references, it would seem that Baoquan's approach of a new lock to handle the hotplug activity is the way to go? Eric _______________________________________________ kexec mailing list [email protected] http://lists.infradead.org/mailman/listinfo/kexec
