On Tue, Jun 03, 2025 at 06:25:57PM +0200, David Hildenbrand wrote: > On 03.06.25 17:59, Jiri Bohac wrote: > I'd phrase it more like "Pages residing in CMA areas can usually not get > long-term pinned, so long-term pinning is typically not a concern. BUGs in > the kernel might still lead to long-term pinning of such pages if everything > goes wrong."
... > > If you want, I have no problem changing this to: > > + mdelay(cma_dma_timeout_sec * 1000); > > Probably good enough. Or just hard-code 10s and call it a day. :) Thanks for your comments, David. This would be the v5 of this patch: Subject: [PATCH v5 4/5] kdump: wait for DMA to finish when using CMA When re-using the CMA area for kdump there is a risk of pending DMA into pinned user pages in the CMA area. Pages residing in CMA areas can usually not get long-term pinned and are instead migrated away from the CMA area, so long-term pinning is typically not a concern. (BUGs in the kernel might still lead to long-term pinning of such pages if everything goes wrong.) Pages pinned without FOLL_LONGTERM remain in the CMA and may possibly be the source or destination of a pending DMA transfer. Although there is no clear specification how long a page may be pinned without FOLL_LONGTERM, pinning without the flag shows an intent of the caller to only use the memory for short-lived DMA transfers, not a transfer initiated by a device asynchronously at a random time in the future. Add a delay of CMA_DMA_TIMEOUT_SEC seconds before starting the kdump kernel, giving such short-lived DMA transfers time to finish before the CMA memory is re-used by the kdump kernel. Set CMA_DMA_TIMEOUT_SEC to 10 seconds - chosen arbitrarily as both a huge margin for a DMA transfer, yet not increasing the kdump time too significantly. Signed-off-by: Jiri Bohac <jbo...@suse.cz> --- Changes since v4: - reworded the paragraph about long-term pinning - simplified crash_cma_clear_pending_dma() --- Changes since v3: - renamed CMA_DMA_TIMEOUT_SEC to CMA_DMA_TIMEOUT_MSEC, change delay to 10 seconds - introduce a cma_dma_timeout_sec initialized to CMA_DMA_TIMEOUT_SEC to make the timeout trivially tunable if needed in the future --- include/linux/crash_core.h | 3 +++ kernel/crash_core.c | 15 +++++++++++++++ 2 files changed, 18 insertions(+) diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h index 44305336314e..805a07042c96 100644 --- a/include/linux/crash_core.h +++ b/include/linux/crash_core.h @@ -56,6 +56,9 @@ static inline unsigned int crash_get_elfcorehdr_size(void) { return 0; } /* Alignment required for elf header segment */ #define ELF_CORE_HEADER_ALIGN 4096 +/* Default value for cma_dma_timeout_sec */ +#define CMA_DMA_TIMEOUT_SEC 10 + extern int crash_exclude_mem_range(struct crash_mem *mem, unsigned long long mstart, unsigned long long mend); diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 335b8425dd4b..540fd75a4a0d 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -21,6 +21,7 @@ #include <linux/reboot.h> #include <linux/btf.h> #include <linux/objtool.h> +#include <linux/delay.h> #include <asm/page.h> #include <asm/sections.h> @@ -33,6 +34,11 @@ /* Per cpu memory for storing cpu states in case of system crash. */ note_buf_t __percpu *crash_notes; +/* time to wait for possible DMA to finish before starting the kdump kernel + * when a CMA reservation is used + */ +unsigned int cma_dma_timeout_sec = CMA_DMA_TIMEOUT_SEC; + #ifdef CONFIG_CRASH_DUMP int kimage_crash_copy_vmcoreinfo(struct kimage *image) @@ -97,6 +103,14 @@ int kexec_crash_loaded(void) } EXPORT_SYMBOL_GPL(kexec_crash_loaded); +static void crash_cma_clear_pending_dma(void) +{ + if (!crashk_cma_cnt) + return; + + mdelay(cma_dma_timeout_sec * 1000); +} + /* * No panic_cpu check version of crash_kexec(). This function is called * only when panic_cpu holds the current CPU number; this is the only CPU @@ -119,6 +133,7 @@ void __noclone __crash_kexec(struct pt_regs *regs) crash_setup_regs(&fixed_regs, regs); crash_save_vmcoreinfo(); machine_crash_shutdown(&fixed_regs); + crash_cma_clear_pending_dma(); machine_kexec(kexec_crash_image); } kexec_unlock(); -- Jiri Bohac <jbo...@suse.cz> SUSE Labs, Prague, Czechia