When hibernate with data center dGPUs, huge number of VRAM data will be moved to shmem during dev_pm_ops.prepare(). These shmem pages take a lot of system memory so that there's no enough free memory for creating the hibernation image. This will cause hibernation fail and abort.
After dev_pm_ops.prepare(), call shrink_all_memory() to force move shmem pages to swap disk and reclaim the pages, so that there's enough system memory for hibernation image and less pages needed to copy to the image. This patch can only flush and free about half shmem pages. It will be better to flush and free more pages, even all of shmem pages, so that there're less pages to be copied to the hibernation image and the overall hibernation time can be reduced. Signed-off-by: Samuel Zhang <guoqing.zh...@amd.com> --- kernel/power/hibernate.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index 10a01af63a80..913a298c1d01 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -370,6 +370,17 @@ static int create_image(int platform_mode) return error; } +static void shrink_shmem_memory(void) +{ + struct sysinfo info; + unsigned long pages, freed; + + si_meminfo(&info); + pages = info.sharedram; + freed = shrink_all_memory(pages); + pr_debug("requested to reclaim %lu pages, freed %lu pages\n", pages, freed); +} + /** * hibernation_snapshot - Quiesce devices and create a hibernation image. * @platform_mode: If set, use platform driver to prepare for the transition. @@ -411,6 +422,8 @@ int hibernation_snapshot(int platform_mode) goto Thaw; } + shrink_shmem_memory(); + suspend_console(); pm_restrict_gfp_mask(); -- 2.43.5