Mikulas Patocka <[email protected]>: https://lore.kernel.org/linux-pm/[email protected]/ : > Askar Safin requires swap and hibernation on the dm-integrity device mapper > target because he needs to protect his data. > > This hits two problems: > 1. The kernel doesn't send the flush bio to the hibernation device after > writing the image and before powering off - this is easy to fix > 2. The dm-integrity target keeps parts of the device in-memory - it keeps > a journal and a dm-bufio cache in memory. If we hibernate and resume, > the content of memory no longer matches the data on the hibernate > partition and that may cause spurious errors - this is hard to fix
Let me add some more info on this patchset. First of all, I already solved the problem for me personally: I wrote hackish patch, which fixes the problem. My patch is tested on my real hardware under load. I successfully use it for 2 weeks (I hibernated a lot of times during this period.) The patch is absolutely rock solid, and I absolutely sure it is correct. Unfortunately, it is not generic, it is tied to my particular configuration, it hard codes paths (!!!), and hence is non-upstreamable. Here is this patch for your information: https://zerobin.net/?ad6142bd67df015a#68Az6yBUxHA3AXB7jY1+clSRnR745olFHAByxwPGM08= . Feel free to use code from it. So I personally is not in hurry, I already have solution, which works for me. (But I am still available for testing.) Your patch has a problem: after "notify_swap_device" call, the pages can still be swapped out. "pm_restrict_gfp_mask" call in "hibernation_snapshot" prevents further swapping. Thus "notify_swap_device" should be called after "pm_restrict_gfp_mask" (but read on). I attempted to create test case, which would expose this problem. And I was unable to do so. Still I believe this is a real problem. Also, our problem is very similar to reason of introducing "filesystems_freeze" ( https://elixir.bootlin.com/linux/v6.18-rc7/source/kernel/power/hibernate.c#L824 ). See problem description here: https://lore.kernel.org/all/0a76e074ef262ca857c61175dd3d0dc06b67ec42.ca...@hansenpartnership.com/ . See also https://lwn.net/Articles/1018341/ . (See also this huge thread https://lore.kernel.org/all/[email protected]/ .) "filesystems_freeze" logic is already implemented in mainline. It is gated behind /sys/power/freeze_filesystems . As you can see, authors of "filesystems_freeze" attempted to solve similar problem. Thus, we should probably flush buffers in "filesystems_freeze" call. Ideally, flushing of dm-integrity should be correctly ordered with freezing of filesystems to support complex storage hierarchies (i. e. swap on dm-integrity on loop device on some filesystem, etc). But... call to "filesystems_freeze" happens before "pm_restrict_gfp_mask" call. So... in this point I gave up, and I don't know what to do (i. e. what the upstream kernel should do). Feel free to ask any questions. I changed CCs for further exposure. -- Askar Safin
