When splitting a huge page, we should set all small pages as dirty if the original huge page has the dirty bit set before. Otherwise we'll lose the original dirty bit.
CC: Andrea Arcangeli <[email protected]> CC: Andrew Morton <[email protected]> CC: "Kirill A. Shutemov" <[email protected]> CC: Michal Hocko <[email protected]> CC: Zi Yan <[email protected]> CC: Huang Ying <[email protected]> CC: Dan Williams <[email protected]> CC: Naoya Horiguchi <[email protected]> CC: "Jérôme Glisse" <[email protected]> CC: "Aneesh Kumar K.V" <[email protected]> CC: Konstantin Khlebnikov <[email protected]> CC: Souptick Joarder <[email protected]> CC: [email protected] CC: [email protected] Signed-off-by: Peter Xu <[email protected]> --- To the reviewers: I'm new to the mm world so sorry if this patch is making silly mistakes, however it did solve a problem for me when testing with a customized Linux tree mostly based on Andrea's userfault write-protect work. Without the change, my customized QEMU/tcg tree will not be able to do correct UFFDIO_WRITEPROTECT and then QEMU will get a SIGBUS when faulting multiple times. With the change (or of course disabling THP) then UFFDIO_WRITEPROTECT will be able to correctly resolve the write protections then it runs well. Any comment would be welcomed. TIA. --- mm/huge_memory.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index c3bc7e9c9a2a..0754a16923d5 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2176,6 +2176,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_mkold(entry); if (soft_dirty) entry = pte_mksoft_dirty(entry); + if (dirty) + entry = pte_mkdirty(entry); } pte = pte_offset_map(&_pmd, addr); BUG_ON(!pte_none(*pte)); -- 2.17.1

