On Mon, Nov 10, 2025 at 06:54:57PM +0530, Dev Jain wrote: > > On 10/11/25 5:31 pm, Lorenzo Stoakes wrote: > > > > ofc you are addressing this with the !cc->is_khugepaged, but feels like > > we'd be > > better off just doing it in madvise_collapse(). > > I suppose the common case is that writeback is not needed, and I don't know > what > is the overhead of filemap_write_and_wait_range() in that case, but your
Low. > suggestion will force that unconditional overhead and skip all the early exits > possible in hpage_collapse_scan_file()? Which are? PMD-mapped folio at start of range, scan abort, non-LRU, spurious ref count I don't think this matters. And we're trading for putting _yet more_ logic that belongs elsewhere in the wrong place. I mean I'd actually be pretty good with us putting it literally in madvise.c, but since we defer the collapse to khugepaged.c then madvise_collapse(). > > > I wonder also if doing I/O without getting the mmap lock again and > > revalidating > > is wise, as the state of things might have changed significantly. > > > > So maybe need a hugepage_vma_revalidate() as well? > > The file collapse path (apart from collapse_pte_mapped_thp) is solely > concerned > about doing the collapse in the page cache, for which information about the > mm or > the vma is not needed, that is why no locking is required. The get_file() in The user has asked specifically to collapse pages in a VMA's range. Yes you can go check the mapping of a pinned file which you're keeping pinned during this operation (wise? Not sure it is). This would be the first time in this operation we are doing a _synchronous_ I/O operation where we sleep holding a pin. So no, I think we really do need to revalidate. 'Collapse some random file we no longer map at this address' is probably not great semantics, also of course, we are revalidating at each PMD anyway. Maybe even do a addr -= HPAGE_PMD_SIZE; continue + take it from the top? Maybe David has thoughts... > khugepaged_scan_mm_slot() seems to be serving the same function as > maybe_unlock_mmap_for_io(). > So I don't think this is needed? We're talking about the MADV_COLLAPSE case so don't understand this? I may be missing something here (happens a lot ;)! > > > > > > > > + > > > result = alloc_charge_folio(&new_folio, mm, cc); > > > if (result != SCAN_SUCCEED) > > > goto out; > > > > > > base-commit: e9a6fb0bcdd7609be6969112f3fbfcce3b1d4a7c > > > -- > > > 2.43.0 > > > > > Thanks, Lorenzo Cheers, Lorenzo
