Re: [PATCH v4 00/25] Page folios
On Fri, Mar 05, 2021 at 04:18:36AM +, Matthew Wilcox (Oracle) wrote: > Our type system does not currently distinguish between tail pages and > head or single pages. This is a problem because we call compound_head() > multiple times (and the compiler cannot optimise it out), bloating the > kernel. It also makes programming hard as it is often unclear whether > a function operates on an individual page, or an entire compound page. I've pushed a new version out here: https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio I think it takes into account everyone's comments so far. It even compiles.
Re: [PATCH v4 00/25] Page folios
On Mon, Mar 15, 2021 at 07:40:14PM +, Matthew Wilcox wrote: > The reason I didn't go with 'head' is that traditionally 'head' implies > that there are tail pages. It would be weird to ask 'if (HeadHead(head))' > That's currently spelled 'if (FolioMulti(folio))'. But it can be changed > if there's a really better alternative. It'll make me more grumpy if > somebody comes up with a really good alternative in six months. The folio name keeps growing on me. Still not perfect, but I do like that it is: a) short and b) very different from page
Re: [PATCH v4 00/25] Page folios
On Mon, Mar 15, 2021 at 07:40:14PM +, Matthew Wilcox wrote: > I would agree that the conversion is both straightforward and noisy. > There are some minor things that crop up, like noticing that we get > the accounting wrong for writeback of compound pages. That's not > entirely unexpected since no filesystem supports both compound pages > and writeback today. And this is where the abstraction that the "folio" introduces is required - filesystem people don't want to have to deal with all the complexity and subtlety of compound pages when large page support is added to the page cache. As Willy says, this will be a never-ending source of data corruption bugs Hence from the filesystem side of things, I think this abstraction is absolutely necessary. Especially because handling buffered IO in 4kB pages really, really sucks from a performance persepctive and the sooner we have native high-order page support in the page cache the better. These days buffered IO often can't even max out a cheap NVMe SSD because of the CPU cost of per-page state management in the page cache. So the sooner we have large page support to mitigate the worst of the overhead for streaming buffered IO, the better. FWIW, like others, I'm not a fan of "folio" as a name, but I can live with it because it so jarringly different to "pages" that we're not going to confuse the two of them. But if we want a better name, my suggestion would be for a "struct cage" as in Compound pAGE Cheers, Dave. -- Dave Chinner da...@fromorbit.com
Re: [PATCH v4 00/25] Page folios
Michal Hocko writes: > bikeshedding) because it hasn't really resonated with the udnerlying > concept. Maybe just me as a non native speaker... page_head would have > been so much more straightforward but not something I really care > about. Yes. page_head explains exactly what it is. But Folio? Huh? -Andi
Re: [PATCH v4 00/25] Page folios
On Mon, Mar 15, 2021 at 07:09:04PM +, Christoph Hellwig wrote: > On Mon, Mar 15, 2021 at 01:38:04PM +0100, Michal Hocko wrote: > > I tend to agree here as well. The level compoud_head has spread out > > silently is just too large. There are people coming up with all sorts of > > optimizations to workaround that, and they are quite right that this is > > somehing worth doing, but last attempts I have seen were very focused on > > specific page flags handling which is imho worse wrt maintainability > > than a higher level and type safe abstraction. I find it quite nice that > > this doesn't really have to be a flag day conversion but it can be done > > incrementally. > > > > I didn't get review the series yet and I cannot really promise anything > > but from what I understand the conversion should be pretty > > straightforward, albeit noisy. > > > > One thing that was really strange to me when seeing the concept for the > > first time was the choice of naming (no I do not want to start any > > bikeshedding) because it hasn't really resonated with the udnerlying > > concept. Maybe just me as a non native speaker... page_head would have > > been so much more straightforward but not something I really care about. > > That pretty much summarizes my opinion as well. I'll need to find some > time to review the series as well. If it's easier for you, I'm trying to keep https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio up to date. Not all of those 111 patches are suitable for upstreaming, but it might give you a better idea of where I'm going than if I only posted the first 70-80 of them. Stopping at "mm/memory: Use a folio in copy_pte_range()" nets us almost 10kb of text reduction for the UEK-derived config, about 3.3kb on an allnoconfig (which is a little over 0.1% on a 2.4MB kernel). The reason I didn't go with 'head' is that traditionally 'head' implies that there are tail pages. It would be weird to ask 'if (HeadHead(head))' That's currently spelled 'if (FolioMulti(folio))'. But it can be changed if there's a really better alternative. It'll make me more grumpy if somebody comes up with a really good alternative in six months. I would agree that the conversion is both straightforward and noisy. There are some minor things that crop up, like noticing that we get the accounting wrong for writeback of compound pages. That's not entirely unexpected since no filesystem supports both compound pages and writeback today.
Re: [PATCH v4 00/25] Page folios
On Mon, Mar 15, 2021 at 01:38:04PM +0100, Michal Hocko wrote: > I tend to agree here as well. The level compoud_head has spread out > silently is just too large. There are people coming up with all sorts of > optimizations to workaround that, and they are quite right that this is > somehing worth doing, but last attempts I have seen were very focused on > specific page flags handling which is imho worse wrt maintainability > than a higher level and type safe abstraction. I find it quite nice that > this doesn't really have to be a flag day conversion but it can be done > incrementally. > > I didn't get review the series yet and I cannot really promise anything > but from what I understand the conversion should be pretty > straightforward, albeit noisy. > > One thing that was really strange to me when seeing the concept for the > first time was the choice of naming (no I do not want to start any > bikeshedding) because it hasn't really resonated with the udnerlying > concept. Maybe just me as a non native speaker... page_head would have > been so much more straightforward but not something I really care about. That pretty much summarizes my opinion as well. I'll need to find some time to review the series as well.
Re: [PATCH v4 00/25] Page folios
On Mon, Mar 15, 2021 at 02:55:01PM +0300, Kirill A. Shutemov wrote: > I'm with Matthew on this. I would really want to drop the number of places > where we call compoud_head(). I hope we can get rid of the page flag > policy hack I made. I can't see that far ahead too clearly, but I do think that at some point we'll actually distinguish between folio flags and page flags. For example, we won't have a FolioHWPoison, because we won't keep a folio together if one page in it has become defective. Nor will we have a PageUptodate because we'll only care about whether a folio is uptodate. And at that point, we won't want page flag policies.
Re: [PATCH v4 00/25] Page folios
On Mon 15-03-21 14:55:01, Kirill A. Shutemov wrote: > On Sat, Mar 13, 2021 at 07:09:01PM -0800, Hugh Dickins wrote: > > On Sat, 13 Mar 2021, Andrew Morton wrote: > > > On Fri, 5 Mar 2021 04:18:36 + "Matthew Wilcox (Oracle)" > > > wrote: > > > > > > > Our type system does not currently distinguish between tail pages and > > > > head or single pages. This is a problem because we call compound_head() > > > > multiple times (and the compiler cannot optimise it out), bloating the > > > > kernel. It also makes programming hard as it is often unclear whether > > > > a function operates on an individual page, or an entire compound page. > > > > > > > > This patch series introduces the struct folio, which is a type that > > > > represents an entire compound page. This initial set reduces the kernel > > > > size by approximately 6kB, although its real purpose is adding > > > > infrastructure to enable further use of the folio. > > > > > > Geeze it's a lot of noise. More things to remember and we'll forever > > > have a mismash of `page' and `folio' and code everywhere converting > > > from one to the other. Ongoing addition of folio > > > accessors/manipulators to overlay the existing page > > > accessors/manipulators, etc. > > > > > > It's unclear to me that it's all really worth it. What feedback have > > > you seen from others? > > > > My own feeling and feedback have been much like yours. > > > > I don't get very excited by type safety at this level; and although > > I protested back when all those compound_head()s got tucked into the > > *PageFlag() functions, the text size increase was not very much, and > > I never noticed any adverse performance reports. > > > > To me, it's distraction, churn and friction, ongoing for years; but > > that's just me, and I'm resigned to the possibility that it will go in. > > Matthew is not alone in wanting to pursue it: let others speak. > > I'm with Matthew on this. I would really want to drop the number of places > where we call compoud_head(). I hope we can get rid of the page flag > policy hack I made. I tend to agree here as well. The level compoud_head has spread out silently is just too large. There are people coming up with all sorts of optimizations to workaround that, and they are quite right that this is somehing worth doing, but last attempts I have seen were very focused on specific page flags handling which is imho worse wrt maintainability than a higher level and type safe abstraction. I find it quite nice that this doesn't really have to be a flag day conversion but it can be done incrementally. I didn't get review the series yet and I cannot really promise anything but from what I understand the conversion should be pretty straightforward, albeit noisy. One thing that was really strange to me when seeing the concept for the first time was the choice of naming (no I do not want to start any bikeshedding) because it hasn't really resonated with the udnerlying concept. Maybe just me as a non native speaker... page_head would have been so much more straightforward but not something I really care about. -- Michal Hocko SUSE Labs
Re: [PATCH v4 00/25] Page folios
On Sat, Mar 13, 2021 at 07:09:01PM -0800, Hugh Dickins wrote: > On Sat, 13 Mar 2021, Andrew Morton wrote: > > On Fri, 5 Mar 2021 04:18:36 + "Matthew Wilcox (Oracle)" > > wrote: > > > > > Our type system does not currently distinguish between tail pages and > > > head or single pages. This is a problem because we call compound_head() > > > multiple times (and the compiler cannot optimise it out), bloating the > > > kernel. It also makes programming hard as it is often unclear whether > > > a function operates on an individual page, or an entire compound page. > > > > > > This patch series introduces the struct folio, which is a type that > > > represents an entire compound page. This initial set reduces the kernel > > > size by approximately 6kB, although its real purpose is adding > > > infrastructure to enable further use of the folio. > > > > Geeze it's a lot of noise. More things to remember and we'll forever > > have a mismash of `page' and `folio' and code everywhere converting > > from one to the other. Ongoing addition of folio > > accessors/manipulators to overlay the existing page > > accessors/manipulators, etc. > > > > It's unclear to me that it's all really worth it. What feedback have > > you seen from others? > > My own feeling and feedback have been much like yours. > > I don't get very excited by type safety at this level; and although > I protested back when all those compound_head()s got tucked into the > *PageFlag() functions, the text size increase was not very much, and > I never noticed any adverse performance reports. > > To me, it's distraction, churn and friction, ongoing for years; but > that's just me, and I'm resigned to the possibility that it will go in. > Matthew is not alone in wanting to pursue it: let others speak. I'm with Matthew on this. I would really want to drop the number of places where we call compoud_head(). I hope we can get rid of the page flag policy hack I made. -- Kirill A. Shutemov
Re: [PATCH v4 00/25] Page folios
On Sat, 13 Mar 2021, Andrew Morton wrote: > On Fri, 5 Mar 2021 04:18:36 + "Matthew Wilcox (Oracle)" > wrote: > > > Our type system does not currently distinguish between tail pages and > > head or single pages. This is a problem because we call compound_head() > > multiple times (and the compiler cannot optimise it out), bloating the > > kernel. It also makes programming hard as it is often unclear whether > > a function operates on an individual page, or an entire compound page. > > > > This patch series introduces the struct folio, which is a type that > > represents an entire compound page. This initial set reduces the kernel > > size by approximately 6kB, although its real purpose is adding > > infrastructure to enable further use of the folio. > > Geeze it's a lot of noise. More things to remember and we'll forever > have a mismash of `page' and `folio' and code everywhere converting > from one to the other. Ongoing addition of folio > accessors/manipulators to overlay the existing page > accessors/manipulators, etc. > > It's unclear to me that it's all really worth it. What feedback have > you seen from others? My own feeling and feedback have been much like yours. I don't get very excited by type safety at this level; and although I protested back when all those compound_head()s got tucked into the *PageFlag() functions, the text size increase was not very much, and I never noticed any adverse performance reports. To me, it's distraction, churn and friction, ongoing for years; but that's just me, and I'm resigned to the possibility that it will go in. Matthew is not alone in wanting to pursue it: let others speak. Hugh
Re: [PATCH v4 00/25] Page folios
On Sat, Mar 13, 2021 at 12:36:58PM -0800, Andrew Morton wrote: > On Fri, 5 Mar 2021 04:18:36 + "Matthew Wilcox (Oracle)" > wrote: > > > Our type system does not currently distinguish between tail pages and > > head or single pages. This is a problem because we call compound_head() > > multiple times (and the compiler cannot optimise it out), bloating the > > kernel. It also makes programming hard as it is often unclear whether > > a function operates on an individual page, or an entire compound page. > > > > This patch series introduces the struct folio, which is a type that > > represents an entire compound page. This initial set reduces the kernel > > size by approximately 6kB, although its real purpose is adding > > infrastructure to enable further use of the folio. > > Geeze it's a lot of noise. More things to remember and we'll forever > have a mismash of `page' and `folio' and code everywhere converting > from one to the other. Ongoing addition of folio > accessors/manipulators to overlay the existing page > accessors/manipulators, etc. > > It's unclear to me that it's all really worth it. What feedback have > you seen from others? Mmm. The thing is, the alternative is ongoing bugs. And inefficiencies. Today, we have code everywhere converting from tail pages to head pages -- we just don't notice it because it's all wrapped up in macros. I have over 10kB in text size reductions in my tree (yes, it's a monster series of patches), almost all from removing those conversions. And it's far from done. And these conversions are all in hot paths, like handling page faults and read(). For example: filemap_fault 19801289-691 it's two-thirds the size it was! Surely that's not all in the hot path, but still it's going to have some kind of effect. As well, we have code today that _looks_ right but is buggy. Take a look at vfs_dedupe_file_range_compare(). There's nothing wrong with it at first glance, until you realise that vfs_dedupe_get_page() might return a tail page, and you can't look at page->mapping for a tail page. Nor page->index, so vfs_lock_two_pages() is also broken. As far as feedback, I really want more. Particularly from filesystem people. I don't think a lot of them realise yet that I'm going to change 15 of the 22 address_space_ops to work with folios instead of pages. Individual filesystems can keep working with pages, of course, until they enable the "use multipage folios" bit.
Re: [PATCH v4 00/25] Page folios
On Fri, 5 Mar 2021 04:18:36 + "Matthew Wilcox (Oracle)" wrote: > Our type system does not currently distinguish between tail pages and > head or single pages. This is a problem because we call compound_head() > multiple times (and the compiler cannot optimise it out), bloating the > kernel. It also makes programming hard as it is often unclear whether > a function operates on an individual page, or an entire compound page. > > This patch series introduces the struct folio, which is a type that > represents an entire compound page. This initial set reduces the kernel > size by approximately 6kB, although its real purpose is adding > infrastructure to enable further use of the folio. Geeze it's a lot of noise. More things to remember and we'll forever have a mismash of `page' and `folio' and code everywhere converting from one to the other. Ongoing addition of folio accessors/manipulators to overlay the existing page accessors/manipulators, etc. It's unclear to me that it's all really worth it. What feedback have you seen from others?
[PATCH v4 00/25] Page folios
Our type system does not currently distinguish between tail pages and head or single pages. This is a problem because we call compound_head() multiple times (and the compiler cannot optimise it out), bloating the kernel. It also makes programming hard as it is often unclear whether a function operates on an individual page, or an entire compound page. This patch series introduces the struct folio, which is a type that represents an entire compound page. This initial set reduces the kernel size by approximately 6kB, although its real purpose is adding infrastructure to enable further use of the folio. The big correctness proof that exists in this patch series is that we never lock or wait for writeback on a tail page. This is important as we would miss wakeups due to being on the wrong page waitqueue if we ever did. I analysed the text size reduction using a config based on Oracle UEK with all modules changed to built-in. That's obviously not a kernel which makes sense to run, but it serves to compare the effects on (many common) filesystems & drivers, not just the core. add/remove: 33510/33499 grow/shrink: 1831/1898 up/down: 888762/-894719 (-5957) For a Debian config, just comparing vmlinux.o gives a reduction of 3828 bytes of text and 72 bytes of data: textdata bss dec hex filename 161258794421122 1846344 22393345155b201 linus/vmlinux.o 161220514421050 1846344 22389445155a2c5 folio/vmlinux.o For nfs (a module I happened to notice was particularly affected), it's a reduction of 588 bytes of text and 16 bytes of data: 257142 59228 408 316778 4d56a linus/fs/nfs/nfs.o 256554 59212 408 316174 4d30e folio/fs/nfs/nfs.o Current tree at: https://git.infradead.org/users/willy/pagecache.git/shortlog/refs/heads/folio (contains another ~70 patches on top of this batch) v4: - Rebase on current Linus tree (including swap fix) - Analyse each patch in terms of its effects on kernel text size. A few were modified to improve their effect. In particular, where pushing calls to page_folio() into the callers resulted in unacceptable size increases, the wrapper was placed in mm/folio-compat.c. This lets us see all the places which are good targets for conversion to folios. - Some of the patches were reordered, split or merged in order to make more logical sense. - Use nth_page() for folio_next() if we're using SPARSEMEM and not VMEMMAP (Zi Yan) - Increment and decrement page stats in units of pages instead of units of folios (Zi Yan) v3: - Rebase on next-20210127. Two major sources of conflict, the generic_file_buffered_read refactoring (in akpm tree) and the fscache work (in dhowells tree). v2: - Pare patch series back to just infrastructure and the page waiting parts. Matthew Wilcox (Oracle) (25): mm: Introduce struct folio mm: Add folio_pgdat and folio_zone mm/vmstat: Add functions to account folio statistics mm/debug: Add VM_BUG_ON_FOLIO and VM_WARN_ON_ONCE_FOLIO mm: Add put_folio mm: Add get_folio mm: Create FolioFlags mm: Handle per-folio private data mm: Add folio_index, folio_page and folio_contains mm/util: Add folio_mapping and folio_file_mapping mm/memcg: Add folio wrappers for various memcontrol functions mm/filemap: Add unlock_folio mm/filemap: Add lock_folio mm/filemap: Add lock_folio_killable mm/filemap: Convert lock_page_async to lock_folio_async mm/filemap: Convert end_page_writeback to end_folio_writeback mm/filemap: Add wait_on_folio_locked & wait_on_folio_locked_killable mm/page-writeback: Add wait_on_folio_writeback mm/page-writeback: Add wait_for_stable_folio mm/filemap: Convert wait_on_page_bit to wait_on_folio_bit mm/filemap: Add __lock_folio_or_retry mm/filemap: Convert wake_up_page_bit to wake_up_folio_bit mm/page-writeback: Convert test_clear_page_writeback to take a folio mm/filemap: Convert page wait queues to be folios cachefiles: Switch to wait_page_key fs/afs/write.c | 21 ++-- fs/cachefiles/rdwr.c | 13 +-- fs/io_uring.c | 2 +- include/linux/fscache.h| 5 + include/linux/memcontrol.h | 30 + include/linux/mm.h | 88 ++- include/linux/mm_types.h | 33 ++ include/linux/mmdebug.h| 20 include/linux/page-flags.h | 106 ++ include/linux/pagemap.h| 197 +++-- include/linux/vmstat.h | 83 ++ mm/Makefile| 2 +- mm/filemap.c | 218 ++--- mm/folio-compat.c | 37 +++ mm/memory.c| 8 +- mm/page-writeback.c| 58 +- mm/swapfile.c | 6 +- mm/util.c | 20 ++-- 18 files changed, 666 insertions(+), 281 deletions(-) create mode 100644 mm/folio-compat.c -- 2.30.0