Thank you for your prompt and patient response! > > At this point, f2fs has no concept of head/tail pages. Because it > doesn't tell the VFS that it can handle large folios, it will only see > order-0 pages. The page->private member will go away, so filesystems > cannot depend on being able to access it. They only get folio->private, > and it's recommended (but not required) that they use that to point to > their own private per-folio struct. Yes, I understand that we should treat all pages represented by a folio as a whole. The folio structure itself acts as the head page. Operations and flags applied to the folio are effectively applied to all pages within it, except for those operations that need to track page-specific attributes, such as whether a page is dirty or uptodate. I was just previously a bit concern about whether special flags used in private within f2fs needed to be tracked on a per-page basis, otherwise information might be lost. Let me give a specific example. For instance, PAGE_PRIVATE_ONGOING_MIGRATION indicates that a page is undergoing block migration during garbage collection. Initially, I was a bit worried about what would happen if some pages in a folio were in garbage collection while others were not. However, after further consideration and looking at how this flag is used in the f2fs code, it seems that it's sufficient for the folio's private field to know that it is in the migration phase of garbage collection. For PAGE_PRIVATE_INLINE_INODE, just from the name of this enumeration, we can tell that it will only be used for metadata pages. Therefore, we can currently fix the folio order for metadata folios to 0.
> I do think the best approach is to extend iomap and then have f2fs use > iomap, but I appreciate that is several large jobs. It's worth it > because it completely insulates f2fs from having to deal with > pages/folios (except for metadata) Well for iomap, I have several questions. First of all,how should we define "having f2fs using iomap"? Does it mean rewriting the address_space_operations using iomap-based APIs? Let me take buffered read as a specific example. The only difference between traditional buffered read and iomap-based buffered read is whether to use mpage_readpages or iomap_readahead function during aops->readahead. If using iomap in f2fs implies using the iomap_readahead function, I am wondering if iomap_readahead supports files based on indirect pointers? I personally believe this question is very important. Because I recently looked at the code related to iomap in buffered read for xfs and the buffered read in the ext4 large folios patch. My conclusion is that the current implementation of iomap_readahead in the mainline kernel is entirely based on the assumption that the file's data block allocation is extent-based. (The author of ext4 large folios patch explicitly restricted the iomap buffered read path to extent-based files).It seems to completely lack support for files with data blocks allocated using indirect pointers. I am not sure if I have missed something crucial or if my understanding of iomap's readahead logic is not deep enough. I would like to confirm this with you. This is because f2fs is a file system entirely based on indirect pointers (and even with an additional layer of NAT table). The concept of "extent" for files simply does not exist in f2fs. (And f2fs's extent cache is also not the same concept as extent, which might be a point of confusion). This is different from XFS, Btrfs, and even ext4. If there are currently no iomap APIs that support indirect pointers, then using iomap to support folios in f2fs in the short term is almost completely infeasible. I also sent you an email previously to discuss this matter. https://lore.kernel.org/linux-f2fs-devel/CAMLCH1FThw2hH3pNm_dYxDPRbQ=mpxxadzsgsxhpa4obzk8...@mail.gmail.com/T/#t I have listened to the Linux Foundation talk "Challenges and Ideas in Transitioning EXT* and other FS to iomap". The talk mentioned that iomap is being optimized for mapping performance of files based on indirect pointers. I am curious to know if there are any patches in iomap currently that address the handling of indirect pointer mappings? Next,I would like to discuss the design of the disign of the extended iomap strcture, assuming we make some extensions to the fields in iomap_folio_state (for example, we might add f2fs's page_private flags, sorry I haven't fully figured out the specific design yet), we would not be able to directly use iomap's various ifs APIs (such as ifs_alloc) with this extended structure. I am wondering if we could write some adaptation layer APIs? For example, could we process this extended iomap_folio_state structure in adpater function and then delegate the operations to iomap's APIs? If iomap indeed does not support indirect based files ,then regarding how to enable large folio support in f2fs at this stage, I believe that in the short term, making f2fs's traditional buffered read API support large folios would be a more appropriate and pragmatic interim solution. (I haven't yet deeply studied other address_space_operations, so let's put them aside for now.) Furthermore, I think we may need to embed calls to iomap APIs and iomap data structures within these functions. For example, directly using the extended iomap_folio_state structure and related APIs in f2fs_mpage_readpages. I understand that iomap was not designed for this kind of usage. But I feel it might be difficult to avoid doing so in the short term. To illustrate, besides buffered I/O, the garbage collection process in f2fs also generates a significant amount of I/O that interacts with the page cache. Moreover, garbage collection has its own APIs for interacting with the page cache. Completely refactoring them to directly follow the framework provided by iomap might also be challenging. If this approach might cause interface pollution for the future migration of f2fs to iomap, then I think our current focus could be prioritized on enabling large folio support for f2fs's traditional buffered read and buffered write, as well as garbage collection, using the solution I proposed. This should not interfere with iomap, as iomap uses a completely separate set of interfaces for buffered read and buffered write. If you have a better solution, I would be very grateful if you could share your insights. > Ah, you need a tool called b4. Your distro may have it packaged, > or you can get it from: > https://git.kernel.org/pub/scm/utils/b4/b4.git Thanks for recommendation.I think I've learned a lot with this tool.Well it seems that when using the combination of b4 am and git am commands to apply patches, issues can sometimes occur where patches don't apply cleanly. It appears that each patch heavily relies on the patch author's own kernel tree and their previous patches. The ext4 large folio support patch seems to be the case. So, sometimes it might still be necessary to manually resolve code conflicts? I apologize for the length of this reply. It also seems that this discussion has drifted somewhat from the original subject of this thread. If you think it would be better to start a new thread, please let me know. Best regards. Matthew Wilcox <wi...@infradead.org> 于2025年4月2日周三 11:10写道: Matthew Wilcox <wi...@infradead.org> 于2025年4月2日周三 11:10写道: > > On Tue, Apr 01, 2025 at 10:17:42PM +0800, Nanzhe Zhao wrote: > > Based on my understanding after studying the code related to F2FS's > > use of the private field of the page structure, it appears that F2FS > > employs this field in a specific way. If the private field is not > > interpreted as a pointer, it seems it could be used to store > > additional flag bits. A key observation is that these functions seem > > to apply to tail pages as well. Therefore, as you mentioned, if we are > > using folios to manage multiple pages, it seems reasonable to consider > > adding a similar field within the iomap_folio_state structure. This > > would be analogous to how it currently tracks the uptodate and dirty > > states for each subpage, allowing us to track the state of these > > private fields for each subpage as well. Because it looks just like > > F2FS is utilizing the private field as a way to extend the various > > state flags of a page in memory. Perhaps it would be more appropriate > > to directly name this new structure f2fs_folio_state? This is because > > I'm currently unsure whether it will interact with existing iomap APIs > > or if we will need to develop F2FS-specific APIs for it. > > At this point, f2fs has no concept of head/tail pages. Because it > doesn't tell the VFS that it can handle large folios, it will only see > order-0 pages. The page->private member will go away, so filesystems > cannot depend on being able to access it. They only get folio->private, > and it's recommended (but not required) that they use that to point to > their own private per-folio struct. > > I do think the best approach is to extend iomap and then have f2fs use > iomap, but I appreciate that is several large jobs. It's worth it > because it completely insulates f2fs from having to deal with > pages/folios (except for metadata) > > > > You're right that f2fs needs per-block dirty tracking if it is to > > > support large folios. > > > > I feel that we need to consider more than just this aspect. In fact, > > it might be because we are still in the early stages of F2FS folio > > support,so it leaves me the impression that the current F2FS folio > > implementation is essentially just replacing struct page at the > > interface level. It effectively acts just like a single page, or in > > other words, a folio of order 0. > > Right, that's the current approach. We're taking it because the page > APIs are being removed. The f2fs developers have chosen to work on other > projects instead of supporting large folios (which is their right), > but they can't hold up the conversion of the entire filesystem stack > from pages to folios, so they're getting the minimal conversion and can > work on large folios when they have time. > > > As you can see in f2fs_mpage_readpages, after each folio is processed > > in the loop, the nr_pages counter is only decremented by 1. Therefore, > > it's clear that when the allocated folios in the page cache are all > > iterated through, nr_pages still has remaining value, and the loop > > continues. This naturally leads to a segmentation fault at index = > > folio_index(folio); due to dereferencing a null pointer. Furthermore, > > only the first page of each folio is submitted for I/O; the remaining > > pages are not filled with data from disk. > > Yes, there are lots of places in f2fs that assume a folio only has a > single page. > > > I am planning to prepare patches to address these issues and submit > > them soon. I noticed you recently submitted a big bunch of patches on > > folio. I would like to debug and test based on your patch.Therefore, I > > was wondering if it would be possible for you to share your modified > > F2FS code directly, or perhaps provide a link to your Git repository? > > Manually copying and applying so many patches from the mailing list > > would be quite cumbersome. > > Ah, you need a tool called b4. Your distro may have it packaged, > or you can get it from: > > https://git.kernel.org/pub/scm/utils/b4/b4.git _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel