/me grumbles about top-posting... On Fri, Sep 9, 2016 at 1:35 PM, Matthew Wilcox <[email protected]> wrote: > I feel like we're not only building on shifting sands, but we haven't decided > whether we're building a Pyramid or a Sphinx. > > I thought after Storage Summit, we had broad agreement that we were moving to > a primary DAX API that was not BH (nor indeed iomap) based. We would still > have DAX helpers for block based filesystems (because duplicating all that > code between filesystems is pointless), but I now know of three filesystems > which are not block based that are interested in using DAX. Jared Hulbert's > AXFS is a nice public example. > > I posted a prototype of this here: > > https://groups.google.com/d/msg/linux.kernel/xFFHVCQM7Go/ZQeDVYTnFgAJ > > It is, of course, woefully out of date, but some of the principles in it are > still good (and I'm working to split it into digestible chunks). > > The essence: > > 1. VFS or VM calls filesystem (eg ->fault()) > 2. Filesystem calls DAX (eg dax_fault()) > 3. DAX looks in radix tree, finds no information. > 4. DAX calls (NEW!) mapping->a_ops->populate_pfns > 5a. Filesystem (if not block based) does its own thing to find out the PFNs > corresponding to the requested range, then inserts them into the radix tree > (possible helper in DAX code) > 5b. Filesystem (if block based) looks up its internal data structure (eg > extent tree) and > calls dax_create_pfns() (see giant patch from yesterday, only instead of > passing a get_block_t, the filesystem has already filled in a bh which > describes the entire extent that this access happens to land in). > 6b. DAX takes care of calling bdev_direct_access() from dax_create_pfns(). > > Now, notice that there's no interaction with the rest of the filesystem here. > We can swap out BHs and iomaps relatively trivially; there's no call for > making grand changes, like converting ext2 over to iomap. The BH or iomap is > only used for communicating the extent from the filesystem to DAX. > > Do we have agreement that this is the right way to go?
My $0.02... So the current dax implementation is still struggling to get right (pmd faulting, dirty entry cleaning, etc) and this seems like a rewrite that sets us up for future features without addressing the current bugs and todo items. In comparison the iomap conversion work seems incremental and conserving of current development momentum. I agree with you that continuing to touch ext2 is not a good idea, but I'm not yet convinced that now is the time to go do dax-2.0 when we haven't finished shipping dax-1.0.

