On Wed, Feb 5, 2014 at 2:31 AM, Glauber Costa <[email protected]>wrote:
> Hi > > I've been recently trying to devise some mechanism to reuse the ARC > buffers directly into a file back mapping (created by mmap, for instance). > My main goal is not to have duplication between what is in the ARC and what > is in the page cache (and to be honest, the OS I am working on does not > have a page cache, so my real goal is to keep it this way). > > It seems like Solaris and BSD never did that, but I could not find any > indication about the why. > That's the kind of thing I am pretty sure was thought about before, so I > wonder if the lack of an implementation like that is due to a major > showstopper found by you guys. > > So before I dive too deeply into this, can anybody advise me on this? > > As I recall, the main reason we kept the ZFS cache separate from the page cache was to avoid complexity related to the different locking models. If you are designing mmap from scratch, I imagine you could avoid that. Read-only mmap should be relatively straightforward. When the page is faulted in you can just keep the dbuf (dmu_buf_impl_t) held, so that it stays in memory, and then find its page_t and map it into the process's address space. Write-back mmap (i.e. PROT_WRITE + MAP_SHARED) will be trickier, because you can't modify the page while the dbuf is being written (due to checksums, raid-z, etc). Nor can you have transactions of indefinite length (e.g. create transaction when page is first stored to, commit it when pageout gets around to flushing it). I guess you could do something like mark the dbuf dirty and then when syncing context (dbuf_sync_leaf()) gets around to writing it, copy the data from the page to a new arc_buf that's just only while writing it out. --matt
_______________________________________________ developer mailing list [email protected] http://lists.open-zfs.org/mailman/listinfo/developer
