On Sun, Apr 30, 2000 at 04:46:37AM -0400, Erez Zadok wrote:

> Background: my stacking code for linux is minimal.  I only stack on things I
> absolutely have to.  By "stack on" I mean that I save a link/pointer to a
> lower-level object in the private data field of an upper-level object.  I do
> so for struct file, inode, dentry, etc.  But I do NOT stack on pages.  Doing
> so would complicate stacking considerably.  So far I was able to avoid this
> b/c every function that deals with pages also passes a struct file/dentry to
> it so I can find the correct lower page.

You shouldn't need to "stack on" pages anyway, I wouldn't have thought.
For each page you can reference mapping->host, which should point to the
hosting structure (at the moment always an inode, but this may change).

> The new method, sync_page() is only passed a struct page.  So I cannot stack
> on it!  If I have to stack on it, I'll have to either
> 
> (1) complicate my stacking code considerably by stacking on pages.  This is
>     impossible for my stackable compression file system, b/c the mapping of
>     upper and lower pages is not 1-to-1.

Why can your sync_page implementation not grab the inode from mapping->host
and then call sync_page on the underlying fs' page(s) that hold the data?

> (2) change the kernel so that every instance of sync_page is passed the
>     corresponding struct file.  This isn't pretty either.

I'd like to the see the address_space methods /lose/ the struct file /
struct dentry pointer, but it may be there are situations which require
it.

> Luckily, sync_page isn't used too much.  Only nfs seems to use it at the
> moment.  All other file systems which define ->sync_page use
> block_sync_page() which is defined as:
> 
> int block_sync_page(struct page *page)
> {
>       run_task_queue(&tq_disk);
>       return 0;
> }
> 
> This is confusing.  Why would block_sync_page ignore the page argument and
> call something else.  The name "block_sync_page" might be misleading.  The
> only thing I can think of is that block_sync_page is a placeholder for for a
> time when it would actually do something with the page.

No, this looks OK to me. sync_page seems to be designed to ensure that any
async I/O on a page is complete. For normal "block-mapped" filesystems (which
is what the block_* helpers in buffer.c are for), pages with "in flight" I/O
will have requests on the appropriate block device's queue. Running tq_disk
should ensure these are completed. If tq_disk is ever split up (I believe Jens
Axboe is working on this), block_sync_page will need to look at
page->mapping->host to determine the device, I assume.

Non block-mapped filesystems (e.g. nfs) will need to do something different.

> Anyway, since sync_page appears to be an optional function, I've tried my
> stacking without defining my own ->sync_page.  Preliminary results show it
> seems to work.  However, if at any point I'd have to define ->sync_page page
> and have to call the lower file system's ->sync_page, I'd urge a change in
> the prototype of this method that would make it possible for me to stack
> this operation.
> 
> Also, I don't understand what's ->sync_page for in the first place.  The
> name of the fxn implies it might be something like a commit_write.

I presume it's been added to replace explicit tq_disk runs, which will allow
finer-grained syncing as described above, and also to ensure correctness in
nfs and other similar filesystems.

Reply via email to