Re: new VFS method sync_page and stacking

Benjamin C.R. LaHaise Mon, 01 May 2000 11:02:03 -0700
On Mon, 1 May 2000, Roman V. Shaposhnick wrote:

>   I see what you mean. And I completely agree with only one exception: we
> should not use and we should not think of address_space as a generic cache
> till we implement the interface for address_space_operations that:
>  
>     1. can work with *any* type of host object
>     2. at the same time can work with stackable or derived ( in C++
>        terminology ) host objects like file->dentry->inode->phis.
>     3. and has a reasonable and kludge-free specification.
> 
>   I agree that providing such interface to the address_space will simplify 
> things a lot, but for now I see no evidence of somebody doing this. 

If we remove the struct file *'s from the interface, it becomes quite
trivial to do this.

> > Hmm. Take ->readpage() for example. It is used to fill a page which "belongs"
> > to an object. That object is referenced in page->mapping->host. For inode
> > data, the host is the inode structure. When should readpage() ever need to
> > see anything other than the object to which the pages belong? It doesn't make
> > sense (to me). 
> 
>    I disagree; and mainly because sometimes it is hard to tell where object 
> "begins" and where it "ends". Complex objects  are often  derived from 
> simple  and it is a very common practice to use the highest level available
> for the sake of optimization. E.g. in Linux userspace file is built around 
> dentry which in turn is built around inode etc. Can we say that file do not
> include inode directly ? Yes. Can we say that file do not include inode at
> all? No. 
> 
>  Let me show you an example. Consider, that:
>    1. you have a network-based filesystem with a state oriented protocol. 
>    2. to do something with a file you need a special pointer to it called 
>       file_id.
>    3. before reading/writing to/from a file you should open its file_id
>       for reading, writing or both. After you open file_id you can only
>       read/write from/to it but you can not do nothing more.
>    
>  I guess you will agree with me that  the best place for "opened" file_id
> is a file->private_data ? Ok. Now the question is, how can I access opened
> file_id if all methods ( readpage, writepage, prepare_write and commit_write ) 
> get the first argument of type inode ?

Holding the file_id in file->private_data is just plain wrong in the
presence of delayed writes (especially on NFS).  Consider the case where a
temporary file is opened and deleted but then used for either read/write
or backing an mmap.  Does it not make sense that when the file is finally
close()'d/munmap()'d by the last user that the contents should be merely
discarded?  If you're tieing the ability to complete write()'s to the
struct file * that was opened, you'll never be able to detect this case
(which is trivial if the file_id token is placed in the inode or address
space).  An address_space should be able to do everything without
requiring a struct file.  Granted, this raises the problem that Al pointed
out about signing RPC requests, but if a copy of the authentication token
is made at open() time into the lower layer for the filesystem, this
problem goes away.

This is important because you can't have a struct file when getting calls
from the page reclaim action in shrink_mmap.

> > Inode data pages are per-inode, not per-dentry or per-file-struct.
> 
>   Frankly, inode data pages are file pages, because it is userspace files we
> care of. Nothing more, nothing less. 

No, they are not.  address_spaces are a generic way of caching pages of a
vm object.  Files happen to fit nicely into vm objects.

                -ben
Re: new VFS method sync_page and stacking

Reply via email to