On Thu, Feb 08, 2007 at 09:33:05AM +0530, Suparna Bhattacharya wrote:
> On Wed, Feb 07, 2007 at 01:05:44PM -0500, Chris Mason wrote:
> > On Wed, Feb 07, 2007 at 10:38:45PM +0530, Suparna Bhattacharya wrote:
> > > > + * The flags parameter is a bitmask of:
> > > > + *
> > > > + * DIO_PLACEHOLDERS (use placeholder pages for locking)
> > > > + * DIO_CREATE (pass create=1 to get_block for filling holes or 
> > > > extending)
> > > 
> > > A little more explanation about why these options are needed, and examples
> > > of when one would specify each of these options would be good.
> > 
> > I'll extend the comments in the patch, but for discussion here:
> > 
> > DIO_PLACEHOLDERS:  placeholders are inserted into the page cache to
> > synchronize the DIO with buffered writes.  From a locking point of view,
> > this is similar to inserting and locking pages in the address space
> > corresponding to the DIO.
> > 
> > placeholders guard against concurrent allocations and truncates during the 
> > DIO.
> > You don't need placeholders if truncates and allocations are are
> > impossible (for example, on a block device).
> 
> Likewise placeholders may not be needed if the underlying filesystem
> already takes care of locking to synchronizes DIO vs buffered.

True, although I don't think any FS covers 100% of the cases right now.

> 
> > 
> > DIO_CREATE: placeholders make it possible for filesystems to safely fill
> > holes and extend the file via get_block during the DIO.  If DIO_CREATE
> > is turned on, get_block will be called with create=1, allowing the FS to
> > allocate blocks during the DIO.
> 
> When would one NOT specify DIO_CREATE, and what are the implications ?
> The purpose of having an option of NOT allowing the FS to allocate blocks
> during DIO is one is not very intuitive from the standpoint of the caller.
> (the block device case could be an example, but then create=1 could not do
> any harm or add extra overhead, so why bother ?)

DIO has fallen back to buffered IO for so long that I wanted filesystems
to explicitly choose the create=1 for now.  A good example is my patch
for ext3, where the ext3 get_block routine needed to be changed to start
a transaction instead of finding the current trans in
current->journal_info.  The reiserfs DIO get_block needed to be told not
to expect i_mutex to be held, etc etc.

> 
> Is there still a valid case where we fallback to buffered IO to fill holes
> - to me that seems to be the only situation where create=0 must be enforced.

Right, when create=0 we fall back, otherwise we don't.

> 
> > 
> > DIO_DROP_I_MUTEX: If the write is inside of i_size, i_mutex is dropped
> > during the DIO and taken again before returning.
> 
> Again an example of when one would not specify this (block device and
> XFS ?) would be useful.

If the FS can't fill a hole or extend the file without i_mutex, or if
the caller has already dropped I_MUTEX themselves.  I think this is
only XFS right now, the long term goal is to make placeholders fast
enough for XFS to use.

-chris

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to