On Thu, 10 Aug 2000, Chris Mason wrote:
> [ please let me know if you want anything in the cut text answered ;-) ]

If you've got anything to say about map/unmap, something I haven't looked at at
all yet, I'm interested.  Otherwise, It's back to the usual digging...

I hope it's ok with you to cc back to the list... It's called 'trolling for
more feedback'.  I'm also trying to be a little bit didactic about the design
process for the benefit of list lurkers. :-)

> On 8/9/00, 4:55:13 PM, Daniel Phillips <[EMAIL PROTECTED]> 

> > In Ext2 when you unmerge you also have to drill down into the inode's index tree
> > and reallocate the tail block.  Unless someone thinks this is a really bad idea,
> > I'm going to handle this by letting the 'create' parameter of ext2_get_block
> > take 3 values instead of 2:
> 
> >   0: don't create a block, return null if it's not there (read)
> >   1: create a block if there isn't one there (write)
> >   2: create a block even there's already one there (copy on write)
> 
> I'm not sure how the copy on write case is any different from the write 
> case.  You can't use the generic functions (or writepage at all) without 
> unmerging the tail on write.

Since I'm going to modify the version of ext2_get_block that's used everywhere,
I can't afford to start doing copy-on-write when the existing code is
expecting a normal overwrite.  So I only want the copy-on-write to happen in
exactly one place, where a tail block has to be unmerged.  Since the caller
knows whether this is the case or not and the callee would have to discover it
through a series of tests, I think it's better to just *tell* the callee what
to do.

In Tux2 I do copy-on-write religiously for every block write, except where the
block has already been modified in the same phase, but that's another story.

> > I'm not quite ready to code this yet because I'm not clear on what the special
> > handling of 'metadata' is all about.  But getting close.
> 
> Metadata is special mostly because it is in the buffer cache, which is 
> not synchronized with the page cache at all.  File tails effectively 
> become metadata for the same reason.  The task of keeping things in sync, 
> without deadlocks is what makes tails hard ;-)

Thankyou.  Yes, I can see that, but I also think I'm off to a pretty good start
as far as avoiding races and deadlocks goes.  I haven't been very specific
about the details of synchronization yet, but I'll get there soon.  In the
meantime, if you've spotted anything that's obviously wrong, please shout.

> > The unmerge that happens in ext2_file_write looks like this:
> >       <start exclusive>
> >       ext2_get_block (inode, iblock, bh_result, 2) /* force tail to new block */
> >       <clear the this inode's tail offset>
> >       <delete this inode from the merge list>
> >       <end exclusive>
> 
> > The reason the unmerge is done here and not at a lower level in ext2_get_block
> > is, I don't want to have get_block go diving down into the innards of a page
> > it's not being asked to do I/O on.  That could get pretty messy.
> 
> Well, you'll have to do the same thing in writepage.  Unmerging the tail 
> only requires reads from the buffer cache, so it is not too bad inside 
> get_block.

Correct me if I'm wrong, but I think that if writepage ever needs to unmerge at
tail then it's an error.  The tail should have already been unmerged by the
ext2_file_write, which must necessarily precede the writepage.  Similarly for
file_mmap: I should unmerge the tail as soon as a file ever memapped, and then
a unmerge can't come from that source either.

-- 
Daniel

Reply via email to