On Thu, 10 Aug 2000, Chris Mason wrote:
> [ please let me know if you want anything in the cut text answered ;-) ]
If you've got anything to say about map/unmap, something I haven't looked at at
all yet, I'm interested. Otherwise, It's back to the usual digging...
I hope it's ok with you to cc back to the list... It's called 'trolling for
more feedback'. I'm also trying to be a little bit didactic about the design
process for the benefit of list lurkers. :-)
> On 8/9/00, 4:55:13 PM, Daniel Phillips <[EMAIL PROTECTED]>
> > In Ext2 when you unmerge you also have to drill down into the inode's index tree
> > and reallocate the tail block. Unless someone thinks this is a really bad idea,
> > I'm going to handle this by letting the 'create' parameter of ext2_get_block
> > take 3 values instead of 2:
>
> > 0: don't create a block, return null if it's not there (read)
> > 1: create a block if there isn't one there (write)
> > 2: create a block even there's already one there (copy on write)
>
> I'm not sure how the copy on write case is any different from the write
> case. You can't use the generic functions (or writepage at all) without
> unmerging the tail on write.
Since I'm going to modify the version of ext2_get_block that's used everywhere,
I can't afford to start doing copy-on-write when the existing code is
expecting a normal overwrite. So I only want the copy-on-write to happen in
exactly one place, where a tail block has to be unmerged. Since the caller
knows whether this is the case or not and the callee would have to discover it
through a series of tests, I think it's better to just *tell* the callee what
to do.
In Tux2 I do copy-on-write religiously for every block write, except where the
block has already been modified in the same phase, but that's another story.
> > I'm not quite ready to code this yet because I'm not clear on what the special
> > handling of 'metadata' is all about. But getting close.
>
> Metadata is special mostly because it is in the buffer cache, which is
> not synchronized with the page cache at all. File tails effectively
> become metadata for the same reason. The task of keeping things in sync,
> without deadlocks is what makes tails hard ;-)
Thankyou. Yes, I can see that, but I also think I'm off to a pretty good start
as far as avoiding races and deadlocks goes. I haven't been very specific
about the details of synchronization yet, but I'll get there soon. In the
meantime, if you've spotted anything that's obviously wrong, please shout.
> > The unmerge that happens in ext2_file_write looks like this:
> > <start exclusive>
> > ext2_get_block (inode, iblock, bh_result, 2) /* force tail to new block */
> > <clear the this inode's tail offset>
> > <delete this inode from the merge list>
> > <end exclusive>
>
> > The reason the unmerge is done here and not at a lower level in ext2_get_block
> > is, I don't want to have get_block go diving down into the innards of a page
> > it's not being asked to do I/O on. That could get pretty messy.
>
> Well, you'll have to do the same thing in writepage. Unmerging the tail
> only requires reads from the buffer cache, so it is not too bad inside
> get_block.
Correct me if I'm wrong, but I think that if writepage ever needs to unmerge at
tail then it's an error. The tail should have already been unmerged by the
ext2_file_write, which must necessarily precede the writepage. Similarly for
file_mmap: I should unmerge the tail as soon as a file ever memapped, and then
a unmerge can't come from that source either.
--
Daniel