Alexander Viro wrote:
> On Thu, 10 Aug 2000, Daniel Phillips wrote:
> 
> > Correct me if I'm wrong, but I think that if writepage ever needs to unmerge at
> > tail then it's an error.  The tail should have already been unmerged by the
> > ext2_file_write, which must necessarily precede the writepage.  Similarly for
> > file_mmap: I should unmerge the tail as soon as a file ever memapped, and then
> > a unmerge can't come from that source either.
> 
> As for the latter - you shouldn't. THE thing about mmap() is that it is
> not allowed to expand the file. Ever.

Good to see that stated in no uncertain terms.  I hadn't read that part of the
code, so I wasn't sure about it.  I suppose Posix must have something to say
about it, but I've been unsuccessful so far in getting hold of any useful Posix
docs.  (Can somebody give me a pointer??)

It now sounds like file_mmap may not have to be changed at all.

> So all you have is to make sure that your file has no holes _past_ the end of tail. 

I don't get it.  The tail should have been unmerged by file_write or
ext2_truncate, and nothing else should be able to create such a hole.  Did I
miss something?

> ...Then every writepage() may
>         * hit the existing full block (no problem)
>         * need to allocates the full block (also no problem)
>         * write to the tail without changing the length (no unmerging)

Yes, that was the plan.

> It doesn't come for free - to make sure that writepage() will never have
> to unmerge you must unmerge() on expanding ->truncate().

Yes, did I forget to say that?  Truncate has to unmerge in every case except
where the truncate happens to hit inside the tail.  You could relax that a bit,
and say 'except when truncate hits between the beginning of the tail and the
beginning of the next merged tail' but then you'd either have to keep a map of
all the other tails or you'd have to walk the merge ring.  It's much easier to
use the simpler condition; the tiny optimization isn't worth the extra work and
complexity.  Probably.

> So there... I had to start doing something like that for UFS pre-patches.
> Hopefully it will be portable to ext2 - I'ld like to try 32Kb blocks/4Kb
> fragments, for one thing. Problems are mostly the same, except that I'm
> trying to deal with allocation units larger than PAGE_CACHE_SIZE - it adds
> some fun. OTOH, I can almost freely use buffer_cache - I'm not interested
> in fragments below the 512 bytes...

It's good to know that mm handling for large filesystem blocks is on the way...

-- 
Daniel

Reply via email to