Re: [RFC] api for consistent lvm snapshots

Daniel Phillips Wed, 04 Oct 2000 07:25:09 -0700
Chris Mason wrote:
> 
> --On 10/04/00 02:23:30 +0200 Daniel Phillips <[EMAIL PROTECTED]> wrote:
> 
> > Chris Mason wrote:
> >>
> >> --On 10/03/00 21:13:04 +0200 Daniel Phillips
> >> <[EMAIL PROTECTED]> wrote:
> >>
> >> > Chris Mason wrote:
> >> >> Heinz Mauelshagen and I have come up with an API for LVM to use
> >> >> for creating consistent snapshots.  The idea is to block FS
> >> >> modifications while the snapshot is being created, and to give the FS
> >> >> the chance to flush everything (including all pending transactions)
> >> >> to disk before LVM starts doing copy on write.
> >> > [...]
> >> >
> >> >
> >> > What is the thinking behind having a sync_super sandwich?  I've always
> >> > wondered about that.
> >> >
> >>
> >> For filesystems that don't have a write_super_lockfs call back, I wanted
> >> to make fsync_dev_lockfs == fsync_dev.
> >
> > I was actually asking why there are two calls to sync_supers, one at the
> > beginning and one at the end.  That sounds like one too many to me.
> >
> 
> The first one is a call to sync_supers, which will skip the call to the FS
> write_super method if the super is not dirty.  The second is a call to
> sync_supers_lockfs, which will always call the FS write_super_lockfs
> method.

So if the superblock *is* dirty it will be written twice?

[...]
> It could have been done to avoid the first call to sync_supers, but this is
> not a speed sensitive function, and I wanted to make it really clear this
> was fsync_dev + something extra (if the FS provided it).

In Tux2 write_super doesn't make any sense until after the dirty dcache
entries are flushed and the dirty inodes are in turn flushed.  In fact,
from Tux2's point of view, the whole oddly constructed flush mechanism
that currently exists doesn't help - it just gets in the way.  For
example, fsync will try to write all dirty buffers, when it violates
priority ordering constraints.  I simply disable that 'feature' - it's
death for filesystem integrity.  For me, bdflush and friends are just a
menace that has to be held at bay.  I don't see why this would be
different for any other filesystem that pretends to close all windows of
crash vulnerability, including yours.

What we need is a sensible method/callback/library arrangement for the
sync like we now have for read/write/mmap.  What we have now is far from
sensible.  Syncing should be done one superblock at a time, not across
the entire system like it is now.  IOW, it's currently sliced
horizonally while it really needs to be sliced vertically.  We need need
a sync_filesystem method and it should default to a generic_sync_super
that does the current dumb sync.  You should then put your improvements
in as a method override, not just make the current messy arrangement
even messier.

Um, this isn't the only place where support for dumb filesystems messes
things up for smart filesystems, it's just the most painful.  Dumb
filesystems should have the right to remain dumb, but smart filesystems
should pay a price for that.  It would be so easy to fix this right now,
why don't we?  

Of course, "right now" == 2.5, not 2.4.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]
Re: [RFC] api for consistent lvm snapshots

Reply via email to