On Tue, Jan 05, 2016 at 06:36:04PM +0100, Jan Kara wrote:
>   Hi,
> 
> On Mon 04-01-16 17:22:19, Dave Chinner wrote:
> > I've been looking at implementing the lazytime mount option for XFS,
> > and I'm struggling to work out what it is supposed to mean.
> > 
> > AFAICT, on ext4, lazytime means that pure timestamp updates are not
> > journalled and they are only ever written back when the inode is
> > otherwise dirtied and written, or they are timestamp dirty for 24
> > hours which triggers writeback.
> > 
> > This poses a couple of problems for XFS:
> > 
> >     1. we log every timestamp change, so there is no mechanism
> >        for delayed/deferred update.
> > 
> >     2. we track dirty metadata in the journal, not via the VFS
> >        dirty inode lists, so all the infrastructure written for
> >        ext4 to do periodic flushing is useless to us.
> > 
> > These are solvable problems, but what I'm not sure about is exactly
> > what the intended semantics of lazytime durability are. That is,
> > exactly what guaranteed are we giving userspace about timestamp
> > updates when lazytime is used? The guarantees we have to give will
> > greatly influence the XFS implementation, so I really need to nail
> > down what we are expected to provide userspace. Can we:
> > 
> >     a) just ignore all durability concerns?
> >     b) if not, do we only need to care about the 24 hour
> >        writeback and unmount?
> >     c) if not, are fsync/sync/syncfs/freeze/unmount supposed
> >        to provide durability of all metadata changes?
> >     d) do we have to care about ordering - if we fsync one inode
> >        with 1 hour old timestamps, do we also need to guarantee
> >        that all the inodes with older dirty timestamps also get
> >        made durable?
> 
> So the intended semantics is:
> 1) fsync / sync / freeze / unmount will write the timestamp updates even
>    with lazytime. So unless crash happens, timestamps are guaranteed to be
>    consistent. Also sync / fsync guarantees all changes to get to disk.
> 2) We periodically write back timestamps (once per 24 hours) to avoid too
>    big timestamp inconsistencies in case of crash.

Ok, so it's supposed to be a delayed timestamp update mechanism
without any specific ordering guarantees, not an opportunistic
timestamp update mechanism.

I can work with that.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to