On Sun, Apr 15, 2018 at 07:10:52PM -0500, Vijay Chidambaram wrote:
> Thanks! As I mentioned before, this is useful. I have a follow-up
> question. Consider the following workload:
>  creat foo
>  link (foo, A/bar)
>  fsync(foo)
>  crash
> In this case, after the file system recovers, do we expect foo's link
> count to be 2 or 1? 

So, strictly ordered behaviour:

create foo:
        - creates dirent in inode B and new inode A in an atomic
          transaction sequence #1

link foo -> A/bar
        - creates dirent in inode C and bumps inode A link count in
          an atomic transaction seqeunce #2.

fsync foo
        - looks at inode A, sees it's "last modification" sequence
          counter as #2
        - flushes all transactions up to and including #2 to the

See the dependency chain? Both the inodes and dirents in the create
operation and the link operation are chained to the inode foo via
the atomic transactions. Hence when we flush foo, we also flush the
dependent changes because of the change atomicity requirements....

> I would say 2,

Correct, for strict ordering. But....

> but POSIX is silent on this,

Well, it's not silent, POSIX explicitly allows for fsync() to do
nothing and report success. Hence we can't really look to POSIX to
define how fsync() should behave.

> so
> thought I would confirm. The tricky part here is we are not calling
> fsync() on directory A.

Right. But directory A has a dependent change linked to foo. If we
fsync() foo, we are persisting the link count change in that file,
and hence all the other changes related to that link count change
must also be flushed. Similarly, all the cahnges related to the
creation on foo must be flushed, too.

> In this case, its not a symlink; its a hard link, so I would say the
> link count for foo should be 2.

Right - that's the "reference counted object dependency" I refered
to. i.e. it's a bi-direction atomic dependency - either we show both
the new dirent and the link count change, or we show neither of
them.  Hence fsync on one object implies that we are also persisting
the related changes in the other object, too.

> But btrfs and F2FS show link count of
> 1 after a crash.

That may be valid if the dirent A/bar does not exist after recovery,
but it also means fsync() hasn't actually guaranteed inode changes
made prior to the fsync to be persistent on disk. i.e. that's a
violation of ordered metadata semantics and probably a bug.


Dave Chinner

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Linux-f2fs-devel mailing list

Reply via email to