On Mon, Apr 16, 2018 at 7:07 PM, Dave Chinner <da...@fromorbit.com> wrote:
> On Sun, Apr 15, 2018 at 07:10:52PM -0500, Vijay Chidambaram wrote:
>> Thanks! As I mentioned before, this is useful. I have a follow-up
>> question. Consider the following workload:
>>  creat foo
>>  link (foo, A/bar)
>>  fsync(foo)
>>  crash
>> In this case, after the file system recovers, do we expect foo's link
>> count to be 2 or 1?
> So, strictly ordered behaviour:
> create foo:
>         - creates dirent in inode B and new inode A in an atomic
>           transaction sequence #1
> link foo -> A/bar
>         - creates dirent in inode C and bumps inode A link count in
>           an atomic transaction seqeunce #2.
> fsync foo
>         - looks at inode A, sees it's "last modification" sequence
>           counter as #2
>         - flushes all transactions up to and including #2 to the
>           journal.
> See the dependency chain? Both the inodes and dirents in the create
> operation and the link operation are chained to the inode foo via
> the atomic transactions. Hence when we flush foo, we also flush the
> dependent changes because of the change atomicity requirements....
>> I would say 2,
> Correct, for strict ordering. But....
>> but POSIX is silent on this,
> Well, it's not silent, POSIX explicitly allows for fsync() to do
> nothing and report success. Hence we can't really look to POSIX to
> define how fsync() should behave.
>> so
>> thought I would confirm. The tricky part here is we are not calling
>> fsync() on directory A.
> Right. But directory A has a dependent change linked to foo. If we
> fsync() foo, we are persisting the link count change in that file,
> and hence all the other changes related to that link count change
> must also be flushed. Similarly, all the cahnges related to the
> creation on foo must be flushed, too.
>> In this case, its not a symlink; its a hard link, so I would say the
>> link count for foo should be 2.
> Right - that's the "reference counted object dependency" I refered
> to. i.e. it's a bi-direction atomic dependency - either we show both
> the new dirent and the link count change, or we show neither of
> them.  Hence fsync on one object implies that we are also persisting
> the related changes in the other object, too.
>> But btrfs and F2FS show link count of
>> 1 after a crash.
> That may be valid if the dirent A/bar does not exist after recovery,
> but it also means fsync() hasn't actually guaranteed inode changes
> made prior to the fsync to be persistent on disk. i.e. that's a
> violation of ordered metadata semantics and probably a bug.

Great, this matches our understanding perfectly. We have separately
posted to the btrfs mailing list to confirm it is a bug. Thanks!

Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
Linux-f2fs-devel mailing list

Reply via email to