On 2019/12/8 21:51, Gao Xiang via Linux-f2fs-devel wrote: > Hi, > > On Sun, Dec 08, 2019 at 09:15:55PM +0800, Hongwei Qin wrote: >> Hi, >> >> On Sun, Dec 8, 2019 at 12:01 PM Chao Yu <c...@kernel.org> wrote: >>> >>> Hello, >>> >>> On 2019-12-7 18:10, 锟斤拷锟秸碉拷锟斤拷锟斤拷锟斤拷 wrote: >>>> Hi F2FS experts, >>>> The following confuses me: >>>> >>>> A typical fsync() goes like this: >>>> 1) Issue data block IOs >>>> 2) Wait for completion >>>> 3) Issue chained node block IOs >>>> 4) Wait for completion >>>> 5) Issue flush command >>>> >>>> In order to preserve data consistency under sudden power failure, it >>>> requires that the storage device persists data blocks prior to node blocks. >>>> Otherwise, under sudden power failure, it's possible that the persisted >>>> node block points to NULL data blocks. >>> >>> Firstly it doesn't break POSIX semantics, right? since fsync() didn't return >>> successfully before sudden power-cut, so we can not guarantee that data is >>> fully >>> persisted in such condition. >>> >>> However, what you want looks like atomic write semantics, which mostly >>> database >>> want to guarantee during db file update. >>> >>> F2FS has support atomic_write via ioctl, which is used by SQLite >>> officially, I >>> guess you can check its implementation detail. >>> >>> Thanks, >>> >> >> Thanks for your kind reply. >> It's true that if we meet power failure before fsync() completes, >> POSIX doen't require FS to recover the file. However, consider the >> following situation: >> >> 1) Data block IOs (Not persisted) >> 2) Node block IOs (All Persisted) >> 3) Power failure >> >> Since the node blocks are all persisted before power failure, the node >> chain isn't broken. Note that this file's new data is not properly >> persisted before crash. So the recovery process should be able to >> recognize this situation and avoid recover this file. However, since >> the node chain is not broken, perhaps the recovery process will regard >> this file as recoverable? > > As my own limited understanding, I'm afraid it seems true for extreme case. > Without proper FLUSH command, newer nodes could be recovered but no newer > data persisted. > > So if fsync() is not successful, the old data should be readed > but for this case, unexpected data (not A or A', could be random data > C) will be considered validly since its node is ok. > > It seems it should FLUSH data before the related node chain written or > introduce some data checksum though. > > If I am wrong, kindly correct me...
Yes, I guess if user wants more consistence guarantee of fsync() than posix one, we can refactor fsync_mode=strict mode a bit to handle fsync() IOs like we did for atomic write IOs to keep strict data/node IO order. But note that such consistence guarantee is weak, after sudden power-cut, recovered file may contain mixed old and new data (fsynced data partially persisted) which may also crash the Apps. Thanks, > > Thanks, > Gao Xiang > > > > _______________________________________________ > Linux-f2fs-devel mailing list > Linux-f2fs-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel > _______________________________________________ Linux-f2fs-devel mailing list Linux-f2fs-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel