On 2018/2/5 17:37, Yunlong Song wrote: > >> OK, details as I explained before: >> >> atomic_commit GC >> - file_write_and_wait_range >> - move_data_block >> - f2fs_submit_page_write >> - f2fs_update_data_blkaddr >> - set_page_dirty >> - fsync_node_pages >> >> 1. atomic writes data page #1 & update node #1 >> 2. GC data page #2 & update node #2 >> 3. page #1 & node #1 & #2 can be committed into nand flash before page #2 be >> committed. >> >> After a sudden pow-cut, database transaction will be inconsistent. So I think >> there will be better to exclude gc/atomic_write to each other, with a lock >> instead of flag checking. >> > > I do not understand why this transaction is inconsistent, is it a > problem that page #2 is not committed into nand flash? Since normal
Yes, node #2 contains newly updated LBAx of page #2, but if page #2 is not committed to LBAx, after recovery, page #2 's block address in node #2 will point to LBAx which contains random data, result in corrupted db file. > gc also has this problem: > > Suppose that there is db file A, f2fs_gc moves data page #1 of db file > A. But if write checkpoint only commit node page #1 and then a sudden f2fs will ensure GCed data being persisted during checkpoint, so migrated page #1 and updated node #1 will both be committed in this checkpoint. Please check WB_DATA_TYPE macro to see how we define data type that cp guarantees to writeback. > power-cut happens. Data page #1 is not committed to nand flash, but > node page #1 is committed. Is the db transaction broken and > inconsistent? > > Come back to your example, I think data page 2 of atomic file does not > belong to this transaction, so even node page 2 is committed, it is just If node #2 is committed only, it will be harmful to db transaction due to the reason I said above. Thanks, > the same problem as what I have listed above(db file A), and it does not > break this transaction. > >> Thanks, >> >>>>>> >>>>>> So how about just using dio_rwsem[WRITE] during atomic committing to >>>>>> exclude >>>>>> GCing data block of atomic opened file? >>>>>> >>>>>> Thanks, >>>>>> >>>>>>> >>>>>>> Signed-off-by: Yunlong Song <yunlong.s...@huawei.com> >>>>>>> --- >>>>>>> fs/f2fs/data.c | 5 ++--- >>>>>>> fs/f2fs/gc.c | 6 ++++-- >>>>>>> 2 files changed, 6 insertions(+), 5 deletions(-) >>>>>>> >>>>>>> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c >>>>>>> index 7435830..edafcb6 100644 >>>>>>> --- a/fs/f2fs/data.c >>>>>>> +++ b/fs/f2fs/data.c >>>>>>> @@ -1580,14 +1580,13 @@ bool should_update_outplace(struct inode >>>>>>> *inode, struct f2fs_io_info *fio) >>>>>>> return true; >>>>>>> if (S_ISDIR(inode->i_mode)) >>>>>>> return true; >>>>>>> - if (f2fs_is_atomic_file(inode)) >>>>>>> - return true; >>>>>>> if (fio) { >>>>>>> if (is_cold_data(fio->page)) >>>>>>> return true; >>>>>>> if (IS_ATOMIC_WRITTEN_PAGE(fio->page)) >>>>>>> return true; >>>>>>> - } >>>>>>> + } else if (f2fs_is_atomic_file(inode)) >>>>>>> + return true; >>>>>>> return false; >>>>>>> } >>>>>>> >>>>>>> diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c >>>>>>> index b9d93fd..84ab3ff 100644 >>>>>>> --- a/fs/f2fs/gc.c >>>>>>> +++ b/fs/f2fs/gc.c >>>>>>> @@ -622,7 +622,8 @@ static void move_data_block(struct inode *inode, >>>>>>> block_t bidx, >>>>>>> if (!check_valid_map(F2FS_I_SB(inode), segno, off)) >>>>>>> goto out; >>>>>>> >>>>>>> - if (f2fs_is_atomic_file(inode)) >>>>>>> + if (f2fs_is_atomic_file(inode) && >>>>>>> + !f2fs_is_commit_atomic_write(inode)) >>>>>>> goto out; >>>>>>> >>>>>>> if (f2fs_is_pinned_file(inode)) { >>>>>>> @@ -729,7 +730,8 @@ static void move_data_page(struct inode *inode, >>>>>>> block_t bidx, int gc_type, >>>>>>> if (!check_valid_map(F2FS_I_SB(inode), segno, off)) >>>>>>> goto out; >>>>>>> >>>>>>> - if (f2fs_is_atomic_file(inode)) >>>>>>> + if (f2fs_is_atomic_file(inode) && >>>>>>> + !f2fs_is_commit_atomic_write(inode)) >>>>>>> goto out; >>>>>>> if (f2fs_is_pinned_file(inode)) { >>>>>>> if (gc_type == FG_GC) >>>>>>> >>>>>> >>>>>> . >>>>>> >>>>> >>>> >>>> >>>> . >>>> >>> >> >> >> . >> >