Re: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file
-Original Message- From: Jaegeuk Kim [mailto:jaeg...@kernel.org] Sent: Tuesday, September 23, 2014 12:53 PM To: Chao Yu Cc: 'Huang Ying'; linux-ker...@vger.kernel.org; linux-fsde...@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net Subject: Re: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file On Mon, Sep 22, 2014 at 05:20:19PM +0800, Chao Yu wrote: -Original Message- From: Huang Ying [mailto:ying.hu...@intel.com] Sent: Monday, September 22, 2014 3:39 PM To: Chao Yu Cc: 'Jaegeuk Kim'; linux-ker...@vger.kernel.org; linux-fsde...@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net Subject: Re: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file On Mon, 2014-09-22 at 15:24 +0800, Chao Yu wrote: Hi Jaegeuk, Huang, -Original Message- From: Jaegeuk Kim [mailto:jaeg...@kernel.org] Sent: Thursday, September 18, 2014 1:51 PM To: linux-ker...@vger.kernel.org; linux-fsde...@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net Cc: Jaegeuk Kim; Huang Ying Subject: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file This patch revisited whole the recovery information during the f2fs_sync_file. In this patch, there are three information to make a decision. a) IS_CHECKPOINTED, /* is it checkpointed before? */ b) HAS_FSYNCED_INODE, /* is the inode fsynced before? */ c) HAS_LAST_FSYNC,/* has the latest node fsync mark? */ And, the scenarios for our rule are based on: [Term] F: fsync_mark, D: dentry_mark 1. inode(x) | CP | inode(x) | dnode(F) 2. inode(x) | CP | inode(F) | dnode(F) 3. inode(x) | CP | dnode(F) | inode(x) | inode(F) 4. inode(x) | CP | dnode(F) | inode(F) 5. CP | inode(x) | dnode(F) | inode(DF) 6. CP | inode(DF) | dnode(F) 7. CP | dnode(F) | inode(DF) 8. CP | dnode(F) | inode(x) | inode(DF) No sure, do we missed these cases: inode(x) | CP | inode(F) | dnode(x) - write inode(F) CP | inode(DF) | dnode(x) - write inode(F) In these cases we will write another inode with fsync flag because our last dnode is written out to disk by bdi-flusher (HAS_LAST_FSYNC is not marked). But this appended inode is not useful. AFAIK, HAS_LAST_FSYNC(AKA fsync_done) is introduced in commit 479f40c44ae3 (f2fs: skip unnecessary node writes during fsync) to avoid writting multiple unneeded inode page by multiple redundant fsync calls. But for now, its role can be taken by HAS_FSYNCED_INODE. So, can we remove this flag to simplify our logic of fsync flow? Then in fsync flow, the rule can be: If CPed before, there must be a inode(F) written in warm node chain; How about inode(x) | CP | dnode(F) Oh, I missed this one, thanks for remindering that. There is another case: inode(x) | CP | dnode(F) | dnode(x) - write inode(F) It seems we will append another unneeded inode(F) in this patch also due to no HAS_LAST_FSYNC in nat entry cache of inode. As the current rule for roll-forward recovery, we need inode(F) to find the latest mark. This can also be used to distinguish fsynced inode from writebacked inodes. Got it now, thanks for your explanation! If not CPed before, there must be a inode(DF) written in warm node chain. For example below: 1) checkpoint 2) create a, change a 3) fsync a 4) open a, change a Do we want recovery to stop at dnode(F) in step 3) or stop at dnode(x) produced by step 4)? To my understanding, we will recover all info related to fsynced nodes in warm node chain. So will produce to step 4 if changed nodes in step 4 are flushed to device. To Huang, My mistaken for the wrong explanation due to my misunderstanding about the recover code, sorry about that! Current rule should stop at 3) fsync a. It won't recover 4)'s inode, since it was just writebacked. If we'd like to recover whole the inode and its data, we should traverse all the recovery paths from the sketch. I am missing some codes when traverse the recover flow. You're right, and thanks for correcting me. Regards, Yu Thanks, Thanks, Yu Best Regards, Huang, Ying For example, #3, the three conditions should be changed as follows. inode(x) | CP | dnode(F) | inode(x) | inode(F) a)x o o o o b)x x x x o c)x o o x o If f2fs_sync_file stops --^, it should write inode(F)--^ So, the need_inode_block_update should return true, since c) get_nat_flag(e, HAS_LAST_FSYNC), is false
Re: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file
Hi Jaegeuk, Huang, -Original Message- From: Jaegeuk Kim [mailto:jaeg...@kernel.org] Sent: Thursday, September 18, 2014 1:51 PM To: linux-ker...@vger.kernel.org; linux-fsde...@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net Cc: Jaegeuk Kim; Huang Ying Subject: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file This patch revisited whole the recovery information during the f2fs_sync_file. In this patch, there are three information to make a decision. a) IS_CHECKPOINTED, /* is it checkpointed before? */ b) HAS_FSYNCED_INODE, /* is the inode fsynced before? */ c) HAS_LAST_FSYNC,/* has the latest node fsync mark? */ And, the scenarios for our rule are based on: [Term] F: fsync_mark, D: dentry_mark 1. inode(x) | CP | inode(x) | dnode(F) 2. inode(x) | CP | inode(F) | dnode(F) 3. inode(x) | CP | dnode(F) | inode(x) | inode(F) 4. inode(x) | CP | dnode(F) | inode(F) 5. CP | inode(x) | dnode(F) | inode(DF) 6. CP | inode(DF) | dnode(F) 7. CP | dnode(F) | inode(DF) 8. CP | dnode(F) | inode(x) | inode(DF) No sure, do we missed these cases: inode(x) | CP | inode(F) | dnode(x) - write inode(F) CP | inode(DF) | dnode(x) - write inode(F) In these cases we will write another inode with fsync flag because our last dnode is written out to disk by bdi-flusher (HAS_LAST_FSYNC is not marked). But this appended inode is not useful. AFAIK, HAS_LAST_FSYNC(AKA fsync_done) is introduced in commit 479f40c44ae3 (f2fs: skip unnecessary node writes during fsync) to avoid writting multiple unneeded inode page by multiple redundant fsync calls. But for now, its role can be taken by HAS_FSYNCED_INODE. So, can we remove this flag to simplify our logic of fsync flow? Then in fsync flow, the rule can be: If CPed before, there must be a inode(F) written in warm node chain; If not CPed before, there must be a inode(DF) written in warm node chain. Thanks, Yu For example, #3, the three conditions should be changed as follows. inode(x) | CP | dnode(F) | inode(x) | inode(F) a)x o o o o b)x x x x o c)x o o x o If f2fs_sync_file stops --^, it should write inode(F)--^ So, the need_inode_block_update should return true, since c) get_nat_flag(e, HAS_LAST_FSYNC), is false. For example, #8, CP | alloc | dnode(F) | inode(x) | inode(DF) a)o xx x x b)x x x o c)o o x o If f2fs_sync_file stops ---^, it should write inode(DF)--^ Note that, the roll-forward policy should follow this rule, which means, if there are any missing blocks, we doesn't need to recover that inode. Signed-off-by: Huang Ying ying.hu...@intel.com Signed-off-by: Jaegeuk Kim jaeg...@kernel.org --- fs/f2fs/data.c | 3 --- fs/f2fs/f2fs.h | 6 +++--- fs/f2fs/file.c | 10 ++ fs/f2fs/node.c | 56 +++- fs/f2fs/node.h | 21 - 5 files changed, 56 insertions(+), 40 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 0e37658..fdc3dbe 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1089,9 +1089,6 @@ static ssize_t f2fs_direct_IO(int rw, struct kiocb *iocb, if (check_direct_IO(inode, rw, iter, offset)) return 0; - /* clear fsync mark to recover these blocks */ - fsync_mark_clear(F2FS_I_SB(inode), inode-i_ino); - trace_f2fs_direct_IO_enter(inode, offset, count, rw); err = blockdev_direct_IO(rw, iocb, inode, iter, offset, get_data_block); diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 9fc1bcd..fd44083 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -1224,9 +1224,9 @@ struct dnode_of_data; struct node_info; bool available_free_memory(struct f2fs_sb_info *, int); -int is_checkpointed_node(struct f2fs_sb_info *, nid_t); -bool fsync_mark_done(struct f2fs_sb_info *, nid_t); -void fsync_mark_clear(struct f2fs_sb_info *, nid_t); +bool is_checkpointed_node(struct f2fs_sb_info *, nid_t); +bool has_fsynced_inode(struct f2fs_sb_info *, nid_t); +bool need_inode_block_update(struct f2fs_sb_info *, nid_t); void get_node_info(struct f2fs_sb_info *, nid_t, struct node_info *); int get_dnode_of_data(struct dnode_of_data *, pgoff_t, int); int truncate_inode_blocks(struct inode *, pgoff_t); diff --git a/fs/f2fs/file.c b/fs/f2fs/file.c index af06e22..3035c79 100644 --- a/fs/f2fs/file.c +++ b/fs/f2fs/file.c @@ -207,15 +207,17 @@ int f2fs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) up_write(fi-i_sem); } } else { - /* if there is no written node page, write its inode page */ - while (!sync_node_pages(sbi, ino, wbc
Re: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file
-Original Message- From: Huang Ying [mailto:ying.hu...@intel.com] Sent: Monday, September 22, 2014 3:39 PM To: Chao Yu Cc: 'Jaegeuk Kim'; linux-ker...@vger.kernel.org; linux-fsde...@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net Subject: Re: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file On Mon, 2014-09-22 at 15:24 +0800, Chao Yu wrote: Hi Jaegeuk, Huang, -Original Message- From: Jaegeuk Kim [mailto:jaeg...@kernel.org] Sent: Thursday, September 18, 2014 1:51 PM To: linux-ker...@vger.kernel.org; linux-fsde...@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net Cc: Jaegeuk Kim; Huang Ying Subject: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file This patch revisited whole the recovery information during the f2fs_sync_file. In this patch, there are three information to make a decision. a) IS_CHECKPOINTED, /* is it checkpointed before? */ b) HAS_FSYNCED_INODE, /* is the inode fsynced before? */ c) HAS_LAST_FSYNC,/* has the latest node fsync mark? */ And, the scenarios for our rule are based on: [Term] F: fsync_mark, D: dentry_mark 1. inode(x) | CP | inode(x) | dnode(F) 2. inode(x) | CP | inode(F) | dnode(F) 3. inode(x) | CP | dnode(F) | inode(x) | inode(F) 4. inode(x) | CP | dnode(F) | inode(F) 5. CP | inode(x) | dnode(F) | inode(DF) 6. CP | inode(DF) | dnode(F) 7. CP | dnode(F) | inode(DF) 8. CP | dnode(F) | inode(x) | inode(DF) No sure, do we missed these cases: inode(x) | CP | inode(F) | dnode(x) - write inode(F) CP | inode(DF) | dnode(x) - write inode(F) In these cases we will write another inode with fsync flag because our last dnode is written out to disk by bdi-flusher (HAS_LAST_FSYNC is not marked). But this appended inode is not useful. AFAIK, HAS_LAST_FSYNC(AKA fsync_done) is introduced in commit 479f40c44ae3 (f2fs: skip unnecessary node writes during fsync) to avoid writting multiple unneeded inode page by multiple redundant fsync calls. But for now, its role can be taken by HAS_FSYNCED_INODE. So, can we remove this flag to simplify our logic of fsync flow? Then in fsync flow, the rule can be: If CPed before, there must be a inode(F) written in warm node chain; How about inode(x) | CP | dnode(F) Oh, I missed this one, thanks for remindering that. There is another case: inode(x) | CP | dnode(F) | dnode(x) - write inode(F) It seems we will append another unneeded inode(F) in this patch also due to no HAS_LAST_FSYNC in nat entry cache of inode. If not CPed before, there must be a inode(DF) written in warm node chain. For example below: 1) checkpoint 2) create a, change a 3) fsync a 4) open a, change a Do we want recovery to stop at dnode(F) in step 3) or stop at dnode(x) produced by step 4)? To my understanding, we will recover all info related to fsynced nodes in warm node chain. So will produce to step 4 if changed nodes in step 4 are flushed to device. Thanks, Yu Best Regards, Huang, Ying For example, #3, the three conditions should be changed as follows. inode(x) | CP | dnode(F) | inode(x) | inode(F) a)x o o o o b)x x x x o c)x o o x o If f2fs_sync_file stops --^, it should write inode(F)--^ So, the need_inode_block_update should return true, since c) get_nat_flag(e, HAS_LAST_FSYNC), is false. For example, #8, CP | alloc | dnode(F) | inode(x) | inode(DF) a)o xx x x b)x x x o c)o o x o If f2fs_sync_file stops ---^, it should write inode(DF)--^ Note that, the roll-forward policy should follow this rule, which means, if there are any missing blocks, we doesn't need to recover that inode. Signed-off-by: Huang Ying ying.hu...@intel.com Signed-off-by: Jaegeuk Kim jaeg...@kernel.org --- fs/f2fs/data.c | 3 --- fs/f2fs/f2fs.h | 6 +++--- fs/f2fs/file.c | 10 ++ fs/f2fs/node.c | 56 +++- fs/f2fs/node.h | 21 - 5 files changed, 56 insertions(+), 40 deletions(-) diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c index 0e37658..fdc3dbe 100644 --- a/fs/f2fs/data.c +++ b/fs/f2fs/data.c @@ -1089,9 +1089,6 @@ static ssize_t f2fs_direct_IO(int rw, struct kiocb *iocb, if (check_direct_IO(inode, rw, iter, offset)) return 0; - /* clear fsync mark to recover these blocks */ - fsync_mark_clear(F2FS_I_SB(inode), inode-i_ino); - trace_f2fs_direct_IO_enter(inode, offset, count, rw); err
Re: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file
On Mon, Sep 22, 2014 at 05:20:19PM +0800, Chao Yu wrote: -Original Message- From: Huang Ying [mailto:ying.hu...@intel.com] Sent: Monday, September 22, 2014 3:39 PM To: Chao Yu Cc: 'Jaegeuk Kim'; linux-ker...@vger.kernel.org; linux-fsde...@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net Subject: Re: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file On Mon, 2014-09-22 at 15:24 +0800, Chao Yu wrote: Hi Jaegeuk, Huang, -Original Message- From: Jaegeuk Kim [mailto:jaeg...@kernel.org] Sent: Thursday, September 18, 2014 1:51 PM To: linux-ker...@vger.kernel.org; linux-fsde...@vger.kernel.org; linux-f2fs-devel@lists.sourceforge.net Cc: Jaegeuk Kim; Huang Ying Subject: [f2fs-dev] [PATCH 2/3] f2fs: fix conditions to remain recovery information in f2fs_sync_file This patch revisited whole the recovery information during the f2fs_sync_file. In this patch, there are three information to make a decision. a) IS_CHECKPOINTED, /* is it checkpointed before? */ b) HAS_FSYNCED_INODE, /* is the inode fsynced before? */ c) HAS_LAST_FSYNC, /* has the latest node fsync mark? */ And, the scenarios for our rule are based on: [Term] F: fsync_mark, D: dentry_mark 1. inode(x) | CP | inode(x) | dnode(F) 2. inode(x) | CP | inode(F) | dnode(F) 3. inode(x) | CP | dnode(F) | inode(x) | inode(F) 4. inode(x) | CP | dnode(F) | inode(F) 5. CP | inode(x) | dnode(F) | inode(DF) 6. CP | inode(DF) | dnode(F) 7. CP | dnode(F) | inode(DF) 8. CP | dnode(F) | inode(x) | inode(DF) No sure, do we missed these cases: inode(x) | CP | inode(F) | dnode(x) - write inode(F) CP | inode(DF) | dnode(x) - write inode(F) In these cases we will write another inode with fsync flag because our last dnode is written out to disk by bdi-flusher (HAS_LAST_FSYNC is not marked). But this appended inode is not useful. AFAIK, HAS_LAST_FSYNC(AKA fsync_done) is introduced in commit 479f40c44ae3 (f2fs: skip unnecessary node writes during fsync) to avoid writting multiple unneeded inode page by multiple redundant fsync calls. But for now, its role can be taken by HAS_FSYNCED_INODE. So, can we remove this flag to simplify our logic of fsync flow? Then in fsync flow, the rule can be: If CPed before, there must be a inode(F) written in warm node chain; How about inode(x) | CP | dnode(F) Oh, I missed this one, thanks for remindering that. There is another case: inode(x) | CP | dnode(F) | dnode(x) - write inode(F) It seems we will append another unneeded inode(F) in this patch also due to no HAS_LAST_FSYNC in nat entry cache of inode. As the current rule for roll-forward recovery, we need inode(F) to find the latest mark. This can also be used to distinguish fsynced inode from writebacked inodes. If not CPed before, there must be a inode(DF) written in warm node chain. For example below: 1) checkpoint 2) create a, change a 3) fsync a 4) open a, change a Do we want recovery to stop at dnode(F) in step 3) or stop at dnode(x) produced by step 4)? To my understanding, we will recover all info related to fsynced nodes in warm node chain. So will produce to step 4 if changed nodes in step 4 are flushed to device. Current rule should stop at 3) fsync a. It won't recover 4)'s inode, since it was just writebacked. If we'd like to recover whole the inode and its data, we should traverse all the recovery paths from the sketch. Thanks, Thanks, Yu Best Regards, Huang, Ying For example, #3, the three conditions should be changed as follows. inode(x) | CP | dnode(F) | inode(x) | inode(F) a)x o o o o b)x x x x o c)x o o x o If f2fs_sync_file stops --^, it should write inode(F)--^ So, the need_inode_block_update should return true, since c) get_nat_flag(e, HAS_LAST_FSYNC), is false. For example, #8, CP | alloc | dnode(F) | inode(x) | inode(DF) a)o xx x x b)x x x o c)o o x o If f2fs_sync_file stops ---^, it should write inode(DF)--^ Note that, the roll-forward policy should follow this rule, which means, if there are any missing blocks, we doesn't need to recover that inode. Signed-off-by: Huang Ying ying.hu...@intel.com Signed-off-by: Jaegeuk Kim jaeg...@kernel.org --- fs/f2fs/data.c | 3 --- fs/f2fs/f2fs.h | 6 +++--- fs/f2fs/file.c | 10 ++ fs/f2fs/node.c | 56