Re: [Ocfs2-devel] [patch 11/28] ocfs2: extend enough credits for freeing one truncate record while replaying truncate records
On Sat, Aug 29, 2015 at 10:38:09AM +0800, Joseph Qi wrote: > Hi Mark, > > On 2015/8/29 7:55, Mark Fasheh wrote: > > On Wed, Aug 26, 2015 at 03:11:48PM -0700, Andrew Morton wrote: > >> From: Xue jiufei> >> Subject: ocfs2: extend enough credits for freeing one truncate record > >> while replaying truncate records > >> > >> Now function ocfs2_replay_truncate_records() first modifies tl_used, then > >> calls ocfs2_extend_trans() to extend transactions for gd and alloc inode > >> used for freeing clusters. jbd2_journal_restart() may be called and it > >> may happen that tl_used in truncate log is decreased but the clusters are > >> not freed, which means these clusters are lost. So we should avoid > >> extending transactions in these two operations. > >> > >> Signed-off-by: joyce.xue > >> Cc: Mark Fasheh > >> Cc: Joel Becker > >> Signed-off-by: Andrew Morton > >> --- > >> > >> fs/ocfs2/alloc.c | 19 --- > >> 1 file changed, 8 insertions(+), 11 deletions(-) > >> > >> diff -puN > >> fs/ocfs2/alloc.c~extend-enough-credits-for-freeing-one-truncate-record-while-replaying-truncate-records > >> fs/ocfs2/alloc.c > >> --- > >> a/fs/ocfs2/alloc.c~extend-enough-credits-for-freeing-one-truncate-record-while-replaying-truncate-records > >> +++ a/fs/ocfs2/alloc.c > > > > > > > >> @@ -6063,7 +6060,7 @@ int __ocfs2_flush_truncate_log(struct oc > >>goto out_mutex; > >>} > >> > >> - handle = ocfs2_start_trans(osb, OCFS2_TRUNCATE_LOG_UPDATE); > >> + handle = ocfs2_start_trans(osb, OCFS2_TRUNCATE_LOG_FLUSH_ONE_REC); > >>if (IS_ERR(handle)) { > >>status = PTR_ERR(handle); > >>mlog_errno(status); > > > > Why is this particular change here? > > --Mark > > > I think Joyce wants to get enough credits at first and then call > ocfs2_replay_truncate_records. Oh ok I see. That's fine then, the patch otherwise looked good to me, thanks. Reviewed-by: Mark Fasheh --Mark -- Mark Fasheh ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [PATCH V2] ocfs2: fix a tiny case that inode can not removed.
On Mon, Aug 31, 2015 at 03:36:15PM +0800, jiangyiwen wrote: > When running dirop_fileop_racer we found a case that inode > can not removed. > > 2 nodes, say Node A and Node B, mount the same ocfs2 volume. Create > two dirs /race/1/ and /race/2/ in the filesystem. > > Node ANode B > rm -r /race/2/ > mv /race/1/ /race/2/ > call ocfs2_unlink(), get > the EX mode of /race/2/ > wait for B unlock /race/2/ > decrease i_nlink of /race/2/ to 0, > and add inode of /race/2/ into > orphan dir, unlock /race/2/ > got EX mode of /race/2/. because > /race/1/ is dir, so inc i_nlink > of /race/2/ and update into disk, > unlock /race/2/ > because i_nlink of /race/2/ > is not zero, this inode will > always remain in orphan dir > > This patch fixes this case by test whether i_nlink of new dir is zero. > > Signed-off-by: Yiwen JiangLooks great, thanks Reviewed-by: Mark Fasheh --Mark -- Mark Fasheh ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
[Ocfs2-devel] ocfs2: clean up redundant NULL checks before kfree
NULL check before kfree is redundant and so clean them up. Signed-off-by: Joseph Qi--- fs/ocfs2/alloc.c| 2 +- fs/ocfs2/suballoc.c | 6 ++ 2 files changed, 3 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/alloc.c b/fs/ocfs2/alloc.c index 5997c00..73f6684 100644 --- a/fs/ocfs2/alloc.c +++ b/fs/ocfs2/alloc.c @@ -6175,7 +6175,7 @@ bail: iput(tl_inode); brelse(tl_bh); - if (status < 0 && (*tl_copy)) { + if (status < 0) { kfree(*tl_copy); *tl_copy = NULL; mlog_errno(status); diff --git a/fs/ocfs2/suballoc.c b/fs/ocfs2/suballoc.c index 4479029..68a6a80 100644 --- a/fs/ocfs2/suballoc.c +++ b/fs/ocfs2/suballoc.c @@ -149,10 +149,8 @@ void ocfs2_free_ac_resource(struct ocfs2_alloc_context *ac) brelse(ac->ac_bh); ac->ac_bh = NULL; ac->ac_resv = NULL; - if (ac->ac_find_loc_priv) { - kfree(ac->ac_find_loc_priv); - ac->ac_find_loc_priv = NULL; - } + kfree(ac->ac_find_loc_priv); + ac->ac_find_loc_priv = NULL; } void ocfs2_free_alloc_context(struct ocfs2_alloc_context *ac) -- 1.8.4.3 ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [patch 11/28] ocfs2: extend enough credits for freeing one truncate record while replaying truncate records
On Tue, 1 Sep 2015 10:54:03 -0700 Mark Fashehwrote: > > > Why is this particular change here? > > > --Mark > > > > > I think Joyce wants to get enough credits at first and then call > > ocfs2_replay_truncate_records. > > Oh ok I see. That's fine then, the patch otherwise looked good to me, > thanks. > > Reviewed-by: Mark Fasheh Cool. And where are we at with http://ozlabs.org/~akpm/mmots/broken-out/ocfs2-extend-transaction-for-ocfs2_remove_rightmost_path-and-ocfs2_update_edge_lengths-before-to-avoid-inconsistency-between-inode-and-et.patch ? ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] ocfs2 version issue
Hi Goldwyn, On 09/01/2015 06:32 PM, Goldwyn Rodrigues wrote: > Hi Junxiao, > > On 08/31/2015 09:22 PM, Junxiao Bi wrote: >> Hi Goldwyn, >> >> Ocfs2 kernel version is removed from commit >> ff8fb335221e2c446b0d4cbea26be371fd2feb64 ("ocfs2: remove versioning >> information"), but Oracle CRS depends on this version, and this made >> Oracle CRS can't be installed. So i think we should revert this commit >> and sync this version with ocfs2-tools. Do you have any other concern >> about this? > > We removed the version because it did not make sense. Even if we put in > the version back we will have to maintain it and will have a similar > case where the version number in the kernel is way behind the tools > versions because of lack of updating. Add to it the confusion of users > who would not know which version to quote. > > I suggest the Oracle CRS should be modified to use the kernel version as > opposed to the ocfs2 kernel module specific versions. But this still fail for already released software, customer may upgrade to a new kernel but still use the old software. My fault to say to sync ocfs2 kernel version with tools, we really can make that two versions separated as Srini said. But we can sync their version once now since kernel version is not updated for a long time, then in the future, each do itself versioning. Thanks, Junxiao. > > ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [patch 14/28] ocfs2: take inode lock in ocfs2_iop_set/get_acl()
On 08/31/2015 12:44 PM, Mark Fasheh wrote: > On Wed, Aug 26, 2015 at 03:11:57PM -0700, Andrew Morton wrote: >> From: Tariq Saeed>> Subject: ocfs2: take inode lock in ocfs2_iop_set/get_acl() >> >> Orabug: 20189959 >> >> This bug in mainline code is pointed out by Mark Fasheh. When >> ocfs2_iop_set_acl() and ocfs2_iop_get_acl() are entered from VFS layer, >> inode lock is not held. This seems to be regression from older kernels. >> The patch is to fix that. >> >> Signed-off-by: Tariq Saeed >> Cc: Mark Fasheh >> Cc: Joel Becker >> Signed-off-by: Andrew Morton > Thank you for fixing this Tariq, > > Reviewed-by: Mark Fasheh > > -- > Mark Fasheh Hi Mark, I realized that taking inode lock at vfs entry points opens up a self deadlock window if a remote conversion req to EX is blocked. The reason is this code path. fchmod|fchmodat -> chmod_common -> notify_change -> ocfs2_setattr (takes inode lock EX) << -> posix_acl_chmod -> get_acl -> ocfs2_iop_get_acl (inode lock PR blocks behind remote EX conv) * -> ocfs2_iop_set_acl (inode lock EX blocks behind remote EX conv) * * - self deadlock I think this can be solved by introducing a flag OCFS2_LOCK_RECURSIVE to ocfs2_cluster_lock(). The meaning of this flag is this. If the requesting level is <= lockres->l_level, in that case ignore OCFS2_LOCK_BLOCKED and just inc the holder count. This will work for all req levels if l_level is EX. if (lockres->l_flags & OCFS2_LOCK_BLOCKED && !ocfs2_may_continue_on_blocked_lock(lockres, level) || !(arg_flags & (OCFS2_LOCK_RECURSIVE) ... ocfs2_iop_get|set_acl() will pass OCFS2_LOCK_RECURSIVE to ocfs2_cluster_lock(). I am looking for suggestions. Thanks -Tariq Saeed ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] ocfs2 version issue
Hi Goldwyn On 09/01/2015 03:32 AM, Goldwyn Rodrigues wrote: > Hi Junxiao, > > On 08/31/2015 09:22 PM, Junxiao Bi wrote: >> Hi Goldwyn, >> >> Ocfs2 kernel version is removed from commit >> ff8fb335221e2c446b0d4cbea26be371fd2feb64 ("ocfs2: remove versioning >> information"), but Oracle CRS depends on this version, and this made >> Oracle CRS can't be installed. So i think we should revert this commit >> and sync this version with ocfs2-tools. Do you have any other concern >> about this? > We removed the version because it did not make sense. I am not sure how other kernel modules work, but for me a OCFS2 maintaining it's own version is a good idea :). This version number defines the feature-set it currently supports. It also allows a feature-set to be backported to older kernels(if need arises). > Even if we put in > the version back we will have to maintain it and will have a similar > case where the version number in the kernel is way behind the tools > versions because of lack of updating. Add to it the confusion of users > who would not know which version to quote. I agree that this got out of sync so probably we should fix that part than to remove it all together. I also don't see why tools version has to dictate kernel module version or vise-versa. Isn't it practical for a kernel module to implement a new feature-set but doesn't require any tools changes or vise-versa ? ;) > I suggest the Oracle CRS should be modified to use the kernel version as > opposed to the ocfs2 kernel module specific versions. It may not be just Oracle CRS ... there may be other applications that might be using this. The problem is that we don't want a customer to upgrade a kernel and run into a failure which just looks bad on kernel/filesystem :(. One may argue to write notes/readme's to inform customers but it's not practical and is a painful for some customers to update application software and kernel at the same time. Anyway, if we decide to remove the versioning forever then it may be a good idea to deprecate it first so applications have enough time to incorporate these changes. But my vote would be for ocfs2 to maintain it's own version number :) ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] [patch 09/28] ocfs2: extend transaction for ocfs2_remove_rightmost_path() and ocfs2_update_edge_lengths() before to avoid inconsistency between inode and et
Hi Mark, Since Joyce is on holiday, I'll try to answer your doubt. On 2015/9/1 5:31, Mark Fasheh wrote: > On Wed, Aug 26, 2015 at 03:11:43PM -0700, Andrew Morton wrote: >> From: Xue jiufei>> Subject: ocfs2: extend transaction for ocfs2_remove_rightmost_path() and >> ocfs2_update_edge_lengths() before to avoid inconsistency between inode and >> et >> >> I found that jbd2_journal_restart() is called in some places without >> keeping things consistently before. However, jbd2_journal_restart() may >> commit the handle's transaction and restart another one. If the first >> transaction is committed successfully while another not, it may cause >> filesystem inconsistency or read only. This is an effort to fix this kind >> of problems. >> >> >> This patch (of 3): >> >> The following functions will be called while truncating an extent: >> ocfs2_remove_btree_range >> -> ocfs2_start_trans >> -> ocfs2_remove_extent >> -> ocfs2_truncate_rec >>-> ocfs2_extend_rotate_transaction >> -> jbd2_journal_restart if jbd2_journal_extend fail >>-> ocfs2_rotate_tree_left >> -> ocfs2_remove_rightmost_path >> -> ocfs2_extend_rotate_transaction >>-> ocfs2_unlink_subtree >> -> ocfs2_update_edge_lengths >> -> ocfs2_extend_trans >> -> jbd2_journal_restart if jbd2_journal_extend fail >> -> ocfs2_et_update_clusters >> -> ocfs2_commit_trans >> >> jbd2_journal_restart() may be called and it may happened that the buffers >> dirtied in ocfs2_truncate_rec() are committed while buffers dirtied in >> ocfs2_et_update_clusters() are not, the total clusters on extent tree and >> i_clusters in ocfs2_dinode is inconsistency. So the clusters got from >> ocfs2_dinode is incorrect, and it also cause read-only problem when call >> ocfs2_commit_truncate() with the error message: "Inode %llu has empty >> extent block at %llu". >> >> We should extend enough credits for function ocfs2_remove_rightmost_path >> and ocfs2_update_edge_lengths to avoid this inconsistency. >> >> Signed-off-by: joyce.xue >> Cc: Mark Fasheh >> Cc: Joel Becker >> Signed-off-by: Andrew Morton >> --- >> >> fs/ocfs2/alloc.c | 82 + >> 1 file changed, 54 insertions(+), 28 deletions(-) >> >> diff -puN >> fs/ocfs2/alloc.c~ocfs2-extend-transaction-for-ocfs2_remove_rightmost_path-and-ocfs2_update_edge_lengths-before-to-avoid-inconsistency-between-inode-and-et >> fs/ocfs2/alloc.c >> --- >> a/fs/ocfs2/alloc.c~ocfs2-extend-transaction-for-ocfs2_remove_rightmost_path-and-ocfs2_update_edge_lengths-before-to-avoid-inconsistency-between-inode-and-et >> +++ a/fs/ocfs2/alloc.c >> @@ -2526,21 +2526,6 @@ static int ocfs2_update_edge_lengths(han >> struct ocfs2_extent_block *eb; >> u32 range; >> >> -/* >> - * In normal tree rotation process, we will never touch the >> - * tree branch above subtree_index and ocfs2_extend_rotate_transaction >> - * doesn't reserve the credits for them either. >> - * >> - * But we do have a special case here which will update the rightmost >> - * records for all the bh in the path. >> - * So we have to allocate extra credits and access them. >> - */ >> -ret = ocfs2_extend_trans(handle, subtree_index); >> -if (ret) { >> -mlog_errno(ret); >> -goto out; >> -} >> - >> ret = ocfs2_journal_access_path(et->et_ci, handle, path); >> if (ret) { >> mlog_errno(ret); >> @@ -2967,7 +2952,7 @@ static int __ocfs2_rotate_tree_left(hand >> right_path->p_node[subtree_root].bh->b_blocknr, >> right_path->p_tree_depth); >> >> -ret = ocfs2_extend_rotate_transaction(handle, subtree_root, >> +ret = ocfs2_extend_rotate_transaction(handle, 0, > > I don't understand why you changed the subtree depth parameter here to zero. > > Also, I don't understand why it's zero in all the calls below either. Is > there something wrong with the way the math in > ocfs2_extend_rotate_transaction() is working out? The credits in ocfs2_extend_rotate_transaction is calculated as (path->p_tree_depth - subtree_depth) * 2 + 1 + op_credits. So changing the subtree_depth parameter to 0 means we get extra credits in ocfs2_truncate_rec ASAP. Then extending credits in ocfs2_update_edge_lengths is no longer needed. In other words, Joyce wants to resolve the issue by extending enough credits at the very beginning. Thanks, Joseph > > >>orig_credits, left_path); >> if (ret) { >> mlog_errno(ret); >> @@ -3040,21 +3025,9 @@ static int ocfs2_remove_rightmost_path(h >> struct ocfs2_extent_block *eb; >> struct ocfs2_extent_list *el; >> >> - >> ret =
Re: [Ocfs2-devel] [patch 11/28] ocfs2: extend enough credits for freeing one truncate record while replaying truncate records
On Tue, Sep 01, 2015 at 03:12:07PM -0700, Andrew Morton wrote: > On Tue, 1 Sep 2015 10:54:03 -0700 Mark Fashehwrote: > > > > > Why is this particular change here? > > > > --Mark > > > > > > > I think Joyce wants to get enough credits at first and then call > > > ocfs2_replay_truncate_records. > > > > Oh ok I see. That's fine then, the patch otherwise looked good to me, > > thanks. > > > > Reviewed-by: Mark Fasheh > > Cool. And where are we at with > http://ozlabs.org/~akpm/mmots/broken-out/ocfs2-extend-transaction-for-ocfs2_remove_rightmost_path-and-ocfs2_update_edge_lengths-before-to-avoid-inconsistency-between-inode-and-et.patch > > ? I had a couple questions, we're waiting for a response: https://www.mail-archive.com/ocfs2-devel@oss.oracle.com/msg10167.html --Mark -- Mark Fasheh ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] ocfs2: clean up redundant NULL checks before kfree
On Tue, Sep 01, 2015 at 02:54:08PM +0800, Joseph Qi wrote: > NULL check before kfree is redundant and so clean them up. > > Signed-off-by: Joseph QiThanks, Reviewed-by: Mark Fasheh --Mark -- Mark Fasheh ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel
Re: [Ocfs2-devel] ocfs2 version issue
Hi Junxiao, On 08/31/2015 09:22 PM, Junxiao Bi wrote: > Hi Goldwyn, > > Ocfs2 kernel version is removed from commit > ff8fb335221e2c446b0d4cbea26be371fd2feb64 ("ocfs2: remove versioning > information"), but Oracle CRS depends on this version, and this made > Oracle CRS can't be installed. So i think we should revert this commit > and sync this version with ocfs2-tools. Do you have any other concern > about this? We removed the version because it did not make sense. Even if we put in the version back we will have to maintain it and will have a similar case where the version number in the kernel is way behind the tools versions because of lack of updating. Add to it the confusion of users who would not know which version to quote. I suggest the Oracle CRS should be modified to use the kernel version as opposed to the ocfs2 kernel module specific versions. -- Goldwyn ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel