Hi Joseph<mailto:joseph...@huawei.com>
The following locking order can cause a deadlock. Node A Node B Node C Super lock EX ocfs2_commit_thread ocfs2_commit_cache jbd2_journal_flush while journal is aborted , have been -EIO error. do not wake_up(&osb->dc_event) do not downconvert EX->NL while Node B required EX lock or PR lock, may cause nodes hung. So reset Node A, Node B and Node C will be normal. Thanks a lot ________________________________ zhangguanghui From: Joseph Qi<mailto:joseph...@huawei.com> Date: 2015-12-18 09:05 To: zhangguanghui 10102 (CCPL)<mailto:zhang.guang...@h3c.com> CC: ocfs2-devel@oss.oracle.com<mailto:ocfs2-devel@oss.oracle.com> Subject: Re: [Ocfs2-devel] ocfs2 cannot continue when JBD2 has aborted the journal, Hi Guanghui, Could you please describe the problem you encountered more specifically? I don't think this change is in a fair way. On 2015/12/17 13:33, Zhangguanghui wrote: > Hi all, > > A tiny race about JBD2 has aborted to jbd2_journal_flush, > > because of unstable storage link and I/O stress. > > while JBD2 state is aborted, have been -EIO error, > > may cause all cluster nodes hung. so I thinks > > JBD2 has aborted the journal, ocfs2 cannot continue and trigger ocfs2_abort. > > Thanks, Any ideas about this patch? > > > description: > > ocfs2_commit_thread > ocfs2_commit_cache > jbd2_journal_flush > > > --- journal.c 2015-12-17 11:36:39.140542941 +0800 > +++ journal.c.diff 2015-12-17 11:39:21.308542922 +0800 > @@ -328,6 +328,9 @@ > if (status < 0) { > up_write(&journal->j_trans_barrier); > mlog_errno(status); > + if (is_journal_aborted(journal)) { > + ocfs2_abort(osb->sb, "Detect aborted journal,while committing cache."); > + } > goto finally; > } > ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------! --- > zhangguanghui > ------------------------------------------------------------------------------------------------------------------------------------- > 本邮件及其附件含有杭州华三通信技术有限公司的保密信息,仅限于发送给上面地址中列出 > 的个人或群组。禁止任何其他人以任何形式使用(包括但不限于全部或部分地泄露、复制、 > 或散发)本邮件中的信息。如果您错收了本邮件,请您立即电话或邮件通知发件人并删除本 > 邮件! > This e-mail and its attachments contain confidential information from H3C, > which is > intended only for the person or entity whose address is listed above. Any use > of the > information contained herein in any way (including, but not limited to, total > or partial > disclosure, reproduction, or dissemination) by persons other than the intended > recipient(s) is prohibited. If you receive this e-mail in error, please > notify the sender > by phone or email immediately and delete it! > > > _______________________________________________ > Ocfs2-devel mailing list > Ocfs2-devel@oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-devel >
_______________________________________________ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-devel