Hi Eric,

> 在 2016年10月19日,下午1:19,Eric Ren <z...@suse.com> 写道:
> Hi all!
> Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
> results in another deadlock as we have discussed in the recent thread:
>    https://oss.oracle.com/pipermail/ocfs2-devel/2016-October/012454.html
> Before this one, a similiar deadlock has been fixed by Junxiao:
>    commit c25a1e0671fb ("ocfs2: fix posix_acl_create deadlock")
>    commit 5ee0fbd50fdf ("ocfs2: revert using ocfs2_acl_chmod to avoid inode 
> cluster lock hang")
> We are in the situation that we have to avoid recursive cluster locking, but
> there is no way to check if a cluster lock has been taken by a precess 
> already.
> Mostly, we can avoid recursive locking by writing code carefully. However, as
> the deadlock issues have proved out, it's very hard to handle the routines
> that are called directly by vfs. For instance:
>    const struct inode_operations ocfs2_file_iops = {
>            .permission     = ocfs2_permission,
>            .get_acl        = ocfs2_iop_get_acl,
>            .set_acl        = ocfs2_iop_set_acl,
>    };
> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
> The problem is that the call chain of ocfs2_permission() includes *_acl().
> Possibly, there are three solutions I can think of.  The first one is to
> implement the inode permission routine for ocfs2 itself, replacing the
> existing generic_permission(); this will bring lots of changes and
> involve too many trivial vfs functions into ocfs2 code. Frown on this.
> The second one is, what I am trying now, to keep track of the processes who
> lock/unlock a cluster lock by the following draft patches. But, I quickly
> find out that a cluster locking which has been taken by processA can be 
> unlocked
> by processB. For example, systemfiles like journal:0000 is locked during 
> mout, and
> unlocked during umount. 
I had ever implemented generic recursive locking support, please check the 
patch at https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html 
<https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html> , the 
issue that locking and unlocking in different processes was considered. But it 
was rejected by Mark as recursive locking is not allowed in ocfs2/kernel . 
> The thrid one is to revert that problematic commit! It looks like 
> get/set_acl()
> are always been called by other vfs callback like ocfs2_permission(). I think
> we can do this if it's true, right? Anyway, I'll try to work out if it's 
> true;-)
Not sure whether get/set_acl() will be called directly by vfs. Even not now, we 
can’t make sure that in the future. So revert it may be a little risky. But if 
refactor is complicated, then this maybe the only way we can do.

> Hope for your input to solve this problem;-)
> Thanks,
> Eric

Ocfs2-devel mailing list

Reply via email to