Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-11-08 Thread Eric Ren

Hi all,

On 10/19/2016 01:19 PM, Eric Ren wrote:

ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
The problem is that the call chain of ocfs2_permission() includes *_acl().

Possibly, there are three solutions I can think of.  The first one is to
implement the inode permission routine for ocfs2 itself, replacing the
existing generic_permission(); this will bring lots of changes and
involve too many trivial vfs functions into ocfs2 code. Frown on this.

The second one is, what I am trying now, to keep track of the processes who
lock/unlock a cluster lock by the following draft patches. But, I quickly
find out that a cluster locking which has been taken by processA can be unlocked
by processB. For example, systemfiles like journal: is locked during mout, 
and
unlocked during umount.

We can avoid the problem above by:

1) not keeping track of system file inode:

   if (!(OCFS2_I(inode)->ip_flags & OCFS2_INODE_SYSTEM_FILE)) {
   
  }

2) only keeping track of inode metadata lockres:

   OCFS2_I(inode)->ip_inode_lockres;

because inode open lockres can also be get/release by different processes.

Eric


The thrid one is to revert that problematic commit! It looks like get/set_acl()
are always been called by other vfs callback like ocfs2_permission(). I think
we can do this if it's true, right? Anyway, I'll try to work out if it's true;-)

Hope for your input to solve this problem;-)

Thanks,
Eric


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel



___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-28 Thread Eric Ren

Hi Christoph!

Thanks for your attention.

On 10/28/2016 02:20 PM, Christoph Hellwig wrote:

Hi Eric,

I've added linux-fsdevel to the cc list as this should get a bit
broader attention.

On Wed, Oct 19, 2016 at 01:19:40PM +0800, Eric Ren wrote:

Mostly, we can avoid recursive locking by writing code carefully. However, as
the deadlock issues have proved out, it's very hard to handle the routines
that are called directly by vfs. For instance:

 const struct inode_operations ocfs2_file_iops = {
 .permission = ocfs2_permission,
 .get_acl= ocfs2_iop_get_acl,
 .set_acl= ocfs2_iop_set_acl,
 };


ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
The problem is that the call chain of ocfs2_permission() includes *_acl().

What do you actually protect in ocfs2_permission?  It's a trivial
wrapper around generic_permission which just looks at the VFS inode.

Yes, it is.

https://github.com/torvalds/linux/blob/master/fs/ocfs2/file.c#L1321
---
ocfs2_permission
  ocfs2_inode_lock()
generic_permission
ocfs2_inode_unlock


I think the right fix is to remove ocfs2_permission entirely and use
the default VFS implementation.  That both solves your locking problem,
and it will also get you RCU lookup instead of dropping out of
RCU mode all the time.
But, from my understanding, the pair of ocfs2_inode_lock/unlock() is used to prevent any 
concurrent changes
to the permission of the inode on the other cluster node while we are checking on it. It's a 
common  case for cluster

filesystem, such as GFS2: 
https://github.com/torvalds/linux/blob/master/fs/gfs2/inode.c#L1777

Thanks for your suggestion again!
Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-28 Thread Christoph Hellwig
Hi Eric,

I've added linux-fsdevel to the cc list as this should get a bit
broader attention.

On Wed, Oct 19, 2016 at 01:19:40PM +0800, Eric Ren wrote:
> Mostly, we can avoid recursive locking by writing code carefully. However, as
> the deadlock issues have proved out, it's very hard to handle the routines
> that are called directly by vfs. For instance:
> 
> const struct inode_operations ocfs2_file_iops = {
> .permission = ocfs2_permission,
> .get_acl= ocfs2_iop_get_acl,
> .set_acl= ocfs2_iop_set_acl,
> };
> 
> 
> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
> The problem is that the call chain of ocfs2_permission() includes *_acl().

What do you actually protect in ocfs2_permission?  It's a trivial
wrapper around generic_permission which just looks at the VFS inode.

I think the right fix is to remove ocfs2_permission entirely and use
the default VFS implementation.  That both solves your locking problem,
and it will also get you RCU lookup instead of dropping out of
RCU mode all the time.

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-24 Thread Eric Ren
Hi all,


On 10/19/2016 01:19 PM, Eric Ren wrote:
> The thrid one is to revert that problematic commit! It looks like 
> get/set_acl()
> are always been called by other vfs callback like ocfs2_permission(). I think
> we can do this if it's true, right? Anyway, I'll try to work out if it's 
> true;-)
After looking into more, I get to know get/set_acl() can be invoked 
directly from vfs,
for instance:

fsetxattr()
  setxattr()
vfs_setxattr()
  __vfs_setxattr()
handler->set(handler, dentry, inode, name, value, size) // 
posix_acl_access_xattr_handler.set = posix_acl_xattr_set
  posix_acl_xattr_set()
 set_posix_acl()
   inode->i_op->set_acl()


So, this problem looks really hard to solve:-/

Eric

>
> Hope for your input to solve this problem;-)
>
> Thanks,
> Eric
>
>
> ___
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel
>


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-19 Thread Eric Ren
Hi Junxiao,

On 10/19/2016 02:57 PM, Junxiao Bi wrote:
> I had ever implemented generic recursive locking support, please check the 
> patch at 
> https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html 
>  , 
> the issue that locking and unlocking in different processes was considered. 
> But it was rejected by Mark as recursive locking is not allowed in 
> ocfs2/kernel .
Yes, I can remember it. The different point is that I just want to have a 
function to check 
recursive locking
than supporting recursive locking;-)

Honestly, I cannot understand your patch thoroughly until now.  Back to that 
time, it's the 
complication of your patch
that concerns me. Besides, looks like the "PR+EX" + "non-block" request cannot 
be handled well?
>> The thrid one is to revert that problematic commit! It looks like 
>> get/set_acl()
>> are always been called by other vfs callback like ocfs2_permission(). I think
>> we can do this if it's true, right? Anyway, I'll try to work out if it's 
>> true;-)
> Not sure whether get/set_acl() will be called directly by vfs. Even not now, 
> we can’t make sure that in the future. So revert it may be a little risky. 
> But if refactor is complicated, then this maybe the only way we can do.
Agree. Let's investigate more into it;-)

Thanks,
Junxiao
>
> Thanks,
> Junxiao.
>> Hope for your input to solve this problem;-)
>>
>> Thanks,
>> Eric
>>
>


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [RFC] Should we revert commit "ocfs2: take inode lock in ocfs2_iop_set/get_acl()"? or other ideas?

2016-10-19 Thread Junxiao Bi
Hi Eric,

> 在 2016年10月19日,下午1:19,Eric Ren  写道:
> 
> Hi all!
> 
> Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()")
> results in another deadlock as we have discussed in the recent thread:
>https://oss.oracle.com/pipermail/ocfs2-devel/2016-October/012454.html
> 
> Before this one, a similiar deadlock has been fixed by Junxiao:
>commit c25a1e0671fb ("ocfs2: fix posix_acl_create deadlock")
>commit 5ee0fbd50fdf ("ocfs2: revert using ocfs2_acl_chmod to avoid inode 
> cluster lock hang")
> 
> We are in the situation that we have to avoid recursive cluster locking, but
> there is no way to check if a cluster lock has been taken by a precess 
> already.
> 
> Mostly, we can avoid recursive locking by writing code carefully. However, as
> the deadlock issues have proved out, it's very hard to handle the routines
> that are called directly by vfs. For instance:
> 
>const struct inode_operations ocfs2_file_iops = {
>.permission = ocfs2_permission,
>.get_acl= ocfs2_iop_get_acl,
>.set_acl= ocfs2_iop_set_acl,
>};
> 
> 
> ocfs2_permission() and ocfs2_iop_get/set_acl() both call ocfs2_inode_lock().
> The problem is that the call chain of ocfs2_permission() includes *_acl().
> 
> Possibly, there are three solutions I can think of.  The first one is to
> implement the inode permission routine for ocfs2 itself, replacing the
> existing generic_permission(); this will bring lots of changes and
> involve too many trivial vfs functions into ocfs2 code. Frown on this.
> 
> The second one is, what I am trying now, to keep track of the processes who
> lock/unlock a cluster lock by the following draft patches. But, I quickly
> find out that a cluster locking which has been taken by processA can be 
> unlocked
> by processB. For example, systemfiles like journal: is locked during 
> mout, and
> unlocked during umount. 
I had ever implemented generic recursive locking support, please check the 
patch at https://oss.oracle.com/pipermail/ocfs2-devel/2015-December/011408.html 
 , the 
issue that locking and unlocking in different processes was considered. But it 
was rejected by Mark as recursive locking is not allowed in ocfs2/kernel . 
> 
> The thrid one is to revert that problematic commit! It looks like 
> get/set_acl()
> are always been called by other vfs callback like ocfs2_permission(). I think
> we can do this if it's true, right? Anyway, I'll try to work out if it's 
> true;-)
Not sure whether get/set_acl() will be called directly by vfs. Even not now, we 
can’t make sure that in the future. So revert it may be a little risky. But if 
refactor is complicated, then this maybe the only way we can do.

Thanks,
Junxiao.
> 
> Hope for your input to solve this problem;-)
> 
> Thanks,
> Eric
> 

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel