Re: [Cluster-devel] [PATCH] Add flags option to get xattr method paired to __vfs_getxattr

2019-08-15 Thread James Morris
On Thu, 15 Aug 2019, Mark Salyzyn wrote:

> Good Idea, but using the same argument structure for set and get I would be
> concerned about the loss of compiler protection for the buffer argument;

Agreed, I missed that.

> struct getxattr_args {
>   struct dentry *dentry;
>   struct inode *inode;
>   const char *name;
>   void *buffer;
>   size_t size;
>   int flags;

Does 'get' need flags?

-- 
James Morris




Re: [Cluster-devel] [PATCH] Add flags option to get xattr method paired to __vfs_getxattr

2019-08-15 Thread Mark Salyzyn

On 8/15/19 12:20 PM, James Morris wrote:

On Tue, 13 Aug 2019, Greg Kroah-Hartman wrote:


On Mon, Aug 12, 2019 at 12:32:49PM -0700, Mark Salyzyn wrote:

--- a/include/linux/xattr.h
+++ b/include/linux/xattr.h
@@ -30,10 +30,10 @@ struct xattr_handler {
const char *prefix;
int flags;  /* fs private flags */
bool (*list)(struct dentry *dentry);
-   int (*get)(const struct xattr_handler *, struct dentry *dentry,
+   int (*get)(const struct xattr_handler *handler, struct dentry *dentry,
   struct inode *inode, const char *name, void *buffer,
-  size_t size);
-   int (*set)(const struct xattr_handler *, struct dentry *dentry,
+  size_t size, int flags);
+   int (*set)(const struct xattr_handler *handler, struct dentry *dentry,
   struct inode *inode, const char *name, const void *buffer,
   size_t size, int flags);

Wow, 7 arguments.  Isn't there some nice rule of thumb that says once
you get more then 5, a function becomes impossible to understand?

Surely this could be a structure passed in here somehow, that way when
you add the 8th argument in the future, you don't have to change
everything yet again?  :)

I don't have anything concrete to offer as a replacement fix for this,
but to me this just feels really wrong...

How about something like:

struct xattr_gs_args {
struct dentry *dentry;
struct inode *inode;
const char *name;
const void *buffer;
size_t size;
int flags;
};

int (*get)(const struct xattr_handler *handler, struct xattr_gs_args *args);
int (*set)(const struct xattr_handler *handler, struct xattr_gs_args *args);

Good Idea, but using the same argument structure for set and get I would 
be concerned about the loss of compiler protection for the buffer 
argument; it is void* for get, and const void* for set. And if we pulled 
out buffer (and size since it is paired with it) from the structure to 
solve, the 'mixed' argument approach (resulting in 4 args) adds to the 
difficulty/complexity.


Good news is the same structure(s) can get passed to __vfs_getxattr and 
__vfs_setxattr, so one less issue with getting the argument order 
correct from the caller.


From an optimization standpoint, passing an argument to a pointer to a 
structure assembled on the stack constrains the compiler. Whereas 
individual arguments allow for the optimization to place all the 
arguments into registers. All modern processors have no issue with tens 
of arguments.


So, I will look into what the patch set will look like by splitting into 
set and get, and trying to reuse the structure down the call chain.


struct getxattr_args {
struct dentry *dentry;
struct inode *inode;
const char *name;
void *buffer;
size_t size;
int flags;
};

struct setxattr_args {
struct dentry *dentry;
struct inode *inode;
const char *name;
const void *buffer;
size_t size;
int flags;
};

-- Mark





Re: [Cluster-devel] [PATCH] Add flags option to get xattr method paired to __vfs_getxattr

2019-08-15 Thread Greg Kroah-Hartman
On Fri, Aug 16, 2019 at 05:20:36AM +1000, James Morris wrote:
> On Tue, 13 Aug 2019, Greg Kroah-Hartman wrote:
> 
> > On Mon, Aug 12, 2019 at 12:32:49PM -0700, Mark Salyzyn wrote:
> > > --- a/include/linux/xattr.h
> > > +++ b/include/linux/xattr.h
> > > @@ -30,10 +30,10 @@ struct xattr_handler {
> > >   const char *prefix;
> > >   int flags;  /* fs private flags */
> > >   bool (*list)(struct dentry *dentry);
> > > - int (*get)(const struct xattr_handler *, struct dentry *dentry,
> > > + int (*get)(const struct xattr_handler *handler, struct dentry *dentry,
> > >  struct inode *inode, const char *name, void *buffer,
> > > -size_t size);
> > > - int (*set)(const struct xattr_handler *, struct dentry *dentry,
> > > +size_t size, int flags);
> > > + int (*set)(const struct xattr_handler *handler, struct dentry *dentry,
> > >  struct inode *inode, const char *name, const void *buffer,
> > >  size_t size, int flags);
> > 
> > Wow, 7 arguments.  Isn't there some nice rule of thumb that says once
> > you get more then 5, a function becomes impossible to understand?
> > 
> > Surely this could be a structure passed in here somehow, that way when
> > you add the 8th argument in the future, you don't have to change
> > everything yet again?  :)
> > 
> > I don't have anything concrete to offer as a replacement fix for this,
> > but to me this just feels really wrong...
> 
> How about something like:
> 
> struct xattr_gs_args {
>   struct dentry *dentry;
>   struct inode *inode;

As he said in a later message, dentry and inode is redundant, only 1 is
needed (dentry I think?)

>   const char *name;
>   const void *buffer;
>   size_t size;
>   int flags;
> };
> 
> int (*get)(const struct xattr_handler *handler, struct xattr_gs_args *args);
> int (*set)(const struct xattr_handler *handler, struct xattr_gs_args *args);

But yes, that would be much much better.

thanks,

greg k-h



Re: [Cluster-devel] [PATCH] Add flags option to get xattr method paired to __vfs_getxattr

2019-08-15 Thread James Morris
On Tue, 13 Aug 2019, Greg Kroah-Hartman wrote:

> On Mon, Aug 12, 2019 at 12:32:49PM -0700, Mark Salyzyn wrote:
> > --- a/include/linux/xattr.h
> > +++ b/include/linux/xattr.h
> > @@ -30,10 +30,10 @@ struct xattr_handler {
> > const char *prefix;
> > int flags;  /* fs private flags */
> > bool (*list)(struct dentry *dentry);
> > -   int (*get)(const struct xattr_handler *, struct dentry *dentry,
> > +   int (*get)(const struct xattr_handler *handler, struct dentry *dentry,
> >struct inode *inode, const char *name, void *buffer,
> > -  size_t size);
> > -   int (*set)(const struct xattr_handler *, struct dentry *dentry,
> > +  size_t size, int flags);
> > +   int (*set)(const struct xattr_handler *handler, struct dentry *dentry,
> >struct inode *inode, const char *name, const void *buffer,
> >size_t size, int flags);
> 
> Wow, 7 arguments.  Isn't there some nice rule of thumb that says once
> you get more then 5, a function becomes impossible to understand?
> 
> Surely this could be a structure passed in here somehow, that way when
> you add the 8th argument in the future, you don't have to change
> everything yet again?  :)
> 
> I don't have anything concrete to offer as a replacement fix for this,
> but to me this just feels really wrong...

How about something like:

struct xattr_gs_args {
struct dentry *dentry;
struct inode *inode;
const char *name;
const void *buffer;
size_t size;
int flags;
};

int (*get)(const struct xattr_handler *handler, struct xattr_gs_args *args);
int (*set)(const struct xattr_handler *handler, struct xattr_gs_args *args);


-- 
James Morris




Re: [GIT PULL] iomap: small fixes for 5.3-rc5

2019-08-15 Thread pr-tracker-bot
The pull request you sent on Thu, 15 Aug 2019 10:10:24 -0700:

> git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/iomap-5.3-fixes-1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/4ec1fa692dc7dc915c3485a7fad928924fc13de2

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker


[GIT PULL] iomap: small fixes for 5.3-rc5

2019-08-15 Thread Darrick J. Wong
Hi Linus,

Here's a single update to the MAINTAINERS entry for iomap.
I test-merged it with this morning's master and didn't see any
conflicts.  Please let me know if you encounter any funniness.

--D

The following changes since commit 609488bc979f99f805f34e9a32c1e3b71179d10b:

  Linux 5.3-rc2 (2019-07-28 12:47:02 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/iomap-5.3-fixes-1

for you to fetch changes up to 9a67b72552f8d019948453e56ca7db8c7e5a94ba:

  MAINTAINERS: iomap: Remove fs/iomap.c record (2019-08-13 08:15:07 -0700)


Changes since last update:
- Update MAINTAINERS now that we've removed fs/iomap.c.


Denis Efremov (1):
  MAINTAINERS: iomap: Remove fs/iomap.c record

 MAINTAINERS | 1 -
 1 file changed, 1 deletion(-)


[Cluster-devel] [PATCH v4] Add flags option to get xattr method paired to __vfs_getxattr

2019-08-15 Thread Mark Salyzyn
Add a flag option to get xattr method that could have a bit flag of
XATTR_NOSECURITY passed to it.  XATTR_NOSECURITY is generally then
set in the __vfs_getxattr path.

This handles the case of a union filesystem driver that is being
requested by the security layer to report back the xattr data.

For the use case where access is to be blocked by the security layer.

The path then could be security(dentry) ->
__vfs_getxattr(dentry...XATTR_NOSECUIRTY) ->
handler->get(dentry...XATTR_NOSECURITY) ->
__vfs_getxattr(lower_dentry...XATTR_NOSECUIRTY) ->
lower_handler->get(lower_dentry...XATTR_NOSECUIRTY)
which would report back through the chain data and success as
expected, the logging security layer at the top would have the
data to determine the access permissions and report back the target
context that was blocked.

Without the get handler flag, the path on a union filesystem would be
the errant security(dentry) -> __vfs_getxattr(dentry) ->
handler->get(dentry) -> vfs_getxattr(lower_dentry) -> nested ->
security(lower_dentry, log off) -> lower_handler->get(lower_dentry)
which would report back through the chain no data, and -EACCES.

For selinux for both cases, this would translate to a correctly
determined blocked access. In the first case with this change a correct avc
log would be reported, in the second legacy case an incorrect avc log
would be reported against an uninitialized u:object_r:unlabeled:s0
context making the logs cosmetically useless for audit2allow.

This patch series is inert and is the wide-spread addition of the
flags option for xattr functions, and a replacement of _vfs_getxattr
with __vfs_getxattr(...XATTR_NOSECURITY).

Signed-off-by: Mark Salyzyn 
Cc: Stephen Smalley 
Cc: linux-ker...@vger.kernel.org
Cc: kernel-t...@android.com
Cc: linux-security-mod...@vger.kernel.org
Cc: Jan Kara 
Cc: sta...@vger.kernel.org # 4.4, 4.9, 4.14 & 4.19
---
v4:
- ifdef __KERNEL__ around XATTR_NOSECURITY to
  keep it colocated in uapi headers.

v3:
- poor aim on ubifs not ubifs_xattr_get, but static xattr_get

v2:
- Missed a spot: ubifs, erofs and afs.

v1:
- Removed from an overlayfs patch set, and made independent.
  Expect this to be the basis of some security improvements.
---
 drivers/staging/erofs/xattr.c |  3 ++-
 fs/9p/acl.c   |  3 ++-
 fs/9p/xattr.c |  3 ++-
 fs/afs/xattr.c|  8 +++
 fs/btrfs/xattr.c  |  3 ++-
 fs/ceph/xattr.c   |  3 ++-
 fs/cifs/xattr.c   |  2 +-
 fs/ecryptfs/inode.c   |  6 --
 fs/ecryptfs/mmap.c|  2 +-
 fs/ext2/xattr_trusted.c   |  2 +-
 fs/ext2/xattr_user.c  |  2 +-
 fs/ext4/xattr_security.c  |  2 +-
 fs/ext4/xattr_trusted.c   |  2 +-
 fs/ext4/xattr_user.c  |  2 +-
 fs/f2fs/xattr.c   |  4 ++--
 fs/fuse/xattr.c   |  4 ++--
 fs/gfs2/xattr.c   |  3 ++-
 fs/hfs/attr.c |  2 +-
 fs/hfsplus/xattr.c|  3 ++-
 fs/hfsplus/xattr_trusted.c|  3 ++-
 fs/hfsplus/xattr_user.c   |  3 ++-
 fs/jffs2/security.c   |  3 ++-
 fs/jffs2/xattr_trusted.c  |  3 ++-
 fs/jffs2/xattr_user.c |  3 ++-
 fs/jfs/xattr.c|  5 +++--
 fs/kernfs/inode.c |  3 ++-
 fs/nfs/nfs4proc.c |  6 --
 fs/ocfs2/xattr.c  |  9 +---
 fs/orangefs/xattr.c   |  3 ++-
 fs/overlayfs/super.c  |  8 ---
 fs/posix_acl.c|  2 +-
 fs/reiserfs/xattr_security.c  |  3 ++-
 fs/reiserfs/xattr_trusted.c   |  3 ++-
 fs/reiserfs/xattr_user.c  |  3 ++-
 fs/squashfs/xattr.c   |  2 +-
 fs/ubifs/xattr.c  |  3 ++-
 fs/xattr.c| 36 +++
 fs/xfs/xfs_xattr.c|  3 ++-
 include/linux/xattr.h |  9 
 include/uapi/linux/xattr.h|  7 --
 mm/shmem.c|  3 ++-
 net/socket.c  |  3 ++-
 security/commoncap.c  |  6 --
 security/integrity/evm/evm_main.c |  3 ++-
 security/selinux/hooks.c  | 11 ++
 security/smack/smack_lsm.c|  5 +++--
 46 files changed, 126 insertions(+), 84 deletions(-)

diff --git a/drivers/staging/erofs/xattr.c b/drivers/staging/erofs/xattr.c
index df40654b9fbb..69440065432c 100644
--- a/drivers/staging/erofs/xattr.c
+++ b/drivers/staging/erofs/xattr.c
@@ -463,7 +463,8 @@ int erofs_getxattr(struct inode *inode, int index,
 
 static int erofs_xattr_generic_get(const struct xattr_handler *handler,
   struct dentry *unused, struct inode *inode,
-  const char *name, void *buffer, size_t size)
+  const char *name, void *buffer, size_t size,
+  int flags)
 {
struct 

Re: [Cluster-devel] [PATCH v2] Add flags option to get xattr method paired to __vfs_getxattr

2019-08-15 Thread Jan Kara
On Wed 14-08-19 07:54:16, Mark Salyzyn wrote:
> On 8/14/19 4:00 AM, Jan Kara wrote:
> > On Tue 13-08-19 07:55:06, Mark Salyzyn wrote:
> > ...
> > > diff --git a/fs/xattr.c b/fs/xattr.c
> > > index 90dd78f0eb27..71f887518d6f 100644
> > > --- a/fs/xattr.c
> > > +++ b/fs/xattr.c
> > ...
> > >   ssize_t
> > >   __vfs_getxattr(struct dentry *dentry, struct inode *inode, const char 
> > > *name,
> > > -void *value, size_t size)
> > > +void *value, size_t size, int flags)
> > >   {
> > >   const struct xattr_handler *handler;
> > > -
> > > - handler = xattr_resolve_name(inode, );
> > > - if (IS_ERR(handler))
> > > - return PTR_ERR(handler);
> > > - if (!handler->get)
> > > - return -EOPNOTSUPP;
> > > - return handler->get(handler, dentry, inode, name, value, size);
> > > -}
> > > -EXPORT_SYMBOL(__vfs_getxattr);
> > > -
> > > -ssize_t
> > > -vfs_getxattr(struct dentry *dentry, const char *name, void *value, 
> > > size_t size)
> > > -{
> > > - struct inode *inode = dentry->d_inode;
> > >   int error;
> > > + if (flags & XATTR_NOSECURITY)
> > > + goto nolsm;
> > Hum, is it OK for XATTR_NOSECURITY to skip even the xattr_permission()
> > check? I understand that for reads of security xattrs it actually does not
> > matter in practice but conceptually that seems wrong to me as
> > XATTR_NOSECURITY is supposed to skip just security-module checks to avoid
> > recursion AFAIU.
> 
> Good catch I think.
> 
> I was attempting to make this change purely inert, no change in
> functionality, only a change in API. Adding a call to xattr_permission would
> incur a change in overall functionality, as it would introduce into the
> current and original __vfs_getxattr a call to xattr_permission that was not
> there before.
> 
> (I will have to defer the real answer and requirements to the security
> folks)
> 
> AFAIK you are correct, and to make the call would reduce the attack surface,
> trading a very small amount of CPU utilization, for a much larger amount of
> trust.
> 
> Given the long history of this patch set (for overlayfs) and the large
> amount of stakeholders, I would _prefer_ to submit a followup independent
> functionality/security change to _vfs_get_xattr _after_ this makes it in.

You're right. The problem was there before. So ack to changing this later.

> > > diff --git a/include/uapi/linux/xattr.h b/include/uapi/linux/xattr.h
> > > index c1395b5bd432..1216d777d210 100644
> > > --- a/include/uapi/linux/xattr.h
> > > +++ b/include/uapi/linux/xattr.h
> > > @@ -17,8 +17,9 @@
> > >   #if __UAPI_DEF_XATTR
> > >   #define __USE_KERNEL_XATTR_DEFS
> > > -#define XATTR_CREATE 0x1 /* set value, fail if attr already 
> > > exists */
> > > -#define XATTR_REPLACE0x2 /* set value, fail if attr does not 
> > > exist */
> > > +#define XATTR_CREATE  0x1/* set value, fail if attr already 
> > > exists */
> > > +#define XATTR_REPLACE 0x2/* set value, fail if attr does not 
> > > exist */
> > > +#define XATTR_NOSECURITY 0x4 /* get value, do not involve security 
> > > check */
> > >   #endif
> > It seems confusing to export XATTR_NOSECURITY definition to userspace when
> > that is kernel-internal flag. I'd just define it in include/linux/xattr.h
> > somewhere from the top of flags space (like 0x4000).
> > 
> > Otherwise the patch looks OK to me (cannot really comment on the security
> > module aspect of this whole thing though).
> 
> Good point. However, we do need to keep these flags together to reduce
> maintenance risk, I personally abhor two locations for flags bits even if
> one comes from the opposite bit-side; collisions are undetectable at build
> time. Although I have not gone through the entire thought experiment, I am
> expecting that fuse could possibly benefit from this flag (if exposed) since
> it also has a security recursion. That said, fuse is probably the example of
> a gaping wide attack surface if user space had access to it ... your
> xattr_permissions call addition requested above would be realistically, not
> just pedantically, required!

Yeah, flags bits in two places are bad as well. So maybe at least
#ifdef __KERNEL__ bit around the definitiona and a comment that it is
kernel internal flag?

Honza
-- 
Jan Kara 
SUSE Labs, CR



Re: [Cluster-devel] [GFS2 PATCH 2/2] gfs2: Use async glocks for rename

2019-08-15 Thread Steven Whitehouse

Hi,

On 15/08/2019 14:41, Bob Peterson wrote:

I just noticed the parameter comments for gfs2_glock_async_wait
are wrong, and I've fixed them in a newer version. I can post
the new version after I get people's initial reactions.

Bob


Overall this looks like a much better approach. We know that this 
doesn't happen very often, so the path which involves the timeout should 
be very rarely taken. The problem is how to select a suitable timeout... 
is 2 secs enough? Can we land up with a DoS in certain situations? 
Hopefully not, but we should take care.


The shared wait queue might also be an issue in terms of contention, so 
it might be worth looking at how to avoid that. Generally though, this 
is looking very promising I think,


Steve.



- Original Message -

Because s_vfs_rename_mutex is not cluster-wide, multiple nodes can
reverse the roles of which directories are "old" and which are "new"
for the purposes of rename. This can cause deadlocks where two nodes
can lock old-then-new but since their roles are reversed, they wait
for each other.

This patch fixes the problem by acquiring all gfs2_rename's inode
glocks asychronously and waits for all glocks to be acquired.
That way all inodes are locked regardless of the order.

Signed-off-by: Bob Peterson 
---

(snip)

+ * gfs2_glock_async_wait - wait on multiple asynchronous glock acquisitions
+ * @gh: the glock holder

(snip)

+int gfs2_glock_async_wait(unsigned int num_gh, struct gfs2_holder *ghs)






Re: [Cluster-devel] [GFS2 PATCH 2/2] gfs2: Use async glocks for rename

2019-08-15 Thread Bob Peterson
I just noticed the parameter comments for gfs2_glock_async_wait
are wrong, and I've fixed them in a newer version. I can post
the new version after I get people's initial reactions.

Bob

- Original Message -
> Because s_vfs_rename_mutex is not cluster-wide, multiple nodes can
> reverse the roles of which directories are "old" and which are "new"
> for the purposes of rename. This can cause deadlocks where two nodes
> can lock old-then-new but since their roles are reversed, they wait
> for each other.
> 
> This patch fixes the problem by acquiring all gfs2_rename's inode
> glocks asychronously and waits for all glocks to be acquired.
> That way all inodes are locked regardless of the order.
> 
> Signed-off-by: Bob Peterson 
> ---
(snip)
> + * gfs2_glock_async_wait - wait on multiple asynchronous glock acquisitions
> + * @gh: the glock holder
(snip)
> +int gfs2_glock_async_wait(unsigned int num_gh, struct gfs2_holder *ghs)



[Cluster-devel] [GFS2 PATCH 0/2] gfs2: gfs2_rename lock ordering

2019-08-15 Thread Bob Peterson
These two patches fix a gfs2 deadlock with gfs2_rename.
Comments for vfs function vfs_rename() explain situations in which AB-BA
deadlocks are possible, but prevented by a mutex, s_vfs_rename_mutex.

While that's all true at a single-node level, gfs2 can make processes
running on two distinct cluster nodes behave as if they're on the same
node. But since different nodes have different s_vfs_rename_mutexes
it's still possible to create a similar deadlock across nodes:
One node's gfs2_rename locks the two directory inodes based on their roles
of parent and child, in that order. But then the other node can reverse
their roles and request the same inodes, but in the opposite order.
IOW, both nodes lock parent-then-child, but since their roles switch,
it creates another AB-BA deadlock in which they wait forever for each other.

This patch set resolves the deadlock by submitting both deadlocks to dlm
asychronously. Then it waits for them both to be locked using a new event
that checks whether both glocks are locked properly.

The new event has a timeout, and if the timeout expires, it returns
-ESTALE, which signifies the locking has a conflict, and vfs retries
the locking operation.

The first patch separates the rgrp glock from the inode glocks because
that needs to happen last, rather than asychronously. The second patch
implements the async glock requests and the new event wait.

Bob Peterson (2):
  gfs2: separate holder for rgrps in gfs2_rename
  gfs2: Use async glocks for rename

 fs/gfs2/glock.c  | 40 
 fs/gfs2/glock.h  |  1 +
 fs/gfs2/incore.h |  1 +
 fs/gfs2/inode.c  | 28 ++--
 fs/gfs2/ops_fstype.c |  1 +
 5 files changed, 65 insertions(+), 6 deletions(-)

-- 
2.21.0



[Cluster-devel] [GFS2 PATCH 2/2] gfs2: Use async glocks for rename

2019-08-15 Thread Bob Peterson
Because s_vfs_rename_mutex is not cluster-wide, multiple nodes can
reverse the roles of which directories are "old" and which are "new"
for the purposes of rename. This can cause deadlocks where two nodes
can lock old-then-new but since their roles are reversed, they wait
for each other.

This patch fixes the problem by acquiring all gfs2_rename's inode
glocks asychronously and waits for all glocks to be acquired.
That way all inodes are locked regardless of the order.

Signed-off-by: Bob Peterson 
---
 fs/gfs2/glock.c  | 40 
 fs/gfs2/glock.h  |  1 +
 fs/gfs2/incore.h |  1 +
 fs/gfs2/inode.c  | 13 +
 fs/gfs2/ops_fstype.c |  1 +
 5 files changed, 52 insertions(+), 4 deletions(-)

diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index a27dbd3dec01..3334101c9921 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -305,6 +305,11 @@ static void gfs2_holder_wake(struct gfs2_holder *gh)
clear_bit(HIF_WAIT, >gh_iflags);
smp_mb__after_atomic();
wake_up_bit(>gh_iflags, HIF_WAIT);
+   if (gh->gh_flags & GL_ASYNC) {
+   struct gfs2_sbd *sdp = gh->gh_gl->gl_name.ln_sbd;
+
+   wake_up(>sd_async_glock_wait);
+   }
 }
 
 /**
@@ -952,6 +957,41 @@ int gfs2_glock_wait(struct gfs2_holder *gh)
return gh->gh_error;
 }
 
+static int all_glocks_held(unsigned int num_gh, struct gfs2_holder *ghs)
+{
+   struct gfs2_glock *gl = ghs[0].gh_gl;
+   int i;
+
+   for (i = 0; i < num_gh; i++) {
+   if (test_bit(HIF_WAIT, [i].gh_iflags) ||
+   !test_bit(HIF_HOLDER, [i].gh_iflags) ||
+   gl->gl_state != ghs[i].gh_state)
+   return 0;
+   if (ghs[i].gh_error)
+   return ghs[i].gh_error;
+   }
+   return 1;
+}
+
+/**
+ * gfs2_glock_async_wait - wait on multiple asynchronous glock acquisitions
+ * @gh: the glock holder
+ *
+ * Returns: 0 on success
+ */
+
+int gfs2_glock_async_wait(unsigned int num_gh, struct gfs2_holder *ghs)
+{
+   struct gfs2_glock *gl = ghs[0].gh_gl;
+   struct gfs2_sbd *sdp = gl->gl_name.ln_sbd;
+
+   might_sleep();
+   if (!wait_event_timeout(sdp->sd_async_glock_wait,
+   all_glocks_held(num_gh, ghs), HZ * 2))
+   return -ESTALE;
+   return 0;
+}
+
 /**
  * handle_callback - process a demote request
  * @gl: the glock
diff --git a/fs/gfs2/glock.h b/fs/gfs2/glock.h
index e4e0bed5257c..a997c42726a4 100644
--- a/fs/gfs2/glock.h
+++ b/fs/gfs2/glock.h
@@ -190,6 +190,7 @@ extern void gfs2_holder_uninit(struct gfs2_holder *gh);
 extern int gfs2_glock_nq(struct gfs2_holder *gh);
 extern int gfs2_glock_poll(struct gfs2_holder *gh);
 extern int gfs2_glock_wait(struct gfs2_holder *gh);
+extern int gfs2_glock_async_wait(unsigned int num_gh, struct gfs2_holder *ghs);
 extern void gfs2_glock_dq(struct gfs2_holder *gh);
 extern void gfs2_glock_dq_wait(struct gfs2_holder *gh);
 extern void gfs2_glock_dq_uninit(struct gfs2_holder *gh);
diff --git a/fs/gfs2/incore.h b/fs/gfs2/incore.h
index 7a993d7c022e..6b450065b9d5 100644
--- a/fs/gfs2/incore.h
+++ b/fs/gfs2/incore.h
@@ -725,6 +725,7 @@ struct gfs2_sbd {
struct gfs2_glock *sd_freeze_gl;
struct work_struct sd_freeze_work;
wait_queue_head_t sd_glock_wait;
+   wait_queue_head_t sd_async_glock_wait;
atomic_t sd_glock_disposal;
struct completion sd_locking_init;
struct completion sd_wdack;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 1ced4d0a3b04..cb184d6bed9b 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -1388,16 +1388,18 @@ static int gfs2_rename(struct inode *odir, struct 
dentry *odentry,
}
 
num_gh = 1;
-   gfs2_holder_init(odip->i_gl, LM_ST_EXCLUSIVE, 0, ghs);
+   gfs2_holder_init(odip->i_gl, LM_ST_EXCLUSIVE, GL_ASYNC, ghs);
if (odip != ndip) {
-   gfs2_holder_init(ndip->i_gl, LM_ST_EXCLUSIVE, 0, ghs + num_gh);
+   gfs2_holder_init(ndip->i_gl, LM_ST_EXCLUSIVE,GL_ASYNC,
+ghs + num_gh);
num_gh++;
}
-   gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, 0, ghs + num_gh);
+   gfs2_holder_init(ip->i_gl, LM_ST_EXCLUSIVE, GL_ASYNC, ghs + num_gh);
num_gh++;
 
if (nip) {
-   gfs2_holder_init(nip->i_gl, LM_ST_EXCLUSIVE, 0, ghs + num_gh);
+   gfs2_holder_init(nip->i_gl, LM_ST_EXCLUSIVE, GL_ASYNC,
+ghs + num_gh);
num_gh++;
/* grab the resource lock for unlink flag twiddling 
 * this is the case of the target file already existing
@@ -1414,6 +1416,9 @@ static int gfs2_rename(struct inode *odir, struct dentry 
*odentry,
if (error)
goto out_gunlock;
}
+   error = gfs2_glock_async_wait(num_gh, ghs);
+   if (error)
+   goto out_gunlock;
 

[Cluster-devel] [GFS2 PATCH 1/2] gfs2: separate holder for rgrps in gfs2_rename

2019-08-15 Thread Bob Peterson
Before this patch, gfs2_rename added a holder for the rgrp glock to
its array of holders, ghs. There's nothing wrong with that, but this
patch separates it into a separate holder. This was done to ensure
it's always locked last as per the proper glock lock ordering,
and also to pave the way for a future patch in which we will
lock the non-rgrp glocks asynchronously.

Signed-off-by: Bob Peterson 
---
 fs/gfs2/inode.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 2e2a8a2fb51d..1ced4d0a3b04 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -1348,7 +1348,7 @@ static int gfs2_rename(struct inode *odir, struct dentry 
*odentry,
struct gfs2_inode *ip = GFS2_I(d_inode(odentry));
struct gfs2_inode *nip = NULL;
struct gfs2_sbd *sdp = GFS2_SB(odir);
-   struct gfs2_holder ghs[5], r_gh;
+   struct gfs2_holder ghs[4], r_gh, rd_gh;
struct gfs2_rgrpd *nrgd;
unsigned int num_gh;
int dir_rename = 0;
@@ -1357,6 +1357,7 @@ static int gfs2_rename(struct inode *odir, struct dentry 
*odentry,
int error;
 
gfs2_holder_mark_uninitialized(_gh);
+   gfs2_holder_mark_uninitialized(_gh);
if (d_really_is_positive(ndentry)) {
nip = GFS2_I(d_inode(ndentry));
if (ip == nip)
@@ -1404,7 +1405,8 @@ static int gfs2_rename(struct inode *odir, struct dentry 
*odentry,
 */
nrgd = gfs2_blk2rgrpd(sdp, nip->i_no_addr, 1);
if (nrgd)
-   gfs2_holder_init(nrgd->rd_gl, LM_ST_EXCLUSIVE, 0, ghs + 
num_gh++);
+   gfs2_holder_init(nrgd->rd_gl, LM_ST_EXCLUSIVE, 0,
+_gh);
}
 
for (x = 0; x < num_gh; x++) {
@@ -1413,6 +1415,12 @@ static int gfs2_rename(struct inode *odir, struct dentry 
*odentry,
goto out_gunlock;
}
 
+   if (gfs2_holder_initialized(_gh)) {
+   error = gfs2_glock_nq(_gh);
+   if (error)
+   goto out_gunlock;
+   }
+
error = -ENOENT;
if (ip->i_inode.i_nlink == 0)
goto out_gunlock;
@@ -1541,6 +1549,9 @@ static int gfs2_rename(struct inode *odir, struct dentry 
*odentry,
gfs2_quota_unlock(ndip);
 out_gunlock:
gfs2_dir_no_add();
+   if (gfs2_holder_initialized(_gh))
+   gfs2_glock_dq_uninit(_gh);
+
while (x--) {
gfs2_glock_dq(ghs + x);
gfs2_holder_uninit(ghs + x);
-- 
2.21.0



Re: [PATCH v5 03/18] gfs2: add compat_ioctl support

2019-08-15 Thread Bob Peterson
- Original Message -
> Out of the four ioctl commands supported on gfs2, only FITRIM
> works in compat mode.
> 
> Add a proper handler based on the ext4 implementation.
> 
> Fixes: 6ddc5c3ddf25 ("gfs2: getlabel support")
> Signed-off-by: Arnd Bergmann 
> ---
>  fs/gfs2/file.c | 24 
>  1 file changed, 24 insertions(+)

Hi,

Reviewed-by: Bob Peterson 

Regards,

Bob Peterson