Re: [Cluster-devel] [GFS2 PATCH 2/2] GFS2: Revert 35e478f Flush pending glock work when evicting an inode

2015-11-20 Thread Steven Whitehouse

Hi,

On 19/11/15 18:42, Bob Peterson wrote:

Commit 35e478f was added because there were cases where
the inode's glock still had pending delayed work.
The theory is that the delayed work referenced the inode after it
had been deleted from memory by gfs2_evict_inode during its call to
filemap_fdatawrite. I'm guessing that's wrong. I'm guessing that
the delayed work was referencing the glock after it had been
freed due to the call to gfs2_glock_put() in gfs2_evict_inode().
The previous patch made the last put be done from the delayed
work queue, so that should make 35e478f obsolete. The other thing
to consider is that vfs evict() calls invalidate_inode_buffers
before calling the file system-specific gfs2_evict_inode(), and
therefore we should have no dirty buffers for the inode itself.
(We might have dirty buffers for the glock address space, though).
Not only are there likely to be dirty buffers still in the meta data 
address space, that is a requirement for good performance when unlinking 
lots of inodes.



The reason why the call to flush_delayed_work can't be done here
is that it can block, waiting like so:

kernel: Call Trace:
kernel: [] schedule_timeout+0x215/0x2e0
kernel: [] ? shrink_inactive_list+0x53a/0x830
kernel: [] wait_for_common+0x123/0x180
kernel: [] ? default_wake_function+0x0/0x20
kernel: [] wait_for_completion+0x1d/0x20
kernel: [] flush_work+0x77/0xc0
kernel: [] ? wq_barrier_func+0x0/0x20
kernel: [] flush_delayed_work+0x54/0x70
kernel: [] gfs2_clear_inode+0x62/0xf0 [gfs2]

The delayed work can depend on actions from DLM, and DLM can
hang indefinitely on a fencing action.

Signed-off-by: Bob Peterson 
---
  fs/gfs2/super.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 46e5004..4a9b090 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -1613,7 +1613,6 @@ out:
clear_inode(inode);
gfs2_dir_hash_inval(ip);
ip->i_gl->gl_object = NULL;
-   flush_delayed_work(&ip->i_gl->gl_work);
gfs2_glock_add_to_lru(ip->i_gl);
if (queue_delayed_work(gfs2_glock_workqueue,
   &ip->i_gl->gl_work, 0) == 0)


I'm not convinced that it is safe to do this, without finding some other 
way to flush the workqueue. All glock work items on the queue hold a ref 
count to the glock, so it should not be possible to have a case where 
the glock is referenced by work queue activity after it has been freed  
- unless I've misunderstood the explanation above,


Steve.



Re: [Cluster-devel] [GFS2 PATCH 1/2] GFS2: Make gfs2_clear_inode() queue the final put

2015-11-20 Thread Steven Whitehouse

Hi,

On 19/11/15 18:42, Bob Peterson wrote:

This patch changes function gfs2_clear_inode() so that instead
of calling gfs2_glock_put directly() most of the time, it queues
the glock to the delayed work queue. That avoids a possible
deadlock where it calls dlm during a fence operation:
dlm waits for a fence operation, the fence operation waits for
memory, the shrinker waits for gfs2 to free an inode from memory,
but gfs2 waits for dlm.

Signed-off-by: Bob Peterson 
---
  fs/gfs2/glock.c | 34 +-
  fs/gfs2/glock.h |  1 +
  fs/gfs2/super.c |  5 -
  3 files changed, 22 insertions(+), 18 deletions(-)

[snip]
Most of the patch seems to just rename the workqueue which makes it 
tricky to spot the other changes. However, the below code seems to be 
the new bit..



diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
index 9d5c3f7..46e5004 100644
--- a/fs/gfs2/super.c
+++ b/fs/gfs2/super.c
@@ -24,6 +24,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -1614,7 +1615,9 @@ out:
ip->i_gl->gl_object = NULL;
flush_delayed_work(&ip->i_gl->gl_work);
gfs2_glock_add_to_lru(ip->i_gl);
-   gfs2_glock_put(ip->i_gl);
+   if (queue_delayed_work(gfs2_glock_workqueue,
+  &ip->i_gl->gl_work, 0) == 0)
+   gfs2_glock_put(ip->i_gl);
ip->i_gl = NULL;
if (ip->i_iopen_gh.gh_gl) {
ip->i_iopen_gh.gh_gl->gl_object = NULL;


which replaces a put with a queue & put if the queue fails (due to it 
being already on the queue) which doesn't look quite right to be since 
if calling gfs2_glock_put() was not safe before, then calling it 
conditionally like this is still no safer I think?


Steve.