Re: [Cluster-devel] [GFS2 PATCH v2 14/15] GFS2: Hold onto iopen glock longer when dinode creation fails

2015-10-16 Thread Andrew W Elble

Bob,

 Unsure if this is related to my other issues, but I should probably
   at least pass this along:

void gfs2_glock_put(struct gfs2_glock *gl)
{
 ...
 GLOCK_BUG_ON(gl, !list_empty(>gl_holders)); <- this line
 ...
}

> @@ -883,6 +880,14 @@ fail_free_acls:
>   posix_acl_release(acl);
>  fail_free_vfs_inode:
>   free_vfs_inode = 1;
> + /* We hold off until the very end to release the iopen glock. That
> +  * keeps other processes from acquiring it in EX mode and deleting
> +  * it while we're still using it. Since gfs2_delete_inode already
> +  * handles the iopen vs. inode glocks in any order, the lock order
> +  * does not matter. It must be done before iput, though, otherwise
> +  * we might get a segfault trying to dereference it. */
> + if (ip && ip->i_iopen_gh.gh_gl) /* if holder is linked to the glock */

via this line:

> + gfs2_glock_put(ip->i_iopen_gh.gh_gl); 


[209071.114484] gfs2: G:  s:SH n:5/396fcce f:Iqob t:SH d:EX/0 a:0 v:0 r:-128 
m:200
[209071.114493] gfs2:  H: s:SH f:EH e:0 p:40735 [nfsd] 
gfs2_glock_nq_init+0x11/0x40 [gfs2]
[209071.114529] [ cut here ]
[209071.114530] kernel BUG at fs/gfs2/glock.c:208!
[209071.114531] invalid opcode:  [#1] SMP 
[209071.114555] Modules linked in: gfs2 dlm sctp drbd(OE) cts rpcsec_gss_krb5 
nfsv4 dns_resolver nfs fscache dm_service_time iscsi_tcp libiscsi_tcp libiscsi 
scsi_transport_iscsi 8021q garp mrp stp llc bonding nf_log_ipv4 nf_log_common 
xt_LOG ipt_REJECT nf_reject_ipv4 xt_conntrack iptable_filter nf_conntrack_ftp 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack ip_tables dm_multipath 
x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul 
crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper 
ablk_helper cryptd iTCO_wdt ipmi_devintf iTCO_vendor_support nfsd sb_edac 
ipmi_si auth_rpcgss nfs_acl lpc_ich ipmi_msghandler dcdbas mfd_core edac_core 
pcspkr mei_me lockd wmi mei grace shpchp acpi_power_meter acpi_pad sunrpc 
binfmt_misc xfs mgag200 sr_mod cdrom syscopyarea sysfillrect
[209071.114561]  sysimgblt i2c_algo_bit drm_kms_helper ttm sd_mod ahci ixgbe 
drm tg3 libahci mdio dca libata ptp megaraid_sas i2c_core pps_core dm_mirror 
dm_region_hash dm_log dm_mod
[209071.114563] CPU: 4 PID: 40735 Comm: nfsd Tainted: GW  OE   4.1.10_5 
#1
[209071.114563] Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.2.2 
01/16/2014
[209071.114564] task: 880035480d90 ti: 881f5c76c000 task.ti: 
881f5c76c000
[209071.114568] RIP: 0010:[]  [] 
gfs2_glock_put+0x139/0x160 [gfs2]
[209071.114569] RSP: 0018:881f5c76fa98  EFLAGS: 00010296
[209071.114569] RAX:  RBX: 88144026a940 RCX: 
5298
[209071.114570] RDX: 52985298 RSI: 0286 RDI: 
0286
[209071.114585] RBP: 881f5c76fab8 R08: 004a R09: 
81dbd15e
[209071.114585] R10: 1cb8 R11: 0001 R12: 

[209071.114585] R13: 883fed457000 R14: 88144026a970 R15: 
881f1cb3c590
[209071.114586] FS:  () GS:881fff88() 
knlGS:
[209071.114587] CS:  0010 DS:  ES:  CR0: 80050033
[209071.114587] CR2: 7f83fbd67000 CR3: 0197e000 CR4: 
001406e0
[209071.114588] Stack:
[209071.114589]  8804501a98c0 883fed457000  
883fed457000
[209071.114590]  881f5c76fc08 a087d04e 881ed6512470 
8804501a98e0
[209071.114590]  881ed6512470 881eff86 881ed6512470 
8000fd5b5c80
[209071.114591] Call Trace:
[209071.114596]  [] gfs2_create_inode+0x77e/0x11b0 [gfs2]
[209071.114600]  [] ? gfs2_create_inode+0xd6/0x11b0 [gfs2]
[209071.114603]  [] gfs2_create+0x3b/0x40 [gfs2]
[209071.114607]  [] ? security_inode_create+0x1f/0x30
[209071.114609]  [] vfs_create+0xd5/0x140
[209071.114618]  [] do_nfsd_create+0x481/0x600 [nfsd]
[209071.114623]  [] nfsd4_open+0x24a/0x830 [nfsd]
[209071.114628]  [] nfsd4_proc_compound+0x4d7/0x7e0 [nfsd]
[209071.114632]  [] nfsd_dispatch+0xc3/0x210 [nfsd]
[209071.114658]  [] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc]
[209071.114666]  [] svc_process_common+0x440/0x6d0 [sunrpc]
[209071.114673]  [] svc_process+0x113/0x1b0 [sunrpc]
[209071.114676]  [] nfsd+0xff/0x170 [nfsd]
[209071.114680]  [] ? nfsd_destroy+0x80/0x80 [nfsd]
[209071.114682]  [] kthread+0xc9/0xe0
[209071.114683]  [] ? kthread_create_on_node+0x180/0x180
[209071.114685]  [] ret_from_fork+0x42/0x70
[209071.114687]  [] ? kthread_create_on_node+0x180/0x180
[209071.114696] Code: 49 8b 04 24 48 85 c0 75 e9 eb b8 0f 1f 80 00 00 00 00 f3 
90 48 8b 10 83 e2 01 75 f6 e9 44 ff ff ff 48 89 de 31 ff e8 17 fb ff ff <0f> 0b 
49 83 7c 24 50 00 74 89 48 89 de 31 ff e8 03 fb ff ff 0f 
[209071.114699] RIP  [] gfs2_glock_put+0x139/0x160 [gfs2]
[209071.114699]  RSP 


Thanks,

Andy

-- 
Andrew W. Elble
awe...@discipline.

Re: [Cluster-devel] [GFS2 PATCH v2 14/15] GFS2: Hold onto iopen glock longer when dinode creation fails

2015-10-16 Thread Andrew W Elble
> Hi Andy,
>
> Thanks. I'll investigate it.
>
> BTW, I haven't found any more blatant bugs during testing, however I'm
> debugging another issue. It seems as if I still have a reference counter
> issue somewhere because if I slam GFS2 hard enough, I can get it to
> accumulate millions of glocks that are never freed (unless memory pressure
> causes the glock shrinker to be called). Having the extra glocks is causing
> undue strain on the dlm, to the point where dlm can't keep up.
> So it seems likely that I'll do another revision here anyway.
> I'll keep you posted.

Bob,

  Would that resemble something like this?

Sglocks  nondiskinodergrp   iopen   flock quota j;rnlTotal
S  - ---  --- --- --- -  
S  Unlocked:   167677 304   0   0   460068442
SLocked:   2  1638575 630 1637848   0   7801  3277836
S Total:   3  1706252 934 1637848   0  12401  3346278
S
S   Held EX:   01   0   0   0 012
S   Held SH:   11   0 1637847   0 00  1637849
S   Held DF:   00   0   0   0 000
S G Waiting:   00   0   0   0 000
S P Waiting:   00   0   0   0 000
S  DLM wait:   0

Thanks,

Andy

-- 
Andrew W. Elble
awe...@discipline.rit.edu
Infrastructure Engineer, Communications Technical Lead
Rochester Institute of Technology
PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912



Re: [Cluster-devel] [GFS2 PATCH 06/15] GFS2: Prevent gl_delete work for re-used inodes

2015-10-12 Thread Andrew W Elble

Bob,

   The deadlock is slightly different - but still occurs with your
   patches in place. I'm pretty sure this is what's happening:

nfsd (unlink) has i_mutex
fs/nfsd/vfs.c:nfsd_unlink()
-> fh_lock_nested(fhp, I_MUTEX_PARENT);

nfsd (lookup) gets a glock
fs/gfs2/inode.c:gfs2_inode_lookup()
-> error = gfs2_glock_nq_init(io_gl, LM_ST_SHARED, GL_EXACT, 
>i_iopen_gh);

nfsd (unlink) needs conflicting glock
fs/gfs2/super.c:gfs2_evict_inode()
-> if (ip->i_iopen_gh.gh_gl &&
  test_bit(HIF_HOLDER, >i_iopen_gh.gh_iflags)) {
 gfs2_glock_dq_wait(>i_iopen_gh);

client (lookup) waiting on i_mutex
fs/nfsd/vfs.c:nfsd_lookup_dentry()
-> fh_lock_nested(fhp, I_MUTEX_PARENT);

-> deadlock

G:  s:EX n:2/2f4935a f:yIqob t:EX d:EX/0 a:0 v:0 r:10 m:200
 H: s:EX f:H e:0 p:34415 [nfsd] gfs2_evict_inode+0x160/0x4d0 [gfs2]
 I: n:329967/49582938 t:8 f:0x00 d:0x s:500

G:  s:SH n:5/2f4935a f:DIqob t:SH d:UN/3833211000 a:0 v:0 r:4 m:200
 H: s:SH f:EH e:0 p:34414 [nfsd] gfs2_inode_lookup+0xee/0x1f0 [gfs2]

nfs client host a:
313189 11:47:26.499858000 x.y.z.a -> x.y.z.q NFS 342 V4 Call REMOVE DH: 
0x46fbb746/0353fd0043cc75dd8203b16b5bd4c197-cache-mod_custom-e2acfa1435db9601a6b9645e9f8be86f.php

nfs client host b:
539106 11:49:28.390748000 x.y.z.b -> x.y.z.q NFS 362 V4 Call LOOKUP DH: 
0x46fbb746/0353fd0043cc75dd8203b16b5bd4c197-cache-mod_custom-ea521049a8a64b325300eab10b4ac871.php

crash> bt
PID: 34414  TASK: 881fdc7428b0  CPU: 38  COMMAND: "nfsd"
 #0 [881f2da57b70] __schedule at 8165bbc4
 #1 [881f2da57bc0] schedule at 8165c267
 #2 [881f2da57be0] schedule_preempt_disabled at 8165c59e
 #3 [881f2da57bf0] __mutex_lock_slowpath at 8165e0d5
 #4 [881f2da57c50] mutex_lock at 8165e173
 #5 [881f2da57c70] nfsd_lookup_dentry at a035454f [nfsd]
 #6 [881f2da57cf0] nfsd_lookup at a0354989 [nfsd]
 #7 [881f2da57d40] nfsd4_lookup at a0361a2a [nfsd]
 #8 [881f2da57d50] nfsd4_proc_compound at a0363d57 [nfsd]
 #9 [881f2da57db0] nfsd_dispatch at a034ff83 [nfsd]
#10 [881f2da57df0] svc_process_common at a0188260 [sunrpc]
#11 [881f2da57e60] svc_process at a0188603 [sunrpc]
#12 [881f2da57e90] nfsd at a034f98f [nfsd]
#13 [881f2da57ec0] kthread at 81096989
#14 [881f2da57f50] ret_from_fork at 81660462

crash> bt
PID: 34415  TASK: 881fec1a6c80  CPU: 24  COMMAND: "nfsd"
 #0 [881f2db779f0] __schedule at 8165bbc4
 #1 [881f2db77a40] schedule at 8165c267
 #2 [881f2db77a60] bit_wait at 8165ca7c
 #3 [881f2db77a70] __wait_on_bit at 8165c705
 #4 [881f2db77ac0] out_of_line_wait_on_bit at 8165c7a2
 #5 [881f2db77b30] gfs2_glock_dq_wait at a0850553 [gfs2]
 #6 [881f2db77b50] gfs2_evict_inode at a08697d5 [gfs2]
 #7 [881f2db77bf0] evict at 811fcbcb
 #8 [881f2db77c20] iput at 811fd52b
 #9 [881f2db77c50] d_delete at 811f8e38
#10 [881f2db77c80] vfs_unlink at 811edf79
#11 [881f2db77cd0] nfsd_unlink at a0355dcf [nfsd]
#12 [881f2db77d10] nfsd4_remove at a0362ebd [nfsd]
#13 [881f2db77d50] nfsd4_proc_compound at a0363d57 [nfsd]
#14 [881f2db77db0] nfsd_dispatch at a034ff83 [nfsd]
#15 [881f2db77df0] svc_process_common at a0188260 [sunrpc]
#16 [881f2db77e60] svc_process at a0188603 [sunrpc]
#17 [881f2db77e90] nfsd at a034f98f [nfsd]
#18 [881f2db77ec0] kthread at 81096989
#19 [881f2db77f50] ret_from_fork at 81660462

Thanks,

Andy

-- 
Andrew W. Elble
awe...@discipline.rit.edu
Infrastructure Engineer, Communications Technical Lead
Rochester Institute of Technology
PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912



Re: [Cluster-devel] [GFS2 PATCH 06/15] GFS2: Prevent gl_delete work for re-used inodes

2015-10-06 Thread Andrew W Elble
Bob Peterson <rpete...@redhat.com> writes:

> Hi Andrew,
>
> Actually, I've found a few bugs and problems with that last patch set
> and revised my patches last week. I've also added the glock flag, but
> used "x" rather than "-" because I'm not sure I like punctuation marks there,
> but nothing else makes sense either. The other changes are for the other
> thing you spotted (which I caught in testing). The proper way to do it
> is to initialize the i_gl to ip->i_gl in the evict code, and not have the
> if at all. That affects two of the patches:

I was wondering about that. I'll get that changed.

Thanks,

Andy

-- 
Andrew W. Elble
awe...@discipline.rit.edu
Infrastructure Engineer, Communications Technical Lead
Rochester Institute of Technology
PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912



Re: [Cluster-devel] [GFS2 PATCH 13/15] gfs2: Use new variable i_gl instead of ip->i_gl

2015-10-06 Thread Andrew W Elble

Bob Peterson <rpete...@redhat.com> writes:

> This patch adds a new variable to function gfs2_evict_inode that
> simplifies the references to ip->i_gl. This is just for readability
> and to clarify future patches.
>
> Signed-off-by: Bob Peterson <rpete...@redhat.com>
> ---
>  fs/gfs2/super.c | 13 +++--
>  1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c
> index 06bd72b..79ee54b 100644
> --- a/fs/gfs2/super.c
> +++ b/fs/gfs2/super.c



> @@ -1616,7 +1617,7 @@ out:
>   ip->i_gl->gl_object = NULL;
>   flush_delayed_work(>i_gl->gl_work);
>   gfs2_glock_add_to_lru(ip->i_gl);
> - gfs2_glock_put(ip->i_gl);

if (i_gl)

> + gfs2_glock_put(i_gl);
>   ip->i_gl = NULL;
>       if (ip->i_iopen_gh.gh_gl) {
>   ip->i_iopen_gh.gh_gl->gl_object = NULL;

-- 
Andrew W. Elble
awe...@discipline.rit.edu
Infrastructure Engineer, Communications Technical Lead
Rochester Institute of Technology
PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912



Re: [Cluster-devel] [GFS2 PATCH 06/15] GFS2: Prevent gl_delete work for re-used inodes

2015-10-06 Thread Andrew W Elble

Bob Peterson <rpete...@redhat.com> writes:

> This patch adds a new glock flag GLF_INODE_DELETING which signifies
> when a glock is being used to change an inode from unlinked to
> deleted. The flag is used in a few places:

This is the change I made to what we're testing:

diff --git a/fs/gfs2/trace_gfs2.h b/fs/gfs2/trace_gfs2.h
index 20c007d..80f2ee7 100644
--- a/fs/gfs2/trace_gfs2.h
+++ b/fs/gfs2/trace_gfs2.h
@@ -57,7 +57,8 @@
{(1UL << GLF_QUEUED),   "q" },  \
{(1UL << GLF_LRU),  "L" },  \
{(1UL << GLF_OBJECT),   "o" },  \
-   {(1UL << GLF_BLOCKING), "b" })
+{(1UL << GLF_BLOCKING),"b" },  \
+    {(1UL << GLF_INODE_DELETING),  "-" })
 
 #ifndef NUMPTY
 #define NUMPTY


-- 
Andrew W. Elble
awe...@discipline.rit.edu
Infrastructure Engineer, Communications Technical Lead
Rochester Institute of Technology
PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912



[Cluster-devel] GFS2 deadlock

2015-10-05 Thread Andrew W Elble
We've just run into a deadlock.

It seems very similar to the one referenced in commit
44ad37d69b2cc421d5b5c7ad7fed16230685b092

is it possible that fs/gfs2/export.c:gfs2_get_dentry()

140  inode = gfs2_ilookup(sb, inum->no_addr, 0);

should be:

140  inode = gfs2_ilookup(sb, inum->no_addr, 1);

?

I have a dump if more information would help.

same inode:
this is gfs2_inode->i_iopen_gh->gh_gl
G:  s:SH n:5/3157699 f:DIqob t:SH d:UN/104484397000 a:0 v:0 r:3 m:200
 H: s:SH f:EH e:0 p:24919 [nfsd] gfs2_inode_lookup+0x10e/0x210 [gfs2]

this is gfs2_inode->i_gl
G:  s:EX n:2/3157699 f:yIqob t:EX d:EX/0 a:0 v:0 r:4 m:200
 H: s:EX f:H e:0 p:24920 [nfsd] gfs2_evict_inode+0x124/0x400 [gfs2]
  I: n:81596/51738265 t:8 f:0x00 d:0x s:500

This is doing SEQ/PUTFH/GETATTR:

crash> bt
PID: 24919  TASK: 881f9e11d160  CPU: 32  COMMAND: "nfsd"
 #0 [883f62443950] __schedule at 8165aaf4
 #1 [883f624439a0] schedule at 8165b1a7
 #2 [883f624439a8] __wait_on_freeing_inode at 811fbe1c
 #3 [883f62443a30] find_inode at 811fbed1
 #4 [883f62443a80] ilookup5_nowait at 811fbf61
 #5 [883f62443ab0] ilookup5 at 811fcb33
 #6 [883f62443ad0] gfs2_ilookup at a080d1db [gfs2]
 #7 [883f62443af0] gfs2_get_dentry at a0806a11 [gfs2]
 #8 [883f62443b10] gfs2_fh_to_dentry at a0806b2c [gfs2]
 #9 [883f62443b30] exportfs_decode_fh at 81262ef2
#10 [883f62443ca0] fh_verify at a057e977 [nfsd]
#11 [883f62443d20] nfsd4_putfh at a058ce6d [nfsd]
#12 [883f62443d50] nfsd4_proc_compound at a058ed57 [nfsd]
#13 [883f62443db0] nfsd_dispatch at a057af83 [nfsd]
#14 [883f62443df0] svc_process_common at a01a2bb0 [sunrpc]
#15 [883f62443e60] svc_process at a01a2f53 [sunrpc]
#16 [883f62443e90] nfsd at a057a98f [nfsd]
#17 [883f62443ec0] kthread at 81096919
#18 [883f62443f50] ret_from_fork at 8165f3a2

This is doing SEQ/PUTFH/REMOVE:

crash> bt
PID: 24920  TASK: 881febf843d0  CPU: 32  COMMAND: "nfsd"
 #0 [883f62447a00] __schedule at 8165aaf4
 #1 [883f62447a50] schedule at 8165b1a7
 #2 [883f62447a58] bit_wait at 8165b9bc
 #3 [883f62447a70] bit_wait at 8165b9bc
 #4 [883f62447a80] __wait_on_bit at 8165b645
 #5 [883f62447ad0] out_of_line_wait_on_bit at 8165b6e2
 #6 [883f62447b40] gfs2_glock_dq_wait at a07ff4f3 [gfs2]
 #7 [883f62447b60] gfs2_evict_inode at a0818111 [gfs2]
 #8 [883f62447bf0] evict at 811fc9eb
 #9 [883f62447c20] iput at 811fd34b
#10 [883f62447c50] d_delete at 811f8c58
#11 [883f62447c80] vfs_unlink at 811ee8f9
#12 [883f62447cd0] nfsd_unlink at a0580dcf [nfsd]
#13 [883f62447d10] nfsd4_remove at a058debd [nfsd]
#14 [883f62447d50] nfsd4_proc_compound at a058ed57 [nfsd]
#15 [883f62447db0] nfsd_dispatch at a057af83 [nfsd]
#16 [883f62447df0] svc_process_common at a01a2bb0 [sunrpc]
#17 [883f62447e60] svc_process at a01a2f53 [sunrpc]
#18 [883f62447e90] nfsd at a057a98f [nfsd]
#19 [883f62447ec0] kthread at 81096919
#20 [883f62447f50] ret_from_fork at ffffffff8165f3a2

Thanks,

Andy

-- 
Andrew W. Elble
awe...@discipline.rit.edu
Infrastructure Engineer, Communications Technical Lead
Rochester Institute of Technology
PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912



[Cluster-devel] 3.18.5 kernel panic: fs/gfs2/acl.c:76

2015-02-06 Thread Andrew W Elble

3.18.5 kernel crashing on acl deletion:

null pointer dereference in fs/gfs2/acl.c:76

to replicate:

Prereq: gfs2 filesystem w/ acl mount option turned on.

Execute:

mkdir testdir
setfacl -m d:u::rwx,d:g::rwx,d:g:wheel:rwx,d:m::rwx,d:o::--- testdir
setfattr -x system.posix_acl_default testdir

fix we're using currently:

---
 fs/gfs2/acl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/gfs2/acl.c b/fs/gfs2/acl.c
index 3088e2a..8339754 100644
--- a/fs/gfs2/acl.c
+++ b/fs/gfs2/acl.c
@@ -73,7 +73,7 @@ int gfs2_set_acl(struct inode *inode, struct posix_acl *acl, 
int type)
 
BUG_ON(name == NULL);
 
-   if (acl-a_count  GFS2_ACL_MAX_ENTRIES(GFS2_SB(inode)))
+   if ((acl)  (acl-a_count  GFS2_ACL_MAX_ENTRIES(GFS2_SB(inode
return -E2BIG;
 
if (type == ACL_TYPE_ACCESS) {
-- 
1.9.2

Thanks,

Andy

-- 
Andrew W. Elble
awe...@discipline.rit.edu
Infrastructure Engineer, Communications Technical Lead
Rochester Institute of Technology
PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912