Re: [Cluster-devel] [GFS2 PATCH v2 14/15] GFS2: Hold onto iopen glock longer when dinode creation fails
Bob, Unsure if this is related to my other issues, but I should probably at least pass this along: void gfs2_glock_put(struct gfs2_glock *gl) { ... GLOCK_BUG_ON(gl, !list_empty(>gl_holders)); <- this line ... } > @@ -883,6 +880,14 @@ fail_free_acls: > posix_acl_release(acl); > fail_free_vfs_inode: > free_vfs_inode = 1; > + /* We hold off until the very end to release the iopen glock. That > + * keeps other processes from acquiring it in EX mode and deleting > + * it while we're still using it. Since gfs2_delete_inode already > + * handles the iopen vs. inode glocks in any order, the lock order > + * does not matter. It must be done before iput, though, otherwise > + * we might get a segfault trying to dereference it. */ > + if (ip && ip->i_iopen_gh.gh_gl) /* if holder is linked to the glock */ via this line: > + gfs2_glock_put(ip->i_iopen_gh.gh_gl); [209071.114484] gfs2: G: s:SH n:5/396fcce f:Iqob t:SH d:EX/0 a:0 v:0 r:-128 m:200 [209071.114493] gfs2: H: s:SH f:EH e:0 p:40735 [nfsd] gfs2_glock_nq_init+0x11/0x40 [gfs2] [209071.114529] [ cut here ] [209071.114530] kernel BUG at fs/gfs2/glock.c:208! [209071.114531] invalid opcode: [#1] SMP [209071.114555] Modules linked in: gfs2 dlm sctp drbd(OE) cts rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache dm_service_time iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp mrp stp llc bonding nf_log_ipv4 nf_log_common xt_LOG ipt_REJECT nf_reject_ipv4 xt_conntrack iptable_filter nf_conntrack_ftp nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack ip_tables dm_multipath x86_pkg_temp_thermal coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt ipmi_devintf iTCO_vendor_support nfsd sb_edac ipmi_si auth_rpcgss nfs_acl lpc_ich ipmi_msghandler dcdbas mfd_core edac_core pcspkr mei_me lockd wmi mei grace shpchp acpi_power_meter acpi_pad sunrpc binfmt_misc xfs mgag200 sr_mod cdrom syscopyarea sysfillrect [209071.114561] sysimgblt i2c_algo_bit drm_kms_helper ttm sd_mod ahci ixgbe drm tg3 libahci mdio dca libata ptp megaraid_sas i2c_core pps_core dm_mirror dm_region_hash dm_log dm_mod [209071.114563] CPU: 4 PID: 40735 Comm: nfsd Tainted: GW OE 4.1.10_5 #1 [209071.114563] Hardware name: Dell Inc. PowerEdge R720/0X3D66, BIOS 2.2.2 01/16/2014 [209071.114564] task: 880035480d90 ti: 881f5c76c000 task.ti: 881f5c76c000 [209071.114568] RIP: 0010:[] [] gfs2_glock_put+0x139/0x160 [gfs2] [209071.114569] RSP: 0018:881f5c76fa98 EFLAGS: 00010296 [209071.114569] RAX: RBX: 88144026a940 RCX: 5298 [209071.114570] RDX: 52985298 RSI: 0286 RDI: 0286 [209071.114585] RBP: 881f5c76fab8 R08: 004a R09: 81dbd15e [209071.114585] R10: 1cb8 R11: 0001 R12: [209071.114585] R13: 883fed457000 R14: 88144026a970 R15: 881f1cb3c590 [209071.114586] FS: () GS:881fff88() knlGS: [209071.114587] CS: 0010 DS: ES: CR0: 80050033 [209071.114587] CR2: 7f83fbd67000 CR3: 0197e000 CR4: 001406e0 [209071.114588] Stack: [209071.114589] 8804501a98c0 883fed457000 883fed457000 [209071.114590] 881f5c76fc08 a087d04e 881ed6512470 8804501a98e0 [209071.114590] 881ed6512470 881eff86 881ed6512470 8000fd5b5c80 [209071.114591] Call Trace: [209071.114596] [] gfs2_create_inode+0x77e/0x11b0 [gfs2] [209071.114600] [] ? gfs2_create_inode+0xd6/0x11b0 [gfs2] [209071.114603] [] gfs2_create+0x3b/0x40 [gfs2] [209071.114607] [] ? security_inode_create+0x1f/0x30 [209071.114609] [] vfs_create+0xd5/0x140 [209071.114618] [] do_nfsd_create+0x481/0x600 [nfsd] [209071.114623] [] nfsd4_open+0x24a/0x830 [nfsd] [209071.114628] [] nfsd4_proc_compound+0x4d7/0x7e0 [nfsd] [209071.114632] [] nfsd_dispatch+0xc3/0x210 [nfsd] [209071.114658] [] ? svc_tcp_adjust_wspace+0x12/0x30 [sunrpc] [209071.114666] [] svc_process_common+0x440/0x6d0 [sunrpc] [209071.114673] [] svc_process+0x113/0x1b0 [sunrpc] [209071.114676] [] nfsd+0xff/0x170 [nfsd] [209071.114680] [] ? nfsd_destroy+0x80/0x80 [nfsd] [209071.114682] [] kthread+0xc9/0xe0 [209071.114683] [] ? kthread_create_on_node+0x180/0x180 [209071.114685] [] ret_from_fork+0x42/0x70 [209071.114687] [] ? kthread_create_on_node+0x180/0x180 [209071.114696] Code: 49 8b 04 24 48 85 c0 75 e9 eb b8 0f 1f 80 00 00 00 00 f3 90 48 8b 10 83 e2 01 75 f6 e9 44 ff ff ff 48 89 de 31 ff e8 17 fb ff ff <0f> 0b 49 83 7c 24 50 00 74 89 48 89 de 31 ff e8 03 fb ff ff 0f [209071.114699] RIP [] gfs2_glock_put+0x139/0x160 [gfs2] [209071.114699] RSP Thanks, Andy -- Andrew W. Elble awe...@discipline.
Re: [Cluster-devel] [GFS2 PATCH v2 14/15] GFS2: Hold onto iopen glock longer when dinode creation fails
> Hi Andy, > > Thanks. I'll investigate it. > > BTW, I haven't found any more blatant bugs during testing, however I'm > debugging another issue. It seems as if I still have a reference counter > issue somewhere because if I slam GFS2 hard enough, I can get it to > accumulate millions of glocks that are never freed (unless memory pressure > causes the glock shrinker to be called). Having the extra glocks is causing > undue strain on the dlm, to the point where dlm can't keep up. > So it seems likely that I'll do another revision here anyway. > I'll keep you posted. Bob, Would that resemble something like this? Sglocks nondiskinodergrp iopen flock quota j;rnlTotal S - --- --- --- --- - S Unlocked: 167677 304 0 0 460068442 SLocked: 2 1638575 630 1637848 0 7801 3277836 S Total: 3 1706252 934 1637848 0 12401 3346278 S S Held EX: 01 0 0 0 012 S Held SH: 11 0 1637847 0 00 1637849 S Held DF: 00 0 0 0 000 S G Waiting: 00 0 0 0 000 S P Waiting: 00 0 0 0 000 S DLM wait: 0 Thanks, Andy -- Andrew W. Elble awe...@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
Re: [Cluster-devel] [GFS2 PATCH 06/15] GFS2: Prevent gl_delete work for re-used inodes
Bob, The deadlock is slightly different - but still occurs with your patches in place. I'm pretty sure this is what's happening: nfsd (unlink) has i_mutex fs/nfsd/vfs.c:nfsd_unlink() -> fh_lock_nested(fhp, I_MUTEX_PARENT); nfsd (lookup) gets a glock fs/gfs2/inode.c:gfs2_inode_lookup() -> error = gfs2_glock_nq_init(io_gl, LM_ST_SHARED, GL_EXACT, >i_iopen_gh); nfsd (unlink) needs conflicting glock fs/gfs2/super.c:gfs2_evict_inode() -> if (ip->i_iopen_gh.gh_gl && test_bit(HIF_HOLDER, >i_iopen_gh.gh_iflags)) { gfs2_glock_dq_wait(>i_iopen_gh); client (lookup) waiting on i_mutex fs/nfsd/vfs.c:nfsd_lookup_dentry() -> fh_lock_nested(fhp, I_MUTEX_PARENT); -> deadlock G: s:EX n:2/2f4935a f:yIqob t:EX d:EX/0 a:0 v:0 r:10 m:200 H: s:EX f:H e:0 p:34415 [nfsd] gfs2_evict_inode+0x160/0x4d0 [gfs2] I: n:329967/49582938 t:8 f:0x00 d:0x s:500 G: s:SH n:5/2f4935a f:DIqob t:SH d:UN/3833211000 a:0 v:0 r:4 m:200 H: s:SH f:EH e:0 p:34414 [nfsd] gfs2_inode_lookup+0xee/0x1f0 [gfs2] nfs client host a: 313189 11:47:26.499858000 x.y.z.a -> x.y.z.q NFS 342 V4 Call REMOVE DH: 0x46fbb746/0353fd0043cc75dd8203b16b5bd4c197-cache-mod_custom-e2acfa1435db9601a6b9645e9f8be86f.php nfs client host b: 539106 11:49:28.390748000 x.y.z.b -> x.y.z.q NFS 362 V4 Call LOOKUP DH: 0x46fbb746/0353fd0043cc75dd8203b16b5bd4c197-cache-mod_custom-ea521049a8a64b325300eab10b4ac871.php crash> bt PID: 34414 TASK: 881fdc7428b0 CPU: 38 COMMAND: "nfsd" #0 [881f2da57b70] __schedule at 8165bbc4 #1 [881f2da57bc0] schedule at 8165c267 #2 [881f2da57be0] schedule_preempt_disabled at 8165c59e #3 [881f2da57bf0] __mutex_lock_slowpath at 8165e0d5 #4 [881f2da57c50] mutex_lock at 8165e173 #5 [881f2da57c70] nfsd_lookup_dentry at a035454f [nfsd] #6 [881f2da57cf0] nfsd_lookup at a0354989 [nfsd] #7 [881f2da57d40] nfsd4_lookup at a0361a2a [nfsd] #8 [881f2da57d50] nfsd4_proc_compound at a0363d57 [nfsd] #9 [881f2da57db0] nfsd_dispatch at a034ff83 [nfsd] #10 [881f2da57df0] svc_process_common at a0188260 [sunrpc] #11 [881f2da57e60] svc_process at a0188603 [sunrpc] #12 [881f2da57e90] nfsd at a034f98f [nfsd] #13 [881f2da57ec0] kthread at 81096989 #14 [881f2da57f50] ret_from_fork at 81660462 crash> bt PID: 34415 TASK: 881fec1a6c80 CPU: 24 COMMAND: "nfsd" #0 [881f2db779f0] __schedule at 8165bbc4 #1 [881f2db77a40] schedule at 8165c267 #2 [881f2db77a60] bit_wait at 8165ca7c #3 [881f2db77a70] __wait_on_bit at 8165c705 #4 [881f2db77ac0] out_of_line_wait_on_bit at 8165c7a2 #5 [881f2db77b30] gfs2_glock_dq_wait at a0850553 [gfs2] #6 [881f2db77b50] gfs2_evict_inode at a08697d5 [gfs2] #7 [881f2db77bf0] evict at 811fcbcb #8 [881f2db77c20] iput at 811fd52b #9 [881f2db77c50] d_delete at 811f8e38 #10 [881f2db77c80] vfs_unlink at 811edf79 #11 [881f2db77cd0] nfsd_unlink at a0355dcf [nfsd] #12 [881f2db77d10] nfsd4_remove at a0362ebd [nfsd] #13 [881f2db77d50] nfsd4_proc_compound at a0363d57 [nfsd] #14 [881f2db77db0] nfsd_dispatch at a034ff83 [nfsd] #15 [881f2db77df0] svc_process_common at a0188260 [sunrpc] #16 [881f2db77e60] svc_process at a0188603 [sunrpc] #17 [881f2db77e90] nfsd at a034f98f [nfsd] #18 [881f2db77ec0] kthread at 81096989 #19 [881f2db77f50] ret_from_fork at 81660462 Thanks, Andy -- Andrew W. Elble awe...@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
Re: [Cluster-devel] [GFS2 PATCH 06/15] GFS2: Prevent gl_delete work for re-used inodes
Bob Peterson <rpete...@redhat.com> writes: > Hi Andrew, > > Actually, I've found a few bugs and problems with that last patch set > and revised my patches last week. I've also added the glock flag, but > used "x" rather than "-" because I'm not sure I like punctuation marks there, > but nothing else makes sense either. The other changes are for the other > thing you spotted (which I caught in testing). The proper way to do it > is to initialize the i_gl to ip->i_gl in the evict code, and not have the > if at all. That affects two of the patches: I was wondering about that. I'll get that changed. Thanks, Andy -- Andrew W. Elble awe...@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
Re: [Cluster-devel] [GFS2 PATCH 13/15] gfs2: Use new variable i_gl instead of ip->i_gl
Bob Peterson <rpete...@redhat.com> writes: > This patch adds a new variable to function gfs2_evict_inode that > simplifies the references to ip->i_gl. This is just for readability > and to clarify future patches. > > Signed-off-by: Bob Peterson <rpete...@redhat.com> > --- > fs/gfs2/super.c | 13 +++-- > 1 file changed, 7 insertions(+), 6 deletions(-) > > diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c > index 06bd72b..79ee54b 100644 > --- a/fs/gfs2/super.c > +++ b/fs/gfs2/super.c > @@ -1616,7 +1617,7 @@ out: > ip->i_gl->gl_object = NULL; > flush_delayed_work(>i_gl->gl_work); > gfs2_glock_add_to_lru(ip->i_gl); > - gfs2_glock_put(ip->i_gl); if (i_gl) > + gfs2_glock_put(i_gl); > ip->i_gl = NULL; > if (ip->i_iopen_gh.gh_gl) { > ip->i_iopen_gh.gh_gl->gl_object = NULL; -- Andrew W. Elble awe...@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
Re: [Cluster-devel] [GFS2 PATCH 06/15] GFS2: Prevent gl_delete work for re-used inodes
Bob Peterson <rpete...@redhat.com> writes: > This patch adds a new glock flag GLF_INODE_DELETING which signifies > when a glock is being used to change an inode from unlinked to > deleted. The flag is used in a few places: This is the change I made to what we're testing: diff --git a/fs/gfs2/trace_gfs2.h b/fs/gfs2/trace_gfs2.h index 20c007d..80f2ee7 100644 --- a/fs/gfs2/trace_gfs2.h +++ b/fs/gfs2/trace_gfs2.h @@ -57,7 +57,8 @@ {(1UL << GLF_QUEUED), "q" }, \ {(1UL << GLF_LRU), "L" }, \ {(1UL << GLF_OBJECT), "o" }, \ - {(1UL << GLF_BLOCKING), "b" }) +{(1UL << GLF_BLOCKING),"b" }, \ + {(1UL << GLF_INODE_DELETING), "-" }) #ifndef NUMPTY #define NUMPTY -- Andrew W. Elble awe...@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
[Cluster-devel] GFS2 deadlock
We've just run into a deadlock. It seems very similar to the one referenced in commit 44ad37d69b2cc421d5b5c7ad7fed16230685b092 is it possible that fs/gfs2/export.c:gfs2_get_dentry() 140 inode = gfs2_ilookup(sb, inum->no_addr, 0); should be: 140 inode = gfs2_ilookup(sb, inum->no_addr, 1); ? I have a dump if more information would help. same inode: this is gfs2_inode->i_iopen_gh->gh_gl G: s:SH n:5/3157699 f:DIqob t:SH d:UN/104484397000 a:0 v:0 r:3 m:200 H: s:SH f:EH e:0 p:24919 [nfsd] gfs2_inode_lookup+0x10e/0x210 [gfs2] this is gfs2_inode->i_gl G: s:EX n:2/3157699 f:yIqob t:EX d:EX/0 a:0 v:0 r:4 m:200 H: s:EX f:H e:0 p:24920 [nfsd] gfs2_evict_inode+0x124/0x400 [gfs2] I: n:81596/51738265 t:8 f:0x00 d:0x s:500 This is doing SEQ/PUTFH/GETATTR: crash> bt PID: 24919 TASK: 881f9e11d160 CPU: 32 COMMAND: "nfsd" #0 [883f62443950] __schedule at 8165aaf4 #1 [883f624439a0] schedule at 8165b1a7 #2 [883f624439a8] __wait_on_freeing_inode at 811fbe1c #3 [883f62443a30] find_inode at 811fbed1 #4 [883f62443a80] ilookup5_nowait at 811fbf61 #5 [883f62443ab0] ilookup5 at 811fcb33 #6 [883f62443ad0] gfs2_ilookup at a080d1db [gfs2] #7 [883f62443af0] gfs2_get_dentry at a0806a11 [gfs2] #8 [883f62443b10] gfs2_fh_to_dentry at a0806b2c [gfs2] #9 [883f62443b30] exportfs_decode_fh at 81262ef2 #10 [883f62443ca0] fh_verify at a057e977 [nfsd] #11 [883f62443d20] nfsd4_putfh at a058ce6d [nfsd] #12 [883f62443d50] nfsd4_proc_compound at a058ed57 [nfsd] #13 [883f62443db0] nfsd_dispatch at a057af83 [nfsd] #14 [883f62443df0] svc_process_common at a01a2bb0 [sunrpc] #15 [883f62443e60] svc_process at a01a2f53 [sunrpc] #16 [883f62443e90] nfsd at a057a98f [nfsd] #17 [883f62443ec0] kthread at 81096919 #18 [883f62443f50] ret_from_fork at 8165f3a2 This is doing SEQ/PUTFH/REMOVE: crash> bt PID: 24920 TASK: 881febf843d0 CPU: 32 COMMAND: "nfsd" #0 [883f62447a00] __schedule at 8165aaf4 #1 [883f62447a50] schedule at 8165b1a7 #2 [883f62447a58] bit_wait at 8165b9bc #3 [883f62447a70] bit_wait at 8165b9bc #4 [883f62447a80] __wait_on_bit at 8165b645 #5 [883f62447ad0] out_of_line_wait_on_bit at 8165b6e2 #6 [883f62447b40] gfs2_glock_dq_wait at a07ff4f3 [gfs2] #7 [883f62447b60] gfs2_evict_inode at a0818111 [gfs2] #8 [883f62447bf0] evict at 811fc9eb #9 [883f62447c20] iput at 811fd34b #10 [883f62447c50] d_delete at 811f8c58 #11 [883f62447c80] vfs_unlink at 811ee8f9 #12 [883f62447cd0] nfsd_unlink at a0580dcf [nfsd] #13 [883f62447d10] nfsd4_remove at a058debd [nfsd] #14 [883f62447d50] nfsd4_proc_compound at a058ed57 [nfsd] #15 [883f62447db0] nfsd_dispatch at a057af83 [nfsd] #16 [883f62447df0] svc_process_common at a01a2bb0 [sunrpc] #17 [883f62447e60] svc_process at a01a2f53 [sunrpc] #18 [883f62447e90] nfsd at a057a98f [nfsd] #19 [883f62447ec0] kthread at 81096919 #20 [883f62447f50] ret_from_fork at ffffffff8165f3a2 Thanks, Andy -- Andrew W. Elble awe...@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912
[Cluster-devel] 3.18.5 kernel panic: fs/gfs2/acl.c:76
3.18.5 kernel crashing on acl deletion: null pointer dereference in fs/gfs2/acl.c:76 to replicate: Prereq: gfs2 filesystem w/ acl mount option turned on. Execute: mkdir testdir setfacl -m d:u::rwx,d:g::rwx,d:g:wheel:rwx,d:m::rwx,d:o::--- testdir setfattr -x system.posix_acl_default testdir fix we're using currently: --- fs/gfs2/acl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/gfs2/acl.c b/fs/gfs2/acl.c index 3088e2a..8339754 100644 --- a/fs/gfs2/acl.c +++ b/fs/gfs2/acl.c @@ -73,7 +73,7 @@ int gfs2_set_acl(struct inode *inode, struct posix_acl *acl, int type) BUG_ON(name == NULL); - if (acl-a_count GFS2_ACL_MAX_ENTRIES(GFS2_SB(inode))) + if ((acl) (acl-a_count GFS2_ACL_MAX_ENTRIES(GFS2_SB(inode return -E2BIG; if (type == ACL_TYPE_ACCESS) { -- 1.9.2 Thanks, Andy -- Andrew W. Elble awe...@discipline.rit.edu Infrastructure Engineer, Communications Technical Lead Rochester Institute of Technology PGP: BFAD 8461 4CCF DC95 DA2C B0EB 965B 082E 863E C912