Hello,

We run some troubles since several days on our GFS2 (log attached):

- we ran the FS for some times without troubles (since 2014-11-03)

- the FS was grown from 3To to 4To near 6 month ago

- it seems to happen only on one node “nebula3”

- I run an FSCK when just fencing the node was not sufficient (2 crashes
  the same day)

The nodes run on Ubuntu Trusty Thar up to date.

Do you have any idea?

Regards.

-- 
Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF

Feb 10 09:08:08 nebula3 kernel: [53799.437568] GFS2: buf_blk = 0x3248 
old_state=0, new_state=0
Feb 10 09:08:08 nebula3 kernel: [53799.437577] GFS2: rgrp=0x3ff67bbd 
bi_start=0x0
Feb 10 09:08:08 nebula3 kernel: [53799.437579] GFS2: bi_offset=0x80 bi_len=0xf80
Feb 10 09:08:08 nebula3 kernel: [53799.437585] CPU: 9 PID: 48112 Comm: rm Not 
tainted 3.13.0-77-generic #121-Ubuntu
Feb 10 09:08:08 nebula3 kernel: [53799.437588] Hardware name: Dell Inc. 
PowerEdge M620/0T36VK, BIOS 2.2.7 01/21/2014
Feb 10 09:08:08 nebula3 kernel: [53799.437591]  000000003ff6ae0b 
ffff8816d6421af0 ffffffff81725138 000000003ff6ae0b
Feb 10 09:08:08 nebula3 kernel: [53799.437599]  ffff8816d6421b48 
ffffffffa05b0bbf ffff8817cb61f100 00000000a05b7977
Feb 10 09:08:08 nebula3 kernel: [53799.437605]  ffff8817cb638198 
0000000000003248 ffff8816d671f000 0000000000000020
Feb 10 09:08:08 nebula3 kernel: [53799.437611] Call Trace:
Feb 10 09:08:08 nebula3 kernel: [53799.437629]  [<ffffffff81725138>] 
dump_stack+0x45/0x56
Feb 10 09:08:08 nebula3 kernel: [53799.437650]  [<ffffffffa05b0bbf>] 
rgblk_free+0x1ff/0x230 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437663]  [<ffffffffa05b2f34>] 
__gfs2_free_blocks+0x34/0x120 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437671]  [<ffffffffa058f076>] 
recursive_scan+0x5b6/0x6a0 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437679]  [<ffffffffa058ef2c>] 
recursive_scan+0x46c/0x6a0 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437691]  [<ffffffffa05ad4f5>] ? 
gfs2_quota_hold+0x175/0x1f0 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437699]  [<ffffffffa058f25a>] 
trunc_dealloc+0xfa/0x120 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437708]  [<ffffffffa059a98e>] ? 
gfs2_glock_wait+0x3e/0x80 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437718]  [<ffffffffa059c190>] ? 
gfs2_glock_nq+0x280/0x430 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437726]  [<ffffffffa0590ef0>] 
gfs2_file_dealloc+0x10/0x20 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437737]  [<ffffffffa05b3db3>] 
gfs2_evict_inode+0x2b3/0x3e0 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437746]  [<ffffffffa05b3c13>] ? 
gfs2_evict_inode+0x113/0x3e0 [gfs2]
Feb 10 09:08:08 nebula3 kernel: [53799.437755]  [<ffffffff811d99b0>] 
evict+0xb0/0x1b0
Feb 10 09:08:08 nebula3 kernel: [53799.437760]  [<ffffffff811da1c5>] 
iput+0xf5/0x180
Feb 10 09:08:08 nebula3 kernel: [53799.437767]  [<ffffffff811ceb1e>] 
do_unlinkat+0x18e/0x2b0
Feb 10 09:08:08 nebula3 kernel: [53799.437775]  [<ffffffff811bbb76>] ? 
filp_close+0x56/0x70
Feb 10 09:08:08 nebula3 kernel: [53799.437780]  [<ffffffff811cfa4b>] 
SyS_unlinkat+0x1b/0x40
Feb 10 09:08:08 nebula3 kernel: [53799.437788]  [<ffffffff81735d1d>] 
system_call_fastpath+0x1a/0x1f
Feb 10 09:08:08 nebula3 kernel: [53799.437794] GFS2: 
fsid=yggdrasil:datastores.2: fatal: filesystem consistency error
Feb 10 09:08:08 nebula3 kernel: [53799.437794] GFS2: 
fsid=yggdrasil:datastores.2:   RG = 1073118141
Feb 10 09:08:08 nebula3 kernel: [53799.437794] GFS2: 
fsid=yggdrasil:datastores.2:   function = gfs2_setbit, file = 
/build/linux-faWYrf/linux-3.13.0/fs/gfs2/rgrp.c, line = 103
Feb 10 09:08:08 nebula3 kernel: [53799.437797] GFS2: 
fsid=yggdrasil:datastores.2: about to withdraw this file system
Feb 10 09:08:08 nebula3 kernel: [53799.441715] GFS2: 
fsid=yggdrasil:datastores.2: gfs2_evict_inode: -5
Feb 10 09:08:10 nebula3 kernel: [53801.764726] GFS2: 
fsid=yggdrasil:datastores.2: dirty_inode: glock -5
Feb 10 09:08:11 nebula3 kernel: [53802.249691] GFS2: 
fsid=yggdrasil:datastores.2: dirty_inode: glock -5
Feb 10 09:08:11 nebula3 kernel: [53802.254133] GFS2: 
fsid=yggdrasil:datastores.2: dirty_inode: glock -5
Feb 10 09:08:12 nebula3 kernel: [53803.330583] GFS2: 
fsid=yggdrasil:datastores.2: dirty_inode: glock -5

[...] Node restarted

Feb 10 11:17:05 nebula3 kernel: [ 6703.936206] GFS2: 
fsid=yggdrasil:datastores.2: fatal: filesystem consistency error
Feb 10 11:17:05 nebula3 kernel: [ 6703.936206] GFS2: 
fsid=yggdrasil:datastores.2:   inode = 11514 30312500
Feb 10 11:17:05 nebula3 kernel: [ 6703.936206] GFS2: 
fsid=yggdrasil:datastores.2:   function = gfs2_dinode_dealloc, file = 
/build/linux-OTIHGI/linux-3.13.0/fs/gfs2/super.c, line = 1371
Feb 10 11:17:05 nebula3 kernel: [ 6703.936216] GFS2: 
fsid=yggdrasil:datastores.2: about to withdraw this file system
Feb 10 11:17:05 nebula3 kernel: [ 6703.975181] GFS2: 
fsid=yggdrasil:datastores.2: dirty_inode: glock -5
Feb 10 11:17:05 nebula3 kernel: [ 6704.073107] GFS2: 
fsid=yggdrasil:datastores.2: dirty_inode: glock -5
Feb 10 11:17:05 nebula3 kernel: [ 6704.076098] GFS2: 
fsid=yggdrasil:datastores.2: dirty_inode: glock -5
Feb 10 11:17:05 nebula3 kernel: [ 6704.078946] GFS2: 
fsid=yggdrasil:datastores.2: dirty_inode: glock -5

[...] All node down + fsck.gfs2 on the FS

[...] 5 days later

Feb 15 09:22:27 nebula3 kernel: [411282.308290] GFS2: buf_blk = 0x2089 
old_state=0, new_state=0
Feb 15 09:22:27 nebula3 kernel: [411282.308295] GFS2: rgrp=0xc0c5667 
bi_start=0x0
Feb 15 09:22:27 nebula3 kernel: [411282.308296] GFS2: bi_offset=0x80 
bi_len=0xf80
Feb 15 09:22:27 nebula3 kernel: [411282.308300] CPU: 9 PID: 11494 Comm: rm 
Tainted: G        W     3.13.0-78-generic #122-Ubuntu
Feb 15 09:22:27 nebula3 kernel: [411282.308301] Hardware name: Dell Inc. 
PowerEdge M620/0T36VK, BIOS 2.2.7 01/21/2014
Feb 15 09:22:27 nebula3 kernel: [411282.308303]  000000000c0c7705 
ffff882e055f9a48 ffffffff81725768 000000000c0c76f6
Feb 15 09:22:27 nebula3 kernel: [411282.308309]  ffff882e055f9aa0 
ffffffffa05bcbbf ffff8817de436e00 00000000a05c3977
Feb 15 09:22:27 nebula3 kernel: [411282.308312]  ffff8817de414d48 
0000000000002089 ffff882d6f2ff000 0000000000000010
Feb 15 09:22:27 nebula3 kernel: [411282.308315] Call Trace:
Feb 15 09:22:27 nebula3 kernel: [411282.308327]  [<ffffffff81725768>] 
dump_stack+0x45/0x56
Feb 15 09:22:27 nebula3 kernel: [411282.308340]  [<ffffffffa05bcbbf>] 
rgblk_free+0x1ff/0x230 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308348]  [<ffffffffa05bef34>] 
__gfs2_free_blocks+0x34/0x120 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308352]  [<ffffffffa059b076>] 
recursive_scan+0x5b6/0x6a0 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308356]  [<ffffffffa059af2c>] 
recursive_scan+0x46c/0x6a0 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308360]  [<ffffffffa059af2c>] 
recursive_scan+0x46c/0x6a0 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308367]  [<ffffffffa05b94f5>] ? 
gfs2_quota_hold+0x175/0x1f0 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308371]  [<ffffffffa059b25a>] 
trunc_dealloc+0xfa/0x120 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308377]  [<ffffffffa05a698e>] ? 
gfs2_glock_wait+0x3e/0x80 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308382]  [<ffffffffa05a8190>] ? 
gfs2_glock_nq+0x280/0x430 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308387]  [<ffffffffa059cef0>] 
gfs2_file_dealloc+0x10/0x20 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308393]  [<ffffffffa05bfdb3>] 
gfs2_evict_inode+0x2b3/0x3e0 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308398]  [<ffffffffa05bfc13>] ? 
gfs2_evict_inode+0x113/0x3e0 [gfs2]
Feb 15 09:22:27 nebula3 kernel: [411282.308403]  [<ffffffff811d9a40>] 
evict+0xb0/0x1b0
Feb 15 09:22:27 nebula3 kernel: [411282.308406]  [<ffffffff811da255>] 
iput+0xf5/0x180
Feb 15 09:22:27 nebula3 kernel: [411282.308410]  [<ffffffff811cebae>] 
do_unlinkat+0x18e/0x2b0
Feb 15 09:22:27 nebula3 kernel: [411282.308415]  [<ffffffff811bbc06>] ? 
filp_close+0x56/0x70
Feb 15 09:22:27 nebula3 kernel: [411282.308418]  [<ffffffff811cfadb>] 
SyS_unlinkat+0x1b/0x40
Feb 15 09:22:27 nebula3 kernel: [411282.308421]  [<ffffffff8173635d>] 
system_call_fastpath+0x1a/0x1f
Feb 15 09:22:27 nebula3 kernel: [411282.308424] GFS2: 
fsid=yggdrasil:datastores.1: fatal: filesystem consistency error
Feb 15 09:22:27 nebula3 kernel: [411282.308424] GFS2: 
fsid=yggdrasil:datastores.1:   RG = 202135143
Feb 15 09:22:27 nebula3 kernel: [411282.308424] GFS2: 
fsid=yggdrasil:datastores.1:   function = gfs2_setbit, file = 
/build/linux-OTIHGI/linux-3.13.0/fs/gfs2/rgrp.c, line = 103
Feb 15 09:22:27 nebula3 kernel: [411282.308426] GFS2: 
fsid=yggdrasil:datastores.1: about to withdraw this file system
Feb 15 09:22:27 nebula3 kernel: [411282.483258] GFS2: 
fsid=yggdrasil:datastores.1: dirty_inode: glock -5
Feb 15 09:22:27 nebula3 kernel: [411282.627372] GFS2: 
fsid=yggdrasil:datastores.1: dirty_inode: glock -5
Feb 15 09:22:27 nebula3 kernel: [411282.876874] GFS2: 
fsid=yggdrasil:datastores.1: dirty_inode: glock -5
Feb 15 09:22:27 nebula3 kernel: [411282.879708] GFS2: 
fsid=yggdrasil:datastores.1: dirty_inode: glock -5
Feb 15 09:22:28 nebula3 kernel: [411283.383218] GFS2: 
fsid=yggdrasil:datastores.1: dirty_inode: glock -5
Feb 15 09:22:28 nebula3 kernel: [411283.397423] GFS2: 
fsid=yggdrasil:datastores.1: dirty_inode: glock -5
Feb 15 09:22:28 nebula3 kernel: [411283.399253] GFS2: 
fsid=yggdrasil:datastores.1: dirty_inode: glock -5



Attachment: signature.asc
Description: PGP signature

-- 
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to