Daniel Dehennin <daniel.dehen...@baby-gnu.org> writes:

> Thanks, I'm using 3.1.6.
>
> Tonight I'll build the version 3.1.8 from Git[1] and run “fsck.gfs2 -p” on 
> the fs.

Hello,

I preferred to do the fsck on the filesystem, two times[1], instead of
the “gfs2_edit savemeta”:

1. “fsck.gfs2 -p <BLOCK DEVICE>” was quick
2. “fsck.gfs2 -f -p <BLOCK DEVICE>” took 4 hours

The cluster was bringed up after and everything was working fine until
yesterday:

    Feb 18 19:13:22 nebula3 kernel: [293848.682606] GFS2: buf_blk = 0x2089 
old_state=0, new_state=0
    Feb 18 19:13:22 nebula3 kernel: [293848.682612] GFS2: rgrp=0xc0c5667 
bi_start=0x0
    Feb 18 19:13:22 nebula3 kernel: [293848.682614] GFS2: bi_offset=0x80 
bi_len=0xf80
    Feb 18 19:13:22 nebula3 kernel: [293848.682619] CPU: 6 PID: 7057 Comm: 
kworker/6:8 Tainted: G        W     3.13.0-78-generic #122-Ubuntu
    Feb 18 19:13:22 nebula3 kernel: [293848.682621] Hardware name: Dell Inc. 
PowerEdge M620/0T36VK, BIOS 2.2.7 01/21/2014
    Feb 18 19:13:22 nebula3 kernel: [293848.682637] Workqueue: delete_workqueue 
delete_work_func [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682640]  000000000c0c7705 
ffff8811256c59d8 ffffffff81725768 000000000c0c76f6
    Feb 18 19:13:22 nebula3 kernel: [293848.682648]  ffff8811256c5a30 
ffffffffa05bebbf ffff880f5ffe9200 00000000a05c5977
    Feb 18 19:13:22 nebula3 kernel: [293848.682653]  ffff880f1ee574c8 
0000000000002089 ffff882e8c622000 0000000000000010
    Feb 18 19:13:22 nebula3 kernel: [293848.682658] Call Trace:
    Feb 18 19:13:22 nebula3 kernel: [293848.682668]  [<ffffffff81725768>] 
dump_stack+0x45/0x56
    Feb 18 19:13:22 nebula3 kernel: [293848.682681]  [<ffffffffa05bebbf>] 
rgblk_free+0x1ff/0x230 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682693]  [<ffffffffa05c0f34>] 
__gfs2_free_blocks+0x34/0x120 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682700]  [<ffffffffa059d076>] 
recursive_scan+0x5b6/0x6a0 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682707]  [<ffffffffa059cf2c>] 
recursive_scan+0x46c/0x6a0 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682714]  [<ffffffff8133a0f1>] ? 
submit_bio+0x71/0x150
    Feb 18 19:13:22 nebula3 kernel: [293848.682720]  [<ffffffff811f6146>] ? 
bio_alloc_bioset+0x196/0x2a0
    Feb 18 19:13:22 nebula3 kernel: [293848.682727]  [<ffffffff811f11d0>] ? 
_submit_bh+0x150/0x200
    Feb 18 19:13:22 nebula3 kernel: [293848.682734]  [<ffffffffa059cf2c>] 
recursive_scan+0x46c/0x6a0 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682744]  [<ffffffffa05bb4f5>] ? 
gfs2_quota_hold+0x175/0x1f0 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682752]  [<ffffffffa059d25a>] 
trunc_dealloc+0xfa/0x120 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682760]  [<ffffffffa05a898e>] ? 
gfs2_glock_wait+0x3e/0x80 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682769]  [<ffffffffa05aa190>] ? 
gfs2_glock_nq+0x280/0x430 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682777]  [<ffffffffa059eef0>] 
gfs2_file_dealloc+0x10/0x20 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682787]  [<ffffffffa05c1db3>] 
gfs2_evict_inode+0x2b3/0x3e0 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682796]  [<ffffffffa05c1c13>] ? 
gfs2_evict_inode+0x113/0x3e0 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682802]  [<ffffffff811d9a40>] 
evict+0xb0/0x1b0
    Feb 18 19:13:22 nebula3 kernel: [293848.682807]  [<ffffffff811da255>] 
iput+0xf5/0x180
    Feb 18 19:13:22 nebula3 kernel: [293848.682815]  [<ffffffffa05a90ec>] 
delete_work_func+0x5c/0x90 [gfs2]
    Feb 18 19:13:22 nebula3 kernel: [293848.682822]  [<ffffffff81083cd2>] 
process_one_work+0x182/0x450
    Feb 18 19:13:22 nebula3 kernel: [293848.682827]  [<ffffffff81084ac1>] 
worker_thread+0x121/0x410
    Feb 18 19:13:22 nebula3 kernel: [293848.682832]  [<ffffffff810849a0>] ? 
rescuer_thread+0x430/0x430
    Feb 18 19:13:22 nebula3 kernel: [293848.682837]  [<ffffffff8108b8a2>] 
kthread+0xd2/0xf0
    Feb 18 19:13:22 nebula3 kernel: [293848.682841]  [<ffffffff8108b7d0>] ? 
kthread_create_on_node+0x1c0/0x1c0
    Feb 18 19:13:22 nebula3 kernel: [293848.682846]  [<ffffffff817362a8>] 
ret_from_fork+0x58/0x90
    Feb 18 19:13:22 nebula3 kernel: [293848.682850]  [<ffffffff8108b7d0>] ? 
kthread_create_on_node+0x1c0/0x1c0
    Feb 18 19:13:22 nebula3 kernel: [293848.682855] GFS2: 
fsid=yggdrasil:datastores.1: fatal: filesystem consistency error
    Feb 18 19:13:22 nebula3 kernel: [293848.682855] GFS2: 
fsid=yggdrasil:datastores.1:   RG = 202135143
    Feb 18 19:13:22 nebula3 kernel: [293848.682855] GFS2: 
fsid=yggdrasil:datastores.1:   function = gfs2_setbit, file = 
/build/linux-OTIHGI/linux-3.13.0/fs/gfs2/rgrp.c, line = 103
    Feb 18 19:13:22 nebula3 kernel: [293848.682859] GFS2: 
fsid=yggdrasil:datastores.1: about to withdraw this file system
    Feb 18 19:13:22 nebula3 kernel: [293848.699050] GFS2: 
fsid=yggdrasil:datastores.1: dirty_inode: glock -5
    Feb 18 19:13:22 nebula3 kernel: [293848.705401] GFS2: 
fsid=yggdrasil:datastores.1: dirty_inode: glock -5

Now, the “always faulty node” is down and I'm doing the “gfs2_edit savemeta” 
from the other node.

I'm wondering if I should not upgrade the kernels to a much newer
version than 3.13.0.

My Ubuntu Trusty has some proposed kernel up to 4.2.0.

Regards.

Footnotes: 
[1]  The logs are attached to this email

-- 
Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF

-- 
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to