Daniel Dehennin <daniel.dehen...@baby-gnu.org> writes:

[...]

> We are using 3.1.6-0ubuntu1.
>
> Running an fsck is quite expensive for us, 4 hours with the shared FS
> unusable.
>
> I forgot to say that it stores qcow2 images, so there should not be
> concurrency on the file system except on some directories to
> create/access sub directories:
>
>     <GFS2 mount point>/<DIRECTORY OF RUNNING VMs>/<VM ID>/<QCOW2 images>
>
> Only the <DIRECTORY OF RUNNING VMs> should have concurrent write
> accesses, everything under <VM ID> is accessed only by one node at a
> time, except for monitoring which is read only.
>
> So “looks like it is trying to free a block that is already marked as
> being free” looks strange.

Now the kernel gave me a warning, if it could help:

Feb 15 14:13:07 nebula3 kernel: [16423.261927] ------------[ cut here 
]------------
Feb 15 14:13:07 nebula3 kernel: [16423.261943] WARNING: CPU: 8 PID: 4410 at 
/build/linux-OTIHGI/linux-3.13.0/mm/page_alloc.c:1604 
get_page_from_freelist+0x924/0x930()
Feb 15 14:13:07 nebula3 kernel: [16423.261945] Modules linked in: vhost_net 
vhost macvtap macvlan gfs2 dlm sctp configfs ip6table_filter ip6_tables 
iptable_filter ip_tables x_tables dm_round_robin openvswitch gre vxlan 
ip_tunnel nfsd auth_rpcgss nfs_acl nfs lockd sunrpc fscache bonding 
x86_pkg_temp_thermal intel_powerclamp ipmi_devintf gpio_ich coretemp dcdbas 
kvm_intel kvm crct10dif_pclmul crc32_pclmul aesni_intel aes_x86_64 lrw gf128mul 
glue_helper ablk_helper cryptd dm_multipath joydev scsi_dh mei_me shpchp mei 
sb_edac ipmi_si edac_core lpc_ich acpi_power_meter mac_hid wmi iTCO_wdt 
iTCO_vendor_support ses enclosure hid_generic qla2xxx usbhid hid ahci 
scsi_transport_fc libahci bnx2x tg3 megaraid_sas ptp scsi_tgt pps_core mdio 
libcrc32c
Feb 15 14:13:07 nebula3 kernel: [16423.262017] CPU: 8 PID: 4410 Comm: rm Not 
tainted 3.13.0-78-generic #122-Ubuntu
Feb 15 14:13:07 nebula3 kernel: [16423.262019] Hardware name: Dell Inc. 
PowerEdge M620/0T36VK, BIOS 2.2.7 01/21/2014
Feb 15 14:13:07 nebula3 kernel: [16423.262022]  0000000000000009 
ffff882e5f9f7820 ffffffff81725768 0000000000000000
Feb 15 14:13:07 nebula3 kernel: [16423.262028]  ffff882e5f9f7858 
ffffffff810678bd 0000000000000004 00000000000035de
Feb 15 14:13:07 nebula3 kernel: [16423.262033]  0000000000000001 
ffff88187fffbf00 0000000000000000 ffff882e5f9f7868
Feb 15 14:13:07 nebula3 kernel: [16423.262037] Call Trace:
Feb 15 14:13:07 nebula3 kernel: [16423.262046]  [<ffffffff81725768>] 
dump_stack+0x45/0x56
Feb 15 14:13:07 nebula3 kernel: [16423.262052]  [<ffffffff810678bd>] 
warn_slowpath_common+0x7d/0xa0
Feb 15 14:13:07 nebula3 kernel: [16423.262056]  [<ffffffff8106799a>] 
warn_slowpath_null+0x1a/0x20
Feb 15 14:13:07 nebula3 kernel: [16423.262060]  [<ffffffff81159134>] 
get_page_from_freelist+0x924/0x930
Feb 15 14:13:07 nebula3 kernel: [16423.262091]  [<ffffffff8101289e>] ? 
__switch_to+0x3fe/0x4d0
Feb 15 14:13:07 nebula3 kernel: [16423.262096]  [<ffffffff811592c4>] 
__alloc_pages_nodemask+0x184/0xb80
Feb 15 14:13:07 nebula3 kernel: [16423.262102]  [<ffffffff8114f86e>] ? 
find_get_page+0x1e/0xa0
Feb 15 14:13:07 nebula3 kernel: [16423.262111]  [<ffffffff8114fe00>] ? 
find_lock_page+0x30/0x70
Feb 15 14:13:07 nebula3 kernel: [16423.262115]  [<ffffffff81150404>] ? 
find_or_create_page+0x34/0x90
Feb 15 14:13:07 nebula3 kernel: [16423.262125]  [<ffffffff8136aa2e>] ? 
radix_tree_lookup_slot+0xe/0x10
Feb 15 14:13:07 nebula3 kernel: [16423.262134]  [<ffffffff81198153>] 
alloc_pages_current+0xa3/0x160
Feb 15 14:13:07 nebula3 kernel: [16423.262144]  [<ffffffff8115432e>] 
__get_free_pages+0xe/0x50
Feb 15 14:13:07 nebula3 kernel: [16423.262157]  [<ffffffff8117125e>] 
kmalloc_order_trace+0x2e/0xa0
Feb 15 14:13:07 nebula3 kernel: [16423.262170]  [<ffffffff810ab0f5>] ? 
wake_up_bit+0x25/0x30
Feb 15 14:13:07 nebula3 kernel: [16423.262177]  [<ffffffff811a3301>] 
__kmalloc+0x211/0x230
Feb 15 14:13:07 nebula3 kernel: [16423.262192]  [<ffffffffa05c15f6>] 
gfs2_rlist_alloc+0x26/0x70 [gfs2]
Feb 15 14:13:07 nebula3 kernel: [16423.262199]  [<ffffffffa059cd5d>] 
recursive_scan+0x29d/0x6a0 [gfs2]
Feb 15 14:13:07 nebula3 kernel: [16423.262206]  [<ffffffffa059cf2c>] 
recursive_scan+0x46c/0x6a0 [gfs2]
Feb 15 14:13:07 nebula3 kernel: [16423.262217]  [<ffffffffa05bb4f5>] ? 
gfs2_quota_hold+0x175/0x1f0 [gfs2]
Feb 15 14:13:07 nebula3 kernel: [16423.262224]  [<ffffffffa059d25a>] 
trunc_dealloc+0xfa/0x120 [gfs2]
Feb 15 14:13:07 nebula3 kernel: [16423.262232]  [<ffffffffa05a898e>] ? 
gfs2_glock_wait+0x3e/0x80 [gfs2]
Feb 15 14:13:07 nebula3 kernel: [16423.262240]  [<ffffffffa05aa190>] ? 
gfs2_glock_nq+0x280/0x430 [gfs2]
Feb 15 14:13:07 nebula3 kernel: [16423.262247]  [<ffffffffa059eef0>] 
gfs2_file_dealloc+0x10/0x20 [gfs2]
Feb 15 14:13:07 nebula3 kernel: [16423.262257]  [<ffffffffa05c1db3>] 
gfs2_evict_inode+0x2b3/0x3e0 [gfs2]
Feb 15 14:13:07 nebula3 kernel: [16423.262276]  [<ffffffffa05c1c13>] ? 
gfs2_evict_inode+0x113/0x3e0 [gfs2]
Feb 15 14:13:07 nebula3 kernel: [16423.262286]  [<ffffffff811d9a40>] 
evict+0xb0/0x1b0
Feb 15 14:13:07 nebula3 kernel: [16423.262290]  [<ffffffff811da255>] 
iput+0xf5/0x180
Feb 15 14:13:07 nebula3 kernel: [16423.262296]  [<ffffffff811cebae>] 
do_unlinkat+0x18e/0x2b0
Feb 15 14:13:07 nebula3 kernel: [16423.262305]  [<ffffffff811bbc06>] ? 
filp_close+0x56/0x70
Feb 15 14:13:07 nebula3 kernel: [16423.262310]  [<ffffffff811cfadb>] 
SyS_unlinkat+0x1b/0x40
Feb 15 14:13:07 nebula3 kernel: [16423.262315]  [<ffffffff8173635d>] 
system_call_fastpath+0x1a/0x1f
Feb 15 14:13:07 nebula3 kernel: [16423.262318] ---[ end trace 346ccba5c58117dc 
]---

Regards.
-- 
Daniel Dehennin
Récupérer ma clef GPG: gpg --recv-keys 0xCC1E9E5B7A6FE2DF
Fingerprint: 3E69 014E 5C23 50E8 9ED6  2AAD CC1E 9E5B 7A6F E2DF

Attachment: signature.asc
Description: PGP signature

-- 
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Reply via email to