Hi,
Sorry to resurrect an old issue, but I've just come across the same (or
very similar-looking) problem. I'm also on an Openstack Swift storage
node with lots of small writes to SSDs as in the OP, running on Debian
Stretch in our case with the following kernel:
Linux swift-storage-1 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3
(2018-03-02) x86_64 GNU/Linux
Kernel logs said:
[4769736.560752] XFS (sdc1): Metadata corruption detected at
xfs_attr3_leaf_write_verify+0xe8/0x100 [xfs], xfs_attr3_leaf block 0xe7dd89b0
[4769736.563285] XFS (sdc1): Unmount and run xfs_repair
[4769736.564554] XFS (sdc1): First 64 bytes of corrupted metadata buffer:
[4769736.565818] ffff960ab1d0d000: 00 00 00 00 00 00 00 00 fb ee 00 00 00 00 00
00 ................
[4769736.567064] ffff960ab1d0d010: 10 00 00 00 00 20 0f e0 00 00 00 00 00 00 00
00 ..... ..........
[4769736.568272] ffff960ab1d0d020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ................
[4769736.569446] ffff960ab1d0d030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ................
[4769736.570611] XFS (sdc1): xfs_do_force_shutdown(0x8) called from line 1339
of file /build/linux-YDazDa/linux-4.9.82/fs/xfs/xfs_buf.c. Return address =
0xffffffffc06c1ada
[4769736.573226] XFS (sdc1): Corruption of in-memory data detected. Shutting
down filesystem
[4769736.574419] XFS (sdc1): Please umount the filesystem and rectify the
problem(s)
As per the message, I unmounted the filesystem and ran xfs_repair on it. The
first run of xfs_repair told me to mount the filesystem to replay the log,
which I did. I then unmounted it and ran xfs_repair again:
~$ sudo xfs_repair /dev/sdc1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
Phase 5 - rebuild AG headers and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done
The filesystem now seems to be back up and running OK. I don't know if there's
any more information I could provide to help track down this issue?
Thanks,
Chris
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1596550
Title:
Metadata corruption detected at xfs_attr3_leaf_write_verify+0xd7/0xf0
Status in linux package in Ubuntu:
Confirmed
Bug description:
We noticed a XFS metadata corruption once we ran a lot of small write
IOs on SSDs in our OpenStack swift environment:
[1468860.211158] XFS (sdax): Metadata corruption detected at
xfs_attr3_leaf_write_verify+0xd7/0xf0 [xfs], block 0x7c99480
[1468860.211195] XFS (sdax): Unmount and run xfs_repair
[1468860.211215] XFS (sdax): First 64 bytes of corrupted metadata buffer:
[1468860.211247] ffff880630f66000: 00 00 00 00 00 00 00 00 fb ee 00 00 00 00
00 00 ................
[1468860.211268] ffff880630f66010: 10 00 00 00 00 20 0f e0 00 00 00 00 00 00
00 00 ..... ..........
[1468860.211289] ffff880630f66020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 ................
[1468860.211309] ffff880630f66030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 ................
[1468860.211328] XFS (sdax): xfs_do_force_shutdown(0x8) called from line 1254
of file /build/linux-lts-xenial-7RlTta/linux-lts-xenial-4.4.0/fs/xfs/xfs_buf.c.
Return address = 0x
ffffffffc068f616
[1468860.212214] XFS (sdax): Corruption of in-memory data detected. Shutting
down filesystem
[1468860.212232] XFS (sdax): Please umount the filesystem and rectify the
problem(s)
[1468860.212323] XFS (sdax): xfs_do_force_shutdown(0x1) called from line 315
of file
/build/linux-lts-xenial-7RlTta/linux-lts-xenial-4.4.0/fs/xfs/xfs_trans_buf.c.
Return address
= 0xffffffffc06bdda2
[1468860.261436] XFS (sdax): xfs_log_force: error -5 returned.
This error is reported with linux-generic-lts-xenial @4.4.0.22.12 on a
XFS filesystem formatted with 1024 as inode size and mounted with
rw,noatime,nodiratime,attr2,nobarrier,inode64,logbufs=8,sunit=512,swidth=512,noquota
For us this issue seems to be reproducible after several hours of stress
testing.
cat /proc/version_signature
Ubuntu 4.4.0-22.40~14.04.1-generic 4.4.8
Description: Ubuntu 14.04.3 LTS
Release: 14.04
---
AlsaDevices:
total 0
crw-rw---- 1 root audio 116, 1 Jun 10 13:19 seq
crw-rw---- 1 root audio 116, 33 Jun 10 13:19 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.14.1-0ubuntu3.11
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq',
'/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
DistroRelease: Ubuntu 14.04
IwConfig: Error: [Errno 2] No such file or directory
MachineType: HP ProLiant DL380 Gen9
Package: linux (not installed)
PciMultimedia:
ProcEnviron:
TERM=screen
PATH=(custom, no user)
XDG_RUNTIME_DIR=<set>
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcFB:
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-22-generic
root=/dev/mapper/lxc-root00 ro biosdevname=1 net.ifnames=0
usbcore.autosuspend=-1 vga=normal nomodeset nomdmonddf nomdmonisw
crashkernel=1024M-:128M
ProcVersionSignature: Ubuntu 4.4.0-22.40~14.04.1-generic 4.4.8
RelatedPackageVersions:
linux-restricted-modules-4.4.0-22-generic N/A
linux-backports-modules-4.4.0-22-generic N/A
linux-firmware 1.127.15
RfKill: Error: [Errno 2] No such file or directory
Tags: trusty
Uname: Linux 4.4.0-22-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:
_MarkForUpload: True
dmi.bios.date: 07/20/2015
dmi.bios.vendor: HP
dmi.bios.version: P89
dmi.chassis.type: 23
dmi.chassis.vendor: HP
dmi.modalias:
dmi:bvnHP:bvrP89:bd07/20/2015:svnHP:pnProLiantDL380Gen9:pvr:cvnHP:ct23:cvr:
dmi.product.name: ProLiant DL380 Gen9
dmi.sys.vendor: HP
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596550/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp