Public bug reported:
I have reproduced this multiple times on both Ubuntu 19.10 (5.3.0-16)
and Ubuntu 20.04 beta (5.4.0-24)
sudo duperemove --lookup-extents=yes -d -r --hashfile /mnt/storage/hashes
[...]
0x556dfa962980] Dedupe 2 extents (id: b285a8ff) with target: (0, 2255716352),
"/mnt/storage/from-restic/10350278/tmp/fast/c14-sd-92016-2-home/2017-06-17-233341/2017-06-17-233341/edwin/.mu/xapian/postlist.DB"
[0x556de5dc7300] (154238/154243) Try to dedupe extents with id 3fc7b31b
[0x556de5dc7300] Dedupe 3 extents (id: 3fc7b31b) with target: (0, 2463842304),
"/mnt/storage/restore/25147310-89a6-4a93-9cd8-a582125c12e7/home/edwin/Downloads/ubuntu-19.10-desktop-amd64.iso"
[0x556dfa962860] (154239/154243) Try to dedupe extents with id f5615456
[0x556dfa962860] Dedupe 1 extents (id: f5615456) with target: (0, 5477811089),
"/mnt/storage/restore/25147310-89a6-4a93-9cd8-a582125c12e7/home/edwin/nokia/backup.ab"
[0x556dfa92bb00] (154240/154243) Try to dedupe extents with id a9595eaa
[0x556dfa92bb00] Dedupe 1 extents (id: a9595eaa) with target: (0, 11050147840),
"/mnt/storage/restore/25147310-89a6-4a93-9cd8-a582125c12e7/home/edwin/laptop-backup.squashfs"
[0x556dfa962920] (154241/154243) Try to dedupe extents with id 229145cb
[0x556dfa962920] Dedupe 1 extents (id: 229145cb) with target: (0, 13606766116),
"/mnt/storage/from-restic/0024a369/mnt/fast/listing-sorted0"
[0x556de5dc7180] (154242/154243) Try to dedupe extents with id 2eee56f0
[0x556de5dc7180] Dedupe 2 extents (id: 2eee56f0) with target: (0, 13606766163),
"/mnt/storage/from-restic/0024a369/mnt/fast/listing-sorted"
[0x556dfa92bb60] (154243/154243) Try to dedupe extents with id b6a1de37
[0x556dfa92bb60] Dedupe 1 extents (id: b6a1de37) with target: (0, 13890395789),
"/mnt/storage/from-restic/0024a369/mnt/fast/listing0"
[0x556dfa962980] Dedupe for file
"/mnt/storage/from-restic/0fcf01c9/tmp/fast/c14-sd-92016-2-home/2017-05-23-233346/2017-05-23-233346/edwin/.mu/xapian/postlist.DB"
had status (-5) "Input/output error".
[0x556dfa962980] Dedupe for file
"/mnt/storage/from-restic/04683ed4/tmp/fast/c14-sd-92016-2-home/2017-06-07-233355/2017-06-07-233355/edwin/.mu/xapian/postlist.DB"
had status (-5) "Input/output error".
Error 5: Input/output error while opening
"/mnt/storage/from-restic/10350278/tmp/fast/c14-sd-92016-2-home/2017-06-17-233341/2017-06-17-233341/edwin/.mu/xapian/postlist.DB"
(write=0)
Error 5: Input/output error while opening
"/mnt/storage/from-restic/04683ed4/tmp/fast/c14-sd-92016-2-home/2017-06-07-233355/2017-06-07-233355/edwin/.mu/xapian/postlist.DB"
(write=0)
Error 5: Input/output error while opening
"/mnt/storage/from-restic/0fcf01c9/tmp/fast/c14-sd-92016-2-home/2017-05-23-233346/2017-05-23-233346/edwin/.mu/xapian/postlist.DB"
(write=0)
[0x556de5dc7240] Dedupe for file
"/mnt/storage/restore/root.squashfs/home/edwin/cycles_benchmark_20160228.zip"
had status (-5) "Input/output error".
[0x556de5dc7240] Dedupe for file
"/mnt/storage/restore/root.squashfs/home/edwin/.phoronix-test-suite/installed-tests/pts/blender-1.4.1/cycles_benchmark_20160228.zip"
had status (-5) "Input/output error".
[0x556de5dc7240] Dedupe for file
"/mnt/storage/restore/root.squashfs/home/edwin/.phoronix-test-suite/installed-tests/pts/blender-1.7.0/cycles_benchmark_20160228.zip"
had status (-5) "Input/output error".
[0x556dfa9629e0] Dedupe for file
"/mnt/storage/from-restic/0024a369/mnt/fast/photos/Camera/VID_20180825_121050.mp4"
had status (-5) "Input/output error".
[...]
The kernel shows that XFS data structures got corrupted and it unmounted
itself to protect it from further damage:
[104763.870005] INFO: task pool:97746 blocked for more than 724 seconds.
[104763.870008] Tainted: G W 5.4.0-24-generic #28-Ubuntu
[104763.870009] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
this message.
[104763.870011] pool D 0 97746 47956 0x00000000
[104763.870013] Call Trace:
[104763.870019] __schedule+0x2e3/0x740
[104763.870021] schedule+0x42/0xb0
[104763.870022] rwsem_down_read_slowpath+0x16c/0x4a0
[104763.870051] ? xfs_vn_fiemap+0x32/0x80 [xfs]
[104763.870052] down_read+0x85/0xa0
[104763.870071] xfs_ilock+0x8a/0x100 [xfs]
[104763.870088] xfs_vn_fiemap+0x32/0x80 [xfs]
[104763.870090] do_vfs_ioctl+0x2c0/0x670
[104763.870091] ksys_ioctl+0x67/0x90
[104763.870093] __x64_sys_ioctl+0x1a/0x20
[104763.870094] do_syscall_64+0x57/0x190
[104763.870096] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[104763.870097] RIP: 0033:0x7fe0d575637b
[104763.870101] Code: Bad RIP value.
[104763.870102] RSP: 002b:00007fe0c9441bf8 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[104763.870103] RAX: ffffffffffffffda RBX: 00007fe090006370 RCX:
00007fe0d575637b
[104763.870104] RDX: 00007fe0c9441c30 RSI: 00000000c020660b RDI:
0000000000000012
[104763.870104] RBP: 0000556debad9350 R08: 0000000000000000 R09:
00007fe09004bf28
[104763.870105] R10: 0000000000000000 R11: 0000000000000246 R12:
00007fe09004bf28
[104763.870105] R13: 0000000000000000 R14: 0000556de3ef0b46 R15:
0000000012c60000
[108129.210852] XFS (dm-4): Internal error xfs_trans_cancel at line 1048 of
file fs/xfs/xfs_trans.c. Caller xfs_reflink_remap_extent+0x100/0x560 [xfs]
[108129.210856] CPU: 15 PID: 97752 Comm: pool Tainted: G W
5.4.0-24-generic #28-Ubuntu
[108129.210857] Hardware name: System manufacturer System Product Name/TUF
GAMING X570-PLUS, BIOS 1405 11/19/2019
[108129.210858] Call Trace:
[108129.210862] dump_stack+0x6d/0x9a
[108129.210883] xfs_error_report+0x36/0x40 [xfs]
[108129.210902] ? xfs_reflink_remap_extent+0x100/0x560 [xfs]
[108129.210920] xfs_trans_cancel+0x134/0x160 [xfs]
[108129.210937] xfs_reflink_remap_extent+0x100/0x560 [xfs]
[108129.210954] ? xfs_bmapi_read+0x202/0x340 [xfs]
[108129.210971] xfs_reflink_remap_blocks+0x11e/0x300 [xfs]
[108129.210989] xfs_file_remap_range+0x110/0x2c0 [xfs]
[108129.210992] vfs_dedupe_file_range_one+0x12d/0x150
[108129.210993] vfs_dedupe_file_range+0x156/0x1e0
[108129.210994] do_vfs_ioctl+0x340/0x670
[108129.210996] ksys_ioctl+0x67/0x90
[108129.210997] __x64_sys_ioctl+0x1a/0x20
[108129.210998] do_syscall_64+0x57/0x190
[108129.211000] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[108129.211001] RIP: 0033:0x7fe0d575637b
[108129.211003] Code: 0f 1e fa 48 8b 05 15 3b 0d 00 64 c7 00 26 00 00 00 48 c7
c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 8b 0d e5 3a 0d 00 f7 d8 64 89 01 48
[108129.211004] RSP: 002b:00007fe089ffac38 EFLAGS: 00000246 ORIG_RAX:
0000000000000010
[108129.211005] RAX: ffffffffffffffda RBX: 00007fe07004adc0 RCX:
00007fe0d575637b
[108129.211006] RDX: 00007fe07000c690 RSI: 00000000c0189436 RDI:
0000000000000017
[108129.211006] RBP: 00007fe070084920 R08: 0000000000000002 R09:
00007fe070084970
[108129.211007] R10: 0000000000000000 R11: 0000000000000246 R12:
00007fe070084958
[108129.211007] R13: 00007fe070084960 R14: 00007fe070084970 R15:
00007fe07000c690
[108129.230787] XFS (dm-4): xfs_do_force_shutdown(0x8) called from line 1049 of
file fs/xfs/xfs_trans.c. Return address = 00000000e903a147
[108129.230789] XFS (dm-4): Corruption of in-memory data detected. Shutting
down filesystem
[108129.230790] XFS (dm-4): Please unmount the filesystem and rectify the
problem(s)
[124069.735928] XFS (dm-4): Unmounting Filesystem
[124073.911647] XFS (dm-4): Mounting V5 Filesystem
[124074.026243] XFS (dm-4): Starting recovery (logdev: internal)
[124074.382784] XFS (dm-4): Ending recovery (logdev: internal)
[124074.632650] xfs filesystem being mounted at /mnt/storage supports
timestamps until 2038 (0x7fffffff)
[124076.224530] XFS (dm-4): Unmounting Filesystem
I've got a filesystem where I've unpacked some of my Restic backups to
test the feasibility of using XFS+reflinks as a replacement for Restic.
xfs_info /mnt/storage:
meta-data=/dev/mapper/storage-backup isize=512 agcount=4, agsize=183141376
blks
= sectsz=4096 attr=2, projid32bit=1
= crc=1 finobt=1, sparse=1, rmapbt=0
= reflink=1
data = bsize=4096 blocks=732565504, imaxpct=5
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=357698, version=2
= sectsz=4096 sunit=1 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Initially I had a write-through lvmcache device in front of the above, but I
was able to reproduce it even after removing it and running xfs_repair and
repeating the operation.
Takes about a day or so to reproduce the issue by running duperemove again, but
so far it reproduces 100%.
The latest xfs_repair output looks like this (initially it had a lot more
'clearing reflink flag on inode N' output:
sudo xfs_repair /dev/mapper/storage-backup 2>&1| tee dedupe-bug/dmesg-5
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- check for inodes claiming duplicate blocks...
- agno = 3
- agno = 0
- agno = 2
- agno = 1
clearing reflink flag on inodes when possible
Phase 5 - rebuild AG headers and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- traversing filesystem ...
- traversal finished ...
- moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done
I'll also report this upstream but wanted to report here first.
ProblemType: Bug
DistroRelease: Ubuntu 20.04
Package: linux-image-5.4.0-24-generic 5.4.0-24.28
ProcVersionSignature: Ubuntu 5.4.0-24.28-generic 5.4.30
Uname: Linux 5.4.0-24-generic x86_64
ApportVersion: 2.20.11-0ubuntu27
Architecture: amd64
AudioDevicesInUse:
USER PID ACCESS COMMAND
/dev/snd/controlC1: edwin 2183 F.... pulseaudio
/dev/snd/controlC2: edwin 2183 F.... pulseaudio
/dev/snd/controlC0: edwin 2183 F.... pulseaudio
CasperMD5CheckResult: skip
CurrentDesktop: ubuntu:GNOME
Date: Sat Apr 18 10:30:00 2020
InstallationDate: Installed on 2020-03-26 (22 days ago)
InstallationMedia: Ubuntu 19.10 "Eoan Ermine" - Release amd64 (20191017)
IwConfig:
enp6s0 no wireless extensions.
lo no wireless extensions.
MachineType: System manufacturer System Product Name
ProcFB: 0 amdgpudrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.4.0-24-generic
root=/dev/mapper/storage-ubuntu--1910--root ro quiet splash audit=0 vt.handoff=7
RelatedPackageVersions:
linux-restricted-modules-5.4.0-24-generic N/A
linux-backports-modules-5.4.0-24-generic N/A
linux-firmware 1.187
RfKill:
SourcePackage: linux
UpgradeStatus: Upgraded to focal on 2020-04-16 (1 days ago)
dmi.bios.date: 11/19/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 1405
dmi.board.asset.tag: Default string
dmi.board.name: TUF GAMING X570-PLUS
dmi.board.vendor: ASUSTeK COMPUTER INC.
dmi.board.version: Rev X.0x
dmi.chassis.asset.tag: Default string
dmi.chassis.type: 3
dmi.chassis.vendor: Default string
dmi.chassis.version: Default string
dmi.modalias:
dmi:bvnAmericanMegatrendsInc.:bvr1405:bd11/19/2019:svnSystemmanufacturer:pnSystemProductName:pvrSystemVersion:rvnASUSTeKCOMPUTERINC.:rnTUFGAMINGX570-PLUS:rvrRevX.0x:cvnDefaultstring:ct3:cvrDefaultstring:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: System Product Name
dmi.product.sku: SKU
dmi.product.version: System Version
dmi.sys.vendor: System manufacturer
** Affects: linux (Ubuntu)
Importance: Undecided
Status: New
** Tags: amd64 apport-bug focal
** Attachment added: "kern.log"
https://bugs.launchpad.net/bugs/1873555/+attachment/5356403/+files/kern.log
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1873555
Title:
linux 5.3.0-46 and 5.4.0-24: XFS with reflinks and duperemove: in-
memory data corruption
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1873555/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs