Hello,

I'm having some Fedora Linux VMs (actual versions, latest updates) in a virtual test infrastructure on Virtualbox. There I run different VMs with different filesystems (ext4, xfs, zfs, bcachefs and btrfs).

I had a hardware problem on the underlying hardware where around 1000 4k blocks could not be read anymore. I migrated with ddrescure the whole disk which worked well.

Of course I was expecting some data loss in the VMs but wanted to get them in a consistent state.

The following file systems got very easy in a consistent state with the corresponding repair/scrub tools of the filesystems:
- ext4
- xfs
- zfs

Unfortunately 2 filesystem can't get into a state, where the filesystem repair tools report "everything fine" (of course with some loss data, but that's fine):
- btrfs
- bcachefs

commands run with bcachefs (git version):
git log -n1 | head -n1
commit 1e058db4b603f8992b781b4654b48221dd04407a
./bcachefs version
1.12.0

But bcachefs never got into a consistent state, also with newer versions. Also check with older versions (1.7.0) run for a long time.

To reproduce the problem I created a new filesystem and copied some files there:
mkfs.bcachefs -f /dev/sdb
time cp -Rap /usr /mnt

Afterwards I created a (quick&dirty) script "corrupt_device.sh" to corrupt the device in the same manner as the original failure (1000 4k blocks will be randomly overwritten).
Script: see below

~/corrupt_device.sh
./bcachefs fsck -pf /dev/sdb
./bcachefs fsck -pfR /dev/sdb

Result: It can be reproduced, that bcachefs can't be brought into a consistent state even after several runs of the repair.

You can also try to reproduce it and create a testcase out of it.

Any ideas how to repair and what can be done to get it into a consistent state?

Thnx.

Ciao,
Gerhard

Script corrupt_device.sh:
#!/usr/bin/env bash

RANDOM_DEVICE=/dev/urandom
OUTPUT_DEVICE=/dev/sdb
COUNT=1000
BLOCK_SIZE=4096

MAX_BLOCK_SIZE=$(blockdev --getsize64 ${OUTPUT_DEVICE})

echo "# Configured maximum size=${MAX_BLOCK_SIZE}"
MAX_BLOCK_NUMBER=$((MAX_BLOCK_SIZE/BLOCK_SIZE))
echo "# Maximum block number=${MAX_BLOCK_NUMBER}"

for ((BLOCK_NUMBER=1; BLOCK_NUMBER<=${COUNT}; BLOCK_NUMBER++ )) do
  BLOCK=`shuf --input-range=0-${MAX_BLOCK_NUMBER} --head-count=1`
  dd if=${RANDOM_DEVICE} of=${OUTPUT_DEVICE} bs=${BLOCK_SIZE} seek=${BLOCK} count=1 > /dev/null 2>&1
done


Reply via email to