Re: Question: raid1 behaviour on failure
Satoru Takeuchi wrote on 2016/04/22 11:21 +0900: On 2016/04/21 20:58, Qu Wenruo wrote: On 04/21/2016 03:45 PM, Satoru Takeuchi wrote: On 2016/04/21 15:23, Satoru Takeuchi wrote: On 2016/04/20 14:17, Matthias Bodenbinder wrote: Am 18.04.2016 um 09:22 schrieb Qu Wenruo: BTW, it would be better to post the dmesg for better debug. So here we. I did the same test again. Here is a full log of what i did. It seems to be mean like a bug in btrfs. Sequenz of events: 1. mount the raid1 (2 disc with different size) 2. unplug the biggest drive (hotplug) 3. try to copy something to the degraded raid1 4. plugin the device again (hotplug) This scenario does not work. The disc array is NOT redundant! I can not work with it while a drive is missing and I can not reattach the device so that everything works again. The btrfs module crashes during the test. I am using LMDE2 with backports: btrfs-tools 4.4-1~bpo8+1 linux-image-4.4.0-0.bpo.1-amd64 Matthias rakete - root - /root 1# mount /mnt/raid1/ Journal: Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): enabling auto defrag Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): disk space caching is enabled Apr 20 07:01:16 rakete kernel: BTRFS: has skinny extents rakete - root - /mnt/raid1 3# ll insgesamt 0 drwxrwxr-x 1 root root 36 Nov 14 2014 AfterShot2(64-bit) drwxrwxr-x 1 root root 5082 Apr 17 09:06 etc drwxr-xr-x 1 root root 108 Mär 24 07:31 var 4# btrfs fi show Label: none uuid: 16d5891f-5d52-4b29-8591-588ddf11e73d Total devices 3 FS bytes used 1.60GiB devid1 size 698.64GiB used 3.03GiB path /dev/sdg devid2 size 465.76GiB used 3.03GiB path /dev/sdh devid3 size 232.88GiB used 0.00B path /dev/sdi unplug device sdg: Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete kernel: Aborting journal on device sdf1-8. Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete umount[16405]: umount: /mnt/raid1: target is busy Apr 20 07:03:05 rakete umount[16405]: (In some cases useful info about processes that Apr 20 07:03:05 rakete umount[16405]: use the device is found by lsof(8) or fuser(1).) Apr 20 07:03:05 rakete systemd[1]: mnt-raid1.mount mount process exited, code=exited status=32 Apr 20 07:03:05 rakete systemd[1]: Failed unmounting /mnt/raid1. Apr 20 07:03:24 rakete kernel: usb 3-1: new SuperSpeed USB device number 3 using xhci_hcd Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device found, idVendor=152d, idProduct=0567 Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device strings: Mfr=10, Product=11, SerialNumber=5 Apr 20 07:03:24 rakete kernel: usb 3-1: Product: USB to ATA/ATAPI Bridge Apr 20 07:03:24 rakete kernel: usb 3-1: Manufacturer: JMicron Apr 20 07:03:24 rakete kernel: usb 3-1: SerialNumber: 152D00539000 Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: USB Mass Storage device detected Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: Quirks match for vid 152d pid 0567: 500 Apr 20 07:03:24 rakete kernel: scsi host9: usb-storage 3-1:1.0 Apr 20 07:03:24 rakete mtp-probe[16424]: checking bus 3, device 3: "/sys/devices/pci:00/:00:1c.5/:04:00.0/usb3/3-1" Apr 20 07:03:24 rakete mtp-probe[16424]: bus: 3, device: 3 was not an MTP device Apr 20 07:03:25 rakete kernel: scsi 9:0:0:0: Direct-Access WDC WD20 02FAEX-007BA00125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:1: Direct-Access WDC WD50 01AALS-00L3B20125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:2: Direct-Access SAMSUNG SP2504C 0125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: Attached scsi generic sg6 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: Attached scsi generic sg7 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: Attached scsi generic sg8 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] 976773168 512-byte logical blocks: (500 GB/466 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Assuming drive cache: write through Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: [sdk] 488395055 512-byte logical blocks: (250 GB/233 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj]
About fi du and reflink/dedupe
Hi Mark, Thanks for your contribution to btrfs-filesystem-du command. However there seems to be some strange behavior related to reflinke(and further in-band dedupe). (And the root cause is lying quite deep into kernel backref resolving codes) ["Exclusive" value not really exclsuive] When a file with 2 file extents, and the 2nd file extent points to the 1st one, the fi du gives wrong answer The following command can create such file easily. # mkfs.btrfs -f /dev/sdb5 # mount /dev/sdb5 /mnt/test # xfs_io -f -c "pwrite 0 128K" /mnt/test/tmp # xfs_io -c "reflink /mnt/test/tmp 0 128K 128K" /mnt/test/tmp # btrfs fi du /mnt/test Total Exclusive Set shared Filename 256.00KiB 256.00KiB - /mnt/test//tmp 256.00KiB 256.00KiB 0.00B /mnt/test/ Total seems to be OK, while I am confused of the exclusive value. As the above method will only create one real data extent, which takes 128K, and if following the qgroup definition, its exclusive should be 128K other than 256K. Fi du uses FIEMAP ioctl to get the fiemap, and fi du uses the SHARED flag to determine whether it is shared. However that SHARED flag doesn't handle case like this, in which ino/root are all the same, only extent offset is different. And what's more, if we modify btrfs_check_shared() to return SHARED flag for such case, we will get 0 exclusive value for it. Which is quite strang. (I assume the exclusive should be 128K) [Slow btrfs_check_shared() performance] In above case, btrfs fi du returns very fast. But when the file is in-band deduped and size goes to 1G. btrfs_check_shared() will take a lot of time to return, as it will do backref walk through. This would be a super huge problem for inband dedupe. [Possible solution] Would you please consider to judge shared extent in user space? And don't rely on the SHARED flag from fiemap. The work flow would be like: 1) Call fiemap skipping FIEMAP_EXTENT_SHARED flag Although we still need to modify kernel to avoid btrfs_check_shared() 2) Get the disk bytenr and record it in user space bytenr pool 3) Compare each file extent disk bytenr with bytenr pool And like qgroup, use this to build a rfer/excl data for each inode. At least, this method would handle above exclusive value and avoid year-long fiemap ioctl call in in-band dedupe case. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: raid1 behaviour on failure
On 2016/04/21 20:58, Qu Wenruo wrote: On 04/21/2016 03:45 PM, Satoru Takeuchi wrote: On 2016/04/21 15:23, Satoru Takeuchi wrote: On 2016/04/20 14:17, Matthias Bodenbinder wrote: Am 18.04.2016 um 09:22 schrieb Qu Wenruo: BTW, it would be better to post the dmesg for better debug. So here we. I did the same test again. Here is a full log of what i did. It seems to be mean like a bug in btrfs. Sequenz of events: 1. mount the raid1 (2 disc with different size) 2. unplug the biggest drive (hotplug) 3. try to copy something to the degraded raid1 4. plugin the device again (hotplug) This scenario does not work. The disc array is NOT redundant! I can not work with it while a drive is missing and I can not reattach the device so that everything works again. The btrfs module crashes during the test. I am using LMDE2 with backports: btrfs-tools 4.4-1~bpo8+1 linux-image-4.4.0-0.bpo.1-amd64 Matthias rakete - root - /root 1# mount /mnt/raid1/ Journal: Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): enabling auto defrag Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): disk space caching is enabled Apr 20 07:01:16 rakete kernel: BTRFS: has skinny extents rakete - root - /mnt/raid1 3# ll insgesamt 0 drwxrwxr-x 1 root root 36 Nov 14 2014 AfterShot2(64-bit) drwxrwxr-x 1 root root 5082 Apr 17 09:06 etc drwxr-xr-x 1 root root 108 Mär 24 07:31 var 4# btrfs fi show Label: none uuid: 16d5891f-5d52-4b29-8591-588ddf11e73d Total devices 3 FS bytes used 1.60GiB devid1 size 698.64GiB used 3.03GiB path /dev/sdg devid2 size 465.76GiB used 3.03GiB path /dev/sdh devid3 size 232.88GiB used 0.00B path /dev/sdi unplug device sdg: Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete kernel: Aborting journal on device sdf1-8. Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete umount[16405]: umount: /mnt/raid1: target is busy Apr 20 07:03:05 rakete umount[16405]: (In some cases useful info about processes that Apr 20 07:03:05 rakete umount[16405]: use the device is found by lsof(8) or fuser(1).) Apr 20 07:03:05 rakete systemd[1]: mnt-raid1.mount mount process exited, code=exited status=32 Apr 20 07:03:05 rakete systemd[1]: Failed unmounting /mnt/raid1. Apr 20 07:03:24 rakete kernel: usb 3-1: new SuperSpeed USB device number 3 using xhci_hcd Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device found, idVendor=152d, idProduct=0567 Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device strings: Mfr=10, Product=11, SerialNumber=5 Apr 20 07:03:24 rakete kernel: usb 3-1: Product: USB to ATA/ATAPI Bridge Apr 20 07:03:24 rakete kernel: usb 3-1: Manufacturer: JMicron Apr 20 07:03:24 rakete kernel: usb 3-1: SerialNumber: 152D00539000 Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: USB Mass Storage device detected Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: Quirks match for vid 152d pid 0567: 500 Apr 20 07:03:24 rakete kernel: scsi host9: usb-storage 3-1:1.0 Apr 20 07:03:24 rakete mtp-probe[16424]: checking bus 3, device 3: "/sys/devices/pci:00/:00:1c.5/:04:00.0/usb3/3-1" Apr 20 07:03:24 rakete mtp-probe[16424]: bus: 3, device: 3 was not an MTP device Apr 20 07:03:25 rakete kernel: scsi 9:0:0:0: Direct-Access WDC WD20 02FAEX-007BA00125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:1: Direct-Access WDC WD50 01AALS-00L3B20125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:2: Direct-Access SAMSUNG SP2504C 0125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: Attached scsi generic sg6 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: Attached scsi generic sg7 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: Attached scsi generic sg8 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] 976773168 512-byte logical blocks: (500 GB/466 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Assuming drive cache: write through Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: [sdk] 488395055 512-byte logical blocks: (250 GB/233 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Assuming drive cache: write through Apr 20 07:03:25
Re: Raid5 replace disk problems
Jussi Kansanen posted on Thu, 21 Apr 2016 18:09:31 +0300 as excerpted: > The replace operation is super slow (no other load) with avg. 3x20MB/s > (old disks) reads and 1.4MB/s write (new disk) with CFQ scheduler. Using > deadline schd. the performance is better with avg. 3x40MB/s reads and > 4MB/s write (both schds. with default queue/nr_requests). > > Write speed seems slow but guess it possible if there's a lot random > writes but why is the difference between data read vs. written so large? > According to iostat replace reads 35 times more data than it writes to > the new disk. > > > Info: > > kernel 4.5 (now 4.5.2, no change) > btrfs-progs 4.5.1 [Just a btrfs using admin and list regular, not a dev. Also, raid56 isn't my own use-case, but I am following it in general on the list.] Keep in mind that btrfs raid56 mode (aka parity raid mode) remains less mature and stable than non-parity raid modes such as raid1 and raid10, and of course single-device mode with single data and single or dup metadata, as well. It's certainly /not/ considered stable enough for production usage at this point, and other alternatives such as btrfs raid1 or raid10 or use of a separate raid layer (btrfs raid1 on top of a pair of mdraid0s is one interesting solution) are actively recommended. And you're not the first to report super-slow replace/restripe for raid56, either. It's a known bug, tho as it doesn't seem to affect everyone it has been hard to pin down appropriately and fix. The worst part is that for those affected, replace and restripe are so slow that they cease to be real-world practical, and endanger the entire array because at that speed there's a relatively large chance that another device may fail before the replace is completed, failing the entire array as more devices have failed than it can handle. Which means from a reliability perspective it effectively degrades to slow raid0 as soon as the first device drops out, with no practical way of recovering back to raid5/6 mode. I don't recall seeing the memory issue reported before in relation to raid56, but it isn't horribly surprising either. IIRC there have been some recent memory fix patches that so 4.6 might be better, but I wouldn't count on it. I'd really just recommend getting off of raid56 mode for now, until it has had somewhat longer to mature. (I'm previously on record as suggesting that people wait at least a year, ~5 kernel cycles, from nominal full raid56 support for it to stabilize, and then asking about current state on the list, before trying to use it for anything but testing with throw-away data. With raid56 being nominally complete in 3.19, that would have been 4.4 at the earliest, and for a short time around then it did look reasonable, but then this bug with extremely long replace/restripe times began showing up on the list, and until that's traced down and fixed, I just don't see anyone responsible using it except of course for testing and hopefully fixing this thing. I honestly don't know how long that will be or if there are other bugs lurking as well, but given 4.6 is nearing release and I don't believe the bug has even been fully traced down yet, 4.8 is definitely the earliest I'd say consider it again, and a more conservative recommendation might be to ask again around 4.10.) -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: Test that qgroup counts are valid after snapshot creation
Mark Fasheh wrote on 2016/04/21 16:53 -0700: Thank you for the review, comments are below. On Wed, Apr 20, 2016 at 09:48:54AM +0900, Satoru Takeuchi wrote: On 2016/04/20 7:25, Mark Fasheh wrote: +# Force a small leaf size to make it easier to blow out our root +# subvolume tree +_scratch_mkfs "--nodesize 16384" nodesize 16384 is the default value. Do you intend other value, for example 4096? "future proofing" I suppose - if we up the default, the for loop below may not create a level 1 tree. If we force it smaller than 16K I believe that may mean we can't run this test on some kernels with page size larger than the typical 4k. --Mark -- Mark Fasheh Sorry for the late reply. Unfortunately, for system with 64K page size, it will fail(mount and mkfs) if we use 16K nodesize. IIRC, like some other btrfs qgroup test case, we use 64K nodesize as the safest nodesize. And for level 1 tree create, the idea is to use inline file extents to rapidly create level 1 tree. 16 4K files should create a level 1 tree. Although in this case, max_inline=4096 would be added to mount option though. Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] btrfs: Test that qgroup counts are valid after snapshot creation
This has been broken since Linux v4.1. We may have worked out a solution on the btrfs list but in the meantime sending a test to expose the issue seems like a good idea. Changes from v1-v2: - cleanups - added 122.out Signed-off-by: Mark Fasheh--- tests/btrfs/122 | 88 + tests/btrfs/122.out | 1 + tests/btrfs/group | 1 + 3 files changed, 90 insertions(+) create mode 100755 tests/btrfs/122 create mode 100644 tests/btrfs/122.out diff --git a/tests/btrfs/122 b/tests/btrfs/122 new file mode 100755 index 000..82252ab --- /dev/null +++ b/tests/btrfs/122 @@ -0,0 +1,88 @@ +#! /bin/bash +# FS QA Test No. btrfs/122 +# +# Test that qgroup counts are valid after snapshot creation. This has +# been broken in btrfs since Linux v4.1 +# +#--- +# Copyright (C) 2016 SUSE Linux Products GmbH. All Rights Reserved. +# +# This program is free software; you can redistribute it and/or +# modify it under the terms of the GNU General Public License as +# published by the Free Software Foundation. +# +# This program is distributed in the hope that it would be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write the Free Software Foundation, +# Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA +# +#--- +# + +seq=`basename $0` +seqres=$RESULT_DIR/$seq +echo "QA output created by $seq" + +here=`pwd` +tmp=/tmp/$$ +status=1 # failure is the default! +trap "_cleanup; exit \$status" 0 1 2 3 15 + +_cleanup() +{ + cd / + rm -f $tmp.* +} + +# get standard environment, filters and checks +. ./common/rc +. ./common/filter + +# remove previous $seqres.full before test +rm -f $seqres.full + +# real QA test starts here +_supported_fs btrfs +_supported_os Linux +_require_scratch + +rm -f $seqres.full + +# Force a small leaf size to make it easier to blow out our root +# subvolume tree +_scratch_mkfs "--nodesize 16384" +_scratch_mount +_run_btrfs_util_prog quota enable $SCRATCH_MNT + +mkdir "$SCRATCH_MNT/snaps" + +# First make some simple snapshots - the bug was initially reproduced like this +_run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT "$SCRATCH_MNT/snaps/empty1" +_run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT "$SCRATCH_MNT/snaps/empty2" + +# This forces the fs tree out past level 0, adding at least one tree +# block which must be properly accounted for when we make our next +# snapshots. +mkdir "$SCRATCH_MNT/data" +for i in `seq 0 640`; do + $XFS_IO_PROG -f -c "pwrite 0 1M" "$SCRATCH_MNT/data/file$i" > /dev/null 2>&1 +done + +# Snapshot twice. +_run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT "$SCRATCH_MNT/snaps/snap1" +_run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT "$SCRATCH_MNT/snaps/snap2" + +_scratch_unmount + +# generate a qgroup report and look for inconsistent groups +$BTRFS_UTIL_PROG check --qgroup-report $SCRATCH_DEV 2>&1 | \ + grep -q -E "Counts for qgroup.*are different" +if [ $? -ne 0 ]; then + status=0 +fi + +exit diff --git a/tests/btrfs/122.out b/tests/btrfs/122.out new file mode 100644 index 000..2b1890e --- /dev/null +++ b/tests/btrfs/122.out @@ -0,0 +1 @@ +QA output created by 122 diff --git a/tests/btrfs/group b/tests/btrfs/group index 9403daa..f7e8cff 100644 --- a/tests/btrfs/group +++ b/tests/btrfs/group @@ -122,3 +122,4 @@ 119 auto quick snapshot metadata qgroup 120 auto quick snapshot metadata 121 auto quick snapshot qgroup +122 auto quick snapshot qgroup -- 2.1.4 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] btrfs: Test that qgroup counts are valid after snapshot creation
Thank you for the review, comments are below. On Wed, Apr 20, 2016 at 09:48:54AM +0900, Satoru Takeuchi wrote: > On 2016/04/20 7:25, Mark Fasheh wrote: > >+# Force a small leaf size to make it easier to blow out our root > >+# subvolume tree > >+_scratch_mkfs "--nodesize 16384" > > nodesize 16384 is the default value. Do you > intend other value, for example 4096? "future proofing" I suppose - if we up the default, the for loop below may not create a level 1 tree. If we force it smaller than 16K I believe that may mean we can't run this test on some kernels with page size larger than the typical 4k. --Mark -- Mark Fasheh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs forced readonly + errno=-28 No space left
On Thu, Apr 21, 2016 at 6:53 AM, Martin Svecwrote: > Hello, > > we use btrfs subvolumes for rsync-based backups. During backups btrfs often > fails with "No space > left" error and goes to readonly mode (dmesg output is below) while there's > still plenty of > unallocated space: Are you snapshotting near the time of enospc? If so it's a known problem that's been around for a while. There are some suggestions in the archives but I think the main thing is to back off on the workload momentarily, take the snapshot, and then resume the workload. I don't think it has to come to a complete stop but it's a lot more reproducible with heavy writes. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Клиентские базы т +79133913837 (whatsapp,viber,telegram) Skype: prodawez389 Email: ammanakuw-7...@yopmail.com Соберем для Вас по интернет базу данных потенциальных клиентов для Вашего Бизнеса. По ба
Соберем для Вас по интернет базу данных потенциальных клиентов для Вашего Бизнеса. По базе можно звонить, писать, слать факсы и email,вести любые прямые активные продажи Ваших товаров и услуг Узнайте подробнее по тел +79133913837 (whatsapp,viber,telegram) Skype: prodawez389 Email: ammanakuw-7...@yopmail.com -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: btrfs forced readonly + errno=-28 No space left
>we use btrfs subvolumes for rsync-based backups. During backups btrfs often >fails with "No >space >left" error and goes to readonly mode (dmesg output is below) while there's >still plenty of >unallocated space I have the same use case and the same issue with no real solution that I've found. However, mounting nospace_cache greatly reduces the problem. For me the frequency has gone from every other rsync giving No space to about 1 in six, after which I delete some snapshots and start again. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: raid1 behaviour on failure
Am 21.04.2016 um 07:43 schrieb Qu Wenruo: > There are already unmerged patches which will partly do the mdadm level > behavior, like automatically change to degraded mode without making the fs RO. > > The original patchset: > http://comments.gmane.org/gmane.comp.file-systems.btrfs/48335 The description of thix patch says: "Although the one-size-fit-all solution is quite safe, it's too strict if data and metadata has different duplication level." ... "This patchset will introduce a new per-chunk degradable check for btrfs, allow above case to succeed, and it's quite small anyway." My raid1 is "-m raid1 -d raid1". Both the same duplication level. Would that patch make any difference? And: What do I need to do to test this in "debian stable"? I am not a programmer - but I know how to use git and how to compile with proper configuration directions. Matthias > Or the latest patchset inside Anand Jain's auto-replace patchset: > http://thread.gmane.org/gmane.comp.file-systems.btrfs/55446 > > Thanks, > Qu >> >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: raid1 behaviour on failure
Am 21.04.2016 um 13:28 schrieb Henk Slager: >> Can anyone explain this behavior? > > All 4 drives (WD20, WD75, WD50, SP2504C) get a disconnect twice in > this test. What is on WD20 is unclear to me, but the raid1 array is > {WD75, WD50, SP2504C} > So the test as described by Matthias is not what actually happens. > In fact, the whole btrfs fs is 'disconnected on the lower layers of > the kernel' but there is no unmount. You can see the scsi items go > from 8?.0.0.x to > 9.0.0.x to 10.0.0.x. In the 9.0.0.x state, the tools show then 1 dev > missing (WD75), but in fact the whole fs state is messed up. So as > indicated by Anand already, it is a bad test and it is what one can > expect from an unpatched 4.4.0 kernel. ( I'm curious to know how md > raidX would handle this ). > > a) My best guess is that the 4 drives are in a USB connected drivebay > and that Matthias unplugged WD75 (so cut its power and SATA > connection), did the file copy trial and then plugged in the WD75 > again into the drivebay. The (un)plug of a harddisk is then assumed to > trigger a USB link re-init by the chipset in the drivebay. > > b) Another possibility is that due to (un)plug of WD75 cause the host > USB chipset to re-init the USB link due to (too big?) changes in > electrical current. And likely separate USB cables and maybe some > SATA. > > c) Or some flaw in the LMDE2 distribution in combination with btrfs. I > don't what is in the linux-image-4.4.0-0.bpo.1-amd64 > Just to clarify my setup. I HDs are mounted into a FANTEC QB-35US3-6G case. According to the handbook it has "Hot-Plug for USB / eSATA interface". It is equipped with 4 HDs. 3 of them are part of the raid1. The fourth HD is a 2 TB device with ext4 filesystem and no relevance for this thread. Matthias -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs-progs confusing message
On 04/21/2016 04:02 AM, Austin S. Hemmelgarn wrote: > On 2016-04-20 16:23, Konstantin Svist wrote: >> Pretty much all commands print out the usage message when no device is >> specified: >> >> [root@host ~]# btrfs scrub start >> btrfs scrub start: too few arguments >> usage: btrfs scrub start [-BdqrRf] [-c ioprio_class -n ioprio_classdata] >> | >> ... >> >> However, balance doesn't >> >> [root@host ~]# btrfs balance start >> ERROR: can't access 'start': No such file or directory > > And this is an example of why backwards comparability can be a pain. > The original balance command was 'btrfs filesystem balance', and had > no start, stop, or similar sub-commands. This got changed to the > current incarnation when the support for filters was added. For > backwards compatibility reasons, we decided to still accept balance > with no arguments other than the path as being the same as running > 'btrfs balance start' on that path, and then made the old name an > alias to the new one, with the restriction that you can't pass in > filters through that interface. What is happening here is that > balance is trying to interpret start as a path, not a command, hence > the message about not being able to access 'start'. > So since this is still detected as an error, why not print usage info at this point? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Raid5 replace disk problems
Hello, I have a 4x 2TB HDD raid5 array and one of the disks started going bad (according to smart no read/write errors seen by btrfs), after replacing the disk with a new one I ran "btrfs replace" which resulted in kernel crash about 0.5% done: BTRFS info (device dm-10): dev_replace from (devid 4) to /dev/mapper/bcrypt_sdj1 started WARNING: CPU: 1 PID: 30627 at fs/btrfs/inode.c:9125 btrfs_destroy_inode+0x271/0x290() Modules linked in: algif_skcipher af_alg evdev xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack x86_pkg_temp_thermal kvm_intel kvm irqbypass ghash_clmulni_intel psmouse iptable_filter ip_tables x_tables fan thermal battery processor button autofs4 CPU: 1 PID: 30627 Comm: umount Not tainted 4.5.0 #1 Hardware name: System manufacturer System Product Name/P8Z77-V LE PLUS, BIOS 0910 03/18/2014 813971f9 817f2b34 8107ab78 8800d55daa00 8800cb990998 880212d5b800 8801fcc0ff58 812dbfc1 8800d55daa00 Call Trace: [] ? dump_stack+0x46/0x5d [] ? warn_slowpath_common+0x78/0xb0 [] ? btrfs_destroy_inode+0x271/0x290 [] ? btrfs_put_block_group_cache+0x72/0xa0 [] ? close_ctree+0x146/0x330 [] ? generic_shutdown_super+0x5f/0xe0 [] ? kill_anon_super+0x9/0x10 [] ? btrfs_kill_super+0xd/0x90 [] ? deactivate_locked_super+0x2f/0x60 [] ? cleanup_mnt+0x36/0x80 [] ? task_work_run+0x6c/0x90 [] ? exit_to_usermode_loop+0x8a/0x90 [] ? int_ret_from_sys_call+0x25/0x8f ---[ end trace 6a7dec9450d45f9c ]--- Replace continues automatically after reboot but ends up using all of memory, around every 6% of progress (8 hours) and crashes system: BTRFS info (device dm-10): continuing dev_replace from (devid 4) to /dev/mapper/bcrypt_sdj1 @0% Apr 20 14:03:48 localhost kernel: BTRFS warning (device dm-4): devid 4 uuid e02b8898-c6ce-4c95-956d-24217c470b8a is missing Apr 20 14:03:52 localhost kernel: BTRFS info (device dm-4): continuing dev_replace from (devid 4) to /dev/mapper/bcrypt_sdj1 @6% Apr 20 22:38:41 localhost kernel: BTRFS warning (device dm-4): devid 4 uuid e02b8898-c6ce-4c95-956d-24217c470b8a is missing Apr 20 22:38:46 localhost kernel: BTRFS info (device dm-4): continuing dev_replace from (devid 4) to /dev/mapper/bcrypt_sdj1 @12% Apr 21 13:14:51 localhost kernel: BTRFS warning (device dm-4): devid 4 uuid e02b8898-c6ce-4c95-956d-24217c470b8a is missing Apr 21 13:14:55 localhost kernel: BTRFS info (device dm-4): continuing dev_replace from (devid 4) to /dev/mapper/bcrypt_sdj1 @18% The issue is related to "bio-1" using all of memory: /proc/meminfo: MemTotal:8072852 kB MemFree: 646108 kB ... Slab:6235188 kB SReclaimable: 49320 kB SUnreclaim: 6185868 kB /proc/slabinfo: # name : tunables: slabdata bio-1 17588753 17588964320 121 : tunables000 : slabdata 1465747 1465747 0 The replace operation is super slow (no other load) with avg. 3x20MB/s (old disks) reads and 1.4MB/s write (new disk) with CFQ scheduler. Using deadline schd. the performance is better with avg. 3x40MB/s reads and 4MB/s write (both schds. with default queue/nr_requests). Write speed seems slow but guess it possible if there's a lot random writes but why is the difference between data read vs. written so large? According to iostat replace reads 35 times more data than it writes to the new disk. Info: kernel 4.5 (now 4.5.2, no change) btrfs-progs 4.5.1 dm-crypted partitions, 4k aligned mount opts: defaults,noatime,compress=lzo 8GB RAM btrfs fi usage /bstorage/ WARNING: RAID56 detected, not implemented WARNING: RAID56 detected, not implemented WARNING: RAID56 detected, not implemented Overall: Device size: 9.10TiB Device allocated:0.00B Device unallocated:9.10TiB Device missing:1.82TiB Used:0.00B Free (estimated):0.00B (min: 8.00EiB) Data ratio: 0.00 Metadata ratio: 0.00 Global reserve: 512.00MiB (used: 0.00B) Data,RAID5: Size:1.52TiB, Used:1.46TiB /dev/mapper/bcrypt_sdg1 520.00GiB /dev/mapper/bcrypt_sdh1 520.00GiB /dev/mapper/bcrypt_sdi1 520.00GiB missing 520.00GiB Metadata,RAID5: Size:4.03GiB, Used:1.96GiB /dev/mapper/bcrypt_sdg1 1.34GiB /dev/mapper/bcrypt_sdh1 1.34GiB /dev/mapper/bcrypt_sdi1 1.34GiB missing 1.34GiB System,RAID5: Size:76.00MiB, Used:128.00KiB /dev/mapper/bcrypt_sdg136.00MiB /dev/mapper/bcrypt_sdh136.00MiB /dev/mapper/bcrypt_sdi136.00MiB missing 4.00MiB Unallocated: /dev/mapper/bcrypt_sdg1 1.31TiB /dev/mapper/bcrypt_sdh1 1.31TiB /dev/mapper/bcrypt_sdi1 1.31TiB /dev/mapper/bcrypt_sdj1 1.82TiB missing 1.31TiB btrfs fi show
Re: "/tmp/mnt.", and not honouring compression
On Thu, 2016-03-31 at 23:43 +0100, Duncan wrote: > Chris Murray posted on Thu, 31 Mar 2016 21:49:29 +0100 as excerpted: > > > I'm using Proxmox, based on Debian. Kernel version 4.2.8-1-pve. > Btrfs > > v3.17. > > The problem itself is beyond my level, but aiming for the obvious low- > hanging fruit... > > On this list, which is forward looking as btrfs remains stabilizing, > not > yet fully stable and mature, kernel support comes in four tracks, > mainstream and btrfs development trees, mainstream current, mainstream > lts, and everything else. > > Mainstream and btrfs development trees should be obvious. It covers > mainstream current git and rc kernels as well as btrfs-integration and > linux-next. Generally only recommended for bleeding edge testers > willing > to lose what they're testing. > > Mainstream current follows mainstream latest releases, with generally > the > latest two kernel series being best supported. With 4.5 out, that's > 4.5 > and 4.4. > > Mainstream LTS follows mainstream LTS series, and until recently, > again > the latest two were best supported. That's the 4.4 and 4.1 LTS > series. > However, as btrfs has matured, the previous LTS series, 3.18, hasn't > turned out so bad and remains reasonably well supported as well, tho > depending on the issue, you may still be asked to upgrade and see if > it's > still there in 4.1 or 4.4. > > Then there's "everything else", which is where a 4.2 kernel such as > you're running comes in. These kernels are either long ago history > (pre-3.18 LTS, for instance) in btrfs terms, or out of their > mainstream > kernel support windows, which is where 4.2 is. While we recognize > that > various distros claiming btrfs support may still be using these > kernels, > because we're mainline focused we don't track what patches they may or > may not have backported, and thus aren't in a particularly good > position > to support them. If you're relying on your distro's support in such a > case, that's where you need to look, as they know what they've > backported > and what they haven't and are thus in a far better position to provide > support. > > As for the list, we still do the best we can with these "everything > else" > kernels, but unless it's a known problem recognized on-sight, that's > most > often simply to recommend upgrading to something that's better > supported > and trying to duplicate the problem there. > > Meanwhile, for long-term enterprise level stability, btrfs isn't > likely > to be a good choice in any case, as it really is still stabilizing and > the expectation is that people running it will be upgrading to get the > newer patches. If that's not feasible, as it may not be for the > enterprise-stability-level use-case, then it's very likely that btrfs > isn't a good match for the use-case anyway, as it's simply not to that > level of stability yet. A more mature filesystem such as ext4, ext3, > the > old reiserfs which I still use on some spinning rust here (all my > btrfs > are on ssd), xfs, etc, is very likely to be a more appropriate choice > for > that use-case. > > For kernel 4.2, that leaves you with a few choices: > > 1) Ask your distro for btrfs support if they offer it on the out-of- > mainline-support kernels which they've obviously chosen to use instead > of > the LTS series that /are/ still mainline supported. > > 2) Upgrade to the supported 4.4 LTS kernel series. > > 3) Downgrade to the older supported 4.1 LTS kernel series. > > 4) Decide btrfs is inappropriate for your use-case and switch to a > fully > stable and mature filesystem. > > 5) Continue with 4.2 and muddle thru, using our "best effort" help > where > you can and doing without or getting it elsewhere if the opportunity > presents itself or you have money to buy it from a qualified provider. > > > Personally I'd choose option 2, upgrading to 4.4, but that's just me. > The other choices may work better for you. > > > As for btrfs-progs userspace, when the filesystem is working it's not > as > critical, since other than filesystem creation with mkfs.btrfs, most > operational commands simply invoke kernel code to do the real work. > However, once problems appear, a newer version can be critical as > patches > to deal with newly discovered problems continue to be added to tools > such > as btrfs check (for detecting and repairing problems) and btrfs > restore > (for recovery of files off an unmountable filesystem). And newer > userspace is designed to work with older kernels, so newer isn't a > problem in that regard. > > As a result, to keep userspace from getting /too/ far behind and > because > userspace release version numbers are synced with kernel version, a > good > rule of thumb is to run a userspace version similar to that of your > kernel, or newer. Assuming you're already following the current or > LTS > track kernel recommendations, that will keep you reasonably current, > and > you can always upgrade to the newest
btrfs forced readonly + errno=-28 No space left
Hello, we use btrfs subvolumes for rsync-based backups. During backups btrfs often fails with "No space left" error and goes to readonly mode (dmesg output is below) while there's still plenty of unallocated space: $ btrfs fi df /backup Data, single: total=15.75TiB, used=15.72TiB System, DUP: total=8.00MiB, used=1.91MiB Metadata, DUP: total=148.00GiB, used=146.20GiB GlobalReserve, single: total=512.00MiB, used=0.00B $ btrfs fi show /dev/md2 Label: none uuid: 32892e65-f78d-45a3-a7c4-980fedc14e63 Total devices 1 FS bytes used 15.86TiB devid1 size 21.83TiB used 16.03TiB path /dev/md2 $ btrfs file usage /backup Overall: Device size: 21.83TiB Device allocated: 16.02TiB Device unallocated:5.81TiB Device missing: 0.00B Used: 15.94TiB Free (estimated): 5.89TiB (min: 2.98TiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 296.64MiB) Data,single: Size:15.73TiB, Used:15.65TiB /dev/md2 15.73TiB Metadata,DUP: Size:148.00GiB, Used:146.07GiB /dev/md2 296.00GiB System,DUP: Size:8.00MiB, Used:1.91MiB /dev/md2 16.00MiB Unallocated: /dev/md25.81TiB It usually helps to rebalance 100% of metadata but the error reappears again after few days or weeks. I also tried "btrfs check --repair" but it requires approx. 45 GB of RAM/swap and crashes after several days of swapping. Btrfs runs on top of a single MD RAID1 device and is mounted with the following options: $ cat /proc/mounts /dev/md2 /backup btrfs rw,noatime,compress=lzo,space_cache,clear_cache,enospc_debug,subvolid=5,subvol=/ 0 0 Kernel version: 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.3-7~bpo8+1 (2016-01-19) x86_64 GNU/Linux (jessie-backports) [2151517.510044] BTRFS info (device md2): disk space caching is enabled [2151517.510047] BTRFS: has skinny extents [2266753.904426] use_block_rsv: 307 callbacks suppressed [2266753.904430] [ cut here ] [2266753.904453] WARNING: CPU: 7 PID: 17513 at /build/linux-kTc2b3/linux-4.3.3/fs/btrfs/extent-tree.c:7637 btrfs_alloc_tree_block+0x107/0x480 [btrfs]() [2266753.904481] BTRFS: block rsv returned -28 [2266753.904483] Modules linked in: binfmt_misc xt_comment xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 iptable_filter xt_conntrack iptable_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack ip6table_filter ip6table_mangle ip6table_raw iptable_mangle ip6_tables ip_tables x_tables nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc intel_powerc lamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul iTCO_wdt iTCO_vendor_support sha256_ssse3 sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ast ttm drm_kms_helper drm i2c_ismt i2c_i801 joydev evdev tpm_tis ipmi_si tpm serio_raw acpi_cpufreq ipmi_msghandler 8250_fintek lpc_ich mfd_core shpchp pcspkr processor button autofs4 xfs libcrc32c btrfs xor raid6_pq dm_mod [2266753.904552] raid10 raid1 hid_generic usbhid hid md_mod sg sd_mod ahci libahci crc32c_intel ehci_pci mpt2sas ehci_hcd raid_class libata scsi_transport_sas igb i2c_algo_bit usbcore dca ptp usb_common scsi_mod pps_core [2266753.904574] CPU: 7 PID: 17513 Comm: kworker/u16:10 Tainted: GW 4.3.0-0.bpo.1-amd64 #1 Debian 4.3.3-7~bpo8+1 [2266753.904576] Hardware name: Supermicro SSG-5018A-AR12L/A1SA7, BIOS 1.0a 07/09/2014 [2266753.904597] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs] [2266753.904600] 071448ef 812e1889 880003637868 [2266753.904604] 81074451 880265c3e000 8800036378c0 4000 [2266753.904608] 880339498970 0001 810744dc a0341c18 [2266753.904612] Call Trace: [2266753.904620] [] ? dump_stack+0x40/0x57 [2266753.904625] [] ? warn_slowpath_common+0x81/0xb0 [2266753.904629] [] ? warn_slowpath_fmt+0x5c/0x80 [2266753.904643] [] ? btrfs_alloc_tree_block+0x107/0x480 [btrfs] [2266753.904649] [] ? __switch_to+0x25c/0x590 [2266753.904662] [] ? __btrfs_cow_block+0x145/0x5e0 [btrfs] [2266753.904674] [] ? btrfs_cow_block+0x10f/0x1b0 [btrfs] [2266753.904687] [] ? btrfs_search_slot+0x1fd/0xa30 [btrfs] [2266753.904705] [] ? insert_state+0xbd/0x130 [btrfs] [2266753.904718] [] ? lookup_inline_extent_backref+0xee/0x650 [btrfs] [2266753.904723] [] ? __set_page_dirty_nobuffers+0xe1/0x140 [2266753.904728] [] ? kmem_cache_alloc+0x21c/0x440 [2266753.904741] [] ? __btrfs_free_extent.isra.66+0x11d/0xd60 [btrfs] [2266753.904754] [] ? update_block_group.isra.65+0x127/0x350 [btrfs] [2266753.904773] [] ? btrfs_merge_delayed_refs+0x66/0x5e0 [btrfs] [2266753.904787] [] ? __btrfs_run_delayed_refs+0x8b1/0x1080 [btrfs] [2266753.904801] [] ? btrfs_run_delayed_refs+0x78/0x2b0 [btrfs]
Re: Question: raid1 behaviour on failure
On 04/21/2016 03:45 PM, Satoru Takeuchi wrote: On 2016/04/21 15:23, Satoru Takeuchi wrote: On 2016/04/20 14:17, Matthias Bodenbinder wrote: Am 18.04.2016 um 09:22 schrieb Qu Wenruo: BTW, it would be better to post the dmesg for better debug. So here we. I did the same test again. Here is a full log of what i did. It seems to be mean like a bug in btrfs. Sequenz of events: 1. mount the raid1 (2 disc with different size) 2. unplug the biggest drive (hotplug) 3. try to copy something to the degraded raid1 4. plugin the device again (hotplug) This scenario does not work. The disc array is NOT redundant! I can not work with it while a drive is missing and I can not reattach the device so that everything works again. The btrfs module crashes during the test. I am using LMDE2 with backports: btrfs-tools 4.4-1~bpo8+1 linux-image-4.4.0-0.bpo.1-amd64 Matthias rakete - root - /root 1# mount /mnt/raid1/ Journal: Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): enabling auto defrag Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): disk space caching is enabled Apr 20 07:01:16 rakete kernel: BTRFS: has skinny extents rakete - root - /mnt/raid1 3# ll insgesamt 0 drwxrwxr-x 1 root root 36 Nov 14 2014 AfterShot2(64-bit) drwxrwxr-x 1 root root 5082 Apr 17 09:06 etc drwxr-xr-x 1 root root 108 Mär 24 07:31 var 4# btrfs fi show Label: none uuid: 16d5891f-5d52-4b29-8591-588ddf11e73d Total devices 3 FS bytes used 1.60GiB devid1 size 698.64GiB used 3.03GiB path /dev/sdg devid2 size 465.76GiB used 3.03GiB path /dev/sdh devid3 size 232.88GiB used 0.00B path /dev/sdi unplug device sdg: Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete kernel: Aborting journal on device sdf1-8. Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete umount[16405]: umount: /mnt/raid1: target is busy Apr 20 07:03:05 rakete umount[16405]: (In some cases useful info about processes that Apr 20 07:03:05 rakete umount[16405]: use the device is found by lsof(8) or fuser(1).) Apr 20 07:03:05 rakete systemd[1]: mnt-raid1.mount mount process exited, code=exited status=32 Apr 20 07:03:05 rakete systemd[1]: Failed unmounting /mnt/raid1. Apr 20 07:03:24 rakete kernel: usb 3-1: new SuperSpeed USB device number 3 using xhci_hcd Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device found, idVendor=152d, idProduct=0567 Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device strings: Mfr=10, Product=11, SerialNumber=5 Apr 20 07:03:24 rakete kernel: usb 3-1: Product: USB to ATA/ATAPI Bridge Apr 20 07:03:24 rakete kernel: usb 3-1: Manufacturer: JMicron Apr 20 07:03:24 rakete kernel: usb 3-1: SerialNumber: 152D00539000 Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: USB Mass Storage device detected Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: Quirks match for vid 152d pid 0567: 500 Apr 20 07:03:24 rakete kernel: scsi host9: usb-storage 3-1:1.0 Apr 20 07:03:24 rakete mtp-probe[16424]: checking bus 3, device 3: "/sys/devices/pci:00/:00:1c.5/:04:00.0/usb3/3-1" Apr 20 07:03:24 rakete mtp-probe[16424]: bus: 3, device: 3 was not an MTP device Apr 20 07:03:25 rakete kernel: scsi 9:0:0:0: Direct-Access WDC WD20 02FAEX-007BA00125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:1: Direct-Access WDC WD50 01AALS-00L3B20125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:2: Direct-Access SAMSUNG SP2504C 0125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: Attached scsi generic sg6 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: Attached scsi generic sg7 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: Attached scsi generic sg8 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] 976773168 512-byte logical blocks: (500 GB/466 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Assuming drive cache: write through Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: [sdk] 488395055 512-byte logical blocks: (250 GB/233 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Assuming drive cache: write through Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: [sdk] Write
Re: Question: raid1 behaviour on failure
On Thu, Apr 21, 2016 at 8:23 AM, Satoru Takeuchiwrote: > On 2016/04/20 14:17, Matthias Bodenbinder wrote: >> >> Am 18.04.2016 um 09:22 schrieb Qu Wenruo: >>> >>> BTW, it would be better to post the dmesg for better debug. >> >> >> So here we. I did the same test again. Here is a full log of what i did. >> It seems to be mean like a bug in btrfs. >> Sequenz of events: >> 1. mount the raid1 (2 disc with different size) >> 2. unplug the biggest drive (hotplug) >> 3. try to copy something to the degraded raid1 >> 4. plugin the device again (hotplug) >> >> This scenario does not work. The disc array is NOT redundant! I can not >> work with it while a drive is missing and I can not reattach the device so >> that everything works again. >> >> The btrfs module crashes during the test. >> >> I am using LMDE2 with backports: >> btrfs-tools 4.4-1~bpo8+1 >> linux-image-4.4.0-0.bpo.1-amd64 >> >> Matthias >> >> >> rakete - root - /root >> 1# mount /mnt/raid1/ >> >> Journal: >> >> Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): enabling auto >> defrag >> Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): disk space caching >> is enabled >> Apr 20 07:01:16 rakete kernel: BTRFS: has skinny extents >> >> rakete - root - /mnt/raid1 >> 3# ll >> insgesamt 0 >> drwxrwxr-x 1 root root 36 Nov 14 2014 AfterShot2(64-bit) >> drwxrwxr-x 1 root root 5082 Apr 17 09:06 etc >> drwxr-xr-x 1 root root 108 Mär 24 07:31 var >> >> 4# btrfs fi show >> Label: none uuid: 16d5891f-5d52-4b29-8591-588ddf11e73d >> Total devices 3 FS bytes used 1.60GiB >> devid1 size 698.64GiB used 3.03GiB path /dev/sdg >> devid2 size 465.76GiB used 3.03GiB path /dev/sdh >> devid3 size 232.88GiB used 0.00B path /dev/sdi >> >> >> unplug device sdg: >> >> Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block >> 243826688, lost sync page write >> Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating >> journal superblock for sdf1-8. >> Apr 20 07:03:05 rakete kernel: Aborting journal on device sdf1-8. >> Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block >> 243826688, lost sync page write >> Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating >> journal superblock for sdf1-8. >> Apr 20 07:03:05 rakete umount[16405]: umount: /mnt/raid1: target is busy >> Apr 20 07:03:05 rakete umount[16405]: (In some cases useful info about >> processes that >> Apr 20 07:03:05 rakete umount[16405]: use the device is found by lsof(8) >> or fuser(1).) >> Apr 20 07:03:05 rakete systemd[1]: mnt-raid1.mount mount process exited, >> code=exited status=32 >> Apr 20 07:03:05 rakete systemd[1]: Failed unmounting /mnt/raid1. >> Apr 20 07:03:24 rakete kernel: usb 3-1: new SuperSpeed USB device number 3 >> using xhci_hcd >> Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device found, >> idVendor=152d, idProduct=0567 >> Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device strings: Mfr=10, >> Product=11, SerialNumber=5 >> Apr 20 07:03:24 rakete kernel: usb 3-1: Product: USB to ATA/ATAPI Bridge >> Apr 20 07:03:24 rakete kernel: usb 3-1: Manufacturer: JMicron >> Apr 20 07:03:24 rakete kernel: usb 3-1: SerialNumber: 152D00539000 >> Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: USB Mass Storage >> device detected >> Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: Quirks match for vid >> 152d pid 0567: 500 >> Apr 20 07:03:24 rakete kernel: scsi host9: usb-storage 3-1:1.0 >> Apr 20 07:03:24 rakete mtp-probe[16424]: checking bus 3, device 3: >> "/sys/devices/pci:00/:00:1c.5/:04:00.0/usb3/3-1" >> Apr 20 07:03:24 rakete mtp-probe[16424]: bus: 3, device: 3 was not an MTP >> device >> Apr 20 07:03:25 rakete kernel: scsi 9:0:0:0: Direct-Access WDC WD20 >> 02FAEX-007BA00125 PQ: 0 ANSI: 6 >> Apr 20 07:03:25 rakete kernel: scsi 9:0:0:1: Direct-Access WDC WD50 >> 01AALS-00L3B20125 PQ: 0 ANSI: 6 >> Apr 20 07:03:25 rakete kernel: scsi 9:0:0:2: Direct-Access SAMSUNG >> SP2504C 0125 PQ: 0 ANSI: 6 >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: Attached scsi generic sg6 type >> 0 >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: Attached scsi generic sg7 type >> 0 >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] 3907029168 512-byte >> logical blocks: (2.00 TB/1.82 TiB) >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Write Protect is off >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Mode Sense: 67 00 10 08 >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: Attached scsi generic sg8 type >> 0 >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] 976773168 512-byte >> logical blocks: (500 GB/466 GiB) >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] No Caching mode page >> found >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Assuming drive cache: >> write through >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Write Protect is off >> Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj]
Re: Question: raid1 behaviour on failure
On 2016-04-21 02:23, Satoru Takeuchi wrote: On 2016/04/20 14:17, Matthias Bodenbinder wrote: Am 18.04.2016 um 09:22 schrieb Qu Wenruo: BTW, it would be better to post the dmesg for better debug. So here we. I did the same test again. Here is a full log of what i did. It seems to be mean like a bug in btrfs. Sequenz of events: 1. mount the raid1 (2 disc with different size) 2. unplug the biggest drive (hotplug) 3. try to copy something to the degraded raid1 4. plugin the device again (hotplug) This scenario does not work. The disc array is NOT redundant! I can not work with it while a drive is missing and I can not reattach the device so that everything works again. The btrfs module crashes during the test. I am using LMDE2 with backports: btrfs-tools 4.4-1~bpo8+1 linux-image-4.4.0-0.bpo.1-amd64 Matthias rakete - root - /root 1# mount /mnt/raid1/ Journal: Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): enabling auto defrag Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): disk space caching is enabled Apr 20 07:01:16 rakete kernel: BTRFS: has skinny extents rakete - root - /mnt/raid1 3# ll insgesamt 0 drwxrwxr-x 1 root root 36 Nov 14 2014 AfterShot2(64-bit) drwxrwxr-x 1 root root 5082 Apr 17 09:06 etc drwxr-xr-x 1 root root 108 Mär 24 07:31 var 4# btrfs fi show Label: none uuid: 16d5891f-5d52-4b29-8591-588ddf11e73d Total devices 3 FS bytes used 1.60GiB devid1 size 698.64GiB used 3.03GiB path /dev/sdg devid2 size 465.76GiB used 3.03GiB path /dev/sdh devid3 size 232.88GiB used 0.00B path /dev/sdi unplug device sdg: Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete kernel: Aborting journal on device sdf1-8. Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete umount[16405]: umount: /mnt/raid1: target is busy Apr 20 07:03:05 rakete umount[16405]: (In some cases useful info about processes that Apr 20 07:03:05 rakete umount[16405]: use the device is found by lsof(8) or fuser(1).) Apr 20 07:03:05 rakete systemd[1]: mnt-raid1.mount mount process exited, code=exited status=32 Apr 20 07:03:05 rakete systemd[1]: Failed unmounting /mnt/raid1. Apr 20 07:03:24 rakete kernel: usb 3-1: new SuperSpeed USB device number 3 using xhci_hcd Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device found, idVendor=152d, idProduct=0567 Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device strings: Mfr=10, Product=11, SerialNumber=5 Apr 20 07:03:24 rakete kernel: usb 3-1: Product: USB to ATA/ATAPI Bridge Apr 20 07:03:24 rakete kernel: usb 3-1: Manufacturer: JMicron Apr 20 07:03:24 rakete kernel: usb 3-1: SerialNumber: 152D00539000 Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: USB Mass Storage device detected Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: Quirks match for vid 152d pid 0567: 500 Apr 20 07:03:24 rakete kernel: scsi host9: usb-storage 3-1:1.0 Apr 20 07:03:24 rakete mtp-probe[16424]: checking bus 3, device 3: "/sys/devices/pci:00/:00:1c.5/:04:00.0/usb3/3-1" Apr 20 07:03:24 rakete mtp-probe[16424]: bus: 3, device: 3 was not an MTP device Apr 20 07:03:25 rakete kernel: scsi 9:0:0:0: Direct-Access WDC WD20 02FAEX-007BA00125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:1: Direct-Access WDC WD50 01AALS-00L3B20125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:2: Direct-Access SAMSUNG SP2504C 0125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: Attached scsi generic sg6 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: Attached scsi generic sg7 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: Attached scsi generic sg8 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] 976773168 512-byte logical blocks: (500 GB/466 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Assuming drive cache: write through Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: [sdk] 488395055 512-byte logical blocks: (250 GB/233 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Assuming drive cache: write through Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: [sdk] Write Protect is off Apr 20 07:03:25 rakete kernel: sd
Re: btrfs-progs confusing message
On 2016-04-20 16:23, Konstantin Svist wrote: Pretty much all commands print out the usage message when no device is specified: [root@host ~]# btrfs scrub start btrfs scrub start: too few arguments usage: btrfs scrub start [-BdqrRf] [-c ioprio_class -n ioprio_classdata] | ... However, balance doesn't [root@host ~]# btrfs balance start ERROR: can't access 'start': No such file or directory And this is an example of why backwards comparability can be a pain. The original balance command was 'btrfs filesystem balance', and had no start, stop, or similar sub-commands. This got changed to the current incarnation when the support for filters was added. For backwards compatibility reasons, we decided to still accept balance with no arguments other than the path as being the same as running 'btrfs balance start' on that path, and then made the old name an alias to the new one, with the restriction that you can't pass in filters through that interface. What is happening here is that balance is trying to interpret start as a path, not a command, hence the message about not being able to access 'start'. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: raid1 behaviour on failure
On 2016/04/21 15:23, Satoru Takeuchi wrote: On 2016/04/20 14:17, Matthias Bodenbinder wrote: Am 18.04.2016 um 09:22 schrieb Qu Wenruo: BTW, it would be better to post the dmesg for better debug. So here we. I did the same test again. Here is a full log of what i did. It seems to be mean like a bug in btrfs. Sequenz of events: 1. mount the raid1 (2 disc with different size) 2. unplug the biggest drive (hotplug) 3. try to copy something to the degraded raid1 4. plugin the device again (hotplug) This scenario does not work. The disc array is NOT redundant! I can not work with it while a drive is missing and I can not reattach the device so that everything works again. The btrfs module crashes during the test. I am using LMDE2 with backports: btrfs-tools 4.4-1~bpo8+1 linux-image-4.4.0-0.bpo.1-amd64 Matthias rakete - root - /root 1# mount /mnt/raid1/ Journal: Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): enabling auto defrag Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): disk space caching is enabled Apr 20 07:01:16 rakete kernel: BTRFS: has skinny extents rakete - root - /mnt/raid1 3# ll insgesamt 0 drwxrwxr-x 1 root root 36 Nov 14 2014 AfterShot2(64-bit) drwxrwxr-x 1 root root 5082 Apr 17 09:06 etc drwxr-xr-x 1 root root 108 Mär 24 07:31 var 4# btrfs fi show Label: none uuid: 16d5891f-5d52-4b29-8591-588ddf11e73d Total devices 3 FS bytes used 1.60GiB devid1 size 698.64GiB used 3.03GiB path /dev/sdg devid2 size 465.76GiB used 3.03GiB path /dev/sdh devid3 size 232.88GiB used 0.00B path /dev/sdi unplug device sdg: Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete kernel: Aborting journal on device sdf1-8. Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete umount[16405]: umount: /mnt/raid1: target is busy Apr 20 07:03:05 rakete umount[16405]: (In some cases useful info about processes that Apr 20 07:03:05 rakete umount[16405]: use the device is found by lsof(8) or fuser(1).) Apr 20 07:03:05 rakete systemd[1]: mnt-raid1.mount mount process exited, code=exited status=32 Apr 20 07:03:05 rakete systemd[1]: Failed unmounting /mnt/raid1. Apr 20 07:03:24 rakete kernel: usb 3-1: new SuperSpeed USB device number 3 using xhci_hcd Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device found, idVendor=152d, idProduct=0567 Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device strings: Mfr=10, Product=11, SerialNumber=5 Apr 20 07:03:24 rakete kernel: usb 3-1: Product: USB to ATA/ATAPI Bridge Apr 20 07:03:24 rakete kernel: usb 3-1: Manufacturer: JMicron Apr 20 07:03:24 rakete kernel: usb 3-1: SerialNumber: 152D00539000 Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: USB Mass Storage device detected Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: Quirks match for vid 152d pid 0567: 500 Apr 20 07:03:24 rakete kernel: scsi host9: usb-storage 3-1:1.0 Apr 20 07:03:24 rakete mtp-probe[16424]: checking bus 3, device 3: "/sys/devices/pci:00/:00:1c.5/:04:00.0/usb3/3-1" Apr 20 07:03:24 rakete mtp-probe[16424]: bus: 3, device: 3 was not an MTP device Apr 20 07:03:25 rakete kernel: scsi 9:0:0:0: Direct-Access WDC WD20 02FAEX-007BA00125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:1: Direct-Access WDC WD50 01AALS-00L3B20125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:2: Direct-Access SAMSUNG SP2504C 0125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: Attached scsi generic sg6 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: Attached scsi generic sg7 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: Attached scsi generic sg8 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] 976773168 512-byte logical blocks: (500 GB/466 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Assuming drive cache: write through Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: [sdk] 488395055 512-byte logical blocks: (250 GB/233 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Assuming drive cache: write through Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: [sdk] Write Protect is off Apr 20
Re: KERNEL PANIC + CORRUPTED BTRFS?
lenovomi wrote on 2016/04/21 08:53 +0200: Hello Liu, please find both files stored here: https://drive.google.com/folderview?id=0B6RZ_9vVuTEcMDV6eGNmRlZ0ZjQ=sharing Thanks On Thu, Apr 21, 2016 at 7:27 AM, Liu Bowrote: On Wed, Apr 20, 2016 at 10:09:07PM +0200, lenovomi wrote: Hi Chris, please find below attached the complete log while executing all the brtrfs commands, all of them failed. ;-( https://bpaste.net/show/4d8877a49b80 https://bpaste.net/show/7e2e5aa30741 https://bpaste.net/show/482e91b25fc5 https://bpaste.net/show/5093cc3daa5a https://bpaste.net/show/a24935eb5a1b It's not that easy to corrupt both of metadata copies which are located in two different drives by an unclean shutdown because we always do COW. From what I can tell from the above results, the two copies of raid1 remain consistent and indentical, but somehow there are some problems in checksum field. - root@heap-unreal:/home/heap/btrfs-progs# ./btrfs check --readonly /dev/sda checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev bytenr 2972268003328, devid 2 checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev bytenr 2972268003328, devid 2 checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev bytenr 2973311336448, devid 1 checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev bytenr 2972268003328, devid 2 - In order to verify that, please follow this and show us what you get. 1. dd if=/dev/sdb of=/tmp/corrupt-dev2.txt bs=1 skip=2972268003327 count=16384 2. dd if=/dev/sdd of=/tmp/corrupt-dev1.txt bs=1 skip=2973311336447 count=16384 3. od -x /tmp/corrupt-dev2.txt 4. od -x /tmp/corrupt-dev1.txt It seems that your dump command is wrong. Your skip caused one byte offset. Just as Lenovomi's dump: : 00ec 2ab0 bf00 ..*. It should be ec2a b0bf. So Lenovomi, please dump it again with correct command: dd if=/dev/sdb of=/tmp/corrupt-dev2.txt bs=1 skip=2972268003328 count=16384 dd if=/dev/sdd of=/tmp/corrupt-dev1.txt bs=1 skip=2973311336448 count=16384 Thanks, Qu Thanks, -liubo Thanks On Tue, Apr 12, 2016 at 9:58 PM, Chris Murphy wrote: On Tue, Apr 12, 2016 at 9:48 AM, lenovomi wrote: root@ubuntu:/home/ubuntu# btrfs restore -D -v /dev/sda /mnt/usb/ checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC Csum didn't match Couldn't read tree root Could not open root, trying backup super warning, device 2 is missing warning devid 2 not found already warning devid 3 not found already warning devid 4 not found already checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC Csum didn't match Couldn't read tree root Could not open root, trying backup super warning, device 2 is missing warning devid 2 not found already warning devid 3 not found already warning devid 4 not found already checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC Csum didn't match Couldn't read tree root Could not open root, trying backup super Why are devices 2, 3, 4 missing? I think there's a known issue where btrfs fi show might see drives as available that other tools won't see. Try 'btrfs dev scan' and then repeat the restore command with -D just to see if the missing device warnings go away. If devices are missing, it's kinda hard to do a restore. If these are hard drives, there should be supers 0, 1, 2 and they should all be the same. But they may not be the same on each drive, so it's worth checking: btrfs-show-super -f And then also btrfs-find-root -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: raid1 behaviour on failure
On 04/21/2016 01:15 PM, Matthias Bodenbinder wrote: Am 20.04.2016 um 15:32 schrieb Anand Jain: 1. mount the raid1 (2 disc with different size) 2. unplug the biggest drive (hotplug) Btrfs won't know that you have plugged-out a disk. Though it experiences IO failures, it won't close the bdev. Well, as far as I can tell mdadm can handle this use case. I tested that. I have an mdadm raid5 running. I accidentially unplugged a sata cable from one of the devices and the raid still worked. I did not even notice that the cable was unplugged until a few hours later. Then I plugged in the cable agaib and that was it. mdadm recovered the raid5 without any problem. -> This is redunancy! Yep. I meant to say its a bug in btrfs that it won't know about the missing device (after mount). Pls do test the hot spare patch set it has few first steps (yep not a complete) to handle the failed device while FS is mounted. 3. try to copy something to the degraded raid1 This will work as long as you do _not_ run unmount/mount. I did not umount the raid1 when I tried to copy something. As you can see from the sequence of events: I removed the drive and immdiately afterwards tried to copy something to the degraded array. This copy failed with a crash of the btrfs module. -> This is NOT redundancy. The ummount and mount operations are coming afterwards. In a nutshell I have to say that the btrfs behaviour is by no means compliant with my understanding of redundancy. A known issue. Your testing / validating of hot spare patch set will help. Thanks, Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: KERNEL PANIC + CORRUPTED BTRFS?
I run: od -x /tmp/corrupt-dev2.txt > a od -x /tmp/corrupt-dev1.txt > b cmp a b; diff a b; looks like both files are identical, means that both metadata files got corrupted? thanks On Thu, Apr 21, 2016 at 8:53 AM, lenovomiwrote: > Hello Liu, > > please find both files stored here: > > https://drive.google.com/folderview?id=0B6RZ_9vVuTEcMDV6eGNmRlZ0ZjQ=sharing > > > Thanks > > > > > > > On Thu, Apr 21, 2016 at 7:27 AM, Liu Bo wrote: >> On Wed, Apr 20, 2016 at 10:09:07PM +0200, lenovomi wrote: >>> Hi Chris, >>> >>> please find below attached the complete log while executing all the >>> brtrfs commands, all of them failed. >>> >>> ;-( >>> >>> >>> https://bpaste.net/show/4d8877a49b80 >>> https://bpaste.net/show/7e2e5aa30741 >>> https://bpaste.net/show/482e91b25fc5 >>> https://bpaste.net/show/5093cc3daa5a >>> https://bpaste.net/show/a24935eb5a1b >> >> It's not that easy to corrupt both of metadata copies which are located >> in two different drives by an unclean shutdown because we always do COW. >> >> From what I can tell from the above results, the two copies of raid1 >> remain consistent and indentical, but somehow there are some problems in >> checksum field. >> >> - >> root@heap-unreal:/home/heap/btrfs-progs# ./btrfs check --readonly /dev/sda >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev >> bytenr 2972268003328, devid 2 >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev >> bytenr 2972268003328, devid 2 >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev >> bytenr 2973311336448, devid 1 >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev >> bytenr 2972268003328, devid 2 >> - >> >> In order to verify that, please follow this and show us what you get. >> >> 1. dd if=/dev/sdb of=/tmp/corrupt-dev2.txt bs=1 skip=2972268003327 >> count=16384 >> 2. dd if=/dev/sdd of=/tmp/corrupt-dev1.txt bs=1 skip=2973311336447 >> count=16384 >> 3. od -x /tmp/corrupt-dev2.txt >> 4. od -x /tmp/corrupt-dev1.txt >> >> Thanks, >> >> -liubo >> >>> >>> Thanks >>> >>> On Tue, Apr 12, 2016 at 9:58 PM, Chris Murphy >>> wrote: >>> > On Tue, Apr 12, 2016 at 9:48 AM, lenovomi wrote: >>> > >>> >> root@ubuntu:/home/ubuntu# btrfs restore -D -v /dev/sda /mnt/usb/ >>> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >>> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >>> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >>> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >>> >> Csum didn't match >>> >> Couldn't read tree root >>> >> Could not open root, trying backup super >>> >> warning, device 2 is missing >>> >> warning devid 2 not found already >>> >> warning devid 3 not found already >>> >> warning devid 4 not found already >>> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >>> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >>> >> Csum didn't match >>> >> Couldn't read tree root >>> >> Could not open root, trying backup super >>> >> warning, device 2 is missing >>> >> warning devid 2 not found already >>> >> warning devid 3 not found already >>> >> warning devid 4 not found already >>> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >>> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >>> >> Csum didn't match >>> >> Couldn't read tree root >>> >> Could not open root, trying backup super >>> >> >>> > >>> > Why are devices 2, 3, 4 missing? I think there's a known issue where >>> > btrfs fi show might see drives as available that other tools won't >>> > see. Try 'btrfs dev scan' and then repeat the restore command with -D >>> > just to see if the missing device warnings go away. If devices are >>> > missing, it's kinda hard to do a restore. >>> > >>> > >>> > If these are hard drives, there should be supers 0, 1, 2 and they >>> > should all be the same. But they may not be the same on each drive, so >>> > it's worth checking: >>> > >>> > btrfs-show-super -f >>> > >>> > And then also btrfs-find-root >>> > >>> > >>> > >>> > >>> > -- >>> > Chris Murphy >>> > -- >>> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> > the body of a message to majord...@vger.kernel.org >>> > More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majord...@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to
Re: KERNEL PANIC + CORRUPTED BTRFS?
Hello Liu, please find both files stored here: https://drive.google.com/folderview?id=0B6RZ_9vVuTEcMDV6eGNmRlZ0ZjQ=sharing Thanks On Thu, Apr 21, 2016 at 7:27 AM, Liu Bowrote: > On Wed, Apr 20, 2016 at 10:09:07PM +0200, lenovomi wrote: >> Hi Chris, >> >> please find below attached the complete log while executing all the >> brtrfs commands, all of them failed. >> >> ;-( >> >> >> https://bpaste.net/show/4d8877a49b80 >> https://bpaste.net/show/7e2e5aa30741 >> https://bpaste.net/show/482e91b25fc5 >> https://bpaste.net/show/5093cc3daa5a >> https://bpaste.net/show/a24935eb5a1b > > It's not that easy to corrupt both of metadata copies which are located > in two different drives by an unclean shutdown because we always do COW. > > From what I can tell from the above results, the two copies of raid1 > remain consistent and indentical, but somehow there are some problems in > checksum field. > > - > root@heap-unreal:/home/heap/btrfs-progs# ./btrfs check --readonly /dev/sda > checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev > bytenr 2972268003328, devid 2 > checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev > bytenr 2972268003328, devid 2 > checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev > bytenr 2973311336448, devid 1 > checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC, dev > bytenr 2972268003328, devid 2 > - > > In order to verify that, please follow this and show us what you get. > > 1. dd if=/dev/sdb of=/tmp/corrupt-dev2.txt bs=1 skip=2972268003327 count=16384 > 2. dd if=/dev/sdd of=/tmp/corrupt-dev1.txt bs=1 skip=2973311336447 count=16384 > 3. od -x /tmp/corrupt-dev2.txt > 4. od -x /tmp/corrupt-dev1.txt > > Thanks, > > -liubo > >> >> Thanks >> >> On Tue, Apr 12, 2016 at 9:58 PM, Chris Murphy >> wrote: >> > On Tue, Apr 12, 2016 at 9:48 AM, lenovomi wrote: >> > >> >> root@ubuntu:/home/ubuntu# btrfs restore -D -v /dev/sda /mnt/usb/ >> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >> >> Csum didn't match >> >> Couldn't read tree root >> >> Could not open root, trying backup super >> >> warning, device 2 is missing >> >> warning devid 2 not found already >> >> warning devid 3 not found already >> >> warning devid 4 not found already >> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >> >> Csum didn't match >> >> Couldn't read tree root >> >> Could not open root, trying backup super >> >> warning, device 2 is missing >> >> warning devid 2 not found already >> >> warning devid 3 not found already >> >> warning devid 4 not found already >> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >> >> checksum verify failed on 17802818387968 found FF45E2D3 wanted BFB02AEC >> >> Csum didn't match >> >> Couldn't read tree root >> >> Could not open root, trying backup super >> >> >> > >> > Why are devices 2, 3, 4 missing? I think there's a known issue where >> > btrfs fi show might see drives as available that other tools won't >> > see. Try 'btrfs dev scan' and then repeat the restore command with -D >> > just to see if the missing device warnings go away. If devices are >> > missing, it's kinda hard to do a restore. >> > >> > >> > If these are hard drives, there should be supers 0, 1, 2 and they >> > should all be the same. But they may not be the same on each drive, so >> > it's worth checking: >> > >> > btrfs-show-super -f >> > >> > And then also btrfs-find-root >> > >> > >> > >> > >> > -- >> > Chris Murphy >> > -- >> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> > the body of a message to majord...@vger.kernel.org >> > More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: raid1 behaviour on failure
On 2016/04/20 14:17, Matthias Bodenbinder wrote: Am 18.04.2016 um 09:22 schrieb Qu Wenruo: BTW, it would be better to post the dmesg for better debug. So here we. I did the same test again. Here is a full log of what i did. It seems to be mean like a bug in btrfs. Sequenz of events: 1. mount the raid1 (2 disc with different size) 2. unplug the biggest drive (hotplug) 3. try to copy something to the degraded raid1 4. plugin the device again (hotplug) This scenario does not work. The disc array is NOT redundant! I can not work with it while a drive is missing and I can not reattach the device so that everything works again. The btrfs module crashes during the test. I am using LMDE2 with backports: btrfs-tools 4.4-1~bpo8+1 linux-image-4.4.0-0.bpo.1-amd64 Matthias rakete - root - /root 1# mount /mnt/raid1/ Journal: Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): enabling auto defrag Apr 20 07:01:16 rakete kernel: BTRFS info (device sdi): disk space caching is enabled Apr 20 07:01:16 rakete kernel: BTRFS: has skinny extents rakete - root - /mnt/raid1 3# ll insgesamt 0 drwxrwxr-x 1 root root 36 Nov 14 2014 AfterShot2(64-bit) drwxrwxr-x 1 root root 5082 Apr 17 09:06 etc drwxr-xr-x 1 root root 108 Mär 24 07:31 var 4# btrfs fi show Label: none uuid: 16d5891f-5d52-4b29-8591-588ddf11e73d Total devices 3 FS bytes used 1.60GiB devid1 size 698.64GiB used 3.03GiB path /dev/sdg devid2 size 465.76GiB used 3.03GiB path /dev/sdh devid3 size 232.88GiB used 0.00B path /dev/sdi unplug device sdg: Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete kernel: Aborting journal on device sdf1-8. Apr 20 07:03:05 rakete kernel: Buffer I/O error on dev sdf1, logical block 243826688, lost sync page write Apr 20 07:03:05 rakete kernel: JBD2: Error -5 detected when updating journal superblock for sdf1-8. Apr 20 07:03:05 rakete umount[16405]: umount: /mnt/raid1: target is busy Apr 20 07:03:05 rakete umount[16405]: (In some cases useful info about processes that Apr 20 07:03:05 rakete umount[16405]: use the device is found by lsof(8) or fuser(1).) Apr 20 07:03:05 rakete systemd[1]: mnt-raid1.mount mount process exited, code=exited status=32 Apr 20 07:03:05 rakete systemd[1]: Failed unmounting /mnt/raid1. Apr 20 07:03:24 rakete kernel: usb 3-1: new SuperSpeed USB device number 3 using xhci_hcd Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device found, idVendor=152d, idProduct=0567 Apr 20 07:03:24 rakete kernel: usb 3-1: New USB device strings: Mfr=10, Product=11, SerialNumber=5 Apr 20 07:03:24 rakete kernel: usb 3-1: Product: USB to ATA/ATAPI Bridge Apr 20 07:03:24 rakete kernel: usb 3-1: Manufacturer: JMicron Apr 20 07:03:24 rakete kernel: usb 3-1: SerialNumber: 152D00539000 Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: USB Mass Storage device detected Apr 20 07:03:24 rakete kernel: usb-storage 3-1:1.0: Quirks match for vid 152d pid 0567: 500 Apr 20 07:03:24 rakete kernel: scsi host9: usb-storage 3-1:1.0 Apr 20 07:03:24 rakete mtp-probe[16424]: checking bus 3, device 3: "/sys/devices/pci:00/:00:1c.5/:04:00.0/usb3/3-1" Apr 20 07:03:24 rakete mtp-probe[16424]: bus: 3, device: 3 was not an MTP device Apr 20 07:03:25 rakete kernel: scsi 9:0:0:0: Direct-Access WDC WD20 02FAEX-007BA00125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:1: Direct-Access WDC WD50 01AALS-00L3B20125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: scsi 9:0:0:2: Direct-Access SAMSUNG SP2504C 0125 PQ: 0 ANSI: 6 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: Attached scsi generic sg6 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: Attached scsi generic sg7 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: Attached scsi generic sg8 type 0 Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] 976773168 512-byte logical blocks: (500 GB/466 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:0: [sdf] Assuming drive cache: write through Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Write Protect is off Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Mode Sense: 67 00 10 08 Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: [sdk] 488395055 512-byte logical blocks: (250 GB/233 GiB) Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] No Caching mode page found Apr 20 07:03:25 rakete kernel: sd 9:0:0:1: [sdj] Assuming drive cache: write through Apr 20 07:03:25 rakete kernel: sd 9:0:0:2: [sdk] Write Protect is off Apr 20 07:03:25 rakete kernel: sd
Re: Question: raid1 behaviour on failure
Liu Bo wrote on 2016/04/20 23:02 -0700: On Thu, Apr 21, 2016 at 01:43:56PM +0800, Qu Wenruo wrote: Matthias Bodenbinder wrote on 2016/04/21 07:22 +0200: Am 20.04.2016 um 09:25 schrieb Qu Wenruo: Unfortunately, this is the designed behavior. The fs is rw just because it doesn't hit any critical problem. If you try to touch a file and then sync the fs, btrfs will become RO immediately. Btrfs fails to read space cache, nor make a new dir. The failure on cow_block in mkdir is ciritical, and btrfs become RO. All expected behavior so far. You may try use degraded mount option, but AFAIK it may not handle case like yours. This really scares me. "Expected bevahour"? So you are saying: If one of the drives in the raid1 is going dead without noticing btrfs, the redundancy is lost. Lets say, the power unit of a disc is going dead. This disc will disappear from the raid1 pretty much as suddenly as in my test case here. No difference. You are saying that in this case, btrfs should exactly behave like this? If that is the case I eventually need to rethink my interpretation of redundancy. Matthias The "expected behavior" just means the abort transaction behavior for critical error is expected. And you should know, btrfs is not doing full block level RAID1, it's doing RAID at chunk level. Which needs to consider more things than full block level RAID1, and it's more flex than block level raid1. (For example, you can use 3 devices with different sizes to do btrfs RAID1 and get more available size than mdadm raid1) You may think the behavior is totally insane for btrfs RAID1, but don't forget, btrfs can have different metdata/data profile. (And even more, there is already plan to support different profile for different subvolumes) In case your metadata is RAID1, your data can still be RAID0, and in that case a missing devices can still cause huge problem. From an user's point of view, what you're saying is more an excuse and kind of irrelavant. Stop doing that please, try to fix the insane behavior instead. Thanks, -liubo Didn't you see I have already submitted the first version of per-chunk degradable patchset for a long time to address the problem? And you should blame the person who is blocking the patchset from merging by refusing the split them along. Thanks, Qu There are already unmerged patches which will partly do the mdadm level behavior, like automatically change to degraded mode without making the fs RO. The original patchset: http://comments.gmane.org/gmane.comp.file-systems.btrfs/48335 Or the latest patchset inside Anand Jain's auto-replace patchset: http://thread.gmane.org/gmane.comp.file-systems.btrfs/55446 Thanks, Qu -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Question: raid1 behaviour on failure
On Thu, Apr 21, 2016 at 01:43:56PM +0800, Qu Wenruo wrote: > > > Matthias Bodenbinder wrote on 2016/04/21 07:22 +0200: > >Am 20.04.2016 um 09:25 schrieb Qu Wenruo: > > > >> > >>Unfortunately, this is the designed behavior. > >> > >>The fs is rw just because it doesn't hit any critical problem. > >> > >>If you try to touch a file and then sync the fs, btrfs will become RO > >>immediately. > >> > > > > > >>Btrfs fails to read space cache, nor make a new dir. > >> > >>The failure on cow_block in mkdir is ciritical, and btrfs become RO. > >> > >>All expected behavior so far. > >> > >>You may try use degraded mount option, but AFAIK it may not handle case > >>like yours. > > > >This really scares me. "Expected bevahour"? > >So you are saying: If one of the drives in the raid1 is going dead without > >noticing btrfs, the redundancy is lost. > > > >Lets say, the power unit of a disc is going dead. This disc will disappear > >from the raid1 pretty much as suddenly as in my test case here. No > >difference. > > > >You are saying that in this case, btrfs should exactly behave like this? If > >that is the case I eventually need to rethink my interpretation of > >redundancy. > > > >Matthias > > > > The "expected behavior" just means the abort transaction behavior for > critical error is expected. > > And you should know, btrfs is not doing full block level RAID1, it's doing > RAID at chunk level. > Which needs to consider more things than full block level RAID1, and it's > more flex than block level raid1. > (For example, you can use 3 devices with different sizes to do btrfs RAID1 > and get more available size than mdadm raid1) > > You may think the behavior is totally insane for btrfs RAID1, but don't > forget, btrfs can have different metdata/data profile. > (And even more, there is already plan to support different profile for > different subvolumes) > > In case your metadata is RAID1, your data can still be RAID0, and in that > case a missing devices can still cause huge problem. >From an user's point of view, what you're saying is more an excuse and kind of irrelavant. Stop doing that please, try to fix the insane behavior instead. Thanks, -liubo > > There are already unmerged patches which will partly do the mdadm level > behavior, like automatically change to degraded mode without making the fs > RO. > > The original patchset: > http://comments.gmane.org/gmane.comp.file-systems.btrfs/48335 > > Or the latest patchset inside Anand Jain's auto-replace patchset: > http://thread.gmane.org/gmane.comp.file-systems.btrfs/55446 > > Thanks, > Qu > > > > > >-- > >To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > >the body of a message to majord...@vger.kernel.org > >More majordomo info at http://vger.kernel.org/majordomo-info.html > > > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html