Re: Need help with potential ~45TB dataloss
On Tue, Dec 4, 2018 at 3:09 AM Patrick Dijkgraaf wrote: > > Hi Chris, > > See the output below. Any suggestions based on it? If they're SATA drives, they may not support SCT ERC; and if they're SAS, depending on what controller they're behind, smartctl might need a hint to properly ask the drive for SCT ERC status. Simplest way to know is do 'smartctl -x' on one drive, assuming they're all the same basic make/model other than size. -- Chris Murphy
Re: Need help with potential ~45TB dataloss
Hi Chris, See the output below. Any suggestions based on it? Thanks! -- Groet / Cheers, Patrick Dijkgraaf On Mon, 2018-12-03 at 20:16 -0700, Chris Murphy wrote: > Also useful information for autopsy, perhaps not for fixing, is to > know whether the SCT ERC value for every drive is less than the > kernel's SCSI driver block device command timeout value. It's super > important that the drive reports an explicit read failure before the > read command is considered failed by the kernel. If the drive is > still > trying to do a read, and the kernel command timer times out, it'll > just do a reset of the whole link and we lose the outcome for the > hanging command. Upon explicit read error only, can Btrfs, or md > RAID, > know what device and physical sector has a problem, and therefore how > to reconstruct the block, and fix the bad sector with a write of > known > good data. > > smartctl -l scterc /device/ Seems to not work: [root@cornelis ~]# for disk in /dev/sd{e..x}; do echo ${disk}; smartctl -l scterc ${disk}; done /dev/sde smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdf smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdg smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdh smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdi smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdj smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdk smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org Smartctl open device: /dev/sdk failed: No such device /dev/sdl smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdm smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdn smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SCT Error Recovery Control command not supported /dev/sdo smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdp smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdq smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sdr smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW register SCT (Get) Error Recovery Control command failed /dev/sds smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org SMART WRITE LOG does not return COUNT and LBA_LOW
Re: Need help with potential ~45TB dataloss
Hi, thanks again. Please see answers inline. -- Groet / Cheers, Patrick Dijkgraaf On Mon, 2018-12-03 at 08:35 +0800, Qu Wenruo wrote: > > On 2018/12/2 下午5:03, Patrick Dijkgraaf wrote: > > Hi Qu, > > > > Thanks for helping me! > > > > Please see the reponses in-line. > > Any suggestions based on this? > > > > Thanks! > > > > > > On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote: > > > On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote: > > > > Hi all, > > > > > > > > I have been a happy BTRFS user for quite some time. But now I'm > > > > facing > > > > a potential ~45TB dataloss... :-( > > > > I hope someone can help! > > > > > > > > I have Server A and Server B. Both having a 20-devices BTRFS > > > > RAID6 > > > > filesystem. Because of known RAID5/6 risks, Server B was a > > > > backup > > > > of > > > > Server A. > > > > After applying updates to server B and reboot, the FS would not > > > > mount > > > > anymore. Because it was "just" a backup. I decided to recreate > > > > the > > > > FS > > > > and perform a new backup. Later, I discovered that the FS was > > > > not > > > > broken, but I faced this issue: > > > > https://patchwork.kernel.org/patch/10694997/ > > > > > > > > > > > > > > Sorry for the inconvenience. > > > > > > I didn't realize the max_chunk_size limit isn't reliable at that > > > timing. > > > > No problem, I should not have jumped to the conclusion to recreate > > the > > backup volume. > > > > > > Anyway, the FS was already recreated, so I needed to do a new > > > > backup. > > > > During the backup (using rsync -vah), Server A (the source) > > > > encountered > > > > an I/O error and my rsync failed. In an attempt to "quick fix" > > > > the > > > > issue, I rebooted Server A after which the FS would not mount > > > > anymore. > > > > > > Did you have any dmesg about that IO error? > > > > Yes there was. But I omitted capturing it... The system is now > > rebooted > > and I can't retrieve it anymore. :-( > > > > > And how is the reboot scheduled? Forced power off or normal > > > reboot > > > command? > > > > The system was rebooted using a normal reboot command. > > Then the problem is pretty serious. > > Possibly already corrupted before. > > > > > I documented what I have tried, below. I have not yet tried > > > > anything > > > > except what is shown, because I am afraid of causing more harm > > > > to > > > > the FS. > > > > > > Pretty clever, no btrfs check --repair is a pretty good move. > > > > > > > I hope somebody here can give me advice on how to (hopefully) > > > > retrieve my data... > > > > > > > > Thanks in advance! > > > > > > > > == > > > > > > > > [root@cornelis ~]# btrfs fi show > > > > Label: 'cornelis-btrfs' uuid: ac643516-670e-40f3-aa4c- > > > > f329fc3795fd > > > > Total devices 1 FS bytes used 463.92GiB > > > > devid1 size 800.00GiB used 493.02GiB path > > > > /dev/mapper/cornelis-cornelis--btrfs > > > > > > > > Label: 'data' uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5 > > > > Total devices 20 FS bytes used 44.85TiB > > > > devid1 size 3.64TiB used 3.64TiB path /dev/sdn2 > > > > devid2 size 3.64TiB used 3.64TiB path /dev/sdp2 > > > > devid3 size 3.64TiB used 3.64TiB path /dev/sdu2 > > > > devid4 size 3.64TiB used 3.64TiB path /dev/sdx2 > > > > devid5 size 3.64TiB used 3.64TiB path /dev/sdh2 > > > > devid6 size 3.64TiB used 3.64TiB path /dev/sdg2 > > > > devid7 size 3.64TiB used 3.64TiB path /dev/sdm2 > > > > devid8 size 3.64TiB used 3.64TiB path /dev/sdw2 > > > > devid9 size 3.64TiB used 3.64TiB path /dev/sdj2 > > > > devid 10 size 3.64TiB used 3.64TiB path /dev/sdt2 > > > > devid 11 size 3.64TiB used 3.64TiB path /dev/sdk2 > > > > devid 12 size 3.64TiB used 3.64TiB path /dev/sdq2 > > > > devid 13 size 3.64TiB used 3.64TiB path /dev/sds2 > > > > devid 14 size 3.64TiB used 3.64TiB path /dev/sdf2 > > > > devid 15 size 7.28TiB used 588.80GiB path /dev/sdr2 > > > > devid 16 size 7.28TiB used 588.80GiB path /dev/sdo2 > > > > devid 17 size 7.28TiB used 588.80GiB path /dev/sdv2 > > > > devid 18 size 7.28TiB used 588.80GiB path /dev/sdi2 > > > > devid 19 size 7.28TiB used 588.80GiB path /dev/sdl2 > > > > devid 20 size 7.28TiB used 588.80GiB path /dev/sde2 > > > > > > > > [root@cornelis ~]# mount /dev/sdn2 /mnt/data > > > > mount: /mnt/data: wrong fs type, bad option, bad superblock on > > > > /dev/sdn2, missing codepage or helper program, or other error. > > > > > > What is the dmesg of the mount failure? > > > > [Sun Dec 2 09:41:08 2018] BTRFS info (device sdn2): disk space > > caching > > is enabled > > [Sun Dec 2 09:41:08 2018] BTRFS info (device sdn2): has skinny > > extents > > [Sun Dec 2 09:41:08 2018] BTRFS error (device sdn2): parent > >
Re: Need help with potential ~45TB dataloss
Also useful information for autopsy, perhaps not for fixing, is to know whether the SCT ERC value for every drive is less than the kernel's SCSI driver block device command timeout value. It's super important that the drive reports an explicit read failure before the read command is considered failed by the kernel. If the drive is still trying to do a read, and the kernel command timer times out, it'll just do a reset of the whole link and we lose the outcome for the hanging command. Upon explicit read error only, can Btrfs, or md RAID, know what device and physical sector has a problem, and therefore how to reconstruct the block, and fix the bad sector with a write of known good data. smartctl -l scterc /device/ and cat /sys/block/sda/device/timeout Only if SCT ERC is enabled with a value below 30, or if the kernel command timer is change to be well above 30 (like 180, which is absolutely crazy but a separate conversation) can we be sure that there haven't just been resets going on for a while, preventing bad sectors from being fixed up all along, and can contribute to the problem. This comes up on the linux-raid (mainly md driver) list all the time, and it contributes to lost RAID all the time. And arguably it leads to unnecessary data loss in even the single device desktop/laptop use case as well. Chris Murphy
Re: Need help with potential ~45TB dataloss
On 2018/12/3 上午4:30, Andrei Borzenkov wrote: > 02.12.2018 23:14, Patrick Dijkgraaf пишет: >> I have some additional info. >> >> I found the reason the FS got corrupted. It was a single failing drive, >> which caused the entire cabinet (containing 7 drives) to reset. So the >> FS suddenly lost 7 drives. >> > > This remains mystery for me. btrfs is marketed to be always consistent > on disk - you either have previous full transaction or current full > transaction. If current transaction was interrupted the promise is you > are left with previous valid consistent transaction. > > Obviously this is not what happens in practice. Which nullifies the main > selling point of btrfs. > > Unless this is expected behavior, it sounds like some barriers are > missing and summary data is updated before (and without waiting for) > subordinate data. And if it is expected behavior ... There are one (unfortunately) known problem for RAID5/6 and one special problem for RAID6. The common problem is write hole. For a RAID5 stripe like: Disk 1 |Disk 2| Disk 3 --- DATA1 |DATA2 | PARITY If we have written something into DATA1, but powerloss happened before we update PARITY in disk 3. In this case, we can't tolerant Disk 2 loss, since DATA1 doesn't match PARAITY anymore. Without the ability to know what exactly block we have written, for write hole problem exists for any parity based solution, including BTRFS RAID5/6. From the guys in the mail list, other RAID5/6 implementations have their own record of which block is updated on-disk, and for powerloss case they will rebuild involved stripes. Since btrfs doesn't has such ability, we need to scrub the whole fs to regain the disk loss tolerance (and hope there will not be another power loss during it) The RAID6 special problem is the missing of rebuilt retry logic. (Not any more after 4.16 kernel, but still missing btrfs-progs support) For a RAID6 stripe like: Disk 1 |Disk 2 | Disk 3 |Disk 4 DATA1 |DATA2 | P| Q If data read from DATA1 failed, we have 3 ways to rebuild the data: 1) Using DATA2 and P (just as RAID5) 2) Using P and Q 3) Using DATA2 and Q However until 4.16 we won't retry all possible ways to build it. (Thanks Liu for solving this problem). Thanks, Qu > >> I have removed the failed drive, so the RAID is now degraded. I hope >> the data is still recoverable... ☹ >> > signature.asc Description: OpenPGP digital signature
Re: Need help with potential ~45TB dataloss
On 2018/12/3 上午8:35, Qu Wenruo wrote: > > > On 2018/12/2 下午5:03, Patrick Dijkgraaf wrote: >> Hi Qu, >> >> Thanks for helping me! >> >> Please see the reponses in-line. >> Any suggestions based on this? >> >> Thanks! >> >> >> On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote: >>> On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote: Hi all, I have been a happy BTRFS user for quite some time. But now I'm facing a potential ~45TB dataloss... :-( I hope someone can help! I have Server A and Server B. Both having a 20-devices BTRFS RAID6 filesystem. I forgot one important thing here, specially for RAID6. If one data device corrupted, RAID6 will normally try to rebuild using RAID5 way, and if another one disk get corrupted, it may not recover correctly. Current way to recover is try *all* combination. IIRC Liu Bo tried such patch but not merged. This means current RAID6 can only handle two missing devices at its best condition. But for corruption, it can only be as good as RAID5. Thanks, Qu > Because of known RAID5/6 risks, Server B was a backup of Server A. After applying updates to server B and reboot, the FS would not mount anymore. Because it was "just" a backup. I decided to recreate the FS and perform a new backup. Later, I discovered that the FS was not broken, but I faced this issue: https://patchwork.kernel.org/patch/10694997/ >>> >>> Sorry for the inconvenience. >>> >>> I didn't realize the max_chunk_size limit isn't reliable at that >>> timing. >> >> No problem, I should not have jumped to the conclusion to recreate the >> backup volume. >> Anyway, the FS was already recreated, so I needed to do a new backup. During the backup (using rsync -vah), Server A (the source) encountered an I/O error and my rsync failed. In an attempt to "quick fix" the issue, I rebooted Server A after which the FS would not mount anymore. >>> >>> Did you have any dmesg about that IO error? >> >> Yes there was. But I omitted capturing it... The system is now rebooted >> and I can't retrieve it anymore. :-( >> >>> And how is the reboot scheduled? Forced power off or normal reboot >>> command? >> >> The system was rebooted using a normal reboot command. > > Then the problem is pretty serious. > > Possibly already corrupted before. > >> I documented what I have tried, below. I have not yet tried anything except what is shown, because I am afraid of causing more harm to the FS. >>> >>> Pretty clever, no btrfs check --repair is a pretty good move. >>> I hope somebody here can give me advice on how to (hopefully) retrieve my data... Thanks in advance! == [root@cornelis ~]# btrfs fi show Label: 'cornelis-btrfs' uuid: ac643516-670e-40f3-aa4c-f329fc3795fd Total devices 1 FS bytes used 463.92GiB devid1 size 800.00GiB used 493.02GiB path /dev/mapper/cornelis-cornelis--btrfs Label: 'data' uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5 Total devices 20 FS bytes used 44.85TiB devid1 size 3.64TiB used 3.64TiB path /dev/sdn2 devid2 size 3.64TiB used 3.64TiB path /dev/sdp2 devid3 size 3.64TiB used 3.64TiB path /dev/sdu2 devid4 size 3.64TiB used 3.64TiB path /dev/sdx2 devid5 size 3.64TiB used 3.64TiB path /dev/sdh2 devid6 size 3.64TiB used 3.64TiB path /dev/sdg2 devid7 size 3.64TiB used 3.64TiB path /dev/sdm2 devid8 size 3.64TiB used 3.64TiB path /dev/sdw2 devid9 size 3.64TiB used 3.64TiB path /dev/sdj2 devid 10 size 3.64TiB used 3.64TiB path /dev/sdt2 devid 11 size 3.64TiB used 3.64TiB path /dev/sdk2 devid 12 size 3.64TiB used 3.64TiB path /dev/sdq2 devid 13 size 3.64TiB used 3.64TiB path /dev/sds2 devid 14 size 3.64TiB used 3.64TiB path /dev/sdf2 devid 15 size 7.28TiB used 588.80GiB path /dev/sdr2 devid 16 size 7.28TiB used 588.80GiB path /dev/sdo2 devid 17 size 7.28TiB used 588.80GiB path /dev/sdv2 devid 18 size 7.28TiB used 588.80GiB path /dev/sdi2 devid 19 size 7.28TiB used 588.80GiB path /dev/sdl2 devid 20 size 7.28TiB used 588.80GiB path /dev/sde2 [root@cornelis ~]# mount /dev/sdn2 /mnt/data mount: /mnt/data: wrong fs type, bad option, bad superblock on /dev/sdn2, missing codepage or helper program, or other error. >>> >>> What is the dmesg of the mount failure? >> >> [Sun Dec 2 09:41:08 2018] BTRFS info (device sdn2): disk space caching >> is enabled >> [Sun Dec 2 09:41:08 2018] BTRFS info (device sdn2): has skinny extents >> [Sun Dec 2 09:41:08 2018] BTRFS error (device sdn2): parent transid >> verify failed on 46451963543552 wanted 114401 found 114173 >> [Sun Dec 2 09:41:08 2018] BTRFS critical (device sdn2):
Re: Need help with potential ~45TB dataloss
On 2018/12/2 下午5:03, Patrick Dijkgraaf wrote: > Hi Qu, > > Thanks for helping me! > > Please see the reponses in-line. > Any suggestions based on this? > > Thanks! > > > On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote: >> On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote: >>> Hi all, >>> >>> I have been a happy BTRFS user for quite some time. But now I'm >>> facing >>> a potential ~45TB dataloss... :-( >>> I hope someone can help! >>> >>> I have Server A and Server B. Both having a 20-devices BTRFS RAID6 >>> filesystem. Because of known RAID5/6 risks, Server B was a backup >>> of >>> Server A. >>> After applying updates to server B and reboot, the FS would not >>> mount >>> anymore. Because it was "just" a backup. I decided to recreate the >>> FS >>> and perform a new backup. Later, I discovered that the FS was not >>> broken, but I faced this issue: >>> https://patchwork.kernel.org/patch/10694997/ >>> >> >> Sorry for the inconvenience. >> >> I didn't realize the max_chunk_size limit isn't reliable at that >> timing. > > No problem, I should not have jumped to the conclusion to recreate the > backup volume. > >>> Anyway, the FS was already recreated, so I needed to do a new >>> backup. >>> During the backup (using rsync -vah), Server A (the source) >>> encountered >>> an I/O error and my rsync failed. In an attempt to "quick fix" the >>> issue, I rebooted Server A after which the FS would not mount >>> anymore. >> >> Did you have any dmesg about that IO error? > > Yes there was. But I omitted capturing it... The system is now rebooted > and I can't retrieve it anymore. :-( > >> And how is the reboot scheduled? Forced power off or normal reboot >> command? > > The system was rebooted using a normal reboot command. Then the problem is pretty serious. Possibly already corrupted before. > >>> I documented what I have tried, below. I have not yet tried >>> anything >>> except what is shown, because I am afraid of causing more harm to >>> the FS. >> >> Pretty clever, no btrfs check --repair is a pretty good move. >> >>> I hope somebody here can give me advice on how to (hopefully) >>> retrieve my data... >>> >>> Thanks in advance! >>> >>> == >>> >>> [root@cornelis ~]# btrfs fi show >>> Label: 'cornelis-btrfs' uuid: ac643516-670e-40f3-aa4c-f329fc3795fd >>> Total devices 1 FS bytes used 463.92GiB >>> devid1 size 800.00GiB used 493.02GiB path >>> /dev/mapper/cornelis-cornelis--btrfs >>> >>> Label: 'data' uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5 >>> Total devices 20 FS bytes used 44.85TiB >>> devid1 size 3.64TiB used 3.64TiB path /dev/sdn2 >>> devid2 size 3.64TiB used 3.64TiB path /dev/sdp2 >>> devid3 size 3.64TiB used 3.64TiB path /dev/sdu2 >>> devid4 size 3.64TiB used 3.64TiB path /dev/sdx2 >>> devid5 size 3.64TiB used 3.64TiB path /dev/sdh2 >>> devid6 size 3.64TiB used 3.64TiB path /dev/sdg2 >>> devid7 size 3.64TiB used 3.64TiB path /dev/sdm2 >>> devid8 size 3.64TiB used 3.64TiB path /dev/sdw2 >>> devid9 size 3.64TiB used 3.64TiB path /dev/sdj2 >>> devid 10 size 3.64TiB used 3.64TiB path /dev/sdt2 >>> devid 11 size 3.64TiB used 3.64TiB path /dev/sdk2 >>> devid 12 size 3.64TiB used 3.64TiB path /dev/sdq2 >>> devid 13 size 3.64TiB used 3.64TiB path /dev/sds2 >>> devid 14 size 3.64TiB used 3.64TiB path /dev/sdf2 >>> devid 15 size 7.28TiB used 588.80GiB path /dev/sdr2 >>> devid 16 size 7.28TiB used 588.80GiB path /dev/sdo2 >>> devid 17 size 7.28TiB used 588.80GiB path /dev/sdv2 >>> devid 18 size 7.28TiB used 588.80GiB path /dev/sdi2 >>> devid 19 size 7.28TiB used 588.80GiB path /dev/sdl2 >>> devid 20 size 7.28TiB used 588.80GiB path /dev/sde2 >>> >>> [root@cornelis ~]# mount /dev/sdn2 /mnt/data >>> mount: /mnt/data: wrong fs type, bad option, bad superblock on >>> /dev/sdn2, missing codepage or helper program, or other error. >> >> What is the dmesg of the mount failure? > > [Sun Dec 2 09:41:08 2018] BTRFS info (device sdn2): disk space caching > is enabled > [Sun Dec 2 09:41:08 2018] BTRFS info (device sdn2): has skinny extents > [Sun Dec 2 09:41:08 2018] BTRFS error (device sdn2): parent transid > verify failed on 46451963543552 wanted 114401 found 114173 > [Sun Dec 2 09:41:08 2018] BTRFS critical (device sdn2): corrupt leaf: > root=2 block=46451963543552 slot=0, unexpected item end, have > 1387359977 expect 16283 OK, this shows that one of the copy has mismatched generation while the other copy is completely corrupted. > [Sun Dec 2 09:41:08 2018] BTRFS warning (device sdn2): failed to read > tree root > [Sun Dec 2 09:41:08 2018] BTRFS error (device sdn2): open_ctree failed > >> And have you tried -o ro,degraded ? > > Tried it just now, gives the exact same error. > >>> [root@cornelis ~]# btrfs check /dev/sdn2 >>> Opening filesystem to check... >>> parent transid verify failed on
Re: Need help with potential ~45TB dataloss
02.12.2018 23:14, Patrick Dijkgraaf пишет: > I have some additional info. > > I found the reason the FS got corrupted. It was a single failing drive, > which caused the entire cabinet (containing 7 drives) to reset. So the > FS suddenly lost 7 drives. > This remains mystery for me. btrfs is marketed to be always consistent on disk - you either have previous full transaction or current full transaction. If current transaction was interrupted the promise is you are left with previous valid consistent transaction. Obviously this is not what happens in practice. Which nullifies the main selling point of btrfs. Unless this is expected behavior, it sounds like some barriers are missing and summary data is updated before (and without waiting for) subordinate data. And if it is expected behavior ... > I have removed the failed drive, so the RAID is now degraded. I hope > the data is still recoverable... ☹ >
Re: Need help with potential ~45TB dataloss
I have some additional info. I found the reason the FS got corrupted. It was a single failing drive, which caused the entire cabinet (containing 7 drives) to reset. So the FS suddenly lost 7 drives. I have removed the failed drive, so the RAID is now degraded. I hope the data is still recoverable... ☹ -- Groet / Cheers, Patrick Dijkgraaf On Sun, 2018-12-02 at 10:03 +0100, Patrick Dijkgraaf wrote: > Hi Qu, > > Thanks for helping me! > > Please see the reponses in-line. > Any suggestions based on this? > > Thanks! > > > On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote: > > On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote: > > > Hi all, > > > > > > I have been a happy BTRFS user for quite some time. But now I'm > > > facing > > > a potential ~45TB dataloss... :-( > > > I hope someone can help! > > > > > > I have Server A and Server B. Both having a 20-devices BTRFS > > > RAID6 > > > filesystem. Because of known RAID5/6 risks, Server B was a backup > > > of > > > Server A. > > > After applying updates to server B and reboot, the FS would not > > > mount > > > anymore. Because it was "just" a backup. I decided to recreate > > > the > > > FS > > > and perform a new backup. Later, I discovered that the FS was not > > > broken, but I faced this issue: > > > https://patchwork.kernel.org/patch/10694997/ > > > > > > > > > > Sorry for the inconvenience. > > > > I didn't realize the max_chunk_size limit isn't reliable at that > > timing. > > No problem, I should not have jumped to the conclusion to recreate > the > backup volume. > > > > Anyway, the FS was already recreated, so I needed to do a new > > > backup. > > > During the backup (using rsync -vah), Server A (the source) > > > encountered > > > an I/O error and my rsync failed. In an attempt to "quick fix" > > > the > > > issue, I rebooted Server A after which the FS would not mount > > > anymore. > > > > Did you have any dmesg about that IO error? > > Yes there was. But I omitted capturing it... The system is now > rebooted > and I can't retrieve it anymore. :-( > > > And how is the reboot scheduled? Forced power off or normal reboot > > command? > > The system was rebooted using a normal reboot command. > > > > I documented what I have tried, below. I have not yet tried > > > anything > > > except what is shown, because I am afraid of causing more harm to > > > the FS. > > > > Pretty clever, no btrfs check --repair is a pretty good move. > > > > > I hope somebody here can give me advice on how to (hopefully) > > > retrieve my data... > > > > > > Thanks in advance! > > > > > > == > > > > > > [root@cornelis ~]# btrfs fi show > > > Label: 'cornelis-btrfs' uuid: ac643516-670e-40f3-aa4c- > > > f329fc3795fd > > > Total devices 1 FS bytes used 463.92GiB > > > devid1 size 800.00GiB used 493.02GiB path > > > /dev/mapper/cornelis-cornelis--btrfs > > > > > > Label: 'data' uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5 > > > Total devices 20 FS bytes used 44.85TiB > > > devid1 size 3.64TiB used 3.64TiB path /dev/sdn2 > > > devid2 size 3.64TiB used 3.64TiB path /dev/sdp2 > > > devid3 size 3.64TiB used 3.64TiB path /dev/sdu2 > > > devid4 size 3.64TiB used 3.64TiB path /dev/sdx2 > > > devid5 size 3.64TiB used 3.64TiB path /dev/sdh2 > > > devid6 size 3.64TiB used 3.64TiB path /dev/sdg2 > > > devid7 size 3.64TiB used 3.64TiB path /dev/sdm2 > > > devid8 size 3.64TiB used 3.64TiB path /dev/sdw2 > > > devid9 size 3.64TiB used 3.64TiB path /dev/sdj2 > > > devid 10 size 3.64TiB used 3.64TiB path /dev/sdt2 > > > devid 11 size 3.64TiB used 3.64TiB path /dev/sdk2 > > > devid 12 size 3.64TiB used 3.64TiB path /dev/sdq2 > > > devid 13 size 3.64TiB used 3.64TiB path /dev/sds2 > > > devid 14 size 3.64TiB used 3.64TiB path /dev/sdf2 > > > devid 15 size 7.28TiB used 588.80GiB path /dev/sdr2 > > > devid 16 size 7.28TiB used 588.80GiB path /dev/sdo2 > > > devid 17 size 7.28TiB used 588.80GiB path /dev/sdv2 > > > devid 18 size 7.28TiB used 588.80GiB path /dev/sdi2 > > > devid 19 size 7.28TiB used 588.80GiB path /dev/sdl2 > > > devid 20 size 7.28TiB used 588.80GiB path /dev/sde2 > > > > > > [root@cornelis ~]# mount /dev/sdn2 /mnt/data > > > mount: /mnt/data: wrong fs type, bad option, bad superblock on > > > /dev/sdn2, missing codepage or helper program, or other error. > > > > What is the dmesg of the mount failure? > > [Sun Dec 2 09:41:08 2018] BTRFS info (device sdn2): disk space > caching > is enabled > [Sun Dec 2 09:41:08 2018] BTRFS info (device sdn2): has skinny > extents > [Sun Dec 2 09:41:08 2018] BTRFS error (device sdn2): parent transid > verify failed on 46451963543552 wanted 114401 found 114173 > [Sun Dec 2 09:41:08 2018] BTRFS critical (device sdn2): corrupt > leaf: > root=2 block=46451963543552 slot=0, unexpected item end, have > 1387359977 expect 16283 > [Sun Dec 2 09:41:08 2018] BTRFS warning
Re: Need help with potential ~45TB dataloss
Hi Qu, Thanks for helping me! Please see the reponses in-line. Any suggestions based on this? Thanks! On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote: > On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote: > > Hi all, > > > > I have been a happy BTRFS user for quite some time. But now I'm > > facing > > a potential ~45TB dataloss... :-( > > I hope someone can help! > > > > I have Server A and Server B. Both having a 20-devices BTRFS RAID6 > > filesystem. Because of known RAID5/6 risks, Server B was a backup > > of > > Server A. > > After applying updates to server B and reboot, the FS would not > > mount > > anymore. Because it was "just" a backup. I decided to recreate the > > FS > > and perform a new backup. Later, I discovered that the FS was not > > broken, but I faced this issue: > > https://patchwork.kernel.org/patch/10694997/ > > > > Sorry for the inconvenience. > > I didn't realize the max_chunk_size limit isn't reliable at that > timing. No problem, I should not have jumped to the conclusion to recreate the backup volume. > > Anyway, the FS was already recreated, so I needed to do a new > > backup. > > During the backup (using rsync -vah), Server A (the source) > > encountered > > an I/O error and my rsync failed. In an attempt to "quick fix" the > > issue, I rebooted Server A after which the FS would not mount > > anymore. > > Did you have any dmesg about that IO error? Yes there was. But I omitted capturing it... The system is now rebooted and I can't retrieve it anymore. :-( > And how is the reboot scheduled? Forced power off or normal reboot > command? The system was rebooted using a normal reboot command. > > I documented what I have tried, below. I have not yet tried > > anything > > except what is shown, because I am afraid of causing more harm to > > the FS. > > Pretty clever, no btrfs check --repair is a pretty good move. > > > I hope somebody here can give me advice on how to (hopefully) > > retrieve my data... > > > > Thanks in advance! > > > > == > > > > [root@cornelis ~]# btrfs fi show > > Label: 'cornelis-btrfs' uuid: ac643516-670e-40f3-aa4c-f329fc3795fd > > Total devices 1 FS bytes used 463.92GiB > > devid1 size 800.00GiB used 493.02GiB path > > /dev/mapper/cornelis-cornelis--btrfs > > > > Label: 'data' uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5 > > Total devices 20 FS bytes used 44.85TiB > > devid1 size 3.64TiB used 3.64TiB path /dev/sdn2 > > devid2 size 3.64TiB used 3.64TiB path /dev/sdp2 > > devid3 size 3.64TiB used 3.64TiB path /dev/sdu2 > > devid4 size 3.64TiB used 3.64TiB path /dev/sdx2 > > devid5 size 3.64TiB used 3.64TiB path /dev/sdh2 > > devid6 size 3.64TiB used 3.64TiB path /dev/sdg2 > > devid7 size 3.64TiB used 3.64TiB path /dev/sdm2 > > devid8 size 3.64TiB used 3.64TiB path /dev/sdw2 > > devid9 size 3.64TiB used 3.64TiB path /dev/sdj2 > > devid 10 size 3.64TiB used 3.64TiB path /dev/sdt2 > > devid 11 size 3.64TiB used 3.64TiB path /dev/sdk2 > > devid 12 size 3.64TiB used 3.64TiB path /dev/sdq2 > > devid 13 size 3.64TiB used 3.64TiB path /dev/sds2 > > devid 14 size 3.64TiB used 3.64TiB path /dev/sdf2 > > devid 15 size 7.28TiB used 588.80GiB path /dev/sdr2 > > devid 16 size 7.28TiB used 588.80GiB path /dev/sdo2 > > devid 17 size 7.28TiB used 588.80GiB path /dev/sdv2 > > devid 18 size 7.28TiB used 588.80GiB path /dev/sdi2 > > devid 19 size 7.28TiB used 588.80GiB path /dev/sdl2 > > devid 20 size 7.28TiB used 588.80GiB path /dev/sde2 > > > > [root@cornelis ~]# mount /dev/sdn2 /mnt/data > > mount: /mnt/data: wrong fs type, bad option, bad superblock on > > /dev/sdn2, missing codepage or helper program, or other error. > > What is the dmesg of the mount failure? [Sun Dec 2 09:41:08 2018] BTRFS info (device sdn2): disk space caching is enabled [Sun Dec 2 09:41:08 2018] BTRFS info (device sdn2): has skinny extents [Sun Dec 2 09:41:08 2018] BTRFS error (device sdn2): parent transid verify failed on 46451963543552 wanted 114401 found 114173 [Sun Dec 2 09:41:08 2018] BTRFS critical (device sdn2): corrupt leaf: root=2 block=46451963543552 slot=0, unexpected item end, have 1387359977 expect 16283 [Sun Dec 2 09:41:08 2018] BTRFS warning (device sdn2): failed to read tree root [Sun Dec 2 09:41:08 2018] BTRFS error (device sdn2): open_ctree failed > And have you tried -o ro,degraded ? Tried it just now, gives the exact same error. > > [root@cornelis ~]# btrfs check /dev/sdn2 > > Opening filesystem to check... > > parent transid verify failed on 46451963543552 wanted 114401 found > > 114173 > > parent transid verify failed on 46451963543552 wanted 114401 found > > 114173 > > checksum verify failed on 46451963543552 found A8F2A769 wanted > > 4C111ADF > > checksum verify failed on 46451963543552 found 32153BE8 wanted > > 8B07ABE4 > > checksum verify failed on
Re: Need help with potential ~45TB dataloss
On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote: > Hi all, > > I have been a happy BTRFS user for quite some time. But now I'm facing > a potential ~45TB dataloss... :-( > I hope someone can help! > > I have Server A and Server B. Both having a 20-devices BTRFS RAID6 > filesystem. Because of known RAID5/6 risks, Server B was a backup of > Server A. > After applying updates to server B and reboot, the FS would not mount > anymore. Because it was "just" a backup. I decided to recreate the FS > and perform a new backup. Later, I discovered that the FS was not > broken, but I faced this issue: > https://patchwork.kernel.org/patch/10694997/ Sorry for the inconvenience. I didn't realize the max_chunk_size limit isn't reliable at that timing. > > Anyway, the FS was already recreated, so I needed to do a new backup. > During the backup (using rsync -vah), Server A (the source) encountered > an I/O error and my rsync failed. In an attempt to "quick fix" the > issue, I rebooted Server A after which the FS would not mount anymore. Did you have any dmesg about that IO error? And how is the reboot scheduled? Forced power off or normal reboot command? > > I documented what I have tried, below. I have not yet tried anything > except what is shown, because I am afraid of causing more harm to > the FS. Pretty clever, no btrfs check --repair is a pretty good move. > I hope somebody here can give me advice on how to (hopefully) > retrieve my data... > > Thanks in advance! > > == > > [root@cornelis ~]# btrfs fi show > Label: 'cornelis-btrfs' uuid: ac643516-670e-40f3-aa4c-f329fc3795fd > Total devices 1 FS bytes used 463.92GiB > devid1 size 800.00GiB used 493.02GiB path > /dev/mapper/cornelis-cornelis--btrfs > > Label: 'data' uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5 > Total devices 20 FS bytes used 44.85TiB > devid1 size 3.64TiB used 3.64TiB path /dev/sdn2 > devid2 size 3.64TiB used 3.64TiB path /dev/sdp2 > devid3 size 3.64TiB used 3.64TiB path /dev/sdu2 > devid4 size 3.64TiB used 3.64TiB path /dev/sdx2 > devid5 size 3.64TiB used 3.64TiB path /dev/sdh2 > devid6 size 3.64TiB used 3.64TiB path /dev/sdg2 > devid7 size 3.64TiB used 3.64TiB path /dev/sdm2 > devid8 size 3.64TiB used 3.64TiB path /dev/sdw2 > devid9 size 3.64TiB used 3.64TiB path /dev/sdj2 > devid 10 size 3.64TiB used 3.64TiB path /dev/sdt2 > devid 11 size 3.64TiB used 3.64TiB path /dev/sdk2 > devid 12 size 3.64TiB used 3.64TiB path /dev/sdq2 > devid 13 size 3.64TiB used 3.64TiB path /dev/sds2 > devid 14 size 3.64TiB used 3.64TiB path /dev/sdf2 > devid 15 size 7.28TiB used 588.80GiB path /dev/sdr2 > devid 16 size 7.28TiB used 588.80GiB path /dev/sdo2 > devid 17 size 7.28TiB used 588.80GiB path /dev/sdv2 > devid 18 size 7.28TiB used 588.80GiB path /dev/sdi2 > devid 19 size 7.28TiB used 588.80GiB path /dev/sdl2 > devid 20 size 7.28TiB used 588.80GiB path /dev/sde2 > > [root@cornelis ~]# mount /dev/sdn2 /mnt/data > mount: /mnt/data: wrong fs type, bad option, bad superblock on > /dev/sdn2, missing codepage or helper program, or other error. What is the dmesg of the mount failure? And have you tried -o ro,degraded ? > > [root@cornelis ~]# btrfs check /dev/sdn2 > Opening filesystem to check... > parent transid verify failed on 46451963543552 wanted 114401 found > 114173 > parent transid verify failed on 46451963543552 wanted 114401 found > 114173 > checksum verify failed on 46451963543552 found A8F2A769 wanted 4C111ADF > checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4 > checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4 > bad tree block 46451963543552, bytenr mismatch, want=46451963543552, > have=75208089814272 > Couldn't read tree root Would you please also paste the output of "btrfs ins dump-super /dev/sdn2" ? It looks like your tree root (or at least some tree root nodes/leaves get corrupted) > ERROR: cannot open file system And since it's your tree root corrupted, you could also try "btrfs-find-root " to try to get a good old copy of your tree root. But I suspect the corruption happens before you noticed, thus the old tree root may not help much. Also, the output of "btrfs ins dump-tree -t root " will help. Thanks, Qu > > [root@cornelis ~]# btrfs restore /dev/sdn2 /mnt/data/ > parent transid verify failed on 46451963543552 wanted 114401 found > 114173 > parent transid verify failed on 46451963543552 wanted 114401 found > 114173 > checksum verify failed on 46451963543552 found A8F2A769 wanted 4C111ADF > checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4 > checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4 > bad tree block 46451963543552, bytenr mismatch, want=46451963543552, > have=75208089814272 > Couldn't read
Need help with potential ~45TB dataloss
Hi all, I have been a happy BTRFS user for quite some time. But now I'm facing a potential ~45TB dataloss... :-( I hope someone can help! I have Server A and Server B. Both having a 20-devices BTRFS RAID6 filesystem. Because of known RAID5/6 risks, Server B was a backup of Server A. After applying updates to server B and reboot, the FS would not mount anymore. Because it was "just" a backup. I decided to recreate the FS and perform a new backup. Later, I discovered that the FS was not broken, but I faced this issue: https://patchwork.kernel.org/patch/10694997/ Anyway, the FS was already recreated, so I needed to do a new backup. During the backup (using rsync -vah), Server A (the source) encountered an I/O error and my rsync failed. In an attempt to "quick fix" the issue, I rebooted Server A after which the FS would not mount anymore. I documented what I have tried, below. I have not yet tried anything except what is shown, because I am afraid of causing more harm to the FS. I hope somebody here can give me advice on how to (hopefully) retrieve my data... Thanks in advance! == [root@cornelis ~]# btrfs fi show Label: 'cornelis-btrfs' uuid: ac643516-670e-40f3-aa4c-f329fc3795fd Total devices 1 FS bytes used 463.92GiB devid1 size 800.00GiB used 493.02GiB path /dev/mapper/cornelis-cornelis--btrfs Label: 'data' uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5 Total devices 20 FS bytes used 44.85TiB devid1 size 3.64TiB used 3.64TiB path /dev/sdn2 devid2 size 3.64TiB used 3.64TiB path /dev/sdp2 devid3 size 3.64TiB used 3.64TiB path /dev/sdu2 devid4 size 3.64TiB used 3.64TiB path /dev/sdx2 devid5 size 3.64TiB used 3.64TiB path /dev/sdh2 devid6 size 3.64TiB used 3.64TiB path /dev/sdg2 devid7 size 3.64TiB used 3.64TiB path /dev/sdm2 devid8 size 3.64TiB used 3.64TiB path /dev/sdw2 devid9 size 3.64TiB used 3.64TiB path /dev/sdj2 devid 10 size 3.64TiB used 3.64TiB path /dev/sdt2 devid 11 size 3.64TiB used 3.64TiB path /dev/sdk2 devid 12 size 3.64TiB used 3.64TiB path /dev/sdq2 devid 13 size 3.64TiB used 3.64TiB path /dev/sds2 devid 14 size 3.64TiB used 3.64TiB path /dev/sdf2 devid 15 size 7.28TiB used 588.80GiB path /dev/sdr2 devid 16 size 7.28TiB used 588.80GiB path /dev/sdo2 devid 17 size 7.28TiB used 588.80GiB path /dev/sdv2 devid 18 size 7.28TiB used 588.80GiB path /dev/sdi2 devid 19 size 7.28TiB used 588.80GiB path /dev/sdl2 devid 20 size 7.28TiB used 588.80GiB path /dev/sde2 [root@cornelis ~]# mount /dev/sdn2 /mnt/data mount: /mnt/data: wrong fs type, bad option, bad superblock on /dev/sdn2, missing codepage or helper program, or other error. [root@cornelis ~]# btrfs check /dev/sdn2 Opening filesystem to check... parent transid verify failed on 46451963543552 wanted 114401 found 114173 parent transid verify failed on 46451963543552 wanted 114401 found 114173 checksum verify failed on 46451963543552 found A8F2A769 wanted 4C111ADF checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4 checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4 bad tree block 46451963543552, bytenr mismatch, want=46451963543552, have=75208089814272 Couldn't read tree root ERROR: cannot open file system [root@cornelis ~]# btrfs restore /dev/sdn2 /mnt/data/ parent transid verify failed on 46451963543552 wanted 114401 found 114173 parent transid verify failed on 46451963543552 wanted 114401 found 114173 checksum verify failed on 46451963543552 found A8F2A769 wanted 4C111ADF checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4 checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4 bad tree block 46451963543552, bytenr mismatch, want=46451963543552, have=75208089814272 Couldn't read tree root Could not open root, trying backup super warning, device 14 is missing warning, device 13 is missing warning, device 12 is missing warning, device 11 is missing warning, device 10 is missing warning, device 9 is missing warning, device 8 is missing warning, device 7 is missing warning, device 6 is missing warning, device 5 is missing warning, device 4 is missing warning, device 3 is missing warning, device 2 is missing checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0 checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0 bad tree block 22085632, bytenr mismatch, want=22085632, have=1147797504 ERROR: cannot read chunk root Could not open root, trying backup super warning, device 14 is missing warning, device 13 is missing warning, device 12 is missing warning, device 11 is missing warning, device 10 is missing warning, device 9 is missing warning, device 8 is missing warning, device 7 is missing warning, device 6 is missing warning, device 5 is missing warning, device 4