Re: Re: unable to mount btrfs partition, please help :(
On Sun, Mar 20, 2016 at 1:31 PM, Patrick Tschackertwrote: > My raid is done with the scrub now, this is what i get: > > $ cat /sys/block/md0/md/mismatch_cnt > 311936608 I think this is an assembly problem. Read errors don't result in mismatch counts. An md mismatch count happens when there's a mismatch between data strip and parity strip(s). So this is a lot of mismatches. I think you need to take this problem to the linux-raid@ list, I don't think anyone on this list is going to be able to help with this portion of the problem. I'm only semi-literate with this, and you need to find out why there are so many mismatches and confirm whether the array is being assembled correctly. In your writeup for the list you can include the URL for the first post to this list. I wouldn't repeat any of the VM crashing stuff because it's not really relevant. You'll need to include the kernel you were using at the time of the problem, the kernel you're using for the scrub, the version of mdadm, and all the device metadata (-E for each device) and the array (-D), and smartctl -A for each device (you could put smartctl -x for each drive into a file and the put the file up somewhere like dropbox or google drive, or individually pastebin them if you can keep it all separate, -x is really verbose but sometimes contains read error information) to show bad sectors. The summary line is basically: this was working, after a VM crash followed by shutdown -r now, the Btrfs filesystem won't mount. A drive was faulty and rebuilt with a spare. You just did a check scrub and have all these errors in mismatch_cnt. The question is: how to confirm the array is properly assembled? Because that's too many errors, and the file system on that array will not mount. Further complicating matters is even after rebuild you have another drive that has some read errors. Those weren't being fixed this whole time (during rebuild for example) likely because of the timeout vs SCT ERC misconfiguration, other wise they would have been fixed. > > I also attached my dmesg output to this mail. Here's an excerpt: > [12235.372901] sd 7:0:0:0: [sdh] tag#15 FAILED Result: hostbyte=DID_OK > driverbyte=DRIVER_SENSE > [12235.372906] sd 7:0:0:0: [sdh] tag#15 Sense Key : Medium Error [current] > [descriptor] > [12235.372909] sd 7:0:0:0: [sdh] tag#15 Add. Sense: Unrecovered read error - > auto reallocate failed > [12235.372913] sd 7:0:0:0: [sdh] tag#15 CDB: Read(16) 88 00 00 00 00 00 af b2 > bb 48 00 00 05 40 00 00 > [12235.372916] blk_update_request: I/O error, dev sdh, sector 2947727304 > [12235.372941] ata8: EH complete > [12266.856747] ata8.00: exception Emask 0x0 SAct 0x7fff SErr 0x0 action > 0x0 > [12266.856753] ata8.00: irq_stat 0x4008 > [12266.856756] ata8.00: failed command: READ FPDMA QUEUED > [12266.856762] ata8.00: cmd 60/40:d8:08:17:b5/05:00:af:00:00/40 tag 27 ncq > 688128 in > res 41/40:00:18:1b:b5/00:00:af:00:00/40 Emask 0x409 (media error) > [12266.856765] ata8.00: status: { DRDY ERR } > [12266.856767] ata8.00: error: { UNC } > [12266.858112] ata8.00: configured for UDMA/133 What do you get for smartctl -x /dev/sdh I see this too: [11440.088441] ata8.00: status: { DRDY } [11440.088443] ata8.00: failed command: READ FPDMA QUEUED [11440.088447] ata8.00: cmd 60/40:c8:e8:bc:15/05:00:ab:00:00/40 tag 25 ncq 688128 in res 50/00:00:00:00:00/00:00:00:00:00/a0 Emask 0x1 (device error) That's weird. You have several other identical model drives, so I doubt this is some sort of NCQ incompatibility with this model drive, no other drive is complaining like this. So I wonder if there's just something wrong with this drive aside from the bad sectors (?) I can't really tell but it's suspicious. > If I understand correctly, my /dev/sdh drive is having trouble. > Could this be the problem? Should I set the drive to failed and rebuild on a > spare disk? You need to really slow down and understand the problem first. Every data loss case I've ever come across with md/mdadm raid6 was user induced because they changed too much stuff too fast without consulting people who know better. They got impatient. So I suggest going to the linux-raid@ list and asking there what's going on. The less you change the better because most of the changes md/mdadm does are irreversible. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unable to mount btrfs partition, please help :(
On Sun, Mar 20, 2016 at 6:19 AM, Martin Steigerwaldwrote: > On Sonntag, 20. März 2016 10:18:26 CET Patrick Tschackert wrote: >> > I think in retrospect the safe way to do these kinds of Virtual Box >> > updates, which require kernel module updates, would have been to >> > shutdown the VM and stop the array. *shrug* >> >> >> After this, I think I'll just do away with the virtual machine on this host, >> as the app contained in that vm can also run on the host. I tried to be >> fancy, and it seems to needlessly complicate things. > > I am not completely sure and I have no exact reference anymore, but I think I > read more than once about fs benchmarks running faster in Virtualbox than on > the physical system, which may point at an at least incomplete fsync() > implementation for writing into Virtualbox image files. > > I never found any proof of this nor did I specificially seeked to research it. > So it may be true or not. Sure but that would only affect the guest's file system, the one inside the VDI. It's the host managed filesystem that's busted. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unable to mount btrfs partition, please help :(
On Sun, Mar 20, 2016 at 3:18 AM, Patrick Tschackertwrote: > Thanks for answering again! > So, first of all I installed a newer kernel from the backports as per > Nicholas D Steeves suggestion: > > $ apt-get install -t jessie-backports linux-image-4.3.0-0.bpo.1-amd64 > > After rebooting: > $ uname -a > Linux vmhost 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.5-1~bpo8+1 (2016-02-23) > x86_64 GNU/Linux > > But the problem with mounting the filesystem persists :( > >> OK I went back and read this again: host is managing the md raid5, the >> guest is writing Btrfs to an "encrypted container" but what is that? A >> LUKS encrypted LVM LV that's directly used by Virtual Box as a raw >> device? It's hard to say what layer broke this. But the VM crashing is >> in effect like a power failure, and it's an open question (for me) how >> this setup deals with barriers. A shutdown -r now should still cleanly >> stop the array so I wouldn't expect there to be an array problem but >> then you also report a device failure. Bad luck. > > The host is managing an md raid 6 (/dev/md0), and I had an encrypted volume > (via cryptsetup) on top of that device. > The host mounted the btrfs filesystem contained in that volume, and the VM > wrote to the filesystem as well using a virtualbox shared folder. OK well to me the VM doesn't seem related off hand. Ultimately its only the host writing to the filesystem, even for the shared folder. The guest VM has no direct access to do Btrfs writes, it's something like a network-like shared folder. > After this, I think I'll just do away with the virtual machine on this host, > as the app contained in that vm can also run on the host. > I tried to be fancy, and it seems to needlessly complicate things. virt-manager or gnome-boxes work better, although you lose shared folder, you'll have to come up with a work around, like using NFS. > $ for i in /sys/class/scsi_generic/*/device/timeout; do echo 120 > "$i"; done > (I know this isn't persistent across reboots...) Correct. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unable to mount btrfs partition, please help :(
On Sonntag, 20. März 2016 10:18:26 CET Patrick Tschackert wrote: > > I think in retrospect the safe way to do these kinds of Virtual Box > > updates, which require kernel module updates, would have been to > > shutdown the VM and stop the array. *shrug* > > > After this, I think I'll just do away with the virtual machine on this host, > as the app contained in that vm can also run on the host. I tried to be > fancy, and it seems to needlessly complicate things. I am not completely sure and I have no exact reference anymore, but I think I read more than once about fs benchmarks running faster in Virtualbox than on the physical system, which may point at an at least incomplete fsync() implementation for writing into Virtualbox image files. I never found any proof of this nor did I specificially seeked to research it. So it may be true or not. Thanks, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unable to mount btrfs partition, please help :(
Thanks for answering, I already upgraded to a backports kernel as mentioned here: https://mail-archive.com/linux-btrfs@vger.kernel.org/msg51748.html I now have $ uname -a Linux vmhost 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.5-1~bpo8+1 (2016-02-23) x86_64 GNU/Linux As I wrote here https://mail-archive.com/linux-btrfs@vger.kernel.org/msg51748.html the problem still persists :( Cheers, Patrick Gesendet: Sonntag, 20. März 2016 um 13:11 Uhr Von: "Martin Steigerwald" <mar...@lichtvoll.de> An: "Chris Murphy" <li...@colorremedies.com> Cc: "Patrick Tschackert" <killing-t...@gmx.de>, "Btrfs BTRFS" <linux-btrfs@vger.kernel.org> Betreff: Re: unable to mount btrfs partition, please help :( On Samstag, 19. März 2016 19:34:55 CET Chris Murphy wrote: > >>> $ uname -a > >>> Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4 > >>> (2016-02-29) x86_64 GNU/Linux > >> > >>This is old. You should upgrade to something newer, ideally 4.5 but > >>4.4.6 is good also, and then oldest I'd suggest is 4.1.20. > >> > > Shouldn't I be able to get the newest kernel by executing "apt-get update > > && apt-get dist-upgrade"? That's what I ran just now, and it doesn't > > install a newer kernel. Do I really have to manually upgrade to a newer > > one? > I'm not sure. You might do a list search for debian, as I know debian > users are using newer kernels that they didn't build themselves. Try a backport¹ kernel. Add backports and do apt-cache search linux-image I use 4.3 backport kernel successfully on two server VMs which use BTRFS. [1] http://backports.debian.org/ Thx, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unable to mount btrfs partition, please help :(
On Samstag, 19. März 2016 19:34:55 CET Chris Murphy wrote: > >>> $ uname -a > >>> Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4 > >>> (2016-02-29) x86_64 GNU/Linux > >> > >>This is old. You should upgrade to something newer, ideally 4.5 but > >>4.4.6 is good also, and then oldest I'd suggest is 4.1.20. > >> > > Shouldn't I be able to get the newest kernel by executing "apt-get update > > && apt-get dist-upgrade"? That's what I ran just now, and it doesn't > > install a newer kernel. Do I really have to manually upgrade to a newer > > one? > I'm not sure. You might do a list search for debian, as I know debian > users are using newer kernels that they didn't build themselves. Try a backport¹ kernel. Add backports and do apt-cache search linux-image I use 4.3 backport kernel successfully on two server VMs which use BTRFS. [1] http://backports.debian.org/ Thx, -- Martin -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unable to mount btrfs partition, please help :(
Thanks for answering again! So, first of all I installed a newer kernel from the backports as per Nicholas D Steeves suggestion: $ apt-get install -t jessie-backports linux-image-4.3.0-0.bpo.1-amd64 After rebooting: $ uname -a Linux vmhost 4.3.0-0.bpo.1-amd64 #1 SMP Debian 4.3.5-1~bpo8+1 (2016-02-23) x86_64 GNU/Linux But the problem with mounting the filesystem persists :( > OK I went back and read this again: host is managing the md raid5, the > guest is writing Btrfs to an "encrypted container" but what is that? A > LUKS encrypted LVM LV that's directly used by Virtual Box as a raw > device? It's hard to say what layer broke this. But the VM crashing is > in effect like a power failure, and it's an open question (for me) how > this setup deals with barriers. A shutdown -r now should still cleanly > stop the array so I wouldn't expect there to be an array problem but > then you also report a device failure. Bad luck. The host is managing an md raid 6 (/dev/md0), and I had an encrypted volume (via cryptsetup) on top of that device. The host mounted the btrfs filesystem contained in that volume, and the VM wrote to the filesystem as well using a virtualbox shared folder. The vm then crashed, but I shut down the host with "shutdown -r now". After the reboot, one disk of the array was no longer present, but I managed to rebuild/restore using a spare disk. The RAID now seems to be healthy. > I think in retrospect the safe way to do these kinds of Virtual Box > updates, which require kernel module updates, would have been to > shutdown the VM and stop the array. *shrug* After this, I think I'll just do away with the virtual machine on this host, as the app contained in that vm can also run on the host. I tried to be fancy, and it seems to needlessly complicate things. > These drives are technically not suitable for use in any kind of raid > except linear and raid 0 (which have no redundancy so they aren't > really raid). You'd have to dig up drive specs, assuming they're > published, to see what the recovery times are for the drive models > when a bad sector is encountered. But it's typical for such drives to > exceed 30 seconds for recovery, with some drives reported to have 2+ > minute recoveries. To properly configure them, you'll have to increase > the kernel's SCSI comment timer to at least 120 to make sure there's > sufficient time to wait for the drive to explicitly spit back a read > error to the kernel. Otherwise, the kernel gives up after 30 seconds, > and resets the link to the drive, and any possibility of fixing up the > bad sector via the raid read error fixup mechanism is thwarted. It's > really common, the linux-raid@ list has many of these kinds of threads > with this misconfiguration as the source problem. > For the first listing of drives yes. And 120 second delays might be > too long for your use case, but that's the reality. > You should change the command timer for the drives that do not support > configurable SCT ERC. And then do a scrub check. And then check both > cat /sys/block/mdX/md/mismatch_cnt, which ideally should be 0, and > also check kernel messages for libata read errors. So I did this: $ cat /sys/block/md0/md/mismatch_cnt 0 $ for i in /sys/class/scsi_generic/*/device/timeout; do echo 120 > "$i"; done (I know this isn't persistent across reboots...) $ echo check > /sys/block/md0/md/sync_action $ cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] md0 : active raid6 sda[0] sdf[12](S) sdg[11](S) sdj[9] sdh[7] sdi[6] sdk[10] sde[4] sdd[3] sdc[2] sdb[1] 20510948416 blocks super 1.2 level 6, 64k chunk, algorithm 2 [9/9] [U] [>] check = 1.0% (30812476/2930135488) finish=340.6min speed=141864K/sec unused devices: So the raid is currently doing a scrub, which will take a few hours. > Hmm not good. See this similar thread. > http://www.spinics.net/lists/linux-btrfs/msg51711.html > backups in all superblocks have the same chunk_root, no alternative > chunk root to try. > So at the moment I think it's worth trying a newer kernel version and > mounting normally; then mounting with -o recovery; then - recovery,ro. > If that doesn't work, you're best off waiting for a developer to give > advice on the next step; 'btrfs rescue chunk-recover' seems most > appropriate but again someone else a while back had success with > zero-log, but it's hard to say if the two cases are really similar and > maybe that person just got lucky. Both of those change the file system > in irreversible ways, that's why I suggest waiting or asking on IRC. Thanks again for taking the time to answer. I'll wait while my RAID is doing the scrub, maybe a dev will answer (like you said). The friendly people on IRC couldn't help and sent me here. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at
Re: unable to mount btrfs partition, please help :(
Patrick Tschackert posted on Sat, 19 Mar 2016 23:15:33 +0100 as excerpted: > I'm growing increasingly desperate, can anyone help me? No need to be desperate. As the sysadmin's rule of backups states, simple form, you either have at least one level of backup, or you are by your (in)action defining the data not backed up as worth less than the time, hassle and resources necessary to do that backup. Therefore, there are only two possibilities: 1) You have a backup. No sweat. You can use it if you need to, so no desperation needed. 2) You don't have a backup. No sweat. By not having a backup, your actions defined the data at risk as worth less than the time, hassle and resources necessary for that backup, so if you lose the data, you can still be happy, because you saved what you defined as of most importance, the time, resources and hassle of doing that backup. Since you saved what you yourself defined by your own actions as of most value to you, either way, you have what was most valuable to you and can thus be happy to have the valuable stuff, even if you lost what was therefore much more trivial. There are no other possibilities. Your words might lie. Your actions don't. Either way, you saved the valuable stuff and thus have no reason to be desperate. And of course, btrfs, while stabilizing, is not yet fully stable and mature, and while stable enough to be potentially suitable for those who have tested backups or are only using it with trivial data they can afford to lose anyway, if they don't have backups, it's certainly not to the level of stability of the more mature filesystems the above sysadmin's rule of backups was designed for. So that rule applies even MORE strongly to btrfs than it does to more mature and stable filesystems. (FWIW, there's a more complex version of the rule that takes relative risk into account and covers multiple levels of backup where either the risk is high enough or the data valuable enough to warrant it, but the simple form just says if you don't have at least one backup, you are by that lack of backup defining the data at risk as not worth the time and trouble to do it.) And there's no way that not knowing the btrfs status changes that either, because if you didn't know the status, it can only be because you didn't care enough about the reliability of the filesystem you were entrusting your data to, to care about researching it. After all, both the btrfs wiki and the kernel btrfs option stress the need for backups if you're choosing btrfs, as does this list, repeatedly. So the only way someone couldn't know is if they didn't care enough to /bother/ to know, which again defines the data stored on the filesystem as of only trivial value, worth so little it's not worth researching a new filesystem you plan on storing it on. So there's no reason to be desperate. It'll only stress you out and increase your blood pressure. Either you considered the data valuable enough to have a backup, or you didn't. There is no third option. And either way, it's not worth stressing out over, because you either have that backup and thus don't need to stress, or you yourself defined the data as trivial by not having it. > $ uname -a Linux vmhost 3.16.0-4-amd64 #1 SMP Debian > 3.16.7-ckt20-1+deb8u4 (2016-02-29) x86_64 GNU/Linux > > $ btrfs --version btrfs-progs v4.4 As CMurphy says, that's an old kernel, not really supported by the list. With btrfs still stabilizing, the code is still changing pretty fast, and old kernels are known buggy kernels. The list focuses on the mainline kernel and its two primary tracks, LTS kernel series and current kernel series. On the current kernel track, the last two kernels are best supported. With 4.5 just out, that's 4.5 and 4.4. On the LTS track, the two latest LTS kernel series are recommended, with 4.4 being the latest LTS kernel, and 4.1 being the one previous to that. However, 3.18 was the one previous to that and has been reasonably stable, so while the two latest LTS series remain recommended, we're still trying to support 3.18 too, for those who need that far back. But 3.16 is previous to that and is really too far back to be practically supported well by the list, as btrfs really is still stabilizing and our focus is forward, not backward. That doesn't mean we won't try to support it, it simply means that when there's a problem, the first recommendation, as you've seen, is likely to be try a newer kernel. Of course various distros do offer support for btrfs on older kernels and we recognize that. However, our focus is on mainline, and we don't track what patches the various distros have backported and what patches they haven't, so we're not in a particularly good position to provide support for them, at least back further than the mainline kernels we support. If you wish to use btrfs on such old kernels, then, our recommendation is to get that support
Re: unable to mount btrfs partition, please help :(
On 19 March 2016 at 21:34, Chris Murphywrote: > On Sat, Mar 19, 2016 at 5:35 PM, Patrick Tschackert > wrote: $ uname -a Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4 (2016-02-29) x86_64 GNU/Linux >>>This is old. You should upgrade to something newer, ideally 4.5 but >>>4.4.6 is good also, and then oldest I'd suggest is 4.1.20. >> >> Shouldn't I be able to get the newest kernel by executing "apt-get update && >> apt-get dist-upgrade"? >> That's what I ran just now, and it doesn't install a newer kernel. Do I >> really have to manually upgrade to a newer one? > > I'm not sure. You might do a list search for debian, as I know debian > users are using newer kernels that they didn't build themselves. > > >> On top of the sticky situation i'm already in, i'm not sure if I trust >> myself manually building a new kernel. Should I? If you enable Debian backports, which I assume you have since you're running the version of btrfs-progs that was backported without a warning not to use it with old kernels...well, if backports are enabled then you can try: apt-get install -t jessie-backports linux-image-4.3.0-0.bpo.1-amd64 linux-4.3.x was a complete mess for both my laptop (Thinkpad X220, quite well supported), and I'm not sure if it was driver-related or btrfs-related. I actually started tracking linux-4.4 at rc1, it was so bad. If you don't want to try building your own kernel, I'd file a bug report against linux-image-amd64 asking for a backport of linux-4.4, which is in Stretch/testing; I'm surprised it hasn't been backported yet... The only issue I remember is an error message when booting, I think because the microcode interface changed between 4.3.x and 4.4.x. Installing microcode-related packages from backports is how think I worked around this. Alternatively, if you want to build your own kernel you might be able to install linux-image from backports, download and untar linux-4.1.x somewhere, and then copy the config from /boot/config-4.3* to somedir/linux-4.1.x/.config. I uploaded two scripts to github that I've been using for ages to track the upstream LTS kernel branch that Debian didn't choose. You can find them here: https://github.com/sten0/lts-convenience All those syncs and btrfs sub sync lines are there because I always seem to run strange issues with adding and removing snapshots. Cheers, Nicholas -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: unable to mount btrfs partition, please help :(
On Sat, Mar 19, 2016 at 5:35 PM, Patrick Tschackertwrote: > Hi Chris, > > thank you for answering so quickly! > >> Try 'btrfs check' without any options first. > $ btrfs check /dev/mapper/storage > checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89 > checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89 > bytenr mismatch, want=36340960788480, have=4530277753793296986 > Couldn't read chunk tree > Couldn't open file system > >> To me it seems the problem is instigated by lower layers either not >> completing critical writes at the time of the power failure, or didn't >> rebuild correctly. > > There wasn't a power failure, a VM crashed whilst writing to the btrfs > filesys. OK I went back and read this again: host is managing the md raid5, the guest is writing Btrfs to an "encrypted container" but what is that? A LUKS encrypted LVM LV that's directly used by Virtual Box as a raw device? It's hard to say what layer broke this. But the VM crashing is in effect like a power failure, and it's an open question (for me) how this setup deals with barriers. A shutdown -r now should still cleanly stop the array so I wouldn't expect there to be an array problem but then you also report a device failure. Bad luck. I think in retrospect the safe way to do these kinds of Virtual Box updates, which require kernel module updates, would have been to shutdown the VM and stop the array. *shrug* > >> You should check the SCT ERC setting on each drive with 'smartctl -l >> scterc /dev/sdX' and also the kernel command timer setting with 'cat >> /sys/block/sdX/device/timeout' also for each device. The SCT ERC value >> must be less than the command timer. It's a common misconfiguration >> with raid setups. > > $ smartctl -l scterc /dev/sda (sdb, sdc, sde, sdg) > gives me > > smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build) > Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org > > SCT Error Recovery Control command not supported These drives are technically not suitable for use in any kind of raid except linear and raid 0 (which have no redundancy so they aren't really raid). You'd have to dig up drive specs, assuming they're published, to see what the recovery times are for the drive models when a bad sector is encountered. But it's typical for such drives to exceed 30 seconds for recovery, with some drives reported to have 2+ minute recoveries. To properly configure them, you'll have to increase the kernel's SCSI comment timer to at least 120 to make sure there's sufficient time to wait for the drive to explicitly spit back a read error to the kernel. Otherwise, the kernel gives up after 30 seconds, and resets the link to the drive, and any possibility of fixing up the bad sector via the raid read error fixup mechanism is thwarted. It's really common, the linux-raid@ list has many of these kinds of threads with this misconfiguration as the source problem. > > while > $ smartctl -l scterc /dev/sdf (sdh, sdi, sdj, sdk) > gives me > > smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build) > Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org > > SCT Error Recovery Control: >Read: 70 (7.0 seconds) > Write: 70 (7.0 seconds) These drives are suitable for raid out of the box. > > $ cat /sys/block/sdX/device/timeout > gives me "30" for every device > > Does that mean my settings for the device timeouts are wrong? For the first listing of drives yes. And 120 second delays might be too long for your use case, but that's the reality. You should change the command timer for the drives that do not support configurable SCT ERC. And then do a scrub check. And then check both cat /sys/block/mdX/md/mismatch_cnt, which ideally should be 0, and also check kernel messages for libata read errors. > >> After that's fixed you should do a scrub, and I'm thinking it's best >> to do only a check, which means 'echo check > >> /sys/block/mdX/md/sync_action' rather than issuing repair which >> assumes data strips are correct and parity strips are wrong and >> rebuilds all parity strips. > > I don't quite understand, I thought a scrub could only be done on a mounted > filesys? You have two scrubs. There's a Btrfs scrub. And an md scrub. I'm referring to the latter. > Do you reall mean executing the command "echo check > > /sys/block/md0/md/sync_action"? At the moment it says "idle" in that file. > Also, the btrfs filesys sits in an encrypted container, so the setup looks > like this: > > /dev/md0 (this is the Raid device) > /dev/mapper/storage (after cryptsetup luksOpen, this is where filesys should > be mounted from) > /media/storage (i always mounted the filesystem into this folder by executing > "mount /dev/mapper/storage /media/storage") > > Apologies if I didn't make that clear enough in my initial email Ok so the host is writing Btrfs to
Re: unable to mount btrfs partition, please help :(
Hi Chris, thank you for answering so quickly! > Try 'btrfs check' without any options first. $ btrfs check /dev/mapper/storage checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89 checksum verify failed on 36340960788480 found 8F8E1006 wanted 4AA1BC89 bytenr mismatch, want=36340960788480, have=4530277753793296986 Couldn't read chunk tree Couldn't open file system > To me it seems the problem is instigated by lower layers either not > completing critical writes at the time of the power failure, or didn't > rebuild correctly. There wasn't a power failure, a VM crashed whilst writing to the btrfs filesys. I then rebooted the whole system via "shutdown -r now", after which the filesystem wasn't mountable. The rebuild/restore of the raid seemed to go just fine though. > You should check the SCT ERC setting on each drive with 'smartctl -l > scterc /dev/sdX' and also the kernel command timer setting with 'cat > /sys/block/sdX/device/timeout' also for each device. The SCT ERC value > must be less than the command timer. It's a common misconfiguration > with raid setups. $ smartctl -l scterc /dev/sda (sdb, sdc, sde, sdg) gives me smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org SCT Error Recovery Control command not supported while $ smartctl -l scterc /dev/sdf (sdh, sdi, sdj, sdk) gives me smartctl 6.4 2014-10-07 r4002 [x86_64-linux-3.16.0-4-amd64] (local build) Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org SCT Error Recovery Control: Read: 70 (7.0 seconds) Write: 70 (7.0 seconds) $ cat /sys/block/sdX/device/timeout gives me "30" for every device Does that mean my settings for the device timeouts are wrong? > After that's fixed you should do a scrub, and I'm thinking it's best > to do only a check, which means 'echo check > > /sys/block/mdX/md/sync_action' rather than issuing repair which > assumes data strips are correct and parity strips are wrong and > rebuilds all parity strips. I don't quite understand, I thought a scrub could only be done on a mounted filesys? Do you reall mean executing the command "echo check > /sys/block/md0/md/sync_action"? At the moment it says "idle" in that file. Also, the btrfs filesys sits in an encrypted container, so the setup looks like this: /dev/md0 (this is the Raid device) /dev/mapper/storage (after cryptsetup luksOpen, this is where filesys should be mounted from) /media/storage (i always mounted the filesystem into this folder by executing "mount /dev/mapper/storage /media/storage") Apologies if I didn't make that clear enough in my initial email >> $ uname -a >> Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4 >> (2016-02-29) x86_64 GNU/Linux >This is old. You should upgrade to something newer, ideally 4.5 but >4.4.6 is good also, and then oldest I'd suggest is 4.1.20. Shouldn't I be able to get the newest kernel by executing "apt-get update && apt-get dist-upgrade"? That's what I ran just now, and it doesn't install a newer kernel. Do I really have to manually upgrade to a newer one? On top of the sticky situation i'm already in, i'm not sure if I trust myself manually building a new kernel. Should I? > What do you get for > btrfs-find-root /dev/mdX > btrfs-show-super -fa /dev/mdX $ btrfs-find-root /dev/mapper/storage Couldn't read chunk tree Open ctree failed $ btrfs-show-super -fa /dev/mapper/storage superblock: bytenr=65536, device=/dev/mapper/storage - csum 0xf3887f83 [match] bytenr 65536 flags 0x1 ( WRITTEN ) magic _BHRfS_M [match] fsid 9868d803-78d1-40c3-b1ee-a4ce3363df87 label generation 1322969 root 24022309593088 sys_array_size 97 chunk_root_generation 1275381 root_level 2 chunk_root 36340959809536 chunk_root_level 2 log_root 0 log_root_transid 0 log_root_level 0 total_bytes 21003208163328 bytes_used 17670843191296 sectorsize 4096 nodesize 4096 leafsize 4096 stripesize 4096 root_dir 6 num_devices 1 compat_flags 0x0 compat_ro_flags 0x0 incompat_flags 0x1 ( MIXED_BACKREF ) csum_type 0 csum_size 4 cache_generation 1322969 uuid_tree_generation 1322969 dev_item.uuid c1123f55-46ce-4931-8722-7387fee07608 dev_item.fsid 9868d803-78d1-40c3-b1ee-a4ce3363df87 [match] dev_item.type 0 dev_item.total_bytes 21003208163328 dev_item.bytes_used 17886424858624 dev_item.io_align 4096 dev_item.io_width 4096 dev_item.sector_size 4096
Re: unable to mount btrfs partition, please help :(
On Sat, Mar 19, 2016 at 4:15 PM, Patrick Tschackertwrote: > I'm growing increasingly desperate, can anyone help me? I'm thinking > of trying one or more of the following, but would like an informed > opinion: > 1) btrfs check --fix-crc > 2) btrfs-check --init-csum-tree > 3) btrfs rescue chunk-recover > 4) btrfs-check --repair > 5) btrfs rescue zero-log None of the above. Try 'btrfs check' without any options first. To me it seems the problem is instigated by lower layers either not completing critical writes at the time of the power failure, or didn't rebuild correctly. You should check the SCT ERC setting on each drive with 'smartctl -l scterc /dev/sdX' and also the kernel command timer setting with 'cat /sys/block/sdX/device/timeout' also for each device. The SCT ERC value must be less than the command timer. It's a common misconfiguration with raid setups. After that's fixed you should do a scrub, and I'm thinking it's best to do only a check, which means 'echo check > /sys/block/mdX/md/sync_action' rather than issuing repair which assumes data strips are correct and parity strips are wrong and rebuilds all parity strips. > > $ uname -a > Linux vmhost 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u4 > (2016-02-29) x86_64 GNU/Linux This is old. You should upgrade to something newer, ideally 4.5 but 4.4.6 is good also, and then oldest I'd suggest is 4.1.20. > > $ btrfs --version > btrfs-progs v4.4 Good. > $ btrfs fi show > Label: none uuid: 9868d803-78d1-40c3-b1ee-a4ce3363df87 > Total devices 1 FS bytes used 16.07TiB > devid 1 size 19.10TiB used 16.27TiB path /dev/mapper/storage > > excerpt from DMESG: > [ 151.970916] BTRFS: device fsid 9868d803-78d1-40c3-b1ee-a4ce3363df87 > devid 1 transid 1322969 /dev/dm-0 > [ 163.105784] BTRFS info (device dm-0): disk space caching is enabled > [ 165.304968] BTRFS: bad tree block start 4530277753793296986 36340960788480 > [ 165.305233] BTRFS: bad tree block start 4530277753793296986 36340960788480 > [ 165.305281] BTRFS: failed to read chunk tree on dm-0 > [ 165.331407] BTRFS: open_ctree failed Yeah this isn't a good message typically. There's one surprising (to me) case where someone had luck getting this fixed with btrfs-zero-log which is unexpected. But I think it's very premature to make changes to the file system until you have more information. What do you get for btrfs-find-root /dev/mdX btrfs-show-super -fa /dev/mdX -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html