Re: Frequent btrfs corruption on a USB flash drive
>> Device is GOOD >> >> I also created a big file with dd using /dev/urandom with the same size >> as my flash drive, copied it once and read it three times. The SHA-1 >> checksum is always the same and matches the original one on the hard disk. >> >> So after much testing I feel I can conclude that my USB flash drive is >> not fake and it is not defective. >> > For what it's worth, there's multiple other things that could cause similar > issues. I've had a number of cases where bad USB hubs or poorly designed > (or just buggy or failing) USB controllers caused similar data corruption, > the most recent one being an issue with both a bad USB 2.0 hub (which did > not properly implement the USB standard, counterfeit USB devices come in all > types) and a malfunctioning USB 3.0 controller (which did not properly > account for things that didn't properly implement the standard and had no > recovery code to handle this in the drivers). I ended up in most cases > checking the ports using other USB devices (at least a keyboard, a mouse, > and a USB serial adapter). Similar as Austin, I also want to note that there might be USB related issues that only pop-up after some time and not in tests. For example, this weekend I connected a 2.5inch 500G drive with its Y-cable to a H87M-Pro board that is fed by a 80+Gold PSU, despite its many 'bad sectors' I remembered from 2 years ago in a btrfs raid1 setup. This 500G disk has worked well for almost 2 years connected to a 7-inch eeepc4G, XFS formatted. But with the H87M-Pro I just now saw that it dropped off the USB every now and then, causing trouble for Btrfs. For connecting harddisks to phones, I once bought an external powered hub, and I put that between the board the the 500G disk => that made it all stable, no disconnects and Btrfs works fine as expected. I had similar issues on another PC with a Sandisk Extreme 64G USB3 stick, but that was likely a protocol issue. So maybe try to use the stick with your use case in another HW setup, hopefully then it is stable for a longer time than the few days now. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Frequent btrfs corruption on a USB flash drive
On 2016-07-08 12:10, Francesco Turco wrote: On 2016-07-07 19:57, Chris Murphy wrote: Use F3 to test flash: http://oss.digirati.com.br/f3/ I tested my USB flash drive with F3 as you suggested, and there's no indication it is a fake device. --- # f3probe --destructive /dev/sdb F3 probe 6.0 Copyright (C) 2010 Digirati Internet LTDA. This is free software; see the source for copying conditions. WARNING: Probing normally takes from a few seconds to 15 minutes, but it can take longer. Please be patient. Good news: The device `/dev/sdb' is the real thing Device geometry: *Usable* size: 57.69 GB (120979456 blocks) Announced size: 57.69 GB (120979456 blocks) Module: 64.00 GB (2^36 Bytes) Approximate cache size: 0.00 Byte (0 blocks), need-reset=no Physical block size: 512.00 Byte (2^9 Bytes) Probe time: 2'23" -- $ f3read /run/media/fturco/a7d8a7b1-e0c2-4fbb-879f-e17046989f3c SECTORS ok/corrupted/changed/overwritten Validating file 1.h2w ... 2097152/0/ 0/ 0 Validating file 2.h2w ... 2097152/0/ 0/ 0 Validating file 3.h2w ... 2097152/0/ 0/ 0 Validating file 4.h2w ... 2097152/0/ 0/ 0 Validating file 5.h2w ... 2097152/0/ 0/ 0 Validating file 6.h2w ... 2097152/0/ 0/ 0 Validating file 7.h2w ... 2097152/0/ 0/ 0 Validating file 8.h2w ... 2097152/0/ 0/ 0 Validating file 9.h2w ... 2097152/0/ 0/ 0 Validating file 10.h2w ... 2097152/0/ 0/ 0 Validating file 11.h2w ... 2097152/0/ 0/ 0 Validating file 12.h2w ... 2097152/0/ 0/ 0 Validating file 13.h2w ... 2097152/0/ 0/ 0 Validating file 14.h2w ... 2097152/0/ 0/ 0 Validating file 15.h2w ... 2097152/0/ 0/ 0 Validating file 16.h2w ... 2097152/0/ 0/ 0 Validating file 17.h2w ... 2097152/0/ 0/ 0 Validating file 18.h2w ... 2097152/0/ 0/ 0 Validating file 19.h2w ... 2097152/0/ 0/ 0 Validating file 20.h2w ... 2097152/0/ 0/ 0 Validating file 21.h2w ... 2097152/0/ 0/ 0 Validating file 22.h2w ... 2097152/0/ 0/ 0 Validating file 23.h2w ... 2097152/0/ 0/ 0 Validating file 24.h2w ... 2097152/0/ 0/ 0 Validating file 25.h2w ... 2097152/0/ 0/ 0 Validating file 26.h2w ... 2097152/0/ 0/ 0 Validating file 27.h2w ... 2097152/0/ 0/ 0 Validating file 28.h2w ... 2097152/0/ 0/ 0 Validating file 29.h2w ... 2097152/0/ 0/ 0 Validating file 30.h2w ... 2097152/0/ 0/ 0 Validating file 31.h2w ... 2097152/0/ 0/ 0 Validating file 32.h2w ... 2097152/0/ 0/ 0 Validating file 33.h2w ... 2097152/0/ 0/ 0 Validating file 34.h2w ... 2097152/0/ 0/ 0 Validating file 35.h2w ... 2097152/0/ 0/ 0 Validating file 36.h2w ... 2097152/0/ 0/ 0 Validating file 37.h2w ... 2097152/0/ 0/ 0 Validating file 38.h2w ... 2097152/0/ 0/ 0 Validating file 39.h2w ... 2097152/0/ 0/ 0 Validating file 40.h2w ... 2097152/0/ 0/ 0 Validating file 41.h2w ... 2097152/0/ 0/ 0 Validating file 42.h2w ... 2097152/0/ 0/ 0 Validating file 43.h2w ... 2097152/0/ 0/ 0 Validating file 44.h2w ... 2097152/0/ 0/ 0 Validating file 45.h2w ... 2097152/0/ 0/ 0 Validating file 46.h2w ... 2097152/0/ 0/ 0 Validating file 47.h2w ... 2097152/0/ 0/ 0 Validating file 48.h2w ... 2097152/0/ 0/ 0 Validating file 49.h2w ... 2097152/0/ 0/ 0 Validating file 50.h2w ... 2097152/0/ 0/ 0 Validating file 51.h2w ... 2097152/0/ 0/ 0 Validating file 52.h2w ... 2097152/0/ 0/ 0 Validating file 53.h2w ... 2097152/0/ 0/ 0 Validating file 54.h2w ... 2097152/0/ 0/ 0 Validating file 55.h2w ... 2097152/0/ 0/ 0 Validating file 56.h2w ... 1364266/0/ 0/ 0 Data OK: 55.65 GB (116707626 sectors) Data LOST: 0.00 Byte (0 sectors) Corrupted: 0.00 Byte (0 sectors) Slightly changed: 0.00 Byte (0 sectors) Overwritten: 0.00 Byte (0 sectors) Average reading speed: 34.73 MB/s Read more, and also includes a much faster alternative for GNOME: https://blogs.gnome.org/hughsie/2015/01/28/detecting-fake-flash/ I also tested my flash drive with gnome-multi-writer-probe, and it says it is not fake: # gnome-multi-writer-probe
Re: Frequent btrfs corruption on a USB flash drive
On 2016-07-07 19:57, Chris Murphy wrote: > Use F3 to test flash: > http://oss.digirati.com.br/f3/ I tested my USB flash drive with F3 as you suggested, and there's no indication it is a fake device. --- # f3probe --destructive /dev/sdb F3 probe 6.0 Copyright (C) 2010 Digirati Internet LTDA. This is free software; see the source for copying conditions. WARNING: Probing normally takes from a few seconds to 15 minutes, but it can take longer. Please be patient. Good news: The device `/dev/sdb' is the real thing Device geometry: *Usable* size: 57.69 GB (120979456 blocks) Announced size: 57.69 GB (120979456 blocks) Module: 64.00 GB (2^36 Bytes) Approximate cache size: 0.00 Byte (0 blocks), need-reset=no Physical block size: 512.00 Byte (2^9 Bytes) Probe time: 2'23" -- $ f3read /run/media/fturco/a7d8a7b1-e0c2-4fbb-879f-e17046989f3c SECTORS ok/corrupted/changed/overwritten Validating file 1.h2w ... 2097152/0/ 0/ 0 Validating file 2.h2w ... 2097152/0/ 0/ 0 Validating file 3.h2w ... 2097152/0/ 0/ 0 Validating file 4.h2w ... 2097152/0/ 0/ 0 Validating file 5.h2w ... 2097152/0/ 0/ 0 Validating file 6.h2w ... 2097152/0/ 0/ 0 Validating file 7.h2w ... 2097152/0/ 0/ 0 Validating file 8.h2w ... 2097152/0/ 0/ 0 Validating file 9.h2w ... 2097152/0/ 0/ 0 Validating file 10.h2w ... 2097152/0/ 0/ 0 Validating file 11.h2w ... 2097152/0/ 0/ 0 Validating file 12.h2w ... 2097152/0/ 0/ 0 Validating file 13.h2w ... 2097152/0/ 0/ 0 Validating file 14.h2w ... 2097152/0/ 0/ 0 Validating file 15.h2w ... 2097152/0/ 0/ 0 Validating file 16.h2w ... 2097152/0/ 0/ 0 Validating file 17.h2w ... 2097152/0/ 0/ 0 Validating file 18.h2w ... 2097152/0/ 0/ 0 Validating file 19.h2w ... 2097152/0/ 0/ 0 Validating file 20.h2w ... 2097152/0/ 0/ 0 Validating file 21.h2w ... 2097152/0/ 0/ 0 Validating file 22.h2w ... 2097152/0/ 0/ 0 Validating file 23.h2w ... 2097152/0/ 0/ 0 Validating file 24.h2w ... 2097152/0/ 0/ 0 Validating file 25.h2w ... 2097152/0/ 0/ 0 Validating file 26.h2w ... 2097152/0/ 0/ 0 Validating file 27.h2w ... 2097152/0/ 0/ 0 Validating file 28.h2w ... 2097152/0/ 0/ 0 Validating file 29.h2w ... 2097152/0/ 0/ 0 Validating file 30.h2w ... 2097152/0/ 0/ 0 Validating file 31.h2w ... 2097152/0/ 0/ 0 Validating file 32.h2w ... 2097152/0/ 0/ 0 Validating file 33.h2w ... 2097152/0/ 0/ 0 Validating file 34.h2w ... 2097152/0/ 0/ 0 Validating file 35.h2w ... 2097152/0/ 0/ 0 Validating file 36.h2w ... 2097152/0/ 0/ 0 Validating file 37.h2w ... 2097152/0/ 0/ 0 Validating file 38.h2w ... 2097152/0/ 0/ 0 Validating file 39.h2w ... 2097152/0/ 0/ 0 Validating file 40.h2w ... 2097152/0/ 0/ 0 Validating file 41.h2w ... 2097152/0/ 0/ 0 Validating file 42.h2w ... 2097152/0/ 0/ 0 Validating file 43.h2w ... 2097152/0/ 0/ 0 Validating file 44.h2w ... 2097152/0/ 0/ 0 Validating file 45.h2w ... 2097152/0/ 0/ 0 Validating file 46.h2w ... 2097152/0/ 0/ 0 Validating file 47.h2w ... 2097152/0/ 0/ 0 Validating file 48.h2w ... 2097152/0/ 0/ 0 Validating file 49.h2w ... 2097152/0/ 0/ 0 Validating file 50.h2w ... 2097152/0/ 0/ 0 Validating file 51.h2w ... 2097152/0/ 0/ 0 Validating file 52.h2w ... 2097152/0/ 0/ 0 Validating file 53.h2w ... 2097152/0/ 0/ 0 Validating file 54.h2w ... 2097152/0/ 0/ 0 Validating file 55.h2w ... 2097152/0/ 0/ 0 Validating file 56.h2w ... 1364266/0/ 0/ 0 Data OK: 55.65 GB (116707626 sectors) Data LOST: 0.00 Byte (0 sectors) Corrupted: 0.00 Byte (0 sectors) Slightly changed: 0.00 Byte (0 sectors) Overwritten: 0.00 Byte (0 sectors) Average reading speed: 34.73 MB/s > Read more, and also includes a much faster alternative for GNOME: > https://blogs.gnome.org/hughsie/2015/01/28/detecting-fake-flash/ I also tested my flash drive with gnome-multi-writer-probe, and it says it is not fake: # gnome-multi-writer-probe /dev/sdb Device is GOOD I also created a big
Re: Frequent btrfs corruption on a USB flash drive
On Thu, Jul 7, 2016 at 4:38 PM, Andrew E. Mileskiwrote: > On 2016-07-07 17:13, Francesco Turco wrote: >> >> >> On 2016-07-07 23:11, Andrew E. Mileski wrote: >>> >>> How large is this USB flash device? >> >> >> 64 GB. >> > > I don't know if there is an official recommended minimum size for btrfs, but > I would expect 64 GB to be okay. In my similar case, it was a 16GiB stick, but the Btrfs on LUKS partition was maybe 4GiB. Again I used -M and ran into zero problems in ~6 months of almost daily usage. But not rsync. I was using it for encrypted /home. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Frequent btrfs corruption on a USB flash drive
On 2016-07-07 17:13, Francesco Turco wrote: On 2016-07-07 23:11, Andrew E. Mileski wrote: How large is this USB flash device? 64 GB. I don't know if there is an official recommended minimum size for btrfs, but I would expect 64 GB to be okay. I've personally set my minimum recommendation for btrfs at 120 GB based on my experience with failures in various flash devices from 4 to 30 GB. If you want to experiment, I have a theory that formatting single volumes with "-m single" can avoid a potential controller race in one specific situation, plus it helps to reduce the meta overhead on smaller devices. Lastly, the last two USB issues I investigated were both fixed by replacing the cables. Something to try if it is a cabled device. ~~ Andrew E. Mileski -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Frequent btrfs corruption on a USB flash drive
On 2016-07-07 09:49, Francesco Turco wrote: I have a USB flash drive with an encrypted Btrfs filesystem where I store daily backups. My problem is that this btrfs filesystem gets corrupted very often, after a few days of usage. Usually I just reformat it and move along, but this time I'd like to understand the root cause of the problem and fix it. How large is this USB flash device? I've had issues with btrfs and small devices, where a 1 GB data chunk is relatively large. ~~ Andrew E. Mileski -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Frequent btrfs corruption on a USB flash drive
On 2016-07-07 23:11, Andrew E. Mileski wrote: > How large is this USB flash device? 64 GB. -- Website: http://www.fturco.net/ GPG key: 6712 2364 B2FE 30E1 4791 EB82 7BB1 1F53 29DE CD34 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Frequent btrfs corruption on a USB flash drive
On 2016-07-07 20:25, Chris Murphy wrote: > On Thu, Jul 7, 2016 at 8:55 AM, Francesco Turcowrote: >> Perhaps I >> should try to rule out an hardware problem by filling my USB flash drive >> with a large random file and then checking if its SHA-1 checksum >> corresponds to the original copy on the hard disk. But first I probably >> should backup the current Btrfs filesystem with the dd command. Can I >> proceed? > > https://btrfs.wiki.kernel.org/index.php/Gotchas Thank you for the link, I didn't know that using LVM snapshots or mounting dd copies can create problems! That could explain the reason for some of the problems I had in the past. -- Website: http://www.fturco.net/ GPG key: 6712 2364 B2FE 30E1 4791 EB82 7BB1 1F53 29DE CD34 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Frequent btrfs corruption on a USB flash drive
On Thu, Jul 7, 2016 at 8:55 AM, Francesco Turcowrote: > I'm not sure. Commands don't fail explicitely when I use ext4, but I > agree with you that I may get corruption silently nonetheless. Use XFS v5 format which is the default in xfsprogs 3.2.3 and later. It at least checksums metadata. > Perhaps I > should try to rule out an hardware problem by filling my USB flash drive > with a large random file and then checking if its SHA-1 checksum > corresponds to the original copy on the hard disk. But first I probably > should backup the current Btrfs filesystem with the dd command. Can I > proceed? https://btrfs.wiki.kernel.org/index.php/Gotchas >> Just to clarify, you're using BTRFS on top of disk encryption (LUKS? Or >> is it just raw encryption, or even something completely different?), on >> a USB flash drive (not a USB to SATA adapter with an SSD or HDD in it), >> correct? > > I'm using a btrfs filesystem on a GUID partition encrypted with LUKS. > It's a Kingston USB flash drive connected directly to my desktop machine > via USB. It's definitively not a SSD or a HDD, and I'm not using any > adapter. First definitely check to make sure it's not fake. It's a well known brand and there's a lot of incentive to make fake Kingston devices. I have a Kingston DTR500 and have used it in the same use case you have, Btrfs on LUKS, for maybe 6 months with no corruptions. In my case I formatted with -M (mixed bg), and it was with kernels older than 4.x, but otherwise sounds the same. Granted, individual units of the same model can have big differences let alone between models. But if it's a Btrfs bug, it might be a regression. I wonder if this might be a use case for one of the integrity check mount options? It slows things down a lot but the extra checking might help pin point at least the moment something bad is happening. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Frequent btrfs corruption on a USB flash drive
On Thu, Jul 7, 2016 at 7:49 AM, Francesco Turcowrote: > $ btrfs filesystem show > /run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3 > $ Try it with sudo. I think it's a bug that 'btrfs fi show' returns silently for non-root. It should produce an error that root privileges are needed, or it should work for unprivileged users. > Btrfs-check reports many errors. I attached the output to this e-mail > message. > > Output from dmesg: > > $ dmesg | tail > [18756.159963] BTRFS error (device dm-4): bad tree block start > 6592115285688248773 35323904 The problem happened before this, so I think we need the entire dmesg. > I checked this USB flash drive with badblocks in non-destructive > read-write mode. No errors. Use F3 to test flash: http://oss.digirati.com.br/f3/ Some distros have it in their repo, Fedora does. It's a bit unintuitive what you need to do is use the write binary to write the test files to the stick (this is destructive) and then use the read binary to read back the written files. Read more, and also includes a much faster alternative for GNOME: https://blogs.gnome.org/hughsie/2015/01/28/detecting-fake-flash/ -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Frequent btrfs corruption on a USB flash drive
On 2016-07-07 10:55, Francesco Turco wrote: On 2016-07-07 16:27, Austin S. Hemmelgarn wrote: This seems odd, are you trying to access anything over NFS or some other network filesystem protocol here? If not, then I believe you've found a bug, because I'm pretty certain we shouldn't be returning -ESTALE for anything. No, I don't use NFS or any other network filesystem. OK, I'm going to try and check the kernel code to figure out if there's any other case we might return that in. I'm pretty certain that there's nowhere BTRFS should return that though, which means you've either hit a bug or have some other hardware issue (Given past experience, I think it's more likely that you've hit a bug). The question here is: Do you get any data corruption when using ext4? Quite often when there's a hardware issue, you won't see _any_ indication of it other than corrupted files when using something like ext4 or XFS, but it will show up almost immediately with BTRFS because we validate checksums on almost everything. There have been at least a couple of times I've found disk issues while converting from ext4 to BTRFS that I didn't know existed before, and then going back was able to reliable reproduce using other tools. Also, FWIW, badblocks is not necessarily a reliable test method for flash drives, they often handle serialized reads like badblocks does very well even when failing. I'm not sure. Commands don't fail explicitely when I use ext4, but I agree with you that I may get corruption silently nonetheless. Perhaps I should try to rule out an hardware problem by filling my USB flash drive with a large random file and then checking if its SHA-1 checksum corresponds to the original copy on the hard disk. But first I probably should backup the current Btrfs filesystem with the dd command. Can I proceed? Yeah, I would suggest backing up the filesystem, be careful that you don't have both copies of the filesystem visible to the system at the same time once you've finished creating the backup copy though, as there are potential issues if you have both visible while trying to mount the FS. As far as checking the drive, I'd do essentially what you had said, with two extra parts: 1. Calculate the checksum of the data on the drive multiple times and make sure that it matches each time as well as matching the original file (if it doesn't match the original file, but each calculation from the drive matches, then the issue is something in the write path only). 2. Do so multiple times so you can be sure to cover _every_ block. Most flash drives have a pool of spare blocks that are used for wear leveling, and if the issue is in one of those, this is the only way to find it. You might also try doing some testing with FIO or iozone, those tend to exercise a wider variety of things than stuff like badblocks or dd. Also, since you'll have a backup copy of the FS, you might consider running a destructive test with badblocks (it works a bit more reliably on flash devices this way, just make sure to run it multiple times too), both with and without the -B option (-B affects how things are buffered, if you see errors with it enabled but none without it, then you probably have some bad RAM). Just to clarify, you're using BTRFS on top of disk encryption (LUKS? Or is it just raw encryption, or even something completely different?), on a USB flash drive (not a USB to SATA adapter with an SSD or HDD in it), correct? I'm using a btrfs filesystem on a GUID partition encrypted with LUKS. It's a Kingston USB flash drive connected directly to my desktop machine via USB. It's definitively not a SSD or a HDD, and I'm not using any adapter. OK, that both simplifies things, and makes them a bit more complicated. If it had been a SSD or HDD connected through an adapter, the preferred method of checking would be to pull it out and put it directly in the system to verify the drive. However, since it's a regular flash drive, if it is the drive, it will probably be significantly less expensive to replace. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Frequent btrfs corruption on a USB flash drive
On 2016-07-07 16:27, Austin S. Hemmelgarn wrote: > This seems odd, are you trying to access anything over NFS or some other > network filesystem protocol here? If not, then I believe you've found a > bug, because I'm pretty certain we shouldn't be returning -ESTALE for > anything. No, I don't use NFS or any other network filesystem. > The question here is: Do you get any data corruption when using ext4? > Quite often when there's a hardware issue, you won't see _any_ > indication of it other than corrupted files when using something like > ext4 or XFS, but it will show up almost immediately with BTRFS because > we validate checksums on almost everything. There have been at least a > couple of times I've found disk issues while converting from ext4 to > BTRFS that I didn't know existed before, and then going back was able to > reliable reproduce using other tools. > > Also, FWIW, badblocks is not necessarily a reliable test method for > flash drives, they often handle serialized reads like badblocks does > very well even when failing. I'm not sure. Commands don't fail explicitely when I use ext4, but I agree with you that I may get corruption silently nonetheless. Perhaps I should try to rule out an hardware problem by filling my USB flash drive with a large random file and then checking if its SHA-1 checksum corresponds to the original copy on the hard disk. But first I probably should backup the current Btrfs filesystem with the dd command. Can I proceed? > Just to clarify, you're using BTRFS on top of disk encryption (LUKS? Or > is it just raw encryption, or even something completely different?), on > a USB flash drive (not a USB to SATA adapter with an SSD or HDD in it), > correct? I'm using a btrfs filesystem on a GUID partition encrypted with LUKS. It's a Kingston USB flash drive connected directly to my desktop machine via USB. It's definitively not a SSD or a HDD, and I'm not using any adapter. -- Website: http://www.fturco.net/ GPG key: 6712 2364 B2FE 30E1 4791 EB82 7BB1 1F53 29DE CD34 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Frequent btrfs corruption on a USB flash drive
On 2016-07-07 09:49, Francesco Turco wrote: I have a USB flash drive with an encrypted Btrfs filesystem where I store daily backups. My problem is that this btrfs filesystem gets corrupted very often, after a few days of usage. Usually I just reformat it and move along, but this time I'd like to understand the root cause of the problem and fix it. I can mount the partition without problems, but then when using commands such as rsync or even humble ls I get the following error message: $ rsync /home/fturco/Buffer/E-book/ /run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Buffer/E-book/ --recursive rsync: readlink_stat("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Riviste") failed: Stale file handle (116) rsync: readlink_stat("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Backup") failed: Stale file handle (116) rsync: readdir("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Calibre (TMSU)"): Input/output error (5) This seems odd, are you trying to access anything over NFS or some other network filesystem protocol here? If not, then I believe you've found a bug, because I'm pretty certain we shouldn't be returning -ESTALE for anything. The previous command gets stuck and I had to manually stop it. The following command doesn't return any output, but its exit code is 1 (failure): $ btrfs filesystem show /run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3 $ Something is definitely wrong here. Unless Parabola has seriously modified btrfs-progs, this should be spitting out info about the devices and filesystem usage. This may be a result of the errors seen by check, but I doubt that Btrfs-check reports many errors. I attached the output to this e-mail message. Looking at this, I see a couple of things I know it should fix correctly (the 'errors 2001' stuff is fixable, and I'm pretty certain that the 'errors 200' thing is too, and I think it will fix the bytenr mismatch stuff mostly safely), but there's enough I'm not sure about that I can't in good conscience recommend that you run check with --repair, as it may make things worse. Hopefully someone who actually understands what the other things actually mean can provide more help on that. Output from dmesg: $ dmesg | tail [18756.159963] BTRFS error (device dm-4): bad tree block start 6592115285688248773 35323904 [18756.160828] BTRFS error (device dm-4): bad tree block start 8533404122473270145 35323904 [18756.161821] BTRFS error (device dm-4): bad tree block start 6592115285688248773 35323904 [18756.163047] BTRFS error (device dm-4): bad tree block start 8533404122473270145 35323904 [18756.163921] BTRFS error (device dm-4): bad tree block start 6592115285688248773 35323904 [18756.164806] BTRFS error (device dm-4): bad tree block start 8533404122473270145 35323904 [18756.165673] BTRFS error (device dm-4): bad tree block start 6592115285688248773 35323904 [18756.166548] BTRFS error (device dm-4): bad tree block start 8533404122473270145 35323904 [18757.950603] BTRFS error (device dm-4): bad tree block start 6592115285688248773 35323904 [18757.951492] BTRFS error (device dm-4): bad tree block start 8533404122473270145 35323904 I checked this USB flash drive with badblocks in non-destructive read-write mode. No errors. If I format this partition as Ext4 instead of Btrfs I can use it without problems, but my goal is to use Btrfs on all devices. The question here is: Do you get any data corruption when using ext4? Quite often when there's a hardware issue, you won't see _any_ indication of it other than corrupted files when using something like ext4 or XFS, but it will show up almost immediately with BTRFS because we validate checksums on almost everything. There have been at least a couple of times I've found disk issues while converting from ext4 to BTRFS that I didn't know existed before, and then going back was able to reliable reproduce using other tools. Also, FWIW, badblocks is not necessarily a reliable test method for flash drives, they often handle serialized reads like badblocks does very well even when failing. Just to clarify, you're using BTRFS on top of disk encryption (LUKS? Or is it just raw encryption, or even something completely different?), on a USB flash drive (not a USB to SATA adapter with an SSD or HDD in it), correct? My GNU/Linux distribution is Parabola GNU/Linux-libre. Kernel version is: 4.6.3. Btrfs-progs version is: 4.6 Please tell me if you need other details. Thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Frequent btrfs corruption on a USB flash drive
I have a USB flash drive with an encrypted Btrfs filesystem where I store daily backups. My problem is that this btrfs filesystem gets corrupted very often, after a few days of usage. Usually I just reformat it and move along, but this time I'd like to understand the root cause of the problem and fix it. I can mount the partition without problems, but then when using commands such as rsync or even humble ls I get the following error message: $ rsync /home/fturco/Buffer/E-book/ /run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Buffer/E-book/ --recursive rsync: readlink_stat("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Riviste") failed: Stale file handle (116) rsync: readlink_stat("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Backup") failed: Stale file handle (116) rsync: readdir("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Calibre (TMSU)"): Input/output error (5) The previous command gets stuck and I had to manually stop it. The following command doesn't return any output, but its exit code is 1 (failure): $ btrfs filesystem show /run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3 $ Btrfs-check reports many errors. I attached the output to this e-mail message. Output from dmesg: $ dmesg | tail [18756.159963] BTRFS error (device dm-4): bad tree block start 6592115285688248773 35323904 [18756.160828] BTRFS error (device dm-4): bad tree block start 8533404122473270145 35323904 [18756.161821] BTRFS error (device dm-4): bad tree block start 6592115285688248773 35323904 [18756.163047] BTRFS error (device dm-4): bad tree block start 8533404122473270145 35323904 [18756.163921] BTRFS error (device dm-4): bad tree block start 6592115285688248773 35323904 [18756.164806] BTRFS error (device dm-4): bad tree block start 8533404122473270145 35323904 [18756.165673] BTRFS error (device dm-4): bad tree block start 6592115285688248773 35323904 [18756.166548] BTRFS error (device dm-4): bad tree block start 8533404122473270145 35323904 [18757.950603] BTRFS error (device dm-4): bad tree block start 6592115285688248773 35323904 [18757.951492] BTRFS error (device dm-4): bad tree block start 8533404122473270145 35323904 I checked this USB flash drive with badblocks in non-destructive read-write mode. No errors. If I format this partition as Ext4 instead of Btrfs I can use it without problems, but my goal is to use Btrfs on all devices. My GNU/Linux distribution is Parabola GNU/Linux-libre. Kernel version is: 4.6.3. Btrfs-progs version is: 4.6 Please tell me if you need other details. Thanks. -- Website: http://www.fturco.net/ GPG key: 6712 2364 B2FE 30E1 4791 EB82 7BB1 1F53 29DE CD34 # btrfs check --readonly /dev/mapper/luks-08e23ed4-a2a1-41f0-a5f6-794ff0647ada Checking filesystem on /dev/mapper/luks-08e23ed4-a2a1-41f0-a5f6-794ff0647ada UUID: 5283147c-b7b4-448f-97b0-b235344a56a3 checking extents checksum verify failed on 35274752 found E042416D wanted 4CD1CFA0 checksum verify failed on 35274752 found E042416D wanted 4CD1CFA0 checksum verify failed on 35274752 found E8B38F1B wanted B3F4F728 checksum verify failed on 35274752 found E042416D wanted 4CD1CFA0 bytenr mismatch, want=35274752, have=6970279768983377651 checksum verify failed on 35291136 found 6B9667D1 wanted CDED2E29 checksum verify failed on 35291136 found 6B9667D1 wanted CDED2E29 checksum verify failed on 35291136 found 607F5103 wanted F21126A3 checksum verify failed on 35291136 found 6B9667D1 wanted CDED2E29 bytenr mismatch, want=35291136, have=16962852950865328208 checksum verify failed on 35307520 found 088ACE59 wanted 22164173 checksum verify failed on 35307520 found 088ACE59 wanted 22164173 checksum verify failed on 35307520 found F59BACEE wanted E647A1CD checksum verify failed on 35307520 found 088ACE59 wanted 22164173 bytenr mismatch, want=35307520, have=16013504349018505369 checksum verify failed on 35323904 found CA154283 wanted 10E9FA6B checksum verify failed on 35323904 found CA154283 wanted 10E9FA6B checksum verify failed on 35323904 found 4DA7B234 wanted 794014C7 checksum verify failed on 35323904 found 4DA7B234 wanted 794014C7 bytenr mismatch, want=35323904, have=8533404122473270145 parent transid verify failed on 35340288 wanted 44 found 37 parent transid verify failed on 35340288 wanted 44 found 37 parent transid verify failed on 35340288 wanted 44 found 37 parent transid verify failed on 35340288 wanted 44 found 37 Ignoring transid failure leaf parent key incorrect 35340288 bad block 35340288 Errors found in extent allocation tree or chunk allocation checking free space cache checking fs roots parent transid verify failed on 35340288 wanted 44 found 37 Ignoring transid failure parent transid verify failed on 35340288 wanted 44 found 37 Ignoring transid failure parent transid verify failed on 35340288 wanted 44 found 37 Ignoring transid failure parent transid verify failed on 35340288 wanted 44 found 37 Ignoring transid failure checksum verify failed on 35274752 found E042416D wanted 4CD1CFA0 checksum