On 2017-10-19 09:48, Zoltan wrote:
Hi,
On Thu, Oct 19, 2017 at 1:01 PM, Peter Grandi <p...@btrfs.list.sabi.co.uk>
wrote:
What the OP was doing was using "unreliable" both for the case
where the device "lies" and the case where the device does not
"lie" but reports a failure. Both of these are malfunctions in a
wide sense:
* The [block] device "lies" as to its status or what it has done.
* The [block] device reports truthfully that an action has failed.
Thanks for making this point, it made me realize that I had different
assumption than what you use in your reasoning. I assumed that when
writes to a USB device fail due to a temporary disconnection, the
kernel can actually recognize that a write error happened. So are you
saying that a write error due to USB problems can go completely
unnoticed? That seems very strange to me; are USB drives really that
unreliable or is that some software limitation?
It depends on what type of write error happens.
If it's a case where the data gets corrupted on it's way over the bus,
or the device just drops the write, or you have a bogus storage device
(this is actually a pretty big issue with flash drives and SD cards,
check [1], and [2] for more info on this, and [3] for a tool you can use
to check things), then it generally won't be detected by the kernel, but
might be by the filesystem driver when it tries to read data.
However, it doesn't go completely undetected if the device disconnects
(which is where the big issue with BTRFS comes in), the kernel will
detect the disconnect, issue a bus reset (which will cause performance
issues with other USB devices on the same controller), and generally
recover. However, the disappearance of the device doesn't get
propagated up to the filesystem correctly, and that is what causes the
biggest issue with BTRFS. Because BTRFS just knows writes are suddenly
failing for some reason, it doesn't try to release the device so that
things get properly cleaned up in the kernel, and thus when the same
device reappears (as it will when the disconnect was due to a transient
bus error, which happens a lot), it shows up as a different device node,
which gets scanned for filesystems by udev, and BTRFS then gets really
confused because it now sees 3 (or more) devices for a 2 device
filesystem. That final resultant state is what's so dangerous about
using USB devices with BTRFS right now, as it's pretty much guaranteed
to result in data corruption.
[1] https://fightflashfraud.wordpress.com/
[2] https://sosfakeflash.wordpress.com/
[3] http://oss.digirati.com.br/f3/
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html