I use SMART (smartmontools etc) and its tests to keep track of and warn
me of such issues. It's way more likely to catch incipient media
failures long before scrub would. It's also more likely to correct
situations before they become visible to userspace. Its also a way
better full-platter scan that involves less real time delay and won't
bog down a running system.

Don't put too much trust in SMART - sectors can rot unexpectedly even if SMART is thinking everything is fine with the drive.

I had exactly this issue recently:

1) one of the drives in the server failed and was replaced

2) "btrfs device delete missing" (which basically moves data from the remaining drive to the new one) was failing with IO error

3) according to SMART, the drive with IO error was fine (no reallocated sectors, no warnings etc.)


So, scrub to the rescue - it printed "broken" files, after removing them manually, it was possible to finish "btrfs device delete missing".

Probably it makes sense to run scrub occasionally (just like mdraid is doing on most distributions).


--
Tomasz Chmielewski
http://www.sslrack.com

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to