On Wed, Mar 23, 2016 at 10:51 AM, Brad Templeton <brad...@gmail.com> wrote: > Thanks for assist. To reiterate what I said in private: > > a) I am fairly sure I swapped drives by adding the 6TB drive and then > removing the 2TB drive, which would not have made the 6TB think it was > only 2TB. The btrfs statistics commands have shown from the beginning > the size of the device as 6TB, and that after the remove, it haad 4TB > unallocated.
I agree this seems to be consistent with what's been reported. > > So I am looking for other options, or if people have commands I might > execute to diagnose this (as it seems to be a flaw in balance) let me know. What version of btrfs-progs is this? I'm vaguely curious what 'btrfs check' reports (without --repair). Any version is OK but it's better to use something fairly recent since the check code continues to change a lot. Another thing you could try is a newer kernel. Maybe there's a related bug in 4.2.0. I think it may be more likely this is just an edge case bug that's always been there, but it's valuable to know if recent kernels exhibit the problem. And before proceeding with a change in layout (converting to another profile) I suggest taking an image of the metadata with btrfs-image, it might come in handy for a developer. > > Some options remaining open to me: > > a) I could re-add the 2TB device, which is still there. Then balance > again, which hopefully would move a lot of stuff. Then remove it again > and hopefully the new stuff would distribute mostly to the large drive. > Then I could try balance again. Yeah, to do this will require -f to wipe the signature info from that drive when you add it. But I don't think this is a case of needing more free space, I think it might be due to the odd number of drives that are also fairly different in size. But then what happens when you delete the 2TB drive after the balance? Do you end up right back in this same situation? > > b) It was suggested I could (with a good backup) convert the drive to > non-RAID1 to free up tons of space and then re-convert. What's the > precise procedure for that? Perhaps I can do it with a limit to see how > it works as an experiment? Any way to specifically target the blocks > that have their two copies on the 2 smaller drives for conversion? btrfs balance -dconvert=single -mconvert=single -f ## you have to use -f to force reduction in redundancy btrfs balance -dconvert=raid1 -mconvert=raid1 There is the devid= filter but I'm not sure of the consequences of limiting the conversion to two of three devices, that's kinda confusing and is sufficiently an edge case I wonder how many bugs you're looking to find today? :-) > c) Finally, I could take a full-full backup (my normal backups don't > bother with cached stuff and certain other things that you can recover) > and take the system down for a while to just wipe and restore the > volumes. That doesn't find the bug, however. I'd have the full backup no matter what choice you make. At any time for any reason any filesystem can face plant without warning. But yes this should definitely work or else you've definitely found a bug. Finding the bug in your current scenario is harder because the history of this volume makes it really non-deterministic whereas if you start with a 3 disk volume at mkfs time, and then you reproduce this problem, for sure it's a bug. And fairly straightforward to reproduce. I still recommend a newer kernel and progs though, just because there's no work being done on 4.2 anymore. I suggest 4.4.6 and 4.4.1 progs. And then if you reproduce it, it's not just a bug, it's a current bug. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html