Re: btrfsck does not fix
Hi Chris, thanks for your reply. ./btrfs filesystem show /dev/sdb1 Label: none uuid: 989306aa-d291-4752-8477-0baf94f8c42f Total devices 2 FS bytes used 3.47TiB devid1 size 2.73TiB used 1.74TiB path /dev/sdb1 devid2 size 2.73TiB used 1.74TiB path /dev/sdc1 I don't understand the no spare part. You have 3.47T of data, and yet the single device size is 2.73T. There is no way to migrate 1.74T from sdc1 to sdb1 because there isn't enough space. Fair point. I summed up manually (with du) and apparently missed some data. I can move the 0.8TiB out of the way. I just don't have 3.5TiB 'spare'. btrfs device delete /dev/sdc1 /mnt/BTRFS/rsnapshot/ btrfs device delete /dev/sdc1 /mnt/BTRFS/backups/ btrfs device delete /dev/sdc1 /mnt/BTRFS/Video/ btrfs filesystem balance start /mnt/BTRFS/Video/ I don't understand this sequence because I don't know what you've mounted where, I'm sorry. here you go: /btrfs subvolume list /mnt/BTRFS/Video ID 256 gen 226429 top level 5 path Video -- /mnt/BTRFS/Video/ ID 1495 gen 226141 top level 5 path rsnapshot -- /mnt/BTRFS/rsnapshot ID gen 226429 top level 256 path Snapshot -- not mounted ID 5845 gen 226375 top level 5 path backups -- /mnt/BTRFS/backups but in any case maybe it's a bug that you're not getting errors for each of these commands because you can't delete sdc1 from a raid0 volume. That makes sense. I read that procedure somewhere in the -totally unvalidated- Internet. In case the missing Error-Message is a Bug: Is this place here sufficient to report it, or is there a Bug-Tracker? You'd first have to convert the data, metadata, and system profiles to single (metadata can be set to dup). And then you'd be able to delete a device so long as there's room on remaining devices, which you don't have. Yes, but I can create that space. So, for me the next steps would be to: -generate enough room on the filesystem -btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt/BTRFS/Video -btrfs device delete /dev/sdc1 /mnt/BTRFS/Video Right? next, I'm doing the balance for the subvolume /mnt/BTRFS/backups You told us above you deleted that subvolume. So how are you balancing it? Yes, that was my understanding from my research: You tell btrfs, that you want to remove one disc from the filesystem and then balance it to move the data on the remaining disc. I did find this logical. I was expecting that I possibly need a further command to tell btrfs that it's not a raid anymore, but I thought this could also be automagical. I understand, that's not the way it is implemented, but it's not a crazy idea, is it? And also, balance applies to a mountpoint, and even if you mount a subvolume to that mountpoint, the whole file system is balanced. Not just the mounted subvolume. That is confusing. (I mean: I understand what you are saying, but it's counterintuitive). Why is this the case? In parallel, I try to delete /mnt/BTRFS/rsnapshot, but it fails: btrfs subvolume delete /mnt/BTRFS/rsnapshot/ Delete subvolume '/mnt/BTRFS/rsnapshot' ERROR: cannot delete '/mnt/BTRFS/rsnapshot' - Inappropriate ioctl for device Why's that? But even more: How do I free sdc1 now?! Well I'm pretty confused because again, I can't tell if your paths refer to subvolumes or if they refer to mount points. Now I am confused. These paths are the paths to which I mounted the subvolumes: my (abbreviated) fstab: UUID=xy /mnt/BTRFS/Video btrfs subvol=Video UUID=xy /mnt/BTRFS/rsnapshot btrfs subvol=rsnapshot UUID=xy /mnt/BTRFS/backups btrfs subvol=backups The balance and device delete commands all refer to a mount point, which is the path returned by the df command. So this: /dev/sdb1 5,5T3,5T 2,0T 64% /mnt/BTRFS/Video /dev/sdb1 5,5T3,5T 2,0T 64% /mnt/BTRFS/backups /dev/sdc1 5,5T3,5T 2,0T 64% /mnt/BTRFS/rsnapshot The subvolume delete command needs a path to subvolume that starts with the mount point. Sorry, this I do not understand, no matter how hard I think about it.. What would it be in my case? Thanks for your help! I appreciate it. Greetings, Hendrik -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Provide a better free space estimate on RAID1
On Sun, 9 Feb 2014 06:38:53 + (UTC) Duncan 1i5t5.dun...@cox.net wrote: RAID or multi-device filesystems aren't 1970s features and break 1970s behavior and the assumptions associated with it. If you're not prepared to deal with those broken assumptions, don't. Use mdraid or dmraid or lvm or whatever to combine your multiple devices into one logical devices as presented, and put your filesystem (either traditional filesystem, or even btrfs using traditional single-device functionality) on top of the single device the layer beneath the filesystem presents. Problem solved! =:^) Note that df only lists a single device as well, not the multiple component devices of the filesystem. That's broken functionality by your definition, too, and again, using some other layer like lvm or mdraid to present multiple devices as a single virtual device, with a traditional single-device filesystem layout on top of that single device... solves the problem! No reason BTRFS can't work well in a similar simplistic usage scenario. You seem to insist there is no way around it being too flexible for its own good, but all those advanced features absolutely don't *have* to get in the way of everyday usage for users who don't require them. Meanwhile, what I've done here is use one of df's commandline options to set its block size to 2 MiB, and further used bash's alias functionality to setup an alias accordingly: alias df='df -B2M' $ df /h Filesystem 2M-blocks Used Available Use% Mounted on /dev/sda6 20480 12186 7909 61% /h $ sudo btrfs fi show /h Label: hm0238gcnx+35l0 uuid: ce23242a-b0a9-423f-a9c3-7db2729f48d6 Total devices 2 FS bytes used 11.90GiB devid1 size 20.00GiB used 14.78GiB path /dev/sda6 devid2 size 20.00GiB used 14.78GiB path /dev/sdb6 $ sudo btrfs fi df /h Data, RAID1: total=14.00GiB, used=11.49GiB System, RAID1: total=32.00MiB, used=16.00KiB Metadata, RAID1: total=768.00MiB, used=414.94MiB On btrfs such as the above I can read the 2M blocks as 1M and be happy. On btrfs such as my /boot, which aren't raid1 (I have two separate /boots, one on each device, with grub2 configured separately for each to provide a backup), or if I df my media partitions still on reiserfs on the old spinning rust, I can either double the figures DF gives me, or add a second -B option at the CLI, overriding the aliased option. Congratulations, you broke your df readings on all other filesystems to fix them on btrfs. If I wanted something fully automated, it'd be easy enough to setup a script that checked what filesystem I was df-ing, matched that against a table of filesystems to preferred df block sizes, and supplied the appropriate -BxX option accordingly. I am not sure this would work well in the network share scenario described earlier, with clients which in the real world are largely Windows-based. -- With respect, Roman signature.asc Description: PGP signature
Re: Provide a better free space estimate on RAID1
Duncan 1i5t5.dun...@cox.net schrieb: Roman Mamedov posted on Sun, 09 Feb 2014 04:10:50 +0600 as excerpted: If you need to perform a btrfs-specific operation, you can easily use the btrfs-specific tools to prepare for it, specifically use btrfs fi df which could give provide every imaginable interpretation of free space estimate and then some. UNIX 'df' and the 'statfs' call on the other hand should keep the behavior people are accustomized to rely on since 1970s. Which it does... on filesystems that only have 1970s filesystem features. =:^) RAID or multi-device filesystems aren't 1970s features and break 1970s behavior and the assumptions associated with it. If you're not prepared to deal with those broken assumptions, don't. Use mdraid or dmraid or lvm or whatever to combine your multiple devices into one logical devices as presented, and put your filesystem (either traditional filesystem, or even btrfs using traditional single-device functionality) on top of the single device the layer beneath the filesystem presents. Problem solved! =:^) Note that df only lists a single device as well, not the multiple component devices of the filesystem. That's broken functionality by your definition, too, and again, using some other layer like lvm or mdraid to present multiple devices as a single virtual device, with a traditional single-device filesystem layout on top of that single device... solves the problem! Meanwhile, what I've done here is use one of df's commandline options to set its block size to 2 MiB, and further used bash's alias functionality to setup an alias accordingly: alias df='df -B2M' $ df /h Filesystem 2M-blocks Used Available Use% Mounted on /dev/sda6 20480 12186 7909 61% /h $ sudo btrfs fi show /h Label: hm0238gcnx+35l0 uuid: ce23242a-b0a9-423f-a9c3-7db2729f48d6 Total devices 2 FS bytes used 11.90GiB devid1 size 20.00GiB used 14.78GiB path /dev/sda6 devid2 size 20.00GiB used 14.78GiB path /dev/sdb6 $ sudo btrfs fi df /h Data, RAID1: total=14.00GiB, used=11.49GiB System, RAID1: total=32.00MiB, used=16.00KiB Metadata, RAID1: total=768.00MiB, used=414.94MiB On btrfs such as the above I can read the 2M blocks as 1M and be happy. On btrfs such as my /boot, which aren't raid1 (I have two separate /boots, one on each device, with grub2 configured separately for each to provide a backup), or if I df my media partitions still on reiserfs on the old spinning rust, I can either double the figures DF gives me, or add a second -B option at the CLI, overriding the aliased option. If I wanted something fully automated, it'd be easy enough to setup a script that checked what filesystem I was df-ing, matched that against a table of filesystems to preferred df block sizes, and supplied the appropriate -BxX option accordingly. As I guess most admins after a few years, I've developed quite a library of scripts/aliases for various things I do routinely enough to warrant it, and this would be just one more joining the list. =:^) Well done... And a good idea, I didn't think of it yet. But it's my idea of fixing it in user space. :-) I usually leave the discussion when people start to argument with pointers to unix tradition... That's like starting a systemd discussion and telling me that systemd is broken by design while mentioning in the same sentence that sysvinit is working perfectly fine. The latter doesn't do so. The first is a matter of personal taste but is in no case broken... But... Well... But of course it's your system in question, and you can patch btrfs to output anything you like, in any format you like. No need to bother with df's -B option if you'd prefer to patch the kernel instead. Me, I'll stick to the -B option. =:^) That's essentially the FOSS idea. Actually, I don't want df behavior being broken for me. It uses fstat syscall, that returns blocks. Cutting returned values into half lies about the properties of the device - for EVERY application out there, no matter which assumptions are being made about the returned values. This breaks the fstat syscall. User-space should simply not rely on the assumption that 1k of user data occupies 1k worth of blocks (that's not true anyways because meta-data has to be allocated, too). When I had contact with unix first, df returned used/free blocks - native BLOCKS! No option to make it human readable. No forced intention that it would show you usable space for actual written data. The blocks were given as 512-byte sectors. I've been okay with that. I knew: If I cut the values in half, I'd get about the size of data I perhabs could fit in the device. If it had been a property of the device that 512 byte of user data would write two blocks, nobody had cared about df displaying wrong values. -- Replies to list only preferred. -- To unsubscribe from this list: send the line unsubscribe
Re: Provide a better free space estimate on RAID1
Roman Mamedov r...@romanrm.net schrieb: When I started to use unix, df returned blocks, not bytes. Without your proposed patch, it does that right. With your patch, it does it wrong. It returns total/used/available space that is usable/used/available by/for user data. No, it does not. It returns space allocatable to the filesystem. That's user data and meta data. That can be far from your expectations depending on how allocation on the filesystem works. -- Replies to list only preferred. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 2/2] Revert Btrfs: remove transaction from btrfs send
On Sun, Feb 9, 2014 at 2:39 AM, Shilong Wang wangshilong1...@gmail.com wrote: 2014-02-08 23:46 GMT+08:00 Wang Shilong wangshilong1...@gmail.com: From: Wang Shilong wangsl.f...@cn.fujitsu.com This reverts commit 41ce9970a8a6a362ae8df145f7a03d789e9ef9d2. Previously i was thinking we can use readonly root's commit root safely while it is not true, readonly root may be cowed with the following cases. 1.snapshot send root will cow source root. 2.balance,device operations will also cow readonly send root to relocate. So i have two ideas to make us safe to use commit root. --approach 1: make it protected by transaction and end transaction properly and we research next item from root node(see btrfs_search_slot_for_read()). --approach 2: add another counter to local root structure to sync snapshot with send. and add a global counter to sync send with exclusive device operations. So with approach 2, send can use commit root safely, because we make sure send root can not be cowed during send. Unfortunately, it make codes *ugly* and more complex to maintain. To make snapshot and send exclusively, device operations and send operation exclusively with each other is a little confusing for common users. So why not drop into previous way. Cc: Josef Bacik jba...@fb.com Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- Josef, if we reach agreement to adopt this approach, please revert Filipe's patch(Btrfs: make some tree searches in send.c more efficient) from btrfs-next. Oops, this patch guarantee searching commit roots are all protected by transaction, Filipe's patch is ok, we need update Josef's previous patch. Hi Shilong, I am confused. Can you explain why that optimization patch is a problem, either with or without your patch or any other patch currently flying around? Either before or after the optimization, we search through the commit root and after a key search we process a key while holding the leaf's extent buffer. Both approaches call btrfs_next_leaf too (either directly or via btrfs_search_slot_for_read). Thanks Wang -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH 2/2] Revert Btrfs: remove transaction from btrfs send
2014-02-09 21:52 GMT+08:00 Filipe David Manana fdman...@gmail.com: On Sun, Feb 9, 2014 at 2:39 AM, Shilong Wang wangshilong1...@gmail.com wrote: 2014-02-08 23:46 GMT+08:00 Wang Shilong wangshilong1...@gmail.com: From: Wang Shilong wangsl.f...@cn.fujitsu.com This reverts commit 41ce9970a8a6a362ae8df145f7a03d789e9ef9d2. Previously i was thinking we can use readonly root's commit root safely while it is not true, readonly root may be cowed with the following cases. 1.snapshot send root will cow source root. 2.balance,device operations will also cow readonly send root to relocate. So i have two ideas to make us safe to use commit root. --approach 1: make it protected by transaction and end transaction properly and we research next item from root node(see btrfs_search_slot_for_read()). --approach 2: add another counter to local root structure to sync snapshot with send. and add a global counter to sync send with exclusive device operations. So with approach 2, send can use commit root safely, because we make sure send root can not be cowed during send. Unfortunately, it make codes *ugly* and more complex to maintain. To make snapshot and send exclusively, device operations and send operation exclusively with each other is a little confusing for common users. So why not drop into previous way. Cc: Josef Bacik jba...@fb.com Signed-off-by: Wang Shilong wangsl.f...@cn.fujitsu.com --- Josef, if we reach agreement to adopt this approach, please revert Filipe's patch(Btrfs: make some tree searches in send.c more efficient) from btrfs-next. Oops, this patch guarantee searching commit roots are all protected by transaction, Filipe's patch is ok, we need update Josef's previous patch. Hi Shilong, I am confused. Can you explain why that optimization patch is a problem, either with or without your patch or any other patch currently flying around? Either before or after the optimization, we search through the commit root and after a key search we process a key while holding the leaf's extent buffer. Both approaches call btrfs_next_leaf too (either directly or via btrfs_search_slot_for_read). Sorry my miss, your patch did not have problem, you did not notice my following thread comments for this patch, we need update josef's previous patch not yours. ^_^ Thanks, Wang Thanks Wang -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Filipe David Manana, Reasonable men adapt themselves to the world. Unreasonable men adapt the world to themselves. That's why all progress depends on unreasonable men. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH][V3] Provide a better free space estimate [was]Re: Provide a better free space estimate on RAID1
On 02/07/2014 05:40 AM, Roman Mamedov wrote: On Thu, 06 Feb 2014 20:54:19 +0100 Goffredo Baroncelli kreij...@libero.it wrote: [...] As Roman pointed out, df show the raw space available. However when a RAID level is used, the space available to the user is less. This patch try to address this estimation correcting the value on the basis of the RAID level. This is my third revision of this patch. In this last issue, I addressed the bugs related to an uncorrected evaluation of the free space in case of RAID1 [1] and DUP. I have to point out that the free space estimation is quite approximative, because it assumes: a) all the new files are allocated in data chunk b) the free space will not consumed by metadata c) the already allocated chunk are not evaluated for the free space estimation Both these assumptions are unrelated to my patch. I performed some tests with a filesystem composed by 7 51GB disks. Here my df results: Profile: single Filesystem Size Used Avail Use% Mounted on /dev/vdb351G 512K 348G 1% /mnt/btrfs1 Profile: raid1 Filesystem Size Used Avail Use% Mounted on /dev/vdb351G 1.3M 175G 1% /mnt/btrfs1 Profile: raid10 Filesystem Size Used Avail Use% Mounted on /dev/vdb351G 2.3M 177G 1% /mnt/btrfs1 Profile: raid5 Filesystem Size Used Avail Use% Mounted on /dev/vdb351G 2.0M 298G 1% /mnt/btrfs1 Profile: raid6 Filesystem Size Used Avail Use% Mounted on /dev/vdb351G 1.8M 248G 1% /mnt/btrfs1 Profile: DUP (only one 50GB disk was used) Filesystem Size Used Avail Use% Mounted on /dev/vdc 51G 576K 26G 1% /mnt/btrfs1 Below my patch. BR G.Baroncelli [1] the bug is before my patch; try to see what happens when you create a RAID1 filesystem with three disks. Changes history: V1 First issue V2 Correct a (old) bug when in RAID10 the disks aren't a multiple of 4 V3 Correct the free space estimation in RAID1 (when the number of disks are odd) and DUP diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index d71a11d..4064a5f 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -1481,10 +1481,16 @@ static int btrfs_calc_avail_data_space(struct btrfs_root *root, u64 *free_bytes) num_stripes = nr_devices; } else if (type BTRFS_BLOCK_GROUP_RAID1) { min_stripes = 2; - num_stripes = 2; + num_stripes = nr_devices; } else if (type BTRFS_BLOCK_GROUP_RAID10) { min_stripes = 4; - num_stripes = 4; + num_stripes = nr_devices; + } else if (type BTRFS_BLOCK_GROUP_RAID5) { + min_stripes = 3; + num_stripes = nr_devices; + } else if (type BTRFS_BLOCK_GROUP_RAID6) { + min_stripes = 4; + num_stripes = nr_devices; } if (type BTRFS_BLOCK_GROUP_DUP) @@ -1560,9 +1566,44 @@ static int btrfs_calc_avail_data_space(struct btrfs_root *root, u64 *free_bytes) if (devices_info[i].max_avail = min_stripe_size) { int j; - u64 alloc_size; + u64 alloc_size, delta; + int k, div; + + /* +* Depending by the RAID profile, we use some +* disk space as redundancy: +* RAID1, RAID10, DUP - half of space used as redundancy +* RAID5 - 1 stripe used as redundancy +* RAID6 - 2 stripes used as redundancy +* RAID0,LINEAR - no redundancy +*/ + if (type BTRFS_BLOCK_GROUP_RAID1) { + k = num_stripes; + div = 2; + } else if (type BTRFS_BLOCK_GROUP_DUP) { + k = num_stripes; + div = 2; + } else if (type BTRFS_BLOCK_GROUP_RAID10) { + k = num_stripes; + div = 2; + } else if (type BTRFS_BLOCK_GROUP_RAID5) { + k = num_stripes-1; + div = 1; + } else if (type BTRFS_BLOCK_GROUP_RAID6) { + k = num_stripes-2; + div = 1; + } else { /* RAID0/LINEAR */ + k = num_stripes; + div = 1; + } + + delta =
[GIT PULL] Btrfs
Hi Linus, Please pull my for-linus branch: git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus This is a small collection of fixes Josef Bacik (2) commits (+4/-5): Btrfs: don't loop forever if we can't run because of the tree mod log (+1/-0) Btrfs: fix assert screwup for the pending move stuff (+3/-5) David Sterba (1) commits (+1/-1): btrfs: reserve no transaction units in btrfs_ioctl_set_features Filipe David Borba Manana (1) commits (+2/-0): Btrfs: fix data corruption when reading/updating compressed extents Jeff Mahoney (1) commits (+2/-2): btrfs: commit transaction after setting label and features Total: (5) commits (+9/-8) fs/btrfs/compression.c | 2 ++ fs/btrfs/extent-tree.c | 1 + fs/btrfs/ioctl.c | 6 +++--- fs/btrfs/send.c| 8 +++- 4 files changed, 9 insertions(+), 8 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Error: could not do orphan cleanup -22
There was a similar discussion about an error in January 2013 but it related to some kernel panic. I don't know if I encountered the same thing. These errors from system journal bother me: 2月 09 22:18:53 melforce kernel: BTRFS error (device sdb3): Error removing orphan entry, stopping orphan cleanup 2月 09 22:18:53 melforce kernel: BTRFS critical (device sdb3): could not do orphan cleanup -22 I run kernel 3.12.10. I'll explain what I did at that moment. Subvolumes were already mounted at /home and /var and I mounted the root subvolume at /mnt/btr2. Then executed ls command on /home/btr2. ls gave me invalid argument errors, but still displayed the contents. Next time I ran ls (right away), there were no more errors. Another example is a script that mounts the same thing and then takes snapshots. If I run the script manually, it never fails. If I run it from cron job, one of the snapshot commands fails telling me that /home/btr2/var isn't accesible (I don't remember the exact error message, I can look if it shows up again). Someone said in the January thread that -22 error messages are harmless but in this case userspace tools break so I wouldn't consider this totally harmless. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
BTRFS with RAID1 cannot boot when removing drive
Hello, I am experimenting with BTRFS and RAID1 on my Debian Wheezy (with backported kernel 3.12-0.bpo.1-amd64) using a a motherboard with UEFI. However I haven't managed to make the system boot when the removing the first hard drive. I have installed Debian with the following partition on the first hard drive (no BTRFS subsystem): /dev/sda1: for / (BTRFS) /dev/sda2: for /home (BTRFS) /dev/sda3: for swap Then I added another drive for a RAID1 configuration (with btrfs balance) and I installed grub on the second hard drive with grub-install /dev/sdb. If I boot on sdb, it takes sda1 as the root filesystem If I switched the cable, it always take the first hard drive as the root filesystem (now sdb) If I disconnect /dev/sda, the system doesn't boot with a message saying that it hasn't found the UUID: Scanning for BTRFS filesystems... mount: mounting /dev/disk/by-uuid/c64fca2a-5700-4cca-abac-3a61f2f7486c on /root failed: Invalid argument Can you tell me what I have done incorrectly ? Is it because of UEFI ? If yes I haven't understood how I can correct it in a simple way. As extra question, I don't see also how I can configure the system to get the correct swap in case of disk failure. Should I force both swap partition to have the same UUID ? Many thanks in advance ! Here are some outputs for info: btrfs filesystem show Label: none uuid: 743d6b3b-71a7-4869-a0af-83549555284b Total devices 2 FS bytes used 27.96MB devid1 size 897.98GB used 3.03GB path /dev/sda2 devid2 size 897.98GB used 3.03GB path /dev/sdb2 Label: none uuid: c64fca2a-5700-4cca-abac-3a61f2f7486c Total devices 2 FS bytes used 3.85GB devid1 size 27.94GB used 7.03GB path /dev/sda1 devid2 size 27.94GB used 7.03GB path /dev/sdb1 blkid /dev/sda1: UUID=c64fca2a-5700-4cca-abac-3a61f2f7486c UUID_SUB=77ffad34-681c-4c43-9143-9b73da7d1ae3 TYPE=btrfs /dev/sda3: UUID=469715b2-2fa3-4462-b6f5-62c04a60a4a2 TYPE=swap /dev/sda2: UUID=743d6b3b-71a7-4869-a0af-83549555284b UUID_SUB=744510f5-5bd5-4df4-b8c4-0fc1a853199a TYPE=btrfs /dev/sdb1: UUID=c64fca2a-5700-4cca-abac-3a61f2f7486c UUID_SUB=2615fd98-f2ad-4e7b-84bc-0ee7f9770ca0 TYPE=btrfs /dev/sdb2: UUID=743d6b3b-71a7-4869-a0af-83549555284b UUID_SUB=8783a7b1-57ef-4bcc-ae7f-be20761e9a19 TYPE=btrfs /dev/sdb3: UUID=56fbbe2f-7048-488f-b263-ab2eb000d1e1 TYPE=swap cat /etc/fstab # file system mount point type options dump pass UUID=c64fca2a-5700-4cca-abac-3a61f2f7486c / btrfs defaults 0 1 UUID=743d6b3b-71a7-4869-a0af-83549555284b /home btrfs defaults 0 2 UUID=469715b2-2fa3-4462-b6f5-62c04a60a4a2 noneswapsw 0 0 cat /boot/grub/grub.cfg # # DO NOT EDIT THIS FILE # # It is automatically generated by grub-mkconfig using templates # from /etc/grub.d and settings from /etc/default/grub # ### BEGIN /etc/grub.d/00_header ### if [ -s $prefix/grubenv ]; then load_env fi set default=0 if [ ${prev_saved_entry} ]; then set saved_entry=${prev_saved_entry} save_env saved_entry set prev_saved_entry= save_env prev_saved_entry set boot_once=true fi function savedefault { if [ -z ${boot_once} ]; then saved_entry=${chosen} save_env saved_entry fi } function load_video { insmod vbe insmod vga insmod video_bochs insmod video_cirrus } insmod part_msdos insmod btrfs set root='(hd1,msdos1)' search --no-floppy --fs-uuid --set=root c64fca2a-5700-4cca-abac-3a61f2f7486c if loadfont /usr/share/grub/unicode.pf2 ; then set gfxmode=640x480 load_video insmod gfxterm insmod part_msdos insmod btrfs set root='(hd1,msdos1)' search --no-floppy --fs-uuid --set=root c64fca2a-5700-4cca-abac-3a61f2f7486c set locale_dir=($root)/boot/grub/locale set lang=fr_FR insmod gettext fi terminal_output gfxterm set timeout=5 ### END /etc/grub.d/00_header ### ### BEGIN /etc/grub.d/05_debian_theme ### insmod part_msdos insmod btrfs set root='(hd1,msdos1)' search --no-floppy --fs-uuid --set=root c64fca2a-5700-4cca-abac-3a61f2f7486c insmod png if background_image /usr/share/images/desktop-base/joy-grub.png; then set color_normal=white/black set color_highlight=black/white else set menu_color_normal=cyan/blue set menu_color_highlight=white/blue fi ### END /etc/grub.d/05_debian_theme ### ### BEGIN /etc/grub.d/10_linux ### menuentry 'Debian GNU/Linux, with Linux 3.12-0.bpo.1-amd64' --class debian --class gnu-linux --class gnu --class os { load_video insmod gzio insmod part_msdos insmod btrfs set root='(hd1,msdos1)' search --no-floppy --fs-uuid --set=root c64fca2a-5700-4cca-abac-3a61f2f7486c echo'Chargement de Linux 3.12-0.bpo.1-amd64 ...' linux /boot/vmlinuz-3.12-0.bpo.1-amd64 root=UUID=c64fca2a-5700-4cca-abac-3a61f2f7486c ro quiet echo'Chargement du disque mémoire initial ...' initrd
Re: [PATCH] xfstests: Btrfs: add test for large metadata blocks
On Sat, Feb 08, 2014 at 09:30:51AM +0100, Koen De Wit wrote: On 02/07/2014 11:49 PM, Dave Chinner wrote: On Fri, Feb 07, 2014 at 06:14:45PM +0100, Koen De Wit wrote: echo -n $xattr_value | md5sum ${ATTR_PROG} -Lq -s attr_$char -V $xattr_value $file ${ATTR_PROG} -Lq -g attr_$char $file | md5sum ${ATTR_PROG} -Lq -g attr_$char $lnkfile | md5sum is all that neds to be done here. The problem with this is that the length of the output will depend on the page size. The code above runs for every valid leafsize, which can be any multiple of the page size up to 64KB, as defined in the loop initialization: for leafsize in `seq $pagesize_kb $pagesize_kb 64`; do That's only a limit on the mkfs leafsize parameter, yes? An the limiation is that the leaf size can't be smaller than page size? So really, the attribute sizes that are being tested are independent of the mkfs parameters being tested. i.e: for attrsize in `seq 4 4 64`; do if [ $attrsize -lt $pagesize ]; then leafsize=$pagesize else leafsize=$attrsize fi $BTRFS_MKFS_PROG -l $leafsize $SCRATCH_DEV And now the test executes a fixed loop, testing the same attribute sizes on all the filesystems under test. i.e. the attribute sizes being tested are *independent* of the mkfs parameters being tested. Always test the same attribute sizes, the mkfs parameters simply vary by page size. +_scratch_unmount + +# Some illegal leafsizes + +_scratch_mkfs -l 0 2 $seqres.full +echo $? Same again - you are dumping the error output into a different file, then detecting the error manually. pass the output of _scratch_mkfs through a filter, and let errors cause golden output mismatches. I did this to make the golden output not depend on the output of mkfs.btrfs, inspired by http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/cmds/xfstests.git;a=commit;h=fd7a8e885732475c17488e28b569ac1530c8eb59 and http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/cmds/xfstests.git;a=commit;h=78d86b996c9c431542fdbac11fa08764b16ceb7d However, in my opinion the test should simply be updated if the output of mkfs.btrfs changes, so I agree with you and I fixed this in v2. While I agree with the sentiment, I'm questioning the implementation. i.e. you've done this differently to every other test that needs to check for failures. run_check woul dbe just fine, as would be simply filtering the output of mkfs. FWIW, the method for detecting the cp error in the second commit is for a very specific case. It could have also been done with a filter, as we have done in the past with such error messages. So what's good for one case is not necessarily the right way to handle the output for another. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Btrfs: faster/more efficient insertion of file extent items
This is an extension to my previous commit titled: Btrfs: faster file extent item replace operations (hash 1acae57b161ef1282f565ef907f72aeed0eb71d9) Instead of inserting the new file extent item if we deleted existing file extent items covering our target file range, also allow to insert the new file extent item if we didn't find any existing items to delete and replace_extent != 0, since in this case our caller would do another tree search to insert the new file extent item anyway, therefore just combine the two tree searches into a single one, saving cpu time, reducing lock contention and reducing btree node/leaf COW operations. This covers the case where applications keep doing tail append writes to files, which for example is the case of Apache CouchDB (its database and view index files are always open with O_APPEND). Signed-off-by: Filipe David Borba Manana fdman...@gmail.com --- fs/btrfs/file.c | 52 ++-- 1 file changed, 30 insertions(+), 22 deletions(-) diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index 0165b86..006af2f 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -720,7 +720,7 @@ int __btrfs_drop_extents(struct btrfs_trans_handle *trans, if (drop_cache) btrfs_drop_extent_cache(inode, start, end - 1, 0); - if (start = BTRFS_I(inode)-disk_i_size) + if (start = BTRFS_I(inode)-disk_i_size !replace_extent) modify_tree = 0; while (1) { @@ -938,34 +938,42 @@ next_slot: * Set path-slots[0] to first slot, so that after the delete * if items are move off from our leaf to its immediate left or * right neighbor leafs, we end up with a correct and adjusted -* path-slots[0] for our insertion. +* path-slots[0] for our insertion (if replace_extent != 0). */ path-slots[0] = del_slot; ret = btrfs_del_items(trans, root, path, del_slot, del_nr); if (ret) btrfs_abort_transaction(trans, root, ret); + } - leaf = path-nodes[0]; - /* -* leaf eb has flag EXTENT_BUFFER_STALE if it was deleted (that -* is, its contents got pushed to its neighbors), in which case -* it means path-locks[0] == 0 -*/ - if (!ret replace_extent leafs_visited == 1 - path-locks[0] - btrfs_leaf_free_space(root, leaf) = - sizeof(struct btrfs_item) + extent_item_size) { - - key.objectid = ino; - key.type = BTRFS_EXTENT_DATA_KEY; - key.offset = start; - setup_items_for_insert(root, path, key, - extent_item_size, - extent_item_size, - sizeof(struct btrfs_item) + - extent_item_size, 1); - *key_inserted = 1; + leaf = path-nodes[0]; + /* +* If btrfs_del_items() was called, it might have deleted a leaf, in +* which case it unlocked our path, so check path-locks[0] matches a +* write lock. +*/ + if (!ret replace_extent leafs_visited == 1 + (path-locks[0] == BTRFS_WRITE_LOCK_BLOCKING || +path-locks[0] == BTRFS_WRITE_LOCK) + btrfs_leaf_free_space(root, leaf) = + sizeof(struct btrfs_item) + extent_item_size) { + + key.objectid = ino; + key.type = BTRFS_EXTENT_DATA_KEY; + key.offset = start; + if (!del_nr path-slots[0] btrfs_header_nritems(leaf)) { + struct btrfs_key slot_key; + + btrfs_item_key_to_cpu(leaf, slot_key, path-slots[0]); + if (btrfs_comp_cpu_keys(key, slot_key) 0) + path-slots[0]++; } + setup_items_for_insert(root, path, key, + extent_item_size, + extent_item_size, + sizeof(struct btrfs_item) + + extent_item_size, 1); + *key_inserted = 1; } if (!replace_extent || !(*key_inserted)) -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Provide a better free space estimate on RAID1
Roman Mamedov posted on Sun, 09 Feb 2014 15:20:00 +0600 as excerpted: On Sun, 9 Feb 2014 06:38:53 + (UTC) Duncan 1i5t5.dun...@cox.net wrote: RAID or multi-device filesystems aren't 1970s features and break 1970s behavior and the assumptions associated with it. If you're not prepared to deal with those broken assumptions, don't. Use mdraid or dmraid or lvm or whatever to combine your multiple devices into one logical devices as presented, and put your filesystem (either traditional filesystem, or even btrfs using traditional single-device functionality) on top of the single device the layer beneath the filesystem presents. Problem solved! =:^) No reason BTRFS can't work well in a similar simplistic usage scenario. You seem to insist there is no way around it being too flexible for its own good, but all those advanced features absolutely don't *have* to get in the way of everyday usage for users who don't require them. Not really. I'm more insisting that I've not seen a good kernel-space solution to the problem yet, and believe that it's a userspace or wetware problem. And I provided a userspace/wetware solution that works for me, too. =:^) Meanwhile, what I've done here is use one of df's commandline options to set its block size to 2 MiB, and further used bash's alias functionality to setup an alias accordingly: alias df='df -B2M' $ df /h Filesystem 2M-blocks Used Available Use% Mounted on /dev/sda6 20480 12186 7909 61% /h On btrfs such as the above I can read the 2M blocks as 1M and be happy. On btrfs such as my /boot, which aren't raid1 (I have two separate /boots, one on each device, with grub2 configured separately for each to provide a backup), or if I df my media partitions still on reiserfs on the old spinning rust, I can either double the figures DF gives me, or add a second -B option at the CLI, overriding the aliased option. Congratulations, you broke your df readings on all other filesystems to fix them on btrfs. No. It clearly says 2M blocks. Nothing's broken at all, except perhaps the user's wetware. I just find it a easier to do the doubling in wetware on the occasion it's needed, in MiB, then halving on more frequent occasions (since all my core automounted filesystems that I'd normally be doing df on are btrfs raid1), larger KiB or byte units, and don't need to do that wetware halving often enough to have gone to the trouble of setting up the software-scripted version I propose below. If I wanted something fully automated, it'd be easy enough to setup a script that checked what filesystem I was df-ing, matched that against a table of filesystems to preferred df block sizes, and supplied the appropriate -BxX option accordingly. I am not sure this would work well in the network share scenario described earlier, with clients which in the real world are largely Windows-based. So patch the window-based stuff... oh, you've let them be your master (in the context of my sig below) and you can't... Well, servant by choice, I guess... There's freedom if you want it... which in fact you are using to do your kernel patches. Try patching the MS Windows kernel and distributing those patches, and see how far you get! =:^( FWIW/IMO, in the business context Ernie Ball made the right decision. One BSA audit was enough. He said no more, and the company moved to free as in freedom software and isn't beholden to the whims of any servantware or the BSA auditors enforcing it, any longer. =:^) But as I said, your systems (or your company's systems), play servant with them and be subject to the BSA gestapo (or the equivalent in your country) if you will. No skin off my nose. shrug Meanwhile, you said it yourself, users aren't normally concerned about this. And others pointed out that to the degree users /are/ concerned, they should be looking at their quotas, not filesystem level usage. And admins, assuming they're proper admins, not the simple here's my MCSE, I'm certified to do anything, and if I can't do it, it's not possible, types, should have the wetware resources to either deal with the problem there, or script their own solutions, offloading it from wetware to installation-specific userspace software scripts as necessary. All that said, it's worth noting that there ARE already API changes proposed and working their way thru the pipeline, that would expose various bits of necessary data to userspace in a standardized API that filesystems other than btrfs could make use of as well, with the intent of then updating coreutils (the package containing df) and friends to allow them to make use of the information exposed by this API to improve their default information output and allow for additional CLI level options as appropriate. Presumably other userspace apps, including the GUIs over time, would follow the same course. But the key is, getting a standardized modern API ready
Re: [PATCH 1/2] btrfs-progs: Add missing devices check for mounted btrfs.
On Fri, 07 Feb 2014 17:34:46 +0800, Anand Jain wrote: IMO btrfs-progs shouldn't add its intelligence to know if disk is missing. If btrfs-kernel doesn't know when disk is missing that's a bug to fix in btrfs-kernel. yes that indeed true as of now in btrfs-kernel. btrfs kernel has no idea when disk goes missing, just -EIO doesn't tell btrfs that. I am trying to fix this first. But the problem is there isn't good way with in btrfs/FS to know when disk goes missing. did I miss anything ? Yes, kernel detection is the best way. But since it has no better way to detect missing device, I think the btrfs-progs way fix is good enough for now. Since btrfs fi show with -d options will scan the /dev to find fs and check missing disks, I think adds some user-land check even using the ioctl way is still somewhat reasonable. Thanks Qu Thanks, Anand On 02/07/2014 02:45 PM, Qu Wenruo wrote: In btrfs/003 of xfstest, it will check whether btrfs fi show can find missing devices. But before the patch, btrfs-progs will not check whether device missing if given a mounted btrfs mountpoint/block device. This patch fixes the bug and will pass btrfs/003. Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com Cc: Anand Jain anand.j...@oracle.com --- cmds-filesystem.c | 12 1 file changed, 12 insertions(+) diff --git a/cmds-filesystem.c b/cmds-filesystem.c index 384d1b9..4c9933d 100644 --- a/cmds-filesystem.c +++ b/cmds-filesystem.c @@ -363,6 +363,8 @@ static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info, char *label, char *path) { int i; +int fd; +int missing; char uuidbuf[BTRFS_UUID_UNPARSED_SIZE]; struct btrfs_ioctl_dev_info_args *tmp_dev_info; int ret; @@ -385,6 +387,14 @@ static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info, for (i = 0; i fs_info-num_devices; i++) { tmp_dev_info = (struct btrfs_ioctl_dev_info_args *)dev_info[i]; + +/* Add check for missing devices even mounted */ +fd = open((char *)tmp_dev_info-path, O_RDONLY); +if (fd 0) { +missing = 1; +continue; +} +close(fd); printf(\tdevid %4llu size %s used %s path %s\n, tmp_dev_info-devid, pretty_size(tmp_dev_info-total_bytes), @@ -392,6 +402,8 @@ static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info, tmp_dev_info-path); } +if (missing) +printf(\t*** Some devices missing\n); printf(\n); return 0; } -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/2] btrfs-progs: Add -p/--print-missing options for btrfs fi show
On fri, 07 Feb 2014 17:26:11 +0800, Anand Jain wrote: Whats needed is more comprehensive btrfs fi show which shows the flags (including missing) per disk. Yes indeed. And also show the FS/Raid status. Which I am working on. sorry -p feature would be covered by default in the coming revamp of btrfs fi show. That's all right. Just a kind remind, if the output format changes, don't forget to modify the related xfstests testcase. Thanks, Qu Thanks, Anand On 02/07/2014 02:46 PM, Qu Wenruo wrote: Since a mounted btrfs filesystem contains all the devices info even a device is removed after mount(like btrfs/003 in xfstests), we can use the info to print the known missing device if possible. So -p/--print-missing options are added to print possible missing devices. Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com --- cmds-filesystem.c | 26 -- man/btrfs.8.in| 4 +++- 2 files changed, 23 insertions(+), 7 deletions(-) diff --git a/cmds-filesystem.c b/cmds-filesystem.c index 4c9933d..77b142c 100644 --- a/cmds-filesystem.c +++ b/cmds-filesystem.c @@ -360,7 +360,7 @@ static u64 calc_used_bytes(struct btrfs_ioctl_space_args *si) static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info, struct btrfs_ioctl_dev_info_args *dev_info, struct btrfs_ioctl_space_args *space_info, -char *label, char *path) +char *label, char *path, int print_missing) { int i; int fd; @@ -392,7 +392,14 @@ static int print_one_fs(struct btrfs_ioctl_fs_info_args *fs_info, fd = open((char *)tmp_dev_info-path, O_RDONLY); if (fd 0) { missing = 1; -continue; +if (print_missing) +printf(\tdevid %4llu size %s used %s path %s (missing)\n, + tmp_dev_info-devid, + pretty_size(tmp_dev_info-total_bytes), + pretty_size(tmp_dev_info-bytes_used), + tmp_dev_info-path); +else +continue; } close(fd); printf(\tdevid %4llu size %s used %s path %s\n, @@ -440,7 +447,7 @@ static int check_arg_type(char *input) return BTRFS_ARG_UNKNOWN; } -static int btrfs_scan_kernel(void *search) +static int btrfs_scan_kernel(void *search, int print_missing) { int ret = 0, fd; FILE *f; @@ -477,7 +484,8 @@ static int btrfs_scan_kernel(void *search) fd = open(mnt-mnt_dir, O_RDONLY); if ((fd != -1) !get_df(fd, space_info_arg)) { print_one_fs(fs_info_arg, dev_info_arg, -space_info_arg, label, mnt-mnt_dir); + space_info_arg, label, mnt-mnt_dir, + print_missing); kfree(space_info_arg); memset(label, 0, sizeof(label)); } @@ -500,6 +508,7 @@ static const char * const cmd_show_usage[] = { Show the structure of a filesystem, -d|--all-devices show only disks under /dev containing btrfs filesystem, -m|--mounted show only mounted btrfs, +-p|--print-missing show known missing device if possible, If no argument is given, structure of all present filesystems is shown., NULL }; @@ -513,6 +522,7 @@ static int cmd_show(int argc, char **argv) int ret; int where = BTRFS_SCAN_LBLKID; int type = 0; +int print_missing = 0; char mp[BTRFS_PATH_NAME_MAX + 1]; char path[PATH_MAX]; @@ -521,9 +531,10 @@ static int cmd_show(int argc, char **argv) static struct option long_options[] = { { all-devices, no_argument, NULL, 'd'}, { mounted, no_argument, NULL, 'm'}, +{ print-missing, no_argument, NULL, 'p'}, { NULL, no_argument, NULL, 0 }, }; -int c = getopt_long(argc, argv, dm, long_options, +int c = getopt_long(argc, argv, dmp, long_options, long_index); if (c 0) break; @@ -534,6 +545,9 @@ static int cmd_show(int argc, char **argv) case 'm': where = BTRFS_SCAN_MOUNTED; break; +case 'p': +print_missing = 1; +break; default: usage(cmd_show_usage); } @@ -571,7 +585,7 @@ static int cmd_show(int argc, char **argv) goto devs_only; /* show mounted btrfs */ -ret = btrfs_scan_kernel(search); +ret = btrfs_scan_kernel(search, print_missing); if (search !ret) return 0; diff --git a/man/btrfs.8.in b/man/btrfs.8.in index 8fea115..db2e355 100644 --- a/man/btrfs.8.in +++ b/man/btrfs.8.in @@ -25,7 +25,7 @@ btrfs \- control a btrfs filesystem .PP \fBbtrfs\fP \fBfilesystem df\fP\fI path\fP .PP -\fBbtrfs\fP \fBfilesystem show\fP [\fI--mounted\fP|\fI--all-devices\fP|\fIuuid\fP]\fP +\fBbtrfs\fP \fBfilesystem show\fP [\fI--mounted\fP|\fI--all-devices\fP|\fI--print-missing\fP|\fIuuid\fP]\fP .PP \fBbtrfs\fP \fBfilesystem
Issue with btrfs balance
I just recently discovered something about btrfs filesystem balance that (as far as I can see) isn't documented anywhere, and doesn't necessarily have an obvious (to the average user) explanation. Apparently, trying to use -mconvert=dup or -sconvert=dup on a multi-device filesystem using one of the RAID profiles for metadata fails with a statement to look at the kernel log, which doesn't show anything at all about the failure. Based on what I've been able to understand from the source, it appears that the kernel stops you from converting to a dup profile for metadata in this case because it thinks that such a profile doesn't work on multiple devices, despite the fact that you can take a single device filesystem, and a device, and it will still work fine even without converting the metadata/system profiles. I feel at the very least, this should be documented, and the kernel should give at least some indication of what went wrong. Ideally, this should be changed to allow converting to dup so that when converting a multi-device filesystem to single-device, you never have to have metadata or system chunks use a single profile. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with RAID1 cannot boot when removing drive
Saint Germain posted on Sun, 09 Feb 2014 22:40:55 +0100 as excerpted: I am experimenting with BTRFS and RAID1 on my Debian Wheezy (with backported kernel 3.12-0.bpo.1-amd64) using a a motherboard with UEFI. My systems don't do UEFI, but I do run GPT partitions and use grub2 for booting, with grub2-core installed to a BIOS/reserved type partition (instead of as an EFI service as it would be with UEFI). And I have root filesystem btrfs two-device raid1 mode working fine here, tested bootable with only one device of the two available. So while I can't help you directly with UEFI, I know the rest of it can/ does work. One more thing: I do have a (small) separate btrfs /boot, actually two of them as I setup a separate /boot on each of the two devices in ordered to have a backup /boot, since grub can only point to one /boot by default, and while pointing to another in grub's rescue mode is possible, I didn't want to have to deal with that if the first /boot was corrupted, as it's easier to simply point the BIOS at a different drive entirely and load its (independently installed and configured) grub and /boot. But grub2's btrfs module reads raid1 mode just fine as I can access files on the btrfs raid1 mode rootfs directly from grub without issue, so that's not a problem. But I strongly suspect I know what is... and it's a relatively easy fix. See below. =:^) However I haven't managed to make the system boot when the removing the first hard drive. I have installed Debian with the following partition on the first hard drive (no BTRFS subsystem): /dev/sda1: for / (BTRFS) /dev/sda2: for /home (BTRFS) /dev/sda3: for swap Then I added another drive for a RAID1 configuration (with btrfs balance) and I installed grub on the second hard drive with grub-install /dev/sdb. Just for clarification as you don't mention it specifically, altho your btrfs filesystem show information suggests you did it this way, are your partition layouts identical on both drives? That's what I've done here, and I definitely find that easiest to manage and even just to think about, tho it's definitely not a requirement. But using different partition layouts does significantly increase management complexity, so it's useful to avoid if possible. =:^) If I boot on sdb, it takes sda1 as the root filesystem If I switched the cable, it always take the first hard drive as the root filesystem (now sdb) That's normal /appearance/, but that /appearance/ doesn't fully reflect reality. The problem is that mount output (and /proc/self/mounts), fstab, etc, were designed with single-device filesystems in mind, and multi-device btrfs has to be made to fix the existing rules as best it can. So what's actually happening is that the for a btrfs composed of multiple devices, since there's only one device slot for the kernel to list devices, it only displays the first one it happens to come across, even tho the filesystem will normally (unless degraded) require that all component devices be available and logically assembled into the filesystem before it can be mounted. When you boot on sdb, naturally, the sdb component of the multi-device filesystem that the kernel finds, so it's the one listed, even tho the filesystem is actually composed of more devices, not just that one. When you switch the cables, the first one is, at least on your system, always the first device component of the filesystem detected, so it's always the one occupying the single device slot available for display, even tho the filesystem has actually assembled all devices into the complete filesystem before mounting. If I disconnect /dev/sda, the system doesn't boot with a message saying that it hasn't found the UUID: Scanning for BTRFS filesystems... mount: mounting /dev/disk/by-uuid/c64fca2a-5700-4cca-abac-3a61f2f7486c on /root failed: Invalid argument Can you tell me what I have done incorrectly ? Is it because of UEFI ? If yes I haven't understood how I can correct it in a simple way. As you haven't mentioned it and the grub config below doesn't mention it either, I'm almost certain that you're simply not aware of the degraded mount option, and when/how it should be used. And if you're not aware of that, chances are you're not aware of the btrfs wiki, and the multitude of other very helpful information it has available. I'd suggest you spend some time reading it, as it'll very likely save you quite some btrfs administration questions and headaches down the road, as you continue to work with btrfs. Bookmark it and refer to it often! =:^) https://btrfs.wiki.kernel.org (Click on the guides and usage information in contents under section 5, documentation.) Here's the mount options page. Note that the kernel btrfs documentation also includes mount options: https://btrfs.wiki.kernel.org/index.php/Mount_options $KERNELDIR/Documentation/filesystems/btrfs.txt You should be able to mount a two-device
[PATCH 0/3] make 'btrfs fi show /mnt/point/' works with ending '/' character
Before this patchset, 'btrfs fi show' can work with '/mnt/point' but not '/mnt/point/', which is very annoying since tab completion will add '/' to a directory. This patchset just reuse the find_mount_root function with some small modification to ignore the last '/' only when needed. Qu Wenruo (3): btrfs-progs: move find_mount_root to utils.[ch] btrfs-progs: Add path_is_mp option for find_mount_root. btrfs-progs: reuse find_mount_root to determine arg type and so on. cmds-filesystem.c | 7 + cmds-receive.c| 2 +- cmds-send.c | 53 ++--- cmds-subvolume.c | 2 +- commands.h| 1 - utils.c | 88 +-- utils.h | 1 + 7 files changed, 86 insertions(+), 68 deletions(-) -- 1.8.5.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3] btrfs-progs: Add path_is_mp option for find_mount_root.
Add path_is_mp option for find_mount_root, allowing to treat path as a mount point, if not found a restricted(*) match, will return -ENOENT. *: stricted match allow only the last '/' differs since path completion often addes a '/' in the end but mount points in /proc/self/mounts. e.g /mnt/data and /mnt/data/ is a restricted match but /mnt/data and /mnt/data/something is not a restricted match. Signed-off-by: Qu Wenruo quwen...@cn.fujitsu.com --- cmds-receive.c | 2 +- cmds-send.c | 4 ++-- cmds-subvolume.c | 2 +- utils.c | 32 utils.h | 2 +- 5 files changed, 29 insertions(+), 13 deletions(-) diff --git a/cmds-receive.c b/cmds-receive.c index cce37a7..810fd59 100644 --- a/cmds-receive.c +++ b/cmds-receive.c @@ -843,7 +843,7 @@ static int do_receive(struct btrfs_receive *r, const char *tomnt, int r_fd) goto out; } - ret = find_mount_root(dest_dir_full_path, r-root_path); + ret = find_mount_root(dest_dir_full_path, r-root_path, 0); if (ret 0) { ret = -EINVAL; fprintf(stderr, ERROR: failed to determine mount point diff --git a/cmds-send.c b/cmds-send.c index 9d49ce9..967f45a 100644 --- a/cmds-send.c +++ b/cmds-send.c @@ -349,7 +349,7 @@ static int init_root_path(struct btrfs_send *s, const char *subvol) if (s-root_path) goto out; - ret = find_mount_root(subvol, s-root_path); + ret = find_mount_root(subvol, s-root_path, 0); if (ret 0) { ret = -EINVAL; fprintf(stderr, ERROR: failed to determine mount point @@ -584,7 +584,7 @@ int cmd_send(int argc, char **argv) goto out; } - ret = find_mount_root(subvol, mount_root); + ret = find_mount_root(subvol, mount_root, 0); if (ret 0) { fprintf(stderr, ERROR: find_mount_root failed on %s: %s\n, subvol, diff --git a/cmds-subvolume.c b/cmds-subvolume.c index 0bd76f2..a78a535 100644 --- a/cmds-subvolume.c +++ b/cmds-subvolume.c @@ -931,7 +931,7 @@ static int cmd_subvol_show(int argc, char **argv) goto out; } - ret = find_mount_root(fullpath, mnt); + ret = find_mount_root(fullpath, mnt, 0); if (ret 0) { fprintf(stderr, ERROR: find_mount_root failed on %s: %s\n, fullpath, strerror(-ret)); diff --git a/utils.c b/utils.c index 8f06d4e..e39ad79 100644 --- a/utils.c +++ b/utils.c @@ -2111,7 +2111,13 @@ int lookup_ino_rootid(int fd, u64 *rootid) return 0; } -int find_mount_root(const char *path, char **mount_root) +/* + * Find the mount root of a given path. + * Return 0 when found and restore the mount root into mount_root. + * If path_is_mp is set, path will be treated as mount point and compare + * in a restricted way. + */ +int find_mount_root(const char *path, char **mount_root, int path_is_mp) { FILE *mnttab; int fd; @@ -2144,17 +2150,27 @@ int find_mount_root(const char *path, char **mount_root) endmntent(mnttab); if (!longest_match) { - fprintf(stderr, - ERROR: Failed to find mount root for path %s.\n, - path); - return -ENOENT; + ret = -ENOENT; + goto out; + } + + /* Only the last '/' in path may differs if path_is_mp */ + if (path_is_mp strlen(path) != strlen(longest_match)) { + if (strlen(path) != strlen(longest_match) + 1 || + path[strlen(path) - 1] != '/') { + ret = -ENOENT; + goto out; + } } ret = 0; - *mount_root = realpath(longest_match, NULL); - if (!*mount_root) - ret = -errno; + if (mount_root) { + *mount_root = realpath(longest_match, *mount_root); + if (!*mount_root) + ret = -errno; + } +out: free(longest_match); return ret; } diff --git a/utils.h b/utils.h index e074732..b69712e 100644 --- a/utils.h +++ b/utils.h @@ -96,6 +96,6 @@ int ask_user(char *question); int lookup_ino_rootid(int fd, u64 *rootid); int btrfs_scan_lblkid(int update_kernel); int get_btrfs_mount(const char *dev, char *mp, size_t mp_size); -int find_mount_root(const char *path, char **mount_root); +int find_mount_root(const char *path, char **mount_root, int path_is_mp); #endif -- 1.8.5.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] btrfs-progs: move find_mount_root to utils.[ch]
Move find_mount_root to utils.[ch] for general use. Signed-off-by: Qu Wenruo quwen...@cn.fuijitsu.com --- cmds-send.c | 49 + commands.h | 1 - utils.c | 48 utils.h | 1 + 4 files changed, 50 insertions(+), 49 deletions(-) diff --git a/cmds-send.c b/cmds-send.c index fc9a01e..9d49ce9 100644 --- a/cmds-send.c +++ b/cmds-send.c @@ -39,6 +39,7 @@ #include ioctl.h #include commands.h #include list.h +#include utils.h #include send.h #include send-utils.h @@ -57,54 +58,6 @@ struct btrfs_send { struct subvol_uuid_search sus; }; -int find_mount_root(const char *path, char **mount_root) -{ - FILE *mnttab; - int fd; - struct mntent *ent; - int len; - int ret; - int longest_matchlen = 0; - char *longest_match = NULL; - - fd = open(path, O_RDONLY | O_NOATIME); - if (fd 0) - return -errno; - close(fd); - - mnttab = setmntent(/proc/self/mounts, r); - if (!mnttab) - return -errno; - - while ((ent = getmntent(mnttab))) { - len = strlen(ent-mnt_dir); - if (strncmp(ent-mnt_dir, path, len) == 0) { - /* match found */ - if (longest_matchlen len) { - free(longest_match); - longest_matchlen = len; - longest_match = strdup(ent-mnt_dir); - } - } - } - endmntent(mnttab); - - if (!longest_match) { - fprintf(stderr, - ERROR: Failed to find mount root for path %s.\n, - path); - return -ENOENT; - } - - ret = 0; - *mount_root = realpath(longest_match, NULL); - if (!*mount_root) - ret = -errno; - - free(longest_match); - return ret; -} - static int get_root_id(struct btrfs_send *s, const char *path, u64 *root_id) { struct subvol_info *si; diff --git a/commands.h b/commands.h index 23c1201..db70043 100644 --- a/commands.h +++ b/commands.h @@ -126,5 +126,4 @@ int cmd_rescue(int argc, char **argv); int test_issubvolume(char *path); /* send.c */ -int find_mount_root(const char *path, char **mount_root); char *get_subvol_name(char *mnt, char *full_path); diff --git a/utils.c b/utils.c index 75b37f3..8f06d4e 100644 --- a/utils.c +++ b/utils.c @@ -2110,3 +2110,51 @@ int lookup_ino_rootid(int fd, u64 *rootid) return 0; } + +int find_mount_root(const char *path, char **mount_root) +{ + FILE *mnttab; + int fd; + struct mntent *ent; + int len; + int ret; + int longest_matchlen = 0; + char *longest_match = NULL; + + fd = open(path, O_RDONLY | O_NOATIME); + if (fd 0) + return -errno; + close(fd); + + mnttab = setmntent(/proc/self/mounts, r); + if (!mnttab) + return -errno; + + while ((ent = getmntent(mnttab))) { + len = strlen(ent-mnt_dir); + if (strncmp(ent-mnt_dir, path, len) == 0) { + /* match found */ + if (longest_matchlen len) { + free(longest_match); + longest_matchlen = len; + longest_match = strdup(ent-mnt_dir); + } + } + } + endmntent(mnttab); + + if (!longest_match) { + fprintf(stderr, + ERROR: Failed to find mount root for path %s.\n, + path); + return -ENOENT; + } + + ret = 0; + *mount_root = realpath(longest_match, NULL); + if (!*mount_root) + ret = -errno; + + free(longest_match); + return ret; +} diff --git a/utils.h b/utils.h index 512c51b..e074732 100644 --- a/utils.h +++ b/utils.h @@ -96,5 +96,6 @@ int ask_user(char *question); int lookup_ino_rootid(int fd, u64 *rootid); int btrfs_scan_lblkid(int update_kernel); int get_btrfs_mount(const char *dev, char *mp, size_t mp_size); +int find_mount_root(const char *path, char **mount_root); #endif -- 1.8.5.4 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html