Re: Provide a better free space estimate on RAID1

2014-02-10 Thread Roman Mamedov
On Mon, 10 Feb 2014 00:02:38 + (UTC)
Duncan 1i5t5.dun...@cox.net wrote:

 Meanwhile, you said it yourself, users aren't normally concerned about 
 this.

I think you're being mistaken here, the point that users aren't looking at
the free space, hence it is not important to provide a correct estimate was
made by someone else, not me. Personally I found that to be just a bit too
surrealistic to try and seriously answer; much like the rest of your message.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: Provide a better free space estimate on RAID1

2014-02-09 Thread Roman Mamedov
On Sun, 9 Feb 2014 06:38:53 + (UTC)
Duncan 1i5t5.dun...@cox.net wrote:

 RAID or multi-device filesystems aren't 1970s features and break 1970s 
 behavior and the assumptions associated with it.  If you're not prepared 
 to deal with those broken assumptions, don't.  Use mdraid or dmraid or lvm 
 or whatever to combine your multiple devices into one logical devices as 
 presented, and put your filesystem (either traditional filesystem, or 
 even btrfs using traditional single-device functionality) on top of the 
 single device the layer beneath the filesystem presents.  Problem solved! 
 =:^)
 
 Note that df only lists a single device as well, not the multiple 
 component devices of the filesystem.  That's broken functionality by your 
 definition, too, and again, using some other layer like lvm or mdraid to 
 present multiple devices as a single virtual device, with a traditional 
 single-device filesystem layout on top of that single device... solves 
 the problem!

No reason BTRFS can't work well in a similar simplistic usage scenario.

You seem to insist there is no way around it being too flexible for its own
good, but all those advanced features absolutely don't *have* to get in the
way of everyday usage for users who don't require them.

 Meanwhile, what I've done here is use one of df's commandline options to 
 set its block size to 2 MiB, and further used bash's alias functionality 
 to setup an alias accordingly:
 
 alias df='df -B2M'
 
 $ df /h
 Filesystem 2M-blocks  Used Available Use% Mounted on
 /dev/sda6  20480 12186  7909  61% /h
 
 $ sudo btrfs fi show /h
 Label: hm0238gcnx+35l0  uuid: ce23242a-b0a9-423f-a9c3-7db2729f48d6
 Total devices 2 FS bytes used 11.90GiB
 devid1 size 20.00GiB used 14.78GiB path /dev/sda6
 devid2 size 20.00GiB used 14.78GiB path /dev/sdb6
 
 $ sudo btrfs fi df /h
 Data, RAID1: total=14.00GiB, used=11.49GiB
 System, RAID1: total=32.00MiB, used=16.00KiB
 Metadata, RAID1: total=768.00MiB, used=414.94MiB
 
 
 On btrfs such as the above I can read the 2M blocks as 1M and be happy.

 On btrfs such as my /boot, which aren't raid1 (I have two separate 
 /boots, one on each device, with grub2 configured separately for each to 
 provide a backup), or if I df my media partitions still on reiserfs on 
 the old spinning rust, I can either double the figures DF gives me, or 
 add a second -B option at the CLI, overriding the aliased option.

Congratulations, you broke your df readings on all other filesystems to fix
them on btrfs.

 If I wanted something fully automated, it'd be easy enough to setup a 
 script that checked what filesystem I was df-ing, matched that against a 
 table of filesystems to preferred df block sizes, and supplied the 
 appropriate -BxX option accordingly.

I am not sure this would work well in the network share scenario described
earlier, with clients which in the real world are largely Windows-based.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: Provide a better free space estimate on RAID1

2014-02-09 Thread Kai Krakow
Duncan 1i5t5.dun...@cox.net schrieb:

 Roman Mamedov posted on Sun, 09 Feb 2014 04:10:50 +0600 as excerpted:
 
 If you need to perform a btrfs-specific operation, you can easily use
 the btrfs-specific tools to prepare for it, specifically use btrfs fi
 df which could give provide every imaginable interpretation of free
 space estimate and then some.
 
 UNIX 'df' and the 'statfs' call on the other hand should keep the
 behavior people are accustomized to rely on since 1970s.
 
 Which it does... on filesystems that only have 1970s filesystem features.
 =:^)
 
 RAID or multi-device filesystems aren't 1970s features and break 1970s
 behavior and the assumptions associated with it.  If you're not prepared
 to deal with those broken assumptions, don't.  Use mdraid or dmraid or lvm
 or whatever to combine your multiple devices into one logical devices as
 presented, and put your filesystem (either traditional filesystem, or
 even btrfs using traditional single-device functionality) on top of the
 single device the layer beneath the filesystem presents.  Problem solved!
 =:^)
 
 Note that df only lists a single device as well, not the multiple
 component devices of the filesystem.  That's broken functionality by your
 definition, too, and again, using some other layer like lvm or mdraid to
 present multiple devices as a single virtual device, with a traditional
 single-device filesystem layout on top of that single device... solves
 the problem!
 
 
 Meanwhile, what I've done here is use one of df's commandline options to
 set its block size to 2 MiB, and further used bash's alias functionality
 to setup an alias accordingly:
 
 alias df='df -B2M'
 
 $ df /h
 Filesystem 2M-blocks  Used Available Use% Mounted on
 /dev/sda6  20480 12186  7909  61% /h
 
 $ sudo btrfs fi show /h
 Label: hm0238gcnx+35l0  uuid: ce23242a-b0a9-423f-a9c3-7db2729f48d6
 Total devices 2 FS bytes used 11.90GiB
 devid1 size 20.00GiB used 14.78GiB path /dev/sda6
 devid2 size 20.00GiB used 14.78GiB path /dev/sdb6
 
 $ sudo btrfs fi df /h
 Data, RAID1: total=14.00GiB, used=11.49GiB
 System, RAID1: total=32.00MiB, used=16.00KiB
 Metadata, RAID1: total=768.00MiB, used=414.94MiB
 
 
 On btrfs such as the above I can read the 2M blocks as 1M and be happy.
 On btrfs such as my /boot, which aren't raid1 (I have two separate
 /boots, one on each device, with grub2 configured separately for each to
 provide a backup), or if I df my media partitions still on reiserfs on
 the old spinning rust, I can either double the figures DF gives me, or
 add a second -B option at the CLI, overriding the aliased option.
 
 If I wanted something fully automated, it'd be easy enough to setup a
 script that checked what filesystem I was df-ing, matched that against a
 table of filesystems to preferred df block sizes, and supplied the
 appropriate -BxX option accordingly.  As I guess most admins after a few
 years, I've developed quite a library of scripts/aliases for various
 things I do routinely enough to warrant it, and this would be just one
 more joining the list. =:^)

Well done... And a good idea, I didn't think of it yet. But it's my idea of 
fixing it in user space. :-)

I usually leave the discussion when people start to argument with pointers 
to unix tradition... That's like starting a systemd discussion and telling 
me that systemd is broken by design while mentioning in the same sentence 
that sysvinit is working perfectly fine. The latter doesn't do so. The first 
is a matter of personal taste but is in no case broken... But... Well...

 But of course it's your system in question, and you can patch btrfs to
 output anything you like, in any format you like.  No need to bother with
 df's -B option if you'd prefer to patch the kernel instead.  Me, I'll
 stick to the -B option.  =:^)

That's essentially the FOSS idea. Actually, I don't want df behavior being 
broken for me. It uses fstat syscall, that returns blocks. Cutting returned 
values into half lies about the properties of the device - for EVERY 
application out there, no matter which assumptions are being made about the 
returned values. This breaks the fstat syscall. User-space should simply not 
rely on the assumption that 1k of user data occupies 1k worth of blocks 
(that's not true anyways because meta-data has to be allocated, too). When I 
had contact with unix first, df returned used/free blocks - native BLOCKS! 
No option to make it human readable. No forced intention that it would show 
you usable space for actual written data. The blocks were given as 512-byte 
sectors. I've been okay with that. I knew: If I cut the values in half, I'd 
get about the size of data I perhabs could fit in the device. If it had been 
a property of the device that 512 byte of user data would write two blocks, 
nobody had cared about df displaying wrong values.
 
-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line unsubscribe 

Re: Provide a better free space estimate on RAID1

2014-02-09 Thread Kai Krakow
Roman Mamedov r...@romanrm.net schrieb:

 When I started to use unix, df returned blocks, not bytes. Without your
 proposed patch, it does that right. With your patch, it does it wrong.
 
 It returns total/used/available space that is usable/used/available by/for
 user data.

No, it does not. It returns space allocatable to the filesystem. That's user 
data and meta data. That can be far from your expectations depending on how 
allocation on the filesystem works.

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][V3] Provide a better free space estimate [was]Re: Provide a better free space estimate on RAID1

2014-02-09 Thread Goffredo Baroncelli
On 02/07/2014 05:40 AM, Roman Mamedov wrote:
 On Thu, 06 Feb 2014 20:54:19 +0100
 Goffredo Baroncelli kreij...@libero.it wrote:
 
[...]

As Roman pointed out, df show the raw space available. However
when a RAID level is used, the space available to the user is
less.
This patch try to address this estimation correcting the value
on the basis of the RAID level.

This is my third revision of this patch. In this last issue, I
addressed the bugs related to an uncorrected evaluation of the 
free space in case of RAID1 [1] and DUP.

I have to point out that the free space estimation is quite
approximative, because it assumes:

a) all the new files are allocated in data chunk
b) the free space will not consumed by metadata
c) the already allocated chunk are not evaluated for the free
space estimation

Both these assumptions are unrelated to my patch.

I performed some tests with a filesystem composed by 7 51GB disks. 
Here my df results:

Profile: single
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdb351G  512K  348G   1% /mnt/btrfs1

Profile: raid1
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdb351G  1.3M  175G   1% /mnt/btrfs1

Profile: raid10
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdb351G  2.3M  177G   1% /mnt/btrfs1

Profile: raid5
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdb351G  2.0M  298G   1% /mnt/btrfs1

Profile: raid6
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdb351G  1.8M  248G   1% /mnt/btrfs1


Profile: DUP (only one 50GB disk was used)
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc 51G  576K   26G   1% /mnt/btrfs1


Below my patch.

BR
G.Baroncelli

[1] the bug is before my patch; try to see what happens when you 
create a RAID1 filesystem with three disks.

Changes history:
V1  First issue
V2  Correct a (old) bug when in RAID10 the disks aren't 
a multiple of 4
V3  Correct the free space estimation in RAID1 (when the
number of disks are odd) and DUP



diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d71a11d..4064a5f 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1481,10 +1481,16 @@ static int btrfs_calc_avail_data_space(struct 
btrfs_root *root, u64 *free_bytes)
num_stripes = nr_devices;
} else if (type  BTRFS_BLOCK_GROUP_RAID1) {
min_stripes = 2;
-   num_stripes = 2;
+   num_stripes = nr_devices;
} else if (type  BTRFS_BLOCK_GROUP_RAID10) {
min_stripes = 4;
-   num_stripes = 4;
+   num_stripes = nr_devices;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID5) {
+   min_stripes = 3;
+   num_stripes = nr_devices;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID6) {
+   min_stripes = 4;
+   num_stripes = nr_devices;
}
 
if (type  BTRFS_BLOCK_GROUP_DUP)
@@ -1560,9 +1566,44 @@ static int btrfs_calc_avail_data_space(struct btrfs_root 
*root, u64 *free_bytes)
 
if (devices_info[i].max_avail = min_stripe_size) {
int j;
-   u64 alloc_size;
+   u64 alloc_size, delta;
+   int k, div;
+
+   /*
+* Depending by the RAID profile, we use some
+* disk space as redundancy:
+* RAID1, RAID10, DUP - half of space used as 
redundancy
+* RAID5  - 1 stripe used as redundancy
+* RAID6  - 2 stripes used as redundancy
+* RAID0,LINEAR   - no redundancy
+*/
+   if (type  BTRFS_BLOCK_GROUP_RAID1) {
+   k = num_stripes;
+   div = 2;
+   } else if (type  BTRFS_BLOCK_GROUP_DUP) {
+   k = num_stripes;
+   div = 2;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID10) {
+   k = num_stripes;
+   div = 2;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID5) {
+   k = num_stripes-1;
+   div = 1;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID6) {
+   k = num_stripes-2;
+   div = 1;
+   } else { /* RAID0/LINEAR */
+   k = num_stripes;
+   div = 1;
+   }
+
+   delta = 

Re: Provide a better free space estimate on RAID1

2014-02-09 Thread Duncan
Roman Mamedov posted on Sun, 09 Feb 2014 15:20:00 +0600 as excerpted:

 On Sun, 9 Feb 2014 06:38:53 + (UTC)
 Duncan 1i5t5.dun...@cox.net wrote:
 
 RAID or multi-device filesystems aren't 1970s features and break 1970s
 behavior and the assumptions associated with it.  If you're not
 prepared to deal with those broken assumptions, don't.  Use mdraid or
 dmraid or lvm or whatever to combine your multiple devices into one
 logical devices as presented, and put your filesystem (either
 traditional filesystem, or even btrfs using traditional single-device
 functionality) on top of the single device the layer beneath the
 filesystem presents.  Problem solved! =:^)
 
 No reason BTRFS can't work well in a similar simplistic usage scenario.
 
 You seem to insist there is no way around it being too flexible for its
 own good, but all those advanced features absolutely don't *have* to
 get in the way of everyday usage for users who don't require them.

Not really.  I'm more insisting that I've not seen a good kernel-space 
solution to the problem yet, and believe that it's a userspace or wetware 
problem.

And I provided a userspace/wetware solution that works for me, too. =:^)

 Meanwhile, what I've done here is use one of df's commandline options
 to set its block size to 2 MiB, and further used bash's alias
 functionality to setup an alias accordingly:
 
 alias df='df -B2M'
 
 $ df /h Filesystem 2M-blocks  Used Available Use% Mounted on
 /dev/sda6  20480 12186  7909  61% /h

 On btrfs such as the above I can read the 2M blocks as 1M and be happy.

 On btrfs such as my /boot, which aren't raid1 (I have two separate
 /boots, one on each device, with grub2 configured separately for each
 to provide a backup), or if I df my media partitions still on reiserfs
 on the old spinning rust, I can either double the figures DF gives me,
 or add a second -B option at the CLI, overriding the aliased option.
 
 Congratulations, you broke your df readings on all other filesystems to
 fix them on btrfs.

No.  It clearly says 2M blocks.  Nothing's broken at all, except perhaps 
the user's wetware.

I just find it a easier to do the doubling in wetware on the occasion 
it's needed, in MiB, then halving on more frequent occasions (since all 
my core automounted filesystems that I'd normally be doing df on are 
btrfs raid1), larger KiB or byte units, and don't need to do that wetware 
halving often enough to have gone to the trouble of setting up the 
software-scripted version I propose below.

 If I wanted something fully automated, it'd be easy enough to setup a
 script that checked what filesystem I was df-ing, matched that against
 a table of filesystems to preferred df block sizes, and supplied the
 appropriate -BxX option accordingly.
 
 I am not sure this would work well in the network share scenario
 described earlier, with clients which in the real world are largely
 Windows-based.

So patch the window-based stuff... oh, you've let them be your master (in 
the context of my sig below) and you can't...  Well, servant by choice, I 
guess...  There's freedom if you want it... which in fact you are using 
to do your kernel patches.  Try patching the MS Windows kernel and 
distributing those patches, and see how far you get! =:^(

FWIW/IMO, in the business context Ernie Ball made the right decision.  
One BSA audit was enough.  He said no more, and the company moved to free 
as in freedom software and isn't beholden to the whims of any servantware 
or the BSA auditors enforcing it, any longer. =:^)

But as I said, your systems (or your company's systems), play servant 
with them and be subject to the BSA gestapo (or the equivalent in your 
country) if you will.  No skin off my nose.  shrug


Meanwhile, you said it yourself, users aren't normally concerned about 
this.  And others pointed out that to the degree users /are/ concerned, 
they should be looking at their quotas, not filesystem level usage.

And admins, assuming they're proper admins, not the simple here's my 
MCSE, I'm certified to do anything, and if I can't do it, it's not 
possible, types, should have the wetware resources to either deal with 
the problem there, or script their own solutions, offloading it from 
wetware to installation-specific userspace software scripts as necessary.


All that said, it's worth noting that there ARE already API changes 
proposed and working their way thru the pipeline, that would expose 
various bits of necessary data to userspace in a standardized API that 
filesystems other than btrfs could make use of as well, with the intent 
of then updating coreutils (the package containing df) and friends to 
allow them to make use of the information exposed by this API to improve 
their default information output and allow for additional CLI level 
options as appropriate.  Presumably other userspace apps, including the 
GUIs over time, would follow the same course.

But the key is, getting a standardized modern API ready 

Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Roman Mamedov
On Fri, 7 Feb 2014 12:08:12 +0600
Roman Mamedov r...@romanrm.net wrote:

  Earlier conventions would have stated Size ~900GB, and Avail ~900GB. But 
  that's not exactly true either, is it?
 
 Much better, and matching the user expectations of how RAID1 should behave,
 without a major gotcha blowing up into their face the first minute they are
 trying it out. In fact next step that I planned would be finding how to adjust
 also Size and Used on all my machines to show what you just mentioned.

OK done; again, this is just what I will personally use from now on (and for
anyone who finds this helpful).



--- fs/btrfs/super.c.orig   2014-02-06 01:28:36.636164982 +0600
+++ fs/btrfs/super.c2014-02-08 17:16:50.361931959 +0600
@@ -1481,6 +1481,11 @@
}
 
kfree(devices_info);
+
+   if (type  BTRFS_BLOCK_GROUP_RAID1) {
+   do_div(avail_space, min_stripes);
+   }
+  
*free_bytes = avail_space;
return 0;
 }
@@ -1491,8 +1496,10 @@
struct btrfs_super_block *disk_super = fs_info-super_copy;
struct list_head *head = fs_info-space_info;
struct btrfs_space_info *found;
+   u64 total_size;
u64 total_used = 0;
u64 total_free_data = 0;
+   u64 type;
int bits = dentry-d_sb-s_blocksize_bits;
__be32 *fsid = (__be32 *)fs_info-fsid;
int ret;
@@ -1512,7 +1519,13 @@
rcu_read_unlock();
 
buf-f_namelen = BTRFS_NAME_LEN;
-   buf-f_blocks = btrfs_super_total_bytes(disk_super)  bits;
+   total_size = btrfs_super_total_bytes(disk_super);
+   type = btrfs_get_alloc_profile(fs_info-tree_root, 1);
+   if (type  BTRFS_BLOCK_GROUP_RAID1) {
+   do_div(total_size, 2);
+   do_div(total_used, 2);
+   }
+   buf-f_blocks = total_size  bits;
buf-f_bfree = buf-f_blocks - (total_used  bits);
buf-f_bsize = dentry-d_sb-s_blocksize;
buf-f_type = BTRFS_SUPER_MAGIC;




2x1TB RAID1 with a 1GB file:

Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   912G  1.1G  911G   1% /mnt/p2


-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Roman Mamedov
On Fri, 07 Feb 2014 21:32:42 +0100
Kai Krakow hurikhan77+bt...@gmail.com wrote:

 It should show the raw space available. Btrfs also supports compression and 
 doesn't try to be smart about how much compressed data would fit in the free 
 space of the drive. If one is using RAID1, it's supposed to fill up with a 
 rate of 2:1. If one is using compression, it's supposed to fill up with a 
 rate of maybe 1:5 for mostly text files.

Imagine a small business with some 30-40 employees. There is a piece of paper
near the door at the office so that everyone sees it when entering or leaving,
which says:

Dear employees,

Please keep in mind that on the fileserver '\\DepartmentC', in the directory
'\PublicStorage7' the free space you see as being available needs to be divided
by two; On the server '\\DepartmentD', in '\StorageArchive' and '\VideoFiles',
multiplied by two-thirds. For more details please contact the IT operations
team. Further assistance will be provided at the monthly training seminar.

Regards,
John S, CTO.'

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Hugo Mills
On Sat, Feb 08, 2014 at 05:33:10PM +0600, Roman Mamedov wrote:
 On Fri, 07 Feb 2014 21:32:42 +0100
 Kai Krakow hurikhan77+bt...@gmail.com wrote:
 
  It should show the raw space available. Btrfs also supports compression and 
  doesn't try to be smart about how much compressed data would fit in the 
  free 
  space of the drive. If one is using RAID1, it's supposed to fill up with a 
  rate of 2:1. If one is using compression, it's supposed to fill up with a 
  rate of maybe 1:5 for mostly text files.
 
 Imagine a small business with some 30-40 employees. There is a piece of paper
 near the door at the office so that everyone sees it when entering or leaving,
 which says:
 
 Dear employees,
 
 Please keep in mind that on the fileserver '\\DepartmentC', in the directory
 '\PublicStorage7' the free space you see as being available needs to be 
 divided
 by two; On the server '\\DepartmentD', in '\StorageArchive' and '\VideoFiles',
 multiplied by two-thirds. For more details please contact the IT operations
 team. Further assistance will be provided at the monthly training seminar.
 
 Regards,
 John S, CTO.'

   In my experience, nobody who uses a shared filesystem *ever* looks
at the amount of free space on it, until it fills up, at which point
they may look at the free space and see 0. Or most likely, they'll
be alerted to the issue by an email from the systems people saying,
please will everyone delete unnecessary files from the shared drive,
because it's full up.

   Having a more accurate estimate of the free space is a laudable
aim, and in principle I agree with attempts to do it, but I think the
argument above isn't exactly a strong one in practice.

   Even in the current code with only one RAID setting available for
data, if you have parity RAID, you've got to look at the number of
drives with available free space to make an estimate of available
space. I think your best bet, ultimately, is to write code to give
either a pessimistic (lower bound) or optimistic (upper bound)
estimate of available space based on the profiles in use and the
current distribution of free/unallocated space, and stick with that. I
think I'd prefer to see a pessimistic bound, although that could break
anything like an installer that attempts to see how much free space
there is before proceeding.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
 --- This year,  I'm giving up Lent. --- 


signature.asc
Description: Digital signature


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Goffredo Baroncelli
On 02/07/2014 05:40 AM, Roman Mamedov wrote:
 On Thu, 06 Feb 2014 20:54:19 +0100
 Goffredo Baroncelli kreij...@libero.it wrote:
 
[...]

Even I am not entirely convinced, I update the Roman's PoC in order
to take in account all the RAID levels.

I performed some tests with 7 48.8GB disks. Here my df results

Profile: single
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc342G  512K  340G   1% /mnt/btrfs1

Profile: raid1
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc342G  1.3M  147G   1% /mnt/btrfs1

Profile: raid10
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc342G  2.3M  102G   1% /mnt/btrfs1

Profile: raid5
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc342G  2.0M  291G   1% /mnt/btrfs1

Profile: raid6
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc342G  1.8M  243G   1% /mnt/btrfs1

Note that RAID1 can only uses 6 disks; raid 10 only four, but I think that it 
is due to a previous bug. 
Still the mixing mode (data and metadata raid in the same chunk) is unsupported

below my patch.

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d71a11d..e5c58b3 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1485,6 +1485,12 @@ static int btrfs_calc_avail_data_space(struct btrfs_root 
*root, u64 *free_bytes)
} else if (type  BTRFS_BLOCK_GROUP_RAID10) {
min_stripes = 4;
num_stripes = 4;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID5) {
+   min_stripes = 3;
+   num_stripes = nr_devices;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID6) {
+   min_stripes = 4;
+   num_stripes = nr_devices;
}
 
if (type  BTRFS_BLOCK_GROUP_DUP)
@@ -1561,8 +1567,30 @@ static int btrfs_calc_avail_data_space(struct btrfs_root 
*root, u64 *free_bytes)
if (devices_info[i].max_avail = min_stripe_size) {
int j;
u64 alloc_size;
+   int k;
 
-   avail_space += devices_info[i].max_avail * num_stripes;
+   /*
+* Depending by the RAID profile, we use some
+* disk space as redundancy:
+* RAID1, RAID10, DUP - half of space used as 
redundancy
+* RAID5  - 1 stripe used as redundancy
+* RAID6  - 2 stripes used as redundancy
+* RAID0,LINEAR   - no redundancy
+*/
+   if (type  BTRFS_BLOCK_GROUP_RAID1) {
+   k = num_stripes  1;
+   } else if (type  BTRFS_BLOCK_GROUP_DUP) {
+   k = num_stripes  1;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID10) {
+   k = num_stripes  1;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID5) {
+   k = num_stripes-1;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID6) {
+   k = num_stripes-2;
+   } else { /* RAID0/LINEAR */
+   k = num_stripes;
+   }
+   avail_space += devices_info[i].max_avail * k;
alloc_size = devices_info[i].max_avail;
for (j = i + 1 - num_stripes; j = i; j++)
devices_info[j].max_avail -= alloc_size;




-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH][V2] Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Goffredo Baroncelli
On 02/07/2014 05:40 AM, Roman Mamedov wrote:
 On Thu, 06 Feb 2014 20:54:19 +0100
 Goffredo Baroncelli kreij...@libero.it wrote:
 
[...]

Even I am not entirely convinced, I update the Roman's PoC in order
to take in account all the RAID levels.

The filesystem test is composed by 7 51GB disks. Here my df results:

Profile: single
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc351G  512K  348G   1% /mnt/btrfs1

Profile: raid1
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc351G  1.3M  150G   1% /mnt/btrfs1

Profile: raid10
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc351G  2.3M  153G   1% /mnt/btrfs1

Profile: raid5
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc351G  2.0M  298G   1% /mnt/btrfs1

Profile: raid6
Filesystem  Size  Used Avail Use% Mounted on
/dev/vdc351G  1.8M  248G   1% /mnt/btrfs1


Note that RAID1 and RAID10 can only use an even number of disks.
The mixing mode (data and metadata in the same chunk) 
return strange results.

Below my patch.

BR
G.Baroncelli

Changes history:
V1  First issue
V2  Correct a (old) bug when in RAID10 the disks aren't 
a multiple of 4


diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index d71a11d..aea9afa 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1481,10 +1481,16 @@ static int btrfs_calc_avail_data_space(struct 
btrfs_root *root, u64 *free_bytes)
num_stripes = nr_devices;
} else if (type  BTRFS_BLOCK_GROUP_RAID1) {
min_stripes = 2;
-   num_stripes = 2;
+   num_stripes = nr_devices  ~1llu;
} else if (type  BTRFS_BLOCK_GROUP_RAID10) {
min_stripes = 4;
-   num_stripes = 4;
+   num_stripes = nr_devices  ~1llu;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID5) {
+   min_stripes = 3;
+   num_stripes = nr_devices;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID6) {
+   min_stripes = 4;
+   num_stripes = nr_devices;
}
 
if (type  BTRFS_BLOCK_GROUP_DUP)
@@ -1561,8 +1567,30 @@ static int btrfs_calc_avail_data_space(struct btrfs_root 
*root, u64 *free_bytes)
if (devices_info[i].max_avail = min_stripe_size) {
int j;
u64 alloc_size;
+   int k;
 
-   avail_space += devices_info[i].max_avail * num_stripes;
+   /*
+* Depending by the RAID profile, we use some
+* disk space as redundancy:
+* RAID1, RAID10, DUP - half of space used as 
redundancy
+* RAID5  - 1 stripe used as redundancy
+* RAID6  - 2 stripes used as redundancy
+* RAID0,LINEAR   - no redundancy
+*/
+   if (type  BTRFS_BLOCK_GROUP_RAID1) {
+   k = num_stripes  1;
+   } else if (type  BTRFS_BLOCK_GROUP_DUP) {
+   k = num_stripes  1;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID10) {
+   k = num_stripes  1;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID5) {
+   k = num_stripes-1;
+   } else if (type  BTRFS_BLOCK_GROUP_RAID6) {
+   k = num_stripes-2;
+   } else { /* RAID0/LINEAR */
+   k = num_stripes;
+   }
+   avail_space += devices_info[i].max_avail * k;
alloc_size = devices_info[i].max_avail;
for (j = i + 1 - num_stripes; j = i; j++)
devices_info[j].max_avail -= alloc_size;



-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Kai Krakow
Hugo Mills h...@carfax.org.uk schrieb:

 On Sat, Feb 08, 2014 at 05:33:10PM +0600, Roman Mamedov wrote:
 On Fri, 07 Feb 2014 21:32:42 +0100
 Kai Krakow hurikhan77+bt...@gmail.com wrote:
 
  It should show the raw space available. Btrfs also supports compression
  and doesn't try to be smart about how much compressed data would fit in
  the free space of the drive. If one is using RAID1, it's supposed to
  fill up with a rate of 2:1. If one is using compression, it's supposed
  to fill up with a rate of maybe 1:5 for mostly text files.
 
 Imagine a small business with some 30-40 employees. There is a piece of
 paper near the door at the office so that everyone sees it when entering
 or leaving, which says:
 
 Dear employees,
 
 Please keep in mind that on the fileserver '\\DepartmentC', in the
 directory '\PublicStorage7' the free space you see as being available
 needs to be divided by two; On the server '\\DepartmentD', in
 '\StorageArchive' and '\VideoFiles', multiplied by two-thirds. For more
 details please contact the IT operations team. Further assistance will be
 provided at the monthly training seminar.
 
 Regards,
 John S, CTO.'
 
In my experience, nobody who uses a shared filesystem *ever* looks
 at the amount of free space on it, until it fills up, at which point
 they may look at the free space and see 0. Or most likely, they'll
 be alerted to the issue by an email from the systems people saying,
 please will everyone delete unnecessary files from the shared drive,
 because it's full up.

Exactly that is the point from my practical experience. Only sysadmins watch 
these numbers, and they'd know how to handle them.

Imagine the future: Btrfs supports different RAID levels per subvolume. We 
need to figure out where to place a new subvolume. I need raw numbers for 
it. Df won't tell me that now. Things become very difficult now.

Free space is a number unimportant to end users. They won't look at it. They 
start to cry and call helpdesk if an application says: Disk is full. You 
cannot even save your unsaved document, because: Disk full.

The only way to solve this, is to apply quotas to users and let the 
sysadmins do the space usage planning. That will work.

I still think, there should be an extra utility which guesses the predicted 
usable free space - or an option added to df to show that.

Roman's argument is only one view of the problem. My argument (sysadmin 
space planning) is exactly the opposite view. In the future, free space 
prediction will only become more complicated, involves more code, introduces 
bugs... It should be done in user space. Df should receive raw numbers.

Storage space is cheap these days. You should just throw another disk at the 
array if free space falls below a certain threshold. End users do not care 
for free space. They just cry when it's full - no matter how accurate the 
numbers had been before. They will certainly not cry if they copied 2 MB to 
the disk but 4 MB had been taken. In a shared storage space this is probably 
always the case anyway, because just the very same moment someone else also 
copied 2 MB to the volume. So what?

Having a more accurate estimate of the free space is a laudable
 aim, and in principle I agree with attempts to do it, but I think the
 argument above isn't exactly a strong one in practice.

I do not disagree, too. But I think it should go to a separate utility or 
there should be a new API call in the kernel to get predicted usable free 
space based on current usage pattern. Df is meant as a utility to get 
accurate numbers. It should not tell you guessed numbers.

Whatever you design a df calculater in btrfs, it could always be too 
pessimistic or too optimistic (and could even switch unpredictably between 
both situations). So whatever you do: It is always inaccurate. It will never 
be able to exactly tell you the numbers you need. If disk space is low: Add 
disks. Clean up. Whatever. Just simply do not try to fill up your FS to just 
1kb left. Btrfs doesn't like that anyway. So: Use quotas.

Picking up the piece of paper example: You still have to tell your employees 
that the free space numbers aren't exact anyways, so their best chance is to 
simply not look at them and are better off with just trying to copy 
something.

Besides: If you want to fix this, what about the early-ENOSPC problem which 
is there by design (allocation in chunks)? You'd need to fix that, too.

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Kai Krakow
Chris Murphy li...@colorremedies.com schrieb:

 
 On Feb 6, 2014, at 11:08 PM, Roman Mamedov r...@romanrm.net wrote:
 
  And what
 if I am accessing that partition on a server via a network CIFS/NFS share
 and don't even *have a way to find out* any of that.
 
 That's the strongest argument. And if the user is using
 Explorer/Finder/Nautilus to copy files to the share, I'm pretty sure all
 three determine if there's enough free space in advance of starting the
 copy. So if it thinks there's free space, it will start to copy and then
 later fail midstream when there's no more space. And then the user's copy
 task is in a questionable state as to what's been copied, depending on how
 the file copies are being threaded.

This problem has already been solved for remote file systems maybe 20-30 
years ago: You cannot know how much space is left at the end of the copy by 
looking at the numbers before the copy - it may have been used up by another 
user copying a file at the same time. The problem has been solved by 
applying hard and soft quotas: The sysadmin does an optimistic (or possibly 
even pessimistic) planning and applies quotas. Soft quotas can be passed for 
(maybe) 7 days after which you need to free up space again before adding new 
data. Hard quotas are the hard cutoff - you cannot pass that barrier. Df 
will show you what's free within your softquota. Problem solved. If you need 
better numbers, there are quota commands instead of df. Why break with this 
design choice?

If you manage a central shared storage for end users, you should really 
start thinking about quotas. Without, you cannot even exactly plan your 
backups.

If df shows transformed/guessed numbers to the sysadmins, things start to 
become very complicated and unpredictable.

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Kai Krakow
Martin Steigerwald mar...@lichtvoll.de schrieb:

 While I understand that there is *never* a guarentee that a given free
 space can really be allocated by a process cause other processes can
 allocate space as well in the mean time, and while I understand that its
 difficult to provide an accurate to provide exact figures as soon as RAID
 settings can be set per subvolume, it still think its important to improve
 on the figures.

The question here: Does the free space indicator fail predictably or 
inpredictably? It will do the latter with this change.

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Roman Mamedov
On Sat, 08 Feb 2014 22:35:40 +0100
Kai Krakow hurikhan77+bt...@gmail.com wrote:

 Imagine the future: Btrfs supports different RAID levels per subvolume. We 
 need to figure out where to place a new subvolume. I need raw numbers for 
 it. Df won't tell me that now. Things become very difficult now.

If you need to perform a btrfs-specific operation, you can easily use the
btrfs-specific tools to prepare for it, specifically use btrfs fi df which
could give provide every imaginable interpretation of free space estimate
and then some.

UNIX 'df' and the 'statfs' call on the other hand should keep the behavior
people are accustomized to rely on since 1970s.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread cwillu
Everyone who has actually looked at what the statfs syscall returns
and how df (and everyone else) uses it, keep talking.  Everyone else,
go read that source code first.

There is _no_ combination of values you can return in statfs which
will not be grossly misleading in some common scenario that someone
cares about.
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Kai Krakow
Roman Mamedov r...@romanrm.net schrieb:

 It should show the raw space available. Btrfs also supports compression
 and doesn't try to be smart about how much compressed data would fit in
 the free space of the drive. If one is using RAID1, it's supposed to fill
 up with a rate of 2:1. If one is using compression, it's supposed to fill
 up with a rate of maybe 1:5 for mostly text files.
 
 Imagine a small business with some 30-40 employees. There is a piece of
 paper near the door at the office so that everyone sees it when entering
 or leaving, which says:
 
 Dear employees,
 
 Please keep in mind that on the fileserver '\\DepartmentC', in the
 directory '\PublicStorage7' the free space you see as being available
 needs to be divided by two; On the server '\\DepartmentD', in
 '\StorageArchive' and '\VideoFiles', multiplied by two-thirds. For more
 details please contact the IT operations team. Further assistance will be
 provided at the monthly training seminar.

Dear employees,

Please keep in mind that when you run out of space on the fileserver 
'\\DepartmentC', when you free up space in the directory '\PublicStorage7' 
the free space you gain on '\StorageArchive' is only one third of the amount 
you deleted, and in '\VideoFiles', you gain only one half. For more details 
please contact the IT operations team. Further assistance will be provided 
at the monthly training seminar.

Regards,
John S, CTO.

The exercise of why is left to the reader...

The proposed fix simply does not fix the problem. It simply shifts it 
introducing the need for another fix somewhere else, which in turn probably 
also introduces another need for a fix, and so forth... This will become an 
endless effort of fixing and tuning.

It simply does not work because btrfs' design does not allow that. Feel free 
to fix it but be prepared for the reincarnation of this problem when per-
subvolume raid levels become introduced. The problem has to be fixed in user 
space or with a new API call.

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Kai Krakow
cwillu cwi...@cwillu.com schrieb:

 Everyone who has actually looked at what the statfs syscall returns
 and how df (and everyone else) uses it, keep talking.  Everyone else,
 go read that source code first.
 
 There is _no_ combination of values you can return in statfs which
 will not be grossly misleading in some common scenario that someone
 cares about.

Thanks man! statfs returns free blocks. So let's stick with that. The df 
command, as people try to understand it, is broken by design on btrfs. One 
has to live with that. The df command as it works since 1970 returns free 
blocks - and it does that perfectly fine on btrfs without that proposed 
fix.

User space should not try to be smart about how many blocks are written to 
the filesystem if it writes xyz bytes to the filesystem. It has been that 
way since 1970 (or whatever), and it will be that way in the future. And a 
good file copying GUI should give you the choice of I know better, copy 
anyways (like every other unix utility).

Your pointer is everything to say about it.

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Kai Krakow
Roman Mamedov r...@romanrm.net schrieb:

 UNIX 'df' and the 'statfs' call on the other hand should keep the behavior
 people are accustomized to rely on since 1970s.

When I started to use unix, df returned blocks, not bytes. Without your 
proposed patch, it does that right. With your patch, it does it wrong.
 
-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Roman Mamedov
On Sun, 09 Feb 2014 00:32:47 +0100
Kai Krakow hurikhan77+bt...@gmail.com wrote:

 When I started to use unix, df returned blocks, not bytes. Without your 
 proposed patch, it does that right. With your patch, it does it wrong.

It returns total/used/available space that is usable/used/available by/for
user data. Whether that be in sectors, blocks, kilobytes, megabytes or in some
other unit, is a secondary detail which is also unrelated to the change being
currently discussed and not affected by it.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Roman Mamedov
On Sun, 09 Feb 2014 00:17:29 +0100
Kai Krakow hurikhan77+bt...@gmail.com wrote:

 Dear employees,
 
 Please keep in mind that when you run out of space on the fileserver 
 '\\DepartmentC', when you free up space in the directory '\PublicStorage7' 
 the free space you gain on '\StorageArchive' is only one third of the amount 
 you deleted, and in '\VideoFiles', you gain only one half.

But that's simply incorrect. Looking at my 2nd patch which also changes the
total reported size and 'used' size, the 'total' space, 'used' space and space
freed up as 'available' after file deletion will all match up perfectly.

 The exercise of why is left to the reader...
 
 The proposed fix simply does not fix the problem. It simply shifts it 
 introducing the need for another fix somewhere else, which in turn probably 
 also introduces another need for a fix, and so forth... This will become an 
 endless effort of fixing and tuning.

Not sure what exactly becomes problematic if a 2-device RAID1 tells the user
they can store 1 TB of their data on it, and is no longer lying about the 
possibility to store 2 TB on it as currently.

Two 1TB disks in RAID1. Total space 1TB. Can store of my data: 1TB.
Wrote 100 GB of files? 100 GB used, 900 GB available, 1TB total.
Deleted 50 GB of those? 50 GB used, 950 GB available, 1TB total.

Can't see anything horribly broken about this behavior.

For when you need to get to the bottom of things, as mentioned earlier
there's always 'btrfs fi df'.

 Feel free to fix it but be prepared for the reincarnation of this problem when
 per-subvolume raid levels become introduced.

AFAIK no one has even begun to write any code code to implement those yet.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Chris Murphy

On Feb 8, 2014, at 6:55 PM, Roman Mamedov r...@romanrm.net wrote:
 
 Not sure what exactly becomes problematic if a 2-device RAID1 tells the user
 they can store 1 TB of their data on it, and is no longer lying about the 
 possibility to store 2 TB on it as currently.
 
 Two 1TB disks in RAID1.

OK but while we don't have a top level switch for variable raid on a volume 
yet, the on-disk format doesn't consider the device to be raid1 at all. Not the 
device, nor the volume, nor the subvolume have this attribute. It's a function 
of the data, metadata or system chunk via their profiles.

I can do a partial conversion on a volume, and even could do this multiple 
times and end up with some chunks in every available option, some chunks 
single, some raid1, some raid0, some raid5. All I have to do is cancel the 
conversion before each conversion is complete, successively shortening the time.

And it's not fair to say this has no application because such conversions take 
a long time. I might not want to fully do a conversion all at once. There's no 
requirement that I do so.

In any case I object to the language being used that implicitly indicates the 
'raidness' is a device or disk attribute.


Chris Murphy

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Chris Murphy

On Feb 8, 2014, at 7:21 PM, Chris Murphy li...@colorremedies.com wrote:

 we don't have a top level switch for variable raid on a volume yet

This isn't good wording. We don't have a controllable way to set variable raid 
levels. The interrupted convert model I'd consider not controllable.


Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-08 Thread Duncan
Roman Mamedov posted on Sun, 09 Feb 2014 04:10:50 +0600 as excerpted:

 If you need to perform a btrfs-specific operation, you can easily use
 the btrfs-specific tools to prepare for it, specifically use btrfs fi
 df which could give provide every imaginable interpretation of free
 space estimate and then some.
 
 UNIX 'df' and the 'statfs' call on the other hand should keep the
 behavior people are accustomized to rely on since 1970s.

Which it does... on filesystems that only have 1970s filesystem features. 
=:^)

RAID or multi-device filesystems aren't 1970s features and break 1970s 
behavior and the assumptions associated with it.  If you're not prepared 
to deal with those broken assumptions, don't.  Use mdraid or dmraid or lvm 
or whatever to combine your multiple devices into one logical devices as 
presented, and put your filesystem (either traditional filesystem, or 
even btrfs using traditional single-device functionality) on top of the 
single device the layer beneath the filesystem presents.  Problem solved! 
=:^)

Note that df only lists a single device as well, not the multiple 
component devices of the filesystem.  That's broken functionality by your 
definition, too, and again, using some other layer like lvm or mdraid to 
present multiple devices as a single virtual device, with a traditional 
single-device filesystem layout on top of that single device... solves 
the problem!


Meanwhile, what I've done here is use one of df's commandline options to 
set its block size to 2 MiB, and further used bash's alias functionality 
to setup an alias accordingly:

alias df='df -B2M'

$ df /h
Filesystem 2M-blocks  Used Available Use% Mounted on
/dev/sda6  20480 12186  7909  61% /h

$ sudo btrfs fi show /h
Label: hm0238gcnx+35l0  uuid: ce23242a-b0a9-423f-a9c3-7db2729f48d6
Total devices 2 FS bytes used 11.90GiB
devid1 size 20.00GiB used 14.78GiB path /dev/sda6
devid2 size 20.00GiB used 14.78GiB path /dev/sdb6

$ sudo btrfs fi df /h
Data, RAID1: total=14.00GiB, used=11.49GiB
System, RAID1: total=32.00MiB, used=16.00KiB
Metadata, RAID1: total=768.00MiB, used=414.94MiB


On btrfs such as the above I can read the 2M blocks as 1M and be happy.  
On btrfs such as my /boot, which aren't raid1 (I have two separate 
/boots, one on each device, with grub2 configured separately for each to 
provide a backup), or if I df my media partitions still on reiserfs on 
the old spinning rust, I can either double the figures DF gives me, or 
add a second -B option at the CLI, overriding the aliased option.

If I wanted something fully automated, it'd be easy enough to setup a 
script that checked what filesystem I was df-ing, matched that against a 
table of filesystems to preferred df block sizes, and supplied the 
appropriate -BxX option accordingly.  As I guess most admins after a few 
years, I've developed quite a library of scripts/aliases for various 
things I do routinely enough to warrant it, and this would be just one 
more joining the list. =:^)


But of course it's your system in question, and you can patch btrfs to 
output anything you like, in any format you like.  No need to bother with 
df's -B option if you'd prefer to patch the kernel instead.  Me, I'll 
stick to the -B option.  =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-07 Thread Martin Steigerwald
Am Donnerstag, 6. Februar 2014, 22:30:46 schrieb Chris Murphy:
 On Feb 6, 2014, at 9:40 PM, Roman Mamedov r...@romanrm.net wrote:
  On Thu, 06 Feb 2014 20:54:19 +0100
 
  Goffredo Baroncelli kreij...@libero.it wrote:
  
 
  I agree with you about the needing of a solution. However your patch to
  me seems even worse than the actual code.
  
 
  For example you cannot take in account the mix of data/linear and
  metadata/dup (with the pathological case of small files stored in the
  metadata chunks ), nor different profile level like raid5/6 (or the
  future raidNxM) And do not forget the compression...
 
  
 
  Every estimate first and foremost should be measured by how precise it is,
  or in this case wrong by how many gigabytes. The actual code returns a
  result that is pretty much always wrong by 2x, after the patch it will be
  close within gigabytes to the correct value in the most common use case
  (data raid1, metadata raid1 and that's it). Of course that PoC is nowhere
  near the final solution, what I can't agree with is if another option is
  somewhat better, but not ideally perfect, then it's worse than the
  current one, even considering the current one is absolutely broken.
 
 Is the glass half empty or is it half full?
 
 From the original post, context is a 2x 1TB raid volume:
 
 Filesystem  Size  Used Avail Use% Mounted on
 /dev/sda2   1.8T  1.1M  1.8T   1% /mnt/p2
 
 Earlier conventions would have stated Size ~900GB, and Avail ~900GB. But
 that's not exactly true either, is it? It's merely a convention to cut the
 storage available in half, while keeping data file sizes the same as if
 they were on a single device without raid.
 
 On Btrfs the file system Size is reported as the total storage stack size,
 and that's not incorrect. And the amount Avail is likewise not wrong
 because that space is not otherwise occupied which is the definition of
 available. It's linguistically consistent, it's just not a familiar
 convention.

I see one issue with it:

There are installers and applications that check available disk space prior to 
installing. This won´t work with current df figures that BTRFS delivers.

While I understand that there is *never* a guarentee that a given free space 
can really be allocated by a process cause other processes can allocate space 
as well in the mean time, and while I understand that its difficult to provide 
an accurate to provide exact figures as soon as RAID settings can be set per 
subvolume, it still think its important to improve on the figures.

In the longer term I´d like like a function / syscall to ask the filesystem the 
following question:

I am about to write 200 MB in this directory, am I likely to succeed with 
that?

This way an application can ask specific to a directory which allows BTRFS to 
provide a more accurate estimation.


I understand that there is something like that for single files (fallocate), 
but there is nothing like this for writing a certain amount of data in several 
files / directories.

Thanks,
-- 
Martin 'Helios' Steigerwald - http://www.Lichtvoll.de
GPG: 03B0 0D6C 0040 0710 4AFA  B82F 991B EAAC A599 84C7
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-07 Thread Frank Kingswood

On 06/02/14 19:54, Goffredo Baroncelli wrote:

Hi Roman

On 02/06/2014 01:45 PM, Roman Mamedov wrote:

There's not a lot of code to include (as my 3-line patch demonstrates), it
could just as easily be removed when it's obsolete. But I did not have any
high hopes of defeating the broken by design philosophy, that's why I didn't
submit it as a real patch for inclusion but rather just as a helpful hint for
people to add to their own kernels if they want this change to happen.


I agree with you about the needing of a solution. However your patch to me 
seems even worse than the actual code.

For example you cannot take in account the mix of data/linear and metadata/dup 
(with the pathological case of small files stored in the metadata chunks ), nor 
different profile level like raid5/6 (or the future raidNxM)
And do not forget the compression...


Just because the solution that Roman provided is not perfect does not 
mean that it is no good at all. For common use cases this will give a 
much better estimate of free space than the current code does, at a 
trivial cost in code size.


It has the benefit of giving a simple estimate without doing any further 
work or disk activity (no need to walk all chunks).


Adding a couple of more lines of code will make it work equally well 
with other RAID levels, maybe that would be more acceptable?


Frank

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-07 Thread Chris Murphy

On Feb 6, 2014, at 11:08 PM, Roman Mamedov r...@romanrm.net wrote:

  And what
 if I am accessing that partition on a server via a network CIFS/NFS share and
 don't even *have a way to find out* any of that.

That's the strongest argument. And if the user is using 
Explorer/Finder/Nautilus to copy files to the share, I'm pretty sure all three 
determine if there's enough free space in advance of starting the copy. So if 
it thinks there's free space, it will start to copy and then later fail 
midstream when there's no more space. And then the user's copy task is in a 
questionable state as to what's been copied, depending on how the file copies 
are being threaded.

And due to Btrfs metadata requirements even when deleting, we actually need an 
Avail estimate that accounts for phantom future metadata as if it's currently 
in use, otherwise we don't really have the right indication of whether or not 
files can be copied. 


Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-07 Thread Kai Krakow
Josef Bacik jba...@fb.com schrieb:

 
 On 02/05/2014 03:15 PM, Roman Mamedov wrote:
 Hello,

 On a freshly-created RAID1 filesystem of two 1TB disks:

 # df -h /mnt/p2/
 Filesystem  Size  Used Avail Use% Mounted on
 /dev/sda2   1.8T  1.1M  1.8T   1% /mnt/p2

 I cannot write 2TB of user data to that RAID1, so this estimate is
 clearly misleading. I got tired of looking at the bogus disk free space
 on all my RAID1 btrfs systems, so today I decided to do something about
 this:

 --- fs/btrfs/super.c.orig2014-02-06 01:28:36.636164982 +0600
 +++ fs/btrfs/super.c 2014-02-06 01:28:58.304164370 +0600
 @@ -1481,6 +1481,11 @@
   }
   
   kfree(devices_info);
 +
 +if (type  BTRFS_BLOCK_GROUP_RAID1) {
 +do_div(avail_space, min_stripes);
 +}
 +
   *free_bytes = avail_space;
   return 0;
   }
 
 This needs to be more flexible, and also this causes the problem where
 now you show the actual usable amount of space _but_ you are also
 showing twice the amount of used space.  I'm ok with going in this
 direction, but we need to convert everybody over so it works for raid10
 as well and the used values need to be adjusted.  Thanks,

It should show the raw space available. Btrfs also supports compression and 
doesn't try to be smart about how much compressed data would fit in the free 
space of the drive. If one is using RAID1, it's supposed to fill up with a 
rate of 2:1. If one is using compression, it's supposed to fill up with a 
rate of maybe 1:5 for mostly text files.

However, btrfs should probably provide its own df utility (like btrfs fi 
df) which is smart about disk usage and tries to predict the usable space. 
But df should stay with actually showing the _free_ space, not _usable_ 
space (the latter is unpredictable anyway based on the usage patterns 
applied to the drive, so it can be a rough guess as its best, like looking 
into the crystal ball).

The point here is about free vs. usable space, the first being a known 
number, the latter only being a prediction based on current usage. I'd like 
to stay with free space actually showing raw free space.

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-06 Thread Roman Mamedov
On Thu, 06 Feb 2014 09:38:15 +0200
Brendan Hide bren...@swiftspirit.co.za wrote:

 This is a known issue: 
 https://btrfs.wiki.kernel.org/index.php/FAQ#Why_does_df_show_incorrect_free_space_for_my_RAID_volume.3F
 Btrfs is still considered experimental 

It's long overdue to start tackling these snags and 'stop hiding behind the
experimental flag' [1], which is also no longer present as of 3.13.

[1] http://www.spinics.net/lists/linux-btrfs/msg30396.html

 this is just one of those caveats we've learned to adjust to.

Sure, but it's hard to argue this particular behavior is clearly broken from
the user perspective, even if it's broken by design, and there are a number
of very smart and future-proof reasons to keep it broken today.

Personally I tired of trying to keep in mind which partitions are btrfs raid1,
and mentally divide the displayed free space by two for those. That's what
computers are for, to keep track of such things.

 The change could work well for now and I'm sure it has been considered. 
 I guess the biggest end-user issue is that you can, at a whim, change 
 the model for new blocks - raid0/5/6,single etc and your value from 5 
 minutes ago is far out from your new value without having written 
 anything or taken up any space. Not a show-stopper problem, really.

Changing the allocation profile for new blocks is a serious change you don't do
accidentally, it's about the same importance level as e.g. resizing the
filesystem. And no one is really surprised when the 'df' result changes after
an FS resize.

 The biggest dev issue is that future features will break this behaviour, 

That could be years away.

 such as the per-subvolume RAID profiles you mentioned. It is difficult 
 to motivate including code (for which there's a known workaround) where 
 we know it will be obsoleted.

There's not a lot of code to include (as my 3-line patch demonstrates), it
could just as easily be removed when it's obsolete. But I did not have any
high hopes of defeating the broken by design philosophy, that's why I didn't
submit it as a real patch for inclusion but rather just as a helpful hint for
people to add to their own kernels if they want this change to happen.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: Provide a better free space estimate on RAID1

2014-02-06 Thread Goffredo Baroncelli
Hi Roman

On 02/06/2014 01:45 PM, Roman Mamedov wrote:
 On Thu, 06 Feb 2014 09:38:15 +0200
[...]
 
 There's not a lot of code to include (as my 3-line patch demonstrates), it
 could just as easily be removed when it's obsolete. But I did not have any
 high hopes of defeating the broken by design philosophy, that's why I didn't
 submit it as a real patch for inclusion but rather just as a helpful hint for
 people to add to their own kernels if they want this change to happen.


I agree with you about the needing of a solution. However your patch to me 
seems even worse than the actual code.

For example you cannot take in account the mix of data/linear and metadata/dup 
(with the pathological case of small files stored in the metadata chunks ), nor 
different profile level like raid5/6 (or the future raidNxM)
And do not forget the compression...

The situation is very complex. I am inclined to use a different approach.

As you know, btrfs allocate space in chunk. Each chunk has an own ration 
between the data occupied on the disk, and the data available to the 
filesystem. For SINGLE the ratio is 1, for DUP/RAID1/RAID10 the ratio is 2, for 
raid 5 the ratio is n/(n-1) (where n is the stripes count), for raid 6 the 
ratio is n/(n-2)

Because a filesystem could have chunks with different ratios, we can compute a 
global ratio as the composition of the each chunk ratio:

for_each_chunk:
all_chunks_size += chunk_size[i]

for_each_chunk:
global_ratio += chunk_ratio[i] * chunk_size[i] / all_chunks_size

If we assume that this ratio is constant during the live of the filesystem, we 
can use it to get an estimation of the space available to the users as:

free_space = (all_disks_size-all_chunks_size)/global_ratio


The code above is a simplification, because we should take in account also the 
space available on each _already_allocated_ chunk.
We could further enhance this estimation, taking in account also the total 
files sizes and their space consumed in the chunks (this could be different due 
to the compression)

Even tough not perfect, it would be a better estimation than the actual one. 


BR
G.Baroncelli

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli (kreijackATinwind.it
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-06 Thread Josef Bacik


On 02/05/2014 03:15 PM, Roman Mamedov wrote:

Hello,

On a freshly-created RAID1 filesystem of two 1TB disks:

# df -h /mnt/p2/
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  1.8T   1% /mnt/p2

I cannot write 2TB of user data to that RAID1, so this estimate is clearly
misleading. I got tired of looking at the bogus disk free space on all my
RAID1 btrfs systems, so today I decided to do something about this:

--- fs/btrfs/super.c.orig   2014-02-06 01:28:36.636164982 +0600
+++ fs/btrfs/super.c2014-02-06 01:28:58.304164370 +0600
@@ -1481,6 +1481,11 @@
}
  
  	kfree(devices_info);

+
+   if (type  BTRFS_BLOCK_GROUP_RAID1) {
+   do_div(avail_space, min_stripes);
+   }
+
*free_bytes = avail_space;
return 0;
  }


This needs to be more flexible, and also this causes the problem where 
now you show the actual usable amount of space _but_ you are also 
showing twice the amount of used space.  I'm ok with going in this 
direction, but we need to convert everybody over so it works for raid10 
as well and the used values need to be adjusted.  Thanks,


Josef
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-06 Thread Roman Mamedov
On Thu, 06 Feb 2014 20:54:19 +0100
Goffredo Baroncelli kreij...@libero.it wrote:

 I agree with you about the needing of a solution. However your patch to me 
 seems even worse than the actual code.
 
 For example you cannot take in account the mix of data/linear and 
 metadata/dup (with the pathological case of small files stored in the 
 metadata chunks ), nor different profile level like raid5/6 (or the future 
 raidNxM)
 And do not forget the compression...

Every estimate first and foremost should be measured by how precise it is, or
in this case wrong by how many gigabytes. The actual code returns a result
that is pretty much always wrong by 2x, after the patch it will be close
within gigabytes to the correct value in the most common use case (data raid1,
metadata raid1 and that's it). Of course that PoC is nowhere near the final
solution, what I can't agree with is if another option is somewhat better,
but not ideally perfect, then it's worse than the current one, even
considering the current one is absolutely broken.

 The situation is very complex. I am inclined to use a different approach.
 
 As you know, btrfs allocate space in chunk. Each chunk has an own ration 
 between the data occupied on the disk, and the data available to the 
 filesystem. For SINGLE the ratio is 1, for DUP/RAID1/RAID10 the ratio is 2, 
 for raid 5 the ratio is n/(n-1) (where n is the stripes count), for raid 6 
 the ratio is n/(n-2)
 
 Because a filesystem could have chunks with different ratios, we can compute 
 a global ratio as the composition of the each chunk ratio

 We could further enhance this estimation, taking in account also the total 
 files sizes and their space consumed in the chunks (this could be different 
 due to the compression)

I wonder what would be performance implications of all that. I feel a simpler
approach could work.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: Provide a better free space estimate on RAID1

2014-02-06 Thread Chris Murphy

On Feb 6, 2014, at 9:40 PM, Roman Mamedov r...@romanrm.net wrote:

 On Thu, 06 Feb 2014 20:54:19 +0100
 Goffredo Baroncelli kreij...@libero.it wrote:
 
 I agree with you about the needing of a solution. However your patch to me 
 seems even worse than the actual code.
 
 For example you cannot take in account the mix of data/linear and 
 metadata/dup (with the pathological case of small files stored in the 
 metadata chunks ), nor different profile level like raid5/6 (or the future 
 raidNxM)
 And do not forget the compression...
 
 Every estimate first and foremost should be measured by how precise it is, or
 in this case wrong by how many gigabytes. The actual code returns a result
 that is pretty much always wrong by 2x, after the patch it will be close
 within gigabytes to the correct value in the most common use case (data raid1,
 metadata raid1 and that's it). Of course that PoC is nowhere near the final
 solution, what I can't agree with is if another option is somewhat better,
 but not ideally perfect, then it's worse than the current one, even
 considering the current one is absolutely broken.

Is the glass half empty or is it half full?

From the original post, context is a 2x 1TB raid volume:

Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  1.8T   1% /mnt/p2

Earlier conventions would have stated Size ~900GB, and Avail ~900GB. But that's 
not exactly true either, is it? It's merely a convention to cut the storage 
available in half, while keeping data file sizes the same as if they were on a 
single device without raid.

On Btrfs the file system Size is reported as the total storage stack size, and 
that's not incorrect. And the amount Avail is likewise not wrong because that 
space is not otherwise occupied which is the definition of available. It's 
linguistically consistent, it's just not a familiar convention.

What I don't care for is the fact that btrfs fi df doesn't report total and 
used for raid1, the user has to mentally double the displayed values. I think 
the doubling should already be computed, that's what total and used mean, 
rather than needing secret decoder ring knowledge to understand the situation.

Anyway, there isn't a terribly good solution for this issue still. But I don't 
find the argument that it's absolutely broken very compelling. And I disagree 
that upending Used+Avail=Size as you suggest is a good alternative. How is that 
going to work, by the way?

Your idea:
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  912G   1% /mnt/p2

If I copy 500GB to this file system, what do you propose df shows me? Clearly 
Size stays the same, and Avail presumably becomes 412G. But what does Used go 
to? 500G? Or 1T? And when full, will it say Size 1.8T, Used 900G, Avail 11M? So 
why is the Size 1.8T, only 900G used and yet it's empty? That doesn't make 
sense. Nor does Used increasing at twice the rate Avail goes down.

I also don't think it's useful to fix the problem more than once either.



Chris Murphy

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Provide a better free space estimate on RAID1

2014-02-06 Thread Roman Mamedov
On Thu, 6 Feb 2014 22:30:46 -0700
Chris Murphy li...@colorremedies.com wrote:

 From the original post, context is a 2x 1TB raid volume:
 
 Filesystem  Size  Used Avail Use% Mounted on
 /dev/sda2   1.8T  1.1M  1.8T   1% /mnt/p2
 
 Earlier conventions would have stated Size ~900GB, and Avail ~900GB. But 
 that's not exactly true either, is it?

Much better, and matching the user expectations of how RAID1 should behave,
without a major gotcha blowing up into their face the first minute they are
trying it out. In fact next step that I planned would be finding how to adjust
also Size and Used on all my machines to show what you just mentioned. I get it
that btrfs is special and its RAID1 is not the usual RAID1 either, but that's
not a good reason to break the 'df' behavior; do whatever you want with in
'btrfs fi df', but if I'm not mistaken the UNIX 'df' always was about user
data, how much of my data I have already stored on this partition and how much
more can I store. If that's not possible to tell, then try to be reasonably
close to the truth, not deliberately off by 2x.

 On Btrfs ...the amount Avail is likewise not wrong because that space is not 
 otherwise occupied which is the definition of available.

That's not the definition of available that's directly useful to anyone, but
rather a filesystem-designer level implementation detail, if anything.

What usually interests me is, I have a 100 GB file, can I fit it on this
filesystem, yes/no? Sure let's find out, just check 'df'. Oh wait, not so fast
let's remember was this btrfs? Is that the one with RAID1 or not?... And what
if I am accessing that partition on a server via a network CIFS/NFS share and
don't even *have a way to find out* any of that.


-- 
With respect,
Roman


signature.asc
Description: PGP signature


Provide a better free space estimate on RAID1

2014-02-05 Thread Roman Mamedov
Hello,

On a freshly-created RAID1 filesystem of two 1TB disks:

# df -h /mnt/p2/
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  1.8T   1% /mnt/p2

I cannot write 2TB of user data to that RAID1, so this estimate is clearly
misleading. I got tired of looking at the bogus disk free space on all my
RAID1 btrfs systems, so today I decided to do something about this:

--- fs/btrfs/super.c.orig   2014-02-06 01:28:36.636164982 +0600
+++ fs/btrfs/super.c2014-02-06 01:28:58.304164370 +0600
@@ -1481,6 +1481,11 @@
}
 
kfree(devices_info);
+
+   if (type  BTRFS_BLOCK_GROUP_RAID1) {
+   do_div(avail_space, min_stripes);
+   }
+  
*free_bytes = avail_space;
return 0;
 }


After:

# df -h /mnt/p2/
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  912G   1% /mnt/p2

Until per-subvolume RAID profiles are implemented, this estimate will be
correct, and even after, it should be closer to the truth than assuming the
user will fill their RAID1 FS only with subvolumes of single or raid0 profiles.

If anyone likes feel free to reimplement my PoC patch in a better way, e.g.
integrate this into the calculation 'while' block of that function immediately
before it (logic of which I couldn't yet grasp due to it lacking comments),
and not just tacked onto the tail of it.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: Provide a better free space estimate on RAID1

2014-02-05 Thread Brendan Hide

On 2014/02/05 10:15 PM, Roman Mamedov wrote:

Hello,

On a freshly-created RAID1 filesystem of two 1TB disks:

# df -h /mnt/p2/
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  1.8T   1% /mnt/p2

I cannot write 2TB of user data to that RAID1, so this estimate is clearly
misleading. I got tired of looking at the bogus disk free space on all my
RAID1 btrfs systems, so today I decided to do something about this:

...

After:

# df -h /mnt/p2/
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  912G   1% /mnt/p2

Until per-subvolume RAID profiles are implemented, this estimate will be
correct, and even after, it should be closer to the truth than assuming the
user will fill their RAID1 FS only with subvolumes of single or raid0 profiles.
This is a known issue: 
https://btrfs.wiki.kernel.org/index.php/FAQ#Why_does_df_show_incorrect_free_space_for_my_RAID_volume.3F


Btrfs is still considered experimental - this is just one of those 
caveats we've learned to adjust to.


The change could work well for now and I'm sure it has been considered. 
I guess the biggest end-user issue is that you can, at a whim, change 
the model for new blocks - raid0/5/6,single etc and your value from 5 
minutes ago is far out from your new value without having written 
anything or taken up any space. Not a show-stopper problem, really.


The biggest dev issue is that future features will break this behaviour, 
such as the per-subvolume RAID profiles you mentioned. It is difficult 
to motivate including code (for which there's a known workaround) where 
we know it will be obsoleted.


--
__
Brendan Hide
http://swiftspirit.co.za/
http://www.webafrica.co.za/?AFF1E97

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html