Re: btrfs is using 25% more disk than it should

2014-12-23 Thread Zygo Blaxell
On Sat, Dec 20, 2014 at 06:28:22AM -0500, Josef Bacik wrote: We now have two extents with the same bytenr but with different lengths. [...] Then there is the problem of actually returning the free space. Now if we drop all of the refs for an extent we know the space is free and we return

Re: btrfs is using 25% more disk than it should

2014-12-20 Thread Daniele Testa
Ok, so this is what I did: 1. Copied the sparse 315GB (with 302GB inside) to another server 2. Re-formatted the btrfs partition 3. chattr +C on the parent dir 4. Copied the 315GB file back to the btrfs partition (the file is not sparse any more due to the copying) This is the end result:

Re: btrfs is using 25% more disk than it should

2014-12-20 Thread Josef Bacik
On 12/20/2014 01:18 AM, Daniele Testa wrote: But I read somewhere that compression should be turned off on mounts that just store large VM-images. Is that wrong? It doesn't really matter frankly. Usually virt images are preallocated with fallocate which means compression doesn't happen as

Re: btrfs is using 25% more disk than it should

2014-12-20 Thread Robert White
On 12/19/2014 01:17 PM, Josef Bacik wrote: tl;dr: Cow means you can in the worst case end up using 2 * filesize - blocksize of data on disk and the file will appear to be filesize. Thanks, Doesn't the worst case more like N^log(N) (when N is file in blocksize) in the pernicious case?

Re: btrfs is using 25% more disk than it should

2014-12-20 Thread Josef Bacik
On 12/20/2014 12:52 AM, Zygo Blaxell wrote: On Fri, Dec 19, 2014 at 04:17:08PM -0500, Josef Bacik wrote: And for your inode you now have this inode 256, file offset 0, size 4k, offset 0, diskebytenr (123+302g), disklen 4k inode 256, file offset 4k, size 302g-4k, offset 4k, diskbytenr 123,

Re: btrfs is using 25% more disk than it should

2014-12-20 Thread Josef Bacik
On 12/20/2014 06:23 AM, Robert White wrote: On 12/19/2014 01:17 PM, Josef Bacik wrote: tl;dr: Cow means you can in the worst case end up using 2 * filesize - blocksize of data on disk and the file will appear to be filesize. Thanks, Doesn't the worst case more like N^log(N) (when N is file in

Re: btrfs is using 25% more disk than it should

2014-12-20 Thread Robert White
On 12/20/2014 03:39 AM, Josef Bacik wrote: On 12/20/2014 06:23 AM, Robert White wrote: On 12/19/2014 01:17 PM, Josef Bacik wrote: tl;dr: Cow means you can in the worst case end up using 2 * filesize - blocksize of data on disk and the file will appear to be filesize. Thanks, Doesn't the

Re: btrfs is using 25% more disk than it should

2014-12-20 Thread Robert White
On 12/19/2014 01:10 PM, Josef Bacik wrote: On 12/18/2014 09:59 AM, Daniele Testa wrote: Hey, I am hoping you guys can shed some light on my issue. I know that it's a common question that people see differences in the disk used when running different calculations, but I still think that my

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/18/2014 9:59 AM, Daniele Testa wrote: As seen above, I have a 410GB SSD mounted at /opt/drives/ssd. On that partition, I have one single starse file, taking 302GB of space (max 315GB). The snapshots directory is completely empty. So you

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Daniele Testa
No, I don't have any snapshots or subvolumes. Only that single file. The file has both checksums and datacow on it. I will do chattr +C on the parent dir and re-create the file to make sure all files are marked as nodatacow. Should I also turn off checksums with the mount-flags if this

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/19/2014 2:59 PM, Daniele Testa wrote: No, I don't have any snapshots or subvolumes. Only that single file. The file has both checksums and datacow on it. I will do chattr +C on the parent dir and re-create the file to make sure all files

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Josef Bacik
On 12/18/2014 09:59 AM, Daniele Testa wrote: Hey, I am hoping you guys can shed some light on my issue. I know that it's a common question that people see differences in the disk used when running different calculations, but I still think that my issue is weird. root@s4 / # mount /dev/md3 on

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Josef Bacik
On 12/19/2014 02:59 PM, Daniele Testa wrote: No, I don't have any snapshots or subvolumes. Only that single file. The file has both checksums and datacow on it. I will do chattr +C on the parent dir and re-create the file to make sure all files are marked as nodatacow. Should I also turn off

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Josef Bacik
On 12/19/2014 04:10 PM, Josef Bacik wrote: On 12/18/2014 09:59 AM, Daniele Testa wrote: Hey, I am hoping you guys can shed some light on my issue. I know that it's a common question that people see differences in the disk used when running different calculations, but I still think that my

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/19/2014 4:15 PM, Josef Bacik wrote: Please God don't turn off of checksums. Checksums are tracked in metadata anyway, they won't show up in the data accounting. Our csums are 8 bytes per block, so basic math says you are going to max out

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Josef Bacik
On 12/19/2014 04:53 PM, Phillip Susi wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/19/2014 4:15 PM, Josef Bacik wrote: Please God don't turn off of checksums. Checksums are tracked in metadata anyway, they won't show up in the data accounting. Our csums are 8 bytes per block, so

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Duncan
Daniele Testa posted on Sat, 20 Dec 2014 03:59:42 +0800 as excerpted: The file has both checksums and datacow on it. I will do chattr +C on the parent dir and re-create the file to make sure all files are marked as nodatacow. Should I also turn off checksums with the mount-flags if this

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Duncan
Josef Bacik posted on Fri, 19 Dec 2014 16:17:08 -0500 as excerpted: tl;dr: Cow means you can in the worst case end up using 2 * filesize - blocksize of data on disk and the file will appear to be filesize. Thanks for the tl;dr /and/ the very sensible longer explanation. That's a very nice

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Zygo Blaxell
On Fri, Dec 19, 2014 at 04:17:08PM -0500, Josef Bacik wrote: And for your inode you now have this inode 256, file offset 0, size 4k, offset 0, diskebytenr (123+302g), disklen 4k inode 256, file offset 4k, size 302g-4k, offset 4k, diskbytenr 123, disklen 302g and in your extent tree you

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Daniele Testa
But I read somewhere that compression should be turned off on mounts that just store large VM-images. Is that wrong? Btw, I am not pre-allocation space for the images. I use sparse files with: dd if=/dev/zero of=drive.img bs=1 count=1 seek=300G It creates the file in a few ms. Is it better to

Re: btrfs is using 25% more disk than it should

2014-12-19 Thread Duncan
Daniele Testa posted on Sat, 20 Dec 2014 14:18:31 +0800 as excerpted: Anyways, would disabling CoW (by putting +C on the parent dir) prevent the performance issues and 2*filesize issue? It should, provided you don't then start snapshotting the file (which I don't believe you intend to do but

btrfs is using 25% more disk than it should

2014-12-18 Thread Daniele Testa
Hey, I am hoping you guys can shed some light on my issue. I know that it's a common question that people see differences in the disk used when running different calculations, but I still think that my issue is weird. root@s4 / # mount /dev/md3 on /opt/drives/ssd type btrfs