Re: [zfs-discuss] Data size grew.. with compression on

2009-04-17 Thread Robert Milkowski
Hello Will,

Monday, April 13, 2009, 6:44:47 PM, you wrote:

WM On Mon, Apr 13, 2009 at 07:03, Robert Milkowski mi...@task.gda.pl wrote:
 Hello Daniel,

 Thursday, April 9, 2009, 3:35:07 PM, you wrote:

 DR Jonathan schrieb:
 OpenSolaris Forums wrote:
 if you have a snapshot of your files and rsync the same files again,
 you need to use --inplace rsync option , otherwise completely new
 blocks will be allocated for the new files. that`s because rsync will
 write entirely new file and rename it over the old one.

 ZFS will allocate new blocks either way

 DR No it won't. --inplace doesn't rewrite blocks identical on source and
 DR target but only blocks which have been changed.

 Yes, it will. Inplace in rsync has nothing to do with how ZFS works.
WM But it has big consequences on how rsync uses the file system, and
WM thus big consequences on how ZFS behaves.  --inplace means rsync walks
WM through the file on the receiving end until it finds a mismatch, and
WM only then does it write new blocks to disk.

agree.

btw: what I meant by Yes, it will is that everytime rsync modifies
any part of file zfs will allocate new fs block for those modification
- regardless if inplace was used or not. However inplace will be more
effective on zfs+snapshots as making a new full copy of a file will be avoided.


 Now with inplce you're telling rsync to overwrite any changed blocks
 directly over the original file instead of making a full copy of a
 file. Everytime you overwrite some data zfs will allocate new blocks
 only for those blocks and keep the original blocks as long as they are
 referenced by at least one snapshot.
WM Exactly.  But the consequence of this is that with no --inplace,
WM rsync+snapshots balloon space usage under ZFS, and with --inplace you
WM don't get that behavior.

But it's not specific to zfs - will happen one way or
another if you use any file system with snapshots (unless you do have
a dedup buil-in fs).

Nevertheless of course I agree that inplace makes sense if updating
relatively small portions of files.


-- 
Best regards,
 Robert Milkowski
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-13 Thread Robert Milkowski
Hello Daniel,

Thursday, April 9, 2009, 3:35:07 PM, you wrote:

DR Jonathan schrieb:
 OpenSolaris Forums wrote:
 if you have a snapshot of your files and rsync the same files again,
 you need to use --inplace rsync option , otherwise completely new
 blocks will be allocated for the new files. that`s because rsync will
 write entirely new file and rename it over the old one.
 
 ZFS will allocate new blocks either way

DR No it won't. --inplace doesn't rewrite blocks identical on source and 
DR target but only blocks which have been changed.

Yes, it will. Inplace in rsync has nothing to do with how ZFS works.


DR I use rsync to synchronize a directory with a few large files (each up
DR to 32 GB). Data normally gets appended to one file until it reaches the
DR size limit of 32 GB. Before I used --inplace a snapshot needed on 
DR average ~16 GB. Now with --inplace it is just a few kBytes.


That's because without --inplace rsync will copy a file first, then
apply changes to it and if successful will remove the old file. So if
the old file is still in snapshot you will end-up with a new copy of
the file and the old copy being kept in a pool.

Now with inplce you're telling rsync to overwrite any changed blocks
directly over the original file instead of making a full copy of a
file. Everytime you overwrite some data zfs will allocate new blocks
only for those blocks and keep the original blocks as long as they are
referenced by at least one snapshot.


-- 
Best regards,
 Robert Milkowski
   http://milek.blogspot.com

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-13 Thread Bob Friesenhahn

On Mon, 13 Apr 2009, Robert Milkowski wrote:


That's because without --inplace rsync will copy a file first, then
apply changes to it and if successful will remove the old file. So if
the old file is still in snapshot you will end-up with a new copy of
the file and the old copy being kept in a pool.

Now with inplce you're telling rsync to overwrite any changed blocks
directly over the original file instead of making a full copy of a
file. Everytime you overwrite some data zfs will allocate new blocks
only for those blocks and keep the original blocks as long as they are
referenced by at least one snapshot.


It should be noted that using --inplace without also employing zfs 
snapshots makes rsync almost useless as a backup mechanism.  If there 
is a problem with reading all or part of the original file, then rsync 
is likely to destroy the backup file as well.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-13 Thread Will Murnane
On Mon, Apr 13, 2009 at 07:03, Robert Milkowski mi...@task.gda.pl wrote:
 Hello Daniel,

 Thursday, April 9, 2009, 3:35:07 PM, you wrote:

 DR Jonathan schrieb:
 OpenSolaris Forums wrote:
 if you have a snapshot of your files and rsync the same files again,
 you need to use --inplace rsync option , otherwise completely new
 blocks will be allocated for the new files. that`s because rsync will
 write entirely new file and rename it over the old one.

 ZFS will allocate new blocks either way

 DR No it won't. --inplace doesn't rewrite blocks identical on source and
 DR target but only blocks which have been changed.

 Yes, it will. Inplace in rsync has nothing to do with how ZFS works.
But it has big consequences on how rsync uses the file system, and
thus big consequences on how ZFS behaves.  --inplace means rsync walks
through the file on the receiving end until it finds a mismatch, and
only then does it write new blocks to disk.

 Now with inplce you're telling rsync to overwrite any changed blocks
 directly over the original file instead of making a full copy of a
 file. Everytime you overwrite some data zfs will allocate new blocks
 only for those blocks and keep the original blocks as long as they are
 referenced by at least one snapshot.
Exactly.  But the consequence of this is that with no --inplace,
rsync+snapshots balloon space usage under ZFS, and with --inplace you
don't get that behavior.

Perhaps ZFS could do some very simplistic de-dup here: it has the
B-tree entry for the file in question when it goes to overwrite a
piece of it, so it could calculate the checksum of the new block and
see if it matches the checksum for the block it is overwriting.  This
is a much simpler case than the global de-dup that's in the works:
only one checksum needs to be compared to one other, compared to the
needle-in-haystack problem of dedup on the whole pool.

Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-13 Thread Daniel Rock

Will Murnane schrieb:

Perhaps ZFS could do some very simplistic de-dup here: it has the
B-tree entry for the file in question when it goes to overwrite a
piece of it, so it could calculate the checksum of the new block and
see if it matches the checksum for the block it is overwriting.  This
is a much simpler case than the global de-dup that's in the works:
only one checksum needs to be compared to one other, compared to the
needle-in-haystack problem of dedup on the whole pool.


Beware that the default checksum algorithm is not cryptographically 
safe. You can easily generate blocks with different content but the same 
checksum.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-10 Thread Harry Putnam
David Magda dma...@ee.ryerson.ca writes:

 On Apr 7, 2009, at 16:43, OpenSolaris Forums wrote:

 if you have a snapshot of your files and rsync the same files again,
 you need to use --inplace rsync option , otherwise completely new
 blocks will be allocated for the new files. that`s because rsync
 will write entirely new file and rename it over the old one.


 not sure if this applies here, but i think it`s worth mentioning and
 not obvious.

 With ZFS new blocks will always be allocated: it's copy-on-write (COW)
 file system.

So who is right here... Daniel Rock says he can see on disk that it
doesn't work that way... that is only a small amount of space is taken
when rsyncing in this way.   
See his post:

  From: Daniel Rock sola...@deadcafe.de
  Subject: Re: Data size grew.. with compression on
  Newsgroups: gmane.os.solaris.opensolaris.zfs
  To: zfs-discuss@opensolaris.org
  Date: Thu, 09 Apr 2009 16:35:07 +0200
  Message-ID: 49de079b.2040...@deadcafe.de

  [...]

  Johnathon wrote:
   ZFS will allocate new blocks either way
  
  Daniel R replied: No it won't. --inplace doesn't rewrite blocks
  identical on source and target but only blocks which have been
  changed.
  
  I use rsync to synchronize a directory with a few large files (each
  up to 32 GB). Data normally gets appended to one file until it
  reaches the size limit of 32 GB. Before I used --inplace a snapshot
  needed on average ~16 GB. Now with --inplace it is just a few
  kBytes.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-10 Thread Toby Thain


On 10-Apr-09, at 2:03 PM, Harry Putnam wrote:


David Magda dma...@ee.ryerson.ca writes:


On Apr 7, 2009, at 16:43, OpenSolaris Forums wrote:


if you have a snapshot of your files and rsync the same files again,
you need to use --inplace rsync option , otherwise completely new
blocks will be allocated for the new files. that`s because rsync
will write entirely new file and rename it over the old one.




not sure if this applies here, but i think it`s worth mentioning and
not obvious.


With ZFS new blocks will always be allocated: it's copy-on-write  
(COW)

file system.


So who is right here...


As far as I can see - the effect of --inplace would be that new  
blocks are allocated for the deltas, not the whole file, so Daniel  
Rock's finding does not contradict OpenSolaris Forums. But in  
either case, COW is involved.


--Toby


Daniel Rock says he can see on disk that it
doesn't work that way... that is only a small amount of space is taken
when rsyncing in this way.
See his post:

  From: Daniel Rock sola...@deadcafe.de
  Subject: Re: Data size grew.. with compression on
  Newsgroups: gmane.os.solaris.opensolaris.zfs
  To: zfs-discuss@opensolaris.org
  Date: Thu, 09 Apr 2009 16:35:07 +0200
  Message-ID: 49de079b.2040...@deadcafe.de

  [...]

  Johnathon wrote:

ZFS will allocate new blocks either way


  Daniel R replied: No it won't. --inplace doesn't rewrite blocks
  identical on source and target but only blocks which have been
  changed.

  I use rsync to synchronize a directory with a few large files (each
  up to 32 GB). Data normally gets appended to one file until it
  reaches the size limit of 32 GB. Before I used --inplace a snapshot
  needed on average ~16 GB. Now with --inplace it is just a few
  kBytes.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-09 Thread Harry Putnam
Jeff Bonwick jeff.bonw...@sun.com writes:

  Yes, I made note of that in my OP on this thread.  But is it enough to
  end up with 8gb of non-compressed files measuring 8gb on
  reiserfs(linux) and the same data showing nearly 9gb when copied to a
  zfs filesystem with compression on.  
 
 whoops.. a hefty exaggeration it only shows about 16mb difference.
 But still since zfs side is compressed, that seems like quite a lot..

 That's because ZFS reports *all* space consumed by a file, including
 all metadata (dnodes, indirect blocks, etc).  For an 8G file stored
 in 128K blocks, there are 8G / 128K = 64K block pointers, each of
 which is 128 bytes, and is two-way replicated (via ditto blocks),
 for a total of 64K * 128 * 2 = 16M.  So this is exactly as expected.

All good info thanks.  Still one thing doesn't quite work in your line
of reasoning.   The data on the gentoo linux end is uncompressed.
Whereas it is compressed on the zfs side.

A number of the files are themselves compressed formats such as jpg
mpg avi pdf maybe a few more, which aren't going to compress further
to speak of, but thousands of the files are text files (html).  So
compression should show some downsize.

Your calculation appears to be based on both ends being uncompressed.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-09 Thread Jonathan
OpenSolaris Forums wrote:
 if you rsync data to zfs over existing files, you need to take
 something more into account:
 
 if you have a snapshot of your files and rsync the same files again,
 you need to use --inplace rsync option , otherwise completely new
 blocks will be allocated for the new files. that`s because rsync will
 write entirely new file and rename it over the old one.

ZFS will allocate new blocks either way, check here
http://all-unix.blogspot.com/2007/03/zfs-cow-and-relate-features.html
for more information about how Copy-On-Write works.

Jonathan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-09 Thread Daniel Rock

Jonathan schrieb:

OpenSolaris Forums wrote:

if you have a snapshot of your files and rsync the same files again,
you need to use --inplace rsync option , otherwise completely new
blocks will be allocated for the new files. that`s because rsync will
write entirely new file and rename it over the old one.


ZFS will allocate new blocks either way


No it won't. --inplace doesn't rewrite blocks identical on source and 
target but only blocks which have been changed.


I use rsync to synchronize a directory with a few large files (each up 
to 32 GB). Data normally gets appended to one file until it reaches the 
size limit of 32 GB. Before I used --inplace a snapshot needed on 
average ~16 GB. Now with --inplace it is just a few kBytes.



Daniel
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-09 Thread Jonathan
Daniel Rock wrote:
 Jonathan schrieb:
 OpenSolaris Forums wrote:
 if you have a snapshot of your files and rsync the same files again,
 you need to use --inplace rsync option , otherwise completely new
 blocks will be allocated for the new files. that`s because rsync will
 write entirely new file and rename it over the old one.

 ZFS will allocate new blocks either way
 
 No it won't. --inplace doesn't rewrite blocks identical on source and
 target but only blocks which have been changed.
 
 I use rsync to synchronize a directory with a few large files (each up
 to 32 GB). Data normally gets appended to one file until it reaches the
 size limit of 32 GB. Before I used --inplace a snapshot needed on
 average ~16 GB. Now with --inplace it is just a few kBytes.

It appears I may have misread the initial post.  I don't really know how
I misread it, but I think I missed the snapshot portion of the message
and got confused.  I understand the interaction between snapshots,
rsync, and --inplace being discussed now.

My apologies,
Jonathan
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-09 Thread Greg Mason

Harry,

ZFS will only compress data if it is able to gain more than 12% of space 
by compressing the data (I may be wrong on the exact percentage). If ZFS 
can't get get that 12% compression at least, it doesn't bother and will 
just store the block uncompressed.


Also, the default ZFS compression algorithm isn't gzip, so you aren't 
going to get the greatest compression possible, but it is quite fast.


Depending on the type of data, it may not compress well at all, leading 
ZFS to store that data completely uncompressed.


-Greg



All good info thanks.  Still one thing doesn't quite work in your line
of reasoning.   The data on the gentoo linux end is uncompressed.
Whereas it is compressed on the zfs side.

A number of the files are themselves compressed formats such as jpg
mpg avi pdf maybe a few more, which aren't going to compress further
to speak of, but thousands of the files are text files (html).  So
compression should show some downsize.

Your calculation appears to be based on both ends being uncompressed.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-09 Thread reader
Greg Mason gma...@msu.edu writes:

 Harry,

 ZFS will only compress data if it is able to gain more than 12% of
 space by compressing the data (I may be wrong on the exact
 percentage). If ZFS can't get get that 12% compression at least, it
 doesn't bother and will just store the block uncompressed.

 Also, the default ZFS compression algorithm isn't gzip, so you aren't
 going to get the greatest compression possible, but it is quite fast.

 Depending on the type of data, it may not compress well at all,
 leading ZFS to store that data completely uncompressed.

Thanks for another little addition to my knowledge of zfs.  Good stuff
to know.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-09 Thread reader
Jonathan jonat...@kc8onw.net writes:

 It appears I may have misread the initial post.  I don't really know how
 I misread it, but I think I missed the snapshot portion of the message
 and got confused.  I understand the interaction between snapshots,
 rsync, and --inplace being discussed now.

I don't think you did misread it. The initial post had nothing to do
with snapshots.  It had only to do with a single run of rsync from a
linux box to an zfs filesystem and noticing the data had grown even
though the zfs filesystem has compression turned on.

I'm not sure how snapthosts crept in here.. but I'm interested to know
more about the interaction with rsync in the case of snapshots.

It was a post authored by Opensolaris Forums:
Message-ID: 1811927823.191239282659293.javamail.tweb...@sf-app2
That first mentioned snapshots.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-09 Thread David Magda

On Apr 7, 2009, at 16:43, OpenSolaris Forums wrote:

if you have a snapshot of your files and rsync the same files again,  
you need to use --inplace rsync option , otherwise completely new  
blocks will be allocated for the new files. that`s because rsync  
will write entirely new file and rename it over the old one.




not sure if this applies here, but i think it`s worth mentioning and  
not obvious.


With ZFS new blocks will always be allocated: it's copy-on-write (COW)  
file system.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-08 Thread Richard Elling

Harry Putnam wrote:

Robert Milkowski mi...@task.gda.pl writes:

  

Then is block doesn't compress better than 12.5% it won't be
compressed at all. Then in zfs you need extra space for checksums, etc.

How did the OP came up with how much data is being used?



OP, just used `du -sh' at both ends of the transfer.  On origin end it
is gentoo Linux running reiserfs filesystem
  


The size allocated is dependent on file system features. For example, a
zero-filled file named zeros copied to 3 different file systems shows:

$ du -sh /A/zeros
0K /A/zeros
$ du -sh /B/zeros
16K /B/zeros
$ du -sh /C/zeros
32K /C/zeros

Which is correct? All of them :-)

File system zfs options
-
A compression=on
B compression=off, copies=1
C compression=off, copies=2

-- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-08 Thread Harry Putnam
Richard Elling richard.ell...@gmail.com writes:

 Harry Putnam wrote:
 Robert Milkowski mi...@task.gda.pl writes:

   
 Then is block doesn't compress better than 12.5% it won't be
 compressed at all. Then in zfs you need extra space for checksums, etc.

 How did the OP came up with how much data is being used?
 

 OP, just used `du -sh' at both ends of the transfer.  On origin end it
 is gentoo Linux running reiserfs filesystem
   

 The size allocated is dependent on file system features. For example, a
 zero-filled file named zeros copied to 3 different file systems shows:

 $ du -sh /A/zeros
 0K /A/zeros
 $ du -sh /B/zeros
 16K /B/zeros
 $ du -sh /C/zeros
 32K /C/zeros

Yes, I made note of that in my OP on this thread.  But is it enough to
end up with 8gb of non-compressed files measuring 8gb on
reiserfs(linux) and the same data showing nearly 9gb when copied to a
zfs filesystem with compression on.  

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-08 Thread Harry Putnam
Harry Putnam rea...@newsguy.com writes:

 Richard Elling richard.ell...@gmail.com writes:

 Harry Putnam wrote:
 Robert Milkowski mi...@task.gda.pl writes:

   
 Then is block doesn't compress better than 12.5% it won't be
 compressed at all. Then in zfs you need extra space for checksums, etc.

 How did the OP came up with how much data is being used?
 

 OP, just used `du -sh' at both ends of the transfer.  On origin end it
 is gentoo Linux running reiserfs filesystem
   

 The size allocated is dependent on file system features. For example, a
 zero-filled file named zeros copied to 3 different file systems shows:

 $ du -sh /A/zeros
 0K /A/zeros
 $ du -sh /B/zeros
 16K /B/zeros
 $ du -sh /C/zeros
 32K /C/zeros

 Yes, I made note of that in my OP on this thread.  But is it enough to
 end up with 8gb of non-compressed files measuring 8gb on
 reiserfs(linux) and the same data showing nearly 9gb when copied to a
 zfs filesystem with compression on.  

whoops.. a hefty exaggeration it only shows about 16mb difference.
But still since zfs side is compressed, that seems like quite a lot..

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-08 Thread Jeff Bonwick
  Yes, I made note of that in my OP on this thread.  But is it enough to
  end up with 8gb of non-compressed files measuring 8gb on
  reiserfs(linux) and the same data showing nearly 9gb when copied to a
  zfs filesystem with compression on.  
 
 whoops.. a hefty exaggeration it only shows about 16mb difference.
 But still since zfs side is compressed, that seems like quite a lot..

That's because ZFS reports *all* space consumed by a file, including
all metadata (dnodes, indirect blocks, etc).  For an 8G file stored
in 128K blocks, there are 8G / 128K = 64K block pointers, each of
which is 128 bytes, and is two-way replicated (via ditto blocks),
for a total of 64K * 128 * 2 = 16M.  So this is exactly as expected.

Jeff
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-04-02 Thread Harry Putnam
Robert Milkowski mi...@task.gda.pl writes:

 Then is block doesn't compress better than 12.5% it won't be
 compressed at all. Then in zfs you need extra space for checksums, etc.

 How did the OP came up with how much data is being used?

OP, just used `du -sh' at both ends of the transfer.  On origin end it
is gentoo Linux running reiserfs filesystem

receiver end is Osol.11 110 using zfs.  So there used ../gnu/bin/du -sh

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Data size grew.. with compression on

2009-03-30 Thread Harry Putnam
I rsynced an 11gb pile of data from a remote linux machine to a zfs
filesystem with compression turned on.

The data appears to have grown in size rather than been compressed.

Many, even most of the files are formats that are already compressed,
such as mpg jpg avi and several others.  But also many text files
(*.html) are in there.  So didn't expect much compression but also
didn't expect the size to grow.

I realize these are different filesystems that may report
differently.  Reiserfs on the linux machine and zfs on osol.

in bytes:

 Osol:11542196307
linux:11525114469
=
 17081838

Or (If I got the math right) about  16.29 MB bigger on the zfs side
with compression on.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-03-30 Thread Brad Plecs

I've run into this too... I believe the issue is that the block
size/allocation unit size in ZFS is much larger than the default size
on older filesystems (ufs, ext2, ext3).

The result is that if you have lots of small files smaller than the
block size, they take up more total space on the filesystem because
they occupy at least the block size amount.

See the 'recordsize' ZFS filesystem property, though re-reading the
man pages, I'm not 100% sure that tuning this property will have the
intended effect.

BP 


 I rsynced an 11gb pile of data from a remote linux machine to a zfs
 filesystem with compression turned on.
 
 The data appears to have grown in size rather than been compressed.
 
 Many, even most of the files are formats that are already compressed,
 such as mpg jpg avi and several others.  But also many text files
 (*.html) are in there.  So didn't expect much compression but also
 didn't expect the size to grow.
 
 I realize these are different filesystems that may report
 differently.  Reiserfs on the linux machine and zfs on osol.
 
 in bytes:
 
  Osol:11542196307
 linux:11525114469
 =
  17081838
 
 Or (If I got the math right) about  16.29 MB bigger on the zfs side
 with compression on.
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

-- 
bpl...@cs.umd.edu
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Data size grew.. with compression on

2009-03-30 Thread Jeff Bonwick
Right.

Another difference to be aware of is that ZFS reports the total
space consumed, including space for metadata -- typically around 1%.
Traditional filesystems like ufs and ext2 preallocate metadata and
don't count it as using space.  I don't know how reiserfs does its
bookkeeping, but I wouldn't be surprised if it followed that model.

Jeff

On Mon, Mar 30, 2009 at 02:57:31PM -0400, Brad Plecs wrote:
 
 I've run into this too... I believe the issue is that the block
 size/allocation unit size in ZFS is much larger than the default size
 on older filesystems (ufs, ext2, ext3).
 
 The result is that if you have lots of small files smaller than the
 block size, they take up more total space on the filesystem because
 they occupy at least the block size amount.
 
 See the 'recordsize' ZFS filesystem property, though re-reading the
 man pages, I'm not 100% sure that tuning this property will have the
 intended effect.
 
 BP 
 
 
  I rsynced an 11gb pile of data from a remote linux machine to a zfs
  filesystem with compression turned on.
  
  The data appears to have grown in size rather than been compressed.
  
  Many, even most of the files are formats that are already compressed,
  such as mpg jpg avi and several others.  But also many text files
  (*.html) are in there.  So didn't expect much compression but also
  didn't expect the size to grow.
  
  I realize these are different filesystems that may report
  differently.  Reiserfs on the linux machine and zfs on osol.
  
  in bytes:
  
   Osol:11542196307
  linux:11525114469
  =
   17081838
  
  Or (If I got the math right) about  16.29 MB bigger on the zfs side
  with compression on.
  
  
  ___
  zfs-discuss mailing list
  zfs-discuss@opensolaris.org
  http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
 
 -- 
 bpl...@cs.umd.edu
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss