Re: [BackupPC-users] Compression level

2007-12-05 Thread John Pettitt
Craig Barratt wrote:
 Rich writes:

   
 I don't think BackupPC will update the pool with the smaller file even
 though it knows the source was identical, and some tests I just did
 backing up /tmp seem to agree.  Once compressed and copied into the
 pool, the file is not updated with future higher compressed copies.
 Does anyone know something otherwise?
 

 You're right.

 Each file in the pool is only compressed once, at the current
 compression level.  Matching pool files is done by comparing
 uncompressed file contents, not compressed files.

 It's done this way because compression is typically a lot more
 expensive than uncompressing.  Changing the compression level
 will only apply to new additions to the pool.

 To benchmark compression ratios you could remove all the files
 in the pool between runs, but of course you should only do that
 on a test setup, not a production installation.

 Craig
   
The other point to keep in mind is that unless you actually need 
compression for disk space reasons leaving it off will often be faster 
on a CPU bound server.   Since there is a script provided 
(BackupPC_compressPool) to compress it later you can safely leave 
compression off until you need the disk space.

John

-
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell.  From the desktop to the data center, Linux is going
mainstream.  Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Compression level

2007-12-05 Thread Rich Rauenzahn

John Pettitt wrote:
  

What happens is the newly transfered file is compared against candidates 
in the pool with the same hash value and if one exists it's just 
linked,   The new file is not compressed.   It seems to me that if you 
want to change the compression in the pool the way to go is to modify 
the BackupPC_compressPool script which compresses an uncompressed pool 
to instead re-compress a compressed pool.   There is some juggling that 
goes on to maintain the correct inode in the pool so all the links 
remain valid and this script already does that. 

  
You're sure?  That isn't my observation.  At least with rsync, the files 
in the 'new' subdirectory of the backup are already compressed, and I 
vaguely recall reading the code and noticing it compresses them during 
the transfer (but on the server side as it receives the data).  After 
the whole rsync session is finished, then the NewFiles hash list is 
compared with the pool.  Identical files (determined by hash code of 
uncompressed data) are then linked to the pool.


If that is all true, then it seems like there is an opportunity to 
compare the size of the existing file in the pool with the new file, and 
keep the smaller one.


Rich
-
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell.  From the desktop to the data center, Linux is going
mainstream.  Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Compression level

2007-12-05 Thread John Pettitt
Rich Rauenzahn wrote:


 I know backuppc will sometimes need to re-transfer a file (for instance, 
 if it is a 2nd copy in another location.)  I assume it then 
 re-compresses it on the re-transfer, as my understanding is the 
 compression happens as the file is written to disk.(?)  

 Would it make sense to add to the enhancement request list the ability 
 to replace the existing file in the pool with the new file contents if 
 the newly compressed/transferred file is smaller?  I assume this could 
 be done during the pool check at the end of the backup... then if some 
 backups use a higher level of compression, the smallest version of the 
 file is always preferred (ok, usually preferred, because the transfer is 
 avoided with rsync if the file is in the same place as before.)

 Rich

   
What happens is the newly transfered file is compared against candidates 
in the pool with the same hash value and if one exists it's just 
linked,   The new file is not compressed.   It seems to me that if you 
want to change the compression in the pool the way to go is to modify 
the BackupPC_compressPool script which compresses an uncompressed pool 
to instead re-compress a compressed pool.   There is some juggling that 
goes on to maintain the correct inode in the pool so all the links 
remain valid and this script already does that. 

John


-
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell.  From the desktop to the data center, Linux is going
mainstream.  Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Compression level

2007-12-04 Thread Rich Rauenzahn
[EMAIL PROTECTED] wrote:

 Hello,

 I would like to have an information about compression level.

 I'm still doing several tests about compression and I would like to 
 have your opinion about something :
 I think that there is a very little difference between level 1 and 
 level 9.
 I tought that I will be more.

 For example, with a directory (1GB - 1308 files : excel, word, pdf, 
 bmp, jpg, zip, ...) with compression level :

 9 I have the result : 54.4% compressed (1st size : 1018.4 Mo / 
 compressed size : 464.5 Mo)
 1 I have the result : 52.8% compressed (1st size : 1018.4 Mo / 
 compressed size : 480.5 Mo)

 Do you think that's correct / normal ?
I'll ask this again:  How are you ensuring that each compression test 
isn't reusing the compressed files that are already in the pool?  What 
is your test methodology?

I don't think BackupPC will update the pool with the smaller file even 
though it knows the source was identical, and some tests I just did 
backing up /tmp seem to agree.  Once compressed and copied into the 
pool, the file is not updated with future higher compressed copies.  
Does anyone know something otherwise?

Rich

-
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell.  From the desktop to the data center, Linux is going
mainstream.  Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Compression level

2007-12-04 Thread Craig Barratt
Rich writes:

 I don't think BackupPC will update the pool with the smaller file even
 though it knows the source was identical, and some tests I just did
 backing up /tmp seem to agree.  Once compressed and copied into the
 pool, the file is not updated with future higher compressed copies.
 Does anyone know something otherwise?

You're right.

Each file in the pool is only compressed once, at the current
compression level.  Matching pool files is done by comparing
uncompressed file contents, not compressed files.

It's done this way because compression is typically a lot more
expensive than uncompressing.  Changing the compression level
will only apply to new additions to the pool.

To benchmark compression ratios you could remove all the files
in the pool between runs, but of course you should only do that
on a test setup, not a production installation.

Craig

-
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell.  From the desktop to the data center, Linux is going
mainstream.  Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4
___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/


Re: [BackupPC-users] Compression level

2007-12-04 Thread romain . pichard
Hello,

I'm sorry, I forgot to explain that I delete all files in the pool between 
two tests with different compression level.
And all links are cleaned because of BackupPC_nightly, etc...

It's on a test setup, of course.

Thanks a lot for your help.
Regards,

Romain




Craig Barratt [EMAIL PROTECTED] 
05/12/2007 08:00

A
Rich Rauenzahn [EMAIL PROTECTED]
cc
Romain PICHARD/Mondeville/VIC/[EMAIL PROTECTED], 
backuppc-users@lists.sourceforge.net 
backuppc-users@lists.sourceforge.net
Objet
Re: [BackupPC-users] Compression level






Rich writes:

 I don't think BackupPC will update the pool with the smaller file even
 though it knows the source was identical, and some tests I just did
 backing up /tmp seem to agree.  Once compressed and copied into the
 pool, the file is not updated with future higher compressed copies.
 Does anyone know something otherwise?

You're right.

Each file in the pool is only compressed once, at the current
compression level.  Matching pool files is done by comparing
uncompressed file contents, not compressed files.

It's done this way because compression is typically a lot more
expensive than uncompressing.  Changing the compression level
will only apply to new additions to the pool.

To benchmark compression ratios you could remove all the files
in the pool between runs, but of course you should only do that
on a test setup, not a production installation.

Craig



 
SC2N -S.A  Siège Social : 2, Rue Andre Boulle - 94000 Créteil  - 327 153 
722 RCS Créteil

 

This e-mail message is intended only for the use of the intended
recipient(s).
The information contained therein may be confidential or privileged, and
its disclosure or reproduction is strictly prohibited.
If you are not the intended recipient, please return it immediately to its
sender at the above address and destroy it.

-
SF.Net email is sponsored by: The Future of Linux Business White Paper
from Novell.  From the desktop to the data center, Linux is going
mainstream.  Let it simplify your IT future.
http://altfarm.mediaplex.com/ad/ck/8857-50307-18918-4___
BackupPC-users mailing list
BackupPC-users@lists.sourceforge.net
List:https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/