I stand corrected.  Your script, with a check for a link count of
one, will compress those files to save space.  But you still would
only want to search through older trees, not the 'most recent'
because you would not want to compress a file that would normally
be linked to on the next pass.

I hope I don't get yelled at for top posting....

Cheers!

--
Steve Herber    [email protected]                work: 206-221-7262
Software Engineer, UW Medicine, IT Services     home: 425-454-2399

On Wed, 14 Oct 2009, Mark Foster wrote:

Steve Herber wrote:
I think this recommendation is counterproductive.

The purpose of rsnapshot is the have directories for each snapshot
time, linked together with hard links - so each file name in each
directory is just a pointer to a single inode for the unchanged file.
Run an ls -li on some rsnapshot files and you will see link counts
matching each unchanged snapshot.  Compressing any one directory entry
will create the corresponding compressed file but it had to have a new
unique inode, thus breaking the string of hard linked pointers to
the original inode and now taking up more space than before, and if
you happen to compress the 'most recent' rsync previous directory
structure, your next rsync will not find the pre-existing files and
have to copy them over again thus taking up even more space.

A ha! You think I don't understand how rsnapshot works but I do.
OTOH, I'm not convinced you understand how what I'm proposing works.
The compound find statement only considers files with one link. Those
are the "delta" that I was talking about. It's just the files with only
one link that are safe to compress within the rnapshot tree.
Furthermore, doing so with the hourly.1 ensure it happens early and
persists as time goes on but doesn't affect the integrity of hourly.0
where the overlay happens.

Of course this does have the downside of needing to uncompress files in
the case a restore is necessary. Depending on your requirements that may
be a worthwhile trade-off.


Reply via email to