Hi,
On Mon, Dec 19, 2011 at 12:37:34PM -0600, John Knutson wrote:
> Does anyone out there have any experience using rsync to copy HDF5
> files? I've been trying to use rsync to make back-ups of hdf5 files
> as they grow, but instead of the expected fairly constant time
> required for each update, the rsync time increases as the HDF5 file
> grows. This suggests to me that rsync is re-transferring data
> instead of just transferring differences. That, or as I add data to
> the HDF5 file, changes are being made to numerous locations in the
> file.
>
> I thought maybe the problem was that the time spent doing checksums
> was causing the increase as the files grew in size, but the rsync
> output indicates a linear increase in actual data transferred as
> well, just like the run time.
>
> The files in question contain multiple data sets that are being
> updated, each of which is stored as chunked, compressed data.
I don't think it's much of a surprise that rsync can't do small
deltas on binary compressed files. If you change just a single
bit in a file the compressed files before and after can be
radically different, so a "diff" would be huge... It's rather
well known that rsync has problems with these kinds of files
- that's why gzip has an option named '--rsyncable' which makes
it output files that are a bit larger but can be rsync'ed more
effectively. I don't know how you compress those files, perhaps
if you use gzip you can try that option?
Best regards, Jens
--
\ Jens Thoms Toerring ________ [email protected]
\_______________________________ http://toerring.de
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org