Hi,

On Mon, Dec 19, 2011 at 12:37:34PM -0600, John Knutson wrote:
> Does anyone out there have any experience using rsync  to copy HDF5
> files?  I've been trying to use rsync to make back-ups of hdf5 files
> as they grow, but instead of the expected fairly constant time
> required for each update, the rsync time increases as the HDF5 file
> grows.  This suggests to me that rsync is re-transferring data
> instead of just transferring differences.  That, or as I add data to
> the HDF5 file, changes are being made to numerous locations in the
> file.
> 
> I thought maybe the problem was that the time spent doing checksums
> was causing the increase as the files grew in size, but the rsync
> output indicates a linear increase in actual data transferred as
> well, just like the run time.
> 
> The files in question contain multiple data sets that are being
> updated, each of which is stored as chunked, compressed data.

I don't think it's much of a surprise that rsync can't do small
deltas on binary compressed files. If you change just a single
bit in a file the compressed files before and after can be
radically different, so a "diff" would be huge... It's rather
well known that rsync has problems with these kinds of files
- that's why gzip has an option named '--rsyncable' which makes
it output files that are a bit larger but can be rsync'ed more
effectively. I don't know how you compress those files, perhaps
if you use gzip you can try that option?

                          Best regards, Jens
-- 
  \   Jens Thoms Toerring  ________      [email protected]
   \_______________________________      http://toerring.de

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to