Does anyone out there have any experience using rsync to copy HDF5 files? I've been trying to use rsync to make back-ups of hdf5 files as they grow, but instead of the expected fairly constant time required for each update, the rsync time increases as the HDF5 file grows. This suggests to me that rsync is re-transferring data instead of just transferring differences. That, or as I add data to the HDF5 file, changes are being made to numerous locations in the file.

I thought maybe the problem was that the time spent doing checksums was causing the increase as the files grew in size, but the rsync output indicates a linear increase in actual data transferred as well, just like the run time.

The files in question contain multiple data sets that are being updated, each of which is stored as chunked, compressed data.

The only thing I can think of to fiddle with on the rsync end is the checksum block size, and try and make it more like the size of blocks in the HDF5 file, which is an unknown to me at the moment.

Alternately, I can make the files smaller, but that would not be my first choice as it would be a major design change.

If anyone has any suggestions as to how to resolve this "creeping transfer time" issue, I'd appreciate it.


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to