Does anyone out there have any experience using rsync to copy HDF5
files? I've been trying to use rsync to make back-ups of hdf5 files as
they grow, but instead of the expected fairly constant time required for
each update, the rsync time increases as the HDF5 file grows. This
suggests to me that rsync is re-transferring data instead of just
transferring differences. That, or as I add data to the HDF5 file,
changes are being made to numerous locations in the file.
I thought maybe the problem was that the time spent doing checksums was
causing the increase as the files grew in size, but the rsync output
indicates a linear increase in actual data transferred as well, just
like the run time.
The files in question contain multiple data sets that are being updated,
each of which is stored as chunked, compressed data.
The only thing I can think of to fiddle with on the rsync end is the
checksum block size, and try and make it more like the size of blocks in
the HDF5 file, which is an unknown to me at the moment.
Alternately, I can make the files smaller, but that would not be my
first choice as it would be a major design change.
If anyone has any suggestions as to how to resolve this "creeping
transfer time" issue, I'd appreciate it.
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org