On Tue, Aug 05, 2014 at 03:43:12PM +0200, Ekin Akoglu wrote:
> Dear Alex,
> 
> Thank you for the reply. I think I need to make something clearer. I did
> not make cross-comparison of HDF5 files between the Mac OS X and Debian
> Linux systems. What I wanted to mean was that I tried the example I
> depicted in my previous mail on both of those Unix-like systems and the
> data files did differ. To me, this could be a bug, either in difftool or in
> HDF5. But I remember that in the past I did not encounter such a problem
> under GIT DVCS (I think for the releases before 1.8.12). If more
> information is required, I can make trials with earlier HDF5 versions (<=
> 1.8.11) and report the results.

Dear Ekin,

As a HDF5 user, I would not expect HDF5 files with the same data to match with a
tool like diff (or the diff tool of git or any other vcs). HDF5 uses more
advanced techniques with respect to plain ASCII files, which is why we use it.
As a consequence you should not expect any kind of binary equality between
files, even between files created similarly.

Also, in recent versions of HDF5 there is data creation and modification time
tracking. That would certainly cause a difference.

It may have worked at some point for you but as it is not a design goal of HDF5.
The topic of hdf5 vcs data has been discussed at these places, for instance:
http://scicomp.stackexchange.com/questions/8524/are-hdf5-files-suitable-for-git-revision-control
http://stackoverflow.com/questions/540535/managing-large-binary-files-with-git

If you still want to manage your data with git, which is fine, you may want to
use git-annex that handles binary data (although text-based diffing is not
used): https://git-annex.branchable.com/


Regards,

Pierre

> 
> Thank you,
> 
> Ekin
> 
> 
> On 5 August 2014 15:26, Stohr, Alexander <[email protected]> wrote:
> 
> >  Without any deep knowledge in the subject details…
> >
> >
> >
> > HDF5 is a container format that uses miscellaneous techniques
> >
> > for the purpose of storing those data.
> >
> > Some of them are e.g. binary trees, or chunking, or changing size of
> > length values.
> >
> >
> >
> > Even if the data is the very same and even if any reader will see the same
> > data
> >
> > there can still be many cases where the encapsulation might see
> > differences.
> >
> >
> >
> > A binary tree can see different layouts.
> >
> > A chunking value can be tuned differently depending on platform, file
> > system or even the used compiler.
> >
> > A size of a length value might be selected differently by default.
> >
> >
> >
> > A low level parsing of the container format will unveil what the origin is.
> >
> > This is not a bug – instead that is a feature.
> >
> >
> >
> > Maybe your invalid approach for the comparison is the real “bug”. ;-)
> >
> >
> >
> > regards, Alex.
> >
> >
> >
> >
> > Managing Directors: Dr. Seok Cheol Kee, Andrea Weuffen, Wolfgang Vieweger
> >
> > *Von:* Hdf-forum [mailto:[email protected]] *Im
> > Auftrag von *Ekin Akoglu
> > *Gesendet:* Dienstag, 5. August 2014 15:15
> > *An:* HDF Users Discussion List
> > *Betreff:* [Hdf-forum] Identical HDF5 files according to "h5diff" differ
> > in comparison with "diff" Unix command
> >
> >
> >
> > Dear all,
> >
> >
> >
> > For the two versions of the same HDF5 file, h5diff comparison outputs "0
> > differences found"; however, when compared with the Unix "diff" command,
> > they differ. This is creating inconvenience under version control system.
> > Do you have any suggestions why diff and h5diff conflicts? As far as I
> > remember, this was not the case in the past and I remember managing HDF5
> > data files without problems under GIT DVCS; however, I cannot recall which
> > version of the HDF5 library.
> >
> >
> >
> > I tried this as below:
> >
> >
> >
> > I compiled my Fortran program (using GNU Fortran 4.8.2) and ran it so as
> > to create the HDF5 datafile as output. I moved the datafile to some other
> > directory. Then I re-ran my program (without recompiling) and then compared
> > the newly created HDF5 data file with the old one using "diff" tool in Mac
> > OS X (10.9.4) and Linux (Debian Wheezy 7.6 x64) and they did differ. Why?
> >
> >
> >
> > My HDF5 version is 1.8.12 and diff version is GNU diffutils 2.8.1.
> >

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to