> Another possibility would be to use HDF5 as a data container. It > supports the fletcher32 filter [1] which basically computes a chuksum > for evey data chunk written to disk and then always check that the data > read satifies the checksum kept on-disk. So, if the HDF5 layer doesn't > complain, you are basically safe. > > There are at least two usable HDF5 interfaces for Python and NumPy: > PyTables[2] and h5py [3]. PyTables does have support for that right > out-of-the-box. Not sure about h5py though (a quick search in docs > doesn't reveal nothing). > > [1] http://rfc.sunsite.dk/rfc/rfc1071.html > [2] http://www.pytables.org > [3] http://h5py.alfven.org > > Hope it helps, >
Just to confirm that h5py does in fact have fletcher32; it's one of the options you can specify when creating a dataset, although it could use better documentation: http://h5py.alfven.org/docs/guide/hl.html#h5py.highlevel.Group.create_dataset Like other checksums, fletcher32 provides error-detection but not error-correction. You'll still need to throw away data which can't be read. However, I believe that you can still read sections of the dataset which aren't corrupted. Andrew Collette _______________________________________________ Numpy-discussion mailing list [email protected] http://projects.scipy.org/mailman/listinfo/numpy-discussion
