A Sunday 07 December 2008, Brennan Williams escrigué: > OK so maybe I should.... > > (1) not add some sort of checksum type functionality to my read/write > methods > > these read/write methods simply read/write numpy arrays to a > binary file which contains one or more numpy arrays (and nothing > else). > > (2) replace my binary files iwith either HDF5 or PyTables > > But.... > > my app is being used by clients on existing projects - in one case > there are over 900 of these numpy binary files in just one project, > albeit each file is pretty small (200KB or so) > > so.. questions..... > > How can I tranparently (or at least with minimum user-pain) replace > my existing read/write methods with PyTables or HDF5? > > My initial thoughts are... > > (a) have an app version number and a data format version number which > i can check against. > > (b) if data format version < 1.0 then read from old binary files > > (c) if app version number > 1.0 then write to new PyTables or HDF5 > files > > (d) get clients to open existing project and then save existing > project to semi-transparently convert from old to new formats.
Yeah. That would work perfectly. Also, there is a function in PyTables named 'isHDF5File(filename)' that allow you to know whether a file is in HDF5 format or not. You might want to use it and avoid to bother with data format/app version issues. Cheers, Francesc > > Francesc Alted wrote: > > A Friday 05 December 2008, Andrew Collette escrigué: > >>> Another possibility would be to use HDF5 as a data container. It > >>> supports the fletcher32 filter [1] which basically computes a > >>> chuksum for evey data chunk written to disk and then always check > >>> that the data read satifies the checksum kept on-disk. So, if > >>> the HDF5 layer doesn't complain, you are basically safe. > >>> > >>> There are at least two usable HDF5 interfaces for Python and > >>> NumPy: PyTables[2] and h5py [3]. PyTables does have support for > >>> that right out-of-the-box. Not sure about h5py though (a quick > >>> search in docs doesn't reveal nothing). > >>> > >>> [1] http://rfc.sunsite.dk/rfc/rfc1071.html > >>> [2] http://www.pytables.org > >>> [3] http://h5py.alfven.org > >>> > >>> Hope it helps, > >> > >> Just to confirm that h5py does in fact have fletcher32; it's one > >> of the options you can specify when creating a dataset, although > >> it could use better documentation: > >> > >> http://h5py.alfven.org/docs/guide/hl.html#h5py.highlevel.Group.cre > >>ate _dataset > > > > My bad. I've searched for 'fletcher' instead of 'fletcher32'. I > > naively thought that the search tool in Sphinx allowed for partial > > name finding. In fact, it is a pity it does not. > > > > Cheers, > > _______________________________________________ > Numpy-discussion mailing list > Numpy-discussion@scipy.org > http://projects.scipy.org/mailman/listinfo/numpy-discussion -- Francesc Alted _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion