[Hi Jakob. It seems that you sent this from unsubscribed address. Please subscribe first before sending to mailing list. Thanks.]
------------------------------------------------------------------------- Re: 64-bit bug in PyTables/Numpy? De: Jakob van Santen <vansan...@wisc.edu> A: pytables-users@lists.sourceforge.net Data: Dimarts 23:19:36 Hello, a follow-up on this issue: This turned out to be due to the way the file was written: 64-bit integers were being written as 8-byte integers with 32-bit precision. The HDF5 library noticed that the type only had 4 significant bytes and so only wrote out the lower word in H5TBread_records(). Since PyTables prepares the data area with numpy.empty and not numpy.empty_like, the memory is not zeroed. This is fine as long as types always have precision==8*width, but it breaks otherwise. This is more of a pseudo-bug in HDF5; it would seem more logical to pad out the field with zeroes than simply leave the padding bytes unwritten. Cheers, Jakob On 23 Mar 2010, at 14:38, Jakob van Santen wrote: > Hello, > > I came across what appears to be a bug in the handling of 64-bit integers in PyTables. This happens on 64-bit OS X and 32/64 bit RHEL 4/5 with Numpy 1.3 and PyTables 2.1.2. I have a table that contains a sort of block-offset index, the 'start' and 'stop' columns are UInt64s. When I open the file and read the table into freshly-allocated memory, the contents are fine: > > In [1]: import tables > > In [2]: f=tables.openFile('public_html/pasties/garbage.hd5') > > In [3]: f.root.__I3Index__.CalibratedATWD_exp.read(0,5) > Out[3]: > array([(111481L, 659348L, 1, 0L, 8L), (111481L, 659367L, 1, 8L, 26L), > (111481L, 659391L, 1, 26L, 29L), (111481L, 659430L, 1, 29L, 43L), > (111481L, 659456L, 1, 43L, 66L)], > dtype=[('Run', '<u4'), ('Event', '<u4'), ('exists', '|u1'), ('start', '<u8'), ('stop', '<u8')]) > > but if I repeat the operation (or even read in another table), the upper 32 bits are gibberish: > > In [4]: f.root.__I3Index__.CalibratedATWD_exp.read(0,5) > Out[4]: > array([(111481L, 659348L, 1, 5188146770730811392L, 432345564227567624L), > (111481L, 659367L, 1, 8L, 3377699720527898L), > (111481L, 659391L, 1, 26L, 19791209299997L), > (111481L, 659430L, 1, 29L, 103079215147L), > (111481L, 659456L, 1, 43L, 66L)], > dtype=[('Run', '<u4'), ('Event', '<u4'), ('exists', '|u1'), ('start', '<u8'), ('stop', '<u8')]) > > The file can be found here: http://www.icecube.wisc.edu/~jvansanten/pasties/garbage.hd5 . Has anyone else seen this sort of thing before? Why are only the lowest 4 bytes being read in? > > Cheers, > Jakob -- Francesc Alted ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Pytables-users mailing list Pytables-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pytables-users