[Pytables-users] 64-bit bug in PyTables/Numpy?

Francesc Alted Thu, 25 Mar 2010 09:33:42 -0700

[Hi Jakob.  It seems that you sent this from unsubscribed address.  Please 
subscribe first before sending to mailing list.  Thanks.]


-------------------------------------------------------------------------
Re: 64-bit bug in PyTables/Numpy?
De: Jakob van Santen <[email protected]>
A: [email protected]
Data: Dimarts 23:19:36
   
Hello,

a follow-up on this issue:

This turned out to be due to the way the file was written: 64-bit integers 
were being written as 8-byte integers with 32-bit precision. The HDF5 library 
noticed that the type only had 4 significant bytes and so only wrote out the 
lower word in H5TBread_records(). Since PyTables prepares the data area with 
numpy.empty and not numpy.empty_like, the memory is not zeroed. This is fine 
as long as types always have precision==8*width, but it breaks otherwise.

This is more of a pseudo-bug in HDF5; it would seem more logical to pad out 
the field with zeroes than simply leave the padding bytes unwritten.

Cheers,
Jakob

On 23 Mar 2010, at 14:38, Jakob van Santen wrote:

> Hello,
> 
> I came across what appears to be a bug in the handling of 64-bit integers in 
PyTables. This happens on 64-bit OS X and 32/64 bit RHEL 4/5 with Numpy 1.3 
and PyTables 2.1.2. I have a table that contains a sort of block-offset index, 
the 'start' and 'stop' columns are UInt64s. When I open the file and read the 
table into freshly-allocated memory, the contents are fine:
> 
> In [1]: import tables
> 
> In [2]: f=tables.openFile('public_html/pasties/garbage.hd5')
> 
> In [3]: f.root.__I3Index__.CalibratedATWD_exp.read(0,5)
> Out[3]: 
> array([(111481L, 659348L, 1, 0L, 8L), (111481L, 659367L, 1, 8L, 26L),
>      (111481L, 659391L, 1, 26L, 29L), (111481L, 659430L, 1, 29L, 43L),
>      (111481L, 659456L, 1, 43L, 66L)], 
>     dtype=[('Run', '<u4'), ('Event', '<u4'), ('exists', '|u1'), ('start', 
'<u8'), ('stop', '<u8')])
> 
> but if I repeat the operation (or even read in another table), the upper 32 
bits are gibberish:
> 
> In [4]: f.root.__I3Index__.CalibratedATWD_exp.read(0,5)
> Out[4]: 
> array([(111481L, 659348L, 1, 5188146770730811392L, 432345564227567624L),
>      (111481L, 659367L, 1, 8L, 3377699720527898L),
>      (111481L, 659391L, 1, 26L, 19791209299997L),
>      (111481L, 659430L, 1, 29L, 103079215147L),
>      (111481L, 659456L, 1, 43L, 66L)], 
>     dtype=[('Run', '<u4'), ('Event', '<u4'), ('exists', '|u1'), ('start', 
'<u8'), ('stop', '<u8')])
> 
> The file can be found here: 
http://www.icecube.wisc.edu/~jvansanten/pasties/garbage.hd5 . Has anyone else 
seen this sort of thing before? Why are only the lowest 4 bytes being read in?
> 
> Cheers,
> Jakob

-- 
Francesc Alted

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Pytables-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/pytables-users

[Pytables-users] 64-bit bug in PyTables/Numpy?

Reply via email to