Re: [Numpy-discussion] complex numpy.ndarray dtypes

2008-10-02 Thread Francesc Alted
A Thursday 02 October 2008, John Gu escrigué:
 Hello,

 I am using numpy in conjunction with pyTables.  The data that I read
 in from pyTables seem to have the following dtype:

 p = hdf5.root.myTable.read()

 p.__class__
 type 'numpy.ndarray'

 p[0].__class__
 type 'numpy.void'

 p.dtype
 dtype([('time', 'f4'), ('obs1', 'f4'), ('obs2', 'f8'), ('obs3',
 'f4')])

 p.shape
 (61230,)

 The manner in which I access a particular column is p['time'] or
 p['obs1']. I have a couple of questions regarding this data
 structure: 1) how do I restructure the array into a 61230 x 4 array
 that can be indexed using [r,c] notation?

In your example, the table (record array in NumPy jargon) is 
inhomogeneous (all fields are 'f4' except 'obs2' which is 'f8').  In 
that case, you can obtain an homogeneous array by doing something like:

In [44]: a = numpy.array([(1,2),(3,4)], dtype=[('obs1','f4'),
('obs2','f8')])

In [45]: b = numpy.array([(val['obs1'], val['obs2']) for val in a], 
dtype='f4')

In [46]: b
Out[46]:
array([[ 1.,  2.],
   [ 3.,  4.]], dtype=float32)

In case your table would be homegenous, there is a simpler way:

In [41]: a = numpy.array([(1,2),(3,4)], dtype=[('obs1','f4'),
('obs2','f4')])

In [42]: d = a.view(('f4',2))

In [43]: d
Out[43]:
array([[ 1.,  2.],
   [ 3.,  4.]], dtype=float32)

which is faster:

In [68]: timeit d = a.view(('f4',2))
10 loops, best of 3: 11.5 µs per loop

In [69]: timeit b=numpy.array([(val['obs1'], val['obs2']) for val in a], 
dtype='f4')
1 loops, best of 3: 39.8 µs per loop

 2) What kind of dtype is 
 pyTables using?  How do I create a similar array that can be indexed
 by a named column?  I tried various ways:

 a = array([[1,2],[3,4]],
 dtype=dtype([('obs1','f4'),('obs2','f4')]))
 -
-- type 'exceptions.TypeError' Traceback (most
 recent call last)

 p:\AsiaDesk\johngu\projects\deltaForce\ipython console in
 module()

 type 'exceptions.TypeError': expected a readable buffer object

Yeah, the error message is too terse in this case.  Record array 
constructor needs to be sure where your records start and end, and this 
is achieved by mapping tuples to records.  So, your example must be 
rewritten as:

In [70]: a = numpy.array([(1,2),(3,4)], dtype=[('obs1','f4'),
('obs2','f4')])

In [71]: a
Out[71]:
array([(1.0, 2.0), (3.0, 4.0)],
  dtype=[('obs1', 'f4'), ('obs2', 'f4')])

Have a look at:

http://www.scipy.org/RecordArrays

for more info on record arrays.

 I did find some documentation about array type descriptors when
 reading from files... it seems like these array types are specific to
 arrays created when reading from some sort of file / buffer?  Any
 help is appreciated.  Thanks!

I'm not sure on what you are asking here.  At any rate, it might be 
useful to have a look at complex dtype examples in:

http://www.scipy.org/Numpy_Example_List#head-f9175c69cccd74b9e4ee92e2a060af27c7447b76

Hope that helps,

-- 
Francesc Alted
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] complex numpy.ndarray dtypes

2008-10-01 Thread John Gu
Hello,

I am using numpy in conjunction with pyTables.  The data that I read in from
pyTables seem to have the following dtype:

p = hdf5.root.myTable.read()

p.__class__
type 'numpy.ndarray'

p[0].__class__
type 'numpy.void'

p.dtype
dtype([('time', 'f4'), ('obs1', 'f4'), ('obs2', 'f8'), ('obs3', 'f4')])

p.shape
(61230,)

The manner in which I access a particular column is p['time'] or p['obs1'].
I have a couple of questions regarding this data structure: 1) how do I
restructure the array into a 61230 x 4 array that can be indexed using [r,c]
notation?  2) What kind of dtype is pyTables using?  How do I create a
similar array that can be indexed by a named column?  I tried various ways:

a = array([[1,2],[3,4]], dtype=dtype([('obs1','f4'),('obs2','f4')]))
---
type 'exceptions.TypeError' Traceback (most recent call last)

p:\AsiaDesk\johngu\projects\deltaForce\ipython console in module()

type 'exceptions.TypeError': expected a readable buffer object

I did find some documentation about array type descriptors when reading from
files... it seems like these array types are specific to arrays created when
reading from some sort of file / buffer?  Any help is appreciated.  Thanks!

John
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion