On Tue, Nov 3, 2009 at 11:43 AM, David Warde-Farley <d...@cs.toronto.edu>wrote:
> On 2-Nov-09, at 11:35 PM, Thomas Robitaille wrote: > > > But if I want to specify the data types: > > > > np.rec.fromrecords([(1,'hello'),(2,'world')],dtype=[('a',np.int8), > > ('b',np.str)]) > > > > the string field is set to a length of zero: > > > > rec.array([(1, ''), (2, '')], dtype=[('a', '|i1'), ('b', '|S0')]) > > > > I need to specify datatypes for all numerical types since I care about > > int8/16/32, etc, but I would like to benefit from the auto string > > length detection that works if I don't specify datatypes. I tried > > replacing np.str by None but no luck. I know I can specify '|S5' for > > example, but I don't know in advance what the string length should be > > set to. > > This is a limitation of the way the dtype code works, and AFAIK > there's no easy fix. In some code I wrote recently I had to loop > through the entire list of records i.e. max(len(foo[2]) for foo in > records). > > Not to shamelessly plug my own project ... but more robust string type detection is one of the features of Tabular ( http://bitbucket.org/elaine/tabular/), and is one of the (kinds of) reasons we wrote the package. Perhaps using Tabular could be useful to you? Dan
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion