[Numpy-discussion] numpy.loadtext() fails with dtype + usecols

Ryan May Fri, 18 Jul 2008 15:17:05 -0700

Hi,

I was trying to use loadtxt() today to read in some text data, and I hada problem when I specified a dtype that only contained as many elementsas in columns in usecols. The example below shows the problem:


import numpy as np
import StringIO
data = '''STID RELH TAIR
JOE 70.1 25.3
BOB 60.5 27.9
'''
f = StringIO.StringIO(data)
names = ['stid', 'temp']
dtypes = ['S4', 'f8']
arr = np.loadtxt(f, usecols=(0,2),dtype=zip(names,dtypes), skiprows=1)

With current 1.1 (and SVN head), this yields:

IndexError                                Traceback (most recent call last)

/home/rmay/<ipython console> in <module>()

/usr/lib64/python2.5/site-packages/numpy/lib/io.pyc in loadtxt(fname,dtype, comments, delimiter, converters, skiprows, usecols, unpack)

    309                             for j in xrange(len(vals))]
    310         if usecols is not None:
--> 311             row = [converterseq[j](vals[j]) for j in usecols]
    312         else:

313 row = [converterseq[j](val) for j,val inenumerate(vals)]


IndexError: list index out of range
------------------------------------------

I've added a patch that checks for usecols, and if present, correctlycreates the converters dictionary to map each specified column withconverter for the corresponding field in the dtype. With the attachedpatch, this works fine:


>arr
array([('JOE', 25.300000000000001), ('BOB', 27.899999999999999)],
      dtype=[('stid', '|S4'), ('temp', '<f8')])

Comments?  Can I get this in for 1.1.1?

Thanks,

Ryan

--
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma

--- io.py.bak	2008-07-18 18:12:17.000000000 -0400
+++ io.py	2008-07-16 22:49:13.000000000 -0400
@@ -292,8 +292,13 @@
     if converters is None:
         converters = {}
         if dtype.names is not None:
-            converterseq = [_getconv(dtype.fields[name][0]) \
-                            for name in dtype.names]
+            if usecols is None:
+                converterseq = [_getconv(dtype.fields[name][0]) \
+                                for name in dtype.names]
+            else:
+                converters.update([(col,_getconv(dtype.fields[name][0])) \
+                                    for col,name in zip(usecols, dtype.names)])
+
 
     for i,line in enumerate(fh):
         if i<skiprows: continue

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] numpy.loadtext() fails with dtype + usecols

Reply via email to