> On Nov 25, 2008, at 2:37 PM, Ryan May wrote: >> What about doing the parsing and type inference in a loop and holding >> onto the already split lines? Then loop through the lines with the >> converters that were finally chosen? In addition to making my usecase >> work, this has the benefit of not doing the I/O twice. > > You mean, filling a list and relooping on it if we need to ? Sounds > like a plan, but doesn't it create some extra temporaries we may not > want ?
It shouldn't create any *extra* temporaries since we already make a list of lists before creating the final array. It just introduces an extra looping step. (I'd reuse the existing list of lists). > Depends on how we do it. We could have a modified np.loadtxt that > takes some of the ideas of the file I send you (the StringConverter, > for example), then I could have a numpy.ma.io that would take care of > the missing data. And something in scikits.timeseries for the dates... > > The new np.loadtxt could use the default of the initial one, or we > could create yet another function (np.loadfromtxt) that would match > what I was suggesting, and np.loadtxt would be a special stripped > downcase with dtype=float by default. > > thoughts? My personal opinion is that if it doesn't make loadtxt too unwieldly, to just add a few of the options to loadtxt() itself. I'm working on tweaking loadtxt() to add the auto dtype and the names, relying heavily on your StringConverter class (nice code btw.). If my understanding of StringConverter is correct, tweaking the new loadtxt for ma or timeseries would only require passing in modified versions of StringConverter. I'll post that when I'm done and we can see if it looks like too much functionality stapled together or not. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion