On Thu, Mar 6, 2008 at 12:36 PM, <[EMAIL PROTECTED]> wrote: > Proposed solution: > ------------------------- > > It's probably not the best way (noob, that's me), but this situation could > be fixed by: > > 1) add a fill keyword to loadtxt such that > > def loadtxt(...,fill=-999): > > 2) add the following after the line "vals = line.split(delimiter)" (line 713 > in core/numeric.py , numpy 1.0.4) > > ====================== > for j in range(0,len(vals)): > if vals[j] != '': > pass > else: > vals[j]=fill > ====================== > > > Testing: ------------------------- > > Load an 18,000 line ascii dataset, 22 float variables on each line, skipping > the first column (its a time stamp). > > Timings using %timeit in ipython: > > Reading an ascii file with no missing values using the current version of > loadtxt: > ***10 loops, best of 3: 704 ms per loop > > Reading an ascii file with no missing values using the proposed changes to > loadtxt: > ***10 loops, best of 3: 802 ms per loop > > The changes do create a slight performance hit for those who use loadtxt to > read in nicely behaving ascii data. If this is an issue, could a loadtxt2 > function be added?
I haven't used loadtxt so I don't have an opinion on changing it. But would this be faster instead of a for loop? vals = [(z, fill)[z is ''] for z in vals] _______________________________________________ Numpy-discussion mailing list [email protected] http://projects.scipy.org/mailman/listinfo/numpy-discussion
