Re: [Numpy-discussion] loadtxt and missing values

Keith Goodman Thu, 06 Mar 2008 12:50:44 -0800

On Thu, Mar 6, 2008 at 12:36 PM,  <[EMAIL PROTECTED]> wrote:

>  Proposed solution:
>  -------------------------
>
>  It's probably not the best way (noob, that's me), but this situation could 
> be fixed by:
>
>  1) add a fill keyword to loadtxt such that
>
>  def loadtxt(...,fill=-999):
>
>  2) add the following after the line "vals = line.split(delimiter)" (line 713 
> in core/numeric.py , numpy 1.0.4)
>
>  ======================
>        for j in range(0,len(vals)):
>            if vals[j] != '':
>                pass
>            else:
>                vals[j]=fill
>  ======================
>
>
>  Testing: -------------------------
>
>  Load an 18,000 line ascii dataset, 22 float variables on each line, skipping 
> the first column (its a time stamp).
>
>  Timings using %timeit in ipython:
>
>  Reading an ascii file with no missing values using the current version of 
> loadtxt:
>  ***10 loops, best of 3: 704 ms per loop
>
>  Reading an ascii file with no missing values using the proposed changes to 
> loadtxt:
>  ***10 loops, best of 3: 802 ms per loop
>
>  The changes do create a slight performance hit for those who use loadtxt to 
> read in nicely behaving ascii data.  If this is an issue, could a loadtxt2 
> function be added?


I haven't used loadtxt so I don't have an opinion on changing it. But
would this be faster instead of a for loop?

vals = [(z, fill)[z is ''] for z in vals]
_______________________________________________
Numpy-discussion mailing list
[email protected]
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] loadtxt and missing values

Reply via email to