[Numpy-discussion] Filling gaps

2009-02-12 Thread A B
Hi,
Are there any routines to fill in the gaps in an array. The simplest
would be by carrying the last known observation forward.
0,0,10,8,0,0,7,0
0,0,10,8,8,8,7,7
Or by somehow interpolating the missing values based on the previous
and next known observations (mean).
Thanks.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Filling gaps

2009-02-12 Thread Pierre GM

On Feb 12, 2009, at 8:22 PM, A B wrote:

 Hi,
 Are there any routines to fill in the gaps in an array. The simplest
 would be by carrying the last known observation forward.
 0,0,10,8,0,0,7,0
 0,0,10,8,8,8,7,7
 Or by somehow interpolating the missing values based on the previous
 and next known observations (mean).
 Thanks.


The functions `forward_fill` and `backward_fill` in scikits.timeseries  
should do what you want. They work also on MaskedArray objects,  
meaning that you don't need to have actual series.
The catch is that you need to install scikits.timeseries, of course.  
More info here:http://pytseries.sourceforge.net/
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Filling gaps

2009-02-12 Thread Keith Goodman
On Thu, Feb 12, 2009 at 5:22 PM, A B python6...@gmail.com wrote:
 Are there any routines to fill in the gaps in an array. The simplest
 would be by carrying the last known observation forward.
 0,0,10,8,0,0,7,0
 0,0,10,8,8,8,7,7

Here's an obvious hack for 1d arrays:

def fill_forward(x, miss=0):
y = x.copy()
for i in range(x.shape[0]):
if y[i] == miss:
y[i] = y[i-1]
return y

Seems to work:

 x
   array([ 0,  0, 10,  8,  0,  0,  7,  0])
 fill_forward(x)
   array([ 0,  0, 10,  8,  8,  8,  7,  7])
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Filling gaps

2009-02-12 Thread Keith Goodman
On Thu, Feb 12, 2009 at 5:52 PM, Keith Goodman kwgood...@gmail.com wrote:
 On Thu, Feb 12, 2009 at 5:22 PM, A B python6...@gmail.com wrote:
 Are there any routines to fill in the gaps in an array. The simplest
 would be by carrying the last known observation forward.
 0,0,10,8,0,0,7,0
 0,0,10,8,8,8,7,7

 Here's an obvious hack for 1d arrays:

 def fill_forward(x, miss=0):
y = x.copy()
for i in range(x.shape[0]):
if y[i] == miss:
y[i] = y[i-1]
return y

 Seems to work:

 x
   array([ 0,  0, 10,  8,  0,  0,  7,  0])
 fill_forward(x)
   array([ 0,  0, 10,  8,  8,  8,  7,  7])

I guess that should be

for i in range(1, x.shape[0]):

instead of

for i in range(x.shape[0]):

to avoid replacing the first element of the array, if it is missing,
with the last.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Filling gaps

2009-02-12 Thread Keith Goodman
On Thu, Feb 12, 2009 at 6:04 PM, Keith Goodman kwgood...@gmail.com wrote:
 On Thu, Feb 12, 2009 at 5:52 PM, Keith Goodman kwgood...@gmail.com wrote:
 On Thu, Feb 12, 2009 at 5:22 PM, A B python6...@gmail.com wrote:
 Are there any routines to fill in the gaps in an array. The simplest
 would be by carrying the last known observation forward.
 0,0,10,8,0,0,7,0
 0,0,10,8,8,8,7,7

 Here's an obvious hack for 1d arrays:

 def fill_forward(x, miss=0):
y = x.copy()
for i in range(x.shape[0]):
if y[i] == miss:
y[i] = y[i-1]
return y

 Seems to work:

 x
   array([ 0,  0, 10,  8,  0,  0,  7,  0])
 fill_forward(x)
   array([ 0,  0, 10,  8,  8,  8,  7,  7])

 I guess that should be

for i in range(1, x.shape[0]):

 instead of

for i in range(x.shape[0]):

 to avoid replacing the first element of the array, if it is missing,
 with the last.

For large 1d x arrays, this might be faster:

def fill_forward2(x, miss=0):
y = x.copy()
while np.any(y == miss):
idx = np.where(y == miss)[0]
y[idx] = y[idx-1]
return y

But it does replace the first element of the array, if it is missing,
with the last.

We could speed it up by doing (y == miss) only once per loop. (But I
bet the np.where is the bottleneck.)
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion