[Numpy-discussion] fast duplicate of array

2010-01-23 Thread Alan G Isaac
Suppose x and y are conformable 2d arrays.
I now want x to become a duplicate of y.
I could create a new array:
x = y.copy()
or I could assign the values of y to x:
x[:,:] = y

As expected the latter is faster (no array creation).
Are there better ways?

Thanks,
Alan Isaac
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fast duplicate of array

2010-01-23 Thread Anne Archibald
2010/1/23 Alan G Isaac ais...@american.edu:
 Suppose x and y are conformable 2d arrays.
 I now want x to become a duplicate of y.
 I could create a new array:
 x = y.copy()
 or I could assign the values of y to x:
 x[:,:] = y

 As expected the latter is faster (no array creation).
 Are there better ways?

If both arrays are C contiguous, or more generally contiguous blocks
of memory with the same strided structure, you might get faster
copying by flattening them first, so that it can go in a single
memcpy(). For really large arrays that use complete pages, some
low-level hackery involving memmap() might be able to make a shared
copy-on-write copy at almost no cost until you start modifying one
array or the other. But both of these tricks are intended for the
regime where copying the data is the expensive part, not fabricating
the array object; for that, I'm not sure you can accelerate things
much.

Anne
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fast duplicate of array

2010-01-23 Thread Alan G Isaac
On 1/23/2010 5:01 PM, Anne Archibald wrote:
 If both arrays are C contiguous, or more generally contiguous blocks
 of memory with the same strided structure, you might get faster
 copying by flattening them first, so that it can go in a single
 memcpy().

I may misuderstand this.  Did you just mean
x.flat = y.flat
?

If so, I find that to be *much* slower.

Thanks,
Alan


x = np.random.random((1000,1000))
y = x.copy()
t0 = time.clock()
for t in range(1000): x = y.copy()
print(time.clock() - t0)
t0 = time.clock()
for t in range(1000): x[:,:] = y
print(time.clock() - t0)
t0 = time.clock()
for t in range(1000): x.flat = y.flat
print(time.clock() - t0)
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fast duplicate of array

2010-01-23 Thread Keith Goodman
On Sat, Jan 23, 2010 at 2:31 PM, Alan G Isaac ais...@american.edu wrote:
 On 1/23/2010 5:01 PM, Anne Archibald wrote:
 If both arrays are C contiguous, or more generally contiguous blocks
 of memory with the same strided structure, you might get faster
 copying by flattening them first, so that it can go in a single
 memcpy().

 I may misuderstand this.  Did you just mean
 x.flat = y.flat
 ?

 If so, I find that to be *much* slower.

 Thanks,
 Alan


 x = np.random.random((1000,1000))
 y = x.copy()
 t0 = time.clock()
 for t in range(1000): x = y.copy()
 print(time.clock() - t0)
 t0 = time.clock()
 for t in range(1000): x[:,:] = y
 print(time.clock() - t0)
 t0 = time.clock()
 for t in range(1000): x.flat = y.flat
 print(time.clock() - t0)

I don't know what a view is, but it is fast:

x = y.view()

def speed():
import numpy as np
import time
x = np.random.random((1000,1000))
y = x.copy()
t0 = time.clock()
for t in range(1000): x = y.copy()
print(time.clock() - t0)
t0 = time.clock()
for t in range(1000): x[:,:] = y
print(time.clock() - t0)
t0 = time.clock()
for t in range(1000): x.flat = y.flat
print(time.clock() - t0)
t0 = time.clock()
for t in range(1000): x = y.view()
print(time.clock() - t0)

 speed()
1.3
2.07
15.0
0.01
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fast duplicate of array

2010-01-23 Thread Charles R Harris
On Sat, Jan 23, 2010 at 4:00 PM, Keith Goodman kwgood...@gmail.com wrote:

 On Sat, Jan 23, 2010 at 2:31 PM, Alan G Isaac ais...@american.edu wrote:
  On 1/23/2010 5:01 PM, Anne Archibald wrote:
  If both arrays are C contiguous, or more generally contiguous blocks
  of memory with the same strided structure, you might get faster
  copying by flattening them first, so that it can go in a single
  memcpy().
 
  I may misuderstand this.  Did you just mean
  x.flat = y.flat
  ?
 
  If so, I find that to be *much* slower.
 
  Thanks,
  Alan
 
 
  x = np.random.random((1000,1000))
  y = x.copy()
  t0 = time.clock()
  for t in range(1000): x = y.copy()
  print(time.clock() - t0)
  t0 = time.clock()
  for t in range(1000): x[:,:] = y
  print(time.clock() - t0)
  t0 = time.clock()
  for t in range(1000): x.flat = y.flat
  print(time.clock() - t0)

 I don't know what a view is, but it is fast:

 x = y.view()


In this case x isn't a copy of y, it is a reference to the same data in
memory. It is fast because no copying is done.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fast duplicate of array

2010-01-23 Thread Anne Archibald
2010/1/23 Alan G Isaac ais...@american.edu:
 On 1/23/2010 5:01 PM, Anne Archibald wrote:
 If both arrays are C contiguous, or more generally contiguous blocks
 of memory with the same strided structure, you might get faster
 copying by flattening them first, so that it can go in a single
 memcpy().

 I may misuderstand this.  Did you just mean
 x.flat = y.flat
 ?

No, .flat constructs an iterator that traverses the object as if it
were flat. I had in mind accessing the underlying data through views
that were flat:

In [3]: x = np.random.random((1000,1000))

In [4]: y = np.random.random((1000,1000))

In [5]: xf = x.view()

In [6]: xf.shape = (-1,)

In [7]: yf = y.view()

In [8]: yf.shape = (-1,)

In [9]: yf[:] = xf[:]

This may still use a loop instead of a memcpy(), in which case you'd
want to look for an explicit memcpy()-based implementation, but when
manipulating multidimensional arrays you have (in principle, anyway)
nested loops which may not be executed in the cache-optimal order.
Ideally numpy would automatically notice when operations can be done
on flattened versions of arrays and get rid of some of the looping and
indexing, but I wouldn't count on it. At one point I remember finding
that the loops were reordered not for cache coherence but to make the
inner loop over the biggest dimension (to minimize looping overhead).

Anne


 If so, I find that to be *much* slower.

 Thanks,
 Alan


 x = np.random.random((1000,1000))
 y = x.copy()
 t0 = time.clock()
 for t in range(1000): x = y.copy()
 print(time.clock() - t0)
 t0 = time.clock()
 for t in range(1000): x[:,:] = y
 print(time.clock() - t0)
 t0 = time.clock()
 for t in range(1000): x.flat = y.flat
 print(time.clock() - t0)
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fast duplicate of array

2010-01-23 Thread Alan G Isaac
On 1/23/2010 6:00 PM, Keith Goodman wrote:
 x = y.view()

Thanks, but I'm not looking for a view.
And I need x to own its data.

Alan


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] fast duplicate of array

2010-01-23 Thread Alan G Isaac
On 1/23/2010 7:29 PM, Anne Archibald wrote:
 I had in mind accessing the underlying data through views
 that were flat:

 In [3]: x = np.random.random((1000,1000))

 In [4]: y = np.random.random((1000,1000))

 In [5]: xf = x.view()

 In [6]: xf.shape = (-1,)

 In [7]: yf = y.view()

 In [8]: yf.shape = (-1,)

 In [9]: yf[:] = xf[:]


Yup, that's a bit faster.
Thanks,
Alan

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion