Re: [Numpy-discussion] Aligned array allocations

2015-01-22 Thread Sturla Molden
Antoine Pitrou solip...@pitrou.net wrote:

 By always using an aligned allocator there is some overhead:
 - all arrays occupy a bit more memory by a small average amount
   (probably 16 bytes average on a 64-bit machine, for a 16 byte
   guaranteed alignment)

NumPy arrays are Python objects. They have an overhead anyway, much more
than this, and 16 bytes are not worse than adding a couple of pointers to
the struct. In the big picture this tiny overhead does not matter.

 - array resizes can be more expensive in CPU time, when the physical
   start changes and its alignment changes too

We are using Python. If we were worried about small inefficiencies we would
not be using it. Resizing ndarrays are rare anyway. They are not used like
Python lists or instead of lists. We use lists in the same way as anyone
else who uses Python. So an ndarray resize can afford to be more espensive
than a list append.

Also the NumPy community expects an ndarray resize to be expensive and O(n)
due to its current behavior: If an array has a view, realloc is out of the
question.

:-) 

Sturla

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Aligned array allocations

2015-01-19 Thread Antoine Pitrou

Hello,

In https://github.com/numpy/numpy/issues/5312 there's a request for an
aligned allocator in Numpy (more than the default alignment of the
platform's memory allocator). The reason is that on modern
vectorization instruction sets, a certain alignment is required for
optimal performance (even though unaligned data still works: it's just
that performance is degraded... by how much will depend on the CPU
micro-architecture). For example Intel recommends a 32-byte alignment
for AVX loads and stores.

In https://github.com/numpy/numpy/pull/5457 I have proposed a patch to
wrap the system allocator in an aligned allocator. The proposed scheme
makes the alignment configurable at runtime (through a Python API),
because different platforms may have different desirable alignments,
and it is not reasonable for Numpy to know about them all, nor for
users to recompile Numpy each time they have a different CPU.

By always using an aligned allocator there is some overhead:
- all arrays occupy a bit more memory by a small average amount
  (probably 16 bytes average on a 64-bit machine, for a 16 byte
  guaranteed alignment)
- array resizes can be more expensive in CPU time, when the physical
  start changes and its alignment changes too

There is also a limitation: while the physical start of an array will
always be aligned, this can be defeated when taking a view starting at
a non-zero index.

(note that to take advantage of certain instruction set features such
as AVX, Numpy may need to be compiled with specific compiler flags...
but Numpy's allocations also affect other packages such as Numba which
is able to generate code at runtime)

I would like to know if people are interested in this feature, and if
the proposed approach is acceptable.

Regards

Antoine.


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion