Den 18.02.2012 23:54, skrev Travis Oliphant:
> Another factor.   the decision to make an extra layer of indirection makes 
> small arrays that much slower.   I agree with Mark that in a core library we 
> need to go the other way with small arrays being completely allocated in the 
> data-structure itself (reducing the number of pointer de-references).
>

I am not sure there is much overhead to

    double *const data = (double*)PyArray_DATA(array);

If C code calls PyArray_DATA(array) more than needed, the fix is not to 
store the data inside the struct, but rather fix the real problem. For 
example, the Cython syntax for NumPy arrays will under the hood unbox 
the ndarray struct into local variables. That gives the fastest data 
access. The NumPy core could e.g. have macros that takes care of the 
unboxing.

But for the purpose of cache use, it could be smart to make sure the 
data buffer is allocated directly after the PyObject struct (or at least 
in vicinity of it), so it will be loaded into cache along with the 
PyObject. That is, prefetched before dereferencing PyArray_DATA(array). 
But with respect to placement we must keep in mind the the PyObject can 
be subclassed. Putting e.g. 4 kb of static buffer space inside the 
PyArrayObject struct will bloat every ndarray.

Sturla
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to