Den 18.02.2012 23:54, skrev Travis Oliphant: > Another factor. the decision to make an extra layer of indirection makes > small arrays that much slower. I agree with Mark that in a core library we > need to go the other way with small arrays being completely allocated in the > data-structure itself (reducing the number of pointer de-references). >
I am not sure there is much overhead to double *const data = (double*)PyArray_DATA(array); If C code calls PyArray_DATA(array) more than needed, the fix is not to store the data inside the struct, but rather fix the real problem. For example, the Cython syntax for NumPy arrays will under the hood unbox the ndarray struct into local variables. That gives the fastest data access. The NumPy core could e.g. have macros that takes care of the unboxing. But for the purpose of cache use, it could be smart to make sure the data buffer is allocated directly after the PyObject struct (or at least in vicinity of it), so it will be loaded into cache along with the PyObject. That is, prefetched before dereferencing PyArray_DATA(array). But with respect to placement we must keep in mind the the PyObject can be subclassed. Putting e.g. 4 kb of static buffer space inside the PyArrayObject struct will bloat every ndarray. Sturla _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion