On 2/5/2011 1:02 PM, Eric Firing wrote:
On 02/04/2011 02:03 PM, Christoph Gohlke wrote:
[...]
How about these changes to color.py (attached). This avoids copies, uses
in-place operations, and calculates single precision when normalizing
small integer and float32 arrays. Similar could be done for LogNorm. Do
masked arrays support in-place operations?
Christoph
Christoph,
Thank you.
Done (with slight modifications) in 8946 (trunk).
I was surprised by the speedup in normalizing large arrays when using
float32 versus float64. A factor of 10 on my machine with (1000,1000),
timed with ipython %timeit. Because of the way %timeit does multiple
tests, I suspect it may exaggerate cache effects.
Eric
Please consider the attached patch for the _image.frombyte function. It
avoids temporary copies in case of non-contiguous input arrays. Copying
a 1024x1024 slice out of a contiguous 4096x4096 RGBA or RGB array is
about 7x faster (a common case for zooming/panning). Copying contiguous
RGB input arrays is ~2x faster. Tested on win32-py2.7.
Christoph
Index: _image.cpp
===================================================================
--- _image.cpp (revision 8964)
+++ _image.cpp (working copy)
@@ -1083,7 +1083,7 @@
Py::Object x = args[0];
int isoutput = Py::Int(args[1]);
- PyArrayObject *A = (PyArrayObject *) PyArray_ContiguousFromObject(x.ptr(),
PyArray_UBYTE, 3, 3);
+ PyArrayObject *A = (PyArrayObject *) PyArray_FromObject(x.ptr(),
PyArray_UBYTE, 3, 3);
if (A == NULL)
{
throw Py::ValueError("Array must have 3 dimensions");
@@ -1102,35 +1102,86 @@
agg::int8u *arrbuf;
agg::int8u *buffer;
+ agg::int8u *dstbuf;
arrbuf = reinterpret_cast<agg::int8u *>(A->data);
size_t NUMBYTES(imo->colsIn * imo->rowsIn * imo->BPP);
- buffer = new agg::int8u[NUMBYTES];
+ buffer = dstbuf = new agg::int8u[NUMBYTES];
if (buffer == NULL) //todo: also handle allocation throw
{
throw Py::MemoryError("_image_module::frombyte could not allocate
memory");
}
- const size_t N = imo->rowsIn * imo->colsIn * imo->BPP;
- size_t i = 0;
- if (A->dimensions[2] == 4)
+ if PyArray_ISCONTIGUOUS(A)
{
- memmove(buffer, arrbuf, N);
+ if (A->dimensions[2] == 4)
+ {
+ memmove(dstbuf, arrbuf, imo->rowsIn * imo->colsIn * 4);
+ }
+ else
+ {
+ size_t i = imo->rowsIn * imo->colsIn;
+ while (i--)
+ {
+ *dstbuf++ = *arrbuf++;
+ *dstbuf++ = *arrbuf++;
+ *dstbuf++ = *arrbuf++;
+ *dstbuf++ = 255;
+ }
+ }
}
+ else if ((A->strides[1] == 4) && (A->strides[2] == 1))
+ {
+ const size_t N = imo->colsIn * 4;
+ const size_t stride = A->strides[0];
+ for (size_t rownum = 0; rownum < imo->rowsIn; rownum++)
+ {
+ memmove(dstbuf, arrbuf, N);
+ arrbuf += stride;
+ dstbuf += N;
+ }
+ }
+ else if ((A->strides[1] == 3) && (A->strides[2] == 1))
+ {
+ const size_t stride = A->strides[0] - imo->colsIn * 3;
+ for (size_t rownum = 0; rownum < imo->rowsIn; rownum++)
+ {
+ for (size_t colnum = 0; colnum < imo->colsIn; colnum++)
+ {
+ *dstbuf++ = *arrbuf++;
+ *dstbuf++ = *arrbuf++;
+ *dstbuf++ = *arrbuf++;
+ *dstbuf++ = 255;
+ }
+ arrbuf += stride;
+ }
+ }
else
{
- while (i < N)
+ PyArrayIterObject *iter;
+ iter = (PyArrayIterObject *)PyArray_IterNew((PyObject *)A);
+ if (A->dimensions[2] == 4)
{
- memmove(buffer, arrbuf, 3);
- buffer += 3;
- arrbuf += 3;
- *buffer++ = 255;
- i += 4;
+ while (iter->index < iter->size) {
+ *dstbuf++ = *((unsigned char *)iter->dataptr);
+ PyArray_ITER_NEXT(iter);
+ }
}
- buffer -= N;
- arrbuf -= imo->rowsIn * imo->colsIn;
+ else
+ {
+ while (iter->index < iter->size) {
+ *dstbuf++ = *((unsigned char *)iter->dataptr);
+ PyArray_ITER_NEXT(iter);
+ *dstbuf++ = *((unsigned char *)iter->dataptr);
+ PyArray_ITER_NEXT(iter);
+ *dstbuf++ = *((unsigned char *)iter->dataptr);
+ PyArray_ITER_NEXT(iter);
+ *dstbuf++ = 255;
+ }
+ }
+ Py_DECREF(iter);
}
if (isoutput)
------------------------------------------------------------------------------
The ultimate all-in-one performance toolkit: Intel(R) Parallel Studio XE:
Pinpoint memory and threading errors before they happen.
Find and fix more than 250 security defects in the development cycle.
Locate bottlenecks in serial and parallel code that limit performance.
http://p.sf.net/sfu/intel-dev2devfeb
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users