Re: [PyOpenCL] clmath with pre-allocated arrays.

Henry Gomersall Wed, 12 Aug 2015 03:51:03 -0700

On 11/08/15 22:36, Andreas Kloeckner wrote:

Henry Gomersall<[email protected]>  writes:

>I've noticed that using e.g. clmath._atan2(out, in1, in2, queue) with a
>pre-allocated `out` array is nearly twice as fast as using
>clmath.atan2(in1, in2, queue), even when a memory pool is used to
>allocate the Array.

Oops. Thanks for reporting this.


It turns out that the Python reimplementation of the memory pool was
pretty broken--it didn't actually do much. With the fixed version now in
git, things look considerably better. In particular, when I try your
test code (on the AMD CPU implementation), the time difference between
the explicit out argument and the mempool version is now only about 10%.

I'm struggling to realise your improvements. Simply installing frommaster with `pip install -e .` (or similar) is actually substantiallyslower than the pypi installation.

I noticed that the pypi installation (`pip install pyopencl`) usesboost, the config for which seems to have been stripped out of themaster version. Am I missing something regarding the use of libffi inbuilding from source that yields the slowdown I've noticed?


(it's something like half the speed now, so not insignificant!).

Cheers,

Henry

_______________________________________________
PyOpenCL mailing list
[email protected]
http://lists.tiker.net/listinfo/pyopencl

Re: [PyOpenCL] clmath with pre-allocated arrays.

Reply via email to