On 30.09.2016 23:09, josef.p...@gmail.com wrote: > On Fri, Sep 30, 2016 at 9:38 AM, Julian Taylor > <jtaylor.deb...@googlemail.com> wrote: >> hi, >> Temporary arrays generated in expressions are expensive as the imply >> extra memory bandwidth which is the bottleneck in most numpy operations. >> For example: >> >> r = a + b + c >> >> creates the b + c temporary and then adds a to it. >> This can be rewritten to be more efficient using inplace operations: >> >> r = b + c >> r += a > > general question (I wouldn't understand the details even if I looked.) > > how is this affected by broadcasting and type promotion? > > Some of the main reasons that I don't like to use inplace operation in > general is that I'm often not sure when type promotion occurs and when > arrays expand during broadcasting. > > for example b + c is 1-D, a is 2-D, and r has the broadcasted shape. > another case when I switch away from broadcasting is when b + c is int > or bool and a is float. Thankfully, we get error messages for casting > now.
the temporary is only avoided when the casting follows the safe rule, so it should be the same as what you get without inplace operations. E.g. float32-temporary + float64 will not be converted to the unsafe float32 += float64 which a normal inplace operations would allow. But float64-temp + float32 is transformed. Currently the only broadcasting that will be transformed is temporary + scalar value, otherwise it will only work on matching array sizes. Though there is not really anything that prevents full broadcasting but its not implemented yet in the PR. > >> >> This saves some memory bandwidth and can speedup the operation by 50% >> for very large arrays or even more if the inplace operation allows it to >> be completed completely in the cpu cache. > > I didn't realize the difference can be so large. That would make > streamlining some code worth the effort. > > Josef > > >> >> The problem is that inplace operations are a lot less readable so they >> are often only used in well optimized code. But due to pythons >> refcounting semantics we can actually do some inplace conversions >> transparently. >> If an operand in python has a reference count of one it must be a >> temporary so we can use it as the destination array. CPython itself does >> this optimization for string concatenations. >> >> In numpy we have the issue that we can be called from the C-API directly >> where the reference count may be one for other reasons. >> To solve this we can check the backtrace until the python frame >> evaluation function. If there are only numpy and python functions in >> between that and our entry point we should be able to elide the temporary. >> >> This PR implements this: >> https://github.com/numpy/numpy/pull/7997 >> >> It currently only supports Linux with glibc (which has reliable >> backtraces via unwinding) and maybe MacOS depending on how good their >> backtrace is. On windows the backtrace APIs are different and I don't >> know them but in theory it could also be done there. >> >> A problem is that checking the backtrace is quite expensive, so should >> only be enabled when the involved arrays are large enough for it to be >> worthwhile. In my testing this seems to be around 180-300KiB sized >> arrays, basically where they start spilling out of the CPU L2 cache. >> >> I made a little crappy benchmark script to test this cutoff in this branch: >> https://github.com/juliantaylor/numpy/tree/elide-bench >> >> If you are interested you can run it with: >> python setup.py build_ext -j 4 --inplace >> ipython --profile=null check.ipy >> >> At the end it will plot the ratio between elided and non-elided runtime. >> It should get larger than one around 180KiB on most cpus. >> >> If no one points out some flaw in the approach, I'm hoping to get this >> into the next numpy version. >> >> cheers, >> Julian >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion >
signature.asc
Description: OpenPGP digital signature
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion