Much obliged everyone. I've noted all these points in my post about my high performance talk with updates linking back to these threads: http://ianozsvald.com/2014/02/23/high-performance-python-at-pydatalondon-2014/
Re. Ronan - I'll use jitviewer, I just haven't had time. I hope to cover it in the JIT section of my book (in a month or so). On 25 February 2014 11:36, Maciej Fijalkowski <fij...@gmail.com> wrote: > FYI it's fixed > > On Sat, Feb 22, 2014 at 8:48 PM, Maciej Fijalkowski <fij...@gmail.com> wrote: >> On Sat, Feb 22, 2014 at 7:45 PM, Ronan Lamy <ronan.l...@gmail.com> wrote: >>> Hello Ian, >>> >>> Le 20/02/14 20:40, Ian Ozsvald a écrit : >>> >>>> Hi Armin. The point of the question was not to remove numpy but to >>>> understand the behaviour :-) I've already done a set of benchmarks >>>> with lists and with numpy, I've copied the results below. I'm using >>>> the same Julia code throughout (there's a note about the code below). >>>> PyPy on lists is indeed very compelling. >>>> >>>> One observation I've made of beginners (and I did the same) is that >>>> iterating over numpy arrays seems natural until you learn it is >>>> horribly slow. The you learn to vectorise. Some of the current tools >>>> handle the non-vectorised case really well and that's something I want >>>> to mention. >>>> >>>> For Julia I've used lists and numpy. Using a numpy list rather than an >>>> `array` makes sense as arrays won't hold a complex type (and messing >>>> with decomposing the complex elements into two arrays gets even >>>> sillier) and the example is still trivial for a reader to understand. >>>> numpy arrays (and Python arrays) are good because they use much less >>>> RAM than big lists. The reason why my example code above made lists >>>> and then turned them into numpy arrays...that's because I was lazy and >>>> hadn't finished tidying this demo (my bad!). >>> >>> >>> I agree that your code looks rather sensible (at least, to people who >>> haven't internalised yet all the "stupid" implementation details concerning >>> arrays, lists, iteration and vectorisation). So it's a bit of a shame that >>> PyPy doesn't do better. >>> >>> >>>> I don't mind that my use of numpy is silly, I'm just curious to >>>> understand why pypynumpy diverges from the results of the other >>>> compiler technologies. The simple answer might be 'because pypynumpy >>>> is young' and that'd be fine - at least I'd have an answer if someone >>>> asks the question in my talk. If someone has more details, that'd be >>>> really interesting too. Is there a fundamental reason why pypynumpy >>>> couldn't do the example as fast as cython/numba/pythran? >>> >>> >>> To answer such questions, the best way is to use the jitviewer >>> (https://bitbucket.org/pypy/jitviewer ). Looking at the trace for the inner >>> loop, I can see every operation on a scalar triggers a dict lookup to obtain >>> its dtype. This looks like self-inflicted pain coming the broken objspace >>> abstraction rather than anything fundamental. Fixing that should improve >>> speed by about an order of magnitude. >>> >>> Cheers, >>> Ronan >>> >> >> Hi Ronan. >> >> You can't blame objspace for everything ;-) It looks like it's easily >> fixable. I'm in transit right now but I can fix it once I'm home. Ian >> - please come with more broken examples, they usually come from stupid >> reasons! >> >> Cheers, >> fijal > _______________________________________________ > pypy-dev mailing list > pypy-dev@python.org > https://mail.python.org/mailman/listinfo/pypy-dev -- Ian Ozsvald (A.I. researcher) i...@ianozsvald.com http://IanOzsvald.com http://MorConsulting.com/ http://Annotate.IO http://SocialTiesApp.com/ http://TheScreencastingHandbook.com http://FivePoundApp.com/ http://twitter.com/IanOzsvald http://ShowMeDo.com _______________________________________________ pypy-dev mailing list pypy-dev@python.org https://mail.python.org/mailman/listinfo/pypy-dev