Oops! Having just done an svn update, I now see that David appears to have done most of this about a week ago...
I'm behind the times. -tim Tim Hochberg wrote: >I've finally got around to looking at numexpr again. Specifically, I'm >looking at Francesc Altet's numexpr-0.2, with the idea of harmonizing >the two versions. Let me go through his list of enhancements and comment >(my comments are dedented): > > - Addition of a boolean type. This allows better array copying times > for large arrays (lightweight computations ara typically bounded by > memory bandwidth). > >Adding this to numexpr looks like a no brainer. Behaviour of booleans >are different than integers, so in addition to being more memory >efficient, this enables boolean &, |, ~, etc to work properly. > > - Enhanced performance for strided and unaligned data, specially for > lightweigth computations (e.g. 'a>10'). With this and the addition of > the boolean type, we can get up to 2x better times than previous > versions. Also, most of the supported computations goes faster than > with numpy or numarray, even the simplest one. > >Francesc, if you're out there, can you briefly describe what this >support consists of? It's been long enough since I was messing with this >that it's going to take me a while to untangle NumExpr_run, where I >expect it's lurking, so any hints would be appreciated. > > - Addition of ~, & and | operators (a la numarray.where) > >Sounds good. > > - Support for both numpy and numarray (use the flag --force-numarray > in setup.py). > >At first glance this looks like it doesn't make things to messy, so I'm >in favor of incorporating this. > > - Added a new benchmark for testing boolean expressions and > strided/unaligned arrays: boolean_timing.py > >Benchmarks are always good. > > Things that I want to address in the future: > > - Add tests on strided and unaligned data (currently only tested > manually) > >Yep! Tests are good. > > - Add types for int16, int64 (in 32-bit platforms), float32, > complex64 (simple prec.) > >I have some specific ideas about how this should be accomplished. >Basically, I don't think we want to support every type in the same way, >since this is going to make the case statement blow up to an enormous >size. This may slow things down and at a minimum it will make things >less comprehensible. My thinking is that we only add casts for the extra >types and do the computations at high precision. Thus adding two int16 >numbers compiles to two OP_CAST_Ffs followed by an OP_ADD_FFF, and then >a OP_CAST_fF. The details are left as an excercise to the reader ;-). >So, adding int16, float32, complex64 should only require the addition of >6 casting opcodes plus appropriate modifications to the compiler. > >For large arrays, this should have most of the benfits of giving each >type it's own opcode, since the memory bandwidth is still small, while >keeping the interpreter relatively simple. > >Unfortunately, int64 doesn't fit under this scheme; is it used enough to >matter? I hate pile a whole pile of new opcodes on for something that's >rarely used. > > >Regards, > >-tim > > > > > >_______________________________________________ >Numpy-discussion mailing list >Numpy-discussion@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/numpy-discussion > > > > _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion