El dt 09 de 01 del 2007 a les 23:19 +0900, en/na David Cournapeau va escriure: > Hi, > > I am finally implementing a C function to replace the current slow > implementation of clip in python as promised a few weeks ago. The idea > is to implement it as the following: > > def clip(input, min, max): > a = input.copy() > putmask(a, a <= min, min) > putmask(a, a >= max, max) > return a > > I don't have that much experience in writing general numpy functions, so > I was wondering of other people could advise me on the following points. >
Sorry, but not real experience writing extensions directly in C. However, you may want to experiment using numexpr for doing what you want. It's relatively easy to extend numexpr and adding a new opcode to its virtual machine. I'm attaching a patch for implementing such a clip routine (only for floating point types, but given the example, it should be straightforward to add support for integers as well). Also, you should note that using the fancy indexing of numpy seems to perform better than the putmask approach. Below are my figures for a small benchmark (also attached) for testing the performance of clip using several approaches: time (putmask)--> 1.38 time (where)--> 2.713 time (numexpr where)--> 1.291 time (fancy+assign)--> 0.967 time (numexpr clip)--> 0.596 It is interesting to see there how fancy-indexing + assignation is quite more efficient than putmask. Unfortunately, integrating this clip version officially in numexpr doesn't seem realistic as the main authors are trying hard to avoid cluttering the VM with too many opcodes (they don't want to overload the instruction cache of the CPU). Nevertheless, as the implementation of clip in numexpr should be rather optimal, its performance can still be used as a reference for other implementations. HTH and good luck with your own clip implementation, -- Francesc Altet | Be careful about using the following code -- Carabos Coop. V. | I've only proven that it works, www.carabos.com | I haven't tested it. -- Donald Knuth
import numpy as N from numexpr import evaluate from time import time M = 1000000; niter = 5 lower = 5; upper = M-7 a = N.random.random(M)*M t1 = time() for i in xrange(niter): b = a.copy() N.putmask(b, b<=lower, lower) N.putmask(b, b>=upper, upper) print "time (putmask)-->", round(time()-t1, 3) t1 = time() for i in xrange(niter): t=N.where(a<=lower, lower, a) b=N.where(t>=upper, upper, t) print "time (where)-->", round(time()-t1, 3) t1 = time() for i in xrange(niter): t=evaluate("where(a<=lower, lower, a)") b=evaluate("where(t>=upper, upper, t)") print "time (numexpr where)-->", round(time()-t1, 3) t1 = time() for i in xrange(niter): b=a.copy() b[b<=lower] = lower b[b>=upper] = upper print "time (fancy+assign)-->", round(time()-t1, 3) t1 = time() for i in xrange(niter): b = evaluate("clip(a, lower, upper)") print "time (numexpr clip)-->", round(time()-t1, 3)
Index: interp_body.c =================================================================== --- interp_body.c (revision 2513) +++ interp_body.c (working copy) @@ -177,6 +177,10 @@ case OP_WHERE_FFFF: VEC_ARG3(f_dest = f1 ? f2 : f3); + case OP_CLIP_FFFF: VEC_ARG3(f_dest = f1; + if (f_dest <= f2) f_dest = f2; + if (f_dest >= f3) f_dest = f3); + case OP_FUNC_FF: VEC_ARG1(f_dest = functions_f[arg2](f1)); case OP_FUNC_FFF: VEC_ARG2(f_dest = functions_ff[arg3](f1, f2)); Index: interpreter.c =================================================================== --- interpreter.c (revision 2513) +++ interpreter.c (working copy) @@ -64,6 +64,7 @@ OP_SQRT_FF, OP_ARCTAN2_FFF, OP_WHERE_FFFF, + OP_CLIP_FFFF, OP_FUNC_FF, OP_FUNC_FFF, @@ -181,6 +182,9 @@ case OP_WHERE_FFFF: if (n == 0 || n == 1 || n == 2 || n == 3) return 'f'; break; + case OP_CLIP_FFFF: + if (n == 0 || n == 1 || n == 2 || n == 3) return 'f'; + break; case OP_FUNC_FF: if (n == 0 || n == 1) return 'f'; if (n == 2) return 'n'; @@ -1340,6 +1344,7 @@ add_op("sqrt_ff", OP_SQRT_FF); add_op("arctan2_fff", OP_ARCTAN2_FFF); add_op("where_ffff", OP_WHERE_FFFF); + add_op("clip_ffff", OP_CLIP_FFFF); add_op("func_ff", OP_FUNC_FF); add_op("func_fff", OP_FUNC_FFF); Index: expressions.py =================================================================== --- expressions.py (revision 2513) +++ expressions.py (working copy) @@ -104,6 +104,12 @@ return ConstantNode(numpy.where(a, b, c)) return FuncNode('where', [a,b,c]) [EMAIL PROTECTED] +def clip_func(a, b, c): + if isinstance(a, ConstantNode): + raise ValueError("too many dimensions") + return FuncNode('clip', [a,b,c]) + def encode_axis(axis): if isinstance(axis, ConstantNode): axis = axis.value @@ -211,6 +217,7 @@ 'fmod' : func(numpy.fmod, 'float'), 'where' : where_func, + 'clip' : clip_func, 'complex' : func(complex, 'complex'),
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion