If you revise the macro as Stefan suggests, would you post the revision as a response here?
On Thursday, January 14, 2016 at 8:09:23 AM UTC-5, [email protected] wrote: > > This macro: > > macro clenshaw(x, c...) > bk1,bk2 = :(zero(t)),:(zero(t)) > N = length(c) > for k = N:-1:2 > bk2, bk1 = bk1, :(muladd(t,$bk1,$(esc(c[k]))-$bk2)) > end > ex = :(muladd(t/2,$bk1,$(esc(c[1]))-$bk2)) > Expr(:block, :(t = $(esc(2))*$(esc(x))), ex) > end > > implements Clenshaw's algorithm to sum Chebyshev series. It successfully > "unrolls" the loop, but is impractical for more than 24 coefficients. The > resulting LLVM code is theoretically only 50% longer than unrolling > Horner's rule: > > f(x) = > @evalpoly(x,1.0,1/2,1/3,1/4,1/5,1/6,1/7,1/8,1/9,1/10,1/11,1/12,1/13,1/14,1/15,1/16,1/17,1/18,1/19,1/20) > > @code_llvm f(1.0) > > g(x) = > @clenshaw(x,1.0,1/2,1/3,1/4,1/5,1/6,1/7,1/8,1/9,1/10,1/11,1/12,1/13,1/14,1/15,1/16,1/17,1/18,1/19,1/20) > > @code_llvm g(1.0) > > How could I write the macro differently? How else could I end up with the > same efficient LLVM code? using a staged function? >
