If you revise the macro as Stefan suggests, would you post the revision as 
a response here?

On Thursday, January 14, 2016 at 8:09:23 AM UTC-5, 
[email protected] wrote:
>
> This macro:
>
> macro clenshaw(x, c...)
>     bk1,bk2 = :(zero(t)),:(zero(t))
>     N = length(c)
>     for k = N:-1:2
>         bk2, bk1 = bk1, :(muladd(t,$bk1,$(esc(c[k]))-$bk2))
>     end
>     ex = :(muladd(t/2,$bk1,$(esc(c[1]))-$bk2))
>     Expr(:block, :(t = $(esc(2))*$(esc(x))), ex)
> end
>
> implements Clenshaw's algorithm to sum Chebyshev series. It successfully 
> "unrolls" the loop, but is impractical for more than 24 coefficients. The 
> resulting LLVM code is theoretically only 50% longer than unrolling 
> Horner's rule:
>
> f(x) = 
> @evalpoly(x,1.0,1/2,1/3,1/4,1/5,1/6,1/7,1/8,1/9,1/10,1/11,1/12,1/13,1/14,1/15,1/16,1/17,1/18,1/19,1/20)
>
> @code_llvm f(1.0)
>
> g(x) = 
> @clenshaw(x,1.0,1/2,1/3,1/4,1/5,1/6,1/7,1/8,1/9,1/10,1/11,1/12,1/13,1/14,1/15,1/16,1/17,1/18,1/19,1/20)
>
> @code_llvm g(1.0)
>
> How could I write the macro differently? How else could I end up with the 
> same efficient LLVM code? using a staged function?
>

Reply via email to