I get this LLVM code:

ulia> @code_llvm g(1.0)

define double @julia_g_23965(double) {
top:
  %1 = fmul double %0, 2.000000e+00
  %2 = fmul double %1, 5.000000e-01
  %3 = fmul fast double %0, 1.000000e-01
  %4 = fadd fast double %3, 0x3FAAF286BCA1AF28
  %5 = fmul fast double %1, %4
  %6 = fadd fast double %5, 0x3F76C16C16C16C10
  %7 = fsub double 0x3FAE1E1E1E1E1E1E, %4
  %8 = fmul fast double %1, %6
  %9 = fadd fast double %7, %8
  %10 = fsub double 6.250000e-02, %6
  %11 = fmul fast double %1, %9
  %12 = fadd fast double %10, %11
  %13 = fsub double 0x3FB1111111111111, %9
  %14 = fmul fast double %1, %12
  %15 = fadd fast double %13, %14
  %16 = fsub double 0x3FB2492492492492, %12
  %17 = fmul fast double %1, %15
  %18 = fadd fast double %16, %17
  %19 = fsub double 0x3FB3B13B13B13B14, %15
  %20 = fmul fast double %1, %18
  %21 = fadd fast double %19, %20
  %22 = fsub double 0x3FB5555555555555, %18
  %23 = fmul fast double %1, %21
  %24 = fadd fast double %22, %23
  %25 = fsub double 0x3FB745D1745D1746, %21
  %26 = fmul fast double %1, %24
  %27 = fadd fast double %25, %26
  %28 = fsub double 1.000000e-01, %24
  %29 = fmul fast double %1, %27
  %30 = fadd fast double %28, %29
  %31 = fsub double 0x3FBC71C71C71C71C, %27
  %32 = fmul fast double %1, %30
  %33 = fadd fast double %31, %32
  %34 = fsub double 1.250000e-01, %30
  %35 = fmul fast double %1, %33
  %36 = fadd fast double %34, %35
  %37 = fsub double 0x3FC2492492492492, %33
  %38 = fmul fast double %1, %36
  %39 = fadd fast double %37, %38
  %40 = fsub double 0x3FC5555555555555, %36
  %41 = fmul fast double %1, %39
  %42 = fadd fast double %40, %41
  %43 = fsub double 2.000000e-01, %39
  %44 = fmul fast double %1, %42
  %45 = fadd fast double %43, %44
  %46 = fsub double 2.500000e-01, %42
  %47 = fmul fast double %1, %45
  %48 = fadd fast double %46, %47
  %49 = fsub double 0x3FD5555555555555, %45
  %50 = fmul fast double %1, %48
  %51 = fadd fast double %49, %50
  %52 = fsub double 5.000000e-01, %48
  %53 = fmul fast double %1, %51
  %54 = fadd fast double %52, %53
  %55 = fsub double 1.000000e+00, %51
  %56 = fmul fast double %2, %54
  %57 = fadd fast double %55, %56
  ret double %57
}


Is that not what you were hoping for?

On Thu, Jan 14, 2016 at 8:09 AM, <[email protected]> wrote:

> This macro:
>
> macro clenshaw(x, c...)
>     bk1,bk2 = :(zero(t)),:(zero(t))
>     N = length(c)
>     for k = N:-1:2
>         bk2, bk1 = bk1, :(muladd(t,$bk1,$(esc(c[k]))-$bk2))
>     end
>     ex = :(muladd(t/2,$bk1,$(esc(c[1]))-$bk2))
>     Expr(:block, :(t = $(esc(2))*$(esc(x))), ex)
> end
>
> implements Clenshaw's algorithm to sum Chebyshev series. It successfully
> "unrolls" the loop, but is impractical for more than 24 coefficients. The
> resulting LLVM code is theoretically only 50% longer than unrolling
> Horner's rule:
>
> f(x) =
> @evalpoly(x,1.0,1/2,1/3,1/4,1/5,1/6,1/7,1/8,1/9,1/10,1/11,1/12,1/13,1/14,1/15,1/16,1/17,1/18,1/19,1/20)
>
> @code_llvm f(1.0)
>
> g(x) =
> @clenshaw(x,1.0,1/2,1/3,1/4,1/5,1/6,1/7,1/8,1/9,1/10,1/11,1/12,1/13,1/14,1/15,1/16,1/17,1/18,1/19,1/20)
>
> @code_llvm g(1.0)
>
> How could I write the macro differently? How else could I end up with the
> same efficient LLVM code? using a staged function?
>

Reply via email to