Hello Christoph,
I can confirm this behavior on 0.5.0-rc0. Notably, the generated LLVM code
is identical for both cases:
julia> @code_llvm test1(1_000_000)
define double @julia_test1_68549(i64) #0 {
top:
%1 = icmp slt i64 %0, 1
br i1 %1, label %L2, label %if.preheader
if.preheader: ; preds = %top
br label %if
L2.loopexit: ; preds = %if
br label %L2
L2: ; preds = %L2.loopexit, %
top
%s.0.lcssa = phi double [ 0.000000e+00, %top ], [ %3, %L2.loopexit ]
ret double %s.0.lcssa
if: ; preds = %if.preheader, %
if
%s.04 = phi double [ %3, %if ], [ 0.000000e+00, %if.preheader ]
%"#temp#.03" = phi i64 [ %2, %if ], [ 1, %if.preheader ]
%2 = add i64 %"#temp#.03", 1
%3 = fadd double %s.04, 0x3F8BAD7BCA428034
%4 = icmp eq i64 %"#temp#.03", %0
br i1 %4, label %L2.loopexit, label %if
}
julia> @code_llvm test2(1_000_000,g1)
define double @julia_test2_68608(i64) #0 {
top:
%1 = icmp slt i64 %0, 1
br i1 %1, label %L2, label %if.preheader
if.preheader: ; preds = %top
br label %if
L2.loopexit: ; preds = %if
br label %L2
L2: ; preds = %L2.loopexit, %
top
%s.0.lcssa = phi double [ 0.000000e+00, %top ], [ %3, %L2.loopexit ]
ret double %s.0.lcssa
if: ; preds = %if.preheader, %
if
%s.04 = phi double [ %3, %if ], [ 0.000000e+00, %if.preheader ]
%"#temp#.03" = phi i64 [ %2, %if ], [ 1, %if.preheader ]
%2 = add i64 %"#temp#.03", 1
%3 = fadd double %s.04, 0x3F8BAD7BCA428034
%4 = icmp eq i64 %"#temp#.03", %0
br i1 %4, label %L2.loopexit, label %if
}
though the generated native code is *not* identical -- this looks indeed
very strange.
On Tuesday, August 2, 2016 at 12:33:24 PM UTC+8, Christoph Ortner wrote:
>
> Below are two tests, in the first a simple polynomial is "hard-coded", in
> the second it is passed as a function. I would expect the two to be
> equivalent, but the second case is significantly faster. Can anybody
> explain what is going on? @code_warntype doesn't show anything that would
> explain it?
>
> function test1(N)
>
> r = 0.234; s = 0.0
> for n = 1:N
> s += r^3 + r^5
> end
> return s
> end
>
>
> function test2(N, f1)
> r = 0.234; s = 0.0
> for n = 1:N
> s += f1(r)
> end
> return s
> end
>
>
> g1(r) = r^3 + r^5
>
>
> test1(10)
> test2(10, g1)
>
>
> println("Test1: hard-coded functions")
> @time test1(1_000_000)
> @time test1(1_000_000)
> @time test1(1_000_000)
>
>
> println("Test2: pass functions")
> @time test2(1_000_000, g1)
> @time test2(1_000_000, g1)
> @time test2(1_000_000, g1)
>
>
>
>
> # $ julia5 weird_test2.jl
> # Test1: hard-coded functions
> # 0.086683 seconds (4.00 M allocations: 61.043 MB, 50.75% gc time)
> # 0.142487 seconds (4.00 M allocations: 61.035 MB, 76.91% gc time)
> # 0.025388 seconds (4.00 M allocations: 61.035 MB, 4.28% gc time)
> # Test2: pass functions
> # 0.000912 seconds (5 allocations: 176 bytes)
> # 0.000860 seconds (5 allocations: 176 bytes)
> # 0.000846 seconds (5 allocations: 176 bytes)
>
>
>
>
>
>