Hello Christoph, I can confirm this behavior on 0.5.0-rc0. Notably, the generated LLVM code is identical for both cases:
julia> @code_llvm test1(1_000_000) define double @julia_test1_68549(i64) #0 { top: %1 = icmp slt i64 %0, 1 br i1 %1, label %L2, label %if.preheader if.preheader: ; preds = %top br label %if L2.loopexit: ; preds = %if br label %L2 L2: ; preds = %L2.loopexit, % top %s.0.lcssa = phi double [ 0.000000e+00, %top ], [ %3, %L2.loopexit ] ret double %s.0.lcssa if: ; preds = %if.preheader, % if %s.04 = phi double [ %3, %if ], [ 0.000000e+00, %if.preheader ] %"#temp#.03" = phi i64 [ %2, %if ], [ 1, %if.preheader ] %2 = add i64 %"#temp#.03", 1 %3 = fadd double %s.04, 0x3F8BAD7BCA428034 %4 = icmp eq i64 %"#temp#.03", %0 br i1 %4, label %L2.loopexit, label %if } julia> @code_llvm test2(1_000_000,g1) define double @julia_test2_68608(i64) #0 { top: %1 = icmp slt i64 %0, 1 br i1 %1, label %L2, label %if.preheader if.preheader: ; preds = %top br label %if L2.loopexit: ; preds = %if br label %L2 L2: ; preds = %L2.loopexit, % top %s.0.lcssa = phi double [ 0.000000e+00, %top ], [ %3, %L2.loopexit ] ret double %s.0.lcssa if: ; preds = %if.preheader, % if %s.04 = phi double [ %3, %if ], [ 0.000000e+00, %if.preheader ] %"#temp#.03" = phi i64 [ %2, %if ], [ 1, %if.preheader ] %2 = add i64 %"#temp#.03", 1 %3 = fadd double %s.04, 0x3F8BAD7BCA428034 %4 = icmp eq i64 %"#temp#.03", %0 br i1 %4, label %L2.loopexit, label %if } though the generated native code is *not* identical -- this looks indeed very strange. On Tuesday, August 2, 2016 at 12:33:24 PM UTC+8, Christoph Ortner wrote: > > Below are two tests, in the first a simple polynomial is "hard-coded", in > the second it is passed as a function. I would expect the two to be > equivalent, but the second case is significantly faster. Can anybody > explain what is going on? @code_warntype doesn't show anything that would > explain it? > > function test1(N) > > r = 0.234; s = 0.0 > for n = 1:N > s += r^3 + r^5 > end > return s > end > > > function test2(N, f1) > r = 0.234; s = 0.0 > for n = 1:N > s += f1(r) > end > return s > end > > > g1(r) = r^3 + r^5 > > > test1(10) > test2(10, g1) > > > println("Test1: hard-coded functions") > @time test1(1_000_000) > @time test1(1_000_000) > @time test1(1_000_000) > > > println("Test2: pass functions") > @time test2(1_000_000, g1) > @time test2(1_000_000, g1) > @time test2(1_000_000, g1) > > > > > # $ julia5 weird_test2.jl > # Test1: hard-coded functions > # 0.086683 seconds (4.00 M allocations: 61.043 MB, 50.75% gc time) > # 0.142487 seconds (4.00 M allocations: 61.035 MB, 76.91% gc time) > # 0.025388 seconds (4.00 M allocations: 61.035 MB, 4.28% gc time) > # Test2: pass functions > # 0.000912 seconds (5 allocations: 176 bytes) > # 0.000860 seconds (5 allocations: 176 bytes) > # 0.000846 seconds (5 allocations: 176 bytes) > > > > > >