I think even powers are treated differently, e.g., if I replace r^3+r^5 with ` r^2*r + r^2^2*r` then I get the right performance.
On Tuesday, 2 August 2016 13:29:24 UTC+1, Mauro wrote: > > On Tue, 2016-08-02 at 14:07, Mauro <[email protected] <javascript:>> > wrote: > > Oh, I see, this is in this issue: > https://github.com/JuliaLang/julia/issues/17751 > > It's not, after all, sorry! (a bit related though) > > > > And another wired thing, this seems to be dependent on using x^3. > Running: > > function test0(N) > r = 0.234 > s = 0.0 > for n = 1:N > s += r^3 > end > s > end > #test0(BigInt(10)) > test0(10) > @time test0(1_000_000) > @time test0(1_000_000) > @time test0(1_000_000) > > gives: > 0.130325 seconds (4.00 M allocations: 61.043 MB, 65.15% gc time) > 0.198170 seconds (4.00 M allocations: 61.035 MB, 79.65% gc time) > 0.036850 seconds (4.00 M allocations: 61.035 MB, 4.88% gc time) > > whereas using the 4-th power: > > s += r^4 > > gives: > 0.001198 seconds (134 allocations: 7.719 KB) > 0.001188 seconds (5 allocations: 176 bytes) > 0.000886 seconds (5 allocations: 176 bytes) > > > > On Tue, 2016-08-02 at 10:52, Mauro <[email protected] <javascript:>> > wrote: > >> Yes, test2 is slower on 0.4 (expected), test1 is slower on 0.5 (weird) > >> on my Linux (Intel) machine. Christoph is right, in 0.5 these should > >> preform the same. And certainly, no function should be 1000x slower on > >> 0.5 than on 0.4. > >> > >> A quick search did not turn up a bug report: > >> > https://github.com/JuliaLang/julia/issues?q=is%3Aopen+is%3Aissue+label%3Aregression+label%3Aperformance > > >> Please file one! > >> > >> Other weird thing is that (on 0.5) @code_warntype and @code_llvm return > >> both exactly the same code. However, @code_native is very short for > >> test2 and long for test1. > >> > >> Weirder: if I re-include the file, then both functions run fast and > then > >> the @code_native. Or, running below file, I get equal results: > >> > >> function test1(N) > >> r = 0.234; s = 0.0 > >> for n = 1:N > >> s += r^3 + r^5 > >> end > >> return s > >> end > >> > >> > >> function test2(N, f1) > >> r = 0.234; s = 0.0 > >> for n = 1:N > >> s += f1(r) > >> end > >> return s > >> end > >> > >> g1(r) = r^3 + r^5 > >> > >> > >> test1(BigInt(10)) # <--- force one compilation for something else than > Int > >> test1(10) # then compilation for Int produces a fast function > >> test2(10, g1) > >> @time 1 # warm-up @time itself > >> > >> println("Test1: hard-coded functions") > >> @time test1(1_000_000) > >> @time test1(1_000_000) > >> @time test1(1_000_000) > >> > >> println("Test2: pass functions") > >> @time test2(1_000_000, g1) > >> @time test2(1_000_000, g1) > >> @time test2(1_000_000, g1) > >> > >> So, it seems that the first compilation is messed up but subsequent > >> compilations are fine. > >> > >> On Tue, 2016-08-02 at 07:01, Eric Forgy <[email protected] > <javascript:>> wrote: > >>> I still don't understand the details of the new functions in v0.5. but > I'd > >>> be inclined to think this test depends on whether you're on v0.4.6 or > >>> v0.5.0. > >>> > >>> On Tuesday, August 2, 2016 at 12:53:27 PM UTC+8, Greg Plowman wrote: > >>>> > >>>> I get timing/allocations the other way around. (test1, hard-coded > version > >>>> is fast without allocation) > >>>> @code_warntype for test2 shows type-instability for s (because return > type > >>>> cannot be inferred for f1) > >>>> > >>>> > >>>> On Tuesday, August 2, 2016 at 2:33:24 PM UTC+10, Christoph Ortner > wrote: > >>>> > >>>>> Below are two tests, in the first a simple polynomial is > "hard-coded", in > >>>>> the second it is passed as a function. I would expect the two to be > >>>>> equivalent, but the second case is significantly faster. Can anybody > >>>>> explain what is going on? @code_warntype doesn't show anything that > would > >>>>> explain it? > >>>>> > >>>>> function test1(N) > >>>>> > >>>>> r = 0.234; s = 0.0 > >>>>> for n = 1:N > >>>>> s += r^3 + r^5 > >>>>> end > >>>>> return s > >>>>> end > >>>>> > >>>>> > >>>>> function test2(N, f1) > >>>>> r = 0.234; s = 0.0 > >>>>> for n = 1:N > >>>>> s += f1(r) > >>>>> end > >>>>> return s > >>>>> end > >>>>> > >>>>> > >>>>> g1(r) = r^3 + r^5 > >>>>> > >>>>> > >>>>> test1(10) > >>>>> test2(10, g1) > >>>>> > >>>>> > >>>>> println("Test1: hard-coded functions") > >>>>> @time test1(1_000_000) > >>>>> @time test1(1_000_000) > >>>>> @time test1(1_000_000) > >>>>> > >>>>> > >>>>> println("Test2: pass functions") > >>>>> @time test2(1_000_000, g1) > >>>>> @time test2(1_000_000, g1) > >>>>> @time test2(1_000_000, g1) > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> # $ julia5 weird_test2.jl > >>>>> # Test1: hard-coded functions > >>>>> # 0.086683 seconds (4.00 M allocations: 61.043 MB, 50.75% gc time) > >>>>> # 0.142487 seconds (4.00 M allocations: 61.035 MB, 76.91% gc time) > >>>>> # 0.025388 seconds (4.00 M allocations: 61.035 MB, 4.28% gc time) > >>>>> # Test2: pass functions > >>>>> # 0.000912 seconds (5 allocations: 176 bytes) > >>>>> # 0.000860 seconds (5 allocations: 176 bytes) > >>>>> # 0.000846 seconds (5 allocations: 176 bytes) > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> >
