On Tue, 2016-08-02 at 14:07, Mauro <[email protected]> wrote:
> Oh, I see, this is in this issue:
> https://github.com/JuliaLang/julia/issues/17751
It's not, after all, sorry! (a bit related though)
And another wired thing, this seems to be dependent on using x^3. Running:
function test0(N)
r = 0.234
s = 0.0
for n = 1:N
s += r^3
end
s
end
#test0(BigInt(10))
test0(10)
@time test0(1_000_000)
@time test0(1_000_000)
@time test0(1_000_000)
gives:
0.130325 seconds (4.00 M allocations: 61.043 MB, 65.15% gc time)
0.198170 seconds (4.00 M allocations: 61.035 MB, 79.65% gc time)
0.036850 seconds (4.00 M allocations: 61.035 MB, 4.88% gc time)
whereas using the 4-th power:
s += r^4
gives:
0.001198 seconds (134 allocations: 7.719 KB)
0.001188 seconds (5 allocations: 176 bytes)
0.000886 seconds (5 allocations: 176 bytes)
> On Tue, 2016-08-02 at 10:52, Mauro <[email protected]> wrote:
>> Yes, test2 is slower on 0.4 (expected), test1 is slower on 0.5 (weird)
>> on my Linux (Intel) machine. Christoph is right, in 0.5 these should
>> preform the same. And certainly, no function should be 1000x slower on
>> 0.5 than on 0.4.
>>
>> A quick search did not turn up a bug report:
>> https://github.com/JuliaLang/julia/issues?q=is%3Aopen+is%3Aissue+label%3Aregression+label%3Aperformance
>> Please file one!
>>
>> Other weird thing is that (on 0.5) @code_warntype and @code_llvm return
>> both exactly the same code. However, @code_native is very short for
>> test2 and long for test1.
>>
>> Weirder: if I re-include the file, then both functions run fast and then
>> the @code_native. Or, running below file, I get equal results:
>>
>> function test1(N)
>> r = 0.234; s = 0.0
>> for n = 1:N
>> s += r^3 + r^5
>> end
>> return s
>> end
>>
>>
>> function test2(N, f1)
>> r = 0.234; s = 0.0
>> for n = 1:N
>> s += f1(r)
>> end
>> return s
>> end
>>
>> g1(r) = r^3 + r^5
>>
>>
>> test1(BigInt(10)) # <--- force one compilation for something else than Int
>> test1(10) # then compilation for Int produces a fast function
>> test2(10, g1)
>> @time 1 # warm-up @time itself
>>
>> println("Test1: hard-coded functions")
>> @time test1(1_000_000)
>> @time test1(1_000_000)
>> @time test1(1_000_000)
>>
>> println("Test2: pass functions")
>> @time test2(1_000_000, g1)
>> @time test2(1_000_000, g1)
>> @time test2(1_000_000, g1)
>>
>> So, it seems that the first compilation is messed up but subsequent
>> compilations are fine.
>>
>> On Tue, 2016-08-02 at 07:01, Eric Forgy <[email protected]> wrote:
>>> I still don't understand the details of the new functions in v0.5. but I'd
>>> be inclined to think this test depends on whether you're on v0.4.6 or
>>> v0.5.0.
>>>
>>> On Tuesday, August 2, 2016 at 12:53:27 PM UTC+8, Greg Plowman wrote:
>>>>
>>>> I get timing/allocations the other way around. (test1, hard-coded version
>>>> is fast without allocation)
>>>> @code_warntype for test2 shows type-instability for s (because return type
>>>> cannot be inferred for f1)
>>>>
>>>>
>>>> On Tuesday, August 2, 2016 at 2:33:24 PM UTC+10, Christoph Ortner wrote:
>>>>
>>>>> Below are two tests, in the first a simple polynomial is "hard-coded", in
>>>>> the second it is passed as a function. I would expect the two to be
>>>>> equivalent, but the second case is significantly faster. Can anybody
>>>>> explain what is going on? @code_warntype doesn't show anything that would
>>>>> explain it?
>>>>>
>>>>> function test1(N)
>>>>>
>>>>> r = 0.234; s = 0.0
>>>>> for n = 1:N
>>>>> s += r^3 + r^5
>>>>> end
>>>>> return s
>>>>> end
>>>>>
>>>>>
>>>>> function test2(N, f1)
>>>>> r = 0.234; s = 0.0
>>>>> for n = 1:N
>>>>> s += f1(r)
>>>>> end
>>>>> return s
>>>>> end
>>>>>
>>>>>
>>>>> g1(r) = r^3 + r^5
>>>>>
>>>>>
>>>>> test1(10)
>>>>> test2(10, g1)
>>>>>
>>>>>
>>>>> println("Test1: hard-coded functions")
>>>>> @time test1(1_000_000)
>>>>> @time test1(1_000_000)
>>>>> @time test1(1_000_000)
>>>>>
>>>>>
>>>>> println("Test2: pass functions")
>>>>> @time test2(1_000_000, g1)
>>>>> @time test2(1_000_000, g1)
>>>>> @time test2(1_000_000, g1)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> # $ julia5 weird_test2.jl
>>>>> # Test1: hard-coded functions
>>>>> # 0.086683 seconds (4.00 M allocations: 61.043 MB, 50.75% gc time)
>>>>> # 0.142487 seconds (4.00 M allocations: 61.035 MB, 76.91% gc time)
>>>>> # 0.025388 seconds (4.00 M allocations: 61.035 MB, 4.28% gc time)
>>>>> # Test2: pass functions
>>>>> # 0.000912 seconds (5 allocations: 176 bytes)
>>>>> # 0.000860 seconds (5 allocations: 176 bytes)
>>>>> # 0.000846 seconds (5 allocations: 176 bytes)
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>