Actually the performance drop there is just because of the always looming threat of global var inefficience :P
Because the var t is global, f1 can't optimize for its type. But f2 has a type declared on its argument, so it can. Remove the ::mytype on f2 or make t a const and you'll see no significant performance difference. I went ahead and added the type deckle to @selftype anyway, cuz' why not. Em domingo, 24 de julho de 2016 11:58:27 UTC-3, Marius Millea escreveu: > > I think you are right btw, the compiler got rid of the wrapper function > for the "+" call, since all I see above is Base.add_float. > > On Sun, Jul 24, 2016 at 4:55 PM, Marius Millea <[email protected] > <javascript:>> wrote: > >> Here's my very simple test case. I will also try on my actual code. >> >> using SelfFunctions >> using TimeIt >> >> @selftype self type mytype >> x::Float64 >> end >> >> t = mytype(0) >> >> >> # Test @self'ed version: >> >> @self @inline function f1() >> 1+x >> end >> >> @timeit f1(t) >> println(@code_warntype(f1(t))) >> >> # 1000000 loops, best of 3: 100.64 ns per loop >> # Variables: >> # sf::SelfFunctions.SelfFunction{###f1_selfimpl#271} >> # args::Tuple{mytype} >> # >> # Body: >> # begin >> # # meta: location /home/marius/workspace/selffunctions/test.jl >> ##f1_selfimpl#271 11 >> # SSAValue(1) = >> (Core.getfield)((Core.getfield)(args::Tuple{mytype},1)::mytype,:x)::Float64 >> # # meta: pop location >> # return >> (Base.box)(Base.Float64,(Base.add_float)((Base.box)(Float64,(Base.sitofp)(Float64,1)),SSAValue(1))) >> # end::Float64 >> >> >> >> >> # Test non-@self'ed version: >> >> @inline function f2(t::mytype) >> 1+t.x >> end >> >> @timeit f2(t) >> println(@code_warntype(f2(t))) >> >> # 10000000 loops, best of 3: 80.13 ns per loop >> # Variables: >> # #self#::#f2 >> # t::mytype >> # >> # Body: >> # begin >> # return >> (Base.box)(Base.Float64,(Base.add_float)((Base.box)(Float64,(Base.sitofp)(Float64,1)),(Core.getfield)(t::mytype,:x)::Float64)) >> # end::Float64 >> # nothing >> >> >> I'm not sure if its the creation of the SSAValue intermediate value or >> the extra getfield lookup, but you can see it slows down from ~80 to >> ~100ns. >> >> >> Marius >> >> >> >> On Sunday, July 24, 2016 at 3:52:38 PM UTC+2, Fábio Cardeal wrote: >>> >>> The compiler is pretty smart about removing these extra function calls, >>> so I didn't get any extra overhead on my test cases. I went ahead and added >>> `@inline` to the selfcall deckles. You can also do this: >>> >>> @self @inline function inc2() >>> inc() >>> inc() >>> end >>> >>> Update from the gist and try using some @inlines and see if it helps. >>> You can also send me your test cases if you want. >>> >>> In general, these techniques of adding and using compile time >>> information shouldn't cause any definite slowdown, even if we need to do >>> some tweaking with them meta tags. The compiler isn't perfect about this >>> yet, but I think our case is covered. (I hope?) >>> >> >
