Here's my very simple test case. I will also try on my actual code.
using SelfFunctions
using TimeIt
@selftype self type mytype
x::Float64
end
t = mytype(0)
# Test @self'ed version:
@self @inline function f1()
1+x
end
@timeit f1(t)
println(@code_warntype(f1(t)))
# 1000000 loops, best of 3: 100.64 ns per loop
# Variables:
# sf::SelfFunctions.SelfFunction{###f1_selfimpl#271}
# args::Tuple{mytype}
#
# Body:
# begin
# # meta: location /home/marius/workspace/selffunctions/test.jl
##f1_selfimpl#271 11
# SSAValue(1) =
(Core.getfield)((Core.getfield)(args::Tuple{mytype},1)::mytype,:x)::Float64
# # meta: pop location
# return
(Base.box)(Base.Float64,(Base.add_float)((Base.box)(Float64,(Base.sitofp)(Float64,1)),SSAValue(1)))
# end::Float64
# Test non-@self'ed version:
@inline function f2(t::mytype)
1+t.x
end
@timeit f2(t)
println(@code_warntype(f2(t)))
# 10000000 loops, best of 3: 80.13 ns per loop
# Variables:
# #self#::#f2
# t::mytype
#
# Body:
# begin
# return
(Base.box)(Base.Float64,(Base.add_float)((Base.box)(Float64,(Base.sitofp)(Float64,1)),(Core.getfield)(t::mytype,:x)::Float64))
# end::Float64
# nothing
I'm not sure if its the creation of the SSAValue intermediate value or the
extra getfield lookup, but you can see it slows down from ~80 to ~100ns.
Marius
On Sunday, July 24, 2016 at 3:52:38 PM UTC+2, Fábio Cardeal wrote:
>
> The compiler is pretty smart about removing these extra function calls, so
> I didn't get any extra overhead on my test cases. I went ahead and added
> `@inline` to the selfcall deckles. You can also do this:
>
> @self @inline function inc2()
> inc()
> inc()
> end
>
> Update from the gist and try using some @inlines and see if it helps. You
> can also send me your test cases if you want.
>
> In general, these techniques of adding and using compile time information
> shouldn't cause any definite slowdown, even if we need to do some tweaking
> with them meta tags. The compiler isn't perfect about this yet, but I think
> our case is covered. (I hope?)
>