There is no runtime overhead in cases where the types are known at compile
time. In terms of performance, the key thing to understand about Julia is:
* The first time you call a function f(x,y) with a given set of argument
types, Julia compiles a new version of f that is specialized (partially
evaluated) for those argument types (which are propagated as far as
possible by type inference). Functions called within f with known types
(via type inference) are compiled recursively as needed, and possibly
inlined.
* This compiled version is cached and re-used for subsequent calls with the
same argument types.
So, for example, consider f(x, y) = x + y, where we know that + is
extremely overloaded. If you call f(3,4), it compiles a specialized
version of f(x,y) specifically for Int arguments. The dispatch for + is
determined at compile time, and in fact is inlined. The resulting machine
code is essentially equivalent to what would be produced by a C compiler
from int64 f(int64 x, int64 y) { return x+y; }, as can be verified by
running code_native(f, (Int,Int)), which produces:
push RBP
mov RBP, RSP
add RDI, RSI
mov RAX, RDI
pop RBP
ret
The next time you call f with integer arguments, e.g. f(7,12), this
compiled code is re-used. If you call f with a different set of argument
types, e.g. f(3.5, 2.7), then it compiles a different version of f
specialized for those argument types. At this point, Julia has both the
(Int,Int) and the (Float64,Float64) compiled versions cached.
Similarly, if you define another function g(x,y) = x + f(x,y) and then call
g(3,4), the compiled version of g will call the precompiled f(Int,Int) code
[or actually it will probably inline it], with no runtime dispatch.
--SGJ