I'm wondering if there are any general rules out there in Julia land 
concerning when it is better to *parametrize *a function on a type versus 
making a more generic function that uses typeof, eltype, as needed. I've 
done some initial cursory examination, which you can see below if you're 
interested. Based on this little bit of investigation, it does seem like 
the code generation does effectively inline type calls when the types are 
fully known, which means that in many cases, the code generated by 
parameterized functions will be identical to their generic siblings. 
Nevertheless, I did reveal some small differences, so I'm wondering if 
there are cases where those differences are worth considering when making 
implementation choices.

Any thoughts, whether derived from experience, knowledge of Julia 
internals, or wild-ass guessing are appreciated.

Regards,
Michael Grant

-------

One very simple example is the definition of eltype in number.jl:
eltype(x::Number)=typeof(x)
One could also implement the same functionality as follows:
eltype{T<:Number}(x::T)=T
Which one is better? Well, if I'm understanding this little test 
properly---neither!
eltype2{T<:Number}(x::T) = T
eltype3(x::Number) = typeof(x)
code_llvm(eltype2,(Complex{Float32},))
code_llvm(eltype3,(Complex{Float32},))
The output of code_llvm looks to be identical in both cases:
define %jl_value_t* @"julia_eltype3;39061"(%Complex.4) {
top:
  ret %jl_value_t* inttoptr (i64 140332346166688 to %jl_value_t*), !dbg !
2208
}
Great! Now, let's try a slightly more complex example, inspired by the sign 
function:
sign2{T<:Real}(x::T) = ifelse(x < 0, oftype(T,-1), ifelse(x > 0, one(T), x))
sign3(x::Real) = ifelse(x < 0, oftype(x,-1), ifelse(x > 0, one(x), x))
code_llvm(sign2,(Float64,))
code_llvm(sign3,(Float64,))
Once again, identical code:
define double @"julia_sign3;39079"(double) {
top:
  %1 = fcmp uge double %0, 0.000000e+00, !dbg !2208
  %2 = fcmp oge double %0, 0x43E0000000000000, !dbg !2208
  %3 = fcmp ogt double %0, 0.000000e+00, !dbg !2208
  %4 = or i1 %2, %3, !dbg !2208
  %5 = select i1 %4, double 1.000000e+00, double %0, !dbg !2208
  %6 = select i1 %1, double %5, double -1.000000e+00, !dbg !2208
  ret double %6, !dbg !2208
}
Win! And yet, both of these examples rely on the fact that types are fully 
resolved. What if they aren't? I can look a generic version of eltype3 by 
typing
code_llvm(eltype3,(Number,))
but I can't do
code_llvm(eltype2,(Number,))
because no such definition exists. So suppose I create a new wrapper 
function:
eltype4(x::Number) = eltype2(x)
so that I can see what kind of generic dispatch code will be generated for 
eltype2 when necessary. Now, different code is indeed generated. Here's the 
resulting code for eltype3:
define %jl_value_t* @"julia_eltype3;39082"(%jl_value_t*, %jl_value_t**, i32) 
{
top:
  %3 = alloca [3 x %jl_value_t*], align 8
  %.sub = getelementptr inbounds [3 x %jl_value_t*]* %3, i64 0, i64 0
  %4 = getelementptr [3 x %jl_value_t*]* %3, i64 0, i64 2, !dbg !2208
  store %jl_value_t* inttoptr (i64 2 to %jl_value_t*), %jl_value_t** %.sub, 
align 8
  %5 = load %jl_value_t*** @jl_pgcstack, align 8, !dbg !2208
  %6 = getelementptr [3 x %jl_value_t*]* %3, i64 0, i64 1, !dbg !2208
  %.c = bitcast %jl_value_t** %5 to %jl_value_t*, !dbg !2208
  store %jl_value_t* %.c, %jl_value_t** %6, align 8, !dbg !2208
  store %jl_value_t** %.sub, %jl_value_t*** @jl_pgcstack, align 8, !dbg !
2208
  store %jl_value_t* null, %jl_value_t** %4, align 8
  %7 = icmp eq i32 %2, 1, !dbg !2208
  br i1 %7, label %ifcont, label %else, !dbg !2208


else:                                             ; preds = %top
  call void @jl_error(i8* getelementptr inbounds ([26 x i8]* @_j_str0, i64 0
, i64 0)), !dbg !2208
  unreachable, !dbg !2208


ifcont:                                           ; preds = %top
  %8 = load %jl_value_t** %1, align 8, !dbg !2208
  %9 = load %jl_value_t** inttoptr (i64 140332374695360 to %jl_value_t**), 
align 64, !dbg !2209
  %10 = getelementptr inbounds %jl_value_t* %9, i64 1, i32 0, !dbg !2209
  %11 = load %jl_value_t** %10, align 8, !dbg !2209
  %12 = bitcast %jl_value_t* %11 to %jl_value_t* (%jl_value_t*, %jl_value_t
**, i32)*, !dbg !2209
  store %jl_value_t* %8, %jl_value_t** %4, align 8, !dbg !2209
  %13 = call %jl_value_t* %12(%jl_value_t* %9, %jl_value_t** %4, i32 1), !dbg 
!2209
  %14 = load %jl_value_t** %6, align 8, !dbg !2209
  %15 = getelementptr inbounds %jl_value_t* %14, i64 0, i32 0, !dbg !2209
  store %jl_value_t** %15, %jl_value_t*** @jl_pgcstack, align 8, !dbg !2209
  ret %jl_value_t* %13, !dbg !2209
}
The code for eltype4 is identical until we get to ifcont:
ifcont:                                           ; preds = %top
  %8 = load %jl_value_t** %1, align 8, !dbg !2208
  store %jl_value_t* %8, %jl_value_t** %4, align 8, !dbg !2209
  %9 = call %jl_value_t* @jl_apply_generic(%jl_value_t* inttoptr (i64 
140332531469120 to %jl_value_t*), %jl_value_t** %4, i32 1), !dbg !2209
  %10 = load %jl_value_t** %6, align 8, !dbg !2209
  %11 = getelementptr inbounds %jl_value_t* %10, i64 0, i32 0, !dbg !2209
  store %jl_value_t** %11, %jl_value_t*** @jl_pgcstack, align 8, !dbg !2209
  ret %jl_value_t* %9, !dbg !2209} 
Now we see the invocation of @jl_apply_generic to handle the dispatching of 
eltype2, I assume. Looking at the native code, the differences are also 
minor, though of course there are a couple of calls I don't understand.

>From looks alone, I see no reason to strongly prefer one form over the 
other.

Reply via email to