Hi!
I'm looking for advice about the best way to make use of keyword
arguments, taking into account the fact that type specialization and
type dispatch does not happen for them.
The use case is writing a contingency table function, but I think this
is a rather standard situation. The method takes varargs holding the
vectors from which to build the (cross-)classification, but also as an
option a vector of weights, and for convenience a vector indicating on
which subset of the vectors the function should operate (this is
absolutely not necessary, but is very useful to avoid indexing
repeatedly the vectors and weights with the same indexes).
At the moment, the method I've written looks like this [1]:
function freqtable(x::AbstractVector...;
weights::Union(Nothing, AbstractVector{Float64}) = nothing,
subset::Union(Nothing, AbstractVector{Int},
AbstractVector{Bool}) = nothing)
Two issues are bothering me:
1) Since there's no type specialization for keyword arguments, the inner
loop walking over the input vectors is not optimized as it could. I can
use ifs so that the basic case which does not depend on weights or
subset has its dedicated, faster loop; but it creates a lot of code
duplication, and still does not make other cases fly.
2) Even if there was type specialization for keyword arguments, the
subset argument would not be easy to handle cleanly. Ideally, I would
define two methods like this (omitting types):
function freqtable(x...; weights)
function freqtable(x...; weights, subset)
and the second one would simply take an array view of x and weights,
passing them to the first method. Since dispatching does not happen on
keyword arguments, this does not work, and I must handle this inside the
function, either by replacing x and weights with array views (type
instability), or branching with again a lot of code duplication.
One solution I thought of is to create a workhorse function
do_freqtable(weights, subset, x...), which would be specialized on
types. It sounds like a bit like working around a limitation of keyword
arguments, but it might be OK, and can easily be changed in the future.
What do you think about this? Are changes to how keyword arguments work
planned which would make it easier? At the moment it looks like they
work very well for functions where performance does not matter much
(e.g. plot), but not so well in cases where it does.
Thanks for your feedback
1:
https://github.com/nalimilan/Tables.jl/blob/master/src/freqtable.jl#L3
AbstractVector{Float64} is quite restrictive and is due to
https://github.com/JuliaLang/julia/issues/3738