Hi Julia Users,
I'm one of the Core-Devs in the BioJulia organisation, with a background in
evolutionary biology/genetics, and, with a few other contributors I'm
writing Bio.jl's Phylo submodule.
The primary type of this submodule is the Phylogeny. Which is a composite
type, used to describe a model of evolution. At the very minimum it looks
like this:
type PhyNode
children::Vector{PhyNode}
parent::PhyNode
function PhyNode(children::Vector{PhyNode} = PhyNode[],
parent = nothing)
x = new()
if parent != nothing
graft!(parent, x)
else
x.parent = x
end
x.children = PhyNode[]
for child in children
graft!(x, child)
end
return x
end
end
type Phylogeny
root::PhyNode
rooted::Bool
rerootable::Bool
Phylogeny() = new(PhyNode(), false, true)
end
PhyNodes are types which link to their children and to their parent - they
are the individual objects that form the tree structure. The Phylogeny type
describes the overall tree, and contains a variable pointing to a PhyNode
that forms the root of the tree, and determines whether the tree is rooted
in the phylogenetic sense, and whether the phylogeny is re-rootable. So far
so good. We can represent the structure of a phylogeny - a model of how
various species are related through history.
Here is where I'd like comments from the julia-users, if possible: With a
phylogeny, often additional information is annotated to the tree, like
branch lengths, confidence intervals, sequences, labels, colours for
plotting, and so on. Well, we can do this with a Dict, and use PhyNodes as
keys:
typealias NodeAnnotation{T} Dict{PhyNode, T}
We can then store thee annotations in the Phylogeny type like this:
type Phylogeny{S <: AbstractString}
root::PhyNode
rooted::Bool
rerootable::Bool
annotations::Dict{S, Any}
end
However, I don't like the type uncertainty of Any because if I'm correct,
it could propagate up through a user's code. But we will always have some
uncertainty, because we don't know in advance what the user might want to
annotate the Phylogeny with - could be anything from simple float values,
to other composite types.
Am I correct that the uncertainty getting and setting such annotations,
would propagate through the user's code when they deal with annotations?
If so, we have tried to think of ways to get around this. One idea was to
store the NodeAnnotations in the phylogeny according to the type of their
values, and then provide getter and setter methods that make the return
type predictable from the types of the parameters passed in the method:
type Phylogeny{S<:AbstractString}
root::PhyNode
rooted::Bool
rerootable::Bool
annotations::Dict{Type, Dict{S, NodeAnnotation{Any}}}
end
function setannotation!{T}(x::Phylogeny, name::ASCIIString, ann::
NodeAnnotation{T})
if haskey(x.annotations,T)
x.annotations[T][name] = ann
else
x.annotations[T] = [name => ann]
end
end
function getannotations{T}(x::Phylogeny, name::ASCIIString, ::Type{T})
x.annotations[T][name]::Dict{PhyNode, T}
end
This seems like it works and would indeed make getting and setting more
type predictable, the only annoying part is that Dicts get converted:
julia> setannotation!(tree, "Node Names", NodeAnnotation{ASCIIString}())
Dict{PhyNode,ASCIIString} with 0 entries
julia> tree
Phylogeny{ASCIIString}(PhyNode(),false,false,Dict{Type{T},Dict{ASCIIString,
Dict{PhyNode,Any}}}(ASCIIString=>Dict("Node Names"=>Dict{PhyNode,Any}())))
You see Dict{PhyNode, ASCIIString} got converted to Dict{PhyNode, Any}.
If anyone has comments on this or has advice on how to prevent type
uncertainty propagating, please do share. How should we be approaching this?
Many thanks,
Ben.