As an update: We have tested fetching annotations without trying to enforce 
type, and then another in which we don't. I don't understand why, but the 
one in which we don't enforce type, is faster, it is also puzzling for me 
as the one where we don't enforce type allocates memory, and yet is still 
faster:

function getannotations{T}(x::Phylogeny, name::ASCIIString, ::Type{T})
    x.annotations[T][name]::T 
end



function getannotations(x::Phylogeny, name::ASCIIString)
    for (k, v) in x.annotations
        if haskey(v, name)
            return(v[name])
                 end
        end
    error("No such key in the phylogeny")
end


*julia> **@time for i in 1:10000; a = tree["Node Names", ASCIIString];**end*

  0.002090 seconds

*julia> **@time for i in 1:10000; a = tree["Node Names"]; **end*

  0.001367 seconds (10.00 k allocations: 312.500 KB)





 

On Thursday, September 24, 2015 at 8:17:52 PM UTC+1, Ben Ward wrote:
>
> Hi Julia Users,
>
> I'm one of the Core-Devs in the BioJulia organisation, with a background 
> in evolutionary biology/genetics, and, with a few other contributors I'm 
> writing Bio.jl's Phylo submodule.
>
> The primary type of this submodule is the Phylogeny. Which is a composite 
> type, used to describe a model of evolution. At the very minimum it looks 
> like this:
>
> type PhyNode
>     children::Vector{PhyNode}
>     parent::PhyNode
>     
>     function PhyNode(children::Vector{PhyNode} = PhyNode[],
>                      parent = nothing)
>         x = new()
>         if parent != nothing
>             graft!(parent, x)
>         else
>             x.parent = x
>         end
>         x.children = PhyNode[]
>         for child in children
>             graft!(x, child)
>         end
>         return x
>     end
> end
>
> type Phylogeny
>     root::PhyNode
>     rooted::Bool
>     rerootable::Bool
>
>     Phylogeny() = new(PhyNode(), false, true)
> end
>
> PhyNodes are types which link to their children and to their parent - they 
> are the individual objects that form the tree structure. The Phylogeny type 
> describes the overall tree, and contains a variable pointing to a PhyNode 
> that forms the root of the tree, and determines whether the tree is rooted 
> in the phylogenetic sense, and whether the phylogeny is re-rootable. So far 
> so good. We can represent the structure of a phylogeny - a model of how 
> various species are related through history.
>
> Here is where I'd like comments from the julia-users, if possible: With a 
> phylogeny, often additional information is annotated to the tree, like 
> branch lengths, confidence intervals, sequences, labels, colours for 
> plotting, and so on. Well, we can do this with a Dict, and use PhyNodes as 
> keys:
>
> typealias NodeAnnotation{T} Dict{PhyNode, T}
>
> We can then store thee annotations in the Phylogeny type like this:
> type Phylogeny{S <: AbstractString}
>     root::PhyNode
>     rooted::Bool
>     rerootable::Bool
>     annotations::Dict{S, Any}
> end
>
> However, I don't like the type uncertainty of Any because if I'm correct, 
> it could propagate up through a user's code. But we will always have some 
> uncertainty, because we don't know in advance what the user might want to 
> annotate the Phylogeny with - could be anything from simple float values, 
> to other composite types.
>
> Am I correct that the uncertainty getting and setting such annotations, 
> would propagate through the user's code when they deal with annotations?
> If so, we have tried to think of ways to get around this. One idea was to 
> store the NodeAnnotations in the phylogeny according to the type of their 
> values, and then provide getter and setter methods that make the return 
> type predictable from the types of the parameters passed in the method:
>
> type Phylogeny{S<:AbstractString}
>     root::PhyNode
>     rooted::Bool
>     rerootable::Bool
>     annotations::Dict{Type, Dict{S, NodeAnnotation{Any}}}
> end
>
> function setannotation!{T}(x::Phylogeny, name::ASCIIString, ann::
> NodeAnnotation{T})
>     if haskey(x.annotations,T)
>         x.annotations[T][name] = ann
>     else 
>         x.annotations[T] = [name => ann]
>     end
> end 
>
> function getannotations{T}(x::Phylogeny, name::ASCIIString, ::Type{T})
>     x.annotations[T][name]::Dict{PhyNode, T}
> end
>
> This seems like it works and would indeed make getting and setting more 
> type predictable, the only annoying part is that Dicts get converted:
>
> julia> setannotation!(tree, "Node Names", NodeAnnotation{ASCIIString}())
> Dict{PhyNode,ASCIIString} with 0 entries
>
>
> julia> tree
> Phylogeny{ASCIIString}(PhyNode(),false,false,Dict{Type{T},Dict{ASCIIString
> ,Dict{PhyNode,Any}}}(ASCIIString=>Dict("Node Names"=>Dict{PhyNode,Any
> }())))
>
> You see Dict{PhyNode, ASCIIString} got converted to Dict{PhyNode, Any}.
>
> If anyone has comments on this or has advice on how to prevent type 
> uncertainty propagating, please do share. How should we be approaching this?
>
> Many thanks,
> Ben.
>

Reply via email to