Hi all,

I'm implementing BSP trees, and I've run into a style issue which I'm
uncertain about.

For my current purposes, the trees will be holding floats, but I wanted to
write things in a way which allowed other possibilities later. At first, I
made an explicit Leaf type:

*Version 1*


abstract Leaf


type FloatLeaf <: Leaf

        value::Float64

end


abstract Split # abstract type for space partition


# starting out with axis-aligned splits, to implement arbitrary-angle later.

# Axis-aligned splits make rectangles, hence RectSplit

type RectSplit <: Split

        var::Int

        loc::Float64

end


# internal tree node. Two sides of a split are pos and neg.

type TreeNode

        split::Split # split definition

        pos::Union(TreeNode, Leaf)  # negative side

        neg::Union(TreeNode, Leaf)  # positive side

end


(This is actually a bit simplified; I also had a special Nothing type to
indicate an empty case. The NA type from dataframes was not appropriate
because I wanted different behavior.)

I implemented functions on the trees mainly by using multi-dispatch in a
pattern-matching style, matching against TreeNode or Leaf types to
differentiate between the recursive case and the base case.

I eventually decided that the Leaf type seemed rather pointless, because it
was just a container for a value. Using a Leaf type created some annoyance
with defining functions to operate on leaves, when the obvious thing to do
was just apply those functions to the leaf values. Also, I was a little
worried that putting stuff inside a Leaf container would be a small
efficiency hit (probably not much?).

End result, I modified to this:

*Version 2*

abstract Split


type RectSplit <: Split

        var::Int

        loc::Float64

end


type TreeNode{a}

        split::Split # split definition

        pos::Union(TreeNode{a}, a)  # negative side

        neg::Union(TreeNode{a}, a)  # positive side

end


This is shorter, which seems like a good sign. It also manages to indicate
that all leafs should have the same type, which the previous version
didn't. However, I couldn't keep using multiple dispatch to define the
different cases for my functions. The closest thing I could do would be to
make the cases which match to Leaf type match to Float instead. This works
when the tree is only holding floats, but I wanted to handle more general
cases.

Since this didn't seem right, I switched most of my code to use if-else in
one function rather than multiple dispatch on the types. Leaving the leaf
case for the final "else" allows me to write things in a generic way (no
explicit assumption that I'm dealing with Floats in functions that don't
need to interact with this assumption).

However, this approach still doesn't work if I want to write a method which
applies to trees (be they internal nodes or leaf nodes) but allows other
function definitions for other types. Anything which isn't a TreeNode is
now assumed to be a leaf, the way I've done it. This isn't actually a
problem yet, but it could be at some point.

Does this mean I was wrong to switch to the 2nd version?

Any other suggestions for how this should go?

Best,

Abram

Reply via email to