One thing that's really nice about Julia is that it's often straightforward
to transliterate python code (or matlab code) in a fairly literal way and
end up with working code that has similar, or sometimes better performance.
But another nice thing about Julia is that it allows you to fairly smoothly
move from python or matlab-like code to c-like code that has much better
performance, in the places where this matters. There's usually some work to
do to achieve this, though. I.e. you can't generally expect that
inefficiently vectorized matlab-style code, or python-style code that uses
tons of Dict lookups, will perform orders of magnitude better in Julia than
it did in the original languages.
To put it differently, "Julia makes code run fast" is too simplistic, but
"Julia lets me write expressive high level code," "Julia lets me write
performant low level code", and "Julia lets me smoothly move back and forth
between high level code and low level code" are all true in my experience.
Okay, so finally the reason for this sermon: if you want to, I think you
could wring a lot more performance out of that Viterbi code. I haven't
actually tried this yet, but I think you could replace a lot of your data
structures with arrays and vectors of floats and ints, and then you'll
probably see performance improvements like 2x or 10x or 100x instead of
1.26x.
If you treat your states, ["Healthy", "Fever"], and your observations
["normal", "cold", "dizzy"] like enums, i.e. think of those strings mapping
onto integers, then your start probability can become a vector of 2 floats,
your transition probability is a 2x2 float array, and your emission
probability is a 2x3 float array.
The big improvement will come from treating "paths" as an int array. Right
now, it's a Dict{K,Any}(), and you allocate a new dict on each iteration,
which is probably really hurting you. You could translate it to a Nx2
array, where N is the number of observations, and 2 is the number of
states, and then allocate it all at the beginning. At each step n, you then
fill in the nth row with integers that tell you which column to look in
above you to keep following the path backwards.
If you try any of this, I'd love to know how it goes. If you're after
performance here, I suspect you'll be impressed by the results.
Best,
Jason
On Saturday, September 20, 2014 9:41:02 AM UTC-7, Jason Trenouth wrote:
>
> Hi,
>
> I converted some Python programs to Julia recently. The (probably
> incorrect) ramblings are here:
>
> http://a-coda.tumblr.com/post/93907978846/julia-dream
>
> http://a-coda.tumblr.com/post/97973293291/a-recurring-dream
>
> tl;dr - similar in style, but Julia can be a lot faster
>
> __Jason
>
>