Re: [sage-support] possible bug in DiscreteHiddenMarkovModel - NaN produced in output matrices

William Stein Thu, 08 May 2014 16:15:27 -0700

On Thu, May 8, 2014 at 3:50 PM, Jesse Hersch <[email protected]> wrote:
> Hi there,
>
> I think I may have found a bug in the class hmm.DiscreteHiddenMarkovModel.
> The repro is below.  It probably has something to do with one emission value
> being much more common than the others, but that shouldn't be invalid from
> my understanding of HMMs.


I could be wrong, but I don't think the implementation of Baum-Welch
is wrong.  The BM algorithm [1] using double precision numbers (which
is all the HMM algorithm in Sage uses) can lead to overflow, given the
sort of computations that are involved.

[1] http://en.wikipedia.org/wiki/Baum%E2%80%93Welch_algorithm

You can see the Sage implementation of Baum-Welch by typing

   model.baum_welch??

after running your code below, or visiting this link:

  https://github.com/sagemath/sage/blob/master/src/sage/stats/hmm/hmm.pyx

The entire implementation starting around line 1250 is only about 1-2
pages, and a straightforward translation of the standard thing.

 -- William

>
> I am running Sage Version 6.2 on Linux (CentOS).  I built it from source
> yesterday.  I am a sage newbie!
>
> Why am I reporting the bug here?  Because the "report a problem" link in the
> sage notebook points here: http://ask.sagemath.org/questions/ but I cannot
> post there because of being a new user (karma < 10)  That page says to use
> this list instead.  :)
>
> repro:
>
> print version()
>
> # here are two emisison sequences.  each observable has 4 possible values:
> 0-3.
> # 1 is much more common then 0,2,3 obviously
> sequences = [
>     [1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
>      1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 3, 1, 1,
>      1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
>     [1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 1, 1, 3, 1, 1, 1, 1, 1, 1,
> 3, 1, 1, 1, 1, 1, 3, 3, 2, 3, 1, 3, 1,
>      3, 1, 3, 3, 3, 1, 1, 3, 3, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
>      1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1]]
>
> transitions = [[0.2, 0.8], [0.2, 0.8]]
> pi = [.4, .6]
> b = [[.1, .7, .1, .1], [.1, .7, .1, .1]]
> model = hmm.DiscreteHiddenMarkovModel(A=transitions, B=b, pi=pi,
> emission_symbols=None, normalize=True)
>
> print 'initial state for hmm:\n', model
>
> # training on the first sequence goes ok.
> # but after the second sequence, all elements of the transition, emission,
> and pi matrices are NaN.
> for i, seq in enumerate(sequences):
>     print '\nbaum_welch on sequence ', i
>     model.baum_welch(obs=seq, max_iter=1000)
>     print model
>
>
> And here is the output.  see the many NaN in the final model
>
> Sage Version 6.2, Release Date: 2014-05-06
> initial state for hmm:
> Discrete Hidden Markov Model with 2 States and 4 Emissions
> Transition matrix:
> [0.2 0.8]
> [0.2 0.8]
> Emission matrix:
> [0.1 0.7 0.1 0.1]
> [0.1 0.7 0.1 0.1]
> Initial probabilities: [0.4000, 0.6000]
>
> baum_welch on sequence  0
> (-18.660162393780404, 128)
> Discrete Hidden Markov Model with 2 States and 4 Emissions
> Transition matrix:
> [0.195469702114 0.804530297886]
> [0.197500250574 0.802499749426]
> Emission matrix:
> [0.000195677912721    0.999217288349               0.0 0.000587033738163]
> [  0.0136321925931    0.945471229628               0.0   0.0408965777794]
> Initial probabilities: [0.9812, 0.0188]
>
> baum_welch on sequence  1
> (nan, 1000)
> Discrete Hidden Markov Model with 2 States and 4 Emissions
> Transition matrix:
> [NaN NaN]
> [NaN NaN]
> Emission matrix:
> [NaN NaN NaN NaN]
> [NaN NaN NaN NaN]
> Initial probabilities: [nan, nan]
>
> --
> You received this message because you are subscribed to the Google Groups
> "sage-support" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/sage-support.
> For more options, visit https://groups.google.com/d/optout.



-- 
William Stein
Professor of Mathematics
University of Washington
http://wstein.org

-- 
You received this message because you are subscribed to the Google Groups 
"sage-support" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/d/optout.

Re: [sage-support] possible bug in DiscreteHiddenMarkovModel - NaN produced in output matrices

Reply via email to