On Thu, May 8, 2014 at 3:50 PM, Jesse Hersch <[email protected]> wrote: > Hi there, > > I think I may have found a bug in the class hmm.DiscreteHiddenMarkovModel. > The repro is below. It probably has something to do with one emission value > being much more common than the others, but that shouldn't be invalid from > my understanding of HMMs.
I could be wrong, but I don't think the implementation of Baum-Welch is wrong. The BM algorithm [1] using double precision numbers (which is all the HMM algorithm in Sage uses) can lead to overflow, given the sort of computations that are involved. [1] http://en.wikipedia.org/wiki/Baum%E2%80%93Welch_algorithm You can see the Sage implementation of Baum-Welch by typing model.baum_welch?? after running your code below, or visiting this link: https://github.com/sagemath/sage/blob/master/src/sage/stats/hmm/hmm.pyx The entire implementation starting around line 1250 is only about 1-2 pages, and a straightforward translation of the standard thing. -- William > > I am running Sage Version 6.2 on Linux (CentOS). I built it from source > yesterday. I am a sage newbie! > > Why am I reporting the bug here? Because the "report a problem" link in the > sage notebook points here: http://ask.sagemath.org/questions/ but I cannot > post there because of being a new user (karma < 10) That page says to use > this list instead. :) > > repro: > > print version() > > # here are two emisison sequences. each observable has 4 possible values: > 0-3. > # 1 is much more common then 0,2,3 obviously > sequences = [ > [1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 3, 1, 1, > 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], > [1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3, 1, 1, 3, 1, 1, 1, 1, 1, 1, > 3, 1, 1, 1, 1, 1, 3, 3, 2, 3, 1, 3, 1, > 3, 1, 3, 3, 3, 1, 1, 3, 3, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, > 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1]] > > transitions = [[0.2, 0.8], [0.2, 0.8]] > pi = [.4, .6] > b = [[.1, .7, .1, .1], [.1, .7, .1, .1]] > model = hmm.DiscreteHiddenMarkovModel(A=transitions, B=b, pi=pi, > emission_symbols=None, normalize=True) > > print 'initial state for hmm:\n', model > > # training on the first sequence goes ok. > # but after the second sequence, all elements of the transition, emission, > and pi matrices are NaN. > for i, seq in enumerate(sequences): > print '\nbaum_welch on sequence ', i > model.baum_welch(obs=seq, max_iter=1000) > print model > > > And here is the output. see the many NaN in the final model > > Sage Version 6.2, Release Date: 2014-05-06 > initial state for hmm: > Discrete Hidden Markov Model with 2 States and 4 Emissions > Transition matrix: > [0.2 0.8] > [0.2 0.8] > Emission matrix: > [0.1 0.7 0.1 0.1] > [0.1 0.7 0.1 0.1] > Initial probabilities: [0.4000, 0.6000] > > baum_welch on sequence 0 > (-18.660162393780404, 128) > Discrete Hidden Markov Model with 2 States and 4 Emissions > Transition matrix: > [0.195469702114 0.804530297886] > [0.197500250574 0.802499749426] > Emission matrix: > [0.000195677912721 0.999217288349 0.0 0.000587033738163] > [ 0.0136321925931 0.945471229628 0.0 0.0408965777794] > Initial probabilities: [0.9812, 0.0188] > > baum_welch on sequence 1 > (nan, 1000) > Discrete Hidden Markov Model with 2 States and 4 Emissions > Transition matrix: > [NaN NaN] > [NaN NaN] > Emission matrix: > [NaN NaN NaN NaN] > [NaN NaN NaN NaN] > Initial probabilities: [nan, nan] > > -- > You received this message because you are subscribed to the Google Groups > "sage-support" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at http://groups.google.com/group/sage-support. > For more options, visit https://groups.google.com/d/optout. -- William Stein Professor of Mathematics University of Washington http://wstein.org -- You received this message because you are subscribed to the Google Groups "sage-support" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/sage-support. For more options, visit https://groups.google.com/d/optout.
