Hello there, 

I've had a little look at the Mahout source.

I've been using the mahout command line to train my hmm and this invokes the 
org.apache.mahout.classifer.sequencelearning.hmm.BaumWelchTrainer class main 
method. 

This class then parses the command line options and calls HmmTrainer. 
trainBaumWelch(model, observationsArray,epsilon,maxIterations, true); 

The final parameter is "boolean scaled" - set to try this should use the log 
scaled algorithm logScaledBaumWelch(observedSequence,iteration,alpha,beta); 

So it looks to me like this is a problem with the logScaled implementation. 

I'll carry on having a little dig around and see what I can see. 

Simon 


-----Original Message-----
From: Ted Dunning [mailto:[email protected]] 
Sent: 07 January 2013 18:49
To: [email protected]
Subject: Re: HMM - baum welch and hmmpredict

I think that the log prob version would handle this better.

Note that even with the extra transitions, you get *very* small probs.
 Without those transitions, you are going to get underflow very quickly.
 With log probs, the system should recognize the underflow correctly without 
having to actually store it.  Since the log probabilities can only step a 
finite and bounded amount each time, they shouldn't ever get to -Inf even if 
that would make them happy.


On Mon, Jan 7, 2013 at 7:22 AM, <[email protected]> wrote:

> Hi there,
>
> Ted's insight on the synthetic data set causing underflow appears to 
> be correct.
>
> - If I train using a pattern "0 0 0 0 1 1 1 1 2 2 2 2 2 0 0 0 0 1 1 1 
> 1 2
> 2 2 2 2 .... <<repeat for 20 times or so>>" for 3 hidden states and 3 
> observable states .
>
> >mahout baumwelch -i pattern,txt -o out.txt -nh 3 -no 3
>
> I get
>
> >...Initial probabilities:
> 0 1 2
> NaN NaN NaN
> Transition matrix:
>   0 1 2
> 0 NaN NaN NaN
> 1 NaN NaN NaN
> 2 NaN NaN NaN
> Emission matrix:
>   0 1 2
> 0 NaN NaN NaN
> 1 NaN NaN NaN
> 2 NaN NaN NaN
>
>
> but if I introduce a just one transition from the 2 2 2 2 state to the 
> 1 1
> 1 1 state and from the 0 0 0 0 state to the 2 2 2 2 state :
>
> " 0 0 0 0 1 1 1 1 2 2 2 2 2 0 0 0 0 2 1 1 1 1 0 2 2 2 2 2  0 0 0 0 1 1 
> 1 1
> 2 2 2 2 2 0 0 0 0 1 1 1 1 2 2 2 2 2.... "
>
> then I get :
> >....Initial probabilities:
> 0 1 2
> 1.0 1.1123861859130155E-36 1.656560305889225E-39 Transition matrix:
>   0 1 2
> 0 0.7547069426685107 2.1036132690268842E-14 0.24529305733146825
> 1 0.19502627742281503 0.8049737225771799 5.174976422070503E-15
> 2 5.730060776423029E-13 0.2500232192495272 0.7499767807498997 Emission 
> matrix:
>   0 1 2
> 0 0.9810688873254809 0.010907170378894576 0.008023942295624495
> 1 4.0644469515616997E-7 2.1553200545010613E-8 0.9999995720021043
> 2 8.750057097530634E-5 0.9973079487546356 0.002604550674389073
>
> Which looks much more like it, and I can generate a prediction with
>
> >mahout hmmpredict -o out.txt -m newmodel.mod -l 100
>
> >cat out.txt
>
> >0 0 0 0 0 0 0 0 1 2 2 2 2 0 1 1 1 1 1 1 2 0 1 1 2 2 0 0 1 1 1 1 1 1 1 
> >1 1
> 1 1 1
> 1 2 0 0 1 1 2 2 2 2 2 2 2 2 2 2 2 0 0 0 0 0 1 1 1 1 1 1 1 2 2 0 1 1 2 
> 2 2
> 2 2 2
> 2 2 0 0 1 1 1 2 2 2 2 0 1 1 2 2 2 0 1 1
>
> Which is rational.
>
> Newbie stuff, but hope it's handy!
>
> Simon
>
> ----
> Dr. Simon Thompson
>
> ________________________________________
> From: Ted Dunning [[email protected]]
> Sent: 06 January 2013 20:16
> To: [email protected]
> Subject: Re: HMM - baum welch and hmmpredict
>
> It sounds like you are getting some numerical stability issues with 
> the training program.  With HMM's, the most common problem that leads 
> to this is numerical underflow.  I haven't looked at this in detail, 
> however, so I can't comment very knowledgeably.  It is possible that 
> the current implementation has no regularization which might lead to 
> problems for synthetic data-sets such as your counting example because 
> there are no observations for some transitions and the trainer may try 
> to represent this as -Inf in log space.
>
> I can say that the Mahout HMM implementations are a student project 
> and have not seen much run-time or critical review.  That means that 
> the probability of serious bugs in the implementation is much higher 
> than code that is heavily used such as the recommender or the math 
> library.  The student who did the work is good, but that doesn't take 
> the place of wide usage.
>
> On Sat, Jan 5, 2013 at 11:44 AM, <[email protected]> wrote:
>
> > Hi there,
> >
> > I've got a couple of questions about the hmm elements of Mahout.
> >
> > - when I get models that are made of NaN I guess this is telling me 
> > that the algorithm can't make a prediction?
> > - I can train models with 1 hidden state, or 2 hidden states and 
> > once or twice with 3 hidden states.. but when I try to train 
> > anything more
> complex
> > it always seems to come back with NaNs - even with data sets like 1 
> > 2 3 4
> > 5 1 2 3 4 5 1 2... which in my simple minded view should work well 
> > for 4 or 5 hidden states : what am I doing wrong?
> > - I have used hmmpredict to produce some... predictions! but how can 
> > I give it a sequence and then ask for the next state? Or should I 
> > simply
> use
> > the code to create a custom predictor of my own?
> >
> > All the best,
> >
> > Simon
> >
> >
> > ----
> > Dr. Simon Thompson
> > Chief Researcher, Customer Experience.
> > BT Research.
> > BT plc. PP11J. MLBG BT Adastral Park, Martlesham Heath.
> > IP5 3RE
> >
> > Note :
> >
> > This email contains BT information, which may be privileged or 
> > confidential. It's meant only for the individual(s) or entity named
> above.
> > If you're not the intended recipient, note that disclosing, copying, 
> > distributing or using this information is prohibited. If you've 
> > received this email in error, please let me know immediately on the 
> > email address above. Thank you.
> > We monitor our email system, and may record your emails.
> > British Telecommunications plc
> > Registered office: 81 Newgate Street London EC1A 7AJ Registered in 
> > England no: 1800000
>

Reply via email to