It sounds like you are getting some numerical stability issues with the
training program.  With HMM's, the most common problem that leads to this
is numerical underflow.  I haven't looked at this in detail, however, so I
can't comment very knowledgeably.  It is possible that the current
implementation has no regularization which might lead to problems for
synthetic data-sets such as your counting example because there are no
observations for some transitions and the trainer may try to represent this
as -Inf in log space.

I can say that the Mahout HMM implementations are a student project and
have not seen much run-time or critical review.  That means that the
probability of serious bugs in the implementation is much higher than code
that is heavily used such as the recommender or the math library.  The
student who did the work is good, but that doesn't take the place of wide
usage.

On Sat, Jan 5, 2013 at 11:44 AM, <[email protected]> wrote:

> Hi there,
>
> I've got a couple of questions about the hmm elements of Mahout.
>
> - when I get models that are made of NaN I guess this is telling me that
> the algorithm can't make a prediction?
> - I can train models with 1 hidden state, or 2 hidden states and once or
> twice with 3 hidden states.. but when I try to train anything more complex
> it always seems to come back with NaNs - even with data sets like 1 2 3 4
> 5 1 2 3 4 5 1 2... which in my simple minded view should work well for 4
> or 5 hidden states : what am I doing wrong?
> - I have used hmmpredict to produce some... predictions! but how can I
> give it a sequence and then ask for the next state? Or should I simply use
> the code to create a custom predictor of my own?
>
> All the best,
>
> Simon
>
>
> ----
> Dr. Simon Thompson
> Chief Researcher, Customer Experience.
> BT Research.
> BT plc. PP11J. MLBG BT Adastral Park, Martlesham Heath.
> IP5 3RE
>
> Note :
>
> This email contains BT information, which may be privileged or
> confidential. It's meant only for the individual(s) or entity named above.
> If you're not the intended recipient, note that disclosing, copying,
> distributing or using this information is prohibited. If you've received
> this email in error, please let me know immediately on the email address
> above. Thank you.
> We monitor our email system, and may record your emails.
> British Telecommunications plc
> Registered office: 81 Newgate Street London EC1A 7AJ
> Registered in England no: 1800000

Reply via email to