Re: HMM investigations

Dhruv Sun, 24 Jul 2011 11:25:41 -0700

Svetlomir and Ted -- I was not trying to be rude, sorry if I came across
that way because of my exuberance. I apologize.

I was eager to help and may have acted too fast and misunderstood the
question, so I turn to both of you for a little clarification.

I'm confused whether the X's refer to the hidden states, or training
instances. Since the hidden sequence is always a Markov Chain in HMMs, I
assumed that Svetlomir meant that X1 and X2 were two separate hidden state
sequences because Markov Chain was explicitly mentioned in his original
question. To quote:

-----------
X1----X1----X1----...X1  (Markov Chain for input parameter 1 => monitoring
X1's changes over time)

X2----X2----X2----...X2  (Markov Chain for intput parameter 2 => monitoring
X2's changes over time)
-----------

Further, since X1 and X2 were not slated to have any relationship with each
other and since they were the observations of two different parameters, I
construed that X1 and X2 represented two separate hidden state sequences. I
gathered that the hidden state sequences X1 and X2 are drawn from two
disjoint hidden vocabulary sets. The user wants to discover the model on
some training set and then, to the trained model, feed Y for decoding to
arrive at the most likely sequence of states, X1 and X2 which emitted Y.

In my answer, I continued with this line saying that in one training, you
can't arrive at two separate models for X1 and X2 which contain the
requisite distributions which can be used for decoding, say sequences of X1
to have produced Y or sequence of X2 to have produced Y. Hence, I suggested
having only one set for the hidden states, combining X1s and X2s and then
train the model on it. Given the domain of application, this may or may not
make sense, hence I was doubtful of formulating the problem as HMM and
suggested alternatives.

However:

If X's are two separate input sequences for training, then yes, the current
implementation is capable of training the HMM. If Y is the output, then one
can decode, after training, the sequence of hidden states which most likely
produced Y.

For the output probability question, my answer was to use the trained
model's HmmModel.getEmissionMatrix.get(hiddenState, emittedState) method to
compute the output probability for a particular hidden state. I believe this
is not what the user wanted?

Dhruv

On Sun, Jul 24, 2011 at 12:56 PM, Ted Dunning <[email protected]> wrote:

> On Sun, Jul 24, 2011 at 7:52 AM, Dhruv <[email protected]> wrote:
>
> > ... If you look into the *definition* of HMM,  the hidden sequence is
> drawn
> > from
> > only one set. The hidden sequence's transitions can be expressed as a
> joint
> > probability p(s0, s1). Similarly the observed sequence has a joint
> > distribution with the hidden sequence such as p(y0, s1) and so on.
> >
>
> I think gentler language might be a good idea here.  The question was not
> at
> all unreasonable.
>
>
> >
> > The hidden state transitions follow the Markov memorylessness property
> and
> > hence form a Markov Chain.
> >
> > In your case, you are trying to model your problem assuming that there
> are
> > two underlying state sequences affecting the observed output. This
> doesn't
> > fit into the HMM's definition and you probably want something else.
> >
>
> Actually, what the original poster wanted is quite sensible.  While the
> output sequence is due to a single input sequence, that input sequence is
> not observable.  As such, we have a noisy channel problem where we want to
> estimate something about that original sequence.  The point of the Markov
> model is that it defines a distribution of output sequence given an input
> sequence (and model).  This distribution can be inverted so that given a
> particular output sequence, we can estimate the probability distribution of
> input sequences conditional on the output.
>
> The typical decoding algorithm for HMM's estimates only the maximum
> likelihood input sequence but this does not negate the fact that we have a
> distribution.  There are alternative decoding algorithms that allow a set
> of
> high probability sequences to be estimated or allow a partial probability
> lattice to be output that allows alternative sequences to be probed.
>
> If you do want to fit your problem into the HMM framework, you need to
> > condense the X1 and X2 sequences into a single set and then condition the
> > Ys
> > on it.
> >
>
> Not at all.
>
>
> > > 3. Can we get output probabilities from the HMM for a concrete state?
> > >
> >
> > Yes, after training, you can retrieve any of the trained model's
> > distributions as a Mahout Matrix type and use get(row, col).
> >
>
> This is not quite what the question was.
>

Re: HMM investigations

Reply via email to