My impression (and Svetlomir should correct me) is that the intent was to
use two HMM's on separate inputs and then use the decoded state sequences
from those as inputs to a third HMM.

If that is the question, then I think that Mahout's HMM's are sufficiently
object-ish that this should work.  Obviously, it will take multiple training
passes to train each separate model.

On Sun, Jul 24, 2011 at 11:25 AM, Dhruv <[email protected]> wrote:

> Svetlomir and Ted -- I was not trying to be rude, sorry if I came across
> that way because of my exuberance. I apologize.
>
> I was eager to help and may have acted too fast and misunderstood the
> question, so I turn to both of you for a little clarification.
>
> I'm confused whether the X's refer to the hidden states, or training
> instances. Since the hidden sequence is always a Markov Chain in HMMs, I
> assumed that Svetlomir meant that X1 and X2 were two separate hidden state
> sequences because Markov Chain was explicitly mentioned in his original
> question. To quote:
>
> -----------
> X1----X1----X1----...X1  (Markov Chain for input parameter 1 => monitoring
> X1's changes over time)
>
> X2----X2----X2----...X2  (Markov Chain for intput parameter 2 => monitoring
> X2's changes over time)
> -----------
>
> Further, since X1 and X2 were not slated to have any relationship with each
> other and since they were the observations of two different parameters, I
> construed that X1 and X2 represented two separate hidden state sequences. I
> gathered that the hidden state sequences X1 and X2 are drawn from two
> disjoint hidden vocabulary sets. The user wants to discover the model on
> some training set and then, to the trained model, feed Y for decoding to
> arrive at the most likely sequence of states, X1 and X2 which emitted Y.
>
> In my answer, I continued with this line saying that in one training, you
> can't arrive at two separate models for X1 and X2 which contain the
> requisite distributions which can be used for decoding, say sequences of X1
> to have produced Y or sequence of X2 to have produced Y. Hence, I suggested
> having only one set for the hidden states, combining X1s and X2s and then
> train the model on it. Given the domain of application, this may or may not
> make sense, hence I was doubtful of formulating the problem as HMM and
> suggested alternatives.
>
> However:
>
> If X's are two separate input sequences for training, then yes, the current
> implementation is capable of training the HMM. If Y is the output, then one
> can decode, after training, the sequence of hidden states which most likely
> produced Y.
>
> For the output probability question, my answer was to use the trained
> model's HmmModel.getEmissionMatrix.get(hiddenState, emittedState) method to
> compute the output probability for a particular hidden state. I believe
> this
> is not what the user wanted?
>
>
> Dhruv
>
> On Sun, Jul 24, 2011 at 12:56 PM, Ted Dunning <[email protected]>
> wrote:
>
> > On Sun, Jul 24, 2011 at 7:52 AM, Dhruv <[email protected]> wrote:
> >
> > > ... If you look into the *definition* of HMM,  the hidden sequence is
> > drawn
> > > from
> > > only one set. The hidden sequence's transitions can be expressed as a
> > joint
> > > probability p(s0, s1). Similarly the observed sequence has a joint
> > > distribution with the hidden sequence such as p(y0, s1) and so on.
> > >
> >
> > I think gentler language might be a good idea here.  The question was not
> > at
> > all unreasonable.
> >
> >
> > >
> > > The hidden state transitions follow the Markov memorylessness property
> > and
> > > hence form a Markov Chain.
> > >
> > > In your case, you are trying to model your problem assuming that there
> > are
> > > two underlying state sequences affecting the observed output. This
> > doesn't
> > > fit into the HMM's definition and you probably want something else.
> > >
> >
> > Actually, what the original poster wanted is quite sensible.  While the
> > output sequence is due to a single input sequence, that input sequence is
> > not observable.  As such, we have a noisy channel problem where we want
> to
> > estimate something about that original sequence.  The point of the Markov
> > model is that it defines a distribution of output sequence given an input
> > sequence (and model).  This distribution can be inverted so that given a
> > particular output sequence, we can estimate the probability distribution
> of
> > input sequences conditional on the output.
> >
> > The typical decoding algorithm for HMM's estimates only the maximum
> > likelihood input sequence but this does not negate the fact that we have
> a
> > distribution.  There are alternative decoding algorithms that allow a set
> > of
> > high probability sequences to be estimated or allow a partial probability
> > lattice to be output that allows alternative sequences to be probed.
> >
> > If you do want to fit your problem into the HMM framework, you need to
> > > condense the X1 and X2 sequences into a single set and then condition
> the
> > > Ys
> > > on it.
> > >
> >
> > Not at all.
> >
> >
> > > > 3. Can we get output probabilities from the HMM for a concrete state?
> > > >
> > >
> > > Yes, after training, you can retrieve any of the trained model's
> > > distributions as a Mahout Matrix type and use get(row, col).
> > >
> >
> > This is not quite what the question was.
> >
>

Reply via email to