Re: HMM investigations

Ted Dunning Sun, 24 Jul 2011 12:16:22 -0700

I remember this problem.

Is it possible for you to post some sample data?


On Sun, Jul 24, 2011 at 12:08 PM, Svetlomir Kasabov <
[email protected]> wrote:

>
> Hello again and thanks for the replies of both of you, I really apreciate
> them. The most important think is, that you try helping and how you do this
> is irrelevant :). I didn't feel angry/insulted.
>
>
> Yes, X1 and X2 are two independent hidden sequences, like
>
> BP -- BP -- BP (Blood Pressure)
> HR -- HR -- HR (Heart Rate)
> And I want to train the model to predict the probability of giving a drug Y
> to a patient (for example, with this sequence)
> Y=0 -- Y=0 -- Y=1
>
> I already tried this with logistic regression, but ended with poor results
> (probably because of my small example set). Logistic regression has also no
> built-in time series and that's why Imust analyze the X's changes using
> percentiles and then train the logistic model with these percentiles. In
> this way I reduce the dimensions to only one. That's why I thought that the
> HMM can do this for me 'out of the box', staying in the dimension of 2, if
> they allow to have two hidden chains, like this:
>
> http://t3.gstatic.com/images?**q=tbn:ANd9GcR8pu4bSm-MSyg3Pj0-**
> aTyi8FaqUOy4U2bcKJBTBYKKvgAhyw**6P<http://t3.gstatic.com/images?q=tbn:ANd9GcR8pu4bSm-MSyg3Pj0-aTyi8FaqUOy4U2bcKJBTBYKKvgAhyw6P>
>
> or 'coupled' HMMs.
>
> I am not very experienced with the HMMs, but will read further the
> literature and Mahout's API :).
>
> Maybe reducing the dimensions is not that bad idea? I've read that we can
> do it with PCA (Principle Components Analysis). Is there a Ḿahout code for
> this somewhere?
>
> Thanks a lot once again,
>
> Svetlomir.
>
>
>
> Am 24.07.2011 20:46, schrieb Ted Dunning:
>
>  My impression (and Svetlomir should correct me) is that the intent was to
>> use two HMM's on separate inputs and then use the decoded state sequences
>> from those as inputs to a third HMM.
>>
>> If that is the question, then I think that Mahout's HMM's are sufficiently
>> object-ish that this should work.  Obviously, it will take multiple
>> training
>> passes to train each separate model.
>>
>> On Sun, Jul 24, 2011 at 11:25 AM, Dhruv<[email protected]>  wrote:
>>
>>  Svetlomir and Ted -- I was not trying to be rude, sorry if I came across
>>> that way because of my exuberance. I apologize.
>>>
>>> I was eager to help and may have acted too fast and misunderstood the
>>> question, so I turn to both of you for a little clarification.
>>>
>>> I'm confused whether the X's refer to the hidden states, or training
>>> instances. Since the hidden sequence is always a Markov Chain in HMMs, I
>>> assumed that Svetlomir meant that X1 and X2 were two separate hidden
>>> state
>>> sequences because Markov Chain was explicitly mentioned in his original
>>> question. To quote:
>>>
>>> -----------
>>> X1----X1----X1----...X1  (Markov Chain for input parameter 1 =>
>>>  monitoring
>>> X1's changes over time)
>>>
>>> X2----X2----X2----...X2  (Markov Chain for intput parameter 2 =>
>>>  monitoring
>>> X2's changes over time)
>>> -----------
>>>
>>> Further, since X1 and X2 were not slated to have any relationship with
>>> each
>>> other and since they were the observations of two different parameters, I
>>> construed that X1 and X2 represented two separate hidden state sequences.
>>> I
>>> gathered that the hidden state sequences X1 and X2 are drawn from two
>>> disjoint hidden vocabulary sets. The user wants to discover the model on
>>> some training set and then, to the trained model, feed Y for decoding to
>>> arrive at the most likely sequence of states, X1 and X2 which emitted Y.
>>>
>>> In my answer, I continued with this line saying that in one training, you
>>> can't arrive at two separate models for X1 and X2 which contain the
>>> requisite distributions which can be used for decoding, say sequences of
>>> X1
>>> to have produced Y or sequence of X2 to have produced Y. Hence, I
>>> suggested
>>> having only one set for the hidden states, combining X1s and X2s and then
>>> train the model on it. Given the domain of application, this may or may
>>> not
>>> make sense, hence I was doubtful of formulating the problem as HMM and
>>> suggested alternatives.
>>>
>>> However:
>>>
>>> If X's are two separate input sequences for training, then yes, the
>>> current
>>> implementation is capable of training the HMM. If Y is the output, then
>>> one
>>> can decode, after training, the sequence of hidden states which most
>>> likely
>>> produced Y.
>>>
>>> For the output probability question, my answer was to use the trained
>>> model's HmmModel.getEmissionMatrix.**get(hiddenState, emittedState)
>>> method to
>>> compute the output probability for a particular hidden state. I believe
>>> this
>>> is not what the user wanted?
>>>
>>>
>>> Dhruv
>>>
>>> On Sun, Jul 24, 2011 at 12:56 PM, Ted Dunning<[email protected]>
>>> wrote:
>>>
>>>  On Sun, Jul 24, 2011 at 7:52 AM, Dhruv<[email protected]>  wrote:
>>>>
>>>>  ... If you look into the *definition* of HMM,  the hidden sequence is
>>>>>
>>>> drawn
>>>>
>>>>> from
>>>>> only one set. The hidden sequence's transitions can be expressed as a
>>>>>
>>>> joint
>>>>
>>>>> probability p(s0, s1). Similarly the observed sequence has a joint
>>>>> distribution with the hidden sequence such as p(y0, s1) and so on.
>>>>>
>>>>>  I think gentler language might be a good idea here.  The question was
>>>> not
>>>> at
>>>> all unreasonable.
>>>>
>>>>
>>>>  The hidden state transitions follow the Markov memorylessness property
>>>>>
>>>> and
>>>>
>>>>> hence form a Markov Chain.
>>>>>
>>>>> In your case, you are trying to model your problem assuming that there
>>>>>
>>>> are
>>>>
>>>>> two underlying state sequences affecting the observed output. This
>>>>>
>>>> doesn't
>>>>
>>>>> fit into the HMM's definition and you probably want something else.
>>>>>
>>>>>  Actually, what the original poster wanted is quite sensible.  While
>>>> the
>>>> output sequence is due to a single input sequence, that input sequence
>>>> is
>>>> not observable.  As such, we have a noisy channel problem where we want
>>>>
>>> to
>>>
>>>> estimate something about that original sequence.  The point of the
>>>> Markov
>>>> model is that it defines a distribution of output sequence given an
>>>> input
>>>> sequence (and model).  This distribution can be inverted so that given a
>>>> particular output sequence, we can estimate the probability distribution
>>>>
>>> of
>>>
>>>> input sequences conditional on the output.
>>>>
>>>> The typical decoding algorithm for HMM's estimates only the maximum
>>>> likelihood input sequence but this does not negate the fact that we have
>>>>
>>> a
>>>
>>>> distribution.  There are alternative decoding algorithms that allow a
>>>> set
>>>> of
>>>> high probability sequences to be estimated or allow a partial
>>>> probability
>>>> lattice to be output that allows alternative sequences to be probed.
>>>>
>>>> If you do want to fit your problem into the HMM framework, you need to
>>>>
>>>>> condense the X1 and X2 sequences into a single set and then condition
>>>>>
>>>> the
>>>
>>>> Ys
>>>>> on it.
>>>>>
>>>>>  Not at all.
>>>>
>>>>
>>>>  3. Can we get output probabilities from the HMM for a concrete state?
>>>>>>
>>>>>>  Yes, after training, you can retrieve any of the trained model's
>>>>> distributions as a Mahout Matrix type and use get(row, col).
>>>>>
>>>>>  This is not quite what the question was.
>>>>
>>>>
>

Re: HMM investigations

Reply via email to