Re: HMM investigations

Dhruv Sun, 24 Jul 2011 15:39:32 -0700

Hi Svetlomir,

Thanks for the clarification.


Since in your case the HR, MAP, SAP etc are the hidden variables but they
are completely observable, you can just count the transitions, emissions etc
to arrive at the required probability distributions.

Is there a previous thread on Mahout where you have discussed this problem?
I need to understand your requirements about what exactly are you trying to
predict, the cause and effect relationship etc. That information can help
model the problem in a way which is more amenable to Mahout's HMM trainers.


On Sun, Jul 24, 2011 at 3:44 PM, Svetlomir Kasabov <
[email protected]> wrote:

> So, that is my sample data. The column "instable" is the outcome variable,
> HR, SAP, MAP etc. is the minute-by-minute raw data. From these I extracted
> derived features (using percentiles) and created a training example with the
> data from i=1 to i=25 with instable = yes/no and so on...
>
>
> Thank you.
>
>
>
>
> i       instable        HR      SAP     MAP     ShockIndex      tStamp
> 1       yes     114,0   87,0    74,0    1,5405405405405406
>  14.Mrz.10,_11:29:00
> 2       yes     113,0   89,0    70,0    1,6142857142857143
>  14.Mrz.10,_11:30:00
> 3       yes     110,0   145,0   116,0   0,9482758620689655
>  14.Mrz.10,_11:31:00
> 4       yes     109,0   202,0   201,0   0,5422885572139303
>  14.Mrz.10,_11:32:00
> 5       yes     111,0   207,0   205,0   0,5414634146341464
>  14.Mrz.10,_11:33:00
> 6       yes     109,0   209,0   208,0   0,5240384615384616
>  14.Mrz.10,_11:34:00
> 7       yes     112,0   144,0   116,0   0,9655172413793104
>  14.Mrz.10,_11:35:00
> 8       yes     111,0   112,0   87,0    1,2758620689655173
>  14.Mrz.10,_11:36:00
> 9       yes     111,0   105,0   84,0    1,3214285714285714
>  14.Mrz.10,_11:37:00
> 10      yes     111,0   102,0   73,0    1,5205479452054795
>  14.Mrz.10,_11:38:00
> 11      yes     111,0   103,0   72,0    1,5416666666666667
>  14.Mrz.10,_11:39:00
> 12      yes     115,0   94,0    74,0    1,554054054054054
> 14.Mrz.10,_11:40:00
> 13      yes     113,0   91,0    67,0    1,6865671641791045
>  14.Mrz.10,_11:41:00
> 14      yes     109,0   124,0   101,0   1,0792079207920793
>  14.Mrz.10,_11:42:00
> 15      yes     109,0   147,0   123,0   0,8861788617886179
>  14.Mrz.10,_11:43:00
> 16      yes     110,0   93,0    69,0    1,5942028985507246
>  14.Mrz.10,_11:44:00
> 17      yes     108,0   91,0    74,0    1,4594594594594594
>  14.Mrz.10,_11:45:00
> 18      yes     109,0   83,0    69,0    1,5797101449275361
>  14.Mrz.10,_11:46:00
> 19      yes     110,0   94,0    70,0    1,5714285714285714
>  14.Mrz.10,_11:47:00
> 20      yes     109,0   104,0   73,0    1,4931506849315068
>  14.Mrz.10,_11:48:00
> 21      yes     107,0   103,0   68,0    1,5735294117647058
>  14.Mrz.10,_11:49:00
> 22      yes     109,0   94,0    69,0    1,5797101449275361
>  14.Mrz.10,_11:50:00
> 23      yes     108,0   90,0    66,0    1,6363636363636365
>  14.Mrz.10,_11:51:00
> 24      yes     109,0   97,0    73,0    1,4931506849315068
>  14.Mrz.10,_11:52:00
> 25      yes     110,0   105,0   73,0    1,5068493150684932
>  14.Mrz.10,_11:53:00
>
>
> 1       no      84,0    138,0   87,0    0,9655172413793104
>  22.Dez.10,_04:10:00
> 2       no      83,0    139,0   87,0    0,9540229885057471
>  22.Dez.10,_04:11:00
> 3       no      80,0    142,0   89,0    0,898876404494382
> 22.Dez.10,_04:12:00
> 4       no      82,0    142,0   87,0    0,9425287356321839
>  22.Dez.10,_04:13:00
> 5       no      81,0    140,0   87,0    0,9310344827586207
>  22.Dez.10,_04:14:00
> 6       no      77,0    138,0   85,0    0,9058823529411765
>  22.Dez.10,_04:15:00
> 7       no      80,0    143,0   89,0    0,898876404494382
> 22.Dez.10,_04:16:00
> 8       no      75,0    139,0   87,0    0,8620689655172413
>  22.Dez.10,_04:17:00
> 9       no      79,0    137,0   84,0    0,9404761904761905
>  22.Dez.10,_04:18:00
> 10      no      81,0    143,0   89,0    0,9101123595505618
>  22.Dez.10,_04:19:00
> 11      no      82,0    142,0   91,0    0,9010989010989011
>  22.Dez.10,_04:20:00
> 12      no      80,0    142,0   88,0    0,9090909090909091
>  22.Dez.10,_04:21:00
> 13      no      79,0    146,0   90,0    0,8777777777777778
>  22.Dez.10,_04:22:00
> 14      no      83,0    151,0   94,0    0,8829787234042553
>  22.Dez.10,_04:23:00
> 15      no      78,0    146,0   90,0    0,8666666666666667
>  22.Dez.10,_04:24:00
> 16      no      80,0    143,0   89,0    0,898876404494382
> 22.Dez.10,_04:25:00
> 17      no      81,0    143,0   88,0    0,9204545454545454
>  22.Dez.10,_04:26:00
> 18      no      79,0    143,0   88,0    0,8977272727272727
>  22.Dez.10,_04:27:00
> 19      no      85,0    145,0   90,0    0,9444444444444444
>  22.Dez.10,_04:28:00
> 20      no      82,0    138,0   88,0    0,9318181818181818
>  22.Dez.10,_04:29:00
> 21      no      81,0    146,0   91,0    0,8901098901098901
>  22.Dez.10,_04:30:00
> 22      no      83,0    135,0   86,0    0,9651162790697675
>  22.Dez.10,_04:31:00
> 23      no      80,0    143,0   89,0    0,898876404494382
> 22.Dez.10,_04:32:00
> 24      no      85,0    141,0   88,0    0,9659090909090909
>  22.Dez.10,_04:33:00
> 25      no      88,0    135,0   88,0    1,0     22.Dez.10,_04:34:00
>
>
>
>
>
> Am 24.07.2011 21:15, schrieb Ted Dunning:
>
>> I remember this problem.
>>
>>
>> Is it possible for you to post some sample data?
>>
>> On Sun, Jul 24, 2011 at 12:08 PM, Svetlomir Kasabov<
>> [email protected]>  wrote:
>>
>>  Hello again and thanks for the replies of both of you, I really apreciate
>>> them. The most important think is, that you try helping and how you do
>>> this
>>> is irrelevant :). I didn't feel angry/insulted.
>>>
>>>
>>> Yes, X1 and X2 are two independent hidden sequences, like
>>>
>>> BP -- BP -- BP (Blood Pressure)
>>> HR -- HR -- HR (Heart Rate)
>>> And I want to train the model to predict the probability of giving a drug
>>> Y
>>> to a patient (for example, with this sequence)
>>> Y=0 -- Y=0 -- Y=1
>>>
>>> I already tried this with logistic regression, but ended with poor
>>> results
>>> (probably because of my small example set). Logistic regression has also
>>> no
>>> built-in time series and that's why Imust analyze the X's changes using
>>> percentiles and then train the logistic model with these percentiles. In
>>> this way I reduce the dimensions to only one. That's why I thought that
>>> the
>>> HMM can do this for me 'out of the box', staying in the dimension of 2,
>>> if
>>> they allow to have two hidden chains, like this:
>>>
>>> http://t3.gstatic.com/images?****q=tbn:ANd9GcR8pu4bSm-**MSyg3Pj0-**<http://t3.gstatic.com/images?**q=tbn:ANd9GcR8pu4bSm-MSyg3Pj0-**>
>>> aTyi8FaqUOy4U2bcKJBTBYKKvgAhyw****6P<http://t3.gstatic.com/**
>>> images?q=tbn:ANd9GcR8pu4bSm-**MSyg3Pj0-**aTyi8FaqUOy4U2bcKJBTBYKKvgAhyw*
>>> *6P<http://t3.gstatic.com/images?q=tbn:ANd9GcR8pu4bSm-MSyg3Pj0-aTyi8FaqUOy4U2bcKJBTBYKKvgAhyw6P>
>>> >
>>>
>>> or 'coupled' HMMs.
>>>
>>> I am not very experienced with the HMMs, but will read further the
>>> literature and Mahout's API :).
>>>
>>> Maybe reducing the dimensions is not that bad idea? I've read that we can
>>> do it with PCA (Principle Components Analysis). Is there a Ḿahout code
>>> for
>>> this somewhere?
>>>
>>> Thanks a lot once again,
>>>
>>> Svetlomir.
>>>
>>>
>>>
>>> Am 24.07.2011 20:46, schrieb Ted Dunning:
>>>
>>>  My impression (and Svetlomir should correct me) is that the intent was
>>> to
>>>
>>>> use two HMM's on separate inputs and then use the decoded state
>>>> sequences
>>>> from those as inputs to a third HMM.
>>>>
>>>> If that is the question, then I think that Mahout's HMM's are
>>>> sufficiently
>>>> object-ish that this should work.  Obviously, it will take multiple
>>>> training
>>>> passes to train each separate model.
>>>>
>>>> On Sun, Jul 24, 2011 at 11:25 AM, Dhruv<[email protected]>   wrote:
>>>>
>>>>  Svetlomir and Ted -- I was not trying to be rude, sorry if I came
>>>> across
>>>>
>>>>> that way because of my exuberance. I apologize.
>>>>>
>>>>> I was eager to help and may have acted too fast and misunderstood the
>>>>> question, so I turn to both of you for a little clarification.
>>>>>
>>>>> I'm confused whether the X's refer to the hidden states, or training
>>>>> instances. Since the hidden sequence is always a Markov Chain in HMMs,
>>>>> I
>>>>> assumed that Svetlomir meant that X1 and X2 were two separate hidden
>>>>> state
>>>>> sequences because Markov Chain was explicitly mentioned in his original
>>>>> question. To quote:
>>>>>
>>>>> -----------
>>>>> X1----X1----X1----...X1  (Markov Chain for input parameter 1 =>
>>>>>  monitoring
>>>>> X1's changes over time)
>>>>>
>>>>> X2----X2----X2----...X2  (Markov Chain for intput parameter 2 =>
>>>>>  monitoring
>>>>> X2's changes over time)
>>>>> -----------
>>>>>
>>>>> Further, since X1 and X2 were not slated to have any relationship with
>>>>> each
>>>>> other and since they were the observations of two different parameters,
>>>>> I
>>>>> construed that X1 and X2 represented two separate hidden state
>>>>> sequences.
>>>>> I
>>>>> gathered that the hidden state sequences X1 and X2 are drawn from two
>>>>> disjoint hidden vocabulary sets. The user wants to discover the model
>>>>> on
>>>>> some training set and then, to the trained model, feed Y for decoding
>>>>> to
>>>>> arrive at the most likely sequence of states, X1 and X2 which emitted
>>>>> Y.
>>>>>
>>>>> In my answer, I continued with this line saying that in one training,
>>>>> you
>>>>> can't arrive at two separate models for X1 and X2 which contain the
>>>>> requisite distributions which can be used for decoding, say sequences
>>>>> of
>>>>> X1
>>>>> to have produced Y or sequence of X2 to have produced Y. Hence, I
>>>>> suggested
>>>>> having only one set for the hidden states, combining X1s and X2s and
>>>>> then
>>>>> train the model on it. Given the domain of application, this may or may
>>>>> not
>>>>> make sense, hence I was doubtful of formulating the problem as HMM and
>>>>> suggested alternatives.
>>>>>
>>>>> However:
>>>>>
>>>>> If X's are two separate input sequences for training, then yes, the
>>>>> current
>>>>> implementation is capable of training the HMM. If Y is the output, then
>>>>> one
>>>>> can decode, after training, the sequence of hidden states which most
>>>>> likely
>>>>> produced Y.
>>>>>
>>>>> For the output probability question, my answer was to use the trained
>>>>> model's HmmModel.getEmissionMatrix.****get(hiddenState, emittedState)
>>>>> method to
>>>>> compute the output probability for a particular hidden state. I believe
>>>>> this
>>>>> is not what the user wanted?
>>>>>
>>>>>
>>>>> Dhruv
>>>>>
>>>>> On Sun, Jul 24, 2011 at 12:56 PM, Ted Dunning<[email protected]>
>>>>> wrote:
>>>>>
>>>>>  On Sun, Jul 24, 2011 at 7:52 AM, Dhruv<[email protected]>   wrote:
>>>>>
>>>>>>  ... If you look into the *definition* of HMM,  the hidden sequence is
>>>>>> drawn
>>>>>>
>>>>>>  from
>>>>>>> only one set. The hidden sequence's transitions can be expressed as a
>>>>>>>
>>>>>>>  joint
>>>>>>
>>>>>>  probability p(s0, s1). Similarly the observed sequence has a joint
>>>>>>> distribution with the hidden sequence such as p(y0, s1) and so on.
>>>>>>>
>>>>>>>  I think gentler language might be a good idea here.  The question
>>>>>>> was
>>>>>>>
>>>>>> not
>>>>>> at
>>>>>> all unreasonable.
>>>>>>
>>>>>>
>>>>>>  The hidden state transitions follow the Markov memorylessness
>>>>>> property
>>>>>> and
>>>>>>
>>>>>>  hence form a Markov Chain.
>>>>>>>
>>>>>>> In your case, you are trying to model your problem assuming that
>>>>>>> there
>>>>>>>
>>>>>>>  are
>>>>>>
>>>>>>  two underlying state sequences affecting the observed output. This
>>>>>>>
>>>>>>>  doesn't
>>>>>>
>>>>>>  fit into the HMM's definition and you probably want something else.
>>>>>>>
>>>>>>>  Actually, what the original poster wanted is quite sensible.  While
>>>>>>>
>>>>>> the
>>>>>> output sequence is due to a single input sequence, that input sequence
>>>>>> is
>>>>>> not observable.  As such, we have a noisy channel problem where we
>>>>>> want
>>>>>>
>>>>>>  to
>>>>>
>>>>>  estimate something about that original sequence.  The point of the
>>>>>> Markov
>>>>>> model is that it defines a distribution of output sequence given an
>>>>>> input
>>>>>> sequence (and model).  This distribution can be inverted so that given
>>>>>> a
>>>>>> particular output sequence, we can estimate the probability
>>>>>> distribution
>>>>>>
>>>>>>  of
>>>>>
>>>>>  input sequences conditional on the output.
>>>>>>
>>>>>> The typical decoding algorithm for HMM's estimates only the maximum
>>>>>> likelihood input sequence but this does not negate the fact that we
>>>>>> have
>>>>>>
>>>>>>  a
>>>>>
>>>>>  distribution.  There are alternative decoding algorithms that allow a
>>>>>> set
>>>>>> of
>>>>>> high probability sequences to be estimated or allow a partial
>>>>>> probability
>>>>>> lattice to be output that allows alternative sequences to be probed.
>>>>>>
>>>>>> If you do want to fit your problem into the HMM framework, you need to
>>>>>>
>>>>>>  condense the X1 and X2 sequences into a single set and then condition
>>>>>>>
>>>>>>>  the
>>>>>> Ys
>>>>>>
>>>>>>> on it.
>>>>>>>
>>>>>>>  Not at all.
>>>>>>>
>>>>>>
>>>>>>  3. Can we get output probabilities from the HMM for a concrete state?
>>>>>>
>>>>>>>  Yes, after training, you can retrieve any of the trained model's
>>>>>>>>
>>>>>>> distributions as a Mahout Matrix type and use get(row, col).
>>>>>>>
>>>>>>>  This is not quite what the question was.
>>>>>>>
>>>>>>
>>>>>>
>

Re: HMM investigations

Reply via email to