Re: HMM investigations

Svetlomir Kasabov Sun, 24 Jul 2011 23:28:51 -0700

Hello Dhruv and thanks for the cooperativeness,

the short description of my problem is this article (which is also mymain source): http://www.ncbi.nlm.nih.gov/pubmed/19163540. I trypredicting the giving a drug Y (indicating hemodynamic instability),based on patient's vital signs like blood pressure and heart rate.

The long description of the problem is herewww.multi-science.co.uk/acce-free.pdf. I will summarize the informationinto these two points:

1. In order to extract a training example for prediction of instablepatients, the autors checked when a drug Y was given (for example, attime t), went back t-4 hours and used the data from t-4 to t-2. Theauthours used for training logistic regression. And we know, that theequation Y = ß0 + x1*ß1...xn*ßn has one-dimensional character, But theproblem is two dimensional: for each patient, you have the time as onedimension and multiple parameters: Systolic Arterial Pressure(SAP),Heart Rate (HR) etc. in the second dimension. So, they used percentilesover the above mentioned time segment in order to map, for example HR'schanges to X1.

2. On page 15, figure 2 from the pdf, you can see how good the chosenparameters (in percentiles) differenciate the stable from unstable patients.

@Ted: many thanks to you too. Your analysis is great. I will post moredata in some minutes...



Svetlomir.

Am 25.07.2011 00:39, schrieb Dhruv:

Hi Svetlomir,

Thanks for the clarification.

Since in your case the HR, MAP, SAP etc are the hidden variables but they
are completely observable, you can just count the transitions, emissions etc
to arrive at the required probability distributions.

Is there a previous thread on Mahout where you have discussed this problem?
I need to understand your requirements about what exactly are you trying to
predict, the cause and effect relationship etc. That information can help
model the problem in a way which is more amenable to Mahout's HMM trainers.


On Sun, Jul 24, 2011 at 3:44 PM, Svetlomir Kasabov<
[email protected]>  wrote:

So, that is my sample data. The column "instable" is the outcome variable,
HR, SAP, MAP etc. is the minute-by-minute raw data. From these I extracted
derived features (using percentiles) and created a training example with the
data from i=1 to i=25 with instable = yes/no and so on...


Thank you.




i       instable        HR      SAP     MAP     ShockIndex      tStamp
1       yes     114,0   87,0    74,0    1,5405405405405406
  14.Mrz.10,_11:29:00
2       yes     113,0   89,0    70,0    1,6142857142857143
  14.Mrz.10,_11:30:00
3       yes     110,0   145,0   116,0   0,9482758620689655
  14.Mrz.10,_11:31:00
4       yes     109,0   202,0   201,0   0,5422885572139303
  14.Mrz.10,_11:32:00
5       yes     111,0   207,0   205,0   0,5414634146341464
  14.Mrz.10,_11:33:00
6       yes     109,0   209,0   208,0   0,5240384615384616
  14.Mrz.10,_11:34:00
7       yes     112,0   144,0   116,0   0,9655172413793104
  14.Mrz.10,_11:35:00
8       yes     111,0   112,0   87,0    1,2758620689655173
  14.Mrz.10,_11:36:00
9       yes     111,0   105,0   84,0    1,3214285714285714
  14.Mrz.10,_11:37:00
10      yes     111,0   102,0   73,0    1,5205479452054795
  14.Mrz.10,_11:38:00
11      yes     111,0   103,0   72,0    1,5416666666666667
  14.Mrz.10,_11:39:00
12      yes     115,0   94,0    74,0    1,554054054054054
14.Mrz.10,_11:40:00
13      yes     113,0   91,0    67,0    1,6865671641791045
  14.Mrz.10,_11:41:00
14      yes     109,0   124,0   101,0   1,0792079207920793
  14.Mrz.10,_11:42:00
15      yes     109,0   147,0   123,0   0,8861788617886179
  14.Mrz.10,_11:43:00
16      yes     110,0   93,0    69,0    1,5942028985507246
  14.Mrz.10,_11:44:00
17      yes     108,0   91,0    74,0    1,4594594594594594
  14.Mrz.10,_11:45:00
18      yes     109,0   83,0    69,0    1,5797101449275361
  14.Mrz.10,_11:46:00
19      yes     110,0   94,0    70,0    1,5714285714285714
  14.Mrz.10,_11:47:00
20      yes     109,0   104,0   73,0    1,4931506849315068
  14.Mrz.10,_11:48:00
21      yes     107,0   103,0   68,0    1,5735294117647058
  14.Mrz.10,_11:49:00
22      yes     109,0   94,0    69,0    1,5797101449275361
  14.Mrz.10,_11:50:00
23      yes     108,0   90,0    66,0    1,6363636363636365
  14.Mrz.10,_11:51:00
24      yes     109,0   97,0    73,0    1,4931506849315068
  14.Mrz.10,_11:52:00
25      yes     110,0   105,0   73,0    1,5068493150684932
  14.Mrz.10,_11:53:00


1       no      84,0    138,0   87,0    0,9655172413793104
  22.Dez.10,_04:10:00
2       no      83,0    139,0   87,0    0,9540229885057471
  22.Dez.10,_04:11:00
3       no      80,0    142,0   89,0    0,898876404494382
22.Dez.10,_04:12:00
4       no      82,0    142,0   87,0    0,9425287356321839
  22.Dez.10,_04:13:00
5       no      81,0    140,0   87,0    0,9310344827586207
  22.Dez.10,_04:14:00
6       no      77,0    138,0   85,0    0,9058823529411765
  22.Dez.10,_04:15:00
7       no      80,0    143,0   89,0    0,898876404494382
22.Dez.10,_04:16:00
8       no      75,0    139,0   87,0    0,8620689655172413
  22.Dez.10,_04:17:00
9       no      79,0    137,0   84,0    0,9404761904761905
  22.Dez.10,_04:18:00
10      no      81,0    143,0   89,0    0,9101123595505618
  22.Dez.10,_04:19:00
11      no      82,0    142,0   91,0    0,9010989010989011
  22.Dez.10,_04:20:00
12      no      80,0    142,0   88,0    0,9090909090909091
  22.Dez.10,_04:21:00
13      no      79,0    146,0   90,0    0,8777777777777778
  22.Dez.10,_04:22:00
14      no      83,0    151,0   94,0    0,8829787234042553
  22.Dez.10,_04:23:00
15      no      78,0    146,0   90,0    0,8666666666666667
  22.Dez.10,_04:24:00
16      no      80,0    143,0   89,0    0,898876404494382
22.Dez.10,_04:25:00
17      no      81,0    143,0   88,0    0,9204545454545454
  22.Dez.10,_04:26:00
18      no      79,0    143,0   88,0    0,8977272727272727
  22.Dez.10,_04:27:00
19      no      85,0    145,0   90,0    0,9444444444444444
  22.Dez.10,_04:28:00
20      no      82,0    138,0   88,0    0,9318181818181818
  22.Dez.10,_04:29:00
21      no      81,0    146,0   91,0    0,8901098901098901
  22.Dez.10,_04:30:00
22      no      83,0    135,0   86,0    0,9651162790697675
  22.Dez.10,_04:31:00
23      no      80,0    143,0   89,0    0,898876404494382
22.Dez.10,_04:32:00
24      no      85,0    141,0   88,0    0,9659090909090909
  22.Dez.10,_04:33:00
25      no      88,0    135,0   88,0    1,0     22.Dez.10,_04:34:00





Am 24.07.2011 21:15, schrieb Ted Dunning:

I remember this problem.


Is it possible for you to post some sample data?

On Sun, Jul 24, 2011 at 12:08 PM, Svetlomir Kasabov<
[email protected]>   wrote:

  Hello again and thanks for the replies of both of you, I really apreciate

them. The most important think is, that you try helping and how you do
this
is irrelevant :). I didn't feel angry/insulted.


Yes, X1 and X2 are two independent hidden sequences, like

BP -- BP -- BP (Blood Pressure)
HR -- HR -- HR (Heart Rate)
And I want to train the model to predict the probability of giving a drug
Y
to a patient (for example, with this sequence)
Y=0 -- Y=0 -- Y=1

I already tried this with logistic regression, but ended with poor
results
(probably because of my small example set). Logistic regression has also
no
built-in time series and that's why Imust analyze the X's changes using
percentiles and then train the logistic model with these percentiles. In
this way I reduce the dimensions to only one. That's why I thought that
the
HMM can do this for me 'out of the box', staying in the dimension of 2,
if
they allow to have two hidden chains, like this:

http://t3.gstatic.com/images?****q=tbn:ANd9GcR8pu4bSm-**MSyg3Pj0-**<http://t3.gstatic.com/images?**q=tbn:ANd9GcR8pu4bSm-MSyg3Pj0-**>
aTyi8FaqUOy4U2bcKJBTBYKKvgAhyw****6P<http://t3.gstatic.com/**
images?q=tbn:ANd9GcR8pu4bSm-**MSyg3Pj0-**aTyi8FaqUOy4U2bcKJBTBYKKvgAhyw*
*6P<http://t3.gstatic.com/images?q=tbn:ANd9GcR8pu4bSm-MSyg3Pj0-aTyi8FaqUOy4U2bcKJBTBYKKvgAhyw6P>
or 'coupled' HMMs.

I am not very experienced with the HMMs, but will read further the
literature and Mahout's API :).

Maybe reducing the dimensions is not that bad idea? I've read that we can
do it with PCA (Principle Components Analysis). Is there a Ḿahout code
for
this somewhere?

Thanks a lot once again,

Svetlomir.



Am 24.07.2011 20:46, schrieb Ted Dunning:

  My impression (and Svetlomir should correct me) is that the intent was
to

use two HMM's on separate inputs and then use the decoded state
sequences
from those as inputs to a third HMM.

If that is the question, then I think that Mahout's HMM's are
sufficiently
object-ish that this should work.  Obviously, it will take multiple
training
passes to train each separate model.

On Sun, Jul 24, 2011 at 11:25 AM, Dhruv<[email protected]>    wrote:

  Svetlomir and Ted -- I was not trying to be rude, sorry if I came
across

that way because of my exuberance. I apologize.

I was eager to help and may have acted too fast and misunderstood the
question, so I turn to both of you for a little clarification.

I'm confused whether the X's refer to the hidden states, or training
instances. Since the hidden sequence is always a Markov Chain in HMMs,
I
assumed that Svetlomir meant that X1 and X2 were two separate hidden
state
sequences because Markov Chain was explicitly mentioned in his original
question. To quote:

-----------
X1----X1----X1----...X1  (Markov Chain for input parameter 1 =>
  monitoring
X1's changes over time)

X2----X2----X2----...X2  (Markov Chain for intput parameter 2 =>
  monitoring
X2's changes over time)
-----------

Further, since X1 and X2 were not slated to have any relationship with
each
other and since they were the observations of two different parameters,
I
construed that X1 and X2 represented two separate hidden state
sequences.
I
gathered that the hidden state sequences X1 and X2 are drawn from two
disjoint hidden vocabulary sets. The user wants to discover the model
on
some training set and then, to the trained model, feed Y for decoding
to
arrive at the most likely sequence of states, X1 and X2 which emitted
Y.

In my answer, I continued with this line saying that in one training,
you
can't arrive at two separate models for X1 and X2 which contain the
requisite distributions which can be used for decoding, say sequences
of
X1
to have produced Y or sequence of X2 to have produced Y. Hence, I
suggested
having only one set for the hidden states, combining X1s and X2s and
then
train the model on it. Given the domain of application, this may or may
not
make sense, hence I was doubtful of formulating the problem as HMM and
suggested alternatives.

However:

If X's are two separate input sequences for training, then yes, the
current
implementation is capable of training the HMM. If Y is the output, then
one
can decode, after training, the sequence of hidden states which most
likely
produced Y.

For the output probability question, my answer was to use the trained
model's HmmModel.getEmissionMatrix.****get(hiddenState, emittedState)
method to
compute the output probability for a particular hidden state. I believe
this
is not what the user wanted?


Dhruv

On Sun, Jul 24, 2011 at 12:56 PM, Ted Dunning<[email protected]>
wrote:

  On Sun, Jul 24, 2011 at 7:52 AM, Dhruv<[email protected]>    wrote:

  ... If you look into the *definition* of HMM,  the hidden sequence is
drawn

  from

only one set. The hidden sequence's transitions can be expressed as a

  joint

  probability p(s0, s1). Similarly the observed sequence has a joint

distribution with the hidden sequence such as p(y0, s1) and so on.

  I think gentler language might be a good idea here.  The question
was

not
at
all unreasonable.


  The hidden state transitions follow the Markov memorylessness
property
and

  hence form a Markov Chain.

In your case, you are trying to model your problem assuming that
there

  are

  two underlying state sequences affecting the observed output. This

  doesn't

  fit into the HMM's definition and you probably want something else.

  Actually, what the original poster wanted is quite sensible.  While

the
output sequence is due to a single input sequence, that input sequence
is
not observable.  As such, we have a noisy channel problem where we
want

  to

  estimate something about that original sequence.  The point of the

Markov
model is that it defines a distribution of output sequence given an
input
sequence (and model).  This distribution can be inverted so that given
a
particular output sequence, we can estimate the probability
distribution

  of

  input sequences conditional on the output.

The typical decoding algorithm for HMM's estimates only the maximum
likelihood input sequence but this does not negate the fact that we
have

  a

  distribution.  There are alternative decoding algorithms that allow a

set
of
high probability sequences to be estimated or allow a partial
probability
lattice to be output that allows alternative sequences to be probed.

If you do want to fit your problem into the HMM framework, you need to

  condense the X1 and X2 sequences into a single set and then condition

the

Ys

on it.

  Not at all.

  3. Can we get output probabilities from the HMM for a concrete state?

  Yes, after training, you can retrieve any of the trained model's
distributions as a Mahout Matrix type and use get(row, col).

  This is not quite what the question was.

Re: HMM investigations

Reply via email to