I remember this problem. Is it possible for you to post some sample data?
On Sun, Jul 24, 2011 at 12:08 PM, Svetlomir Kasabov < [email protected]> wrote: > > Hello again and thanks for the replies of both of you, I really apreciate > them. The most important think is, that you try helping and how you do this > is irrelevant :). I didn't feel angry/insulted. > > > Yes, X1 and X2 are two independent hidden sequences, like > > BP -- BP -- BP (Blood Pressure) > HR -- HR -- HR (Heart Rate) > And I want to train the model to predict the probability of giving a drug Y > to a patient (for example, with this sequence) > Y=0 -- Y=0 -- Y=1 > > I already tried this with logistic regression, but ended with poor results > (probably because of my small example set). Logistic regression has also no > built-in time series and that's why Imust analyze the X's changes using > percentiles and then train the logistic model with these percentiles. In > this way I reduce the dimensions to only one. That's why I thought that the > HMM can do this for me 'out of the box', staying in the dimension of 2, if > they allow to have two hidden chains, like this: > > http://t3.gstatic.com/images?**q=tbn:ANd9GcR8pu4bSm-MSyg3Pj0-** > aTyi8FaqUOy4U2bcKJBTBYKKvgAhyw**6P<http://t3.gstatic.com/images?q=tbn:ANd9GcR8pu4bSm-MSyg3Pj0-aTyi8FaqUOy4U2bcKJBTBYKKvgAhyw6P> > > or 'coupled' HMMs. > > I am not very experienced with the HMMs, but will read further the > literature and Mahout's API :). > > Maybe reducing the dimensions is not that bad idea? I've read that we can > do it with PCA (Principle Components Analysis). Is there a Ḿahout code for > this somewhere? > > Thanks a lot once again, > > Svetlomir. > > > > Am 24.07.2011 20:46, schrieb Ted Dunning: > > My impression (and Svetlomir should correct me) is that the intent was to >> use two HMM's on separate inputs and then use the decoded state sequences >> from those as inputs to a third HMM. >> >> If that is the question, then I think that Mahout's HMM's are sufficiently >> object-ish that this should work. Obviously, it will take multiple >> training >> passes to train each separate model. >> >> On Sun, Jul 24, 2011 at 11:25 AM, Dhruv<[email protected]> wrote: >> >> Svetlomir and Ted -- I was not trying to be rude, sorry if I came across >>> that way because of my exuberance. I apologize. >>> >>> I was eager to help and may have acted too fast and misunderstood the >>> question, so I turn to both of you for a little clarification. >>> >>> I'm confused whether the X's refer to the hidden states, or training >>> instances. Since the hidden sequence is always a Markov Chain in HMMs, I >>> assumed that Svetlomir meant that X1 and X2 were two separate hidden >>> state >>> sequences because Markov Chain was explicitly mentioned in his original >>> question. To quote: >>> >>> ----------- >>> X1----X1----X1----...X1 (Markov Chain for input parameter 1 => >>> monitoring >>> X1's changes over time) >>> >>> X2----X2----X2----...X2 (Markov Chain for intput parameter 2 => >>> monitoring >>> X2's changes over time) >>> ----------- >>> >>> Further, since X1 and X2 were not slated to have any relationship with >>> each >>> other and since they were the observations of two different parameters, I >>> construed that X1 and X2 represented two separate hidden state sequences. >>> I >>> gathered that the hidden state sequences X1 and X2 are drawn from two >>> disjoint hidden vocabulary sets. The user wants to discover the model on >>> some training set and then, to the trained model, feed Y for decoding to >>> arrive at the most likely sequence of states, X1 and X2 which emitted Y. >>> >>> In my answer, I continued with this line saying that in one training, you >>> can't arrive at two separate models for X1 and X2 which contain the >>> requisite distributions which can be used for decoding, say sequences of >>> X1 >>> to have produced Y or sequence of X2 to have produced Y. Hence, I >>> suggested >>> having only one set for the hidden states, combining X1s and X2s and then >>> train the model on it. Given the domain of application, this may or may >>> not >>> make sense, hence I was doubtful of formulating the problem as HMM and >>> suggested alternatives. >>> >>> However: >>> >>> If X's are two separate input sequences for training, then yes, the >>> current >>> implementation is capable of training the HMM. If Y is the output, then >>> one >>> can decode, after training, the sequence of hidden states which most >>> likely >>> produced Y. >>> >>> For the output probability question, my answer was to use the trained >>> model's HmmModel.getEmissionMatrix.**get(hiddenState, emittedState) >>> method to >>> compute the output probability for a particular hidden state. I believe >>> this >>> is not what the user wanted? >>> >>> >>> Dhruv >>> >>> On Sun, Jul 24, 2011 at 12:56 PM, Ted Dunning<[email protected]> >>> wrote: >>> >>> On Sun, Jul 24, 2011 at 7:52 AM, Dhruv<[email protected]> wrote: >>>> >>>> ... If you look into the *definition* of HMM, the hidden sequence is >>>>> >>>> drawn >>>> >>>>> from >>>>> only one set. The hidden sequence's transitions can be expressed as a >>>>> >>>> joint >>>> >>>>> probability p(s0, s1). Similarly the observed sequence has a joint >>>>> distribution with the hidden sequence such as p(y0, s1) and so on. >>>>> >>>>> I think gentler language might be a good idea here. The question was >>>> not >>>> at >>>> all unreasonable. >>>> >>>> >>>> The hidden state transitions follow the Markov memorylessness property >>>>> >>>> and >>>> >>>>> hence form a Markov Chain. >>>>> >>>>> In your case, you are trying to model your problem assuming that there >>>>> >>>> are >>>> >>>>> two underlying state sequences affecting the observed output. This >>>>> >>>> doesn't >>>> >>>>> fit into the HMM's definition and you probably want something else. >>>>> >>>>> Actually, what the original poster wanted is quite sensible. While >>>> the >>>> output sequence is due to a single input sequence, that input sequence >>>> is >>>> not observable. As such, we have a noisy channel problem where we want >>>> >>> to >>> >>>> estimate something about that original sequence. The point of the >>>> Markov >>>> model is that it defines a distribution of output sequence given an >>>> input >>>> sequence (and model). This distribution can be inverted so that given a >>>> particular output sequence, we can estimate the probability distribution >>>> >>> of >>> >>>> input sequences conditional on the output. >>>> >>>> The typical decoding algorithm for HMM's estimates only the maximum >>>> likelihood input sequence but this does not negate the fact that we have >>>> >>> a >>> >>>> distribution. There are alternative decoding algorithms that allow a >>>> set >>>> of >>>> high probability sequences to be estimated or allow a partial >>>> probability >>>> lattice to be output that allows alternative sequences to be probed. >>>> >>>> If you do want to fit your problem into the HMM framework, you need to >>>> >>>>> condense the X1 and X2 sequences into a single set and then condition >>>>> >>>> the >>> >>>> Ys >>>>> on it. >>>>> >>>>> Not at all. >>>> >>>> >>>> 3. Can we get output probabilities from the HMM for a concrete state? >>>>>> >>>>>> Yes, after training, you can retrieve any of the trained model's >>>>> distributions as a Mahout Matrix type and use get(row, col). >>>>> >>>>> This is not quite what the question was. >>>> >>>> >
