Hey Maheshakya,
There are no functions directly in the API for this, so you'll have to be
willing to (slightly) roll up your sleeves here.
Because the HMM is markovian, the only information you need to predict the
`n+1`-th observation in a sequence is the
posterior distribution over the hidden states at `time=n`. If `Y` is the
emission and `X` is the hidden state, then the markov
property here is P(Y_{t+1} | X_{0:t}) = P(Y_ {t+1} | X_t).
So after you fit your model, you want to extract the posteriors over the last
hidden state from an observation
sequence, P(X_t). This value depends on the entire sequence of emissions for
that sequence, y_{0:t}, and you
can get it in sklearn from `x_last= model.predict_proba(y)[-1]`.
Then, you need to propagate this probability vector forward in time. This comes
out to just a vector-matrix multiplication. If
you are going forward for a single step, you'd just do `x_next = np.dot(x_last,
model.transmat_)`. If you want to propagate
forward by more than one step, you can right-multiply by matrix powers of
`transmat_`.[1]
Now, to predict the value of the emission at your new hidden state
distribution, `x_next`, you need to deal with the emission
distributions. If you're using the multinomial emission model, you just take
the dot product np.dot(x_next, model.emissionprob_).
If you're using gaussian or something, you'll need to dot their pdfs with the
weighting factors given from `x_next`.
-Robert
[1] In practice, if you're interested in the long timescale dynamics
(propagating many steps out), then you only need to
propagate the dominant eignevectors of the transition matrix.
On Oct 18, 2013, at 8:44 PM, Andreas Mueller wrote:
> Hi Maheshakya.
> Sorry for the late reply.
> I'm actually not so familiar with the HMM module, but Fred Mailhot and Robert
> McGibbon might be able to help you ;)
> Both should be possible but not entirely convenient with the current API. You
> can fit the model to the data, and then "predict"
> the hidden variables. Using the hidden variable of the last state, you can
> use "transmat" to get the distribution of the next state
> according to the model. Then, using the means (for continuous variables) you
> can infer the observed state.
> No guarantee of correctness, though ;)
>
> Cheers,
> Andy
>
>
> On 09/27/2013 05:11 AM, Maheshakya Wijewardena wrote:
>> Suppose there is a sequence of observations. for an example take
>> [1,2,3,5,5,5,2,3,2,3, ..., 3, 4]. (Those can be even real numbers). How do I
>> use the current implementation of HMM in Scikit-learn to predict the next
>> value of this observation sequence. I have 2 questions regarding this.
>>
>> 1. Given a sequence of observations, predicting the next observation(as
>> mentioned above)
>>
>> 2. Given many sequences of n observations and n+1 observations of those
>> sequences, can HMM be used to predict the (n+1)th observation of a new
>> sequence of n observations? If so how?
>>
>> How do I use the HMM in Scikit-learn for the above tasks? I couldn't grasp
>> much about this from the documentation.
>>
>> Thank you.
>>
>>
>> Maheshakya
>>
>>
>> ------------------------------------------------------------------------------
>> October Webinars: Code for Performance
>> Free Intel webinars can help you accelerate application performance.
>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
>> from
>> the latest Intel processors and coprocessors. See abstracts and register >
>> http://pubads.g.doubleclick.net/gampad/clk?id=60133471&iu=/4140/ostg.clktrk
>>
>>
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
> ------------------------------------------------------------------------------
> October Webinars: Code for Performance
> Free Intel webinars can help you accelerate application performance.
> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
> the latest Intel processors and coprocessors. See abstracts and register >
> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk_______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general