Mathieu,

 

I finished to implement the HMM on my current python framework and I
just got some simulations results in order to quantify the risks ( I use
model.simulate()) . (btw,  thanks all for your help)

 

I would like now to implement the step 2 where I would like classify the
time series on class:

 

Mathieu you wrote this below (your process is not clear to me):

 

Step a

>>>> 

You can group your time series per class and train one HMM per class
with those time series.

>>>> 

Question: How I can group my time series as it s my final objective ?
Which criteria I should use ? the number of state of HMM? The mean  and
or variances of the state ? Should I use PCA ?  Probabilistic PCA ?

 

Step b

>>> 

Then given a new time series, you can decide its class by the argmax of
the Bayes rule:

Class = argmax P(Class | Time Series) = argmax P(Time Series | Class) *
P(Class) / P(Time Series) = argmax P(Time Series | Class) * P(Class)

P(Time Series | Class) can be computed by the forward algorithm or can
be approximated by the Viterbi algorithm (which is more numerically
stable).

P(Class) can be computed by counting the number of time series in each
class.

>>> 

 

Ok with Bayes Rule. But As I don't know how to tackle step A, I struggle
to understand/ implement step B...

 

By Advance, thanks a lot.

 

Didier

Didier Vila, PhD | Risk | CapQuest Group Ltd | Fleet 27 | Rye Close |
Fleet | Hampshire | GU51 2QQ | Fax: 0871 574 2992 | Email:
dv...@capquestco.com <mailto:mbruna...@capquestco.com>  

 

From: mblon...@gmail.com [mailto:mblon...@gmail.com] On Behalf Of
Mathieu Blondel
Sent: 18 October 2012 06:00
To: Didier Vila
Cc: scikit-learn-general@lists.sourceforge.net
Subject: Re: [Scikit-learn-general] HMM: Determination of the state
numbers

 

 

On Thu, Oct 18, 2012 at 1:37 AM, Didier Vila <dv...@capquestco.com>
wrote:

Problem 1: My problem is a risk problem at the moment.

 

*        I want to represent the behaviour of my 20 000 time series and
generates some monte carlo simulations  using the method.sample()

*        Intuitively, i have to choose between 1 or 3 ( max 4) the
number of states.

*        Then,  I want to capture the risk for each time series and the
risk at the aggregate level.( I will generate 100 Monte Carlo
Simulations)

Alright, then optimizing "score" may make sense it that case.

 

        *        Question 1: I don t see how I can train and cross
validate my HMM  in Scikit Learn ( First time I use Scikit Learn for
this purpose)

        *        Question 2: the lenght of the time serie is 32 periods,
is it enough to  make cross validation and validation ?

 

Have a look at http://scikit-learn.org/0.12/modules/grid_search.html. 

         

        Problem 2: Classification

        *        In the near future, I will try to make some
classification of time series but I have no ideas how to handle the
problem ? Should I use an SVM ? Can you refer any paper ?

 

You can group your time series per class and train one HMM per class
with those time series. Then given a new time series, you can decide its
class by the argmax of the Bayes rule:

Class = argmax P(Class | Time Series) = argmax P(Time Series | Class) *
P(Class) / P(Time Series) = argmax P(Time Series | Class) * P(Class)

 

P(Time Series | Class) can be computed by the forward algorithm or can
be approximated by the Viterbi algorithm (which is more numerically
stable).

P(Class) can be computed by counting the number of time series in each
class.

 

        Generic Questions: I was wondering if your algorithm is
developed iin Python ? Do you think your algo is relevant to apply to my
business problem ?

 

My method is useful for classifying time series which are made of
smaller parts whose label you don't know. So I don't think it would work
for you. 

 

Mathieu

 

This e-mail is intended solely for the addressee, is strictly confidential and 
may also be legally privileged. If you are not the addressee please do not 
read, print, re-transmit, store or act in reliance on it or any attachments. 
Instead, please email it back to the sender and then immediately permanently 
delete it. E-mail communications cannot be guaranteed to be secure or error 
free, as information could be intercepted, corrupted, amended, lost, destroyed, 
arrive late or incomplete, or contain viruses. We do not accept liability for 
any such matters or their consequences. Anyone who communicates with us by 
e-mail is taken to accept the risks in doing so. Opinions, conclusions and 
other information in this e-mail and any attachments are solely those of the 
author and do not represent those of CapQuest Group Limited or any of its 
subsidiaries unless otherwise stated. CapQuest Group Limited (registered number 
4936030), CapQuest Debt Recovery Limited (registered number 3772278), CapQuest 
Investments Limited (registered number 5245825), CapQuest Asset Management 
Limited (registered number 5245829) and CapQuest Mortgage Servicing Limited 
(registered number 05821008) are all limited companies registered in England 
and Wales with their registered offices at Fleet 27, Rye Close, Fleet, 
Hampshire, GU51 2QQ. Each company is a separate and independent legal entity. 
None of the companies have any liability for each other's acts or omissions. 
This communication is from the company named in the sender's details above.
------------------------------------------------------------------------------
The Windows 8 Center 
In partnership with Sourceforge
Your idea - your app - 30 days. Get started!
http://windows8center.sourceforge.net/
what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to