Mathieu, 

 

Thanks for your long answer and your research paper.

 

My data is a data set of 20 000 time series ( length max is 32).

 

Problem 1: My problem is a risk problem at the moment.

 

*        I want to represent the behaviour of my 20 000 time series and
generates some monte carlo simulations  using the method.sample()

*        Intuitively, i have to choose between 1 or 3 ( max 4) the
number of states.

*        Then,  I want to capture the risk for each time series and the
risk at the aggregate level.( I will generate 100 Monte Carlo
Simulations)

*        Question 1: I don t see how I can train and cross validate my
HMM  in Scikit Learn ( First time I use Scikit Learn for this purpose)

*        Question 2: the lenght of the time serie is 32 periods, is it
enough to  make cross validation and validation ?

 

Problem 2: Classification

*        In the near future, I will try to make some classification of
time series but I have no ideas how to handle the problem ? Should I use
an SVM ? Can you refer any paper ?

 

Generic Questions: I was wondering if your algorithm is developed iin
Python ? Do you think your algo is relevant to apply to my business
problem ? 

 

By Advance thanks for this.

 

Didier

 

From: Mathieu Blondel [mailto:math...@mblondel.org] 
Sent: 17 October 2012 17:07
To: scikit-learn-general@lists.sourceforge.net
Subject: Re: [Scikit-learn-general] HMM: Determination of the state
numbers

 

 

On Thu, Oct 18, 2012 at 12:18 AM, Didier Vila <dv...@capquestco.com>
wrote:

Should I maximize the  "score" function  depending the number of state
through several candidates that makes senses from a business point of
view ? 


The best criterion to optimize is not "score" but the evaluation metric
you really care about for your application. For example, if what you
want to do is time-series classification, you should optimize the number
of states for classification accuracy. Also, be careful, you cannot use
your test data to choose the number of states. You need to use
validation data or cross-validation.

For some types of data, states have an intuitive meaning and the number
of states can be chosen easily. For example, if what you want to do is
mouse gesture recognition, you can use one state for your HMM modeling a
straight line and two states for your HMM modeling  a gesture with a
turn. You can also try to visualize your data and see if it has an
easy-to-identify number of "modes".

In an old paper of mine [*], I proposed a simple heuristic to choose the
number of states but it's only useful if you want to model several
object classes, each with one HMM. In that case, choosing the optimal
number of states for all HMMs is a combinatorial problem and my
heuristic allows to search only 1 parameter.

You may also want to try GMMHMM but in that case you need to tune the
number of mixture components as well (more components give your HMM more
expressive power therefore your HMM may need fewer states).

HTH,
Mathieu

[*] http://mblondel.org/publications/mblondel-icpr2010.pdf

This e-mail is intended solely for the addressee, is strictly confidential and 
may also be legally privileged. If you are not the addressee please do not 
read, print, re-transmit, store or act in reliance on it or any attachments. 
Instead, please email it back to the sender and then immediately permanently 
delete it. E-mail communications cannot be guaranteed to be secure or error 
free, as information could be intercepted, corrupted, amended, lost, destroyed, 
arrive late or incomplete, or contain viruses. We do not accept liability for 
any such matters or their consequences. Anyone who communicates with us by 
e-mail is taken to accept the risks in doing so. Opinions, conclusions and 
other information in this e-mail and any attachments are solely those of the 
author and do not represent those of CapQuest Group Limited or any of its 
subsidiaries unless otherwise stated. CapQuest Group Limited (registered number 
4936030), CapQuest Debt Recovery Limited (registered number 3772278), CapQuest 
Investments Limited (registered number 5245825), CapQuest Asset Management 
Limited (registered number 5245829) and CapQuest Mortgage Servicing Limited 
(registered number 05821008) are all limited companies registered in England 
and Wales with their registered offices at Fleet 27, Rye Close, Fleet, 
Hampshire, GU51 2QQ. Each company is a separate and independent legal entity. 
None of the companies have any liability for each other's acts or omissions. 
This communication is from the company named in the sender's details above.
------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_sfd2d_oct
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to