HMM's could be useful, but you have to define things a bit differently. First of all, HMM's want symbolic inputs and want to give you symbolic outputs. You don't get to see the internal state.
My first approach would be to use k-means clustering on short sequences of your observed continuous variables. You should use as large a k as gives you about the same squared error on held out data as on the training data. You can now quantize your data using this clustering. That is the first step for your HMM. The next step is to train the HMM. You need to give it many sequences of quantized state variables and the desired outputs at each time step. You have to guess at the number of hidden states. The next step would be to run the HMM on new data and evaluate. On Wed, May 22, 2013 at 8:20 AM, yikes aroni <[email protected]> wrote: > I'm not knowledgable of statistics nor data analysis, so please be > gentle! I am using Mahout to predict time series out of control state. I've > had a fair amount of success classifying with SGD and Adaptive regression > approaches but want to see if Hidden Markov Models can do a better job for > my purposes. I have two questions. > > Question 1 > I train the model using HmmTrainer.trainSupervisedSequence(). The hidden > state is the status: Out-of-Control (OOC) or Not-OOC for the next point in > time. Thus when i use HmmEvaluator.decode(model, observedSequence, false), > I look at the "state" associated with the last point in the > observedSequence and take *that* as my prediction of State at t+1. First of > all -- is this sensible? Or is there a better way to use the API to get a > prediction of State at t+1 given Observations 0 through t after training? > > Question 2 > Once I get my prediction -- i.e., the state the model predicts will be > associated with the last observation in my observation sequence -- how do I > use the API to get the probability of a that predicted state being correct? > I've looked at various output from HmmUtils and HmmEvaluator, but not being > strong in my knowledge of HMM, i'm not sure which (if any) are what i need. > Ultimately, I want to be able to say something like "The predicted next > state of this time series is OOC with a confidence of 0.37". > > thank you >
