Dear Brian You cannot perform forecasts with the results fo the function I sent you, because these results are under a matrix form while ms_forecast needs a results tlist (typed list). What is needed is therefore a results tlist with all needed fields to make forecasts. You will find enclosed a new ms_var_run function that makes that. What I have done is replacing the results that are new in the results tlist estimated, while keeping all invariant results (suach as estimated parameters, t-stats,...): I think I have done it properly, but I cannot insure you that it is the case.
Starting for the previous example, replace: --> [y_hat,resid,PR,PR_STT,PR_STL]=run_ms_var(r,'100*(log( us_revu)-lagts(2,log(us_revu)))' with: -->newr=run_ms_var(r,'100*(log(us_revu)-lagts(2,log(us_revu)))' and then make a forecast with: --> rf=ms_forecast(newr,'2004m12') Again, the function is rough and should be improved somehow. Éric. 2015-02-19 14:28 GMT+01:00 Brian Bouterse <[email protected]>: > Hi Eric, > > Thank you so much for the function. The verification step you demonstrate > are convincing that the implementation produces the correct filtered > probability result on the benchmark data. I've been able to reproduce your > demo results, and also apply it to my own data set. This is great! > > There is one more thing that I'm not sure how to do for the single > variable case. How can I take the results I have from run_ms_var() and use > them with ms_forecast() to produce a single variable filtered estimate? The > results I have are [y_hat,resid,PR,PR_STT,PR_STL]. I imagine this could > be done using the following pseudocode: > > for each time step in PR_STT: > select the regime with the highest filtered probability for this time > step (ie: say regime N). This is like a maximum likelihood selection. > select the autoregressive parameters for regime N from the original > training step > forecast the next time step using the autoregressive parameters using > regime N > > This seems very similar to what ms_forecast() can do, but I'm not sure how > to call ms_forecast given only the existence of parameters > [y_hat,resid,PR,PR_STT,PR_STL]. Is this possible? > > Perhaps one of the variables [y_hat,resid,PR,PR_STT,PR_STL] already > contains what I am looking for, but I want to be sure that it is based on > the filtered probabilities and not considering data that comes later in the > data set than the point of prediction. Does that make sense? In other words > I want to predict the specific value at time t, and only consider data on > the interval [0, t-1]. > > Thanks again for everything you've done including writing this, helping > me, responding so quickly, etc. This is really great. > > -Brian > > > > On Tue, Feb 17, 2015 at 3:50 PM, Eric Dubois <[email protected]> > wrote: > >> Dear Brian. >> >> 1) sorry, I made indeed a typo and wanted to speak about y_mat, x_mat and >> z_mat. >> >> 2) I do not know exactly what you want, but you can calculate what you >> want from the parameters and all other inputs >> >> 3) you will find attached a function run_ms_var that performs, I hope, >> what you need: this function takes a results tlist from a ms_var execution >> and a vector of endogenous variables to feed the VAR (your benchmark data). >> >> I have checked that if you give as endogenous variables exactly the same >> variables as the one used for estimation, you recover the same yhat, >> filtered probs, etc. >> >> To use the function, you have to save it in a folder, say c:/newms, and >> run into Scilab >> --> getd('c:/newms) >> >> To check what I mentionned above, run: >> --> load(GROCERDIR+'\data\us_revu.dat') >> --> bounds('1967m4','2004m2') >> --> nb_states=2 >> --> switch_var=2 // variances are switching >> --> var_opt=3 // heteroskedastik var-cov matrix >> >> --> >> r=ms_var('cte',3,'100*(log(us_revu)-lagts(2,log(us_revu)))',nb_states,switch_var,var_opt,'prt=initial;final','transf=stud') >> >> --> >> [y_hat,resid,PR,PR_STT,PR_STL]=run_ms_var(r,'100*(log(us_revu)-lagts(2,log(us_revu)))' >> --> PR_STT-r('filtered probs') >> >> The function is rather rough (no header, no options,...) and can be >> improved, but I hope it answers your needs. >> >> Éric. >> >> >> >> >> 2015-02-17 15:03 GMT+01:00 Brian Bouterse <[email protected]>: >> >>> Hi Eric, >>> >>> Thanks for the reply! Yes you understand my goals correctly, but one >>> clarification: It would be better to have the estimated values directly >>> instead of the filtered state probabilities. I usually get these with >>> ms_forecast(r, n). >>> >>> I've been reading through the grocer code to determine how to write the >>> function you suggest. I do need it sooner than a few weeks so I'm >>> attempting to do it. It seems straightforward except for the y_hat, x_hat, >>> and z_hat variables I need to provide to MSVAR_Filt.(). Here are some >>> questions: >>> >>> 1) You say I need to feed MSVAR_Filt() with y_hat, x_hat, and z_hat, but >>> the variables in the function signature for MSVAR_Filt read >>> as y_mat,x_mat,z_mat. Did you mean y_mat or y_hat? >>> >>> 2) y_hat (2nd output) is an output of MSVAR_Filt(). The function >>> comments say that is my estimated y. Is that the direct estimates that I am >>> looking for? >>> >>> 3) I read through ms_var() to see how to derive the y_hat, x_hat, and >>> z_hat variables that are needed, but I don't see any code in ms_var that >>> derive these variables. Can you more specifically point out where the code >>> is that shows the derivation of these matrices? >>> >>> Separate from those questions I am wondering what kind of bias is >>> introduced if I use the filtered probabilities from ms_var? Could I use >>> those instead of attempting to predict with data set A and evaluate with >>> data set B. The reason I like the two data set methodology is that the >>> training data (A) is separated from the evaluation data (B) so there can't >>> be any bias in terms of measuring how the trained data generalizes when >>> benchmarked on evaluation data because the training model never saw data >>> set (B). Chapter 23 says the filtered probabilities only use data up until >>> that point in time, but it uses estimates that were built from all >>> information that is available. It seems biased to evaluate the residuals >>> using filtered probabilities (or smoothed probabilities) because training >>> and evaluating error on the same data set seems wrong. What do you think >>> the right way is to use these tools to avoid bias when measuring error of >>> model performance? >>> >>> Thanks for any information. Also is there any possibility for us to chat >>> on IRC? I'm 'bmbouter' in #scilab on freenode if you want to chat there. It >>> would probably be faster than e-mail. >>> >>> Thanks! >>> Brian >>> >>> >>> On Thu, Feb 12, 2015 at 3:44 PM, Eric Dubois <[email protected]> >>> wrote: >>> >>>> Dear Brian. >>>> >>>> If I have well understood, you want: >>>> - to estimate a ms_var model on a subset of your dataset; >>>> - recover the estimated parameters; >>>> - and calculate the filtered state probabilities on the other part of >>>> your dataset with these parameters. >>>> >>>> This can be done: >>>> - the function MSVAR_Filt calculates among other the filetered >>>> probabilities (5th output); >>>> - the function needs among other things the parameters of the model; >>>> they can be recovered from the output tlist of function ms_var; if give it >>>> the name res (with --> res=ms_var(...)): this is the field 'coeff' in the >>>> output tlist (res('coeff') with this example); >>>> >>>> But the function MSVAR_Filt also has to be fed with matrices y_hat, >>>> x_hat and z_hat that are matrices derived from the matrix of endogenous and >>>> exogenous variables (see function ms_var to see how it is done). >>>> >>>> If you are not too in a hurry, I can write the function that gathers >>>> all these operations within a few weeks. >>>> >>>> Éric. >>>> >>>> 2015-02-12 16:56 GMT+01:00 Brian Bouterse <[email protected]>: >>>> >>>>> I use GROCER's ms_var function to estimate a single variable VAR >>>>> model, and it estimates parameters as expected and described by the >>>>> manual. I want to train and evaluate my model on different data sets to >>>>> avoid bias from training and benchmarking on the same data set. How can >>>>> this be done? >>>>> >>>>> For example consider data set A (month 1) and data set B (month 2) >>>>> from a 2 month sample. I would like to train on month 1 and then benchmark >>>>> on month 2. >>>>> >>>>> I use ms_var to train on data set A. It gives me estimated parameters >>>>> and filtered regime probabilities. That works well. How can I use the >>>>> trained parameters to then estimate on month 2 data? >>>>> >>>>> I'm aware of the ms_forecast function, but it seems to only forecast >>>>> using the results from an estimator like ms_var(). The forecasting will >>>>> then only be done on the same data as was used for estimating. I want to >>>>> use the trained parameters to product estimates for a different data set. >>>>> >>>>> Thanks in advance. I really appreciate being able to use this software. >>>>> >>>>> -Brian >>>>> >>>>> -- >>>>> Brian Bouterse >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> [email protected] >>>>> http://lists.scilab.org/mailman/listinfo/users >>>>> >>>>> >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> http://lists.scilab.org/mailman/listinfo/users >>>> >>>> >>> >>> >>> -- >>> Brian Bouterse >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> http://lists.scilab.org/mailman/listinfo/users >>> >>> >> >> _______________________________________________ >> users mailing list >> [email protected] >> http://lists.scilab.org/mailman/listinfo/users >> >> > > > -- > Brian Bouterse > > _______________________________________________ > users mailing list > [email protected] > http://lists.scilab.org/mailman/listinfo/users > >
run_ms_var.sci
Description: Binary data
_______________________________________________ users mailing list [email protected] http://lists.scilab.org/mailman/listinfo/users
