Hi Leed, I got your point. Hence if I see both acf and pacf vanish after 3 then I should try for all possible models and then choose that model giving min aic? i.e. (1,3), (3,1), (3,3), (2,3), (3,2), (1,2), (2,1), (1,1), and (2,2)? And my second doubt is : for the particular dataset that I provided, I got nothing when I run arima(data, order=c(2,1,2)) however arima(diff(data), order=c(2,0,2)) gives no problem : > arima(data, order=c(2,1,2)) Error in arima(data, order = c(2, 1, 2)) : non-stationary AR part from CSS > arima(diff(data), order=c(2,0,2)) Call: arima(x = diff(data), order = c(2, 0, 2)) Coefficients: ar1 ar2 ma1 ma2 intercept 0.1093 -0.3111 -0.1438 0.0632 0.0157 s.e. 0.5378 0.4464 0.5661 0.4796 0.0111 sigma^2 estimated as 0.01329: log likelihood = 47.38, aic = -82.76
Can anyone tell me what is the wrong there? Regars, "Leeds, Mark (IED)" <[EMAIL PROTECTED]> wrote: what ripley says below is kind of related to what I said about p and q both being greater than 1 being very unlikely. He's also right in that those "rules" only work in the sense that, if the acf drops off after q lags, then the Implication is that p = 0 And if they pacf drops off after p lags, then it's implied that q = 0. when the model is mixed, it's more complicated and Mixed models are more rare than common but they could end up being the best model. That's another place where The aic can be used. In other words, if it looks like your acf drops off after 1 and your pacf drops off after 1, then it could be a p = 1 and q =1 model but then the aic should be checked against ( p =1 and q = 0 ) And p = 0 and q = 1 ) because the selection of p = 1 and q = 1 is really flawed because the rules don't really Hold when BOTH p and q are non zero. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Prof Brian Ripley Sent: Friday, August 31, 2007 4:38 AM To: Megh Dal Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Choosing the optimum lag order of ARIMA model On Fri, 31 Aug 2007, Megh Dal wrote: > Dear all R users, > > I am really struggling to determine the most appropriate lag order of > ARIMA model. My understanding is that, as for MA [q] model the auto > correlation coeff vanishes after q lag, it says the MA order of a > ARIMA model, and for a AR[p] model partial autocorrelation vanishes > after p lags it helps to determine the AR lag. And most appropriate > model choosed by this argument gives min AIC. The last part is fallacious. Also, you are applying your rules to selecting the orders in ARMA models, and they apply only to pure MA or AR models. The R test file src/library/stats/tests/ts-tests.R has an example of model selection by AIC. > > Now I considered following data : > > 2.1948 2.2275 2.2669 2.2839 1.9481 2.1319 2.0238 2.3109 2.5727 2.5176 > 2.5728 2.6828 2.8221 2.879 2.8828 2.9955 2.9906 2.9861 3.0452 3.068 > 2.9569 3.0256 3.0977 2.985 2.9572 3.0877 3.1009 3.1149 2.8886 2.9631 > 3.0325 2.9175 2.7231 2.7905 2.8493 2.8208 2.8156 2.9115 2.701 2.6928 > 2.7881 2.723 2.7266 2.9494 3.113 3.0566 3.0358 3.05 3.0724 3.1365 > 3.1083 3.0257 3.2211 3.4269 3.327 3.1205 2.9997 3.0201 3.0803 3.2059 > 3.1997 3.038 3.1613 3.2802 3.2194 > > ACF for 1st diff series: > Autocorrelations of series 'diff(data1)', by lag > 0 1 2 3 4 5 6 7 8 9 10 > 1.000 -0.022 -0.258 -0.016 0.066 0.034 0.035 -0.001 -0.089 0.028 0.222 > 11 12 13 14 15 16 17 18 > -0.132 -0.184 -0.038 0.048 -0.026 -0.041 -0.067 0.059 > > PACF for 1st diff series: > Partial autocorrelations of series 'diff(data1)', by lag > 1 2 3 4 5 6 7 8 9 10 11 > -0.022 -0.258 -0.031 -0.002 0.026 0.057 0.021 -0.069 0.029 0.194 -0.124 > 12 13 14 15 16 17 18 > -0.100 -0.111 -0.043 -0.078 -0.056 -0.085 0.086 > > On basis of that I choose ARIMA[2,1,2] for the original data > > But I got error while doing that : > > > arima(data1, c(2,1,2)) > Error in arima(data1, c(2, 1, 2)) : non-stationary AR part from CSS > > And AIC for other combination of lags are: > > arima(data1, c(2,1,1))$aic > [1] -84.83648 >> arima(data1, c(1,1,2))$aic > [1] -84.35737 >> arima(data1, c(1,1,1))$aic > [1] -83.79392 > > Hence on basis of AIC criteria if I choose ARIMA[2,1,1] model, then > the first rule that I said earlier does not support. > > Am I making anything wrong? Can anyone give me any suggestion on what > is the "universal" rule for choosing the best lag? > > Regards, > > > > > > > > > --------------------------------- > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -------------------------------------------------------- This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.