Re: [R] Rollapply
For anyone who may be interested . Gabor Grothendieck suggested a link and then provided additional help resulting in the following. Any mistakes are mine. The code will allow you to build a rolling regression and to pass a (different) predictor to that regression model. # DATA ## df = data.frame(x = c(70.67,70.54,69.87,69.51,70.69,72.66,72.65,73.36), y = c(78.01,77.07,77.35,76.72,77.49,78.70,77.78,79.58)) pred2 = c(70,71,72,73,72,70) width = 3 FUNCTIONS # embed.data.frame - function(df,width) apply(embed(1:nrow(df),width),1,function(idx)df[idx,]) f-function(df,pred2){ model - lm(y ~ x, data = df) v - coefficients(model) p - predict(model, data.frame(x=pred2),se=TRUE) data.frame(intcpt=v[1],slope=v[2],modfit=p$fit,modse=p$se.fit,row.names=NULL) } # MAPPLY mapply(f, embed.data.frame(df,width), pred2) Thanks Gabor. -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Wednesday, January 13, 2010 4:02 PM To: Brecknock, Peter Cc: r-help@r-project.org Subject: Re: [R] Rollapply See: http://tolstoy.newcastle.edu.au/R/help/04/03/1446.html On Wed, Jan 13, 2010 at 3:45 PM, Pete B peter.breckn...@bp.com wrote: Hi I would like to understand how to extend the function (FUN) I am using in rollapply below. ## With the following simplified data, test1 yields parameters for a rolling regression data = data.frame(Xvar=c(70.67,70.54,69.87,69.51,70.69,72.66,72.65,73.36), Yvar =c(78.01,77.07,77.35,76.72,77.49,78.70,77.78,79.58)) data.z = zoo(d) test1 = rollapply(data.z, width=3, FUN = function(z) coef(lm(z[,1]~z[,2], data=as.data.frame(z))), by.column = FALSE, align = right) print(test1) ## Rewriting this to call myfn1 gives test2 (and is consistent with test1 above) myfn1 = function(mydata){ dd = as.data.frame(mydata) l = lm(dd[,1]~dd[,2], data=dd) c = coef(l) } test2 = rollapply(data.z, width=3, FUN= myfn1, by.column = FALSE, align = right) print(test2) ## I would like to be able to use the predict function to obtain a prediction (and its std error) from the rolling regression I have just calculated. My effort below issues a warning that 'newdata' had 1 row but variable(s) found have 3 rows. (if I run this outside of rollapply I don't get this warning) Also, I don't see the predicted value or its se with print(fm2[[1]]). Again, if I run this outside of rollapply I am able to extract the predicted value. Xpred=c(70.67) myfn2 = function(mydata){ dd = as.data.frame(mydata) l = lm(dd[,1]~dd[,2], data=dd) c = coef(l) p = predict(l, data.frame(Xvar=Xpred),se=T) ret=c(l,c,p) } fm2 = rollapply(data.z, width=3, FUN= myfn2, by.column = FALSE, align = right) print(fm2[[1]]) Any insights would be gratefully received. Best regards Pete -- View this message in context: http://n4.nabble.com/Rollapply-tp1013345p1013345.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ARIMA, xreg and intercepts
David Stoffer describes some challenges with R's output when fitting ARIMA models for different orders (see Issue 2 at http://www.stat.pitt.edu/stoffer/tsa2/Rissues.htm). R doesn't fit an intercept in the model if there is any differencing. David describes a workaround using the xreg parameter to force R to calculate an intercept. Assume I have a variable y and 3 explanatory variables a, b and c. No intercept would be produced for the model fit = arima(y, order=c(1,1,0), xreg=c(a,b,c)) 1. If I wish to force an intercept to be output is the following correct? intercept = 1:length(y) fit2 = arima(y, order=c(1,1,0), xreg=c(intercept, a, b, c)) 2. If 1 is correct, is the following code equivalent? data = ts.intersect(diff(y),intercept, a,b,c) fit3 = arima(data[,1], order=c(1,0,0), xreg=[,c(2:5)]) 3. If I fit 2 and find the intercept is not significant would it be correct to use the following? fit = arima(y, order=c(1,1,0), xreg=c(a,b,c)) Thanks for your help Kind regards Pete This e-mail may contain confidential or proprietary information belonging to the BP group and is intended only for the use of the recipients named above. If you are not the intended recipient, please immediately notify the sender and either delete this email or return to the sender immediately. You may not review, copy or distribute this email. Within the bounds of law, this part of BP retains all emails and IMs and may monitor them to ensure compliance with BP's internal policies and for other legitimate business purposes. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lm output
Hi All I am running a linear regression using the lm object. In the event that my independent variable is the same across all observations the regression slope is returned as an NA. For example, if I have the following y=c(10,12,17) x=c(5,5,5) lm = lm(y~x) produces the following Coefficients: (Intercept)x 13 NA Other than post-processing the results, is there a way to output the slope as 0 rather than NA? Thanks Pete This e-mail may contain confidential or proprietary information belonging to the BP group and is intended only for the use of the recipients named above. If you are not the intended recipient, please immediately notify the sender and either delete this email or return to the sender immediately. You may not review, copy or distribute this email. Within the bounds of law, this part of BP retains all emails and IMs and may monitor them to ensure compliance with BP's internal policies and for other legitimate business purposes. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm output
Daniel Thanks very much for the reply. If the data fails the underlying assumptions of regression wouldn't it make sense to suppress all the output and not just the slope coefficient? Incidently, if I run this simple example in Excel it returns the slope as 0. Intuitively, this makes sense to me ... the best estimate of y would be its mean for any value of x. Kind regards Pete -Original Message- From: Daniel Malter [mailto:dan...@umd.edu] Sent: Friday, October 09, 2009 4:24 PM To: Brecknock, Peter; r-help@r-project.org Subject: AW: [R] lm output That comes out as an NA because X'X is not invertible because it is not full rank (one row/column is a linear combination of the other(s)). And that means there is no unique solution to the system. y=c(10,12,17) x=c(5,5,5) X=cbind(1,x) X t(X)%*%X solve(t(X)%*%X) Therefore, nope, there is now way to make this come out as a zero, because it fails the very assumptions of regression analysis. HTH, Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Im Auftrag von Brecknock, Peter Gesendet: Friday, October 09, 2009 5:12 PM An: r-help@r-project.org Betreff: [R] lm output Hi All I am running a linear regression using the lm object. In the event that my independent variable is the same across all observations the regression slope is returned as an NA. For example, if I have the following y=c(10,12,17) x=c(5,5,5) lm = lm(y~x) produces the following Coefficients: (Intercept)x 13 NA Other than post-processing the results, is there a way to output the slope as 0 rather than NA? Thanks Pete This e-mail may contain confidential or proprietary information belonging to the BP group and is intended only for the use of the recipients named above. If you are not the intended recipient, please immediately notify the sender and either delete this email or return to the sender immediately. You may not review, copy or distribute this email. Within the bounds of law, this part of BP retains all emails and IMs and may monitor them to ensure compliance with BP's internal policies and for other legitimate business purposes. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lm output
Fair point. Thanks. -Original Message- From: David Winsemius [mailto:dwinsem...@comcast.net] Sent: Friday, October 09, 2009 5:02 PM To: Brecknock, Peter Cc: Daniel Malter; r-help@r-project.org Subject: Re: [R] lm output On Oct 9, 2009, at 5:45 PM, Brecknock, Peter wrote: Daniel Thanks very much for the reply. If the data fails the underlying assumptions of regression wouldn't it make sense to suppress all the output and not just the slope coefficient? Incidently, if I run this simple example in Excel it returns the slope as 0. Intuitively, this makes sense to me ... the best estimate of y would be its mean for any value of x. R gave you the best estimate for y. It was Excel that gave you an estimate for something for which it had no basis. There is no mean of dy/dx because dx=0. -- David. Kind regards Pete -Original Message- From: Daniel Malter [mailto:dan...@umd.edu] Sent: Friday, October 09, 2009 4:24 PM To: Brecknock, Peter; r-help@r-project.org Subject: AW: [R] lm output That comes out as an NA because X'X is not invertible because it is not full rank (one row/column is a linear combination of the other(s)). And that means there is no unique solution to the system. y=c(10,12,17) x=c(5,5,5) X=cbind(1,x) X t(X)%*%X solve(t(X)%*%X) Therefore, nope, there is now way to make this come out as a zero, because it fails the very assumptions of regression analysis. HTH, Daniel - cuncta stricte discussurus - -Ursprüngliche Nachricht- Von: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] Im Auftrag von Brecknock, Peter Gesendet: Friday, October 09, 2009 5:12 PM An: r-help@r-project.org Betreff: [R] lm output Hi All I am running a linear regression using the lm object. In the event that my independent variable is the same across all observations the regression slope is returned as an NA. For example, if I have the following y=c(10,12,17) x=c(5,5,5) lm = lm(y~x) produces the following Coefficients: (Intercept)x 13 NA Other than post-processing the results, is there a way to output the slope as 0 rather than NA? Thanks Pete This e-mail may contain confidential or proprietary information belonging to the BP group and is intended only for the use of the recipients named above. If you are not the intended recipient, please immediately notify the sender and either delete this email or return to the sender immediately. You may not review, copy or distribute this email. Within the bounds of law, this part of BP retains all emails and IMs and may monitor them to ensure compliance with BP's internal policies and for other legitimate business purposes. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RExcel/ RCom
Hi All Apologies if this is not the correct mailing list for this question. I have installed RExcel and RCom from CRAN. R is indicating that I need an additional file/package statconnDCOM from rcom.univie.ac.at I have been trying to access this website over the last 2 days be it appears to be down. Does anyone know when this site is likely to be back up and/or know of an alternative source for the statconnDCOM file? Kind regards Pete This e-mail may contain confidential or proprietary information belonging to the BP group and is intended only for the use of the recipients named above. If you are not the intended recipient, please immediately notify the sender and either delete this email or return to the sender immediately. You may not review, copy or distribute this email. Within the bounds of law, this part of BP retains all emails and IMs and may monitor them to ensure compliance with BP's internal policies and for other legitimate business purposes. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Optimization Question
Hi All Apologies if this is not the correct list for this question. The Rglpk package offers the following example in its documentation library(Rglpk) ## Simple mixed integer linear program. ## maximize: 3 x_1 + 1 x_2 + 3 x_3 ## subject to: -1 x_1 + 2 x_2 + x_3 = 4 ## 4 x_2 - 3 x_3 = 2 ## x_1 - 3 x_2 + 2 x_3 = 3 ## x_1, x_3 are non-negative integers ## x_2 is a non-negative real number obj - c(3, 1, 3) mat - matrix(c(-1, 0, 1, 2, 4, -3, 1, -3, 2), nrow = 3) dir - c(=, =, =) rhs - c(4, 2, 3) types - c(I, C, I) max - TRUE Rglpk_solve_LP(obj, mat, dir, rhs, types, max) ## Same as before but with bounds replaced by ## -Inf x_1 = 4 ## 0 = x_2 = 100 ## 2 = x_3 Inf bounds - list(lower = list(ind = c(1L, 3L), val = c(-Inf, 2)), upper = list(ind = c(1L, 2L), val = c(4, 100))) Rglpk_solve_LP(obj, mat, dir, rhs, types, max, bounds) I have 2 questions 1. What is the purpose of the L in the bounds statement (e.g. 1L, 3L etc)? 2. Is it possible to further constrain a variable such that in the optimal solution to the objective function it will be a specific integer or an integer multiple of that integer. For example, x_3 must be 2 or 4,6,8,10 etc Thanks Pete This e-mail may contain confidential or proprietary information belonging to the BP group and is intended only for the use of the recipients named above. If you are not the intended recipient, please immediately notify the sender and either delete this email or return to the sender immediately. You may not review, copy or distribute this email. Within the bounds of law, this part of BP retains all emails and IMs and may monitor them to ensure compliance with BP's internal policies and for other legitimate business purposes. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.