Re: [R] Interpreting summary.lm for a 2 factor anova
Dear Sir, Alright. Best Regards, Ashim On Sun, Dec 4, 2016 at 10:59 AM, Richard M. Heibergerwrote: > As Petr Pikal mentioned, the difficulty in interpretation is entirely due > to the set of contrasts you chose.The default treatment contrasts are > not orthogonal and are therefore the most difficult to interpret. > The note in ?aov warns of this difficulty. > > sum contrasts will give you numbers that are easiest to interpret. > > > options(contrasts = c("contr.sum", "contr.poly")) > warpbreakssum.aov <- aov(breaks ~ wool * tension, data = warpbreaks) > coef(warpbreakssum.aov) > model.tables(warpbreakstreatment.aov, type="effects") > model.tables(warpbreakstreatment.aov, type="means") > > > John Fox showed the algebra using the default treatment contrasts > > For full understanding you will need to read in a text more about > sets of linear contrasts and their algebra. > I recommend Section 10.3 in mine, of course. > > Statistical Analysis and Data Display: > An Intermediate Course with Examples in R > Heiberger, Richard M., Holland, Burt > > http://www.springer.com/us/book/9781493921218 > > On Sat, Dec 3, 2016 at 11:46 PM, Ashim Kapoor > wrote: > > On Sun, Dec 4, 2016 at 10:03 AM, Ashim Kapoor > wrote: > > > >> Dear Sir, > >> > >> Many thanks for the explanation. Prior to your email (with some help > from > >> a friend of mine) I was able to figure this one out. If we look at the > >> model : - > >> > >> y = intercept + B1.woolB + B2. tensionM + B3.tensionH + B4. > woolB.TensionM > >> + B5.woolB.TensionH + error > >> > >> Here woolB, tensionM, tensionH are the dummy indicator variables similar > >> to how you have defined them. > >> > >> Now suppose we consider y1,..,yn, all in group A.L (say). > >> > >> Then y1 + ... + yn = intercept => average(y1,...,yn) = intercept + 0 + > 0 + > >> 0 + 0 + 0. > >> > >> This should be : y1 + ... yn = n . intercept > > > > What was confusing me was how to compute the cell mean in woolB,tensionH > >> cell. > >> > >> If we have y_1,...,y_n all in group B.H then :- > >> > >> y_1+ ... + y_n = intercept + B1 + 0 + B3 + 0 + B5 > >> > >> This should be : y_1 + ... +y_n = n( intercept + B1 + 0 + B3 + 0 + B5 ) > > > > > >> Therefore average of group B.H = intercept + B1 + B3 + B5 > >> > >> Many thanks and Best Regards, > >> Ashim > >> > >> > >> > >> On Sat, Dec 3, 2016 at 7:15 PM, Fox, John wrote: > >> > >>> Dear Ashim, > >>> > >>> Sorry to chime in late, and my apologies if someone has already pointed > >>> this out, but here's the relationship between the cell means and the > model > >>> coefficients, using the row-basis of the model matrix: > >>> > >>> -- snip > >>> > >>> > means <- with( warpbreaks, tapply( breaks, interaction(wool, > tension), > >>> mean ) ) > >>> > x.A <- rep(c(0, 1), 3) > >>> > x.B1 <- rep(c(0, 1, 0), each=2) > >>> > x.B2 <- rep(c(0, 0, 1), each=2) > >>> > x.AB1 <- x.A*x.B1 > >>> > x.AB2 <- x.A*x.B2 > >>> > X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2) > >>> > X.basis > >>>x.A x.B1 x.B2 x.AB1 x.AB2 > >>> [1,] 1 000 0 0 > >>> [2,] 1 100 0 0 > >>> [3,] 1 010 0 0 > >>> [4,] 1 110 1 0 > >>> [5,] 1 001 0 0 > >>> [6,] 1 101 0 1 > >>> > solve(X.basis, means) > >>> x.A x.B1 x.B2 x.AB1 x.AB2 > >>> 44.6 -16.3 -20.6 -20.0 21.1 10.6 > >>> > coef(aov(breaks ~ wool * tension, data = warpbreaks)) > >>>(Intercept) woolB tensionM tensionH > woolB:tensionM > >>> 44.6 -16.3 -20.6 -20.0 > 21.1 > >>> woolB:tensionH > >>> 10.6 > >>> > >>> -- snip > >>> > >>> I hope this helps, > >>> John > >>> > >>> - > >>> John Fox, Professor > >>> McMaster University > >>> Hamilton, Ontario > >>> Canada L8S 4M4 > >>> Web: socserv.mcmaster.ca/jfox > >>> > >>> > >>> > >>> > -Original Message- > >>> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of > Ashim > >>> Kapoor > >>> > Sent: December 3, 2016 12:19 AM > >>> > To: David Winsemius > >>> > Cc: r-help@r-project.org > >>> > Subject: Re: [R] Interpreting summary.lm for a 2 factor anova > >>> > > >>> > Please allow me to rephrase myquery. > >>> > > >>> > > model.tables(model,"m") > >>> > Tables of means > >>> > Grand mean > >>> > > >>> > 28.14815 > >>> > > >>> > wool > >>> > wool > >>> > A B > >>> > 31.037 25.259 > >>> > > >>> > tension > >>> > tension > >>> > L M H > >>> > 36.39 26.39 21.67 > >>> > > >>> > wool:tension > >>> > tension > >>> > wool L M H > >>> >A 44.56 24.00 24.56 > >>> >B 28.22 28.78 18.78 > >>> > > > >>> > > >>> > > >>> > The above is the same as : > >>> > > >>> >
Re: [R] Interpreting summary.lm for a 2 factor anova
On Sun, Dec 4, 2016 at 10:03 AM, Ashim Kapoorwrote: > Dear Sir, > > Many thanks for the explanation. Prior to your email (with some help from > a friend of mine) I was able to figure this one out. If we look at the > model : - > > y = intercept + B1.woolB + B2. tensionM + B3.tensionH + B4. woolB.TensionM > + B5.woolB.TensionH + error > > Here woolB, tensionM, tensionH are the dummy indicator variables similar > to how you have defined them. > > Now suppose we consider y1,..,yn, all in group A.L (say). > > Then y1 + ... + yn = intercept => average(y1,...,yn) = intercept + 0 + 0 + > 0 + 0 + 0. > > This should be : y1 + ... yn = n . intercept What was confusing me was how to compute the cell mean in woolB,tensionH > cell. > > If we have y_1,...,y_n all in group B.H then :- > > y_1+ ... + y_n = intercept + B1 + 0 + B3 + 0 + B5 > > This should be : y_1 + ... +y_n = n( intercept + B1 + 0 + B3 + 0 + B5 ) > Therefore average of group B.H = intercept + B1 + B3 + B5 > > Many thanks and Best Regards, > Ashim > > > > On Sat, Dec 3, 2016 at 7:15 PM, Fox, John wrote: > >> Dear Ashim, >> >> Sorry to chime in late, and my apologies if someone has already pointed >> this out, but here's the relationship between the cell means and the model >> coefficients, using the row-basis of the model matrix: >> >> -- snip >> >> > means <- with( warpbreaks, tapply( breaks, interaction(wool, tension), >> mean ) ) >> > x.A <- rep(c(0, 1), 3) >> > x.B1 <- rep(c(0, 1, 0), each=2) >> > x.B2 <- rep(c(0, 0, 1), each=2) >> > x.AB1 <- x.A*x.B1 >> > x.AB2 <- x.A*x.B2 >> > X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2) >> > X.basis >>x.A x.B1 x.B2 x.AB1 x.AB2 >> [1,] 1 000 0 0 >> [2,] 1 100 0 0 >> [3,] 1 010 0 0 >> [4,] 1 110 1 0 >> [5,] 1 001 0 0 >> [6,] 1 101 0 1 >> > solve(X.basis, means) >> x.A x.B1 x.B2 x.AB1 x.AB2 >> 44.6 -16.3 -20.6 -20.0 21.1 10.6 >> > coef(aov(breaks ~ wool * tension, data = warpbreaks)) >>(Intercept) woolB tensionM tensionH woolB:tensionM >> 44.6 -16.3 -20.6 -20.0 21.1 >> woolB:tensionH >> 10.6 >> >> -- snip >> >> I hope this helps, >> John >> >> - >> John Fox, Professor >> McMaster University >> Hamilton, Ontario >> Canada L8S 4M4 >> Web: socserv.mcmaster.ca/jfox >> >> >> >> > -Original Message- >> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ashim >> Kapoor >> > Sent: December 3, 2016 12:19 AM >> > To: David Winsemius >> > Cc: r-help@r-project.org >> > Subject: Re: [R] Interpreting summary.lm for a 2 factor anova >> > >> > Please allow me to rephrase myquery. >> > >> > > model.tables(model,"m") >> > Tables of means >> > Grand mean >> > >> > 28.14815 >> > >> > wool >> > wool >> > A B >> > 31.037 25.259 >> > >> > tension >> > tension >> > L M H >> > 36.39 26.39 21.67 >> > >> > wool:tension >> > tension >> > wool L M H >> >A 44.56 24.00 24.56 >> >B 28.22 28.78 18.78 >> > > >> > >> > >> > The above is the same as : >> > >> > with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) ) >> > A.L B.L A.M B.M A.H B.H >> > 44.6 28.2 24.0 28.8 24.6 18.8 >> > >> > For reference: >> > >> > > model <- aov(breaks ~ wool * tension, data = warpbreaks) >> > > summary.lm(model) >> > >> > Call: >> > aov(formula = breaks ~ wool * tension, data = warpbreaks) >> > >> > Residuals: >> > Min 1Q Median 3Q Max >> > -19.5556 -6.8889 -0.6667 7.1944 25. >> > >> > Coefficients: >> >Estimate Std. Error t value Pr(>|t|) >> > (Intercept) 44.556 3.647 12.218 2.43e-16 *** >> > woolB -16.333 5.157 -3.167 0.002677 ** >> > tensionM-20.556 5.157 -3.986 0.000228 *** >> > tensionH-20.000 5.157 -3.878 0.000320 *** >> > woolB:tensionM 21.111 7.294 2.895 0.005698 ** >> > woolB:tensionH 10.556 7.294 1.447 0.154327 >> > --- >> > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 >> > >> > Residual standard error: 10.94 on 48 degrees of freedom >> > Multiple R-squared: 0.3778,Adjusted R-squared: 0.3129 >> > F-statistic: 5.828 on 5 and 48 DF, p-value: 0.0002772 >> > >> > >> > Now I'll explain what is confusing me in the output of summary.lm. >> > >> > Coeff of Intercept = 44.556 = cell mean for A.L. This is the base. >> > >> > Coeff of woolB:L = -16.333 = 28.2 - 44.556. This is the difference >> of this >> > cell mean(B:L) from the base. >> > >> > Coeff of woolA:tensionM = -20.556 = 24.000- 44.556. This is the >> difference of >>
Re: [R] Interpreting summary.lm for a 2 factor anova
Dear Sir, Many thanks for the explanation. Prior to your email (with some help from a friend of mine) I was able to figure this one out. If we look at the model : - y = intercept + B1.woolB + B2. tensionM + B3.tensionH + B4. woolB.TensionM + B5.woolB.TensionH + error Here woolB, tensionM, tensionH are the dummy indicator variables similar to how you have defined them. Now suppose we consider y1,..,yn, all in group A.L (say). Then y1 + ... + yn = intercept => average(y1,...,yn) = intercept + 0 + 0 + 0 + 0 + 0. What was confusing me was how to compute the cell mean in woolB,tensionH cell. If we have y_1,...,y_n all in group B.H then :- y_1+ ... + y_n = intercept + B1 + 0 + B3 + 0 + B5 Therefore average of group B.H = intercept + B1 + B3 + B5 Many thanks and Best Regards, Ashim On Sat, Dec 3, 2016 at 7:15 PM, Fox, Johnwrote: > Dear Ashim, > > Sorry to chime in late, and my apologies if someone has already pointed > this out, but here's the relationship between the cell means and the model > coefficients, using the row-basis of the model matrix: > > -- snip > > > means <- with( warpbreaks, tapply( breaks, interaction(wool, tension), > mean ) ) > > x.A <- rep(c(0, 1), 3) > > x.B1 <- rep(c(0, 1, 0), each=2) > > x.B2 <- rep(c(0, 0, 1), each=2) > > x.AB1 <- x.A*x.B1 > > x.AB2 <- x.A*x.B2 > > X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2) > > X.basis >x.A x.B1 x.B2 x.AB1 x.AB2 > [1,] 1 000 0 0 > [2,] 1 100 0 0 > [3,] 1 010 0 0 > [4,] 1 110 1 0 > [5,] 1 001 0 0 > [6,] 1 101 0 1 > > solve(X.basis, means) > x.A x.B1 x.B2 x.AB1 x.AB2 > 44.6 -16.3 -20.6 -20.0 21.1 10.6 > > coef(aov(breaks ~ wool * tension, data = warpbreaks)) >(Intercept) woolB tensionM tensionH woolB:tensionM > 44.6 -16.3 -20.6 -20.0 21.1 > woolB:tensionH > 10.6 > > -- snip > > I hope this helps, > John > > - > John Fox, Professor > McMaster University > Hamilton, Ontario > Canada L8S 4M4 > Web: socserv.mcmaster.ca/jfox > > > > > -Original Message- > > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ashim > Kapoor > > Sent: December 3, 2016 12:19 AM > > To: David Winsemius > > Cc: r-help@r-project.org > > Subject: Re: [R] Interpreting summary.lm for a 2 factor anova > > > > Please allow me to rephrase myquery. > > > > > model.tables(model,"m") > > Tables of means > > Grand mean > > > > 28.14815 > > > > wool > > wool > > A B > > 31.037 25.259 > > > > tension > > tension > > L M H > > 36.39 26.39 21.67 > > > > wool:tension > > tension > > wool L M H > >A 44.56 24.00 24.56 > >B 28.22 28.78 18.78 > > > > > > > > > The above is the same as : > > > > with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) ) > > A.L B.L A.M B.M A.H B.H > > 44.6 28.2 24.0 28.8 24.6 18.8 > > > > For reference: > > > > > model <- aov(breaks ~ wool * tension, data = warpbreaks) > > > summary.lm(model) > > > > Call: > > aov(formula = breaks ~ wool * tension, data = warpbreaks) > > > > Residuals: > > Min 1Q Median 3Q Max > > -19.5556 -6.8889 -0.6667 7.1944 25. > > > > Coefficients: > >Estimate Std. Error t value Pr(>|t|) > > (Intercept) 44.556 3.647 12.218 2.43e-16 *** > > woolB -16.333 5.157 -3.167 0.002677 ** > > tensionM-20.556 5.157 -3.986 0.000228 *** > > tensionH-20.000 5.157 -3.878 0.000320 *** > > woolB:tensionM 21.111 7.294 2.895 0.005698 ** > > woolB:tensionH 10.556 7.294 1.447 0.154327 > > --- > > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > > > Residual standard error: 10.94 on 48 degrees of freedom > > Multiple R-squared: 0.3778,Adjusted R-squared: 0.3129 > > F-statistic: 5.828 on 5 and 48 DF, p-value: 0.0002772 > > > > > > Now I'll explain what is confusing me in the output of summary.lm. > > > > Coeff of Intercept = 44.556 = cell mean for A.L. This is the base. > > > > Coeff of woolB:L = -16.333 = 28.2 - 44.556. This is the difference > of this > > cell mean(B:L) from the base. > > > > Coeff of woolA:tensionM = -20.556 = 24.000- 44.556. This is the > difference of > > this cell mean (A:M) from the base. > > > > Coeff of woolA:tensionH = -20.000 = 24.6 - 44.556. This is the > difference > > of this cell mean(A:H) from the base. > > > > This is where it stops being the difference from the base. > > > > Coeff of woolB:tensionM = 21.111 should turn out to be 28.8 - 44.556 > but > > this is -15.77822 > > > > Coeff of woolB:tensionH
[R] tcltk: use of .Tcl.callback
Dear, How to properly use the function: .Tcl.callback( ) to trigger a single R function with different values? Below is my stupid solution ... The expected behavior is this but I would like to know how to do it right. Thanks in advanced cleber ## library( tcltk ) top <- tktoplevel() dummyCopyCutR <- function( optionXXX ){ if( optionXXX=="copy" ) print("Option Copy") if( optionXXX=="cut" ) print("Option Cut" ) } strcmd <- strsplit( .Tcl.callback( dummyCopyCutR ), " %" )[[1]][1] tkbind( top, '', paste( strcmd, 'copy' ) ) tkbind( top, '', paste( strcmd, 'cut' ) ) ## --- Este email foi escaneado pelo Avast antivírus. https://www.avast.com/antivirus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error serialize (foreach)
As a follow up to this, I have been able to generate a toy example of reproducible code that generates the same problem. Below is just a sample to represent the issue, but my data and subsequent functions acting on the data are much more involved. I no longer have the error, but, the loop running in parallel is extremely slow relative to its serialized counterpart. I have narrowed down the problem to the fact that I am searching through a very large list, grabbing the data from that list by indexing to subset and then doing stuff to it. Both "work", but the parallel version is very, very slow. I believe I am sending data files to each core and the number of searches happening is prohibitive. I am very much stuck in the design-based way of how I would do this particular problem on a single core and am not sure if there is a better designed based approach for solving this problem in the parallel version. Any advice on better ways to work with the %dopar% version here? N <- 20 myList <- vector('list', N) names(myList) <- 1:N for(i in 1:N){ myList[[i]] <- rnorm(100) } nms <- 1:N library(foreach) library(doParallel) registerDoParallel(cores=7) result <- foreach(i = 1:3) %do% { dat <- myList[[which(names(myList) == nms[i])]] mean(dat) } result <- foreach(i = 1:3) %dopar% { dat <- myList[[which(names(myList) == nms[i])]] mean(dat) } -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Doran, Harold Sent: Saturday, December 03, 2016 4:26 PM To: r-help@r-project.org Subject: [R] error serialize (foreach) I have a portion of a foreach loop that I cannot run as parallel but works fine when serialized. Below is a representation of the problem as in this instance I cannot provide reproducible data to generate the same error, the actual data I am working with are confidential. Within each foreach loop are a series of custom functions acting on my data. When using %do% I get expected result but replacing it with %dopar% generates the error. I have searched archives and also stackexchange and see this is an issue that arises and I have tried a couple of the recommendations, like trying to use an outfile in makeCluster. But I am not having success. Oddly, (or perhaps not oddly), others portions of my program run in parallel and do not generate this same error library(foreach) library(doParallel) registerDoParallel(cores=3) # This portion runs and produces expected result result <- foreach(i = 1:N) %do% { tmp1 <- function1(...) tmp2 <- function2(...) tmp2 } # This portion generates error in serialize result <- foreach(i = 1:N) %dopar% { tmp1 <- function1(...) tmp2 <- function2(...) tmp2 } error in serialize(data, node$con) : error writing to connection [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tcltk: use of .Tcl.callback
Dear, How to properly use the function: .Tcl.callback( ) to trigger a single R function with different values? Below is my stupid solution ... The expected behavior is this but I would like to know how to do it right. Thanks in advanced cleber ## library( tcltk ) top <- tktoplevel() dummyCopyCutR <- function( optionXXX ){ if( optionXXX=="copy" ) print("Option Copy") if( optionXXX=="cut" ) print("Option Cut" ) } strcmd <- strsplit( .Tcl.callback( dummyCopyCutR ), " %" )[[1]][1] tkbind( top, '', paste( strcmd, 'copy' ) ) tkbind( top, '', paste( strcmd, 'cut' ) ) ## --- Este email foi escaneado pelo Avast antivírus. https://www.avast.com/antivirus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bootstrap using ARIMA model
> On Dec 1, 2016, at 1:58 PM, Ashwini Patilwrote: > > Hi David, > > here is my code including what i did for the tsboot: > rm(list = ls()) > library(boot) > library(tseries) > library(TTR) > library(quantmod) > library(scales) > library(forecast) > library(zoo) > library(TSA) > security<-"NFLX" > startDate<-"2012-06-01" > endDate<-"2016-10-31" > qte_list<-c("AdjClose") > > data=get.hist.quote(instrument = security, startDate, endDate, quote = > qte_list, provider = "yahoo" ) > > func.ar<- ar(logret) > func.ar<- ar(logret) Error in ar.yw(x, aic = aic, order.max = order.max, na.action = na.action, : object 'logret' not found > func.model<-list(order = c(func.ar$order,0,0),ar=func.ar$ar) > func.res<- func.ar$resid[!is.na(func.ar$resid)] > func.res<-func.res - mean() > func<- function(logret,formula){ > d = logret > return(RSI(exp(logret))) > } > func.sim<-function(res,n.sim,ran.args){ > rg1<- function(n, res) sample(res, n, replace=TRUE) > ts.orig<-ran.args$ts > ts.mod<-ran.args$model > mean(ts.orig)+ts(arima.sim(model=ts.mod,n=n.sim, ran.gen=rg1, > res=as.vestor(res))) > } > myboot<-tsboot(exp(logret),func,R=500,sim="model", ran.gen=func.sim, > ran.args = List(ts=log(data[,1],model=func.sim)) > > > Best, > Ash > > > On Thu, Dec 1, 2016 at 1:50 PM, Bert Gunter wrote: > >> Just briefly to follow up David's comment, though this is mainly about >> statistics and therefore off topic here... >> >> Bootstrapping time series is a subtle issue that requires familiarity >> with the technical details-- and maybe even current research. The >> tsboot() function gives you several options from which you must choose >> *appropriately* -- or maybe choose something else entirely. The Help >> doc gives you a sense of the difficulties: >> >> *** >> Model based resampling is very similar to the parametric bootstrap and >> all simulation must be in one of the user specified functions. This >> avoids the complicated problem of choosing the block length but relies >> on an accurate model choice being made. >> >> Phase scrambling is described in Section 8.2.4 of Davison and Hinkley >> (1997). The types of statistic for which this method produces >> reasonable results is very limited and the other methods seem to do >> better in most situations. Other types of resampling in the frequency >> domain can be accomplished using the function boot with the argument >> sim = "parametric". >> >> >> Moral: If you don't know what you're doing, seek local expertise to >> help -- remote sites offering suggestions from those who aren't >> familiar with the details of your data and analysis goals (maybe you >> don't need to do this at all!) may lead you to irreproducible >> nonsense. >> >> Cheers, >> Bert >> Bert Gunter >> >> "The trouble with having an open mind is that people keep coming along >> and sticking things into it." >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) >> >> >> On Thu, Dec 1, 2016 at 7:45 AM, Ashwini Patil >> wrote: >>> Hi, >>> >>> I want to implement a bootstrap method for time series. >>> I am taking the adj close values from yahoo for NFLX and now I need to >>> bootstrap these values using ARIMA model. >>> >>> here is my code so far: >>> rm(list = ls()) >>> library(boot) >>> library(tseries) >>> library(TTR) >>> library(quantmod) >>> library(scales) >>> library(forecast) >>> library(zoo) >>> library(TSA) >>> security<-"NFLX" >>> startDate<-"2012-06-01" >>> endDate<-"2016-10-31" >>> qte_list<-c("AdjClose") >>> >>> data=get.hist.quote(instrument = security, startDate, endDate, quote = >>> qte_list, provider = "yahoo" ) >>> logret<-diff(log(data[,1])) >>> fit11<-auto.arima(logret, max.order=10) >>> >>> When i use auto.arima, I get an order of (0,0,0) with non-zero mean. >> After >>> this, I tried to use tsboot function but it is not yielding any answers. >>> >>> Any and all help is appreciated. >>> >>> Thank you! >>> >>>[[alternative HTML version deleted]] >>> >>> __ >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/ >> posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide
[R] error serialize (foreach)
I have a portion of a foreach loop that I cannot run as parallel but works fine when serialized. Below is a representation of the problem as in this instance I cannot provide reproducible data to generate the same error, the actual data I am working with are confidential. Within each foreach loop are a series of custom functions acting on my data. When using %do% I get expected result but replacing it with %dopar% generates the error. I have searched archives and also stackexchange and see this is an issue that arises and I have tried a couple of the recommendations, like trying to use an outfile in makeCluster. But I am not having success. Oddly, (or perhaps not oddly), others portions of my program run in parallel and do not generate this same error library(foreach) library(doParallel) registerDoParallel(cores=3) # This portion runs and produces expected result result <- foreach(i = 1:N) %do% { tmp1 <- function1(...) tmp2 <- function2(...) tmp2 } # This portion generates error in serialize result <- foreach(i = 1:N) %dopar% { tmp1 <- function1(...) tmp2 <- function2(...) tmp2 } error in serialize(data, node$con) : error writing to connection [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to setup a multiplicative dummy function in R
I think you need an offset term, or maybe I just don't understand your question. A sample data set, particularly if you can show us how your equation could be used to generate the sample data, would be helpful. -- Sent from my phone. Please excuse my brevity. On December 1, 2016 7:22:37 AM PST, lolo kokowrote: >Hello? > >Does anyone know how I can implement the below equation in R? I would >like to estimate the following equation: > > y=beta_ij * (1+gamma_j * dummy) * x_ij > >where y is continuous, and all the x variables (j of them) are i=3 >level categorical variables. The intuition is that instead of >estimating the additive value for a dummy variable, I would like to >estimate the multiplicative value for the dummy variable. Thus the >presence of the dummy would scale the beta. Note that for each x >variable there is only one gamma. > >For concreteness, you can imagine that y is a continious test score, x >are categorical variables indicating different types of education >achievements, each type of education achievement is categorised in 3 >levels (none, some, a lot), and the dummy indicates race. In this >model I believe that race affects test scores proportionally to >estimated beta of each education level. This avoids having to estimate >a gamma for each education achievement level. > >Is the solution to simply use nls {stats} and type out the equation? > >Hope the explanation makes sense, happy to explain further. > >Best wishes, > >Peter > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rstan error: C:/Rtools/mingw_64/bin/g++: not found
On 02.12.2016 12:23, S Ellison wrote: Apologies for posting a possibly package-specific question, but I'm not sure whether this is an R or rstan ussue. Running rstan under R 3.1.1 in windows 10 I get the well-known error "Compilation ERROR, function(s)/method(s) not created! C:/Rtools/mingw_64/bin/g++: not found" Are you sure this is R-3.1.1? I'd expect such a message for R >= 3.3.0 if you do not update some files, but for 3.1.1 the correct PATH should be sufficient. Nevertheless, there is some code in rstan that seems to deal with Rtools for some reason, so perhaps better ask its package maintainer. Best, Uwe Ligges The cause on my system is simple; g++ is not on my C:\ drive; it's on D: My system path correctly points to D:, running system('g++ -v') works fine. But the error message shows that either rstan or R is insisting on a specific call on C:. I suspect that something is causing either a package, or R, to use the wrong drive. It _may_ be related to the fact that R itself is in its usual place in 'C:\Program files'. Any pointers, either to an answer or to a better place to ask, would be welcome. Steve Ellison PS: I can see that there is a _fairly_ simple work-round, but I prefer Rtools where it is for system management reasons and this is (so far) the only place that the path variable is not correctly picked up. *** This email and any attachments are confidential. Any use...{{dropped:8}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data
This should be reasonably efficient with 'dplyr': > library(dplyr) > input <- read.csv(text = "state,city,x + 1,12,100 + 1,12,100 + 1,12,200 + 1,13,200 + 1,13,100 + 1,13,100 + 1,14,200 + 2,21,200 + 2,21,200 + 2,21,100 + 2,23,100 + 2,23,200 + 2,34,200 + 2,34,100 + 2,35,100") > > result <- input %>% + group_by(state) %>% + summarise(nCities = length(unique(city)), + count = n(), + `100's` = sum(x == 100), + `200's` = sum(x == 200) + ) > result # A tibble: 2 × 5 state nCities count `100's` `200's` 1 1 3 7 4 3 2 2 4 8 4 4 Or you can also use data.table: > library(data.table) > input <- fread("state,city,x + 1,12,100 + 1,12,100 + 1,12,200 + 1,13,200 + 1,13,100 + 1,13,100 + 1,14,200 + 2,21,200 + 2,21,200 + 2,21,100 + 2,23,100 + 2,23,200 + 2,34,200 + 2,34,100 + 2,35,100") > > input[, .(nCities = length(unique(city)), + count = .N, + `100's` = sum(x == 100), + `200's` = sum(x == 200) + ) + , keyby = state + ] state nCities count 100's 200's 1: 1 3 7 4 3 2: 2 4 8 4 4 Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Sat, Dec 3, 2016 at 10:40 AM, Valwrote: > Hi all, > > I am trying to read and summarize a big data frame( >10M records) > > Here is the sample of my data > state,city,x > 1,12,100 > 1,12,100 > 1,12,200 > 1,13,200 > 1,13,100 > 1,13,100 > 1,14,200 > 2,21,200 > 2,21,200 > 2,21,100 > 2,23,100 > 2,23,200 > 2,34,200 > 2,34,100 > 2,35,100 > > I want get the total count by state, and the the number of cities > by state. The x variable is either 100 or 200 and count each > > The result should look like as follows. > > state,city,count,100's,200's > 1,3,7,4,3 > 2,4,8,4,4 > > At the present I am doing it in several steps and taking too long > > Is there an efficient way of doing this? > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] data
Hi all, I am trying to read and summarize a big data frame( >10M records) Here is the sample of my data state,city,x 1,12,100 1,12,100 1,12,200 1,13,200 1,13,100 1,13,100 1,14,200 2,21,200 2,21,200 2,21,100 2,23,100 2,23,200 2,34,200 2,34,100 2,35,100 I want get the total count by state, and the the number of cities by state. The x variable is either 100 or 200 and count each The result should look like as follows. state,city,count,100's,200's 1,3,7,4,3 2,4,8,4,4 At the present I am doing it in several steps and taking too long Is there an efficient way of doing this? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [ESS] how to edit openbugs model file?
Hi Paul: Use BUGS mode which is part of ESS… (require ‘ess-bugs-d) Then name your file model.bug Rodney __ ESS-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/ess-help
Re: [R] Interpreting summary.lm for a 2 factor anova
Dear Ashim, Sorry to chime in late, and my apologies if someone has already pointed this out, but here's the relationship between the cell means and the model coefficients, using the row-basis of the model matrix: -- snip > means <- with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) > ) > x.A <- rep(c(0, 1), 3) > x.B1 <- rep(c(0, 1, 0), each=2) > x.B2 <- rep(c(0, 0, 1), each=2) > x.AB1 <- x.A*x.B1 > x.AB2 <- x.A*x.B2 > X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2) > X.basis x.A x.B1 x.B2 x.AB1 x.AB2 [1,] 1 000 0 0 [2,] 1 100 0 0 [3,] 1 010 0 0 [4,] 1 110 1 0 [5,] 1 001 0 0 [6,] 1 101 0 1 > solve(X.basis, means) x.A x.B1 x.B2 x.AB1 x.AB2 44.6 -16.3 -20.6 -20.0 21.1 10.6 > coef(aov(breaks ~ wool * tension, data = warpbreaks)) (Intercept) woolB tensionM tensionH woolB:tensionM 44.6 -16.3 -20.6 -20.0 21.1 woolB:tensionH 10.6 -- snip I hope this helps, John - John Fox, Professor McMaster University Hamilton, Ontario Canada L8S 4M4 Web: socserv.mcmaster.ca/jfox > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ashim Kapoor > Sent: December 3, 2016 12:19 AM > To: David Winsemius> Cc: r-help@r-project.org > Subject: Re: [R] Interpreting summary.lm for a 2 factor anova > > Please allow me to rephrase myquery. > > > model.tables(model,"m") > Tables of means > Grand mean > > 28.14815 > > wool > wool > A B > 31.037 25.259 > > tension > tension > L M H > 36.39 26.39 21.67 > > wool:tension > tension > wool L M H >A 44.56 24.00 24.56 >B 28.22 28.78 18.78 > > > > > The above is the same as : > > with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) ) > A.L B.L A.M B.M A.H B.H > 44.6 28.2 24.0 28.8 24.6 18.8 > > For reference: > > > model <- aov(breaks ~ wool * tension, data = warpbreaks) > > summary.lm(model) > > Call: > aov(formula = breaks ~ wool * tension, data = warpbreaks) > > Residuals: > Min 1Q Median 3Q Max > -19.5556 -6.8889 -0.6667 7.1944 25. > > Coefficients: >Estimate Std. Error t value Pr(>|t|) > (Intercept) 44.556 3.647 12.218 2.43e-16 *** > woolB -16.333 5.157 -3.167 0.002677 ** > tensionM-20.556 5.157 -3.986 0.000228 *** > tensionH-20.000 5.157 -3.878 0.000320 *** > woolB:tensionM 21.111 7.294 2.895 0.005698 ** > woolB:tensionH 10.556 7.294 1.447 0.154327 > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > Residual standard error: 10.94 on 48 degrees of freedom > Multiple R-squared: 0.3778,Adjusted R-squared: 0.3129 > F-statistic: 5.828 on 5 and 48 DF, p-value: 0.0002772 > > > Now I'll explain what is confusing me in the output of summary.lm. > > Coeff of Intercept = 44.556 = cell mean for A.L. This is the base. > > Coeff of woolB:L = -16.333 = 28.2 - 44.556. This is the difference of this > cell mean(B:L) from the base. > > Coeff of woolA:tensionM = -20.556 = 24.000- 44.556. This is the difference of > this cell mean (A:M) from the base. > > Coeff of woolA:tensionH = -20.000 = 24.6 - 44.556. This is the difference > of this cell mean(A:H) from the base. > > This is where it stops being the difference from the base. > > Coeff of woolB:tensionM = 21.111 should turn out to be 28.8 - 44.556 but > this is -15.77822 > > Coeff of woolB:tensionH = 10.556 should turn out to be 18.8 - 44.556 but > this is -25.77822 > > In the above 2 cases, we can't say that the coefficient = cell mean - base > case. > Can you tell me what should be the statement to be made ? > > > Best Regards, > Ashim > > PS : My apologies for emailing my query to this list. Can you tell me the > names > of a few (active) statistics help list ? > > On Sat, Dec 3, 2016 at 1:33 AM, David Winsemius > wrote: > > > > > > On Dec 2, 2016, at 9:09 AM, David Winsemius > > wrote: > > > > > >> > > >> On Dec 2, 2016, at 6:16 AM, Ashim Kapoor > wrote: > > >> > > >> Dear Pikal, > > >> > > >> All levels except the interactions are compared to the Intercept. > > >> I'm a little confused as to what's going on in interaction terms > > >> eg. the cell wool B : tension M. It's mean is : > > >> 28.78 and 28.78 - 44.56 = -15.78 != 21.111. > > >> > > >> It's something like 44.56 (intercept) -16.333 (wool B) -.20.556 > > >> (tension > > >> M) + 21.111 (woolB:tensionM) = 28.782. > >
Re: [ESS] how to edit openbugs model file?
Also, M-x R-mode but that would need to be done again every time... If you don't want to rename to model.R Add a line ## -*- R -*- at the top (or close enough to the very top; this tells emacs to use R-mode ; there are several (more flexible, more to write) versions of such so-called "file local variables" in emacs.. Emacs manuals and also Google "Emacs file local variables" should help. Martin Maechler, ETH Zurich On Fri, Dec 2, 2016 at 8:35 PM, Simon Bonnerwrote: > Hi Paul, > > An easy way to do this is to rename your file as model.R. The syntax for BUGS > is almost identical to the syntax for R, and BUGS doesn’t care what the model > extension is. > > Wish I could say that I thought of that but kudos go to Matthew Schofield. > > Cheers, > > Simon > > Simon Bonner > Assistant Professor of Environmetrics/Director MMASc > Department of Statistical and Actuarial Sciences/Department of Biology > University of Western Ontario > > Office: Western Science Centre rm 276 > > Email: sbonn...@uwo.ca | Telephone: 519-661-2111 x88205 | Fax: 519-661-3813 > Twitter: @bonnerstatslab | Website: http://simon.bonners.ca/bonner-lab/wpblog/ > > > On 2016-12-02, 1:21 PM, "ESS-help on behalf of Paul Johnson" > wrote: > > I have a student here who wants to edit an OpenBugs file named > "model.txt" in Emacs. I can't understand how to tell Emacs to use a > bugs mode so that things like M-; understand what to do. > > If you have an idea, let me know. > > We don't want to run the bugs code with an emacs session, we just want > to edit the bugs text model file and then we can run with R2OpenBugs > or something else. > > -- > Paul E. Johnson http://pj.freefaculty.org > Director, Center for Research Methods and Data Analysis > http://crmda.ku.edu > > To write to me directly, please address me at pauljohn at ku.edu. > > __ > ESS-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/ess-help > > > __ > ESS-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/ess-help __ ESS-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/ess-help