Re: [R] Interpreting summary.lm for a 2 factor anova

2016-12-03 Thread Ashim Kapoor
Dear Sir,

Alright.

Best Regards,
Ashim

On Sun, Dec 4, 2016 at 10:59 AM, Richard M. Heiberger 
wrote:

> As Petr Pikal mentioned, the difficulty in interpretation is entirely due
> to the set of contrasts you chose.The default treatment contrasts are
> not orthogonal and are therefore the most difficult to interpret.
> The note in ?aov warns of this difficulty.
>
> sum contrasts will give you numbers that are easiest to interpret.
>
>
> options(contrasts = c("contr.sum", "contr.poly"))
> warpbreakssum.aov <- aov(breaks ~ wool * tension, data = warpbreaks)
> coef(warpbreakssum.aov)
> model.tables(warpbreakstreatment.aov, type="effects")
> model.tables(warpbreakstreatment.aov, type="means")
>
>
> John Fox showed the algebra using the default treatment contrasts
>
> For full understanding you will need to read  in a text more about
> sets of linear contrasts and their algebra.
> I recommend Section 10.3 in mine, of course.
>
> Statistical Analysis and Data Display:
> An Intermediate Course with Examples in R
> Heiberger, Richard M., Holland, Burt
>
> http://www.springer.com/us/book/9781493921218
>
> On Sat, Dec 3, 2016 at 11:46 PM, Ashim Kapoor 
> wrote:
> > On Sun, Dec 4, 2016 at 10:03 AM, Ashim Kapoor 
> wrote:
> >
> >> Dear Sir,
> >>
> >> Many thanks for the explanation. Prior to your email (with some help
> from
> >> a friend of mine) I was able to figure this one out. If we look at the
> >> model : -
> >>
> >> y = intercept + B1.woolB + B2. tensionM + B3.tensionH + B4.
> woolB.TensionM
> >> + B5.woolB.TensionH + error
> >>
> >> Here woolB, tensionM, tensionH are the dummy indicator variables similar
> >> to how you have defined them.
> >>
> >> Now suppose we consider y1,..,yn, all in group A.L (say).
> >>
> >> Then y1 + ... + yn = intercept => average(y1,...,yn) = intercept + 0 +
> 0 +
> >> 0 + 0 + 0.
> >>
> >> This should be : y1 + ... yn = n . intercept
> >
> > What was confusing me was how to compute the cell mean in woolB,tensionH
> >> cell.
> >>
> >> If we have y_1,...,y_n all in group B.H then :-
> >>
> >> y_1+ ... + y_n = intercept + B1 + 0 + B3 + 0 +  B5
> >>
> >> This should be : y_1 + ... +y_n = n( intercept + B1 + 0 + B3 + 0 +  B5 )
> >
> >
> >> Therefore average of group B.H = intercept + B1 + B3 + B5
> >>
> >> Many thanks and Best Regards,
> >> Ashim
> >>
> >>
> >>
> >> On Sat, Dec 3, 2016 at 7:15 PM, Fox, John  wrote:
> >>
> >>> Dear Ashim,
> >>>
> >>> Sorry to chime in late, and my apologies if someone has already pointed
> >>> this out, but here's the relationship between the cell means and the
> model
> >>> coefficients, using the row-basis of the model matrix:
> >>>
> >>> -- snip 
> >>>
> >>> > means <- with( warpbreaks, tapply( breaks, interaction(wool,
> tension),
> >>> mean ) )
> >>> > x.A <- rep(c(0, 1), 3)
> >>> > x.B1 <- rep(c(0, 1, 0), each=2)
> >>> > x.B2 <- rep(c(0, 0, 1), each=2)
> >>> > x.AB1 <- x.A*x.B1
> >>> > x.AB2 <- x.A*x.B2
> >>> > X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2)
> >>> > X.basis
> >>>x.A x.B1 x.B2 x.AB1 x.AB2
> >>> [1,] 1   000 0 0
> >>> [2,] 1   100 0 0
> >>> [3,] 1   010 0 0
> >>> [4,] 1   110 1 0
> >>> [5,] 1   001 0 0
> >>> [6,] 1   101 0 1
> >>> > solve(X.basis, means)
> >>> x.A  x.B1  x.B2 x.AB1 x.AB2
> >>>  44.6 -16.3 -20.6 -20.0  21.1  10.6
> >>> > coef(aov(breaks ~ wool * tension, data = warpbreaks))
> >>>(Intercept)  woolB   tensionM   tensionH
> woolB:tensionM
> >>>   44.6  -16.3  -20.6  -20.0
>  21.1
> >>> woolB:tensionH
> >>>   10.6
> >>>
> >>> -- snip 
> >>>
> >>> I hope this helps,
> >>>  John
> >>>
> >>> -
> >>> John Fox, Professor
> >>> McMaster University
> >>> Hamilton, Ontario
> >>> Canada L8S 4M4
> >>> Web: socserv.mcmaster.ca/jfox
> >>>
> >>>
> >>>
> >>> > -Original Message-
> >>> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> Ashim
> >>> Kapoor
> >>> > Sent: December 3, 2016 12:19 AM
> >>> > To: David Winsemius 
> >>> > Cc: r-help@r-project.org
> >>> > Subject: Re: [R] Interpreting summary.lm for a 2 factor anova
> >>> >
> >>> > Please allow me to rephrase myquery.
> >>> >
> >>> > > model.tables(model,"m")
> >>> > Tables of means
> >>> > Grand mean
> >>> >
> >>> > 28.14815
> >>> >
> >>> >  wool
> >>> > wool
> >>> >  A  B
> >>> > 31.037 25.259
> >>> >
> >>> >  tension
> >>> > tension
> >>> > L M H
> >>> > 36.39 26.39 21.67
> >>> >
> >>> >  wool:tension
> >>> > tension
> >>> > wool L M H
> >>> >A 44.56 24.00 24.56
> >>> >B 28.22 28.78 18.78
> >>> > >
> >>> >
> >>> >
> >>> > The above is the same as :
> >>> >
> >>> > 

Re: [R] Interpreting summary.lm for a 2 factor anova

2016-12-03 Thread Ashim Kapoor
On Sun, Dec 4, 2016 at 10:03 AM, Ashim Kapoor  wrote:

> Dear Sir,
>
> Many thanks for the explanation. Prior to your email (with some help from
> a friend of mine) I was able to figure this one out. If we look at the
> model : -
>
> y = intercept + B1.woolB + B2. tensionM + B3.tensionH + B4. woolB.TensionM
> + B5.woolB.TensionH + error
>
> Here woolB, tensionM, tensionH are the dummy indicator variables similar
> to how you have defined them.
>
> Now suppose we consider y1,..,yn, all in group A.L (say).
>
> Then y1 + ... + yn = intercept => average(y1,...,yn) = intercept + 0 + 0 +
> 0 + 0 + 0.
>
> This should be : y1 + ... yn = n . intercept

What was confusing me was how to compute the cell mean in woolB,tensionH
> cell.
>
> If we have y_1,...,y_n all in group B.H then :-
>
> y_1+ ... + y_n = intercept + B1 + 0 + B3 + 0 +  B5
>
> This should be : y_1 + ... +y_n = n( intercept + B1 + 0 + B3 + 0 +  B5 )


> Therefore average of group B.H = intercept + B1 + B3 + B5
>
> Many thanks and Best Regards,
> Ashim
>
>
>
> On Sat, Dec 3, 2016 at 7:15 PM, Fox, John  wrote:
>
>> Dear Ashim,
>>
>> Sorry to chime in late, and my apologies if someone has already pointed
>> this out, but here's the relationship between the cell means and the model
>> coefficients, using the row-basis of the model matrix:
>>
>> -- snip 
>>
>> > means <- with( warpbreaks, tapply( breaks, interaction(wool, tension),
>> mean ) )
>> > x.A <- rep(c(0, 1), 3)
>> > x.B1 <- rep(c(0, 1, 0), each=2)
>> > x.B2 <- rep(c(0, 0, 1), each=2)
>> > x.AB1 <- x.A*x.B1
>> > x.AB2 <- x.A*x.B2
>> > X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2)
>> > X.basis
>>x.A x.B1 x.B2 x.AB1 x.AB2
>> [1,] 1   000 0 0
>> [2,] 1   100 0 0
>> [3,] 1   010 0 0
>> [4,] 1   110 1 0
>> [5,] 1   001 0 0
>> [6,] 1   101 0 1
>> > solve(X.basis, means)
>> x.A  x.B1  x.B2 x.AB1 x.AB2
>>  44.6 -16.3 -20.6 -20.0  21.1  10.6
>> > coef(aov(breaks ~ wool * tension, data = warpbreaks))
>>(Intercept)  woolB   tensionM   tensionH woolB:tensionM
>>   44.6  -16.3  -20.6  -20.0   21.1
>> woolB:tensionH
>>   10.6
>>
>> -- snip 
>>
>> I hope this helps,
>>  John
>>
>> -
>> John Fox, Professor
>> McMaster University
>> Hamilton, Ontario
>> Canada L8S 4M4
>> Web: socserv.mcmaster.ca/jfox
>>
>>
>>
>> > -Original Message-
>> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ashim
>> Kapoor
>> > Sent: December 3, 2016 12:19 AM
>> > To: David Winsemius 
>> > Cc: r-help@r-project.org
>> > Subject: Re: [R] Interpreting summary.lm for a 2 factor anova
>> >
>> > Please allow me to rephrase myquery.
>> >
>> > > model.tables(model,"m")
>> > Tables of means
>> > Grand mean
>> >
>> > 28.14815
>> >
>> >  wool
>> > wool
>> >  A  B
>> > 31.037 25.259
>> >
>> >  tension
>> > tension
>> > L M H
>> > 36.39 26.39 21.67
>> >
>> >  wool:tension
>> > tension
>> > wool L M H
>> >A 44.56 24.00 24.56
>> >B 28.22 28.78 18.78
>> > >
>> >
>> >
>> > The above is the same as :
>> >
>> > with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) )
>> >  A.L  B.L  A.M  B.M  A.H  B.H
>> > 44.6 28.2 24.0 28.8 24.6 18.8
>> >
>> > For reference:
>> >
>> > > model <- aov(breaks ~ wool * tension, data = warpbreaks)
>> > > summary.lm(model)
>> >
>> > Call:
>> > aov(formula = breaks ~ wool * tension, data = warpbreaks)
>> >
>> > Residuals:
>> >  Min   1Q   Median   3Q  Max
>> > -19.5556  -6.8889  -0.6667   7.1944  25.
>> >
>> > Coefficients:
>> >Estimate Std. Error t value Pr(>|t|)
>> > (Intercept)  44.556  3.647  12.218 2.43e-16 ***
>> > woolB   -16.333  5.157  -3.167 0.002677 **
>> > tensionM-20.556  5.157  -3.986 0.000228 ***
>> > tensionH-20.000  5.157  -3.878 0.000320 ***
>> > woolB:tensionM   21.111  7.294   2.895 0.005698 **
>> > woolB:tensionH   10.556  7.294   1.447 0.154327
>> > ---
>> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> >
>> > Residual standard error: 10.94 on 48 degrees of freedom
>> > Multiple R-squared:  0.3778,Adjusted R-squared:  0.3129
>> > F-statistic: 5.828 on 5 and 48 DF,  p-value: 0.0002772
>> >
>> >
>> > Now I'll explain what is confusing me in the output of summary.lm.
>> >
>> > Coeff of Intercept = 44.556  = cell mean for A.L. This is the base.
>> >
>> > Coeff of woolB:L = -16.333 = 28.2 - 44.556. This is the difference
>> of this
>> > cell mean(B:L) from the base.
>> >
>> > Coeff of woolA:tensionM = -20.556  = 24.000- 44.556. This is the
>> difference of
>> 

Re: [R] Interpreting summary.lm for a 2 factor anova

2016-12-03 Thread Ashim Kapoor
Dear Sir,

Many thanks for the explanation. Prior to your email (with some help from a
friend of mine) I was able to figure this one out. If we look at the model
: -

y = intercept + B1.woolB + B2. tensionM + B3.tensionH + B4. woolB.TensionM
+ B5.woolB.TensionH + error

Here woolB, tensionM, tensionH are the dummy indicator variables similar to
how you have defined them.

Now suppose we consider y1,..,yn, all in group A.L (say).

Then y1 + ... + yn = intercept => average(y1,...,yn) = intercept + 0 + 0 +
0 + 0 + 0.

What was confusing me was how to compute the cell mean in woolB,tensionH
cell.

If we have y_1,...,y_n all in group B.H then :-

y_1+ ... + y_n = intercept + B1 + 0 + B3 + 0 +  B5

Therefore average of group B.H = intercept + B1 + B3 + B5

Many thanks and Best Regards,
Ashim



On Sat, Dec 3, 2016 at 7:15 PM, Fox, John  wrote:

> Dear Ashim,
>
> Sorry to chime in late, and my apologies if someone has already pointed
> this out, but here's the relationship between the cell means and the model
> coefficients, using the row-basis of the model matrix:
>
> -- snip 
>
> > means <- with( warpbreaks, tapply( breaks, interaction(wool, tension),
> mean ) )
> > x.A <- rep(c(0, 1), 3)
> > x.B1 <- rep(c(0, 1, 0), each=2)
> > x.B2 <- rep(c(0, 0, 1), each=2)
> > x.AB1 <- x.A*x.B1
> > x.AB2 <- x.A*x.B2
> > X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2)
> > X.basis
>x.A x.B1 x.B2 x.AB1 x.AB2
> [1,] 1   000 0 0
> [2,] 1   100 0 0
> [3,] 1   010 0 0
> [4,] 1   110 1 0
> [5,] 1   001 0 0
> [6,] 1   101 0 1
> > solve(X.basis, means)
> x.A  x.B1  x.B2 x.AB1 x.AB2
>  44.6 -16.3 -20.6 -20.0  21.1  10.6
> > coef(aov(breaks ~ wool * tension, data = warpbreaks))
>(Intercept)  woolB   tensionM   tensionH woolB:tensionM
>   44.6  -16.3  -20.6  -20.0   21.1
> woolB:tensionH
>   10.6
>
> -- snip 
>
> I hope this helps,
>  John
>
> -
> John Fox, Professor
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> Web: socserv.mcmaster.ca/jfox
>
>
>
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ashim
> Kapoor
> > Sent: December 3, 2016 12:19 AM
> > To: David Winsemius 
> > Cc: r-help@r-project.org
> > Subject: Re: [R] Interpreting summary.lm for a 2 factor anova
> >
> > Please allow me to rephrase myquery.
> >
> > > model.tables(model,"m")
> > Tables of means
> > Grand mean
> >
> > 28.14815
> >
> >  wool
> > wool
> >  A  B
> > 31.037 25.259
> >
> >  tension
> > tension
> > L M H
> > 36.39 26.39 21.67
> >
> >  wool:tension
> > tension
> > wool L M H
> >A 44.56 24.00 24.56
> >B 28.22 28.78 18.78
> > >
> >
> >
> > The above is the same as :
> >
> > with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) )
> >  A.L  B.L  A.M  B.M  A.H  B.H
> > 44.6 28.2 24.0 28.8 24.6 18.8
> >
> > For reference:
> >
> > > model <- aov(breaks ~ wool * tension, data = warpbreaks)
> > > summary.lm(model)
> >
> > Call:
> > aov(formula = breaks ~ wool * tension, data = warpbreaks)
> >
> > Residuals:
> >  Min   1Q   Median   3Q  Max
> > -19.5556  -6.8889  -0.6667   7.1944  25.
> >
> > Coefficients:
> >Estimate Std. Error t value Pr(>|t|)
> > (Intercept)  44.556  3.647  12.218 2.43e-16 ***
> > woolB   -16.333  5.157  -3.167 0.002677 **
> > tensionM-20.556  5.157  -3.986 0.000228 ***
> > tensionH-20.000  5.157  -3.878 0.000320 ***
> > woolB:tensionM   21.111  7.294   2.895 0.005698 **
> > woolB:tensionH   10.556  7.294   1.447 0.154327
> > ---
> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > Residual standard error: 10.94 on 48 degrees of freedom
> > Multiple R-squared:  0.3778,Adjusted R-squared:  0.3129
> > F-statistic: 5.828 on 5 and 48 DF,  p-value: 0.0002772
> >
> >
> > Now I'll explain what is confusing me in the output of summary.lm.
> >
> > Coeff of Intercept = 44.556  = cell mean for A.L. This is the base.
> >
> > Coeff of woolB:L = -16.333 = 28.2 - 44.556. This is the difference
> of this
> > cell mean(B:L) from the base.
> >
> > Coeff of woolA:tensionM = -20.556  = 24.000- 44.556. This is the
> difference of
> > this cell mean (A:M)  from the base.
> >
> > Coeff of woolA:tensionH = -20.000  = 24.6 - 44.556. This is the
> difference
> > of this cell mean(A:H) from the base.
> >
> > This is where it stops being the difference from the base.
> >
> > Coeff of woolB:tensionM = 21.111 should turn out to be 28.8 - 44.556
> but
> > this is -15.77822
> >
> > Coeff of woolB:tensionH 

[R] tcltk: use of .Tcl.callback

2016-12-03 Thread Cleber N.Borges
Dear,
How to properly use the function: .Tcl.callback( ) to trigger a single R
function with different values?
Below is my stupid solution ... The expected behavior is this but I
would like to know how to do it right.

Thanks in advanced

cleber

##

library( tcltk )
top <- tktoplevel()

dummyCopyCutR <- function( optionXXX ){
 if( optionXXX=="copy" ) print("Option Copy")
 if( optionXXX=="cut"  ) print("Option Cut" )
 }

strcmd <- strsplit( .Tcl.callback( dummyCopyCutR ), " %" )[[1]][1]

tkbind( top, '', paste( strcmd, 'copy' ) )
tkbind( top, '', paste( strcmd, 'cut'  ) )

##


---
Este email foi escaneado pelo Avast antivírus.
https://www.avast.com/antivirus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] error serialize (foreach)

2016-12-03 Thread Doran, Harold
As a follow up to this, I have been able to generate a toy example of 
reproducible code that generates the same problem. Below is just a sample to 
represent the issue, but my data and subsequent functions acting on the data 
are much more involved. 

I no longer have the error, but, the loop running in parallel is extremely slow 
relative to its serialized counterpart.

I have narrowed down the problem to the fact that I am searching through a very 
large list, grabbing the data from that list by indexing to subset and then 
doing stuff to it. Both "work", but the parallel version is very, very slow. I 
believe I am sending data files to each core and the number of searches 
happening is prohibitive.

I am very much stuck in the design-based way of how I would do this particular 
problem on a single core and am not sure if there is a better designed based 
approach for solving this problem in the parallel version. 

Any advice on better ways to work with the %dopar% version here?

N <- 20
myList <- vector('list', N)
names(myList) <- 1:N
for(i in 1:N){
myList[[i]] <- rnorm(100)
}
nms <- 1:N
library(foreach)
library(doParallel)
registerDoParallel(cores=7)

result <- foreach(i = 1:3) %do% {
dat <- myList[[which(names(myList) == nms[i])]]
mean(dat)
}

result <- foreach(i = 1:3) %dopar% {
dat <- myList[[which(names(myList) == nms[i])]]
mean(dat)
}
-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Doran, Harold
Sent: Saturday, December 03, 2016 4:26 PM
To: r-help@r-project.org
Subject: [R] error serialize (foreach)

I have a portion of a foreach loop that I cannot run as parallel but works fine 
when serialized. Below is a representation of the problem as in this instance I 
cannot provide reproducible data to generate the same error, the actual data I 
am working with are confidential.

Within each foreach loop are a series of custom functions acting on my data. 
When using %do% I get expected result but replacing it with %dopar% generates 
the error.

I have searched archives and also stackexchange and see this is an issue that 
arises and I have tried a couple of the recommendations, like trying to use an 
outfile in makeCluster. But I am not having success.

Oddly, (or perhaps not oddly), others portions of my program run in parallel 
and do not generate this same error

library(foreach)
library(doParallel)
registerDoParallel(cores=3)

# This portion runs and produces expected result result <- foreach(i = 1:N) 
%do% {
tmp1 <- function1(...)
tmp2 <- function2(...)
tmp2
}

# This portion generates error in serialize result <- foreach(i = 1:N) %dopar% {
tmp1 <- function1(...)
tmp2 <- function2(...)
tmp2
}

error in serialize(data, node$con) : error writing to connection


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tcltk: use of .Tcl.callback

2016-12-03 Thread Cleber N.Borges
Dear,
How to properly use the function: .Tcl.callback( ) to trigger a single R
function with different values?
Below is my stupid solution ... The expected behavior is this but I
would like to know how to do it right.

Thanks in advanced

cleber

##

library( tcltk )
top <- tktoplevel()

dummyCopyCutR <- function( optionXXX ){
 if( optionXXX=="copy" ) print("Option Copy")
 if( optionXXX=="cut"  ) print("Option Cut" )
 }

strcmd <- strsplit( .Tcl.callback( dummyCopyCutR ), " %" )[[1]][1]

tkbind( top, '', paste( strcmd, 'copy' ) )
tkbind( top, '', paste( strcmd, 'cut'  ) )

##


---
Este email foi escaneado pelo Avast antivírus.
https://www.avast.com/antivirus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bootstrap using ARIMA model

2016-12-03 Thread David Winsemius

> On Dec 1, 2016, at 1:58 PM, Ashwini Patil  wrote:
> 
> Hi David,
> 
> here is my code including what i did for the tsboot:
> rm(list = ls())
> library(boot)
> library(tseries)
> library(TTR)
> library(quantmod)
> library(scales)
> library(forecast)
> library(zoo)
> library(TSA)
> security<-"NFLX"
> startDate<-"2012-06-01"
> endDate<-"2016-10-31"
> qte_list<-c("AdjClose")
> 
> data=get.hist.quote(instrument = security, startDate, endDate, quote =
> qte_list,   provider = "yahoo" )
> 
> func.ar<- ar(logret)

> func.ar<- ar(logret)
Error in ar.yw(x, aic = aic, order.max = order.max, na.action = na.action,  : 
  object 'logret' not found


> func.model<-list(order = c(func.ar$order,0,0),ar=func.ar$ar)
> func.res<- func.ar$resid[!is.na(func.ar$resid)]
> func.res<-func.res - mean()
> func<- function(logret,formula){
>  d = logret
>  return(RSI(exp(logret)))
> }
> func.sim<-function(res,n.sim,ran.args){
>  rg1<- function(n, res) sample(res, n, replace=TRUE)
>  ts.orig<-ran.args$ts
>  ts.mod<-ran.args$model
>  mean(ts.orig)+ts(arima.sim(model=ts.mod,n=n.sim, ran.gen=rg1,
> res=as.vestor(res)))
> }
> myboot<-tsboot(exp(logret),func,R=500,sim="model", ran.gen=func.sim,
> ran.args = List(ts=log(data[,1],model=func.sim))
> 
> 
> Best,
> Ash
> 
> 
> On Thu, Dec 1, 2016 at 1:50 PM, Bert Gunter  wrote:
> 
>> Just briefly to follow up David's comment, though this is mainly about
>> statistics and therefore off topic here...
>> 
>> Bootstrapping time series is a subtle issue that requires familiarity
>> with the technical details-- and maybe even current research. The
>> tsboot() function gives you several options from which you must choose
>> *appropriately* -- or maybe choose something else entirely. The Help
>> doc gives you a sense of the difficulties:
>> 
>> ***
>> Model based resampling is very similar to the parametric bootstrap and
>> all simulation must be in one of the user specified functions. This
>> avoids the complicated problem of choosing the block length but relies
>> on an accurate model choice being made.
>> 
>> Phase scrambling is described in Section 8.2.4 of Davison and Hinkley
>> (1997). The types of statistic for which this method produces
>> reasonable results is very limited and the other methods seem to do
>> better in most situations. Other types of resampling in the frequency
>> domain can be accomplished using the function boot with the argument
>> sim = "parametric".
>> 
>> 
>> Moral: If you don't know what you're doing, seek local expertise to
>> help -- remote sites offering suggestions from those who aren't
>> familiar with the details of your data and analysis goals (maybe you
>> don't need to do this at all!) may lead you to irreproducible
>> nonsense.
>> 
>> Cheers,
>> Bert
>> Bert Gunter
>> 
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> 
>> 
>> On Thu, Dec 1, 2016 at 7:45 AM, Ashwini Patil 
>> wrote:
>>> Hi,
>>> 
>>> I want to implement a bootstrap method for time series.
>>> I am taking the adj close values from yahoo for NFLX and now I need to
>>> bootstrap these values using ARIMA model.
>>> 
>>> here is my code so far:
>>> rm(list = ls())
>>> library(boot)
>>> library(tseries)
>>> library(TTR)
>>> library(quantmod)
>>> library(scales)
>>> library(forecast)
>>> library(zoo)
>>> library(TSA)
>>> security<-"NFLX"
>>> startDate<-"2012-06-01"
>>> endDate<-"2016-10-31"
>>> qte_list<-c("AdjClose")
>>> 
>>> data=get.hist.quote(instrument = security, startDate, endDate, quote =
>>> qte_list,   provider = "yahoo" )
>>> logret<-diff(log(data[,1]))
>>> fit11<-auto.arima(logret, max.order=10)
>>> 
>>> When i use auto.arima, I get an order of (0,0,0) with non-zero mean.
>> After
>>> this, I tried to use tsboot function but it is not yielding any answers.
>>> 
>>> Any and all help is appreciated.
>>> 
>>> Thank you!
>>> 
>>>[[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/
>> posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 

[R] error serialize (foreach)

2016-12-03 Thread Doran, Harold
I have a portion of a foreach loop that I cannot run as parallel but works fine 
when serialized. Below is a representation of the problem as in this instance I 
cannot provide reproducible data to generate the same error, the actual data I 
am working with are confidential.

Within each foreach loop are a series of custom functions acting on my data. 
When using %do% I get expected result but replacing it with %dopar% generates 
the error.

I have searched archives and also stackexchange and see this is an issue that 
arises and I have tried a couple of the recommendations, like trying to use an 
outfile in makeCluster. But I am not having success.

Oddly, (or perhaps not oddly), others portions of my program run in parallel 
and do not generate this same error

library(foreach)
library(doParallel)
registerDoParallel(cores=3)

# This portion runs and produces expected result
result <- foreach(i = 1:N) %do% {
tmp1 <- function1(...)
tmp2 <- function2(...)
tmp2
}

# This portion generates error in serialize
result <- foreach(i = 1:N) %dopar% {
tmp1 <- function1(...)
tmp2 <- function2(...)
tmp2
}

error in serialize(data, node$con) : error writing to connection


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to setup a multiplicative dummy function in R

2016-12-03 Thread Jeff Newmiller
I think you need an offset term, or maybe I just don't understand your 
question.  A sample data set, particularly if you can show us how your equation 
could be used to generate the sample data, would be helpful. 

-- 
Sent from my phone. Please excuse my brevity.

On December 1, 2016 7:22:37 AM PST, lolo koko  wrote:
>Hello?
>
>Does anyone know how I can implement the below equation in R? I would
>like to estimate the following equation:
>
>  y=beta_ij * (1+gamma_j * dummy) * x_ij
>
>where y is continuous, and all the x variables (j of them) are i=3
>level categorical variables. The intuition is that instead of
>estimating the additive value for a dummy variable, I would like to
>estimate the multiplicative value for the dummy variable. Thus the
>presence of the dummy would scale the beta. Note that for each x
>variable there is only one gamma.
>
>For concreteness, you can imagine that y is a continious test score, x
>are categorical variables indicating different types of education
>achievements, each type of education achievement is categorised in 3
>levels (none, some, a lot), and the dummy indicates race. In this
>model I believe that race affects test scores proportionally to
>estimated beta of each education level. This avoids having to estimate
>a gamma for each education achievement level.
>
>Is the solution to simply use nls {stats} and type out the equation?
>
>Hope the explanation makes sense, happy to explain further.
>
>Best wishes,
>
>Peter
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rstan error: C:/Rtools/mingw_64/bin/g++: not found

2016-12-03 Thread Uwe Ligges



On 02.12.2016 12:23, S Ellison wrote:

Apologies for posting a possibly package-specific question, but I'm not sure 
whether this is an R or rstan ussue.

Running rstan under R 3.1.1 in windows 10 I get the well-known error
"Compilation ERROR, function(s)/method(s) not created! C:/Rtools/mingw_64/bin/g++: 
not found"


Are you sure this is R-3.1.1?

I'd expect such a message for R >= 3.3.0 if you do not update some 
files, but for 3.1.1 the correct PATH should be sufficient.


Nevertheless, there is some code in rstan that seems to deal with Rtools 
for some reason, so perhaps better ask its package maintainer.


Best,
Uwe Ligges




The cause on my system is simple; g++ is not on my C:\ drive; it's on D:
My system path correctly points to D:, running


system('g++ -v')


works fine. But the error message shows that either rstan or R is insisting on 
a specific call on C:.

I suspect that something is causing either a package, or R, to use the wrong 
drive. It _may_ be related to the fact that R itself is in its usual place in 
'C:\Program files'.

Any pointers, either to an answer or to a better place to ask, would be welcome.

Steve Ellison

PS: I can see that there is a _fairly_ simple work-round, but I prefer Rtools 
where it is for system management reasons and this is (so far) the only place 
that the path variable is not correctly picked up.



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data

2016-12-03 Thread jim holtman
This should be reasonably efficient with 'dplyr':

> library(dplyr)
> input <- read.csv(text = "state,city,x
+ 1,12,100
+ 1,12,100
+ 1,12,200
+ 1,13,200
+ 1,13,100
+ 1,13,100
+ 1,14,200
+ 2,21,200
+ 2,21,200
+ 2,21,100
+ 2,23,100
+ 2,23,200
+ 2,34,200
+ 2,34,100
+ 2,35,100")
>
> result <- input %>%
+ group_by(state) %>%
+ summarise(nCities = length(unique(city)),
+ count = n(),
+ `100's` = sum(x == 100),
+ `200's` = sum(x == 200)
+ )
> result
# A tibble: 2 × 5
  state nCities count `100's` `200's`

1 1   3 7   4   3
2 2   4 8   4   4


Or you can also use data.table:

> library(data.table)
> input <- fread("state,city,x
+ 1,12,100
+ 1,12,100
+ 1,12,200
+ 1,13,200
+ 1,13,100
+ 1,13,100
+ 1,14,200
+ 2,21,200
+ 2,21,200
+ 2,21,100
+ 2,23,100
+ 2,23,200
+ 2,34,200
+ 2,34,100
+ 2,35,100")
>
> input[, .(nCities = length(unique(city)),
+   count = .N,
+   `100's` = sum(x == 100),
+   `200's` = sum(x == 200)
+   )
+ , keyby = state
+ ]
   state nCities count 100's 200's
1: 1   3 7 4 3
2: 2   4 8 4 4



Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Sat, Dec 3, 2016 at 10:40 AM, Val  wrote:

> Hi all,
>
> I am trying to read and summarize  a big data frame( >10M records)
>
> Here is the sample of my data
> state,city,x
> 1,12,100
> 1,12,100
> 1,12,200
> 1,13,200
> 1,13,100
> 1,13,100
> 1,14,200
> 2,21,200
> 2,21,200
> 2,21,100
> 2,23,100
> 2,23,200
> 2,34,200
> 2,34,100
> 2,35,100
>
> I want  get  the total count by state, and the  the number of cities
> by state. The x variable is either 100 or 200 and count each
>
> The result should look like as follows.
>
> state,city,count,100's,200's
> 1,3,7,4,3
> 2,4,8,4,4
>
> At the present I am doing it  in several steps and taking too long
>
> Is there an efficient way of doing this?
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] data

2016-12-03 Thread Val
Hi all,

I am trying to read and summarize  a big data frame( >10M records)

Here is the sample of my data
state,city,x
1,12,100
1,12,100
1,12,200
1,13,200
1,13,100
1,13,100
1,14,200
2,21,200
2,21,200
2,21,100
2,23,100
2,23,200
2,34,200
2,34,100
2,35,100

I want  get  the total count by state, and the  the number of cities
by state. The x variable is either 100 or 200 and count each

The result should look like as follows.

state,city,count,100's,200's
1,3,7,4,3
2,4,8,4,4

At the present I am doing it  in several steps and taking too long

Is there an efficient way of doing this?

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [ESS] how to edit openbugs model file?

2016-12-03 Thread Sparapani, Rodney
Hi Paul:

Use BUGS mode which is part of ESS…
(require ‘ess-bugs-d)

Then name your file model.bug

Rodney

__
ESS-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/ess-help

Re: [R] Interpreting summary.lm for a 2 factor anova

2016-12-03 Thread Fox, John
Dear Ashim,

Sorry to chime in late, and my apologies if someone has already pointed this 
out, but here's the relationship between the cell means and the model 
coefficients, using the row-basis of the model matrix:

-- snip 

> means <- with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) 
> )
> x.A <- rep(c(0, 1), 3)
> x.B1 <- rep(c(0, 1, 0), each=2)
> x.B2 <- rep(c(0, 0, 1), each=2)
> x.AB1 <- x.A*x.B1
> x.AB2 <- x.A*x.B2
> X.basis <- cbind(1, x.A, x.B1, x.B2, x.AB1, x.AB2)
> X.basis
   x.A x.B1 x.B2 x.AB1 x.AB2
[1,] 1   000 0 0
[2,] 1   100 0 0
[3,] 1   010 0 0
[4,] 1   110 1 0
[5,] 1   001 0 0
[6,] 1   101 0 1
> solve(X.basis, means)
x.A  x.B1  x.B2 x.AB1 x.AB2 
 44.6 -16.3 -20.6 -20.0  21.1  10.6 
> coef(aov(breaks ~ wool * tension, data = warpbreaks))
   (Intercept)  woolB   tensionM   tensionH woolB:tensionM 
  44.6  -16.3  -20.6  -20.0   21.1 
woolB:tensionH 
  10.6

-- snip 

I hope this helps,
 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario
Canada L8S 4M4
Web: socserv.mcmaster.ca/jfox



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ashim Kapoor
> Sent: December 3, 2016 12:19 AM
> To: David Winsemius 
> Cc: r-help@r-project.org
> Subject: Re: [R] Interpreting summary.lm for a 2 factor anova
> 
> Please allow me to rephrase myquery.
> 
> > model.tables(model,"m")
> Tables of means
> Grand mean
> 
> 28.14815
> 
>  wool
> wool
>  A  B
> 31.037 25.259
> 
>  tension
> tension
> L M H
> 36.39 26.39 21.67
> 
>  wool:tension
> tension
> wool L M H
>A 44.56 24.00 24.56
>B 28.22 28.78 18.78
> >
> 
> 
> The above is the same as :
> 
> with( warpbreaks, tapply( breaks, interaction(wool, tension), mean ) )
>  A.L  B.L  A.M  B.M  A.H  B.H
> 44.6 28.2 24.0 28.8 24.6 18.8
> 
> For reference:
> 
> > model <- aov(breaks ~ wool * tension, data = warpbreaks)
> > summary.lm(model)
> 
> Call:
> aov(formula = breaks ~ wool * tension, data = warpbreaks)
> 
> Residuals:
>  Min   1Q   Median   3Q  Max
> -19.5556  -6.8889  -0.6667   7.1944  25.
> 
> Coefficients:
>Estimate Std. Error t value Pr(>|t|)
> (Intercept)  44.556  3.647  12.218 2.43e-16 ***
> woolB   -16.333  5.157  -3.167 0.002677 **
> tensionM-20.556  5.157  -3.986 0.000228 ***
> tensionH-20.000  5.157  -3.878 0.000320 ***
> woolB:tensionM   21.111  7.294   2.895 0.005698 **
> woolB:tensionH   10.556  7.294   1.447 0.154327
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> Residual standard error: 10.94 on 48 degrees of freedom
> Multiple R-squared:  0.3778,Adjusted R-squared:  0.3129
> F-statistic: 5.828 on 5 and 48 DF,  p-value: 0.0002772
> 
> 
> Now I'll explain what is confusing me in the output of summary.lm.
> 
> Coeff of Intercept = 44.556  = cell mean for A.L. This is the base.
> 
> Coeff of woolB:L = -16.333 = 28.2 - 44.556. This is the difference of this
> cell mean(B:L) from the base.
> 
> Coeff of woolA:tensionM = -20.556  = 24.000- 44.556. This is the difference of
> this cell mean (A:M)  from the base.
> 
> Coeff of woolA:tensionH = -20.000  = 24.6 - 44.556. This is the difference
> of this cell mean(A:H) from the base.
> 
> This is where it stops being the difference from the base.
> 
> Coeff of woolB:tensionM = 21.111 should turn out to be 28.8 - 44.556 but
> this is -15.77822
> 
> Coeff of woolB:tensionH = 10.556 should turn out to be  18.8 - 44.556 but
> this is -25.77822
> 
> In the above 2 cases, we can't say that the coefficient = cell mean - base 
> case.
> Can you tell me what should be the statement to be made ?
> 
> 
> Best Regards,
> Ashim
> 
> PS : My apologies for emailing my query to this list. Can you tell me the 
> names
> of a few (active) statistics help list ?
> 
> On Sat, Dec 3, 2016 at 1:33 AM, David Winsemius 
> wrote:
> 
> >
> > > On Dec 2, 2016, at 9:09 AM, David Winsemius 
> > wrote:
> > >
> > >>
> > >> On Dec 2, 2016, at 6:16 AM, Ashim Kapoor 
> wrote:
> > >>
> > >> Dear Pikal,
> > >>
> > >> All levels except the interactions are compared to the Intercept.
> > >> I'm a little confused as to what's going on in interaction terms
> > >> eg. the cell wool B : tension M. It's mean is :
> > >> 28.78 and 28.78 - 44.56 = -15.78 != 21.111.
> > >>
> > >> It's something like 44.56 (intercept) -16.333 (wool B) -.20.556
> > >> (tension
> > >> M)  + 21.111 (woolB:tensionM) = 28.782.
> > 

Re: [ESS] how to edit openbugs model file?

2016-12-03 Thread Martin Maechler
Also,
   M-x R-mode
but that would need to be done again every time...
If you don't want to rename to model.R
Add a line

## -*- R -*-

at the top (or close enough to the very top; this tells emacs to use
R-mode ;  there are several (more flexible, more to write) versions of
such so-called  "file local variables" in emacs.. Emacs manuals and
also Google "Emacs file local variables"  should help.

Martin Maechler,
ETH Zurich



On Fri, Dec 2, 2016 at 8:35 PM, Simon Bonner  wrote:
> Hi Paul,
>
> An easy way to do this is to rename your file as model.R. The syntax for BUGS 
> is almost identical to the syntax for R, and BUGS doesn’t care what the model 
> extension is.
>
> Wish I could say that I thought of that but kudos go to Matthew Schofield.
>
> Cheers,
>
> Simon
>
> Simon Bonner
> Assistant Professor of Environmetrics/Director MMASc
> Department of Statistical and Actuarial Sciences/Department of Biology
> University of Western Ontario
>
> Office: Western Science Centre rm 276
>
> Email: sbonn...@uwo.ca | Telephone: 519-661-2111 x88205 | Fax: 519-661-3813
> Twitter: @bonnerstatslab | Website: http://simon.bonners.ca/bonner-lab/wpblog/
>
>
> On 2016-12-02, 1:21 PM, "ESS-help on behalf of Paul Johnson" 
>  wrote:
>
> I have a student here who wants to edit an OpenBugs file named
> "model.txt" in Emacs. I can't understand how to tell Emacs to use a
> bugs mode so that things like M-; understand what to do.
>
> If you have an idea, let me know.
>
> We don't want to run the bugs code with an emacs session, we just want
> to edit the bugs text model file and then we can run with R2OpenBugs
> or something else.
>
> --
> Paul E. Johnson   http://pj.freefaculty.org
> Director, Center for Research Methods and Data Analysis 
> http://crmda.ku.edu
>
> To write to me directly, please address me at pauljohn at ku.edu.
>
> __
> ESS-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/ess-help
>
>
> __
> ESS-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/ess-help

__
ESS-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/ess-help