Re: [R] ANOVA in R

2018-10-10 Thread Jim Lemon
Hi again,
Two things, I named the data frame SR as shown in the model.
The other is for those who may wish to answer the OP. The mediafire
website is loaded with intrusive ads and perhaps malware.

Jim
On Thu, Oct 11, 2018 at 9:02 AM Jim Lemon  wrote:
>
> Hi Tranh,
> I'm not sure why you are converting your variables to factors, and I
> think the model you want is:
>
> lm(KIC~temperature+AC+AV+Thickness+temperature:AC+
>  temperature:AV+temperature:thickness+AC:AV+
>  AC:thickness+AV:thickness,SR)
>
> Note the colons (:) rather than asterisks (*) for the interactions.
>
> Jim
> On Wed, Oct 10, 2018 at 5:14 PM Thanh Tran  wrote:
> >
> > Hi eveyone,
> > I'm studying about variance (ANOVA) in R and have some questions to share.
> > I read an article investigating the effect of factors (temperature, Asphalt
> > content, Air voids, and sample thickness) on the hardness of asphalt
> > concrete in the tensile test (abbreviated as Kic). Each condition was
> > repeated four times (4 samples). In the paper, the authors used MINITAB to
> > analyze Anova. The authors use "adjusted sums of squares" calculate the
> > p-value I try to use ANOVA in R to analyze this data and get the result as
> > shown in Figure 4. The results are different from the results in the
> > article. Some papers say that in R, the default for ANOVA analysis is to
> > use "sequential sums of squares" to calculate the p-value.
> > So please help the following two questions: 1 / Introduction to code in R
> > for anova analysis uses "adjusted sums of squares". The main part of the
> > command in R / myself is as follows: > Tem = as.factor (temperature) > Ac =
> > as.factor (AC) > Av = as.factor (AV) > Thick = as.factor (Thickness) >
> > Twoway = lm (KIC ~ Tem + Ac + Av + Thick + Stamp + Ac + Stamp + Av + Stamp
> > + Thick + Ac * Av + Ac * Thick + Av * Thick) > anova (twoway) 2/ When to
> > use "sequential sums of squares" and when to use "adjusted sums of
> > squares". Some papers recommend using the "oa.design
> > "
> > function in R to check for "orthogonal" designs. If not, use "adjusted sums
> > of squares". I am still vague about this command, so look forward to
> > everyone's suggestion. If you could answer all two of my questions, I would
> > be most grateful. Ps: I have added a CSV file and the paper for practicing
> > R. http://www.mediafire.com/file/e5oe54p2c2wd4bc/Saha+research.csv
> > http://www.mediafire.com/file/39jlf9h539y9mdz/Homothetic+behaviour+investigation+on+fracture+toughness+of+asphalt+mixtures+using+semicircular+bending+test.pdf
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in R

2018-10-10 Thread Jim Lemon
Hi Tranh,
I'm not sure why you are converting your variables to factors, and I
think the model you want is:

lm(KIC~temperature+AC+AV+Thickness+temperature:AC+
 temperature:AV+temperature:thickness+AC:AV+
 AC:thickness+AV:thickness,SR)

Note the colons (:) rather than asterisks (*) for the interactions.

Jim
On Wed, Oct 10, 2018 at 5:14 PM Thanh Tran  wrote:
>
> Hi eveyone,
> I'm studying about variance (ANOVA) in R and have some questions to share.
> I read an article investigating the effect of factors (temperature, Asphalt
> content, Air voids, and sample thickness) on the hardness of asphalt
> concrete in the tensile test (abbreviated as Kic). Each condition was
> repeated four times (4 samples). In the paper, the authors used MINITAB to
> analyze Anova. The authors use "adjusted sums of squares" calculate the
> p-value I try to use ANOVA in R to analyze this data and get the result as
> shown in Figure 4. The results are different from the results in the
> article. Some papers say that in R, the default for ANOVA analysis is to
> use "sequential sums of squares" to calculate the p-value.
> So please help the following two questions: 1 / Introduction to code in R
> for anova analysis uses "adjusted sums of squares". The main part of the
> command in R / myself is as follows: > Tem = as.factor (temperature) > Ac =
> as.factor (AC) > Av = as.factor (AV) > Thick = as.factor (Thickness) >
> Twoway = lm (KIC ~ Tem + Ac + Av + Thick + Stamp + Ac + Stamp + Av + Stamp
> + Thick + Ac * Av + Ac * Thick + Av * Thick) > anova (twoway) 2/ When to
> use "sequential sums of squares" and when to use "adjusted sums of
> squares". Some papers recommend using the "oa.design
> "
> function in R to check for "orthogonal" designs. If not, use "adjusted sums
> of squares". I am still vague about this command, so look forward to
> everyone's suggestion. If you could answer all two of my questions, I would
> be most grateful. Ps: I have added a CSV file and the paper for practicing
> R. http://www.mediafire.com/file/e5oe54p2c2wd4bc/Saha+research.csv
> http://www.mediafire.com/file/39jlf9h539y9mdz/Homothetic+behaviour+investigation+on+fracture+toughness+of+asphalt+mixtures+using+semicircular+bending+test.pdf
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in R

2018-10-10 Thread Paul Johnson
We cannot read your message. Should post pure text, not html. Hm, my phone
now may post html, must try to stop. Your R code not legible. It seems to
be output? Lines all run together. I tried find articles you mention, but
"not found" resulted.

You should use aov() for fitting, then get post hoc comparisons.

In car package, Anova function will help. I may teach Anova soon, we'll see
if I have better answer then.

Paul Johnson
University of Kansas

On Wed, Oct 10, 2018, 1:14 AM Thanh Tran  wrote:

> Hi eveyone,
> I'm studying about variance (ANOVA) in R and have some questions to share.
> I read an article investigating the effect of factors (temperature, Asphalt
> content, Air voids, and sample thickness) on the hardness of asphalt
> concrete in the tensile test (abbreviated as Kic). Each condition was
> repeated four times (4 samples). In the paper, the authors used MINITAB to
> analyze Anova. The authors use "adjusted sums of squares" calculate the
> p-value I try to use ANOVA in R to analyze this data and get the result as
> shown in Figure 4. The results are different from the results in the
> article. Some papers say that in R, the default for ANOVA analysis is to
> use "sequential sums of squares" to calculate the p-value.
> So please help the following two questions: 1 / Introduction to code in R
> for anova analysis uses "adjusted sums of squares". The main part of the
> command in R / myself is as follows: > Tem = as.factor (temperature) > Ac =
> as.factor (AC) > Av = as.factor (AV) > Thick = as.factor (Thickness) >
> Twoway = lm (KIC ~ Tem + Ac + Av + Thick + Stamp + Ac + Stamp + Av + Stamp
> + Thick + Ac * Av + Ac * Thick + Av * Thick) > anova (twoway) 2/ When to
> use "sequential sums of squares" and when to use "adjusted sums of
> squares". Some papers recommend using the "oa.design
> <
> https://www.youtube.com/redirect?q=http%3A%2F%2Foa.design%2F_token=AaSAPDY-5UAsoHxN6BdwfyIJ7R98MTUzOTIxNDg2OUAxNTM5MTI4NDY5=comments
> >"
> function in R to check for "orthogonal" designs. If not, use "adjusted sums
> of squares". I am still vague about this command, so look forward to
> everyone's suggestion. If you could answer all two of my questions, I would
> be most grateful. Ps: I have added a CSV file and the paper for practicing
> R. http://www.mediafire.com/file/e5oe54p2c2wd4bc/Saha+research.csv
>
> http://www.mediafire.com/file/39jlf9h539y9mdz/Homothetic+behaviour+investigation+on+fracture+toughness+of+asphalt+mixtures+using+semicircular+bending+test.pdf
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA in R

2018-10-10 Thread Thanh Tran
Hi eveyone,
I'm studying about variance (ANOVA) in R and have some questions to share.
I read an article investigating the effect of factors (temperature, Asphalt
content, Air voids, and sample thickness) on the hardness of asphalt
concrete in the tensile test (abbreviated as Kic). Each condition was
repeated four times (4 samples). In the paper, the authors used MINITAB to
analyze Anova. The authors use "adjusted sums of squares" calculate the
p-value I try to use ANOVA in R to analyze this data and get the result as
shown in Figure 4. The results are different from the results in the
article. Some papers say that in R, the default for ANOVA analysis is to
use "sequential sums of squares" to calculate the p-value.
So please help the following two questions: 1 / Introduction to code in R
for anova analysis uses "adjusted sums of squares". The main part of the
command in R / myself is as follows: > Tem = as.factor (temperature) > Ac =
as.factor (AC) > Av = as.factor (AV) > Thick = as.factor (Thickness) >
Twoway = lm (KIC ~ Tem + Ac + Av + Thick + Stamp + Ac + Stamp + Av + Stamp
+ Thick + Ac * Av + Ac * Thick + Av * Thick) > anova (twoway) 2/ When to
use "sequential sums of squares" and when to use "adjusted sums of
squares". Some papers recommend using the "oa.design
"
function in R to check for "orthogonal" designs. If not, use "adjusted sums
of squares". I am still vague about this command, so look forward to
everyone's suggestion. If you could answer all two of my questions, I would
be most grateful. Ps: I have added a CSV file and the paper for practicing
R. http://www.mediafire.com/file/e5oe54p2c2wd4bc/Saha+research.csv
http://www.mediafire.com/file/39jlf9h539y9mdz/Homothetic+behaviour+investigation+on+fracture+toughness+of+asphalt+mixtures+using+semicircular+bending+test.pdf

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Permutation Test

2018-09-03 Thread Juan Telleria Ruiz de Aguirre
Thank you all for your **very good** answers:

Using aovp(..., perm="Exact") seems to be the way to go for small datasets,
and also I should definitely try ?kruskal.test.


Juan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Permutation Test

2018-09-03 Thread S Ellison
> This package uses a modified version of aov() function, which uses
> Permutation Tests 
> 
> I obtain different p-values for each run!

Could that be because you are defaulting to perm="Prob"?

I am not familiar with the package, but the manual is informative.
You may have missed something when reading it.

" ...The Exact method will be used by default when the number of observations 
is less than or equal to
maxExact, otherwise Prob will be used.
Prob:  Iterations terminate when the estimated standard error of the estimated 
proportion p is less
than p*Ca"

I would assume that probabilistic permutation is random and will change from 
run to run.
You could use set.seed() to stop that, but it's actually quite useful to see 
how much the results change.
If you want complete permutation, you'd need to force Exact (unless that does 
not mean what it sounds like for this package).
It looks like that requires you to set maxExact to at least your number of 
observations. But given that permutation  grows combinatorially,  that could 
take a _long_ time for a run; the Example in the help page does not complete in 
a useful time when maxExact is set to exceed the number of data points.

So I'd probably run it using Prob and simply note the range of results for a 
handful of runs to give you an indication of how far to trust the answers.

> Would it still be possible use the regular aov() by generating permutations
> in advance (Obtaining therefore a Normal Distribution thanks to the Central
> Limit Theorem)? And applying the aov() function afterwards? Does it have
> sense?
As a chemist, I'd guess No. And you'd be even more limited in number of 
permutations.

> Or maybe this issue could be due to unbalanced classes? I also tried to
> weight observations based on proportions, but the function failed.
No, it's nothing to do with balance, if the results change run to run with no 
change in the model. I'd guess that may exacerbate the permutaiton variability 
somewhat but it won't _cause_ it.

> Any alternative solution for performing a One-Way ANOVA Test over
> Non-Normal Data?
Yes; the traditional nonparametric test for one-way data (balanced) is the 
kruskal-wallis test - see ?kruskal.test.
Classical ANOVA on ranks can also be defended as a general 'nonparametric' 
approach, though I gather it can also be criticised. 



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Permutation Test

2018-09-03 Thread Meyners, Michael
Juan, 

Your question might be borderline for this list, as it ultimately rather seems 
a stats question coming in R disguise.

Anyway, the short answer is that you *expect* to get a different p value from a 
permutation test unless you are able to do all possible permutation and 
therefore use the so-called systematic reference set. That is rarely the case, 
and only for relatively small problems. 
The permutation test uses a random subset of all possible permutations. Given 
this randomness, you'll get a different p value. In order to get reproducible 
results, you may specify a seed (?set.seed), yet that is only reproducible with 
this environment. Someone else with a different software and/or code might come 
out with a different p. The higher the number of permutations used, the smaller 
the variation around the p values, however. For most applications, 1000 seem 
good enough to me, but sometimes I go higher (in particular if the p value is 
borderline and I really need a strict above/below alpha decision).

The permutations do not create an implicit normal distribution, but rather a 
null distribution that can (likely is depending on non-normality of your data) 
not normal. So your respective proposal does not appeal.

I don't think you need an alternative - the permutation test is just fine, and 
recognizing the randomness in the execution does not render the (relatively 
small) variability in p values a major issue.

You may want to have a look at the text book by Edgington & Onghena for details 
on permutation tests, and there are plenty of papers out there addressing them 
in various contexts, which will help to understand *why* you observe what you 
observe here. 

HTH, Michael



> -Original Message-
> From: R-help  On Behalf Of Juan Telleria Ruiz
> de Aguirre
> Sent: Montag, 3. September 2018 17:18
> To: R help Mailing list 
> Subject: [R] ANOVA Permutation Test
> 
> Dear R users,
> 
> I have the following Question related to Package lmPerm:
> 
> This package uses a modified version of aov() function, which uses
> Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
> Variance (ANOVA) Model.
> 
> However, when I run the following code for a simple linear model:
> 
> library(lmPerm)
> 
> e$t_Downtime_per_Intervention_Successful %>%
>   aovp(
> formula = `Downtime per Intervention[h]` ~ `Working Hours`,
> data = .
>   ) %>%
>   summary()
> 
> I obtain different p-values for each run!
> 
> With a regular ANOVA Test, I obtain instead a constant F-statistic, but I do 
> not
> fulfill the required Normality Assumptions.
> 
> So my questions are:
> 
> Would it still be possible use the regular aov() by generating permutations in
> advance (Obtaining therefore a Normal Distribution thanks to the Central
> Limit Theorem)? And applying the aov() function afterwards? Does it have
> sense?
> 
> 
> Or maybe this issue could be due to unbalanced classes? I also tried to weight
> observations based on proportions, but the function failed.
> 
> 
> Any alternative solution for performing a One-Way ANOVA Test over Non-
> Normal Data?
> 
> 
> Thank you.
> 
> Juan
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Permutation Test

2018-09-03 Thread Michael Dewey

Dear Juan

I do not use the package but if it does permutation tests it presumably 
uses random numbers and since you are not setting the seed you would get 
different values for each run.


Michael

On 03/09/2018 16:17, Juan Telleria Ruiz de Aguirre wrote:

Dear R users,

I have the following Question related to Package lmPerm:

This package uses a modified version of aov() function, which uses
Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
Variance (ANOVA) Model.

However, when I run the following code for a simple linear model:

library(lmPerm)

e$t_Downtime_per_Intervention_Successful %>%
   aovp(
 formula = `Downtime per Intervention[h]` ~ `Working Hours`,
 data = .
   ) %>%
   summary()

I obtain different p-values for each run!

With a regular ANOVA Test, I obtain instead a constant F-statistic, but I
do not fulfill the required Normality Assumptions.

So my questions are:

Would it still be possible use the regular aov() by generating permutations
in advance (Obtaining therefore a Normal Distribution thanks to the Central
Limit Theorem)? And applying the aov() function afterwards? Does it have
sense?


Or maybe this issue could be due to unbalanced classes? I also tried to
weight observations based on proportions, but the function failed.


Any alternative solution for performing a One-Way ANOVA Test over
Non-Normal Data?


Thank you.

Juan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA Permutation Test

2018-09-03 Thread Juan Telleria Ruiz de Aguirre
Dear R users,

I have the following Question related to Package lmPerm:

This package uses a modified version of aov() function, which uses
Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
Variance (ANOVA) Model.

However, when I run the following code for a simple linear model:

library(lmPerm)

e$t_Downtime_per_Intervention_Successful %>%
  aovp(
formula = `Downtime per Intervention[h]` ~ `Working Hours`,
data = .
  ) %>%
  summary()

I obtain different p-values for each run!

With a regular ANOVA Test, I obtain instead a constant F-statistic, but I
do not fulfill the required Normality Assumptions.

So my questions are:

Would it still be possible use the regular aov() by generating permutations
in advance (Obtaining therefore a Normal Distribution thanks to the Central
Limit Theorem)? And applying the aov() function afterwards? Does it have
sense?


Or maybe this issue could be due to unbalanced classes? I also tried to
weight observations based on proportions, but the function failed.


Any alternative solution for performing a One-Way ANOVA Test over
Non-Normal Data?


Thank you.

Juan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA test to decide whether to use multiple linear regression or linear mixed effects model

2017-08-15 Thread S Ellison
> I am trying to decide between using a multiple linear regression or a linear
> mixed effects model for my data:
> ... 
> but I keep getting the following error code:
> 
> Error in anova.lmlist (object, ...):
> 
> models were not all fitted to the same size of dataset

anova is defaulting to anova.lm, and that doesn't expect a mixed effects model.

Switch them round to put model2 first:

anova (model2, model1)


S Ellison



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA test to decide whether to use multiple linear regression or linear mixed effects model

2017-08-15 Thread Lucy McMahon
R-help:


I am trying to decide between using a multiple linear regression or a linear 
mixed effects model for my data:


model1 <- lm (responsevariable ~ predictor1 + predictor2 + predictor3 + 
predictor4, data= data)

model2 <- lme (responsevariable ~ predictor1 + predictor2 + predictor3 + 
predictor4, random = ~1 | site, data= data)


anova (model1, model2)


but I keep getting the following error code:

Error in anova.lmlist (object, ...):

models were not all fitted to the same size of dataset


Any advice would be appreciated, thanks.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA for one subject

2017-03-28 Thread Uri Eduardo Ramírez Pasos
Dear everyone,

I have a 2x3  design (medication x stimulus type) but very few subjects (6)
and I would like to perform intra-subject stats using r's aov (I've already
run the group analysis). Is there a conceptual problem with this, as long
as I don't interpret the results as representative of the population with
the disease I'm studying? Could anyone point me to any discussion of this
practice, as to its merits for example?

Many thanks,
U. Pasos

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova() Chi-square test

2017-01-20 Thread Sergio Ferreira Cardoso
Dear all,

Anova() for .car package retrieves Chi-square statistics when I'm testing a
model the significance of a multivariate .gls model
gls(x~1+2+3+x,corBrownian(phy=tree), ...).
Is this Chi-square a two-sided test?

Thank you.

Best,
Sérgio.

-- 
Com os melhores cumprimentos,
Sérgio Ferreira Cardoso.



Best regards,
Sérgio Ferreira Cardoso




MSc. Paleontology
Museu da Lourinhã, GEAL.
LATR/IST/CTN - Campus Tecnológico e Nuclear.

Lisboa, Portugal

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Anova() type iii SS plots and diagnostics

2016-07-19 Thread Fox, John
Dear Pamela,

I'm afraid that this question seems confused. Type III sums of squares are 
computed for a linear model -- it's the *linear model* that you'd check, in the 
usual manner, for outliers, etc., and this has nothing to do with how the ANOVA 
table for the model is computed.

Best,
 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario
Canada L8S 4M4
Web: socserv.mcmaster.ca/jfox



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Pamela
> Wong
> Sent: July 18, 2016 10:39 PM
> To: r-help@r-project.org
> Subject: [R] Anova() type iii SS plots and diagnostics
> 
> I am wondering if there is a way to plot results and model diagnostics (to 
> check
> for outliers, homoscedasticity, normality, collinearity) using type III sums 
> of
> squares in R __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova() type iii SS plots and diagnostics

2016-07-19 Thread Pamela Wong
I am wondering if there is a way to plot results and model diagnostics (to 
check for outliers, homoscedasticity, normality, collinearity) using type III 
sums of squares in R
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA for multiple repeated measurements

2015-06-23 Thread Thomas Evangelidis
​Greetings,

My dataset consist of the following columns:

Specimen C_flex   C_rigid   tau_flextau_rigidR_flex
R_rigid1 0.1782   0.29750.3290  0.3223   0.4338
0.51002 0.0527   0.10970.1780  0.1038   0.2364
0.1086.

where C, tau and R are statistical metrics that asses the performance of a
method in two modes, flex and rigid. Essentially my data are paired,
they consist of 18 specimens which were analysed twice by my method, once
in flex mode and once in rigid mode, and I quantified the performance
of each specimen in each mode using 3 statistical metrics (C, tau, R). What
I want is to assess weather the overall performance of the method in the
two modes (flex, rigid) is significantly different. Or in other words,
whether the difference (C_flex, tau_flex, R_flex) vs (C_rigid, tau_rigid,
R_rigid) is statistically significant. Could someone give a hint of what
type of ANOVA I must use in R or point me to a relevant post?

PS: the most relevant R tutorial I found was this:
http://rtutorialseries.blogspot.cz/2011/02/r-tutorial-series-two-way-repeated.html
The author's dataset is like this:

subject schoolAge10 schoolAge15 schoolAge20 workAge10 workAge15
workAge2011   5   5   3 1
   3522   5   5   3 1
   35

But I am not sure whether he assess the significance of the difference
(schoolAge10, schoolAge15, schoolAge20) vs (workAge10 workAge15,
workAge20), or schoolAge10 vs workAge10  schoolAge15 vs workAge15 
schoolAge20 vs workAge20.
​
thanks in advance,
Thomas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ANOVA for nested design

2015-05-15 Thread Jinsong Zhao

Hi there,

The following is a simple design. A and B are factors with their levels 
randomly selected. In other words, A and B are random.


The data is recorded in abc, as:
 dput(abc)
structure(list(water = c(12.1, 12.1, 12.8, 12.8, 14.4, 14.4,
14.7, 14.5, 23.1, 23.4, 28.1, 28.8), A = structure(c(1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c(1, 2,
3), class = factor), B = structure(c(1L, 1L, 2L, 2L, 3L,
3L, 4L, 4L, 5L, 5L, 6L, 6L), .Label = c(1, 2, 3, 4, 5,
6), class = factor)), .Names = c(water, A, B), row.names = c(NA,
-12L), class = data.frame)

I run ems from afex package:

 ems(n ~ A*B, nested=A/B, random=AB)
  VarianceComponent
Effect e B A
 A 1 n nb
 B 1 n
 e 1
attr(,terms)
[1] n a b

So, the ANOVA for this design is something like:

 summary(aov(water ~ A + Error(B), abc))

Error: B
  Df Sum Sq Mean Sq F value Pr(F)
A  2  416.8  208.39   22.68 0.0155 *
Residuals  3   27.69.19
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Error: Within
  Df Sum Sq Mean Sq F value Pr(F)
Residuals  6   0.31 0.05167
 summary(aov(water ~ B + Error(A), abc))

Error: A
  Df Sum Sq Mean Sq
B  2  416.8   208.4

Error: Within
  Df Sum Sq Mean Sq F value   Pr(F)
B  3  27.57   9.190   177.9 2.99e-06 ***
Residuals  6   0.31   0.052
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1


Is it possible to write the above two aov(...) into one aov(...)?

Any suggestions will be really appreciated. Thanks in advance.

Best regards,
Jinsong

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ANOVA test

2014-12-27 Thread peter dalgaard

 On 27 Dec 2014, at 02:50 , Richard M. Heiberger r...@temple.edu wrote:
 
 Kristi,
 
 The easiest way to do what you are looking for is the aovSufficient function
 in the HH package.
 
 ## install.packages(HH) ## if you don't have it yet
 library(HH)
 B.aov - aovSufficient(mean ~ site, data=B, sd=B$SE, weights=c(3,3,3,3))
 summary(B.aov)
 

Ummm, you want SD, not SE, in there, don't you?

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA test

2014-12-26 Thread Kristi Glover
Hi R user,
I am wondering whether I can perform a simple ANOVA analysis in the data in 
which I  have mean + SE (+- Standard Error) for several groups. 
For this one,  I calculated upper and lower confidence interval and made three 
classes for each group (mean, upper and lower values). After that, I did ANOVA 
(simple Anova). I am wondering whether this is a wrong approach?  I have given 
an example


library(reshape)
B-structure(list(mean = c(0.0241262, 0.0433538, 0.2204764, 0.7830054
), SE = c(0.0209097, 0.0329281, 0.1003248, 0.3019256), site = structure(1:4, 
.Label = c(A, 
B, C, D), class = factor)), .Names = c(mean, SE, 
site), class = data.frame, row.names = c(NA, -4L))
attach(B)
B1-data.frame(B, Upper=mean+1.96*SE, Lower=mean-1.96*SE)
B2-subset(B1, select=c(-2))
B2
B3-melt(B2, id=c(site))
B3
Anova-aov(B3$value~B3$site)
summary(Anova)

Thanks 

  
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA test

2014-12-26 Thread Bert Gunter
This is a statistical question primarily and, as such, is off topic
here. Either consult a local statistical expert or post to a
statistical site like stats.stackexchange.com  .

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
Clifford Stoll




On Fri, Dec 26, 2014 at 6:38 AM, Kristi Glover
kristi.glo...@hotmail.com wrote:
 Hi R user,
 I am wondering whether I can perform a simple ANOVA analysis in the data in 
 which I  have mean + SE (+- Standard Error) for several groups.
 For this one,  I calculated upper and lower confidence interval and made 
 three classes for each group (mean, upper and lower values). After that, I 
 did ANOVA (simple Anova). I am wondering whether this is a wrong approach?  I 
 have given an example


 library(reshape)
 B-structure(list(mean = c(0.0241262, 0.0433538, 0.2204764, 0.7830054
 ), SE = c(0.0209097, 0.0329281, 0.1003248, 0.3019256), site = structure(1:4, 
 .Label = c(A,
 B, C, D), class = factor)), .Names = c(mean, SE,
 site), class = data.frame, row.names = c(NA, -4L))
 attach(B)
 B1-data.frame(B, Upper=mean+1.96*SE, Lower=mean-1.96*SE)
 B2-subset(B1, select=c(-2))
 B2
 B3-melt(B2, id=c(site))
 B3
 Anova-aov(B3$value~B3$site)
 summary(Anova)

 Thanks


 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA test

2014-12-26 Thread peter dalgaard
You could at least say that no, that is patently wrong! There are ways to 
reconstruct an ANOVA from means, sd, and group sizes, but this isn't it. In 
fact, the group sizes are not even used in the code.

I agree about the need for a statistical expert. Fundamental misunderstandings 
seem to be present. 

-pd

 On 26 Dec 2014, at 16:23 , Bert Gunter gunter.ber...@gene.com wrote:
 
 This is a statistical question primarily and, as such, is off topic
 here. Either consult a local statistical expert or post to a
 statistical site like stats.stackexchange.com  .
 
 Cheers,
 Bert
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374
 
 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 Clifford Stoll
 
 
 
 
 On Fri, Dec 26, 2014 at 6:38 AM, Kristi Glover
 kristi.glo...@hotmail.com wrote:
 Hi R user,
 I am wondering whether I can perform a simple ANOVA analysis in the data in 
 which I  have mean + SE (+- Standard Error) for several groups.
 For this one,  I calculated upper and lower confidence interval and made 
 three classes for each group (mean, upper and lower values). After that, I 
 did ANOVA (simple Anova). I am wondering whether this is a wrong approach?  
 I have given an example
 
 
 library(reshape)
 B-structure(list(mean = c(0.0241262, 0.0433538, 0.2204764, 0.7830054
 ), SE = c(0.0209097, 0.0329281, 0.1003248, 0.3019256), site = structure(1:4, 
 .Label = c(A,
 B, C, D), class = factor)), .Names = c(mean, SE,
 site), class = data.frame, row.names = c(NA, -4L))
 attach(B)
 B1-data.frame(B, Upper=mean+1.96*SE, Lower=mean-1.96*SE)
 B2-subset(B1, select=c(-2))
 B2
 B3-melt(B2, id=c(site))
 B3
 Anova-aov(B3$value~B3$site)
 summary(Anova)
 
 Thanks
 
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA test

2014-12-26 Thread Richard M. Heiberger
Kristi,

The easiest way to do what you are looking for is the aovSufficient function
in the HH package.

## install.packages(HH) ## if you don't have it yet
library(HH)
B.aov - aovSufficient(mean ~ site, data=B, sd=B$SE, weights=c(3,3,3,3))
summary(B.aov)

You must have the sample size for each of the groups.  Your example did
not include sample sizes, so there is no justification for the degrees
of freedom in the residual.

I invented sizes of 3 observations per group.

See the pulmonary example in ?aovSufficient for a complete example.

There are several style issues that need to be commented on.
It is usually a bad idea to attach a data.frame.  In this example, you attached
B and then never used it, and also never detached it.  That has the potential
to mask objects farther down the search() list.

When you use aov(), or any other function that takes a data= argument,
it is best to use the data= argument.

Rich

On Fri, Dec 26, 2014 at 9:38 AM, Kristi Glover
kristi.glo...@hotmail.com wrote:
 Hi R user,
 I am wondering whether I can perform a simple ANOVA analysis in the data in 
 which I  have mean + SE (+- Standard Error) for several groups.
 For this one,  I calculated upper and lower confidence interval and made 
 three classes for each group (mean, upper and lower values). After that, I 
 did ANOVA (simple Anova). I am wondering whether this is a wrong approach?  I 
 have given an example


 library(reshape)
 B-structure(list(mean = c(0.0241262, 0.0433538, 0.2204764, 0.7830054
 ), SE = c(0.0209097, 0.0329281, 0.1003248, 0.3019256), site = structure(1:4, 
 .Label = c(A,
 B, C, D), class = factor)), .Names = c(mean, SE,
 site), class = data.frame, row.names = c(NA, -4L))
 attach(B)
 B1-data.frame(B, Upper=mean+1.96*SE, Lower=mean-1.96*SE)
 B2-subset(B1, select=c(-2))
 B2
 B3-melt(B2, id=c(site))
 B3
 Anova-aov(B3$value~B3$site)
 summary(Anova)

 Thanks


 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA and permutation tests : beware of traps

2014-09-23 Thread Stéphane Adamowicz
Recently, I came across a strange and potentially troublesome behaviour of the 
lm and aov functions that ask questions about calculation accuracy. Let us 
consider the 2 following datasets dat1  dat2 :

 (dat1 - data.frame(Y=c(1:3, 10+1:3), F=c(rep(A,3), rep(B,3
   Y F
1  1 A
2  2 A
3  3 A
4 11 B
5 12 B
6 13 B
 (dat2 - data.frame(Y=c(10+1:3, 1:3), F=c(rep(A,3), rep(B,3
   Y F
1 11 A
2 12 A
3 13 A
4  1 B
5  2 B
6  3 B

They only differ in the order of values that were exchanged between samples A 
and B. Thus the sd is 1 for each sample in either data sets, and the absolute 
mean difference |A-B| is 10 in both datasets.
Now, let us perform an anova to compare samples A and B in both datasets (of 
course, in such simple case, a bilateral T test would do the job, but an anova 
is nevertheless allowed and should give the same probability than Student's 
test):

 (anova1 - anova(lm(Y~F, dat1)))
Analysis of Variance Table

Response: Y
  Df Sum Sq Mean Sq F valuePr(F)
F  1150 150 150 0.0002552 ***
Residuals  4  4   1  

 (anova2 - anova(lm(Y~F, dat2)))
Analysis of Variance Table

Response: Y
  Df Sum Sq Mean Sq F valuePr(F)
F  1150 150 150 0.0002552 ***
Residuals  4  4   1

As expected, both datasets give a same anova table, but this is only apparent. 
Indeed :

 anova1$F[1] == anova2$F[1]
[1] FALSE
 anova1$F[1] - anova2$F[1]
[1] 5.684342e-14

In fact the F values differ slightly, and this holds also for the aov function. 
I checked also (not shown) that both the residual and factorial sums of squares 
differ between dat1 and dat2. Thus, for some undocumented reason (at least for 
the end user), the F values depend on the order of data!
While such tiny differences (e-14 in this example) are devoid of consequences 
on the risk evaluation by Fisher's distribution, they may have huge 
consequences on the risk evaluation by the permutation method. Indeed, the 
shift from continuous to discrete distributions is far from being insignificant.

For instance, the following code in R is at the basis of many permutation 
algorithms found in the internet and in teaching because it seems quite 
straightforward (see for example 
http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Permutation%20Anova/PermTestsAnova.html
http://www.usna.edu/Users/math/jager/courses/sm439/labs/Lab7.pdf
http://biostatistics1014.blogspot.fr/2013/04/one-way-anova-permutation-test-and.html
http://adn.biol.umontreal.ca/~numericalecology/Rcode/):

 nperm - 1000 # number of permutations
 Fperm - replicate(n=nperm, anova(lm(sample(Y) ~ F, dat1))$F[1]) # calculates 
 nperm F values
 (prob - (sum(anova1$F[1] = Fperm) + 1)/(nperm +1))  # risk calculation  
[1] 0.04695305

Of course, because of the sample function, repeating this code gives different 
prob values, but they remain always close to 5% instead of the exact 
probability of 10%. Indeed, there are only choose(6,3) = 20 possible 
permutations, but because they are symmetric, they give only 10 absolute mean 
differences. Thus, the only exact probabilities are 10%, 20% ... 100%. In our 
example where samples do not overlap, 10% is obviously the right answer.

Thus, the use of lm and aov functions in permutation methods does not seem a 
good idea as it results in biases that underestimate the exact risk. In the 
simple case of one-way anova, it is quite simple to remedy this problem. As the 
total Sums of squares and the degrees of freedom do not change with 
permutations, it is easier and much faster to compare the residual sums of 
squares. For instance, the exact probabilities can be calculated that way :

 combi - combn(6, 3, FUN=function(i) {append(dat1$Y[i], dat1$Y[-i])}) # all 
 permutations
 SCEResid - apply(combi, 2, FUN=function(x) sum(tapply(x, dat1$F, function(x) 
 sum((x - mean(x))^2 # all resi SS
 (prob - mean(SCEResid = SCEResid[1])) # risk calculation
[1] 0.1

10% is indeed the exact risk.

Finally, my problem is : How can we know if R libraries that use randomization 
procedures are not biased ? In the basic case of one way anova, it seems easy 
to submit the above example (by the way, the defunct lmPerm library does not 
succeed ...), but how can we check more complex anova models ?




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA and permutation tests : beware of traps

2014-09-23 Thread Joshua Wiley
Hi Stephane,

This is the well known result of limitted floating point precision (e.g.,
http://www.validlab.com/goldberg/addendum.html).  Using a test of
approximate rather than exact equality shows R yields the correct answer:

nperm - 1
Fperm - replicate(n=nperm, anova(lm(sample(Y) ~ F, dat1))$F[1]) #
calculates nperm F values
(prob - (sum(anova1$F[1] = (Fperm + .Machine$double.eps ^ 0.5)) +
1)/(nperm +1))  # risk calculation

yields

0.10009

I picked .Machine$double.eps ^ 0.5 somewhat arbitrarily as the default for
equality testing from ?all.equal

Cheers,

Josh



On Tue, Sep 23, 2014 at 5:14 PM, Stéphane Adamowicz 
stephane.adamow...@avignon.inra.fr wrote:

 Recently, I came across a strange and potentially troublesome behaviour of
 the lm and aov functions that ask questions about calculation accuracy. Let
 us consider the 2 following datasets dat1  dat2 :

  (dat1 - data.frame(Y=c(1:3, 10+1:3), F=c(rep(A,3), rep(B,3
Y F
 1  1 A
 2  2 A
 3  3 A
 4 11 B
 5 12 B
 6 13 B
  (dat2 - data.frame(Y=c(10+1:3, 1:3), F=c(rep(A,3), rep(B,3
Y F
 1 11 A
 2 12 A
 3 13 A
 4  1 B
 5  2 B
 6  3 B

 They only differ in the order of values that were exchanged between
 samples A and B. Thus the sd is 1 for each sample in either data sets, and
 the absolute mean difference |A-B| is 10 in both datasets.
 Now, let us perform an anova to compare samples A and B in both datasets
 (of course, in such simple case, a bilateral T test would do the job, but
 an anova is nevertheless allowed and should give the same probability than
 Student's test):

  (anova1 - anova(lm(Y~F, dat1)))
 Analysis of Variance Table

 Response: Y
   Df Sum Sq Mean Sq F valuePr(F)
 F  1150 150 150 0.0002552 ***
 Residuals  4  4   1

  (anova2 - anova(lm(Y~F, dat2)))
 Analysis of Variance Table

 Response: Y
   Df Sum Sq Mean Sq F valuePr(F)
 F  1150 150 150 0.0002552 ***
 Residuals  4  4   1

 As expected, both datasets give a same anova table, but this is only
 apparent. Indeed :

  anova1$F[1] == anova2$F[1]
 [1] FALSE
  anova1$F[1] - anova2$F[1]
 [1] 5.684342e-14

 In fact the F values differ slightly, and this holds also for the aov
 function. I checked also (not shown) that both the residual and factorial
 sums of squares differ between dat1 and dat2. Thus, for some undocumented
 reason (at least for the end user), the F values depend on the order of
 data!
 While such tiny differences (e-14 in this example) are devoid of
 consequences on the risk evaluation by Fisher's distribution, they may have
 huge consequences on the risk evaluation by the permutation method. Indeed,
 the shift from continuous to discrete distributions is far from being
 insignificant.

 For instance, the following code in R is at the basis of many permutation
 algorithms found in the internet and in teaching because it seems quite
 straightforward (see for example
 http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Permutation%20Anova/PermTestsAnova.html
 http://www.usna.edu/Users/math/jager/courses/sm439/labs/Lab7.pdf

 http://biostatistics1014.blogspot.fr/2013/04/one-way-anova-permutation-test-and.html
 http://adn.biol.umontreal.ca/~numericalecology/Rcode/):

  nperm - 1000 # number of permutations
  Fperm - replicate(n=nperm, anova(lm(sample(Y) ~ F, dat1))$F[1]) #
 calculates nperm F values
  (prob - (sum(anova1$F[1] = Fperm) + 1)/(nperm +1))  # risk calculation
 [1] 0.04695305

 Of course, because of the sample function, repeating this code gives
 different prob values, but they remain always close to 5% instead of the
 exact probability of 10%. Indeed, there are only choose(6,3) = 20 possible
 permutations, but because they are symmetric, they give only 10 absolute
 mean differences. Thus, the only exact probabilities are 10%, 20% ... 100%.
 In our example where samples do not overlap, 10% is obviously the right
 answer.

 Thus, the use of lm and aov functions in permutation methods does not seem
 a good idea as it results in biases that underestimate the exact risk. In
 the simple case of one-way anova, it is quite simple to remedy this
 problem. As the total Sums of squares and the degrees of freedom do not
 change with permutations, it is easier and much faster to compare the
 residual sums of squares. For instance, the exact probabilities can be
 calculated that way :

  combi - combn(6, 3, FUN=function(i) {append(dat1$Y[i], dat1$Y[-i])}) #
 all permutations
  SCEResid - apply(combi, 2, FUN=function(x) sum(tapply(x, dat1$F,
 function(x) sum((x - mean(x))^2 # all resi SS
  (prob - mean(SCEResid = SCEResid[1])) # risk calculation
 [1] 0.1

 10% is indeed the exact risk.

 Finally, my problem is : How can we know if R libraries that use
 randomization procedures are not biased ? In the basic case of one way
 anova, it seems easy to submit the above example (by the way, the defunct
 lmPerm library does not succeed ...), but how can we check more complex
 anova models 

Re: [R] ANOVA and permutation tests : beware of traps

2014-09-23 Thread Prof Brian Ripley
Beware of the trap of listening to people with no knowledge of basic 
numerical methods!


It really is basic that the results of floating-point computer 
calculations depends on the order in which they are done (and the 
compiler can change the order).  Using == on such calculations is warned 
about on its help page.


The same warning applies to using = if equality is important.

It bears repeating that the problems are exacerbated on platforms which 
use extended-precision registers for some but not all of their 
calculations (and most R platforms do).  The classic reference is pp. 
248ff of http://www.validlab.com/goldberg/paper.pdf (part of a Sun manual).



On 23/09/2014 08:14, Stéphane Adamowicz wrote:

Recently, I came across a strange and potentially troublesome behaviour of the lm 
and aov functions that ask questions about calculation accuracy. Let us consider 
the 2 following datasets dat1  dat2 :


(dat1 - data.frame(Y=c(1:3, 10+1:3), F=c(rep(A,3), rep(B,3

Y F
1  1 A
2  2 A
3  3 A
4 11 B
5 12 B
6 13 B

(dat2 - data.frame(Y=c(10+1:3, 1:3), F=c(rep(A,3), rep(B,3

Y F
1 11 A
2 12 A
3 13 A
4  1 B
5  2 B
6  3 B

They only differ in the order of values that were exchanged between samples A 
and B. Thus the sd is 1 for each sample in either data sets, and the absolute 
mean difference |A-B| is 10 in both datasets.
Now, let us perform an anova to compare samples A and B in both datasets (of 
course, in such simple case, a bilateral T test would do the job, but an anova 
is nevertheless allowed and should give the same probability than Student's 
test):


(anova1 - anova(lm(Y~F, dat1)))

Analysis of Variance Table

Response: Y
   Df Sum Sq Mean Sq F valuePr(F)
F  1150 150 150 0.0002552 ***
Residuals  4  4   1


(anova2 - anova(lm(Y~F, dat2)))

Analysis of Variance Table

Response: Y
   Df Sum Sq Mean Sq F valuePr(F)
F  1150 150 150 0.0002552 ***
Residuals  4  4   1

As expected, both datasets give a same anova table, but this is only apparent. 
Indeed :


anova1$F[1] == anova2$F[1]

[1] FALSE

anova1$F[1] - anova2$F[1]

[1] 5.684342e-14

In fact the F values differ slightly, and this holds also for the aov function. 
I checked also (not shown) that both the residual and factorial sums of squares 
differ between dat1 and dat2. Thus, for some undocumented reason (at least for 
the end user), the F values depend on the order of data!
While such tiny differences (e-14 in this example) are devoid of consequences 
on the risk evaluation by Fisher's distribution, they may have huge 
consequences on the risk evaluation by the permutation method. Indeed, the 
shift from continuous to discrete distributions is far from being insignificant.

For instance, the following code in R is at the basis of many permutation 
algorithms found in the internet and in teaching because it seems quite 
straightforward (see for example 
http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Permutation%20Anova/PermTestsAnova.html
http://www.usna.edu/Users/math/jager/courses/sm439/labs/Lab7.pdf
http://biostatistics1014.blogspot.fr/2013/04/one-way-anova-permutation-test-and.html
http://adn.biol.umontreal.ca/~numericalecology/Rcode/):


nperm - 1000# number of permutations
Fperm - replicate(n=nperm, anova(lm(sample(Y) ~ F, dat1))$F[1]) # calculates 
nperm F values
(prob - (sum(anova1$F[1] = Fperm) + 1)/(nperm +1))  # risk calculation
  

[1] 0.04695305

Of course, because of the sample function, repeating this code gives different 
prob values, but they remain always close to 5% instead of the exact 
probability of 10%. Indeed, there are only choose(6,3) = 20 possible 
permutations, but because they are symmetric, they give only 10 absolute mean 
differences. Thus, the only exact probabilities are 10%, 20% ... 100%. In our 
example where samples do not overlap, 10% is obviously the right answer.

Thus, the use of lm and aov functions in permutation methods does not seem a 
good idea as it results in biases that underestimate the exact risk. In the 
simple case of one-way anova, it is quite simple to remedy this problem. As the 
total Sums of squares and the degrees of freedom do not change with 
permutations, it is easier and much faster to compare the residual sums of 
squares. For instance, the exact probabilities can be calculated that way :


combi - combn(6, 3, FUN=function(i) {append(dat1$Y[i], dat1$Y[-i])}) # all 
permutations
SCEResid - apply(combi, 2, FUN=function(x) sum(tapply(x, dat1$F, function(x) 
sum((x - mean(x))^2 # all resi SS
(prob - mean(SCEResid = SCEResid[1])) # risk calculation

[1] 0.1

10% is indeed the exact risk.

Finally, my problem is : How can we know if R libraries that use randomization 
procedures are not biased ? In the basic case of one way anova, it seems easy 
to submit the above example (by the way, the defunct lmPerm library does not 
succeed ...), but how can we check more complex 

[R] ANOVA MS residuals estimation: two models

2014-08-27 Thread Rosario Garcia Gil
Hello

I have an experiment with two factors MANAGEMENT and REGION. And I have three 
MANAGEMENT types and three REGIONS. Like this.

REGION: A, B and C
MANAGEMENT: N, P, ST
the independent variable is called PHt

I have set the aov in R in two possible ways. The first model I understand, but 
I do not understand how the second model set the df for each variable. I 
understood that the second model is the most correct for a two-way ANOVA 
without repetition, is that correct?:

- aov(PHt~REGION+MANAGEMENT+REGION*MANAGEMENT, data=data)
- aov(PHt~REGION*MANAGEMENT+Error(subject/(REGION*MANAGEMENT)),data=data)

I attach the results of the two models.

Best regards
Rosario

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] anova

2014-08-21 Thread Lynn Govaert
Hi all,

I have troubles with doing an anova. So I have the following variables: a
variable group with two levels, a continuous variable trait and within each
group we have 12 organisms (clone) and for each clone we have 5 replicates.
So we want to see if for the variable trait the two groups differ,
including the information for the clones.

So the dataset looks like:

trait  group

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova

2014-08-21 Thread Lynn Govaert
I pressed enter to soon.

Again

Hi all,

I have troubles with doing an anova. So I have the following variables: a
variable group with two levels, a continuous variable trait and within each
group we have 12 organisms (clone) and for each clone we have 5 replicates.
So we want to see if for the variable trait the two groups differ,
including the information for the clones.

So the dataset looks like:

trait  groupclone
...  1   a1
...  1   a2
...
...  1   a12
...  1a1
...
...   2b1
...   2b2

etc

trait are just some numbers.
So clone is nested in group.

Then I don't understand how to make the anova,
I was thinking we want to see if there is an effect of group

So we built the model aov(trait ~group)
but because we also want to include the effect of clone which is a random
effect we do

aov(trait ~ group + Error(clone))
but because Clone is nested in group
we do
aov(trait ~ group + Error(group/clone))

but if I do this I get the following output


Error: PopulationTNFMF
Df  Sum Sq Mean Sq
PopulationTNFMF  1 0.00917 0.00917

Error: PopulationTNFMF:CloneTNFMF
  DfSum Sq   Mean Sq F value Pr(F)
Residuals  1 0.0001849 0.0001849

Error: Within
   Df  Sum Sq   Mean Sq F value Pr(F)
Residuals 117 0.02107 0.0001801


and I don't get any p-values, so I guess I'm doing something wrong.
Can someone help?

Thank you in advance
Lynn


2014-08-21 12:08 GMT+02:00 Lynn Govaert lynn.gova...@gmail.com:

 Hi all,

 I have troubles with doing an anova. So I have the following variables: a
 variable group with two levels, a continuous variable trait and within each
 group we have 12 organisms (clone) and for each clone we have 5 replicates.
 So we want to see if for the variable trait the two groups differ,
 including the information for the clones.

 So the dataset looks like:

 trait  group


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova

2014-08-21 Thread Sarah Goslee
Hi Lynn,

Please read the posting guide, this document:
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

and don't post in HTML. (See your own email copied below for why.)

Sarah

On Thu, Aug 21, 2014 at 6:08 AM, Lynn Govaert lynn.gova...@gmail.com wrote:
 Hi all,

 I have troubles with doing an anova. So I have the following variables: a
 variable group with two levels, a continuous variable trait and within each
 group we have 12 organisms (clone) and for each clone we have 5 replicates.
 So we want to see if for the variable trait the two groups differ,
 including the information for the clones.

 So the dataset looks like:

 trait  group

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova

2014-08-21 Thread Bert Gunter
... and you do get P values; you just don't understand what you're
looking at, which is more or less what you said.

This is not a statistical help site. You should seek local statistical
consulting advice or post on a statistical help site (caveat emptor!)
like stats.stackexchange.com, not here.

Cheers,
Bert



Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
Clifford Stoll




On Thu, Aug 21, 2014 at 3:13 AM, Lynn Govaert lynn.gova...@gmail.com wrote:
 I pressed enter to soon.

 Again

 Hi all,

 I have troubles with doing an anova. So I have the following variables: a
 variable group with two levels, a continuous variable trait and within each
 group we have 12 organisms (clone) and for each clone we have 5 replicates.
 So we want to see if for the variable trait the two groups differ,
 including the information for the clones.

 So the dataset looks like:

 trait  groupclone
 ...  1   a1
 ...  1   a2
 ...
 ...  1   a12
 ...  1a1
 ...
 ...   2b1
 ...   2b2

 etc

 trait are just some numbers.
 So clone is nested in group.

 Then I don't understand how to make the anova,
 I was thinking we want to see if there is an effect of group

 So we built the model aov(trait ~group)
 but because we also want to include the effect of clone which is a random
 effect we do

 aov(trait ~ group + Error(clone))
 but because Clone is nested in group
 we do
 aov(trait ~ group + Error(group/clone))

 but if I do this I get the following output


 Error: PopulationTNFMF
 Df  Sum Sq Mean Sq
 PopulationTNFMF  1 0.00917 0.00917

 Error: PopulationTNFMF:CloneTNFMF
   DfSum Sq   Mean Sq F value Pr(F)
 Residuals  1 0.0001849 0.0001849

 Error: Within
Df  Sum Sq   Mean Sq F value Pr(F)
 Residuals 117 0.02107 0.0001801


 and I don't get any p-values, so I guess I'm doing something wrong.
 Can someone help?

 Thank you in advance
 Lynn


 2014-08-21 12:08 GMT+02:00 Lynn Govaert lynn.gova...@gmail.com:

 Hi all,

 I have troubles with doing an anova. So I have the following variables: a
 variable group with two levels, a continuous variable trait and within each
 group we have 12 organisms (clone) and for each clone we have 5 replicates.
 So we want to see if for the variable trait the two groups differ,
 including the information for the clones.

 So the dataset looks like:

 trait  group


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA for RCBD, two factors: how to plot residuals?..

2014-07-16 Thread Manuel Lambert



Hello,
I'm having problems to analyse results from a RCB Design Experiment.I have 
three blocks. For each block: four treatments (factor A), randomized. And for 
each factor A treatment,  I have 6 differents treatments (factor C), randomized.
myAOV=aov(response ~ factorA*factorC + Block + Error(field), data)
The model i use would be response = factorA + factorC + factorA:factorC + Block 
(fixed effect) + Field (subplot, random effect).
I have some good results but my problem is the following: i would like to have 
a look at the residuals distribution and the command:
residuals(myAOV)
returns a NULL object. How to do then to extract the residuals? The function 
works if i take off Error(field) from my aov function but this doesn't give 
me relevant residuals right?
Thanks a lot!



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA for RCBD, two factors: how to plot residuals?..

2014-07-16 Thread Manuel Lambert



Hello,
I'm having problems to analyse results from a RCB Design Experiment.I have 
three blocks. For each block: four treatments (factor A), randomized. And for 
each factor A treatment,  I have 6 differents treatments (factor C), randomized.
myAOV=aov(response ~ factorA*factorC + Block + Error(field), data)
The model i use would be response = factorA + factorC + factorA:factorC + Block 
(fixed effect) + Field (subplot, random effect).
I have some good results but my problem is the following: i would like to have 
a look at the residuals distribution and the command:
residuals(myAOV)
returns a NULL object. How to do then to extract the residuals? The function 
works if i take off Error(field) from my aov function but this doesn't give 
me relevant residuals right?
Thanks a lot!



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA for RCBD, two factors: how to plot residuals?..

2014-07-16 Thread Richard M. Heiberger
## Manuel,

## Please look at the maiz example in ?mmc in the HH package.

install.packages(HH)  ## if necessary
library(HH)
?mmc

## After the full maiz example, then you need one more command

maiz.proj - proj(maiz.aov)
maiz.proj

## I think you are looking for
maiz.proj$Within[, Residuals]

## Rich

On Wed, Jul 16, 2014 at 9:46 AM, Manuel Lambert
lambert_man...@hotmail.com wrote:



 Hello,
 I'm having problems to analyse results from a RCB Design Experiment.I have 
 three blocks. For each block: four treatments (factor A), randomized. And for 
 each factor A treatment,  I have 6 differents treatments (factor C), 
 randomized.
 myAOV=aov(response ~ factorA*factorC + Block + Error(field), data)
 The model i use would be response = factorA + factorC + factorA:factorC + 
 Block (fixed effect) + Field (subplot, random effect).
 I have some good results but my problem is the following: i would like to 
 have a look at the residuals distribution and the command:
 residuals(myAOV)
 returns a NULL object. How to do then to extract the residuals? The function 
 works if i take off Error(field) from my aov function but this doesn't give 
 me relevant residuals right?
 Thanks a lot!




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA for proportions with large mass on an extreme of [0, 1]

2014-06-21 Thread Joaquin Rapela

I am trying to perform an ANOVA on a dependent variable that has large mass
on the 1 side of the (0, 1] interval. I decided to use Fractional Regression
Models, as implemented in the package frm. This package seems well-suited for
my problem, but I don't see how to perform model comparisons of nested frm
models. Please, see data and code below.

I would like to do:

anova(model1, model2)

There is a function frm.ptest(model1, model2), but does not work with nested
models.

Are there alternatives to the frm package to perform ANOVAs on proportions
(with large mass on an extreme of [0, 1])?

Is there a way to model repeated measures (as in package lme4) when the 
dependent variable is a  proportion?

Data and code
-

con - url(http://sccn.ucsd.edu/~rapela/avshift/anovaDataFrame.RData;)
myData - get(load(con))
close(con)

myData - myData[!is.na(myData$alternationRate),]
y - myData$alternationRate

library(frm)

model1 - frm(y=y, x=model.matrix(~modality*condition+clusterID, data=myData)[, 
-1], linkfrac=logit, linkbin=logit, type=2P, inflation=1)
model2 - frm(y=y, x=model.matrix(~modality+condition+clusterID, data=myData)[, 
-1], linkfrac=logit, linkbin=logit, type=2P, inflation=1)

# this works
frm.ptest(model2, model3)

# but this does not
# frm.ptest(model1, model2)
#
# Error in frm.ptest(model1, model2) : 
# object 2 is nested in object 1 - no need to use the P test

Thanks, Joaquin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA for proportions with large mass on an extreme of [0, 1]

2014-06-21 Thread Bert Gunter
Although your queries certainly intersect R, they are primarily about
statistical modeling, which is OT for this list. Your issues also
appear to be complex. I would therefore suggest that you eschew remote
Internet advice and consult a local statistical expert for help.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
Clifford Stoll




On Fri, Jun 20, 2014 at 8:54 PM, Joaquin Rapela rap...@ucsd.edu wrote:

 I am trying to perform an ANOVA on a dependent variable that has large mass
 on the 1 side of the (0, 1] interval. I decided to use Fractional Regression
 Models, as implemented in the package frm. This package seems well-suited for
 my problem, but I don't see how to perform model comparisons of nested frm
 models. Please, see data and code below.

 I would like to do:

 anova(model1, model2)

 There is a function frm.ptest(model1, model2), but does not work with nested
 models.

 Are there alternatives to the frm package to perform ANOVAs on proportions
 (with large mass on an extreme of [0, 1])?

 Is there a way to model repeated measures (as in package lme4) when the 
 dependent variable is a  proportion?

 Data and code
 -

 con - url(http://sccn.ucsd.edu/~rapela/avshift/anovaDataFrame.RData;)
 myData - get(load(con))
 close(con)

 myData - myData[!is.na(myData$alternationRate),]
 y - myData$alternationRate

 library(frm)

 model1 - frm(y=y, x=model.matrix(~modality*condition+clusterID, 
 data=myData)[, -1], linkfrac=logit, linkbin=logit, type=2P, inflation=1)
 model2 - frm(y=y, x=model.matrix(~modality+condition+clusterID, 
 data=myData)[, -1], linkfrac=logit, linkbin=logit, type=2P, inflation=1)

 # this works
 frm.ptest(model2, model3)

 # but this does not
 # frm.ptest(model1, model2)
 #
 # Error in frm.ptest(model1, model2) :
 # object 2 is nested in object 1 - no need to use the P test

 Thanks, Joaquin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA for proportions with large mass on an extreme of [0, 1]

2014-06-21 Thread Joaquin Rapela
Bert,

My queries are directly related to R: 

1. Can the R package frm can be used to compare nested models. If so, how.

2. Are there alternative R packages to perform ANOVAs on a dependent variable
that is a proportion with significant mass one one extreme?

Also, my statistical problem is well defined (i.e., perform an ANOVA, ideally
considering repeated measures, on a dependent variable that is a proportion),
there is substantial literature addressing this problem (see some old and
newer articles below), and I am just looking for help on R packages
implementing the functionality described in this literature.

I believe my statistical problem is not complex. After all, ANOVAs are one of
the most common types of statistical analysis, and proportions frequently
appear in statistical problems.  

I just wanted to get some feedback from R users with expertise performing
ANOVAs on proportions. I trust that my statistical problem is not rare, that
several R users have worked on this problem, and that I will get very useful
feedback from them. If I get this feedback, it will show that my question is
no OT. Let's see what happens ...

Cordially, Joaquin

Papke, L. and J.M. Wooldridge (1996), Econometric methods for fractional
response variables with an application to 401(K) plan participation rates,
Journal of Applied Econometrics, 11(6), 619-32.

Ramalho, E.A., J.J.S. Ramalho and J.M.R. Murteira (2011), Alternative
estimating and testing empirical strategies for fractional regression
models, Journal of Economic Surveys, 25(1), 19-68.

On Sat, Jun 21, 2014 at 10:27:27AM -0700, Bert Gunter wrote:
 Although your queries certainly intersect R, they are primarily about
 statistical modeling, which is OT for this list. Your issues also
 appear to be complex. I would therefore suggest that you eschew remote
 Internet advice and consult a local statistical expert for help.
 
 Cheers,
 Bert
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374
 
 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 Clifford Stoll
 
 
 
 
 On Fri, Jun 20, 2014 at 8:54 PM, Joaquin Rapela rap...@ucsd.edu wrote:
 
  I am trying to perform an ANOVA on a dependent variable that has large mass
  on the 1 side of the (0, 1] interval. I decided to use Fractional Regression
  Models, as implemented in the package frm. This package seems well-suited 
  for
  my problem, but I don't see how to perform model comparisons of nested frm
  models. Please, see data and code below.
 
  I would like to do:
 
  anova(model1, model2)
 
  There is a function frm.ptest(model1, model2), but does not work with nested
  models.
 
  Are there alternatives to the frm package to perform ANOVAs on proportions
  (with large mass on an extreme of [0, 1])?
 
  Is there a way to model repeated measures (as in package lme4) when the 
  dependent variable is a  proportion?
 
  Data and code
  -
 
  con - url(http://sccn.ucsd.edu/~rapela/avshift/anovaDataFrame.RData;)
  myData - get(load(con))
  close(con)
 
  myData - myData[!is.na(myData$alternationRate),]
  y - myData$alternationRate
 
  library(frm)
 
  model1 - frm(y=y, x=model.matrix(~modality*condition+clusterID, 
  data=myData)[, -1], linkfrac=logit, linkbin=logit, type=2P, 
  inflation=1)
  model2 - frm(y=y, x=model.matrix(~modality+condition+clusterID, 
  data=myData)[, -1], linkfrac=logit, linkbin=logit, type=2P, 
  inflation=1)
 
  # this works
  frm.ptest(model2, model3)
 
  # but this does not
  # frm.ptest(model1, model2)
  #
  # Error in frm.ptest(model1, model2) :
  # object 2 is nested in object 1 - no need to use the P test
 
  Thanks, Joaquin
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

-- 
Joaquin Rapela, PhD
Swartz Center for Computational Neuroscience
University of California San Diego
9500 Gilman Drive,
San Diego, CA 92093-0559
tel: (858) 822-7547
fax: (858) 822-7556
http://sccn.ucsd.edu/~rapela
--


   Tak fyr, and ber it in the derkeste hous
   Bitwix this and the Mount of Caucasus,
   And lat men shette the dores and go thenne,
   Yet wol the fyr as faire lye and brenne,
   As twenty thousand men mighte it biholde:
   His office naturel ay wol it holde,
   Up peril of my lyf, til that it dye.

 Geoffrey Chaucer
 The Canterbury Tales

__
R-help@r-project.org mailing list

Re: [R] ANOVA repeated mesures

2013-12-23 Thread PIKAL Petr
Hi

You have only single value for each participant in each gruppo and AFAIK you 
can not do statistic on single value.

You can check differences among participants

fit-lm(valor~participantes, data=df2)
 fit-lm(valor~participantes, data=df2)
 anova(fit)
Analysis of Variance Table

Response: valor
  Df Sum Sq Mean Sq F value Pr(F)
participantes  7  5.167  0.7381  0.3163 0.9358
Residuals 16 37.333  2.   

and in groups

 fit-lm(valor~grupo, data=df2)
 anova(fit)
Analysis of Variance Table

Response: valor
  Df Sum Sq Mean Sq F valuePr(F)
grupo  2  22.75 11.3750  12.095 0.0003202 ***
Residuals 21  19.75  0.9405  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Regards
Petr


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Ma Teresa Martinez Soriano
 Sent: Wednesday, December 18, 2013 4:17 PM
 To: r-help@r-project.org
 Subject: [R] ANOVA repeated mesures
 
 Hi to everyone, I am tring to make a Anova with repeated measures,my
 data set looks like:
 participantes - c(1, 2, 3, 4, 5, 6, 7, 8, 1, 2,
 3, 4,   5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8)
 grupo - factor(c(rep(A, 8), rep(B, 8), rep(C, 8)))valor - c(1,
 2, 4, 1, 1, 2, 2, 3, 3, 4, 4, 2, 3, 4, 4, 3, 4, 5, 3, 5, 5, 3,  4,
 6)df2 - data.frame(participantes, grupo, valor)
 I want to find if there is differences in the grupo  in each
 participant. I don't know how to find this p-value for each
 participant, could you help me, please??Thanks a lot
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA repeated mesures

2013-12-18 Thread Mª Teresa Martinez Soriano
Hi to everyone, I am tring to make a Anova with repeated measures,my data set 
looks like:
participantes - c(1, 2, 3, 4, 5, 6, 7, 8, 1, 2, 3, 4,  
 5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8)
grupo - factor(c(rep(A, 8), rep(B, 8), rep(C, 8)))valor - c(1, 2, 4, 1, 
1, 2, 2, 3, 3, 4, 4, 2, 3, 4, 4, 3, 4, 5, 3, 5, 5, 3,  4, 6)df2 - 
data.frame(participantes, grupo, valor)
I want to find if there is differences in the grupo  in each participant. I 
don't know how to find this p-value for each participant, could you help me, 
please??Thanks a lot   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA in R

2013-12-07 Thread Robin Mjelle
ID

a_t1a_t2b_t1b_t2
CACCCGTAGAACCGACCTTGCG_mmu-miR-99b-5p15781941234810941
CACCCGTAGAACCGACCTTGC_mmu-miR-99b-5p4424265643839
CACCCGTAGAACCGACCTTG_mmu-miR-99b-5p544366253
CCGTAGAACCGACCTTGCG_mmu-miR-99b-5p263333157
CGTAGAACCGACCTTGCG_mmu-miR-99b-5p232238187
CACCCGTAGAACCGACCT_mmu-miR-99b-5p172731189
ACCCGTAGAACCGACCTTGCG_mmu-miR-99b-5p14193094

I want to run an ANOVA test on the rows above. I have two conditions with
two replicates each. I assume that there is no difference between a and b
for the different rows. I want to get which rows a different between a and
b.
How do I approach this in R?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in R

2013-12-07 Thread Bert Gunter
I think you start by doing your homework:

1. Read An Inroduction to R (ships with R) or other R online
tutorial. There are many good ones.

2. Use R's Help system:
?aov
?lm
?anova

there will be relevant links in these docs that you should follow,
especially to the use of formulas for model specification and
summary() for printing results (including anova tables).

3. Alternatively, have a look at the Rcmdr package for a GUI interface
to R that may meet your needs while allowing you to skip the homework.

Cheers,
Bert




On Sat, Dec 7, 2013 at 8:30 AM, Robin Mjelle robinmje...@gmail.com wrote:
 ID

 a_t1a_t2b_t1b_t2
 CACCCGTAGAACCGACCTTGCG_mmu-miR-99b-5p15781941234810941
 CACCCGTAGAACCGACCTTGC_mmu-miR-99b-5p4424265643839
 CACCCGTAGAACCGACCTTG_mmu-miR-99b-5p544366253
 CCGTAGAACCGACCTTGCG_mmu-miR-99b-5p263333157
 CGTAGAACCGACCTTGCG_mmu-miR-99b-5p232238187
 CACCCGTAGAACCGACCT_mmu-miR-99b-5p172731189
 ACCCGTAGAACCGACCTTGCG_mmu-miR-99b-5p14193094

 I want to run an ANOVA test on the rows above. I have two conditions with
 two replicates each. I assume that there is no difference between a and b
 for the different rows. I want to get which rows a different between a and
 b.
 How do I approach this in R?

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova split by factors

2013-11-18 Thread catalin roibu
Hello R-users,
I have a problem with Anova in R and I don't know how to solve that. I want
to compute Anova for each experiment (exp). I try this code:

test-lapply(split(eg,eg$Exp),function(x) aov(masa.uscat.tr ~ Clona,data =
x))
or
test-by(eg,eg$Exp, function(x) aov(masa.uscat.tr~Clona,data=x))

I want to compute Anova summary for each experiment (exp) and I want to
compute Tuckey test for each Anova.

Thank you very much!

My data is like this:



Exp Plot Clona Prov masa uscat tr masa usc. Ram masa usc total
B 36 Max4 P Puieti 6.199848485 2.639325843 8.839174328
B 36 Max4 P Puieti 3.87875 1.4798 5.35855
B 36 Max4 P Puieti 7.822702703 3.32852071 11.15122341
B 36 Max4 P Puieti 5.645384615 1.995238095 7.640622711
B 36 Max4 P Puieti 10.2 3.514864865 13.71486486
B 36 Max4 P Puieti 8.815545455 3.35627907 12.17182452
B 36 Max4 P Puieti 5.0 1.607142857 6.64047619
B 36 Max4 P Puieti 6.693488372 2.630208333 9.323696705
B 36 Max4 P Puieti 6.021012658 1.60293578 7.623948438
B 36 Max4 P Puieti 10.20582524 3.768314607 13.97413985
B 33 Max4 Butasi 5.899417476 1.745394737 7.644812213
B 33 Max4 Butasi 3.261428571 1.335735294 4.597163866
B 33 Max4 Butasi 3.359508197 1.456641221 4.816149418
B 33 Max4 Butasi 5.036363636 2.097793103 7.13415674
B 33 Max4 Butasi 3.122162162 1.612012579 4.734174741
B 33 Max4 Butasi 5.042474227 3.246916427 8.289390653
B 33 Max4 Butasi 5.058255814 2.724299065 7.782554879
B 33 Max4 Butasi 4.977818182 1.713504274 6.691322455
B 33 Max4 Butasi 3.195294118 1.243411765 4.438705882
B 33 Max4 Butasi 1.831818182 1.009090909 2.840909091
B 30 AF2 Butasi 6.98195122 1.764 8.74595122
B 30 AF2 Butasi 5.85833 1.686623377 7.54495671
B 30 AF2 Butasi 10.625 3.04 13.665
C 39 AF6 Sade 10.27125 2.283193277 12.55444328
C 39 AF6 Sade 9.473488372 1.909414414 11.38290279
C 39 AF6 Sade 10.4825 2 12.4825
C 39 AF6 Sade 11.61579545 2.431136364 14.04693182
C 39 AF6 Sade 8.185074627 1.80933 9.99440796
C 39 AF6 Sade 11.04510638 2.23672956 13.28183594
C 39 AF6 Sade 9.06667 2.473785714 11.54045238
C 39 AF6 Sade 10.12787611 3.097631579 13.22550769
C 39 AF6 Sade 9.171290323 1.821226415 10.99251674
C 39 AF6 Sade 12.1846875 2.262590361 14.44727786
C 42 Pannonia Sade 9.275 2.482173913 11.75717391
C 42 Pannonia Sade 7.21 1.77 8.98
C 42 Pannonia Sade 11.36939394 3.111780822 14.48117476
C 42 Pannonia Sade 7.85296875 1.943475177 9.796443927
C 42 Pannonia Sade 8.25 2.54047619 10.79047619
C 42 Pannonia Sade 8.669277108 2.187071429 10.85634854
C 42 Pannonia Sade 8.510886076 2.05344 10.56432608
C 42 Pannonia Sade 9.36222 5.525531915 14.88775414
C 42 Pannonia Sade 11.08481928 2.573193277 13.65801255
C 42 Pannonia Sade 10.17462687 3.003225806 13.17785267
C 45 Monviso Sade 12.99693878 3.216083916 16.21302269
C 45 Monviso Sade 11.11456522 1.885714286 13.0002795
C 45 Monviso Sade 8.12933 1.53267 9.662
C 45 Monviso Sade 9.943043478 2.38300885 12.32605233
C 45 Monviso Sade 11.9805814 3.080923913 15.06150531
C 45 Monviso Sade 10.31376623 2.210526316 12.52429255
C 45 Monviso Sade 9.947586207 1.86083 11.80841954
C 45 Monviso Sade 12.24261538 2.166857143 14.40947253
C 45 Monviso Sade 13.56650602 2.414371257 15.98087728
C 45 Monviso Sade 10.87574257 2.922340426 13.798083
C 48 AF2 Sade 9.334545455 2.201822917 11.53636837
C 48 AF2 Sade 9.747640449 2.811780822 12.55942127
C 48 AF2 Sade 14.29541284 4.506885246 18.80229809
C 48 AF2 Sade 10.26451613 3.014322581 13.27883871
C 48 AF2 Sade 13.30924242 3.960661765 17.26990419
C 48 AF2 Sade 13.23228426 5.00546875 18.23775301
C 48 AF2 Sade 14.8277027 4.450447761 19.27815046
C 48 AF2 Sade 15.23669528 7.559322034 22.79601731
C 48 AF2 Sade 13.27198582 3.758252427 17.03023824
C 48 AF2 Sade 13.7136 5.69044 19.40408889
C 51 Max4 Sade 8.860884956 5.613586957 14.47447191
C 51 Max4 Sade 13.29153285 5.653061224 18.94459407
C 51 Max4 Sade 9.609850746 3.385714286 12.99556503

-- 
---
Catalin-Constantin ROIBU
Lecturer PhD, Forestry engineer
Forestry Faculty of Suceava
Str. Universitatii no. 13, Suceava, 720229, Romania
office phone +4 0230 52 29 78, ext. 531
mobile phone   +4 0745 53 18 01
   +4 0766 71 76 58
FAX:+4 0230 52 16 64
silvic.usv.ro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova split by factors

2013-11-18 Thread ONKELINX, Thierry
Dear Catalin,

Have a look at the plyr package.

library(plyr)
dlply(
eg,
.(Exp),
function(x) {
aov(masa.uscat.tr~Clona,data=x)
}
)

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey


-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
catalin roibu
Verzonden: maandag 18 november 2013 13:24
Aan: r-help@r-project.org
Onderwerp: [R] Anova split by factors

Hello R-users,
I have a problem with Anova in R and I don't know how to solve that. I want to 
compute Anova for each experiment (exp). I try this code:

test-lapply(split(eg,eg$Exp),function(x) aov(masa.uscat.tr ~ Clona,data =
x))
or
test-by(eg,eg$Exp, function(x) aov(masa.uscat.tr~Clona,data=x))

I want to compute Anova summary for each experiment (exp) and I want to compute 
Tuckey test for each Anova.

Thank you very much!

My data is like this:



Exp Plot Clona Prov masa uscat tr masa usc. Ram masa usc total B 36 Max4 P 
Puieti 6.199848485 2.639325843 8.839174328 B 36 Max4 P Puieti 3.87875 1.4798 
5.35855 B 36 Max4 P Puieti 7.822702703 3.32852071 11.15122341 B 36 Max4 P 
Puieti 5.645384615 1.995238095 7.640622711 B 36 Max4 P Puieti 10.2 3.514864865 
13.71486486 B 36 Max4 P Puieti 8.815545455 3.35627907 12.17182452 B 36 Max4 P 
Puieti 5.0 1.607142857 6.64047619 B 36 Max4 P Puieti 6.693488372 
2.630208333 9.323696705 B 36 Max4 P Puieti 6.021012658 1.60293578 7.623948438 B 
36 Max4 P Puieti 10.20582524 3.768314607 13.97413985 B 33 Max4 Butasi 
5.899417476 1.745394737 7.644812213 B 33 Max4 Butasi 3.261428571 1.335735294 
4.597163866 B 33 Max4 Butasi 3.359508197 1.456641221 4.816149418 B 33 Max4 
Butasi 5.036363636 2.097793103 7.13415674 B 33 Max4 Butasi 3.122162162 
1.612012579 4.734174741 B 33 Max4 Butasi 5.042474227 3.246916427 8.289390653 B 
33 Max4 Butasi 5.058255814 2.724299065 7.782554879 B 33 Max4 Butasi 4.977818182 
1!
 .713504274 6.691322455 B 33 Max4 Butasi 3.195294118 1.243411765 4.438705882 B 
33 Max4 Butasi 1.831818182 1.009090909 2.840909091 B 30 AF2 Butasi 6.98195122 
1.764 8.74595122 B 30 AF2 Butasi 5.85833 1.686623377 7.54495671 B 30 AF2 
Butasi 10.625 3.04 13.665 C 39 AF6 Sade 10.27125 2.283193277 12.55444328 C 39 
AF6 Sade 9.473488372 1.909414414 11.38290279 C 39 AF6 Sade 10.4825 2 12.4825 C 
39 AF6 Sade 11.61579545 2.431136364 14.04693182 C 39 AF6 Sade 8.185074627 
1.80933 9.99440796 C 39 AF6 Sade 11.04510638 2.23672956 13.28183594 C 39 
AF6 Sade 9.06667 2.473785714 11.54045238 C 39 AF6 Sade 10.12787611 
3.097631579 13.22550769 C 39 AF6 Sade 9.171290323 1.821226415 10.99251674 C 39 
AF6 Sade 12.1846875 2.262590361 14.44727786 C 42 Pannonia Sade 9.275 
2.482173913 11.75717391 C 42 Pannonia Sade 7.21 1.77 8.98 C 42 Pannonia Sade 
11.36939394 3.111780822 14.48117476 C 42 Pannonia Sade 7.85296875 1.943475177 
9.796443927 C 42 Pannonia Sade 8.25 2.54047619 10.79047619 C 42 Pannonia !
 Sade 8.669277108 2.187071429 10.85634854 C 42 Pannonia Sade 8.51088607
6 2.05344 10.56432608 C 42 Pannonia Sade 9.36222 5.525531915 14.88775414 C 
42 Pannonia Sade 11.08481928 2.573193277 13.65801255 C 42 Pannonia Sade 
10.17462687 3.003225806 13.17785267 C 45 Monviso Sade 12.99693878 3.216083916 
16.21302269 C 45 Monviso Sade 11.11456522 1.885714286 13.0002795 C 45 Monviso 
Sade 8.12933 1.53267 9.662 C 45 Monviso Sade 9.943043478 2.38300885 
12.32605233 C 45 Monviso Sade 11.9805814 3.080923913 15.06150531 C 45 Monviso 
Sade 10.31376623 2.210526316 12.52429255 C 45 Monviso Sade 9.947586207 
1.86083 11.80841954 C 45 Monviso Sade 12.24261538 2.166857143 14.40947253 C 
45 Monviso Sade 13.56650602 2.414371257 15.98087728 C 45 Monviso Sade 
10.87574257 2.922340426 13.798083 C 48 AF2 Sade 9.334545455 2.201822917 
11.53636837 C 48 AF2 Sade 9.747640449 2.811780822 12.55942127 C 48 AF2 Sade 
14.29541284 4.506885246 18.80229809 C 48 AF2 Sade 10.26451613 3.014322581 
13.27883871 C 48 AF2 Sade 13.30924242 3.960661765 17.26990419 C 48 AF2 Sade 
13.232284!
 26 5.00546875 18.23775301 C 48 AF2 Sade 14.8277027 4.450447761 19.27815046 C 
48 AF2 Sade 15.23669528 7.559322034 22.79601731 C 48 AF2 Sade 13.27198582 
3.758252427 17.03023824 C 48 AF2 Sade 13.7136 5.69044 19.40408889 C 51 
Max4 Sade 8.860884956 5.613586957 14.47447191 C 51 Max4 Sade 13.29153285

[R] ANOVA nested model with random and fixed effects

2013-08-22 Thread Belyaeva Anna
Hello, 
I am confused with the design of experiment and I need your help. I study 
macroinvertebrates from 40 lakes. Each lake was sampled employing stratified 
random technique. Each lakes was divided into 3 different lake zones based on 
light penetration: near-shore area, transitional, and the deepest part of the 
lake. 5 random sediment samples were taken from the near shore area, 5 random 
samples in the transitional zone, 25 random samples in the deepest part of the 
lake. Sediment type and organisms' counts was recorded for each sample. 
To complicate more, 2 different types of equipment were used. Near-shore and 
transitional area were sampled with large-area heavy sampler, but the deepest 
part of the lake was sampled with small area sampler. I do not know if 
different type of equipment should figured in the model since all organisms 
counts were calculated as per sq/m.
I want to see if there is the sediment effect on organism counts if lake and 
zone are taken into account.I consider the lake to be the random effect . Zone 
is the meaningful, it carries information for benthos communities. Zone is 
nested within the lake. Sample is nested within the zone and lake. One sample - 
one sediment type.
What is the correct way make the model:
My guess is something like: aov(Count ~ Sediment+Error(Lake/Zone/Sample))  
Sincerely,
Anna
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA for mixture experiment

2013-07-15 Thread Zhiqiang Cui
Hi

I am trying to use ANOVA for mixture experiment.

The data can be seen below


Run Blend X1 X2 X3 Response

1 Pure 0.0 0.0 1.0 43.6

2 Pure 0.0 0.0 1.0 42.0

3 Pure 0.0 0.0 1.0 43.0

4 Binary 0.0 0.5 0.5 30.0

5 Binary 0.0 0.5 0.5 29.4

6 Binary 0.0 0.5 0.5 33.6

7 Pure 0.0 1.0 0.0 17.6

8 Pure 0.0 1.0 0.0 30.0

9 Pure 0.0 1.0 0.0 28.0

10 Binary 0.5 0.0 0.5 45.4

11 Binary 0.5 0.0 0.5 42.8

12 Binary 0.5 0.0 0.5 43.2

13 Binary 0.5 0.5 0.0 40.0

14 Binary 0.5 0.5 0.0 39.6

15 Binary 0.5 0.5 0.0 42.2

16 Pure 1.0 0.0 0.0 32.0

17 Pure 1.0 0.0 0.0 34.8

18 Pure 1.0 0.0 0.0 34.0



It was successful to get the prediction equation using mout = lm
(Response~-1+x1*x2*x3) to get

R=33.6 X1 + 25.2 X2 + 42.9X3 + 44.8X1X2 + 22.3X1X3.



But the ANOVA results is incorrect when I use

aov.out = aov(Response~-1+x1*x2*x3).



The original paper shows the results should be

Source DF SS MS F Pr  F

X1 1 3386.88 3386.88 362.02 0.0001

X2 1 1905.12 1905.12 203.64 0.0001

X3 1 5512.65 5512.65 589.24 0.0001

X1*X2 1 250.88 250.88 26.82 0.0002

X1*X3 1 61.98 61.98 6.62 0.0244

X2*X3 1 18.40 18.40 1.97 0.1861

Model 5 829.08 165.82 17.72 0.0001

Error 12 112.27 9.36

Total 17 941.35 17 941.35
I appreciate your help and instruction.

Best regards

Zhiqiang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA for mixture experiment

2013-07-15 Thread S Ellison
 But the ANOVA results is incorrect when I use
 
 aov.out = aov(Response~-1+x1*x2*x3).
First, this doesn't work at all. Your variables are in upper case and your data 
environment is not specified, so you must have done something else. I can get 
an answer using something like
anova(lm(Response~-1+X1*X2*X3-I(X1*X2*X3), data=d))
(I dropped the indetyerminable 3-way product explicitly)

Second, your answer differs from that listed by 'the original paper' at least 
partly because you have done something different from what has been done in the 
paper. You have asked for anova using Type I sums of squares. The paper has 
apparently calculated something other than a Type I sums of squares ANOVA.
 
Try calculating using Type 3 sums of squares using Anova from the car package. 
I did that using
Anova(lm(Response~-1+X1+X2+X3+I(X1*X2)+I(X1*X3)+I(X2*X3), data=d), type=3)

Then read the help page - especially the Warning section - for ?Anova very 
carefully ...


S Ellison


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA Estimated effects may be unbalanced

2013-05-19 Thread Luis Fernando García Hernández
Dear All,

This is a data relating leg shaking on differenntre treatments. I reused
several individuals so I want to know 1) if there are significative
differences on shaking per treatment and 2) if the reused individuals
presented some effect or significative variation.

Nevertehless when I make

model1-aov(Legshaking~Idmale*Idfemale*tratament)

or

model1-aov(Legshaking~Idmale+Idfemale+tratament)

I get

13744 out of 13824 effects not estimable
Estimated effects may be unbalanced

What could be wrong here? Any advice about how to handle this data in R
will be welcome!

Thanks in advance
Legshaking  Idmale  Idfemaletreatment
35  751 762 a
0   746 743 a
9   614 672 a
21  210 675 a
29  678 651 a
5   638 103 a
2   713 626 a
41  621 630 a
28  634 630 a
15  633 718 a
5   637 644 a
11  720 214 a
10  618 48  a
5   654 679 a
2   723 641 a
26  652 659 a
36  756 20  a
32  768 17  a
0   747 21  a
42  710 644 a
0   603 80  b
0   764 753 b
11  604 671 b
2   223 626 b
4   678 676 b
0   711 622 b
0   712 629 b
0   706 705 b
2   601 650 b
3   711 643 b
0   613 691 b
6   717 690 b
10  225 632 b
0   635 622 b
0   640 717 b
2   621 646 b
0   637 671 b
0   702 227 b
1   763 754 b
7   346 643 b
14  750 761 d
9   747 742 d
6   756 757 d
0   680 711 d
19  707 691 d
8   617 624 d
14  619 650 d
44  643 690 d
6   716 670 d
6   6   680 d
1   620 641 d
6   636 628 d
3   614 719 d
10  719 682 d
2   616 628 d
10  752 760 d
10  724 675 d
13  49  632 d
5   709 642 d
0   611 646 d
9   741 749 c
16  617 670 c
0   99  647 c
10  619 681 c
11  617 214 c
3   615 676 c
3   708 650 c
2   615 642 c
2   712 625 c
3   714 624 c
12  715 645 c
0   4   625 c
4   715 703 c
16  633 651 c
1   636 629 c
1   640 646 c
1   722 703 c
4   721 677 c
4   726 650 c
9   725 674 c
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Estimated effects may be unbalanced

2013-05-19 Thread Jim Lemon

On 05/20/2013 10:52 AM, Luis Fernando García Hernández wrote:

Dear All,

This is a data relating leg shaking on differenntre treatments. I reused
several individuals so I want to know 1) if there are significative
differences on shaking per treatment and 2) if the reused individuals
presented some effect or significative variation.

Nevertehless when I make

model1-aov(Legshaking~Idmale*Idfemale*tratament)

or

model1-aov(Legshaking~Idmale+Idfemale+tratament)

I get

13744 out of 13824 effects not estimable
Estimated effects may be unbalanced

What could be wrong here? Any advice about how to handle this data in R
will be welcome!


Hi Luis,
Are you sure that you don't want to compare males and females as a group 
and not every individual male and female?


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Estimated effects may be unbalanced

2013-05-19 Thread Bert Gunter
I would say  that the OP should seek local statistical help, as he
appears to be out of his statistical depth. This would appear to be a
mixed effects models-type setup, but a local statistical expert would
be much better able to judge what the goals of the study were and what
sort of approach, including graphics, might be appropriate.

Seeking remote statistical help from this list when your statistical
background is weak is

a) Inappropriate -- this is not a statistical help list; it's for R software;

b) Highly risky, as relevant details  of the underlying context and
the way the data were obtained are likely to be overlooked. It is
**not** merely a question of finding the appropriate function and just
turning the crank, although that's the way many seems to approach data
analysis.

Cheers,
Bert
On Sun, May 19, 2013 at 6:00 PM, Jim Lemon j...@bitwrit.com.au wrote:
 On 05/20/2013 10:52 AM, Luis Fernando García Hernández wrote:

 Dear All,

 This is a data relating leg shaking on differenntre treatments. I reused
 several individuals so I want to know 1) if there are significative
 differences on shaking per treatment and 2) if the reused individuals
 presented some effect or significative variation.

 Nevertehless when I make

 model1-aov(Legshaking~Idmale*Idfemale*tratament)

 or

 model1-aov(Legshaking~Idmale+Idfemale+tratament)

 I get

 13744 out of 13824 effects not estimable
 Estimated effects may be unbalanced

 What could be wrong here? Any advice about how to handle this data in R
 will be welcome!

 Hi Luis,
 Are you sure that you don't want to compare males and females as a group and
 not every individual male and female?

 Jim

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova und Tukey HSD

2013-05-13 Thread Jochen Schreiber
Hallo zusammen,

ich mache zuerst mittels eine Anova eine Überprüfung ob in den Daten ein 
signifikanter Unterschied exisitiert. Danach will ich mittels des TukeyHSD 
rausfinden, zwischen welchen Gruppe der Unterschied vorliegt. Allerdings weiß 
ich nicht wie ich die Daten interpretieren soll. Hier erstmal mein R Code:

[code]
Value-c(-0.9944999814033508,-0.3585381469727,0.7063000202178955,-1.77435803833,-1.080299973487854,0.3055071525574,1.849046325684,-0.412440395355,0.5827999711036682,1.750669482422,-6.693999767303467,-0.877943869019,-1.3408000469207764,1.2560999393463135,-0.1004081062317,1.849046325684,-0.331905722046,0.4957999885082245,0.877943869019,0.7387999892234802,0.877943869019,0.9154000282287598,0.877943869019,0.7063000202178955,-1.3408000469207764,0.7063000202178955,-0.331905722046,-1.6448999643325806,0.412440395355,-1.6448999643325806,-0.877943869019,0.7487000226974487,0.4399000108242035,1.849046325684,-1.6448999643325806,-2.4323999881744385,1.2265000343322754,-0.4957999885082245,-9.999899864196777,-1.750669482422,-1.6448999643325806,-9.999899864196777,0.877943869019,-5.06279993057251,0.877943869019,-2.967745776367,-5.06279993057251,-6.693999767303467,-1.0990500450134277,0.9944999814033508,-0.467745776367,-0.3585381469727,-9.999899864196777,0.5827999711036682,0.7487000226974487,0.7387999892234802,-0.2533000111579895,-9.999899864196777,-1.0363999605178833,0.3055071525574,-1.174523162842,-0.806410490417,-9.999899864196777,-0.9944999814033508,-2.478300094604492,-0.150930858612,0.4957999885082245,-4.571800231933594,-6.324900150299072,-0.38530001044273376,-1.3408000469207764,-5.93179988861084,-6.693999767303467,-2.967745776367,0.877943869019,-0.05020405311584,-1.77435803833,-0.150930858612,-0.23725003004074097,-0.643268528748,1.2560999393463135,-0.1004081062317,0.4399000108242035,-0.7063000202178955,0.9154000282287598,-0.2181814033508,1.2265000343322754,-0.412440395355,0.1764581741333,-1.4758000373840332,-0.9944999814033508,-1.080299973487854,-0.643268528748,-9.999899864196777,-2.0536999702453613,-0.2181814033508,0.7487000226974487,0.02510202655792,-1.0363999605178833,-0.05020405311584,-0.7387999892234802,0.4957999885082245,-1.4758000373840332,-0.7063000202178955,0.1764581741333,-5.06279993057251,-0.643268528748,-1.4758000373840332,-0.9944999814033508,-0.2533000111579895,0.1764581741333,-0.331905722046,0.6776500344276428,0.3055071525574,-0.05020405311584,0.5827999711036682,1.2560999393463135,-0.4957999885082245,-0.38530001044273376,0.9944999814033508,-2.4323999881744385,1.126338964844,-0.9944999814033508,1.750669482422,1.080299973487854,-0.7387999892234802,-1.3408000469207764,0.612820980835,-2.0536999702453613,0.7063000202178955,-0.806410490417,-0.877943869019,-0.05020405311584,-2.967745776367,-0.877943869019,-2.0536999702453613,-1.3408000469207764,-1.3408000469207764,-0.38530001044273376,0.7063000202178955,-9.999899864196777,-0.467745776367,0.7721999883651733,0.02510202655792,1.126338964844,-6.324900150299072,-0.150930858612,-0.4399000108242035,-0.9944999814033508,-0.9944999814033508,-0.467745776367,-1.0363999605178833,-1.750669482422,1.2265000343322754,-0.877943869019,0.612820980835,-0.05020405311584,0.5827999711036682,-0.7063000202178955,-0.643268528748,-0.23725003004074097,0.02510202655792,0.412440395355,0.7721999883651733,-1.0990500450134277)
Value-c(Value,0.9944999814033508,-0.2533000111579895,1.2560999393463135,-0.2181814033508,-1.174523162842,-0.38530001044273376,-0.4399000108242035,-0.7063000202178955,-2.478300094604492,-2.4323999881744385,0.9154000282287598,-0.23725003004074097,-0.38530001044273376,-1.6448999643325806,-0.05020405311584,1.849046325684,-0.38530001044273376,-0.643268528748,-4.571800231933594,-6.693999767303467,-1.750669482422,1.080299973487854,0.412440395355,-1.3408000469207764,-5.93179988861084,-0.3585381469727,-0.643268528748,-0.412440395355,-1.0990500450134277,-0.9944999814033508,-0.806410490417)
Group-factor(c(rep('D',18),rep('C',1),rep('A',7),rep('B',34),rep('E',3),rep('F',4),rep('G',10),rep('H',2),rep('I',29),rep('J',16),rep('N',1),rep('M',1),rep('Z',2),rep('X',67),rep('O',1)))
mydata-data.frame(Group, Value)
summary(aov(Value ~Group,mydata))
TukeyHSD(aov(Value ~Group))
[/code]

Das Summary von Anova gibt folgenden Output:

[code]
 Df  Sum Sq Mean Sq F value  Pr(F)  
Group14  147.47 10.5339  1.8815 0.03089 *
Residuals   181 1013.38  5.5988  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
[/code]

Dort sehe ich das der Pr(F)-Wert unter 0.05 liegt daher existiert ein 
signifikanter Unterschied. Nun folgt ein Teil der Ausgabe vom TukeyHSD:

[code]
$Group
   diff lwrupr p adj
B-A 

[R] Anova und Tukey HSD

2013-05-13 Thread Jochen Schreiber
Hallo zusammen,

ich mache zuerst mittels eine Anova eine Überprüfung ob in den Daten ein 
signifikanter Unterschied exisitiert. Danach will ich mittels des TukeyHSD 
rausfinden, zwischen welchen Gruppe der Unterschied vorliegt. Allerdings weiß 
ich nicht wie ich die Daten interpretieren soll. Hier erstmal mein R Code:

[code]
Value-c(-0.9944999814033508,-0.3585381469727,0.7063000202178955,-1.77435803833,-1.080299973487854,0.3055071525574,1.849046325684,-0.412440395355,0.5827999711036682,1.750669482422,-6.693999767303467,-0.877943869019,-1.3408000469207764,1.2560999393463135,-0.1004081062317,1.849046325684,-0.331905722046,0.4957999885082245,0.877943869019,0.7387999892234802,0.877943869019,0.9154000282287598,0.877943869019,0.7063000202178955,-1.3408000469207764,0.7063000202178955,-0.331905722046,-1.6448999643325806,0.412440395355,-1.6448999643325806,-0.877943869019,0.7487000226974487,0.4399000108242035,1.849046325684,-1.6448999643325806,-2.4323999881744385,1.2265000343322754,-0.4957999885082245,-9.999899864196777,-1.750669482422,-1.6448999643325806,-9.999899864196777,0.877943869019,-5.06279993057251,0.877943869019,-2.967745776367,-5.06279993057251,-6.693999767303467,-1.0990500450134277,0.9944999814033508,-0.467745776367,-0.3585381469727,-9.999899864196777,0.5827999711036682,0.7487000226974487,0.7387999892234802,-0.2533000111579895,-9.999899864196777,-1.0363999605178833,0.3055071525574,-1.174523162842,-0.806410490417,-9.999899864196777,-0.9944999814033508,-2.478300094604492,-0.150930858612,0.4957999885082245,-4.571800231933594,-6.324900150299072,-0.38530001044273376,-1.3408000469207764,-5.93179988861084,-6.693999767303467,-2.967745776367,0.877943869019,-0.05020405311584,-1.77435803833,-0.150930858612,-0.23725003004074097,-0.643268528748,1.2560999393463135,-0.1004081062317,0.4399000108242035,-0.7063000202178955,0.9154000282287598,-0.2181814033508,1.2265000343322754,-0.412440395355,0.1764581741333,-1.4758000373840332,-0.9944999814033508,-1.080299973487854,-0.643268528748,-9.999899864196777,-2.0536999702453613,-0.2181814033508,0.7487000226974487,0.02510202655792,-1.0363999605178833,-0.05020405311584,-0.7387999892234802,0.4957999885082245,-1.4758000373840332,-0.7063000202178955,0.1764581741333,-5.06279993057251,-0.643268528748,-1.4758000373840332,-0.9944999814033508,-0.2533000111579895,0.1764581741333,-0.331905722046,0.6776500344276428,0.3055071525574,-0.05020405311584,0.5827999711036682,1.2560999393463135,-0.4957999885082245,-0.38530001044273376,0.9944999814033508,-2.4323999881744385,1.126338964844,-0.9944999814033508,1.750669482422,1.080299973487854,-0.7387999892234802,-1.3408000469207764,0.612820980835,-2.0536999702453613,0.7063000202178955,-0.806410490417,-0.877943869019,-0.05020405311584,-2.967745776367,-0.877943869019,-2.0536999702453613,-1.3408000469207764,-1.3408000469207764,-0.38530001044273376,0.7063000202178955,-9.999899864196777,-0.467745776367,0.7721999883651733,0.02510202655792,1.126338964844,-6.324900150299072,-0.150930858612,-0.4399000108242035,-0.9944999814033508,-0.9944999814033508,-0.467745776367,-1.0363999605178833,-1.750669482422,1.2265000343322754,-0.877943869019,0.612820980835,-0.05020405311584,0.5827999711036682,-0.7063000202178955,-0.643268528748,-0.23725003004074097,0.02510202655792,0.412440395355,0.7721999883651733,-1.0990500450134277)
Value-c(Value,0.9944999814033508,-0.2533000111579895,1.2560999393463135,-0.2181814033508,-1.174523162842,-0.38530001044273376,-0.4399000108242035,-0.7063000202178955,-2.478300094604492,-2.4323999881744385,0.9154000282287598,-0.23725003004074097,-0.38530001044273376,-1.6448999643325806,-0.05020405311584,1.849046325684,-0.38530001044273376,-0.643268528748,-4.571800231933594,-6.693999767303467,-1.750669482422,1.080299973487854,0.412440395355,-1.3408000469207764,-5.93179988861084,-0.3585381469727,-0.643268528748,-0.412440395355,-1.0990500450134277,-0.9944999814033508,-0.806410490417)
Group-factor(c(rep('D',18),rep('C',1),rep('A',7),rep('B',34),rep('E',3),rep('F',4),rep('G',10),rep('H',2),rep('I',29),rep('J',16),rep('N',1),rep('M',1),rep('Z',2),rep('X',67),rep('O',1)))
mydata-data.frame(Group, Value)
summary(aov(Value ~Group,mydata))
TukeyHSD(aov(Value ~Group))
[/code]

Das Summary von Anova gibt folgenden Output:

[code]
Df  Sum Sq Mean Sq F value  Pr(F)  
Group14  147.47 10.5339  1.8815 0.03089 *
Residuals   181 1013.38  5.5988  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
[/code]

Dort sehe ich das der Pr(F)-Wert unter 0.05 liegt daher existiert ein 
signifikanter Unterschied. Nun folgt ein Teil der Ausgabe vom TukeyHSD:

[code]
$Group
  diff lwrupr p adj
B-A 

Re: [R] Anova und Tukey HSD

2013-05-13 Thread Meyners, Michael
Jochen,

a) this is an English-spoken mailing list; other languages are not encouraged 
nor will they typically generate a lot of replies...
b) your code is fine, so this is not an R-issue; you are rather stuck with some 
of the stats background -- you might want to see a friendly local statistician 
for advice :-) or try a stats mailing list (see e.g. the posting guide)
c) your interpretation is ok -- if the CI does not contain 0, you'd have 
significance at the respective level. Alternatively, you might also use the p 
value (column p adj) and compare that to your significance level (and it 
seems odd that any search machine might fail to give you some links on the 
relationship of p values and CIs -- I got plenty at first try)
d) In brief: you are doing ~100 (rough estimate) paired comparisons. With that, 
you can expect TukeyHSD to be pretty conservative, and it does not come as a 
surprise that you are unable to identify pairs of different groups, even less 
as the p value from the F test is not extremely small (~3%; whatever extremely 
small may mean in this case). Google also easily gets me to this site (and 
many other useful ones): http://www.pmean.com/05/TukeyTest.html which might 
provide some further guidance.

Net, tough luck to encounter such a situation, but apparently none of the group 
differences is strong enough to still show up as significant after correcting 
for multiplicity (which is certainly called for) in this case where you have an 
inflated number of significance tests. 

HTH, Michael

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Jochen Schreiber
 Sent: Montag, 13. Mai 2013 14:25
 To: r-help@r-project.org
 Subject: [R] Anova und Tukey HSD
 
 Hallo zusammen,
 
 ich mache zuerst mittels eine Anova eine Überprüfung ob in den Daten ein
 signifikanter Unterschied exisitiert. Danach will ich mittels des TukeyHSD
 rausfinden, zwischen welchen Gruppe der Unterschied vorliegt. Allerdings
 weiß ich nicht wie ich die Daten interpretieren soll. Hier erstmal mein R 
 Code:
 
 [code]
 Value-c(-0.9944999814033508,-0.3585381469727,0.7063000202178955,-
 1.77435803833,-
 1.080299973487854,0.3055071525574,1.849046325684,-
 0.412440395355,0.5827999711036682,1.750669482422,-
 6.693999767303467,-0.877943869019,-
 1.3408000469207764,1.2560999393463135,-
 0.1004081062317,1.849046325684,-
 0.331905722046,0.4957999885082245,0.877943869019,0.73879998922
 34802,0.877943869019,0.9154000282287598,0.877943869019,0.70630
 00202178955,-1.3408000469207764,0.7063000202178955,-
 0.331905722046,-1.6448999643325806,0.412440395355,-
 1.6448999643325806,-
 0.877943869019,0.7487000226974487,0.4399000108242035,1.8490463
 25684,-1.6448999643325806,-2.4323999881744385,1.2265000343322754,-
 0.4957999885082245,-9.999899864196777,-1.750669482422,-
 1.6448999643325806,-9.999899864196777,0.877943869019,-
 5.06279993057251,0.877943869019,-2.967745776367,-
 5.06279993057251,-6.693999767303467,-
 1.0990500450134277,0.9944999814033508,-0.467745776367,-
 0.3585381469727,-
 9.999899864196777,0.5827999711036682,0.7487000226974487,0.738799989223
 4802,-0.2533000111579895,-9.999899864196777,-
 1.0363999605178833,0.3055071525574,-1.174523162842,-
 0.806410490417,-9.999899864196777,-0.9944999814033508,-
 2.478300094604492,-0.150930858612,0.4957999885082245,-
 4.571800231933594,-6.324900150299072,-0.38530001044273376,-
 1.3408000469207764,-5.93179988861084,-6.693999767303467,-
 2.967745776367,0.877943869019,-0.05020405311584,-
 1.77435803833,-0.150930858612,-0.23725003004074097,-
 0.643268528748,1.2560999393463135,-
 0.1004081062317,0.4399000108242035,-
 0.7063000202178955,0.9154000282287598,-
 0.2181814033508,1.2265000343322754,-
 0.412440395355,0.1764581741333,-1.4758000373840332,-
 0.9944999814033508,-1.080299973487854,-0.643268528748,-
 9.999899864196777,-2.0536999702453613,-
 0.2181814033508,0.7487000226974487,0.02510202655792,-
 1.0363999605178833,-0.05020405311584,-
 0.7387999892234802,0.4957999885082245,-1.4758000373840332,-
 0.7063000202178955,0.1764581741333,-5.06279993057251,-
 0.643268528748,-1.4758000373840332,-0.9944999814033508,-
 0.2533000111579895,0.1764581741333,-
 0.331905722046,0.6776500344276428,0.3055071525574,-
 0.05020405311584,0.5827999711036682,1.2560999393463135,-
 0.4957999885082245,-0.38530001044273376,0.9944999814033508,-
 2.4323999881744385,1.126338964844,-
 0.9944999814033508,1.750669482422,1.080299973487854,-
 0.7387999892234802,-1.3408000469207764,0.612820980835,-
 2.0536999702453613,0.7063000202178955,-0.806410490417,-
 0.877943869019,-0.05020405311584,-2.967745776367,-
 0.877943869019,-2.0536999702453613,-1.3408000469207764,-
 1.3408000469207764,-0.38530001044273376,0.7063000202178955,-
 9.999899864196777

[R] Anova unbalanced

2013-04-17 Thread paladini

Hello everybody,
I have got a data set with about 400 companies. Each company has a 
score for its enviroment comportment between 0 and 100. These companies 
belong to  about 15 different countries. I have e.g. 70 companies from 
UK and 5 from Luxembourg,- so the data set is pretty unbalanced and I 
want to do an ANOVA. Somthing like aov(enviromentscore~country). But the 
aov function is just for a balanced design.
So I wonder if I can use fit=lm(enviromentscore~country), anova (fit) 
instead? Would this be okay or can it also only be used with balanced 
data?


Thanking you in anticipation, best regards


Claudia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova unbalanced

2013-04-17 Thread Jose Iparraguirre
Dear Claudia,

Your question has been posed on many previous occasions. 
The (short) answer has always been the same: have a look at the Anova function 
in the car package but before doing that, get a copy of John Fox's Applied 
Regression Analysis and Generalized Linear Models book.
Best,

José


José Iparraguirre
Chief Economist
Age UK

T 020 303 31482
E jose.iparragui...@ageuk.org.uk
Twitter @jose.iparraguirre@ageuk


Tavis House, 1- 6 Tavistock Square
London, WC1H 9NB
www.ageuk.org.uk | ageukblog.org.uk | @ageukcampaigns 




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of paladini
Sent: 17 April 2013 10:47
To: r-help@r-project.org
Subject: [R] Anova unbalanced

Hello everybody,
I have got a data set with about 400 companies. Each company has a 
score for its enviroment comportment between 0 and 100. These companies 
belong to  about 15 different countries. I have e.g. 70 companies from 
UK and 5 from Luxembourg,- so the data set is pretty unbalanced and I 
want to do an ANOVA. Somthing like aov(enviromentscore~country). But the 
aov function is just for a balanced design.
So I wonder if I can use fit=lm(enviromentscore~country), anova (fit) 
instead? Would this be okay or can it also only be used with balanced 
data?

Thanking you in anticipation, best regards


Claudia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Please donate to the Syria Crisis Appeal by text or online:

To donate £5 by mobile, text SYRIA to 70800.  To donate online, please visit 

http://www.ageinternational.org.uk/syria

Over one million refugees are desperately in need of water, food, healthcare, 
warm clothing, 
blankets and shelter; Age International urgently needs your support to help 
affected older refugees.


Age International is a subsidiary charity of Age UK and a member of the 
Disasters Emergency Committee (DEC).  
The DEC launches and co-ordinates national fundraising appeals for public 
donations on behalf of its member agencies.

Texts cost £5 plus one standard rate message.  Age International will receive a 
minimum of £4.96.  
More info at ageinternational.org.uk/SyriaTerms



 

---
Age UK is a registered charity and company limited by guarantee, (registered 
charity number 1128267, registered company number 6825798). 
Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

For the purposes of promoting Age UK Insurance, Age UK is an Appointed 
Representative of Age UK Enterprises Limited, Age UK is an Introducer 
Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth 
Access for the purposes of introducing potential annuity and health 
cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit 
Solutions Limited and Simplyhealth Access are all authorised and 
regulated by the Financial Services Authority. 
--

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are 
addressed. If you receive a message in error, please advise the sender and 
delete immediately.

Except where this email is sent in the usual course of our business, any 
opinions expressed in this email are those of the author and do not 
necessarily reflect the opinions of Age UK or its subsidiaries and associated 
companies. Age UK monitors all e-mail transmissions passing 
through its network and may block or modify mails which are deemed to be 
unsuitable.

Age Concern England (charity number 261794) and Help the Aged (charity number 
272786) and their trading and other associated companies merged 
on 1st April 2009.  Together they have formed the Age UK Group, dedicated to 
improving the lives of people in later life.  The three national 
Age Concerns in Scotland, Northern Ireland and Wales have also merged with Help 
the Aged in these nations to form three registered charities: 
Age Scotland, Age NI, Age Cymru.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] anova comparisons

2013-02-23 Thread Robert Zimbardo
I have several linear models on the same data:

m1 - lm(y ~ poly(x,1))
m2 - lm(y ~ poly(x,2))
m3 - lm(y ~ poly(x,3))

What I don't understand is why

anova(m1, m2, m3, test=F)

- yields the same RSS and SS values, but a different p-value from anova(m1,
m2, test=F)
- when it also yields the SAME as anova(m2, m3, test=F)

What am I missing?

Rob

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova comparisons

2013-02-23 Thread Rolf Turner

On 02/23/2013 08:55 PM, Robert Zimbardo wrote:

I have several linear models on the same data:

m1 - lm(y ~ poly(x,1))
m2 - lm(y ~ poly(x,2))
m3 - lm(y ~ poly(x,3))

What I don't understand is why

anova(m1, m2, m3, test=F)

- yields the same RSS and SS values, but a different p-value from anova(m1,
m2, test=F)
- when it also yields the SAME as anova(m2, m3, test=F)

What am I missing?


A basic understanding of the theory of linear models.  This really has 
little

to do with R.  Go and read a good intro to linear modelling.

Insofar as your question has anything to do with R:

When you do

anova(m1, m2, m3, test=F)

the mean squared error from m3 is used as the denominator of the F 
statistic.


When you do

anova(m1, m2, test=F)

the mean squared error from m2 is used as the denominator of the F 
statistic.


cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova comparisons

2013-02-23 Thread peter dalgaard

On Feb 23, 2013, at 10:06 , Rolf Turner wrote:

 
 What am I missing?
 
 A basic understanding of the theory of linear models.  This really has little
 to do with R.  Go and read a good intro to linear modelling.
 
 Insofar as your question has anything to do with R:
 
 When you do
 
anova(m1, m2, m3, test=F)
 
 the mean squared error from m3 is used as the denominator of the F statistic.
 
 When you do
 
anova(m1, m2, test=F)
 
 the mean squared error from m2 is used as the denominator of the F statistic.

Maybe a bit quick there. It's actually about (computer) conventions for ANOVA 
tables; the theory would prefer you to do what the poster expects: Test 
sequentially and update the denominator MS along the route. However, lazy 
programmers (and before them lazy practising statisticians) decided that it is 
easier to use the MS from the biggest model throughout.

(There's also a post-hoc argument, that said MS is less susceptible to small 
departures from the model assumptions, but I'd say that the main reason is 
convenience.)

Even though it is a common convention, you do need exposure to at least one 
statistical computer program or one traditional analysis of variance textbook 
to see it. 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA repeated measures and post-hoc

2012-12-30 Thread Helios de Rosario
A (late) update to this question:

On Fri Aug 17 07:33:29, Henrik Singmann wrote:
 Hi Diego,

 I am struggeling with this question also for some time and there does

 not seem to be an easy and general solution to this problem. At least
I 
 haven't found one.
 However, if you have just one repeated-measures factor, use the
solution 
 describe by me here: http://stats.stackexchange.com/a/15532/442 
 Furthermore, you might wanna check the package phia with accompanying

 vignette (but it uses multivariate tests).

There is a new version of phia that also works with mixed-effects
models - but notice the caveat about the calculation of p-values for
such models, as explained in the vignette.

For more issues about mixed-effects models in R, there is a specifice
SIG-list:
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

 For running the ANOVA you could also check my new package afex.

 Cheers,
 Henrik
 
 
 Diego Bucci schrieb:
 Hi,
  I performed an ANOVA repeated measures but I still can't find any
good news regarding the possibility to perform multiple comparisons.
  Can anyone help me?
  Thanks


INSTITUTO DE BIOMECÁNICA DE VALENCIA
Universidad Politécnica de Valencia • Edificio 9C
Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
www.ibv.org

  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
de Datos de Carácter Personal, le informamos de que el presente mensaje
contiene información confidencial, siendo para uso exclusivo del
destinatario arriba indicado. En caso de no ser usted el destinatario
del mismo le informamos que su recepción no le autoriza a su divulgación
o reproducción por cualquier medio, debiendo destruirlo de inmediato,
rogándole lo notifique al remitente.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova

2012-11-27 Thread humming-bird
Hi everyone,

I am new to this forum and also new to statistics and I would appreciated it
if someone would take some time to answer my question.

I am analyzing companies in regard to their leverage. I categorized the
companies into 3 groups: small, mid and large. For the group small, I have
55 debt multiples, for mid 42 and for large 72. (Unfortunately I can not
provide my data because it is confidential.)
I am now trying to find out whether the mean debt multiples are
significantly different for the 3 different groups. For this reason I
calculated an anova table with the aov function and to display the results
for each pair I did the tukey.hsd function.
Now my question: Am I allowed to use these functions given that my data is
unbalanced? Can use I read several times that aov is only valid for balanced
data? If not, is there another function that I can use?

Thank you very much for your answers.


Call:
aov(formula = Debt.Ebitdax ~ Company.Size, data = Anetdebtx2003)

Terms:
Company.Size Residuals
Sum of Squares 302.3089 926.2174
Deg. of Freedom 2 166

Residual standard error: 2.362123 
Estimated effects may be unbalanced

 summary(Anovanetdebtx2003)
Df Sum Sq Mean Sq F value Pr(F) 
Company.Size 2 302.3 151.15 27.09 6.58e-11 ***
Residuals 166 926.2 5.58 
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 


 TukeyHSD(Anovanetdebtx2003)
Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = Debt.Ebitdax ~ Company.Size, data = Anetdebtx2003)

$Company.Size
diff lwr upr p adj
Mid Market Buyout-Large Buyout -1.446292 -2.530922 -0.3616617 0.0054123
Small Buyout-Large Buyout -3.112143 -4.112545 -2.1117420 0.000
Small Buyout-Mid Market Buyout -1.665852 -2.810574 -0.5211300 0.0021037



--
View this message in context: http://r.789695.n4.nabble.com/Anova-tp4650940.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova

2012-11-27 Thread Jose Iparraguirre
Hi, 

You can use the car package and choose Type-III test.
Also, have a look at the package easyanova.
Regards,
José

José Iparraguirre
Chief Economist
Age UK


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of humming-bird
Sent: 27 November 2012 10:12
To: r-help@r-project.org
Subject: [R] Anova

Hi everyone,

I am new to this forum and also new to statistics and I would appreciated it
if someone would take some time to answer my question.

I am analyzing companies in regard to their leverage. I categorized the
companies into 3 groups: small, mid and large. For the group small, I have
55 debt multiples, for mid 42 and for large 72. (Unfortunately I can not
provide my data because it is confidential.)
I am now trying to find out whether the mean debt multiples are
significantly different for the 3 different groups. For this reason I
calculated an anova table with the aov function and to display the results
for each pair I did the tukey.hsd function.
Now my question: Am I allowed to use these functions given that my data is
unbalanced? Can use I read several times that aov is only valid for balanced
data? If not, is there another function that I can use?

Thank you very much for your answers.


Call:
aov(formula = Debt.Ebitdax ~ Company.Size, data = Anetdebtx2003)

Terms:
Company.Size Residuals
Sum of Squares 302.3089 926.2174
Deg. of Freedom 2 166

Residual standard error: 2.362123 
Estimated effects may be unbalanced

 summary(Anovanetdebtx2003)
Df Sum Sq Mean Sq F value Pr(F) 
Company.Size 2 302.3 151.15 27.09 6.58e-11 ***
Residuals 166 926.2 5.58 
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 


 TukeyHSD(Anovanetdebtx2003)
Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = Debt.Ebitdax ~ Company.Size, data = Anetdebtx2003)

$Company.Size
diff lwr upr p adj
Mid Market Buyout-Large Buyout -1.446292 -2.530922 -0.3616617 0.0054123
Small Buyout-Large Buyout -3.112143 -4.112545 -2.1117420 0.000
Small Buyout-Mid Market Buyout -1.665852 -2.810574 -0.5211300 0.0021037



--
View this message in context: http://r.789695.n4.nabble.com/Anova-tp4650940.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

A Star for Christmas

Kick start the festive season by attending one of Age UK’s Carol Concerts, A 
Star for Christmas. Taking place at Manchester Cathedral on Saturday 1 December 
and London’s St Pancras Church (opposite Euston Station) on Thursday 6 
December, they will feature special musical performances, readings by your 
favourite celebrities and carols, followed by mince pies and wine.
Tickets are priced at £20 full price/ £10 concessions. For more information, 
please visit http://www.ageuk.org.uk/astarforchristmas  or contact the 
Fundraising Events Team on
020 303 31725.

Age UK  Improving later life
www.ageuk.org.uk

 

---
Age UK is a registered charity and company limited by guarantee, (registered 
charity number 1128267, registered company number 6825798). 
Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

For the purposes of promoting Age UK Insurance, Age UK is an Appointed 
Representative of Age UK Enterprises Limited, Age UK is an Introducer 
Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth 
Access for the purposes of introducing potential annuity and health 
cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit 
Solutions Limited and Simplyhealth Access are all authorised and 
regulated by the Financial Services Authority. 
--

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are 
addressed. If you receive a message in error, please advise the sender and 
delete immediately.

Except where this email is sent in the usual course of our business, any 
opinions expressed in this email are those of the author and do not 
necessarily reflect the opinions of Age UK or its subsidiaries and associated 
companies. Age UK monitors all e-mail transmissions passing 
through its network and may block or modify mails which are deemed to be 
unsuitable.

Age Concern England (charity number 261794) and Help the Aged (charity number 
272786) and their trading and other associated companies merged 
on 1st April 2009.  Together they have formed the Age UK Group, dedicated to 
improving the lives of people in later life.  The three national 
Age Concerns in Scotland, Northern Ireland and Wales have also merged with Help 
the Aged in these nations to form three registered charities: 
Age Scotland, Age NI, Age

Re: [R] Anova

2012-11-27 Thread S Ellison
 Now my question: Am I allowed to use these functions given 
 that my data is unbalanced? 

Unusually, Yes, assuming all the other requisite assumptions are reasonably 
well satisfied. One-way ANOVA interpretation is not much affected by imbalance 
because (among other things) with only one factor there is no 'model order' 
dependence or hypothesis separation to worry about. You don't need to use 
alternate sum-of-squares calculations (eg type II and III as in car) because 
types I, II and III sums of squares are all the same in a one-way case.

In a two-factor or higher analysis it's a (very) different story. 

However, one-way anova is still adversely affected by failures in homogeneity 
of variance, non-normality etc. 

S Ellison

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova test for variables with different lengths

2012-10-16 Thread Jose Iparraguirre
Hi Sachin,

You may find this tutorial useful: 
http://goanna.cs.rmit.edu.au/~fscholer/anova.php
And you'll need the car package; but become yourself familiar with Type I, II 
and III sums of squares models before running the Anova; the tutorial explains 
these in detail.
Hope it helps.

José


José Iparraguirre
Chief Economist
Age UK

T 020 303 31482
E jose.iparragui...@ageuk.org.uk
Twitter @jose.iparraguirre@ageuk


Tavis House, 1- 6 Tavistock Square
London, WC1H 9NB
www.ageuk.org.uk | ageukblog.org.uk | @ageukcampaigns 


For a copy of our new Economic Monitor and the full Chief Economist's report, 
visit the Age UK Knowledge Hub 
http://www.ageuk.org.uk/professional-resources-home/knowledge-hub-evidence-statistics/


For evidence and statistics on the older population, visit the Age UK Knowledge 
Hub 
http://www.ageuk.org.uk/professional-resources-home/knowledge-hub-evidence-statistics/


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Sachinthaka Abeywardana
Sent: 16 October 2012 04:18
To: r-help@r-project.org
Subject: [R] anova test for variables with different lengths

Hi all,

I want to test whether the MEAN of two different variables, (and
different number of observations) are the same. I am trying to use the
anova test but it doesn't seem to like that the number of observations
are different:

a=c(1:5)
b=c(1:3)
aov_test=aov(a~b)
Error in model.frame.default(formula = a ~ b, drop.unused.levels = TRUE) :
  variable lengths differ (found for 'b')

Any ideas as to how I would go about doing this test?

Thanks,
Sachin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Age UK and YouthNet are official charities for the Virgin London Marathon 2013

We need you to Run for it. Join the team and help raise vital funds to bring 
generations together to combat loneliness and isolation.

Go to http://www.runforit.org.uk for more information or contact Helen Parson 
at helen.pars...@ageuk.org.uk or on 020 303 31369.

Age UK and YouthNet. A lifeline, online.

www.runforit.org.uk



Age UK Improving later life

www.ageuk.org.uk



---
Age UK is a registered charity and company limited by guarantee, (registered 
charity number 1128267, registered company number 6825798). 
Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

For the purposes of promoting Age UK Insurance, Age UK is an Appointed 
Representative of Age UK Enterprises Limited, Age UK is an Introducer 
Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth 
Access for the purposes of introducing potential annuity and health 
cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit 
Solutions Limited and Simplyhealth Access are all authorised and 
regulated by the Financial Services Authority. 
--

This email and any files transmitted with it are confide...{{dropped:28}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova test for variables with different lengths

2012-10-16 Thread S Ellison
 

 I want to test whether the MEAN of two different variables, 
 (and different number of observations) are the same. I am 
 trying to use the anova test but it doesn't seem to like that 
 the number of observations are different:
 
 a=c(1:5)
 b=c(1:3)
 aov_test=aov(a~b)
 Error in model.frame.default(formula = a ~ b, 
 drop.unused.levels = TRUE) :
   variable lengths differ (found for 'b')
 
 -Original Message-
 You may find this tutorial useful: 
 http://goanna.cs.rmit.edu.au/~fscholer/anova.php
 And you'll need the car package; but become yourself familiar 
 with Type I, II and III sums of squares models before running 
 the Anova; the tutorial explains these in detail.
 Hope it helps.

Sadly, I doubt that it will, though it would be good advice if the OP had got 
as far as formulating the model correctly. 

But they haven't. The OP has tried to model a variable of length 5 using a 
predictor of length 3. (In fact what they've just done is a simple linear 
regression of variables with different length). This will not work, no  matter 
what the OP does about types of SS. 

First, a t test would do this job, assuming normality - though incidentally the 
variances differ so the default t.test will return a somewhat different result 
to anova, which effectively assumes equal variance by default.

Second, to use aov correctly, read ?formula and look at the examples for this 
and ?lm

Then, if you want to get the same result as an equal variance t test using 
ANOVA, you'd have to concatenate the two groups and then model with a predictor 
indicating the groups. In this instance

y - c(a, b)
g - factor( rep( letters[1:2], c(length(a), length(b) ) ), )
summary( aov(y~g) )

Since this is a one way problem the type of SS won't matter, but in other cases 
it would be crucial to at least understand why - and to what extent - anova can 
be  unsafe* on unbalanced data.


S Ellison

*unsafe reads as actively dangerous in this context.

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova, F statistics, P-values

2012-10-06 Thread Jhope
Thank you for the replies. 

I am actually trying to gain p-values and f values, and tried the below
script but unsuccessful.

1) I have read in another forum to use the package lmer but apparently it
does not exist. 
2) Then I tried the pvals.fnc but it is not a function. 
3) I read also that a glm model must first be created but was not able to
gain a P-value through summary but got F statistics. 
4) Is there a way to get the P-value and F statistics? Or does this require
two different executions? 
5) I am afraid to update my R version because I may loose my ALL saved
scripts, should I do so? 

Please advise, Jean


 install.packages(lmer)
Installing package(s) into
‘/Library/Frameworks/R.framework/Versions/2.13/Resources/library’
(as ‘lib’ is unspecified)
Warning in install.packages :
  package ‘lmer’ is not available (for R version 2.13.1)

 pvals.fnc(HSuccess ~ VegIndex, data = data.to.analyze)
Error: could not find function pvals.fnc
 Model1.glm - glm(cbind(Shells, TotalEggs-Shells) ~ HTL,
 data=data.to.analyze, family = binomial)
 summary(Model1.glm)




--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-tp4645130p4645242.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova

2012-10-05 Thread Jhope
Hi R-listers, 

I am trying to do an ANOVA for the following scatterplot and received the
following error:

library(car)
scatterplot(HSuccess ~ Veg, 
data = data.to.analyze,
xlab = Vegetation border (m), 
ylab = Hatching success (%))

anova(HSuccess ~ Veg, data=data.to.analyze)

Error in UseMethod(anova) : 
  no applicable method for 'anova' applied to an object of class formula

I am wondering if there is a better way to do this?
Please advise, Jean




--
View this message in context: http://r.789695.n4.nabble.com/Anova-tp4645130.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova

2012-10-05 Thread Rui Barradas

Hello,

You're thinking of ?aov. anova() does _not_ have a formula interface, it 
would be


anova(lm(HSuccess ~ Veg, data = data.to.analyze))

or

aov(HSuccess ~ Veg, data = data.to.analyze)

Hope this helps,

Rui Barradas
Em 05-10-2012 09:27, Jhope escreveu:

Hi R-listers,

I am trying to do an ANOVA for the following scatterplot and received the
following error:

library(car)
scatterplot(HSuccess ~ Veg,
 data = data.to.analyze,
 xlab = Vegetation border (m),
 ylab = Hatching success (%))

anova(HSuccess ~ Veg, data=data.to.analyze)

Error in UseMethod(anova) :
   no applicable method for 'anova' applied to an object of class formula

I am wondering if there is a better way to do this?
Please advise, Jean




--
View this message in context: http://r.789695.n4.nabble.com/Anova-tp4645130.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova

2012-10-05 Thread John Fox
Dear Jean,

On Fri, 5 Oct 2012 01:27:55 -0700 (PDT)
 Jhope jeanwaij...@gmail.com wrote:
 Hi R-listers, 
 
 I am trying to do an ANOVA for the following scatterplot and received the
 following error:
 
 library(car)
 scatterplot(HSuccess ~ Veg, 
 data = data.to.analyze,
 xlab = Vegetation border (m), 
 ylab = Hatching success (%))
 
 anova(HSuccess ~ Veg, data=data.to.analyze)
 
 Error in UseMethod(anova) : 
   no applicable method for 'anova' applied to an object of class formula
 
 I am wondering if there is a better way to do this?

anova() needs a model object, not a formula, as its first argument:

anova(lm(HSuccess ~ Veg, data=data.to.analyze))

Alternatively, you can use aov(), with summary(), to get the ANOVA table:

summary(aov(HSuccess ~ Veg, data=data.to.analyze))

I hope this helps,
 John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

 Please advise, Jean
 
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Anova-tp4645130.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-30 Thread arun
Hi,

I got your data from Nabble.  I meant to do it by dput() because the .csv files 
may not get through the R-help mailing list.

This is what I got:
dat1-read.delim(typohne.csv)
 str(dat1)
#'data.frame':    60 obs. of  3 variables:
# $ typ : Factor w/ 5 levels A,B,C,D,..: 1 2 3 4 5 1 2 3 4 5 ...
# $ abun: int  7 3 14 8 15 5 21 19 9 21 ...
# $ div : int  7 3 11 6 14 5 12 15 8 12 ...

summary(fm1-aov(abun~typ,data=dat1))
#    Df Sum Sq Mean Sq F value Pr(F)  
#typ  4    847  211.69   3.654 0.0104 *
#Residuals   55   3186   57.93 
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 myresults-TukeyHSD(fm1,typ,ordered=TRUE)
 myresults
#  Tukey multiple comparisons of means
#    95% family-wise confidence level
#    factor levels have been ordered

#Fit: aov(formula = abun ~ typ, data = dat1)
#
#$typ
# diff    lwr  upr p adj
#B-A  6.50 -2.2633699 15.26337 0.2382355
#D-A  6.75 -2.0133699 15.51337 0.2055350
#E-A  9.08  0.3199635 17.84670 0.0386441
#C-A 11.17  2.4032968 19.93004 0.0060292
#D-B  0.25 -8.5133699  9.01337 0.902
#E-B  2.58 -6.1800365 11.34670 0.9197106
#C-B  4.67 -4.0967032 13.43004 0.5657010
#E-D  2.33 -6.4300365 11.09670 0.9432637
#C-D  4.416667 -4.3467032 13.18004 0.6167517
#C-E  2.08 -6.6800365 10.84670 0.9619249
library(agricolae)
 HSD.test(fm1,typ,group=TRUE)
#Study:
#HSD Test for abun 
#Mean Square Error:  57.92879 
#typ,  means
#  abun   std.err replication
#A  6.25000 0.6976693  12
#B 12.75000 2.4061947  12
#C 17.41667 2.9062585  12
#D 13.0 2.6198543  12
#E 15.3 1.5970301  12
#alpha: 0.05 ; Df Error: 55 
#Critical Value of Studentized Range: 3.988545 
#Honestly Significant Difference: 8.76337 
#Means with the same letter are not significantly different.
#Groups, Treatments and means
#a      C      17.41667 
#a      E      15.3 
#ab      D      13 
#ab      B      12.75 
# b      A      6.25 


I am not sure why you didn't get the results as above.  Check str() to see 
whether the str() I got matches with yours.

Have a great day!
A.K.

    











- Original Message -
From: Landi ent-ar...@gmx.de
To: r-help@r-project.org
Cc: 
Sent: Friday, September 28, 2012 10:35 AM
Subject: Re: [R] Anova and tukey-grouping

Hello !

Thanks for your advice. I tried it, but the output is the same:
 HSD.test(anova.typabunmit, typ, group=TRUE)
Name:  typ 
ds.typabunmit$typ 

I don't get the values...!?!?



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-30 Thread Landi
typohne.csv http://r.789695.n4.nabble.com/file/n4644635/typohne.csv  

Hello arun kirshna,

I checked str() but I do not see any mistake.
Find attached a subset of my data (did you mean this with dput() ?)
Typ= treatment
abun= abundance
div= diversity, number of species

best wishes.



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644635.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-30 Thread Landi
Hello again !

It worked exactly the same way as yours!
I'm kind of astonished, because in my opinion our skripts are the same
(obviously not...).
Whatever! It works :-D

Tomorrow (as where I live it's already evening...sunday) will be a great
day, indeed, and I can go on with analysis !
I really appreciate the conversation with you and your support!



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644642.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova and tukey-grouping

2012-09-28 Thread Landi
Hello,

I am really new to R and it's still a challenge to me.
Currently I'm working on my Master's Thesis. My supervisor works with SAS
and is not familiar with R at all.

I want to run an Anova, a tukey-test and as a result I want to have the
tukey-grouping ( something like A - AB - B)

I came across the HSD.test in the agricolae-package, but... unfortunately I
do not get an output (like here in the answer
http://stats.stackexchange.com/questions/31547/how-to-obtain-the-results-of-a-tukey-hsd-post-hoc-test-in-a-table-showing-groupe
)

I did it like this:

##   ANOVA
anova.typabunmit-aov(ds.typabunmit$abun ~ ds.typabunmit$typ)
summary(anova.typabunmit)
summary.lm(anova.typabunmit)

## post HOC
tukey.typabunmit-TukeyHSD(anova.typabunmit)
tukey.typabunmit

## HSD
HSD.test(anova.typabunmit, abun, group=TRUE)



and the ONLY output is this:
Name:  abun 
 ds.typabunmit$typ 


I would be very pleased about some ides..:!





--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-28 Thread arun
HI,

I guess there is a mistake in your code.  You should have used typ instead of 
abun as abun is the dependent variable.
summary(fm1 - aov(breaks ~ wool + tension, data = warpbreaks))
myresults    -  TukeyHSD(fm1, tension, ordered = TRUE)
library(agricolae)

HSD.test(fm1,wool,group=TRUE)
#Study:
#HSD Test for breaks 
#Mean Square Error:  134.9578 
#wool,  means
#    breaks  std.err replication
#A 31.03704 3.050609  27
#B 25.25926 1.789963  27
#alpha: 0.05 ; Df Error: 50 
#Critical Value of Studentized Range: 2.840532 
#Honestly Significant Difference: 6.350628 
#Means with the same letter are not significantly different.
#Groups, Treatments and means
#a      A      31.037037037037 
#a      B      25.2592592592593 

 

A.K.



- Original Message -
From: Landi ent-ar...@gmx.de
To: r-help@r-project.org
Cc: 
Sent: Friday, September 28, 2012 5:41 AM
Subject: [R] Anova and tukey-grouping

Hello,

I am really new to R and it's still a challenge to me.
Currently I'm working on my Master's Thesis. My supervisor works with SAS
and is not familiar with R at all.

I want to run an Anova, a tukey-test and as a result I want to have the
tukey-grouping ( something like A - AB - B)

I came across the HSD.test in the agricolae-package, but... unfortunately I
do not get an output (like here in the answer
http://stats.stackexchange.com/questions/31547/how-to-obtain-the-results-of-a-tukey-hsd-post-hoc-test-in-a-table-showing-groupe
)

I did it like this:

##   ANOVA
anova.typabunmit-aov(ds.typabunmit$abun ~ ds.typabunmit$typ)
summary(anova.typabunmit)
summary.lm(anova.typabunmit)

## post HOC
tukey.typabunmit-TukeyHSD(anova.typabunmit)
tukey.typabunmit

## HSD
HSD.test(anova.typabunmit, abun, group=TRUE)



and the ONLY output is this:
Name:  abun 
ds.typabunmit$typ 


I would be very pleased about some ides..:!





--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-28 Thread Landi
Hello !

Thanks for your advice. I tried it, but the output is the same:
 HSD.test(anova.typabunmit, typ, group=TRUE)
Name:  typ 
 ds.typabunmit$typ 

I don't get the values...!?!?



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-28 Thread arun
Hi,

As I mentioned earlier, these are just guess work until you provide a subset of 
your data with dput().  Also, please check the structure of the data with str().

A.K.  






- Original Message -
From: Landi ent-ar...@gmx.de
To: r-help@r-project.org
Cc: 
Sent: Friday, September 28, 2012 10:35 AM
Subject: Re: [R] Anova and tukey-grouping

Hello !

Thanks for your advice. I tried it, but the output is the same:
 HSD.test(anova.typabunmit, typ, group=TRUE)
Name:  typ 
ds.typabunmit$typ 

I don't get the values...!?!?



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova problem with order of terms in model

2012-09-26 Thread Modi2020
Please check if your independent variables or Xs are independent.
If they are not that can affect the sequential decomposition. 



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-problem-with-order-of-terms-in-model-tp791744p4644320.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] anova for two mixture model objects

2012-08-23 Thread Maria Manuel Angélico
Hi,

I´m having problems trying to compare models obtained using the mixture model 
fitting function “mix”

 

model1- mix(data, data.par1,dist=norm)

model2- mix(data, data.par2,dist=norm)

 

anova(model1, model2)

 

When the number of parameters estimated for the two models is different I get 
an error message: 

“Error in anova.mix(model1, model2) : 

 binary operation on non-conformable arrays”

 

When both models have the same number of parameters estimated I get a warning 
saying that the models should be nested:

“Warning message:

In anova.mix(model1, model2) : The models are not nested”

 



I wonder if anyone can help

Thanks in advance

Maria



 

Maria Manuel Angélico 
IPMA, Portuguese Institute for Sea and Atmosphere
Lisboa , Portugal 
tel: ++ 351 213027068 
angel...@ipimar.pt

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA repeated measures and post-hoc

2012-08-16 Thread Robert Baer
Check out:
http://rtutorialseries.blogspot.com/2011/02/r-tutorial-series-two-way-repeated.html
On 8/15/2012 11:32 AM, Diego Bucci wrote:
 Hi,
 I performed an ANOVA repeated measures but I still can't find any good news 
 regarding the possibility to perform multiple comparisons.
 Can anyone help me?
 Thanks

 Diego Bucci
 Fisiologia Veterinaria
 Dipartimento di Scienze Mediche Veterinarie
 Università degli Studi di Bologna
 Via Tolara di Sopra, 50
 40064 Ozzano dell'Emilia, BO
 Tel. 00390512097904
 mail diego.buc...@unibo.it


   [[alternative HTML version deleted]]



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA repeated measures and post-hoc

2012-08-16 Thread Henrik Singmann

Hi Diego,

I am struggeling with this question also for some time and there does 
not seem to be an easy and general solution to this problem. At least I 
haven't found one.
However, if you have just one repeated-measures factor, use the solution 
describe by me here: http://stats.stackexchange.com/a/15532/442
Furthermore, you might wanna check the package phia with accompanying 
vignette (but it uses multivariate tests).

For running the ANOVA you could also check my new package afex.

Cheers,
Henrik


Diego Bucci schrieb:

Hi,
I performed an ANOVA repeated measures but I still can't find any good news 
regarding the possibility to perform multiple comparisons.
Can anyone help me?
Thanks

Diego Bucci
Fisiologia Veterinaria
Dipartimento di Scienze Mediche Veterinarie
Università degli Studi di Bologna
Via Tolara di Sopra, 50
40064 Ozzano dell'Emilia, BO
Tel. 00390512097904
mail diego.buc...@unibo.it


[[alternative HTML version deleted]]





--
Dipl. Psych. Henrik Singmann
PhD Student
Albert-Ludwigs-Universität Freiburg, Germany
http://www.psychologie.uni-freiburg.de/Members/singmann

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA repeated measures and post-hoc

2012-08-15 Thread Diego Bucci
Hi,
I performed an ANOVA repeated measures but I still can't find any good news 
regarding the possibility to perform multiple comparisons.
Can anyone help me?
Thanks

Diego Bucci
Fisiologia Veterinaria
Dipartimento di Scienze Mediche Veterinarie
Università degli Studi di Bologna
Via Tolara di Sopra, 50
40064 Ozzano dell'Emilia, BO
Tel. 00390512097904
mail diego.buc...@unibo.it


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova in unbalanced data

2012-08-14 Thread S Ellison

 

 -Original Message-
 Say I have the following data:
 
 a-data.frame(col1=c(rep(a,5),rep(b,7)),col2=runif(12))
 
 a_aov-aov(a$col2~a$col1)
 
 summary(aov)
 
 
 Note that there are 5 observations for a and 7 for b, thus is 
 unbalanced. What would be the correct way of doing anova for this set?
 

As this is a single factor case, the imbalance doesn't affect the 
interpretation. For two-way and higher models, it would affect the 
interpretation, and john fox's post (and a very large literature) then applies. 
But here, the usual variants and contrast choices will all return the same 
p-value, so aov works, as does 
anova(lm(col2~col1, data=a)) #note that the 'data' argument also works in aov


S

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] anova in unbalanced data

2012-08-13 Thread Sachinthaka Abeywardana
Hi all,

Say I have the following data:

a-data.frame(col1=c(rep(a,5),rep(b,7)),col2=runif(12))

a_aov-aov(a$col2~a$col1)

summary(aov)


Note that there are 5 observations for a and 7 for b, thus is
unbalanced. What would be the correct way of doing anova for this set?


Thanks,

Sachin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova in unbalanced data

2012-08-13 Thread arun
HI,

Check this link:

https://stat.ethz.ch/pipermail/r-help/2011-April/273858.html

A.K.



- Original Message -
From: Sachinthaka Abeywardana sachin.abeyward...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Monday, August 13, 2012 10:09 PM
Subject: [R] anova in unbalanced data

Hi all,

Say I have the following data:

a-data.frame(col1=c(rep(a,5),rep(b,7)),col2=runif(12))

a_aov-aov(a$col2~a$col1)

summary(aov)


Note that there are 5 observations for a and 7 for b, thus is
unbalanced. What would be the correct way of doing anova for this set?


Thanks,

Sachin

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Anova Type II and Contrasts

2012-07-06 Thread mails
the study design of the data I have to analyse is simple. There is 1 control 
group (CTRL) and 2 different treatment groups (TREAT_1 and TREAT_2). 
The data also includes 2 covariates COV1 and COV2. I have been asked to check 
if there is a linear or quadratic treatment effect in the data. 

I created a dummy data set to explain my situation: 

df1 - data.frame( 

Observation = c(rep(CTRL,15), rep(TREAT_1,13), rep(TREAT_2, 12)), 

COV1 = c(rep(A1, 30), rep(A2, 10)), 

COV2 = c(rep(B1, 5), rep(B2, 5), rep(B3, 10), rep(B1, 5), rep(B2, 5), 
rep(B3, 10)), 

Variable = c(3944133, 3632461, 3351754, 3655975, 3487722, 3644783, 3491138, 
3328894, 
3654507, 3465627, 3511446, 3507249, 3373233, 3432867, 
3640888, 

3677593, 3585096, 3441775, 3608574, 3669114, 4000812, 
3503511, 3423968, 
3647391, 3584604, 3548256, 3505411, 3665138, 

   4049955, 3425512, 3834061, 3639699, 3522208, 3711928, 
3576597, 3786781, 
   3591042, 3995802, 3493091, 3674475) 
) 

plot(Variable ~ Observation, data = df1) 

As you can see from the plot there is a linear relationship between the control 
and the treatment groups. To check if this linear effect is statistical 
significant I change the contrasts using the contr.poly() function and fit a 
linear model like this: 

contrasts(df1$Observation) - contr.poly(levels(df1$Observation)) 
lm1 - lm(log(Variable) ~ Observation, data = df1) 
summary.lm(lm1) 

From the summary we can see that the linear effect is statistically 
significant: 

Observation.L  0.029141   0.0123772.3550.024 *   
Observation.Q  0.002233   0.0124820.1790.859   

However, this first model does not include any of the two covariates. Including 
them results in a non-significant p-value for the linear relationship: 

lm2 - lm(log(Variable) ~ Observation + COV1 + COV2, data = df1) 
summary.lm(lm2) 

Observation.L  0.041160.02624   1.5680.126 
Observation.Q  0.010030.01894   0.5300.600 
COV1A2-0.012030.04202  -0.2860.776 
COV2B2-0.020710.02202  -0.9410.354 
COV2B3-0.020830.02066  -1.0080.320   

So far so good. However, I have been told to conduct a Type II Anova rather 
than a Type I. To conduct a Type II Anova I used the Anova() function 
 from the car package. 

Anova(lm2, type=II) 

Anova Table (Type II tests) 

Response: log(Variable) 
Sum Sq Df F value Pr(F) 
Observation 0.006253  2  1.4651 0.2453 
COV1  0.000175  1  0.0820 0.7763 
COV2  0.002768  2  0.6485 0.5292 
Residuals  0.072555 34 

The problem here with using Type II is that you do not get a p-value for the 
linear and quadratic effect. 
So I do not know if the treatment effect is statistically linear and or 
quadratic. 

I found out that the following code produces the same p-value for Observation 
as the Anova() function. However, the result also does not include 
any p-values for the linear or quadratic effect: 

lm2 - lm(log(Variable) ~ Observation + COV1 + COV2, data = df1) 
lm3 - lm(log(Variable) ~ COV1 + COV2, data = df1) 
anova(lm2, lm3) 


Does anybody know how to conduct a Type II anova and the contrasts function to 
obtain the p-values for the linear and quadratic effects? 

Help would be very much appreciated. 

Best Peter
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova Type II and Contrasts

2012-07-06 Thread David Winsemius

Dear Peter;

This is an exact duplicate of a question posted on SO. Cross-posting  
is deprecated on Rhelp.


--
David.

On Jul 6, 2012, at 11:06 AM, mails wrote:

the study design of the data I have to analyse is simple. There is 1  
control group (CTRL) and 2 different treatment groups (TREAT_1 and  
TREAT_2).
The data also includes 2 covariates COV1 and COV2. I have been asked  
to check if there is a linear or quadratic treatment effect in the  
data.


I created a dummy data set to explain my situation:

df1 - data.frame(

Observation = c(rep(CTRL,15), rep(TREAT_1,13), rep(TREAT_2,  
12)),


COV1 = c(rep(A1, 30), rep(A2, 10)),

COV2 = c(rep(B1, 5), rep(B2, 5), rep(B3, 10), rep(B1, 5),  
rep(B2, 5), rep(B3, 10)),


Variable = c(3944133, 3632461, 3351754, 3655975, 3487722, 3644783,  
3491138, 3328894,
   3654507, 3465627, 3511446, 3507249, 3373233,  
3432867, 3640888,


   3677593, 3585096, 3441775, 3608574, 3669114,  
4000812, 3503511, 3423968,

   3647391, 3584604, 3548256, 3505411, 3665138,

  4049955, 3425512, 3834061, 3639699, 3522208,  
3711928, 3576597, 3786781,

  3591042, 3995802, 3493091, 3674475)
)

plot(Variable ~ Observation, data = df1)

As you can see from the plot there is a linear relationship between  
the control and the treatment groups. To check if this linear effect  
is statistical
significant I change the contrasts using the contr.poly() function  
and fit a linear model like this:


contrasts(df1$Observation) - contr.poly(levels(df1$Observation))
lm1 - lm(log(Variable) ~ Observation, data = df1)
summary.lm(lm1)

From the summary we can see that the linear effect is statistically  
significant:


Observation.L  0.029141   0.0123772.3550.024 *
Observation.Q  0.002233   0.0124820.1790.859

However, this first model does not include any of the two  
covariates. Including them results in a non-significant p-value for  
the linear relationship:


lm2 - lm(log(Variable) ~ Observation + COV1 + COV2, data = df1)
summary.lm(lm2)

Observation.L  0.041160.02624   1.5680.126
Observation.Q  0.010030.01894   0.5300.600
COV1A2-0.012030.04202  -0.2860.776
COV2B2-0.020710.02202  -0.9410.354
COV2B3-0.020830.02066  -1.0080.320

So far so good. However, I have been told to conduct a Type II Anova  
rather than a Type I. To conduct a Type II Anova I used the Anova()  
function

from the car package.

Anova(lm2, type=II)

Anova Table (Type II tests)

Response: log(Variable)
   Sum Sq Df F value Pr(F)
Observation 0.006253  2  1.4651 0.2453
COV1  0.000175  1  0.0820 0.7763
COV2  0.002768  2  0.6485 0.5292
Residuals  0.072555 34

The problem here with using Type II is that you do not get a p-value  
for the linear and quadratic effect.
So I do not know if the treatment effect is statistically linear and  
or quadratic.


I found out that the following code produces the same p-value for  
Observation as the Anova() function. However, the result also does  
not include

any p-values for the linear or quadratic effect:

lm2 - lm(log(Variable) ~ Observation + COV1 + COV2, data = df1)
lm3 - lm(log(Variable) ~ COV1 + COV2, data = df1)
anova(lm2, lm3)


Does anybody know how to conduct a Type II anova and the contrasts  
function to obtain the p-values for the linear and quadratic effects?


Help would be very much appreciated.

Best Peter
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova Type II and Contrasts

2012-07-06 Thread John Fox
Dear Peter,

Because your model is additive, type-II and type-III tests are identical, 
and the t-tests for the linear and quadratic coefficients are interpretable.

I hope this helps,
 John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

On Fri, 6 Jul 2012 16:06:26 +0100
 mails mails00...@gmail.com wrote:
 the study design of the data I have to analyse is simple. There is 1 control 
 group (CTRL) and 2 different treatment groups (TREAT_1 and TREAT_2). 
 The data also includes 2 covariates COV1 and COV2. I have been asked to check 
 if there is a linear or quadratic treatment effect in the data. 
 
 I created a dummy data set to explain my situation: 
 
 df1 - data.frame( 
 
 Observation = c(rep(CTRL,15), rep(TREAT_1,13), rep(TREAT_2, 12)), 
 
 COV1 = c(rep(A1, 30), rep(A2, 10)), 
 
 COV2 = c(rep(B1, 5), rep(B2, 5), rep(B3, 10), rep(B1, 5), rep(B2, 
 5), rep(B3, 10)), 
 
 Variable = c(3944133, 3632461, 3351754, 3655975, 3487722, 3644783, 3491138, 
 3328894, 
 3654507, 3465627, 3511446, 3507249, 3373233, 3432867, 
 3640888, 
 
 3677593, 3585096, 3441775, 3608574, 3669114, 4000812, 
 3503511, 3423968, 
 3647391, 3584604, 3548256, 3505411, 3665138, 
 
4049955, 3425512, 3834061, 3639699, 3522208, 3711928, 
 3576597, 3786781, 
3591042, 3995802, 3493091, 3674475) 
 ) 
 
 plot(Variable ~ Observation, data = df1) 
 
 As you can see from the plot there is a linear relationship between the 
 control and the treatment groups. To check if this linear effect is 
 statistical 
 significant I change the contrasts using the contr.poly() function and fit a 
 linear model like this: 
 
 contrasts(df1$Observation) - contr.poly(levels(df1$Observation)) 
 lm1 - lm(log(Variable) ~ Observation, data = df1) 
 summary.lm(lm1) 
 
 From the summary we can see that the linear effect is statistically 
 significant: 
 
 Observation.L  0.029141   0.0123772.3550.024 *   
 Observation.Q  0.002233   0.0124820.1790.859   
 
 However, this first model does not include any of the two covariates. 
 Including them results in a non-significant p-value for the linear 
 relationship: 
 
 lm2 - lm(log(Variable) ~ Observation + COV1 + COV2, data = df1) 
 summary.lm(lm2) 
 
 Observation.L  0.041160.02624   1.5680.126 
 Observation.Q  0.010030.01894   0.5300.600 
 COV1A2-0.012030.04202  -0.2860.776 
 COV2B2-0.020710.02202  -0.9410.354 
 COV2B3-0.020830.02066  -1.0080.320   
 
 So far so good. However, I have been told to conduct a Type II Anova rather 
 than a Type I. To conduct a Type II Anova I used the Anova() function 
  from the car package. 
 
 Anova(lm2, type=II) 
 
 Anova Table (Type II tests) 
 
 Response: log(Variable) 
 Sum Sq Df F value Pr(F) 
 Observation 0.006253  2  1.4651 0.2453 
 COV1  0.000175  1  0.0820 0.7763 
 COV2  0.002768  2  0.6485 0.5292 
 Residuals  0.072555 34 
 
 The problem here with using Type II is that you do not get a p-value for the 
 linear and quadratic effect. 
 So I do not know if the treatment effect is statistically linear and or 
 quadratic. 
 
 I found out that the following code produces the same p-value for Observation 
 as the Anova() function. However, the result also does not include 
 any p-values for the linear or quadratic effect: 
 
 lm2 - lm(log(Variable) ~ Observation + COV1 + COV2, data = df1) 
 lm3 - lm(log(Variable) ~ COV1 + COV2, data = df1) 
 anova(lm2, lm3) 
 
 
 Does anybody know how to conduct a Type II anova and the contrasts function 
 to obtain the p-values for the linear and quadratic effects? 
 
 Help would be very much appreciated. 
 
 Best Peter
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA help

2012-06-20 Thread James Johnson
Hi All,

I have a microarray dataset as follows:

  expt1 expt2 expt3  expt4 expt 5 

gene1    val  val val  val val

gene2    val  val val  val    val

.
.
..
gene15000   val   val val  val val


The result is from the same organism in four different experiments.  Also, 
there are 4 replicates of each experiment. My aim was to find genes that are 
statistically significant across the four experiments. I carried out one-way 
anova as follows:


sTest-read.table(myData.dat,header = T, row.names = 1)
group - gl(4,4,16, label=c(Glucose,Citrate, Tide,Dawn))

gm1 - (x~group, data=sTest)

I received error messages that x is unknown, and did not know how to go 
further from here.

Please help!!!


James

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA help

2012-06-20 Thread David Winsemius


On Jun 19, 2012, at 5:36 PM, James Johnson wrote:


Hi All,

I have a microarray dataset as follows:

  expt1 expt2 expt3  expt4 expt 5

gene1val  val val  val val

gene2val  val val  valval

.
.
..
gene15000   val   val val  val val


The result is from the same organism in four different experiments.   
Also, there are 4 replicates of each experiment. My aim was to find  
genes that are statistically significant across the four  
experiments. I carried out one-way anova as follows:



sTest-read.table(myData.dat,header = T, row.names = 1)
group - gl(4,4,16, label=c(Glucose,Citrate, Tide,Dawn))

gm1 - (x~group, data=sTest)

I received error messages that x is unknown, and did not know how  
to go further from here.




This appears to be a college homework assignment and as such is quite  
possibily inappropriate submission to Rhelp. If not, you should post  
the results of str(sTest).


You should also read the Posting Guide and learn to post in plain text.


PLEASE do read the posting guide http://www.R-project.org/posting-guide.html



and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova of lme objects (model1, model2) gives different results depending on order of models

2012-06-01 Thread Chris Beeley

Well that's that cleared up then. Thanks to all.

Chris B.

On 31/05/2012 17:51, Albyn Jones wrote:

No, both yield the same result: reject the null hypothesis,
which always corresponds to the restricted (smaller) model.

albyn

On Thu, May 31, 2012 at 12:47:30PM +0100, Chris Beeley wrote:

Hello-

I understand that it's convention, when comparing two models using
the anova function anova(model1, model2), to put the more
complicated (for want of a better word) model as the second model.
However, I'm using lme in the nlme package and I've found that the
order of the models actually gives opposite results. I'm not sure if
this is supposed to be the case or if I have missed something
important, and I can't find anything in the Pinheiro and Bates book
or in ?anova, or in Google for that matter which unfortunately only
returns results about ANOVA which isn't much help. I'm using the
latest version of R and nlme, just checked both.

Here is the code and output:


PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit)

PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit,
+  correlation=corAR1(form=~Date|Case))


anova(PHQmodel1, PHQmodel2) # accept model 2

 Model df  AIC  BIClogLik   Test
L.Ratio p-value
PHQmodel1 1  8 48784.57 48840.43 -24384.28
PHQmodel2 2  9 48284.68 48347.51 -24133.34 1 vs 2 501.8926.0001


PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit,
+  correlation=corAR1(form=~Date|Case))

PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit)


anova(PHQmodel1, PHQmodel2) # accept model 2

  Model df  AIC  BIClogLik   Test
L.Ratio p-value
PHQmodel1 1  9 48284.68 48347.51 -24133.34
PHQmodel2 2  8 48784.57 48840.43 -24384.28 1 vs 2 501.8926.0001

In both cases I am led to accept model 2 even though they are
opposite models. Is it really just that you have to put them in the
right order? It just seems like if there were say four models you
wouldn't necessarily be able to determine the correct order.

Many thanks,
Chris Beeley, Institute of Mental Health, UK

...session info follows


sessionInfo()

R version 2.15.0 (2012-03-30)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] grid  stats graphics  grDevices utils datasets
methods   base

other attached packages:
  [1] gridExtra_0.9  RColorBrewer_1.0-5 car_2.0-12
nnet_7.3-1 MASS_7.3-17
  [6] xtable_1.7-0   psych_1.2.4languageR_1.4
nlme_3.1-104   ggplot2_0.9.1

loaded via a namespace (and not attached):
  [1] colorspace_1.1-1 dichromat_1.2-4  digest_0.5.2 labeling_0.1
lattice_0.20-6   memoise_0.1
  [7] munsell_0.3  plyr_1.7.1   proto_0.3-9.2
reshape2_1.2.1   scales_0.2.1 stringr_0.6
[13] tools_2.15.0


packageDescription(nlme)

Package: nlme
Version: 3.1-104
Date: 2012-05-21
Priority: recommended
Title: Linear and Nonlinear Mixed Effects Models
Authors@R: c(person(Jose, Pinheiro, comment = S version),
person(Douglas, Bates, comment =
up to 2007), person(Saikat, DebRoy, comment = up
to 2002), person(Deepayan,
Sarkar, comment = up to 2005), person(R-core, email
= r-c...@r-project.org, role =
c(aut, cre)))
Author: Jose Pinheiro (S version), Douglas Bates (up to 2007),
Saikat DebRoy (up to 2002), Deepayan
Sarkar (up to 2005), the R Core team.
Maintainer: R-corer-c...@r-project.org
Description: Fit and compare Gaussian linear and nonlinear
mixed-effects models.
Depends: graphics, stats, R (= 2.13)
Imports: lattice
Suggests: Hmisc, MASS
LazyLoad: yes
LazyData: yes
License: GPL (= 2)
BugReports: http://bugs.r-project.org
Packaged: 2012-05-23 07:28:59 UTC; ripley
Repository: CRAN
Date/Publication: 2012-05-23 07:37:45
Built: R 2.15.0; x86_64-pc-mingw32; 2012-05-29 12:36:01 UTC; windows

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] anova of lme objects (model1, model2) gives different results depending on order of models

2012-05-31 Thread Chris Beeley

Hello-

I understand that it's convention, when comparing two models using the 
anova function anova(model1, model2), to put the more complicated (for 
want of a better word) model as the second model. However, I'm using lme 
in the nlme package and I've found that the order of the models actually 
gives opposite results. I'm not sure if this is supposed to be the case 
or if I have missed something important, and I can't find anything in 
the Pinheiro and Bates book or in ?anova, or in Google for that matter 
which unfortunately only returns results about ANOVA which isn't much 
help. I'm using the latest version of R and nlme, just checked both.


Here is the code and output:

 PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, 
random=~1|Case, na.action=na.omit)


 PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, 
random=~1|Case, na.action=na.omit,

+  correlation=corAR1(form=~Date|Case))

 anova(PHQmodel1, PHQmodel2) # accept model 2
Model df  AIC  BIClogLik   Test  
L.Ratio p-value

PHQmodel1 1  8 48784.57 48840.43 -24384.28
PHQmodel2 2  9 48284.68 48347.51 -24133.34 1 vs 2 501.8926 .0001

 PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, 
random=~1|Case, na.action=na.omit,

+  correlation=corAR1(form=~Date|Case))

 PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal, 
random=~1|Case, na.action=na.omit)


 anova(PHQmodel1, PHQmodel2) # accept model 2
 Model df  AIC  BIClogLik   Test  
L.Ratio p-value

PHQmodel1 1  9 48284.68 48347.51 -24133.34
PHQmodel2 2  8 48784.57 48840.43 -24384.28 1 vs 2 501.8926 .0001

In both cases I am led to accept model 2 even though they are opposite 
models. Is it really just that you have to put them in the right order? 
It just seems like if there were say four models you wouldn't 
necessarily be able to determine the correct order.


Many thanks,
Chris Beeley, Institute of Mental Health, UK

...session info follows

 sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United 
Kingdom.1252

[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] grid  stats graphics  grDevices utils datasets  
methods   base


other attached packages:
 [1] gridExtra_0.9  RColorBrewer_1.0-5 car_2.0-12 
nnet_7.3-1 MASS_7.3-17
 [6] xtable_1.7-0   psych_1.2.4languageR_1.4  
nlme_3.1-104   ggplot2_0.9.1


loaded via a namespace (and not attached):
 [1] colorspace_1.1-1 dichromat_1.2-4  digest_0.5.2 
labeling_0.1 lattice_0.20-6   memoise_0.1
 [7] munsell_0.3  plyr_1.7.1   proto_0.3-9.2
reshape2_1.2.1   scales_0.2.1 stringr_0.6

[13] tools_2.15.0

 packageDescription(nlme)
Package: nlme
Version: 3.1-104
Date: 2012-05-21
Priority: recommended
Title: Linear and Nonlinear Mixed Effects Models
Authors@R: c(person(Jose, Pinheiro, comment = S version), 
person(Douglas, Bates, comment =
   up to 2007), person(Saikat, DebRoy, comment = up to 
2002), person(Deepayan,
   Sarkar, comment = up to 2005), person(R-core, email = 
r-c...@r-project.org, role =

   c(aut, cre)))
Author: Jose Pinheiro (S version), Douglas Bates (up to 2007), Saikat 
DebRoy (up to 2002), Deepayan

   Sarkar (up to 2005), the R Core team.
Maintainer: R-core r-c...@r-project.org
Description: Fit and compare Gaussian linear and nonlinear mixed-effects 
models.

Depends: graphics, stats, R (= 2.13)
Imports: lattice
Suggests: Hmisc, MASS
LazyLoad: yes
LazyData: yes
License: GPL (= 2)
BugReports: http://bugs.r-project.org
Packaged: 2012-05-23 07:28:59 UTC; ripley
Repository: CRAN
Date/Publication: 2012-05-23 07:37:45
Built: R 2.15.0; x86_64-pc-mingw32; 2012-05-29 12:36:01 UTC; windows

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova of lme objects (model1, model2) gives different results depending on order of models

2012-05-31 Thread Albyn Jones
No, both yield the same result: reject the null hypothesis,
which always corresponds to the restricted (smaller) model.

albyn

On Thu, May 31, 2012 at 12:47:30PM +0100, Chris Beeley wrote:
 Hello-
 
 I understand that it's convention, when comparing two models using
 the anova function anova(model1, model2), to put the more
 complicated (for want of a better word) model as the second model.
 However, I'm using lme in the nlme package and I've found that the
 order of the models actually gives opposite results. I'm not sure if
 this is supposed to be the case or if I have missed something
 important, and I can't find anything in the Pinheiro and Bates book
 or in ?anova, or in Google for that matter which unfortunately only
 returns results about ANOVA which isn't much help. I'm using the
 latest version of R and nlme, just checked both.
 
 Here is the code and output:
 
  PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,
 random=~1|Case, na.action=na.omit)
 
  PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,
 random=~1|Case, na.action=na.omit,
 +  correlation=corAR1(form=~Date|Case))
 
  anova(PHQmodel1, PHQmodel2) # accept model 2
 Model df  AIC  BIClogLik   Test
 L.Ratio p-value
 PHQmodel1 1  8 48784.57 48840.43 -24384.28
 PHQmodel2 2  9 48284.68 48347.51 -24133.34 1 vs 2 501.8926 .0001
 
  PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,
 random=~1|Case, na.action=na.omit,
 +  correlation=corAR1(form=~Date|Case))
 
  PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,
 random=~1|Case, na.action=na.omit)
 
  anova(PHQmodel1, PHQmodel2) # accept model 2
  Model df  AIC  BIClogLik   Test
 L.Ratio p-value
 PHQmodel1 1  9 48284.68 48347.51 -24133.34
 PHQmodel2 2  8 48784.57 48840.43 -24384.28 1 vs 2 501.8926 .0001
 
 In both cases I am led to accept model 2 even though they are
 opposite models. Is it really just that you have to put them in the
 right order? It just seems like if there were say four models you
 wouldn't necessarily be able to determine the correct order.
 
 Many thanks,
 Chris Beeley, Institute of Mental Health, UK
 
 ...session info follows
 
  sessionInfo()
 R version 2.15.0 (2012-03-30)
 Platform: i386-pc-mingw32/i386 (32-bit)
 
 locale:
 [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
 Kingdom.1252
 [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
 [5] LC_TIME=English_United Kingdom.1252
 
 attached base packages:
 [1] grid  stats graphics  grDevices utils datasets
 methods   base
 
 other attached packages:
  [1] gridExtra_0.9  RColorBrewer_1.0-5 car_2.0-12
 nnet_7.3-1 MASS_7.3-17
  [6] xtable_1.7-0   psych_1.2.4languageR_1.4
 nlme_3.1-104   ggplot2_0.9.1
 
 loaded via a namespace (and not attached):
  [1] colorspace_1.1-1 dichromat_1.2-4  digest_0.5.2 labeling_0.1
 lattice_0.20-6   memoise_0.1
  [7] munsell_0.3  plyr_1.7.1   proto_0.3-9.2
 reshape2_1.2.1   scales_0.2.1 stringr_0.6
 [13] tools_2.15.0
 
  packageDescription(nlme)
 Package: nlme
 Version: 3.1-104
 Date: 2012-05-21
 Priority: recommended
 Title: Linear and Nonlinear Mixed Effects Models
 Authors@R: c(person(Jose, Pinheiro, comment = S version),
 person(Douglas, Bates, comment =
up to 2007), person(Saikat, DebRoy, comment = up
 to 2002), person(Deepayan,
Sarkar, comment = up to 2005), person(R-core, email
 = r-c...@r-project.org, role =
c(aut, cre)))
 Author: Jose Pinheiro (S version), Douglas Bates (up to 2007),
 Saikat DebRoy (up to 2002), Deepayan
Sarkar (up to 2005), the R Core team.
 Maintainer: R-core r-c...@r-project.org
 Description: Fit and compare Gaussian linear and nonlinear
 mixed-effects models.
 Depends: graphics, stats, R (= 2.13)
 Imports: lattice
 Suggests: Hmisc, MASS
 LazyLoad: yes
 LazyData: yes
 License: GPL (= 2)
 BugReports: http://bugs.r-project.org
 Packaged: 2012-05-23 07:28:59 UTC; ripley
 Repository: CRAN
 Date/Publication: 2012-05-23 07:37:45
 Built: R 2.15.0; x86_64-pc-mingw32; 2012-05-29 12:36:01 UTC; windows
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Albyn Jones
Reed College
jo...@reed.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA question

2012-05-11 Thread Robert Latest
Hello all,

I'm very satisfied to say that my grip on both R and statistics is
showing the first hints of firmness, on a very greenhorn level.

I'm faced with a problem that I intend to analyze using ANOVA, and to
test my understanding of a primitive, one-way ANOVA I've written the
self-contained practice script below. It works as expected.

But here's my question: How can I not only get the values of the
coefficients for the different levels of the explanatory factor(s),
but also the corresponding standard errors and confidence levels?
Below I have started doing that on foot by looping over the levels
of my single factor, but I suppose this gets complicated and messy
with more complex models. Any ideas?

Thanks,
robert


set.seed(0)

N - 100 # sample size

MEAN - c(10, 20, 30, 40, 50)
VAR - c(20,20,1, 20, 20)
LABELS - c(A, B, C, D, E)

# create a data frame with labels
df - data.frame(Label=rep(LABELS, each=N))
df$Value - NA
# fill in random data for each factor level
for (i in 1:length(MEAN)) {
df$Value[(1+N*(i-1)):(N*i)] - rnorm(N, MEAN[i], sqrt(VAR[i]))
}



par(mfrow=c(2,2))
plot(df)  # Box plot of the data
plot(df$Value)# scatter plot

mod_aov - aov(Value ~ Label, data=df)

print(summary(mod_aov))
print(mod_aov$coefficients)

rsd - mod_aov$residuals

plot(rsd)

# find and print mean() and var() for each level
for (l in levels(df$Label)) {
index - df$Label == l

# Method 1: directly from data
smp - df$Value[index]  # extract sample for this label
ssq_smp -  var(smp)*(length(smp)-1) # sum of squares is variance
 # times d.f.
# Method 2: from ANOVA residuals
rsd_grp - rsd[index]# extract residuals
ssq_rsd - sum(rsd_grp **2)  # compute sum of squares

# print mean, variance, and difference between SSQs from the two
# methods.
write(sprintf(%s: mean=%5.1f var=%5.1f (%.2g), l,
mean(smp), var(smp),
ssq_smp-ssq_rsd), )
# ...and it works like expected! But is there a shortcut that would give me
# the same result in a one-liner?
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA question

2012-05-11 Thread ONKELINX, Thierry
Dear Robert,

It is easier to use lm instead of aov if you want coefficients for each group. 
Note that you can use rnorm vectorised.

set.seed(0)
N - 100 # sample size
MEAN - c(10, 20, 30, 40, 50)
VAR - c(20,20,1, 20, 20)
LABELS - factor(c(A, B, C, D, E))

# create a data frame with labels
df - data.frame(Label=rep(LABELS, each=N))
df$Value - rnorm(nrow(df), mean = MEAN[df$Label], sd = sqrt(VAR[df$Label]))

mod_aov - aov(Value ~ Label, data=df)
mod_lm - lm(Value ~ Label, data = df)
all.equal(anova(mod_aov), anova(mod_lm))

summary(mod_aov)
summary(mod_lm)

summary(mod_lm)$coef
confint(mod_lm)

#without intercept
mod_lm0 - lm(Value ~ 0 + Label, data = df)
summary(mod_lm0)$coef
confint(mod_lm0)

Best regards,

Thierry


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
Robert Latest
Verzonden: vrijdag 11 mei 2012 9:37
Aan: r-help@r-project.org
Onderwerp: [R] ANOVA question

Hello all,

I'm very satisfied to say that my grip on both R and statistics is showing the 
first hints of firmness, on a very greenhorn level.

I'm faced with a problem that I intend to analyze using ANOVA, and to test my 
understanding of a primitive, one-way ANOVA I've written the self-contained 
practice script below. It works as expected.

But here's my question: How can I not only get the values of the coefficients 
for the different levels of the explanatory factor(s), but also the 
corresponding standard errors and confidence levels?
Below I have started doing that on foot by looping over the levels of my 
single factor, but I suppose this gets complicated and messy with more complex 
models. Any ideas?

Thanks,
robert


set.seed(0)

N - 100 # sample size

MEAN - c(10, 20, 30, 40, 50)
VAR - c(20,20,1, 20, 20)
LABELS - c(A, B, C, D, E)

# create a data frame with labels
df - data.frame(Label=rep(LABELS, each=N)) df$Value - NA # fill in random 
data for each factor level for (i in 1:length(MEAN)) {
df$Value[(1+N*(i-1)):(N*i)] - rnorm(N, MEAN[i], sqrt(VAR[i])) }



par(mfrow=c(2,2))
plot(df)  # Box plot of the data
plot(df$Value)# scatter plot

mod_aov - aov(Value ~ Label, data=df)

print(summary(mod_aov))
print(mod_aov$coefficients)

rsd - mod_aov$residuals

plot(rsd)

# find and print mean() and var() for each level for (l in levels(df$Label)) {
index - df$Label == l

# Method 1: directly from data
smp - df$Value[index]  # extract sample for this label
ssq_smp -  var(smp)*(length(smp)-1) # sum of squares is variance
 # times d.f.
# Method 2: from ANOVA residuals
rsd_grp - rsd[index]# extract residuals
ssq_rsd - sum(rsd_grp **2)  # compute sum of squares

# print mean, variance, and difference between SSQs from the two
# methods.
write(sprintf(%s: mean=%5.1f var=%5.1f (%.2g), l,
mean(smp), var(smp),
ssq_smp-ssq_rsd), )
# ...and it works like expected! But is there a shortcut that would give me # 
the same result in a one-liner?
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA question

2012-05-11 Thread Robert Latest
Hello Thierry,

thanks for your answer! There is one thing, however, that I don't
understand. The values labeled B in my data are generated with
1/20th the variance of the others, yet the standard error and
confidence intervals are the same for all levels of the factor. How
come?


 summary(mod_lm0)$coef
   Estimate Std. Error   t value  Pr(|t|)
LabelA 10.10138  0.3937038  25.65730  5.752714e-93
LabelB 19.79629  0.3937038  50.28218 1.226942e-196
LabelC 30.06722  0.3937038  76.37016 4.825571e-276
LabelD 40.01442  0.3937038 101.63586  0.00e+00
LabelE 49.78282  0.3937038 126.44738  0.00e+00


Thanks,
robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA problem

2012-05-04 Thread robgriffin247
Hi,
I need to create a data frame containing the results of a number of ANOVA's
but I'm having some trouble setting it up (some being enough for me to spend
3 days trying with no progress and be left staring in to the abyss which
some people call a weekend, and what I will call 2 quiet days in the
office...)

The response variable is *V*. 
I need to do an ANOVA for each *G*. 
The fixed effect will be *S* (M or F) whilst also having the *S*L* and
*L* (1 or 2) as random effects. 
The anova of *G* /AB01 /would be some thing like: y=V, fixed=S, Random= L 
L*S...
The new data frame would then compile all the variance components for each
G, including total and residual variance.

here is the example dataframe using 2 G's, with 2 S values, 2 L, and 2
replicates for each.

df-as.data.frame(c(AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB02,AB02,AB02,AB02,AB02,AB02,AB02,AB02))
names(df)-G
df$L-as.numeric(c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2))
df$S-(c(m,m,f,f,m,m,f,f,m,m,f,f,m,m,f,f))
df$R-as.numeric(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2))
df$V-as.numeric(c(1,2,12,21,5,6,12,34,1,6,52,41,5,43,13,24))

It is worth noting the actual data this will be used on is 1*G's, 
2*S's,  40*L's,  and 2*R's so hand writing an ANOVA for each G is not
preferred...

Here is a twitter link to a crudely drawn illustration of the aim
illustrated (using 3 Ls) in case I have confused you with words (through my
own poor understanding): 
https://twitter.com/#!/robgriffin247/status/198446041316593666/photo/1/large
https://twitter.com/#!/robgriffin247/status/198446041316593666/photo/1/large

Thanks in advance for your time,
Rob
(please save my weekend...)

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-problem-tp4609062.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA problem

2012-05-04 Thread Bert Gunter
Rob:

On Fri, May 4, 2012 at 9:18 AM, robgriffin247 rg.rfo...@hotmail.co.uk wrote:
 Hi,
 I need to create a data frame containing the results of a number of ANOVA's
 but I'm having some trouble setting it up (some being enough for me to spend
 3 days trying with no progress and be left staring in to the abyss which
 some people call a weekend, and what I will call 2 quiet days in the
 office...)

I would suggest staying out of the office and consulting a local
statistician Monday morning. As a poor second choice, post on a
statistics Help list (e.g. stats.stackexchange.com).

I haven't gone through your post in detail, but it appears to have
little to do with R and a **lot** to do with your lack of statistical
understanding. It appears that you need to formulate a scientifically
appropriate mixed effect model (the problem is never how to set up an
anova), and interaction with a local consultant is the best way to do
that.

I suppose you could also post this on the r-sig-mixed-models list, as
they often go beyond the R issues to the statistical modeling. But
remote consulting is a risky business, as despite the best of
intentions on both sides, incomplete or mis- communication can lead to
errors of the third kind (right answer -- wrong question).

Best,
Bert


 The response variable is *V*.
 I need to do an ANOVA for each *G*.
 The fixed effect will be *S* (M or F) whilst also having the *S*L* and
 *L* (1 or 2) as random effects.
 The anova of *G* /AB01 /would be some thing like: y=V, fixed=S, Random= L 
 L*S...
 The new data frame would then compile all the variance components for each
 G, including total and residual variance.

 here is the example dataframe using 2 G's, with 2 S values, 2 L, and 2
 replicates for each.

 df-as.data.frame(c(AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB02,AB02,AB02,AB02,AB02,AB02,AB02,AB02))
 names(df)-G
 df$L-as.numeric(c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2))
 df$S-(c(m,m,f,f,m,m,f,f,m,m,f,f,m,m,f,f))
 df$R-as.numeric(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2))
 df$V-as.numeric(c(1,2,12,21,5,6,12,34,1,6,52,41,5,43,13,24))

 It is worth noting the actual data this will be used on is 1*G's,
 2*S's,  40*L's,  and 2*R's so hand writing an ANOVA for each G is not
 preferred...

 Here is a twitter link to a crudely drawn illustration of the aim
 illustrated (using 3 Ls) in case I have confused you with words (through my
 own poor understanding):
 https://twitter.com/#!/robgriffin247/status/198446041316593666/photo/1/large
 https://twitter.com/#!/robgriffin247/status/198446041316593666/photo/1/large

 Thanks in advance for your time,
 Rob
 (please save my weekend...)

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/ANOVA-problem-tp4609062.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA problem

2012-05-04 Thread Richard M. Heiberger
The following constructs the data.frame that I think the original
poster asked for.
I don't understand the graph, so I didn't attempt it.

I agree with Bert that this might not make sense.  Specifically, the distinction
between AB01 and AB02 is not modeled, and that is probably the critical factor.

I made several style changes in the dataset.  The name df is a function name,
and its use as a data.frame name will lead the reader to confusion.
I constructed the data.frame directly, not by constructing vectors and putting
them together. I declared the factor variables to be factors.  For factors with
more than two levels, this assures that they get the right number of degrees of
freedom in the anova table.


rg - data.frame(G=c(AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB01,
   AB02,AB02,AB02,AB02,AB02,AB02,AB02,AB02),
 L=factor(c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2)),
 S=factor(c(m,m,f,f,m,m,f,f,
   m,m,f,f,m,m,f,f)),
 R=factor(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2)),
 V=c(1,2,12,21,5,6,12,34,1,6,52,41,5,43,13,24))

summary(aov(V ~ S * L, data=rg[1:8,])) ## no Error term, to be sure we
understand

rg.aov - lapply(split(rg, rg$G),
 function(x) aov(V ~ S*L + Error(L), data=x))
summary(rg.aov[[1]])  ## same Sums of Squares as above, but now with Error term


anovaSumsOfSquares - function(list.of.aov.objects) {
  t(sapply(rg.aov, function(y) {
tmpy -
  sapply(y[-1], function(x) {
tmp - summary(x)[[1]]
nt - sub( +$, , rownames(tmp))
result - tmp[,Sum Sq]
names(result) - nt
result})
c(tmpy[[1]], tmpy[[2]])
  }))
}
anovaSumsOfSquares(rg.aov)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA problem

2012-05-04 Thread rob.griffin.247
Gee Bert, thanks for the really helpful tip. But if you read my post properly
you'll note that I do know how ANOVA's work. 
 The anova of *G* /AB01 /would be some thing like: y=V, fixed=S, Random= L
 
 L*S...

I didn't want to show a full model formula in case it led people do the
wrong path to solving the real problem (seeing as there are several ways to
create mixed effects models which for some reason may not work with
solutions to the problem) which is how to actually get R to do ANOVA to
analyse the data for each value of G in the data frame given in the example
and then get R to give me the output data frame I desire, ergo, *it is
indeed an R problem.*

Perhaps you should read up on the R mailing list posting guidelines:
Questions about statistics: The R mailing lists are primarily intended for
questions and discussion about the R software. However, questions about
statistical methodology are sometimes posted. If the question is well-asked
and of interest to someone on the list, *it may elicit an informative
up-to-date answer*  so not rude and sarcastic ones then..

I will admit statistics is an element of the question I have posed, but it
is entirely in an R based context.
My understanding of statistics is perfectly acceptable thanks to numerous
courses taken through my undergraduate, masters, and PhD studies. If you're
not willing to help someone solve their problems then don't bother posting -
do you have nothing better to do with your time?

I would also suggest that my post has a lot more to do with R than your post
just moments ago which is solely about statistics and is of no relevance to
the R help forum. 
http://r.789695.n4.nabble.com/Off-Topic-Crime-Statistics-Don-t-Pay-td4609170.html
http://r.789695.n4.nabble.com/Off-Topic-Crime-Statistics-Don-t-Pay-td4609170.html
 

I know you regularly post on this forum and are often helpful, but sometimes
unhelpful posts are unnecessary. 

Rant over. 

As for everyone else:
Firstly, sorry about the above, it's been a long week. Secondly, I would
still really like some helpful answers from people who are interested in
helping me, and more constructive replies will be greatly appreciated.

On Fri, May 4, 2012 at 9:18 AM, robgriffin247 lt;rg.rforum@.cogt; wrote:
 Hi,
 I need to create a data frame containing the results of a number of
 ANOVA's
 but I'm having some trouble setting it up (some being enough for me to
 spend
 3 days trying with no progress and be left staring in to the abyss which
 some people call a weekend, and what I will call 2 quiet days in the
 office...)

I would suggest staying out of the office and consulting a local
statistician Monday morning. As a poor second choice, post on a
statistics Help list (e.g. stats.stackexchange.com).

I haven't gone through your post in detail, but it appears to have
little to do with R and a **lot** to do with your lack of statistical
understanding. It appears that you need to formulate a scientifically
appropriate mixed effect model (the problem is never how to set up an
anova), and interaction with a local consultant is the best way to do
that.

I suppose you could also post this on the r-sig-mixed-models list, as
they often go beyond the R issues to the statistical modeling. But
remote consulting is a risky business, as despite the best of
intentions on both sides, incomplete or mis- communication can lead to
errors of the third kind (right answer -- wrong question).

Best,
Bert


 The response variable is *V*.
 I need to do an ANOVA for each *G*.
 The fixed effect will be *S* (M or F) whilst also having the *S*L* and
 *L* (1 or 2) as random effects.
 The anova of *G* /AB01 /would be some thing like: y=V, fixed=S, Random= L
 
 L*S...
 The new data frame would then compile all the variance components for each
 G, including total and residual variance.

 here is the example dataframe using 2 G's, with 2 S values, 2 L, and 2
 replicates for each.

 df-as.data.frame(c(AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB02,AB02,AB02,AB02,AB02,AB02,AB02,AB02))
 names(df)-G
 df$L-as.numeric(c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2))
 df$S-(c(m,m,f,f,m,m,f,f,m,m,f,f,m,m,f,f))
 df$R-as.numeric(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2))
 df$V-as.numeric(c(1,2,12,21,5,6,12,34,1,6,52,41,5,43,13,24))

 It is worth noting the actual data this will be used on is 1*G's,
 2*S's,  40*L's,  and 2*R's so hand writing an ANOVA for each G is not
 preferred...

 Here is a twitter link to a crudely drawn illustration of the aim
 illustrated (using 3 Ls) in case I have confused you with words (through
 my
 own poor understanding):
 https://twitter.com/#!/robgriffin247/status/198446041316593666/photo/1/large
 https://twitter.com/#!/robgriffin247/status/198446041316593666/photo/1/large

 Thanks in advance for your time,
 Rob
 (please save my weekend...)

 --
 View this message in context:
 http://r.789695.n4.nabble.com/ANOVA-problem-tp4609062.html
 Sent from the R help mailing list archive at Nabble.com.

 

Re: [R] ANOVA problem

2012-05-04 Thread rob.griffin.247
Thanks Richard, that works great on the test data,
I'll try it out on the full dataset now and let you know how it goes.
Thanks a lot!

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-problem-tp4609062p4609400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA Lack of fit test results not matching

2012-04-24 Thread Bart Joosen
Hi,

we have a validated program to do our calculations, but sometime I want to
use R to do some quick statistical calculations.
But for our linearity test, I can't reproduce in R.

Suppose the following data set:
dat -
structure(list(Level = structure(c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 
3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), .Label = c(A, B, C, D, 
E), class = factor), x = c(1.6882, 1.8992, 2.1103, 2.3213, 
2.5323, 1.791, 1.99, 2.189, 2.388, 1.592, 1.6, 1.8, 2, 2.2, 2.4
), y = c(845467.4698, 951160.9668, 1059023.406, 1164772.671, 
1267586.471, 885310.2247, 980398.3656, 1078975.303, 1174925.069, 
785042.962, 802448.3644, 900011.1168, 998232.6022, 1098189.112, 
1200127.806)), .Names = c(Level, x, y), row.names = c(NA, 
-15L), class = data.frame)

Now I wanted to do a Lack of fit test (in our program: residual ANOVA).
I did some searching, and found: anova(lm(y~x + Level, dat)) and look at the
p-value for Level.
But the resulting value (F value 0.0704) doesn't corresponds with the F
value from our program (0.0599).
Also the MS and SS values don't match.

As it is called residual ANOVA, I tried to fit a model (mod - lm(y~x,dat))
and then did a regression of Level agains the residuals of the model: anova
(lm(resid(mod)~dat$Level)).
But again no match. Also the degrees of freedom dont match anywhere (in our
program: 3).

Here is the table from our program, any ideas about how to come to this
result?

Source  SS  DF  MS  F-Ratio p-Value
1   Total   8.601108e+008   13  66162368.020
2   Error (Intra)   8.449211e+008   10  84492107.090
3   Model (Inter)   1.518971e+007   3   5063237.787 0.059926
0.979704


Thanks

Bart
 

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-Lack-of-fit-test-results-not-matching-tp4582774p4582774.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in quantreg - faulty test for 'nesting'?

2012-04-21 Thread Roger Koenker
Yes, the models are nested, and yes I probably should have been more clever
about parsing formulae, for these cases.  I'll have a look at the code for glm, 
I
presume that lm() is also capable of groking this?   Meanwhile, I would like to
point out that any of these linear restrictions can be reformulated as an 
exclusion
restriction, see for example the remark on pages 5-6 of 

http://www.econ.uiuc.edu/~roger/research/ranks/ranktests.pdf

or the references cited there.  And using this sort of parameterization you can
use anova.rq().


Roger Koenker
rkoen...@illinois.edu




On Apr 20, 2012, at 7:59 PM, Galen Sher wrote:

 Thanks to Brian's suggestion of using the logLik() function, I've dug a
 little deeper. I definitely think f1 and f2 are nested models. For example,
 by adding x2 to fmla1, we obtain a formula that quite clearly nests fmla1
 and achieves the same log likelihood as that obtained for f2. Here is the
 extra code to show this:
 
 fmla3 = y~I(x1+x2)+x2
 f3=glm(fmla3)
 logLik(f1); logLik(f2); logLik(f3)
 
 If f2=f3, as the log likelihoods would suggest, then this gives us a
 workaround: define the intermediate formula fmla3 and the fit f3 as above,
 and then conduct the analysis of variance between models f1 and f3 instead
 of f1 and f2. This doesn't offend anova.rq() any longer:
 
 f3.qr = rq(fmla3)
 anova(f1.qr,f3.qr) #this is actually anova(f1.qr, f2.qr) which resulted in
 an error earlier
 
 -Galen
 
 On Fri, Apr 20, 2012 at 6:47 PM, Brian S Cade ca...@usgs.gov wrote:
 
 Galen:  Interesting, first time I tried it I couldn't get anova.glm to
 compute the p-values (got a warning) but I tried it again and it worked.
 Your larger (alternative hypothesis) model is  y = B0 + B1*x1 + B2*x2 + e
 and your smaller (null hypotheisis) model is y = B0 + B3*(x1 + x2).  So  I
 guess I see that what you're trying to test in this case is that B1 = B2.
 I don't think Roger Koenker anticipated such a test with anova.rq.  Other
 options besides using information criteria (AIC, BIC, etc) for comparing
 nonnested models include the Vuong test.  But not sure how readily the
 theory of Vuong's test (like a paired t-test) extends to quantile
 regression.
 
 Brian
 
 Brian S. Cade, PhD
 
 U. S. Geological Survey
 Fort Collins Science Center
 2150 Centre Ave., Bldg. C
 Fort Collins, CO  80526-8818
 
 email:  brian_c...@usgs.gov
 tel:  970 226-9326
 
 
 From: Galen Sher galens...@gmail.com To: Brian S Cade ca...@usgs.gov
 Date: 04/20/2012 09:59 AM Subject: Re: [R] ANOVA in quantreg - faulty
 test for 'nesting'?
 --
 
 
 
 Thanks Brian. I think anova.glm() requires the user to specify the
 appropriate distribution. In the example above, if I use either of the
 following commands
 
 anova(f1,f2,test=Chisq) #or
 anova(f1,f2,test=F)
 
 then anova.glm() will compute and display p-values associated with the
 deviance statistics. The reason I thought these models are nested is
 because the first model can be thought of as the second model estimated
 under an additional linear equality constraint. I suppose that's less of an
 R question and more of a regression question.
 
 Thanks for the logLik suggestion. In the absence of more information I'll
 have to do that - I'm just wary of conducting the test myself!
 
 Regards,
 Galen
 
 On Fri, Apr 20, 2012 at 4:31 PM, Brian S Cade 
 *ca...@usgs.gov*ca...@usgs.gov
 wrote:
 Galen:  Actually don't see how the models are nested (ask yourself what
 parameter in the model with 3 parameters is set to zero in the 2 parameter
 model?) and indeed if I try your code anova.glm will compute the difference
 in deviance between the two models but it does not compute a probability
 value for that difference in deviance as that would not make sense for
 models that aren't nested.   Koenker's implementation of anova.rq
 immediately detects that the models aren't nested so doesn't even compute
 the deviance difference.  You could use the logLik function on the rq
 objects to get their log likelihoods or use AIC (BIC) to compare the
 quantile regression models.
 
 Brian
 
 Brian S. Cade, PhD
 
 U. S. Geological Survey
 Fort Collins Science Center
 2150 Centre Ave., Bldg. C
 Fort Collins, CO  80526-8818
 
 email:  *brian_c...@usgs.gov* brian_c...@usgs.gov
 tel:  *970 226-9326* 970%20226-9326
 
  From: galen *galens...@gmail.com* galens...@gmail.com  To: *
 r-help@r-project.org* r-help@r-project.org  Date: 04/19/2012 02:40 PM
 Subject: [R] ANOVA in quantreg - faulty test for 'nesting'?  Sent by: *
 r-help-boun...@r-project.org* r-help-boun...@r-project.org
 
 --
 
 
 
 
 I am trying to implement an ANOVA on a pair of quantile regression models
 in
 R. The anova.rq() function performs a basic check to see whether the models
 are nested, but I think this check is failing in my case. I think my models
 are nested despite the anova.rqlist() function saying otherwise. Here is an
 example where the GLM ANOVA regards the models as nested

Re: [R] ANOVA in quantreg - faulty test for 'nesting'?

2012-04-20 Thread Galen Sher
Thanks to Brian's suggestion of using the logLik() function, I've dug a
little deeper. I definitely think f1 and f2 are nested models. For example,
by adding x2 to fmla1, we obtain a formula that quite clearly nests fmla1
and achieves the same log likelihood as that obtained for f2. Here is the
extra code to show this:

fmla3 = y~I(x1+x2)+x2
f3=glm(fmla3)
logLik(f1); logLik(f2); logLik(f3)

If f2=f3, as the log likelihoods would suggest, then this gives us a
workaround: define the intermediate formula fmla3 and the fit f3 as above,
and then conduct the analysis of variance between models f1 and f3 instead
of f1 and f2. This doesn't offend anova.rq() any longer:

f3.qr = rq(fmla3)
anova(f1.qr,f3.qr) #this is actually anova(f1.qr, f2.qr) which resulted in
an error earlier

-Galen

On Fri, Apr 20, 2012 at 6:47 PM, Brian S Cade ca...@usgs.gov wrote:

 Galen:  Interesting, first time I tried it I couldn't get anova.glm to
 compute the p-values (got a warning) but I tried it again and it worked.
 Your larger (alternative hypothesis) model is  y = B0 + B1*x1 + B2*x2 + e
 and your smaller (null hypotheisis) model is y = B0 + B3*(x1 + x2).  So  I
 guess I see that what you're trying to test in this case is that B1 = B2.
  I don't think Roger Koenker anticipated such a test with anova.rq.  Other
 options besides using information criteria (AIC, BIC, etc) for comparing
 nonnested models include the Vuong test.  But not sure how readily the
 theory of Vuong's test (like a paired t-test) extends to quantile
 regression.

 Brian

 Brian S. Cade, PhD

 U. S. Geological Survey
 Fort Collins Science Center
 2150 Centre Ave., Bldg. C
 Fort Collins, CO  80526-8818

 email:  brian_c...@usgs.gov
 tel:  970 226-9326


  From: Galen Sher galens...@gmail.com To: Brian S Cade ca...@usgs.gov
 Date: 04/20/2012 09:59 AM Subject: Re: [R] ANOVA in quantreg - faulty
 test for 'nesting'?
 --



 Thanks Brian. I think anova.glm() requires the user to specify the
 appropriate distribution. In the example above, if I use either of the
 following commands

 anova(f1,f2,test=Chisq) #or
 anova(f1,f2,test=F)

 then anova.glm() will compute and display p-values associated with the
 deviance statistics. The reason I thought these models are nested is
 because the first model can be thought of as the second model estimated
 under an additional linear equality constraint. I suppose that's less of an
 R question and more of a regression question.

 Thanks for the logLik suggestion. In the absence of more information I'll
 have to do that - I'm just wary of conducting the test myself!

 Regards,
 Galen

 On Fri, Apr 20, 2012 at 4:31 PM, Brian S Cade 
 *ca...@usgs.gov*ca...@usgs.gov
 wrote:
 Galen:  Actually don't see how the models are nested (ask yourself what
 parameter in the model with 3 parameters is set to zero in the 2 parameter
 model?) and indeed if I try your code anova.glm will compute the difference
 in deviance between the two models but it does not compute a probability
 value for that difference in deviance as that would not make sense for
 models that aren't nested.   Koenker's implementation of anova.rq
 immediately detects that the models aren't nested so doesn't even compute
 the deviance difference.  You could use the logLik function on the rq
 objects to get their log likelihoods or use AIC (BIC) to compare the
 quantile regression models.

 Brian

 Brian S. Cade, PhD

 U. S. Geological Survey
 Fort Collins Science Center
 2150 Centre Ave., Bldg. C
 Fort Collins, CO  80526-8818

 email:  *brian_c...@usgs.gov* brian_c...@usgs.gov
 tel:  *970 226-9326* 970%20226-9326

   From: galen *galens...@gmail.com* galens...@gmail.com  To: *
 r-help@r-project.org* r-help@r-project.org  Date: 04/19/2012 02:40 PM
 Subject: [R] ANOVA in quantreg - faulty test for 'nesting'?  Sent by: *
 r-help-boun...@r-project.org* r-help-boun...@r-project.org

  --




 I am trying to implement an ANOVA on a pair of quantile regression models
 in
 R. The anova.rq() function performs a basic check to see whether the models
 are nested, but I think this check is failing in my case. I think my models
 are nested despite the anova.rqlist() function saying otherwise. Here is an
 example where the GLM ANOVA regards the models as nested, but the quantile
 regression ANOVA tells me the models aren't nested:

 y = rnorm(100)
 x1 = rnorm(100)
 x2 = rnorm(100)

 fmla1 = y~I(x1+x2)
 fmla2 = y~x1+x2

 f1 = glm(fmla1)
 f2 = glm(fmla2)

 anova(f1,f2) #This works

 f1.qr = rq(fmla1)
 f2.qr = rq(fmla2)

 anova(f1.qr,f2.qr) #Error!
 #Error in anova.rqlist(object, ...) : Models aren't nested

 Are the models in fact not nested? If they are nested, is there an easy
 workaround I could use? Many thanks in anticipation.

 --
 View this message in context: *
 http://r.789695.n4.nabble.com/ANOVA-in-quantreg-faulty-test-for-nesting-tp4571994p4571994.html
 *http://r.789695.n4.nabble.com/ANOVA-in-quantreg-faulty-test-for-nesting

  1   2   3   4   >