Re: [R] ANOVA in R

2018-10-10 Thread Jim Lemon
Hi again,
Two things, I named the data frame SR as shown in the model.
The other is for those who may wish to answer the OP. The mediafire
website is loaded with intrusive ads and perhaps malware.

Jim
On Thu, Oct 11, 2018 at 9:02 AM Jim Lemon  wrote:
>
> Hi Tranh,
> I'm not sure why you are converting your variables to factors, and I
> think the model you want is:
>
> lm(KIC~temperature+AC+AV+Thickness+temperature:AC+
>  temperature:AV+temperature:thickness+AC:AV+
>  AC:thickness+AV:thickness,SR)
>
> Note the colons (:) rather than asterisks (*) for the interactions.
>
> Jim
> On Wed, Oct 10, 2018 at 5:14 PM Thanh Tran  wrote:
> >
> > Hi eveyone,
> > I'm studying about variance (ANOVA) in R and have some questions to share.
> > I read an article investigating the effect of factors (temperature, Asphalt
> > content, Air voids, and sample thickness) on the hardness of asphalt
> > concrete in the tensile test (abbreviated as Kic). Each condition was
> > repeated four times (4 samples). In the paper, the authors used MINITAB to
> > analyze Anova. The authors use "adjusted sums of squares" calculate the
> > p-value I try to use ANOVA in R to analyze this data and get the result as
> > shown in Figure 4. The results are different from the results in the
> > article. Some papers say that in R, the default for ANOVA analysis is to
> > use "sequential sums of squares" to calculate the p-value.
> > So please help the following two questions: 1 / Introduction to code in R
> > for anova analysis uses "adjusted sums of squares". The main part of the
> > command in R / myself is as follows: > Tem = as.factor (temperature) > Ac =
> > as.factor (AC) > Av = as.factor (AV) > Thick = as.factor (Thickness) >
> > Twoway = lm (KIC ~ Tem + Ac + Av + Thick + Stamp + Ac + Stamp + Av + Stamp
> > + Thick + Ac * Av + Ac * Thick + Av * Thick) > anova (twoway) 2/ When to
> > use "sequential sums of squares" and when to use "adjusted sums of
> > squares". Some papers recommend using the "oa.design
> > "
> > function in R to check for "orthogonal" designs. If not, use "adjusted sums
> > of squares". I am still vague about this command, so look forward to
> > everyone's suggestion. If you could answer all two of my questions, I would
> > be most grateful. Ps: I have added a CSV file and the paper for practicing
> > R. http://www.mediafire.com/file/e5oe54p2c2wd4bc/Saha+research.csv
> > http://www.mediafire.com/file/39jlf9h539y9mdz/Homothetic+behaviour+investigation+on+fracture+toughness+of+asphalt+mixtures+using+semicircular+bending+test.pdf
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in R

2018-10-10 Thread Jim Lemon
Hi Tranh,
I'm not sure why you are converting your variables to factors, and I
think the model you want is:

lm(KIC~temperature+AC+AV+Thickness+temperature:AC+
 temperature:AV+temperature:thickness+AC:AV+
 AC:thickness+AV:thickness,SR)

Note the colons (:) rather than asterisks (*) for the interactions.

Jim
On Wed, Oct 10, 2018 at 5:14 PM Thanh Tran  wrote:
>
> Hi eveyone,
> I'm studying about variance (ANOVA) in R and have some questions to share.
> I read an article investigating the effect of factors (temperature, Asphalt
> content, Air voids, and sample thickness) on the hardness of asphalt
> concrete in the tensile test (abbreviated as Kic). Each condition was
> repeated four times (4 samples). In the paper, the authors used MINITAB to
> analyze Anova. The authors use "adjusted sums of squares" calculate the
> p-value I try to use ANOVA in R to analyze this data and get the result as
> shown in Figure 4. The results are different from the results in the
> article. Some papers say that in R, the default for ANOVA analysis is to
> use "sequential sums of squares" to calculate the p-value.
> So please help the following two questions: 1 / Introduction to code in R
> for anova analysis uses "adjusted sums of squares". The main part of the
> command in R / myself is as follows: > Tem = as.factor (temperature) > Ac =
> as.factor (AC) > Av = as.factor (AV) > Thick = as.factor (Thickness) >
> Twoway = lm (KIC ~ Tem + Ac + Av + Thick + Stamp + Ac + Stamp + Av + Stamp
> + Thick + Ac * Av + Ac * Thick + Av * Thick) > anova (twoway) 2/ When to
> use "sequential sums of squares" and when to use "adjusted sums of
> squares". Some papers recommend using the "oa.design
> "
> function in R to check for "orthogonal" designs. If not, use "adjusted sums
> of squares". I am still vague about this command, so look forward to
> everyone's suggestion. If you could answer all two of my questions, I would
> be most grateful. Ps: I have added a CSV file and the paper for practicing
> R. http://www.mediafire.com/file/e5oe54p2c2wd4bc/Saha+research.csv
> http://www.mediafire.com/file/39jlf9h539y9mdz/Homothetic+behaviour+investigation+on+fracture+toughness+of+asphalt+mixtures+using+semicircular+bending+test.pdf
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in R

2018-10-10 Thread Paul Johnson
We cannot read your message. Should post pure text, not html. Hm, my phone
now may post html, must try to stop. Your R code not legible. It seems to
be output? Lines all run together. I tried find articles you mention, but
"not found" resulted.

You should use aov() for fitting, then get post hoc comparisons.

In car package, Anova function will help. I may teach Anova soon, we'll see
if I have better answer then.

Paul Johnson
University of Kansas

On Wed, Oct 10, 2018, 1:14 AM Thanh Tran  wrote:

> Hi eveyone,
> I'm studying about variance (ANOVA) in R and have some questions to share.
> I read an article investigating the effect of factors (temperature, Asphalt
> content, Air voids, and sample thickness) on the hardness of asphalt
> concrete in the tensile test (abbreviated as Kic). Each condition was
> repeated four times (4 samples). In the paper, the authors used MINITAB to
> analyze Anova. The authors use "adjusted sums of squares" calculate the
> p-value I try to use ANOVA in R to analyze this data and get the result as
> shown in Figure 4. The results are different from the results in the
> article. Some papers say that in R, the default for ANOVA analysis is to
> use "sequential sums of squares" to calculate the p-value.
> So please help the following two questions: 1 / Introduction to code in R
> for anova analysis uses "adjusted sums of squares". The main part of the
> command in R / myself is as follows: > Tem = as.factor (temperature) > Ac =
> as.factor (AC) > Av = as.factor (AV) > Thick = as.factor (Thickness) >
> Twoway = lm (KIC ~ Tem + Ac + Av + Thick + Stamp + Ac + Stamp + Av + Stamp
> + Thick + Ac * Av + Ac * Thick + Av * Thick) > anova (twoway) 2/ When to
> use "sequential sums of squares" and when to use "adjusted sums of
> squares". Some papers recommend using the "oa.design
> <
> https://www.youtube.com/redirect?q=http%3A%2F%2Foa.design%2F_token=AaSAPDY-5UAsoHxN6BdwfyIJ7R98MTUzOTIxNDg2OUAxNTM5MTI4NDY5=comments
> >"
> function in R to check for "orthogonal" designs. If not, use "adjusted sums
> of squares". I am still vague about this command, so look forward to
> everyone's suggestion. If you could answer all two of my questions, I would
> be most grateful. Ps: I have added a CSV file and the paper for practicing
> R. http://www.mediafire.com/file/e5oe54p2c2wd4bc/Saha+research.csv
>
> http://www.mediafire.com/file/39jlf9h539y9mdz/Homothetic+behaviour+investigation+on+fracture+toughness+of+asphalt+mixtures+using+semicircular+bending+test.pdf
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Permutation Test

2018-09-03 Thread Juan Telleria Ruiz de Aguirre
Thank you all for your **very good** answers:

Using aovp(..., perm="Exact") seems to be the way to go for small datasets,
and also I should definitely try ?kruskal.test.


Juan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Permutation Test

2018-09-03 Thread S Ellison
> This package uses a modified version of aov() function, which uses
> Permutation Tests 
> 
> I obtain different p-values for each run!

Could that be because you are defaulting to perm="Prob"?

I am not familiar with the package, but the manual is informative.
You may have missed something when reading it.

" ...The Exact method will be used by default when the number of observations 
is less than or equal to
maxExact, otherwise Prob will be used.
Prob:  Iterations terminate when the estimated standard error of the estimated 
proportion p is less
than p*Ca"

I would assume that probabilistic permutation is random and will change from 
run to run.
You could use set.seed() to stop that, but it's actually quite useful to see 
how much the results change.
If you want complete permutation, you'd need to force Exact (unless that does 
not mean what it sounds like for this package).
It looks like that requires you to set maxExact to at least your number of 
observations. But given that permutation  grows combinatorially,  that could 
take a _long_ time for a run; the Example in the help page does not complete in 
a useful time when maxExact is set to exceed the number of data points.

So I'd probably run it using Prob and simply note the range of results for a 
handful of runs to give you an indication of how far to trust the answers.

> Would it still be possible use the regular aov() by generating permutations
> in advance (Obtaining therefore a Normal Distribution thanks to the Central
> Limit Theorem)? And applying the aov() function afterwards? Does it have
> sense?
As a chemist, I'd guess No. And you'd be even more limited in number of 
permutations.

> Or maybe this issue could be due to unbalanced classes? I also tried to
> weight observations based on proportions, but the function failed.
No, it's nothing to do with balance, if the results change run to run with no 
change in the model. I'd guess that may exacerbate the permutaiton variability 
somewhat but it won't _cause_ it.

> Any alternative solution for performing a One-Way ANOVA Test over
> Non-Normal Data?
Yes; the traditional nonparametric test for one-way data (balanced) is the 
kruskal-wallis test - see ?kruskal.test.
Classical ANOVA on ranks can also be defended as a general 'nonparametric' 
approach, though I gather it can also be criticised. 



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Permutation Test

2018-09-03 Thread Meyners, Michael
Juan, 

Your question might be borderline for this list, as it ultimately rather seems 
a stats question coming in R disguise.

Anyway, the short answer is that you *expect* to get a different p value from a 
permutation test unless you are able to do all possible permutation and 
therefore use the so-called systematic reference set. That is rarely the case, 
and only for relatively small problems. 
The permutation test uses a random subset of all possible permutations. Given 
this randomness, you'll get a different p value. In order to get reproducible 
results, you may specify a seed (?set.seed), yet that is only reproducible with 
this environment. Someone else with a different software and/or code might come 
out with a different p. The higher the number of permutations used, the smaller 
the variation around the p values, however. For most applications, 1000 seem 
good enough to me, but sometimes I go higher (in particular if the p value is 
borderline and I really need a strict above/below alpha decision).

The permutations do not create an implicit normal distribution, but rather a 
null distribution that can (likely is depending on non-normality of your data) 
not normal. So your respective proposal does not appeal.

I don't think you need an alternative - the permutation test is just fine, and 
recognizing the randomness in the execution does not render the (relatively 
small) variability in p values a major issue.

You may want to have a look at the text book by Edgington & Onghena for details 
on permutation tests, and there are plenty of papers out there addressing them 
in various contexts, which will help to understand *why* you observe what you 
observe here. 

HTH, Michael



> -Original Message-
> From: R-help  On Behalf Of Juan Telleria Ruiz
> de Aguirre
> Sent: Montag, 3. September 2018 17:18
> To: R help Mailing list 
> Subject: [R] ANOVA Permutation Test
> 
> Dear R users,
> 
> I have the following Question related to Package lmPerm:
> 
> This package uses a modified version of aov() function, which uses
> Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
> Variance (ANOVA) Model.
> 
> However, when I run the following code for a simple linear model:
> 
> library(lmPerm)
> 
> e$t_Downtime_per_Intervention_Successful %>%
>   aovp(
> formula = `Downtime per Intervention[h]` ~ `Working Hours`,
> data = .
>   ) %>%
>   summary()
> 
> I obtain different p-values for each run!
> 
> With a regular ANOVA Test, I obtain instead a constant F-statistic, but I do 
> not
> fulfill the required Normality Assumptions.
> 
> So my questions are:
> 
> Would it still be possible use the regular aov() by generating permutations in
> advance (Obtaining therefore a Normal Distribution thanks to the Central
> Limit Theorem)? And applying the aov() function afterwards? Does it have
> sense?
> 
> 
> Or maybe this issue could be due to unbalanced classes? I also tried to weight
> observations based on proportions, but the function failed.
> 
> 
> Any alternative solution for performing a One-Way ANOVA Test over Non-
> Normal Data?
> 
> 
> Thank you.
> 
> Juan
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Permutation Test

2018-09-03 Thread Michael Dewey

Dear Juan

I do not use the package but if it does permutation tests it presumably 
uses random numbers and since you are not setting the seed you would get 
different values for each run.


Michael

On 03/09/2018 16:17, Juan Telleria Ruiz de Aguirre wrote:

Dear R users,

I have the following Question related to Package lmPerm:

This package uses a modified version of aov() function, which uses
Permutation Tests instead of Normal Theory Tests for fitting an Analysis of
Variance (ANOVA) Model.

However, when I run the following code for a simple linear model:

library(lmPerm)

e$t_Downtime_per_Intervention_Successful %>%
   aovp(
 formula = `Downtime per Intervention[h]` ~ `Working Hours`,
 data = .
   ) %>%
   summary()

I obtain different p-values for each run!

With a regular ANOVA Test, I obtain instead a constant F-statistic, but I
do not fulfill the required Normality Assumptions.

So my questions are:

Would it still be possible use the regular aov() by generating permutations
in advance (Obtaining therefore a Normal Distribution thanks to the Central
Limit Theorem)? And applying the aov() function afterwards? Does it have
sense?


Or maybe this issue could be due to unbalanced classes? I also tried to
weight observations based on proportions, but the function failed.


Any alternative solution for performing a One-Way ANOVA Test over
Non-Normal Data?


Thank you.

Juan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA test to decide whether to use multiple linear regression or linear mixed effects model

2017-08-15 Thread S Ellison
> I am trying to decide between using a multiple linear regression or a linear
> mixed effects model for my data:
> ... 
> but I keep getting the following error code:
> 
> Error in anova.lmlist (object, ...):
> 
> models were not all fitted to the same size of dataset

anova is defaulting to anova.lm, and that doesn't expect a mixed effects model.

Switch them round to put model2 first:

anova (model2, model1)


S Ellison



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova() type iii SS plots and diagnostics

2016-07-19 Thread Fox, John
Dear Pamela,

I'm afraid that this question seems confused. Type III sums of squares are 
computed for a linear model -- it's the *linear model* that you'd check, in the 
usual manner, for outliers, etc., and this has nothing to do with how the ANOVA 
table for the model is computed.

Best,
 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario
Canada L8S 4M4
Web: socserv.mcmaster.ca/jfox



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Pamela
> Wong
> Sent: July 18, 2016 10:39 PM
> To: r-help@r-project.org
> Subject: [R] Anova() type iii SS plots and diagnostics
> 
> I am wondering if there is a way to plot results and model diagnostics (to 
> check
> for outliers, homoscedasticity, normality, collinearity) using type III sums 
> of
> squares in R __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA test

2014-12-27 Thread peter dalgaard

 On 27 Dec 2014, at 02:50 , Richard M. Heiberger r...@temple.edu wrote:
 
 Kristi,
 
 The easiest way to do what you are looking for is the aovSufficient function
 in the HH package.
 
 ## install.packages(HH) ## if you don't have it yet
 library(HH)
 B.aov - aovSufficient(mean ~ site, data=B, sd=B$SE, weights=c(3,3,3,3))
 summary(B.aov)
 

Ummm, you want SD, not SE, in there, don't you?

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA test

2014-12-26 Thread Bert Gunter
This is a statistical question primarily and, as such, is off topic
here. Either consult a local statistical expert or post to a
statistical site like stats.stackexchange.com  .

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
Clifford Stoll




On Fri, Dec 26, 2014 at 6:38 AM, Kristi Glover
kristi.glo...@hotmail.com wrote:
 Hi R user,
 I am wondering whether I can perform a simple ANOVA analysis in the data in 
 which I  have mean + SE (+- Standard Error) for several groups.
 For this one,  I calculated upper and lower confidence interval and made 
 three classes for each group (mean, upper and lower values). After that, I 
 did ANOVA (simple Anova). I am wondering whether this is a wrong approach?  I 
 have given an example


 library(reshape)
 B-structure(list(mean = c(0.0241262, 0.0433538, 0.2204764, 0.7830054
 ), SE = c(0.0209097, 0.0329281, 0.1003248, 0.3019256), site = structure(1:4, 
 .Label = c(A,
 B, C, D), class = factor)), .Names = c(mean, SE,
 site), class = data.frame, row.names = c(NA, -4L))
 attach(B)
 B1-data.frame(B, Upper=mean+1.96*SE, Lower=mean-1.96*SE)
 B2-subset(B1, select=c(-2))
 B2
 B3-melt(B2, id=c(site))
 B3
 Anova-aov(B3$value~B3$site)
 summary(Anova)

 Thanks


 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA test

2014-12-26 Thread peter dalgaard
You could at least say that no, that is patently wrong! There are ways to 
reconstruct an ANOVA from means, sd, and group sizes, but this isn't it. In 
fact, the group sizes are not even used in the code.

I agree about the need for a statistical expert. Fundamental misunderstandings 
seem to be present. 

-pd

 On 26 Dec 2014, at 16:23 , Bert Gunter gunter.ber...@gene.com wrote:
 
 This is a statistical question primarily and, as such, is off topic
 here. Either consult a local statistical expert or post to a
 statistical site like stats.stackexchange.com  .
 
 Cheers,
 Bert
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374
 
 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 Clifford Stoll
 
 
 
 
 On Fri, Dec 26, 2014 at 6:38 AM, Kristi Glover
 kristi.glo...@hotmail.com wrote:
 Hi R user,
 I am wondering whether I can perform a simple ANOVA analysis in the data in 
 which I  have mean + SE (+- Standard Error) for several groups.
 For this one,  I calculated upper and lower confidence interval and made 
 three classes for each group (mean, upper and lower values). After that, I 
 did ANOVA (simple Anova). I am wondering whether this is a wrong approach?  
 I have given an example
 
 
 library(reshape)
 B-structure(list(mean = c(0.0241262, 0.0433538, 0.2204764, 0.7830054
 ), SE = c(0.0209097, 0.0329281, 0.1003248, 0.3019256), site = structure(1:4, 
 .Label = c(A,
 B, C, D), class = factor)), .Names = c(mean, SE,
 site), class = data.frame, row.names = c(NA, -4L))
 attach(B)
 B1-data.frame(B, Upper=mean+1.96*SE, Lower=mean-1.96*SE)
 B2-subset(B1, select=c(-2))
 B2
 B3-melt(B2, id=c(site))
 B3
 Anova-aov(B3$value~B3$site)
 summary(Anova)
 
 Thanks
 
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA test

2014-12-26 Thread Richard M. Heiberger
Kristi,

The easiest way to do what you are looking for is the aovSufficient function
in the HH package.

## install.packages(HH) ## if you don't have it yet
library(HH)
B.aov - aovSufficient(mean ~ site, data=B, sd=B$SE, weights=c(3,3,3,3))
summary(B.aov)

You must have the sample size for each of the groups.  Your example did
not include sample sizes, so there is no justification for the degrees
of freedom in the residual.

I invented sizes of 3 observations per group.

See the pulmonary example in ?aovSufficient for a complete example.

There are several style issues that need to be commented on.
It is usually a bad idea to attach a data.frame.  In this example, you attached
B and then never used it, and also never detached it.  That has the potential
to mask objects farther down the search() list.

When you use aov(), or any other function that takes a data= argument,
it is best to use the data= argument.

Rich

On Fri, Dec 26, 2014 at 9:38 AM, Kristi Glover
kristi.glo...@hotmail.com wrote:
 Hi R user,
 I am wondering whether I can perform a simple ANOVA analysis in the data in 
 which I  have mean + SE (+- Standard Error) for several groups.
 For this one,  I calculated upper and lower confidence interval and made 
 three classes for each group (mean, upper and lower values). After that, I 
 did ANOVA (simple Anova). I am wondering whether this is a wrong approach?  I 
 have given an example


 library(reshape)
 B-structure(list(mean = c(0.0241262, 0.0433538, 0.2204764, 0.7830054
 ), SE = c(0.0209097, 0.0329281, 0.1003248, 0.3019256), site = structure(1:4, 
 .Label = c(A,
 B, C, D), class = factor)), .Names = c(mean, SE,
 site), class = data.frame, row.names = c(NA, -4L))
 attach(B)
 B1-data.frame(B, Upper=mean+1.96*SE, Lower=mean-1.96*SE)
 B2-subset(B1, select=c(-2))
 B2
 B3-melt(B2, id=c(site))
 B3
 Anova-aov(B3$value~B3$site)
 summary(Anova)

 Thanks


 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA and permutation tests : beware of traps

2014-09-23 Thread Joshua Wiley
Hi Stephane,

This is the well known result of limitted floating point precision (e.g.,
http://www.validlab.com/goldberg/addendum.html).  Using a test of
approximate rather than exact equality shows R yields the correct answer:

nperm - 1
Fperm - replicate(n=nperm, anova(lm(sample(Y) ~ F, dat1))$F[1]) #
calculates nperm F values
(prob - (sum(anova1$F[1] = (Fperm + .Machine$double.eps ^ 0.5)) +
1)/(nperm +1))  # risk calculation

yields

0.10009

I picked .Machine$double.eps ^ 0.5 somewhat arbitrarily as the default for
equality testing from ?all.equal

Cheers,

Josh



On Tue, Sep 23, 2014 at 5:14 PM, Stéphane Adamowicz 
stephane.adamow...@avignon.inra.fr wrote:

 Recently, I came across a strange and potentially troublesome behaviour of
 the lm and aov functions that ask questions about calculation accuracy. Let
 us consider the 2 following datasets dat1  dat2 :

  (dat1 - data.frame(Y=c(1:3, 10+1:3), F=c(rep(A,3), rep(B,3
Y F
 1  1 A
 2  2 A
 3  3 A
 4 11 B
 5 12 B
 6 13 B
  (dat2 - data.frame(Y=c(10+1:3, 1:3), F=c(rep(A,3), rep(B,3
Y F
 1 11 A
 2 12 A
 3 13 A
 4  1 B
 5  2 B
 6  3 B

 They only differ in the order of values that were exchanged between
 samples A and B. Thus the sd is 1 for each sample in either data sets, and
 the absolute mean difference |A-B| is 10 in both datasets.
 Now, let us perform an anova to compare samples A and B in both datasets
 (of course, in such simple case, a bilateral T test would do the job, but
 an anova is nevertheless allowed and should give the same probability than
 Student's test):

  (anova1 - anova(lm(Y~F, dat1)))
 Analysis of Variance Table

 Response: Y
   Df Sum Sq Mean Sq F valuePr(F)
 F  1150 150 150 0.0002552 ***
 Residuals  4  4   1

  (anova2 - anova(lm(Y~F, dat2)))
 Analysis of Variance Table

 Response: Y
   Df Sum Sq Mean Sq F valuePr(F)
 F  1150 150 150 0.0002552 ***
 Residuals  4  4   1

 As expected, both datasets give a same anova table, but this is only
 apparent. Indeed :

  anova1$F[1] == anova2$F[1]
 [1] FALSE
  anova1$F[1] - anova2$F[1]
 [1] 5.684342e-14

 In fact the F values differ slightly, and this holds also for the aov
 function. I checked also (not shown) that both the residual and factorial
 sums of squares differ between dat1 and dat2. Thus, for some undocumented
 reason (at least for the end user), the F values depend on the order of
 data!
 While such tiny differences (e-14 in this example) are devoid of
 consequences on the risk evaluation by Fisher's distribution, they may have
 huge consequences on the risk evaluation by the permutation method. Indeed,
 the shift from continuous to discrete distributions is far from being
 insignificant.

 For instance, the following code in R is at the basis of many permutation
 algorithms found in the internet and in teaching because it seems quite
 straightforward (see for example
 http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Permutation%20Anova/PermTestsAnova.html
 http://www.usna.edu/Users/math/jager/courses/sm439/labs/Lab7.pdf

 http://biostatistics1014.blogspot.fr/2013/04/one-way-anova-permutation-test-and.html
 http://adn.biol.umontreal.ca/~numericalecology/Rcode/):

  nperm - 1000 # number of permutations
  Fperm - replicate(n=nperm, anova(lm(sample(Y) ~ F, dat1))$F[1]) #
 calculates nperm F values
  (prob - (sum(anova1$F[1] = Fperm) + 1)/(nperm +1))  # risk calculation
 [1] 0.04695305

 Of course, because of the sample function, repeating this code gives
 different prob values, but they remain always close to 5% instead of the
 exact probability of 10%. Indeed, there are only choose(6,3) = 20 possible
 permutations, but because they are symmetric, they give only 10 absolute
 mean differences. Thus, the only exact probabilities are 10%, 20% ... 100%.
 In our example where samples do not overlap, 10% is obviously the right
 answer.

 Thus, the use of lm and aov functions in permutation methods does not seem
 a good idea as it results in biases that underestimate the exact risk. In
 the simple case of one-way anova, it is quite simple to remedy this
 problem. As the total Sums of squares and the degrees of freedom do not
 change with permutations, it is easier and much faster to compare the
 residual sums of squares. For instance, the exact probabilities can be
 calculated that way :

  combi - combn(6, 3, FUN=function(i) {append(dat1$Y[i], dat1$Y[-i])}) #
 all permutations
  SCEResid - apply(combi, 2, FUN=function(x) sum(tapply(x, dat1$F,
 function(x) sum((x - mean(x))^2 # all resi SS
  (prob - mean(SCEResid = SCEResid[1])) # risk calculation
 [1] 0.1

 10% is indeed the exact risk.

 Finally, my problem is : How can we know if R libraries that use
 randomization procedures are not biased ? In the basic case of one way
 anova, it seems easy to submit the above example (by the way, the defunct
 lmPerm library does not succeed ...), but how can we check more complex
 anova models 

Re: [R] ANOVA and permutation tests : beware of traps

2014-09-23 Thread Prof Brian Ripley
Beware of the trap of listening to people with no knowledge of basic 
numerical methods!


It really is basic that the results of floating-point computer 
calculations depends on the order in which they are done (and the 
compiler can change the order).  Using == on such calculations is warned 
about on its help page.


The same warning applies to using = if equality is important.

It bears repeating that the problems are exacerbated on platforms which 
use extended-precision registers for some but not all of their 
calculations (and most R platforms do).  The classic reference is pp. 
248ff of http://www.validlab.com/goldberg/paper.pdf (part of a Sun manual).



On 23/09/2014 08:14, Stéphane Adamowicz wrote:

Recently, I came across a strange and potentially troublesome behaviour of the lm 
and aov functions that ask questions about calculation accuracy. Let us consider 
the 2 following datasets dat1  dat2 :


(dat1 - data.frame(Y=c(1:3, 10+1:3), F=c(rep(A,3), rep(B,3

Y F
1  1 A
2  2 A
3  3 A
4 11 B
5 12 B
6 13 B

(dat2 - data.frame(Y=c(10+1:3, 1:3), F=c(rep(A,3), rep(B,3

Y F
1 11 A
2 12 A
3 13 A
4  1 B
5  2 B
6  3 B

They only differ in the order of values that were exchanged between samples A 
and B. Thus the sd is 1 for each sample in either data sets, and the absolute 
mean difference |A-B| is 10 in both datasets.
Now, let us perform an anova to compare samples A and B in both datasets (of 
course, in such simple case, a bilateral T test would do the job, but an anova 
is nevertheless allowed and should give the same probability than Student's 
test):


(anova1 - anova(lm(Y~F, dat1)))

Analysis of Variance Table

Response: Y
   Df Sum Sq Mean Sq F valuePr(F)
F  1150 150 150 0.0002552 ***
Residuals  4  4   1


(anova2 - anova(lm(Y~F, dat2)))

Analysis of Variance Table

Response: Y
   Df Sum Sq Mean Sq F valuePr(F)
F  1150 150 150 0.0002552 ***
Residuals  4  4   1

As expected, both datasets give a same anova table, but this is only apparent. 
Indeed :


anova1$F[1] == anova2$F[1]

[1] FALSE

anova1$F[1] - anova2$F[1]

[1] 5.684342e-14

In fact the F values differ slightly, and this holds also for the aov function. 
I checked also (not shown) that both the residual and factorial sums of squares 
differ between dat1 and dat2. Thus, for some undocumented reason (at least for 
the end user), the F values depend on the order of data!
While such tiny differences (e-14 in this example) are devoid of consequences 
on the risk evaluation by Fisher's distribution, they may have huge 
consequences on the risk evaluation by the permutation method. Indeed, the 
shift from continuous to discrete distributions is far from being insignificant.

For instance, the following code in R is at the basis of many permutation 
algorithms found in the internet and in teaching because it seems quite 
straightforward (see for example 
http://www.uvm.edu/~dhowell/StatPages/More_Stuff/Permutation%20Anova/PermTestsAnova.html
http://www.usna.edu/Users/math/jager/courses/sm439/labs/Lab7.pdf
http://biostatistics1014.blogspot.fr/2013/04/one-way-anova-permutation-test-and.html
http://adn.biol.umontreal.ca/~numericalecology/Rcode/):


nperm - 1000# number of permutations
Fperm - replicate(n=nperm, anova(lm(sample(Y) ~ F, dat1))$F[1]) # calculates 
nperm F values
(prob - (sum(anova1$F[1] = Fperm) + 1)/(nperm +1))  # risk calculation
  

[1] 0.04695305

Of course, because of the sample function, repeating this code gives different 
prob values, but they remain always close to 5% instead of the exact 
probability of 10%. Indeed, there are only choose(6,3) = 20 possible 
permutations, but because they are symmetric, they give only 10 absolute mean 
differences. Thus, the only exact probabilities are 10%, 20% ... 100%. In our 
example where samples do not overlap, 10% is obviously the right answer.

Thus, the use of lm and aov functions in permutation methods does not seem a 
good idea as it results in biases that underestimate the exact risk. In the 
simple case of one-way anova, it is quite simple to remedy this problem. As the 
total Sums of squares and the degrees of freedom do not change with 
permutations, it is easier and much faster to compare the residual sums of 
squares. For instance, the exact probabilities can be calculated that way :


combi - combn(6, 3, FUN=function(i) {append(dat1$Y[i], dat1$Y[-i])}) # all 
permutations
SCEResid - apply(combi, 2, FUN=function(x) sum(tapply(x, dat1$F, function(x) 
sum((x - mean(x))^2 # all resi SS
(prob - mean(SCEResid = SCEResid[1])) # risk calculation

[1] 0.1

10% is indeed the exact risk.

Finally, my problem is : How can we know if R libraries that use randomization 
procedures are not biased ? In the basic case of one way anova, it seems easy 
to submit the above example (by the way, the defunct lmPerm library does not 
succeed ...), but how can we check more complex 

Re: [R] anova

2014-08-21 Thread Lynn Govaert
I pressed enter to soon.

Again

Hi all,

I have troubles with doing an anova. So I have the following variables: a
variable group with two levels, a continuous variable trait and within each
group we have 12 organisms (clone) and for each clone we have 5 replicates.
So we want to see if for the variable trait the two groups differ,
including the information for the clones.

So the dataset looks like:

trait  groupclone
...  1   a1
...  1   a2
...
...  1   a12
...  1a1
...
...   2b1
...   2b2

etc

trait are just some numbers.
So clone is nested in group.

Then I don't understand how to make the anova,
I was thinking we want to see if there is an effect of group

So we built the model aov(trait ~group)
but because we also want to include the effect of clone which is a random
effect we do

aov(trait ~ group + Error(clone))
but because Clone is nested in group
we do
aov(trait ~ group + Error(group/clone))

but if I do this I get the following output


Error: PopulationTNFMF
Df  Sum Sq Mean Sq
PopulationTNFMF  1 0.00917 0.00917

Error: PopulationTNFMF:CloneTNFMF
  DfSum Sq   Mean Sq F value Pr(F)
Residuals  1 0.0001849 0.0001849

Error: Within
   Df  Sum Sq   Mean Sq F value Pr(F)
Residuals 117 0.02107 0.0001801


and I don't get any p-values, so I guess I'm doing something wrong.
Can someone help?

Thank you in advance
Lynn


2014-08-21 12:08 GMT+02:00 Lynn Govaert lynn.gova...@gmail.com:

 Hi all,

 I have troubles with doing an anova. So I have the following variables: a
 variable group with two levels, a continuous variable trait and within each
 group we have 12 organisms (clone) and for each clone we have 5 replicates.
 So we want to see if for the variable trait the two groups differ,
 including the information for the clones.

 So the dataset looks like:

 trait  group


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova

2014-08-21 Thread Sarah Goslee
Hi Lynn,

Please read the posting guide, this document:
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

and don't post in HTML. (See your own email copied below for why.)

Sarah

On Thu, Aug 21, 2014 at 6:08 AM, Lynn Govaert lynn.gova...@gmail.com wrote:
 Hi all,

 I have troubles with doing an anova. So I have the following variables: a
 variable group with two levels, a continuous variable trait and within each
 group we have 12 organisms (clone) and for each clone we have 5 replicates.
 So we want to see if for the variable trait the two groups differ,
 including the information for the clones.

 So the dataset looks like:

 trait  group

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova

2014-08-21 Thread Bert Gunter
... and you do get P values; you just don't understand what you're
looking at, which is more or less what you said.

This is not a statistical help site. You should seek local statistical
consulting advice or post on a statistical help site (caveat emptor!)
like stats.stackexchange.com, not here.

Cheers,
Bert



Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
Clifford Stoll




On Thu, Aug 21, 2014 at 3:13 AM, Lynn Govaert lynn.gova...@gmail.com wrote:
 I pressed enter to soon.

 Again

 Hi all,

 I have troubles with doing an anova. So I have the following variables: a
 variable group with two levels, a continuous variable trait and within each
 group we have 12 organisms (clone) and for each clone we have 5 replicates.
 So we want to see if for the variable trait the two groups differ,
 including the information for the clones.

 So the dataset looks like:

 trait  groupclone
 ...  1   a1
 ...  1   a2
 ...
 ...  1   a12
 ...  1a1
 ...
 ...   2b1
 ...   2b2

 etc

 trait are just some numbers.
 So clone is nested in group.

 Then I don't understand how to make the anova,
 I was thinking we want to see if there is an effect of group

 So we built the model aov(trait ~group)
 but because we also want to include the effect of clone which is a random
 effect we do

 aov(trait ~ group + Error(clone))
 but because Clone is nested in group
 we do
 aov(trait ~ group + Error(group/clone))

 but if I do this I get the following output


 Error: PopulationTNFMF
 Df  Sum Sq Mean Sq
 PopulationTNFMF  1 0.00917 0.00917

 Error: PopulationTNFMF:CloneTNFMF
   DfSum Sq   Mean Sq F value Pr(F)
 Residuals  1 0.0001849 0.0001849

 Error: Within
Df  Sum Sq   Mean Sq F value Pr(F)
 Residuals 117 0.02107 0.0001801


 and I don't get any p-values, so I guess I'm doing something wrong.
 Can someone help?

 Thank you in advance
 Lynn


 2014-08-21 12:08 GMT+02:00 Lynn Govaert lynn.gova...@gmail.com:

 Hi all,

 I have troubles with doing an anova. So I have the following variables: a
 variable group with two levels, a continuous variable trait and within each
 group we have 12 organisms (clone) and for each clone we have 5 replicates.
 So we want to see if for the variable trait the two groups differ,
 including the information for the clones.

 So the dataset looks like:

 trait  group


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA for RCBD, two factors: how to plot residuals?..

2014-07-16 Thread Richard M. Heiberger
## Manuel,

## Please look at the maiz example in ?mmc in the HH package.

install.packages(HH)  ## if necessary
library(HH)
?mmc

## After the full maiz example, then you need one more command

maiz.proj - proj(maiz.aov)
maiz.proj

## I think you are looking for
maiz.proj$Within[, Residuals]

## Rich

On Wed, Jul 16, 2014 at 9:46 AM, Manuel Lambert
lambert_man...@hotmail.com wrote:



 Hello,
 I'm having problems to analyse results from a RCB Design Experiment.I have 
 three blocks. For each block: four treatments (factor A), randomized. And for 
 each factor A treatment,  I have 6 differents treatments (factor C), 
 randomized.
 myAOV=aov(response ~ factorA*factorC + Block + Error(field), data)
 The model i use would be response = factorA + factorC + factorA:factorC + 
 Block (fixed effect) + Field (subplot, random effect).
 I have some good results but my problem is the following: i would like to 
 have a look at the residuals distribution and the command:
 residuals(myAOV)
 returns a NULL object. How to do then to extract the residuals? The function 
 works if i take off Error(field) from my aov function but this doesn't give 
 me relevant residuals right?
 Thanks a lot!




 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA for proportions with large mass on an extreme of [0, 1]

2014-06-21 Thread Bert Gunter
Although your queries certainly intersect R, they are primarily about
statistical modeling, which is OT for this list. Your issues also
appear to be complex. I would therefore suggest that you eschew remote
Internet advice and consult a local statistical expert for help.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom.
Clifford Stoll




On Fri, Jun 20, 2014 at 8:54 PM, Joaquin Rapela rap...@ucsd.edu wrote:

 I am trying to perform an ANOVA on a dependent variable that has large mass
 on the 1 side of the (0, 1] interval. I decided to use Fractional Regression
 Models, as implemented in the package frm. This package seems well-suited for
 my problem, but I don't see how to perform model comparisons of nested frm
 models. Please, see data and code below.

 I would like to do:

 anova(model1, model2)

 There is a function frm.ptest(model1, model2), but does not work with nested
 models.

 Are there alternatives to the frm package to perform ANOVAs on proportions
 (with large mass on an extreme of [0, 1])?

 Is there a way to model repeated measures (as in package lme4) when the 
 dependent variable is a  proportion?

 Data and code
 -

 con - url(http://sccn.ucsd.edu/~rapela/avshift/anovaDataFrame.RData;)
 myData - get(load(con))
 close(con)

 myData - myData[!is.na(myData$alternationRate),]
 y - myData$alternationRate

 library(frm)

 model1 - frm(y=y, x=model.matrix(~modality*condition+clusterID, 
 data=myData)[, -1], linkfrac=logit, linkbin=logit, type=2P, inflation=1)
 model2 - frm(y=y, x=model.matrix(~modality+condition+clusterID, 
 data=myData)[, -1], linkfrac=logit, linkbin=logit, type=2P, inflation=1)

 # this works
 frm.ptest(model2, model3)

 # but this does not
 # frm.ptest(model1, model2)
 #
 # Error in frm.ptest(model1, model2) :
 # object 2 is nested in object 1 - no need to use the P test

 Thanks, Joaquin

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA for proportions with large mass on an extreme of [0, 1]

2014-06-21 Thread Joaquin Rapela
Bert,

My queries are directly related to R: 

1. Can the R package frm can be used to compare nested models. If so, how.

2. Are there alternative R packages to perform ANOVAs on a dependent variable
that is a proportion with significant mass one one extreme?

Also, my statistical problem is well defined (i.e., perform an ANOVA, ideally
considering repeated measures, on a dependent variable that is a proportion),
there is substantial literature addressing this problem (see some old and
newer articles below), and I am just looking for help on R packages
implementing the functionality described in this literature.

I believe my statistical problem is not complex. After all, ANOVAs are one of
the most common types of statistical analysis, and proportions frequently
appear in statistical problems.  

I just wanted to get some feedback from R users with expertise performing
ANOVAs on proportions. I trust that my statistical problem is not rare, that
several R users have worked on this problem, and that I will get very useful
feedback from them. If I get this feedback, it will show that my question is
no OT. Let's see what happens ...

Cordially, Joaquin

Papke, L. and J.M. Wooldridge (1996), Econometric methods for fractional
response variables with an application to 401(K) plan participation rates,
Journal of Applied Econometrics, 11(6), 619-32.

Ramalho, E.A., J.J.S. Ramalho and J.M.R. Murteira (2011), Alternative
estimating and testing empirical strategies for fractional regression
models, Journal of Economic Surveys, 25(1), 19-68.

On Sat, Jun 21, 2014 at 10:27:27AM -0700, Bert Gunter wrote:
 Although your queries certainly intersect R, they are primarily about
 statistical modeling, which is OT for this list. Your issues also
 appear to be complex. I would therefore suggest that you eschew remote
 Internet advice and consult a local statistical expert for help.
 
 Cheers,
 Bert
 
 Bert Gunter
 Genentech Nonclinical Biostatistics
 (650) 467-7374
 
 Data is not information. Information is not knowledge. And knowledge
 is certainly not wisdom.
 Clifford Stoll
 
 
 
 
 On Fri, Jun 20, 2014 at 8:54 PM, Joaquin Rapela rap...@ucsd.edu wrote:
 
  I am trying to perform an ANOVA on a dependent variable that has large mass
  on the 1 side of the (0, 1] interval. I decided to use Fractional Regression
  Models, as implemented in the package frm. This package seems well-suited 
  for
  my problem, but I don't see how to perform model comparisons of nested frm
  models. Please, see data and code below.
 
  I would like to do:
 
  anova(model1, model2)
 
  There is a function frm.ptest(model1, model2), but does not work with nested
  models.
 
  Are there alternatives to the frm package to perform ANOVAs on proportions
  (with large mass on an extreme of [0, 1])?
 
  Is there a way to model repeated measures (as in package lme4) when the 
  dependent variable is a  proportion?
 
  Data and code
  -
 
  con - url(http://sccn.ucsd.edu/~rapela/avshift/anovaDataFrame.RData;)
  myData - get(load(con))
  close(con)
 
  myData - myData[!is.na(myData$alternationRate),]
  y - myData$alternationRate
 
  library(frm)
 
  model1 - frm(y=y, x=model.matrix(~modality*condition+clusterID, 
  data=myData)[, -1], linkfrac=logit, linkbin=logit, type=2P, 
  inflation=1)
  model2 - frm(y=y, x=model.matrix(~modality+condition+clusterID, 
  data=myData)[, -1], linkfrac=logit, linkbin=logit, type=2P, 
  inflation=1)
 
  # this works
  frm.ptest(model2, model3)
 
  # but this does not
  # frm.ptest(model1, model2)
  #
  # Error in frm.ptest(model1, model2) :
  # object 2 is nested in object 1 - no need to use the P test
 
  Thanks, Joaquin
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

-- 
Joaquin Rapela, PhD
Swartz Center for Computational Neuroscience
University of California San Diego
9500 Gilman Drive,
San Diego, CA 92093-0559
tel: (858) 822-7547
fax: (858) 822-7556
http://sccn.ucsd.edu/~rapela
--


   Tak fyr, and ber it in the derkeste hous
   Bitwix this and the Mount of Caucasus,
   And lat men shette the dores and go thenne,
   Yet wol the fyr as faire lye and brenne,
   As twenty thousand men mighte it biholde:
   His office naturel ay wol it holde,
   Up peril of my lyf, til that it dye.

 Geoffrey Chaucer
 The Canterbury Tales

__
R-help@r-project.org mailing list

Re: [R] ANOVA repeated mesures

2013-12-23 Thread PIKAL Petr
Hi

You have only single value for each participant in each gruppo and AFAIK you 
can not do statistic on single value.

You can check differences among participants

fit-lm(valor~participantes, data=df2)
 fit-lm(valor~participantes, data=df2)
 anova(fit)
Analysis of Variance Table

Response: valor
  Df Sum Sq Mean Sq F value Pr(F)
participantes  7  5.167  0.7381  0.3163 0.9358
Residuals 16 37.333  2.   

and in groups

 fit-lm(valor~grupo, data=df2)
 anova(fit)
Analysis of Variance Table

Response: valor
  Df Sum Sq Mean Sq F valuePr(F)
grupo  2  22.75 11.3750  12.095 0.0003202 ***
Residuals 21  19.75  0.9405  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Regards
Petr


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Ma Teresa Martinez Soriano
 Sent: Wednesday, December 18, 2013 4:17 PM
 To: r-help@r-project.org
 Subject: [R] ANOVA repeated mesures
 
 Hi to everyone, I am tring to make a Anova with repeated measures,my
 data set looks like:
 participantes - c(1, 2, 3, 4, 5, 6, 7, 8, 1, 2,
 3, 4,   5, 6, 7, 8, 1, 2, 3, 4, 5, 6, 7, 8)
 grupo - factor(c(rep(A, 8), rep(B, 8), rep(C, 8)))valor - c(1,
 2, 4, 1, 1, 2, 2, 3, 3, 4, 4, 2, 3, 4, 4, 3, 4, 5, 3, 5, 5, 3,  4,
 6)df2 - data.frame(participantes, grupo, valor)
 I want to find if there is differences in the grupo  in each
 participant. I don't know how to find this p-value for each
 participant, could you help me, please??Thanks a lot
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in R

2013-12-07 Thread Bert Gunter
I think you start by doing your homework:

1. Read An Inroduction to R (ships with R) or other R online
tutorial. There are many good ones.

2. Use R's Help system:
?aov
?lm
?anova

there will be relevant links in these docs that you should follow,
especially to the use of formulas for model specification and
summary() for printing results (including anova tables).

3. Alternatively, have a look at the Rcmdr package for a GUI interface
to R that may meet your needs while allowing you to skip the homework.

Cheers,
Bert




On Sat, Dec 7, 2013 at 8:30 AM, Robin Mjelle robinmje...@gmail.com wrote:
 ID

 a_t1a_t2b_t1b_t2
 CACCCGTAGAACCGACCTTGCG_mmu-miR-99b-5p15781941234810941
 CACCCGTAGAACCGACCTTGC_mmu-miR-99b-5p4424265643839
 CACCCGTAGAACCGACCTTG_mmu-miR-99b-5p544366253
 CCGTAGAACCGACCTTGCG_mmu-miR-99b-5p263333157
 CGTAGAACCGACCTTGCG_mmu-miR-99b-5p232238187
 CACCCGTAGAACCGACCT_mmu-miR-99b-5p172731189
 ACCCGTAGAACCGACCTTGCG_mmu-miR-99b-5p14193094

 I want to run an ANOVA test on the rows above. I have two conditions with
 two replicates each. I assume that there is no difference between a and b
 for the different rows. I want to get which rows a different between a and
 b.
 How do I approach this in R?

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

(650) 467-7374

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova split by factors

2013-11-18 Thread ONKELINX, Thierry
Dear Catalin,

Have a look at the plyr package.

library(plyr)
dlply(
eg,
.(Exp),
function(x) {
aov(masa.uscat.tr~Clona,data=x)
}
)

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey


-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
catalin roibu
Verzonden: maandag 18 november 2013 13:24
Aan: r-help@r-project.org
Onderwerp: [R] Anova split by factors

Hello R-users,
I have a problem with Anova in R and I don't know how to solve that. I want to 
compute Anova for each experiment (exp). I try this code:

test-lapply(split(eg,eg$Exp),function(x) aov(masa.uscat.tr ~ Clona,data =
x))
or
test-by(eg,eg$Exp, function(x) aov(masa.uscat.tr~Clona,data=x))

I want to compute Anova summary for each experiment (exp) and I want to compute 
Tuckey test for each Anova.

Thank you very much!

My data is like this:



Exp Plot Clona Prov masa uscat tr masa usc. Ram masa usc total B 36 Max4 P 
Puieti 6.199848485 2.639325843 8.839174328 B 36 Max4 P Puieti 3.87875 1.4798 
5.35855 B 36 Max4 P Puieti 7.822702703 3.32852071 11.15122341 B 36 Max4 P 
Puieti 5.645384615 1.995238095 7.640622711 B 36 Max4 P Puieti 10.2 3.514864865 
13.71486486 B 36 Max4 P Puieti 8.815545455 3.35627907 12.17182452 B 36 Max4 P 
Puieti 5.0 1.607142857 6.64047619 B 36 Max4 P Puieti 6.693488372 
2.630208333 9.323696705 B 36 Max4 P Puieti 6.021012658 1.60293578 7.623948438 B 
36 Max4 P Puieti 10.20582524 3.768314607 13.97413985 B 33 Max4 Butasi 
5.899417476 1.745394737 7.644812213 B 33 Max4 Butasi 3.261428571 1.335735294 
4.597163866 B 33 Max4 Butasi 3.359508197 1.456641221 4.816149418 B 33 Max4 
Butasi 5.036363636 2.097793103 7.13415674 B 33 Max4 Butasi 3.122162162 
1.612012579 4.734174741 B 33 Max4 Butasi 5.042474227 3.246916427 8.289390653 B 
33 Max4 Butasi 5.058255814 2.724299065 7.782554879 B 33 Max4 Butasi 4.977818182 
1!
 .713504274 6.691322455 B 33 Max4 Butasi 3.195294118 1.243411765 4.438705882 B 
33 Max4 Butasi 1.831818182 1.009090909 2.840909091 B 30 AF2 Butasi 6.98195122 
1.764 8.74595122 B 30 AF2 Butasi 5.85833 1.686623377 7.54495671 B 30 AF2 
Butasi 10.625 3.04 13.665 C 39 AF6 Sade 10.27125 2.283193277 12.55444328 C 39 
AF6 Sade 9.473488372 1.909414414 11.38290279 C 39 AF6 Sade 10.4825 2 12.4825 C 
39 AF6 Sade 11.61579545 2.431136364 14.04693182 C 39 AF6 Sade 8.185074627 
1.80933 9.99440796 C 39 AF6 Sade 11.04510638 2.23672956 13.28183594 C 39 
AF6 Sade 9.06667 2.473785714 11.54045238 C 39 AF6 Sade 10.12787611 
3.097631579 13.22550769 C 39 AF6 Sade 9.171290323 1.821226415 10.99251674 C 39 
AF6 Sade 12.1846875 2.262590361 14.44727786 C 42 Pannonia Sade 9.275 
2.482173913 11.75717391 C 42 Pannonia Sade 7.21 1.77 8.98 C 42 Pannonia Sade 
11.36939394 3.111780822 14.48117476 C 42 Pannonia Sade 7.85296875 1.943475177 
9.796443927 C 42 Pannonia Sade 8.25 2.54047619 10.79047619 C 42 Pannonia !
 Sade 8.669277108 2.187071429 10.85634854 C 42 Pannonia Sade 8.51088607
6 2.05344 10.56432608 C 42 Pannonia Sade 9.36222 5.525531915 14.88775414 C 
42 Pannonia Sade 11.08481928 2.573193277 13.65801255 C 42 Pannonia Sade 
10.17462687 3.003225806 13.17785267 C 45 Monviso Sade 12.99693878 3.216083916 
16.21302269 C 45 Monviso Sade 11.11456522 1.885714286 13.0002795 C 45 Monviso 
Sade 8.12933 1.53267 9.662 C 45 Monviso Sade 9.943043478 2.38300885 
12.32605233 C 45 Monviso Sade 11.9805814 3.080923913 15.06150531 C 45 Monviso 
Sade 10.31376623 2.210526316 12.52429255 C 45 Monviso Sade 9.947586207 
1.86083 11.80841954 C 45 Monviso Sade 12.24261538 2.166857143 14.40947253 C 
45 Monviso Sade 13.56650602 2.414371257 15.98087728 C 45 Monviso Sade 
10.87574257 2.922340426 13.798083 C 48 AF2 Sade 9.334545455 2.201822917 
11.53636837 C 48 AF2 Sade 9.747640449 2.811780822 12.55942127 C 48 AF2 Sade 
14.29541284 4.506885246 18.80229809 C 48 AF2 Sade 10.26451613 3.014322581 
13.27883871 C 48 AF2 Sade 13.30924242 3.960661765 17.26990419 C 48 AF2 Sade 
13.232284!
 26 5.00546875 18.23775301 C 48 AF2 Sade 14.8277027 4.450447761 19.27815046 C 
48 AF2 Sade 15.23669528 7.559322034 22.79601731 C 48 AF2 Sade 13.27198582 
3.758252427 17.03023824 C 48 AF2 Sade 13.7136 5.69044 19.40408889 C 51 
Max4 Sade 8.860884956 5.613586957 14.47447191 C 51 Max4 Sade 13.29153285 

Re: [R] ANOVA for mixture experiment

2013-07-15 Thread S Ellison
 But the ANOVA results is incorrect when I use
 
 aov.out = aov(Response~-1+x1*x2*x3).
First, this doesn't work at all. Your variables are in upper case and your data 
environment is not specified, so you must have done something else. I can get 
an answer using something like
anova(lm(Response~-1+X1*X2*X3-I(X1*X2*X3), data=d))
(I dropped the indetyerminable 3-way product explicitly)

Second, your answer differs from that listed by 'the original paper' at least 
partly because you have done something different from what has been done in the 
paper. You have asked for anova using Type I sums of squares. The paper has 
apparently calculated something other than a Type I sums of squares ANOVA.
 
Try calculating using Type 3 sums of squares using Anova from the car package. 
I did that using
Anova(lm(Response~-1+X1+X2+X3+I(X1*X2)+I(X1*X3)+I(X2*X3), data=d), type=3)

Then read the help page - especially the Warning section - for ?Anova very 
carefully ...


S Ellison


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Estimated effects may be unbalanced

2013-05-19 Thread Jim Lemon

On 05/20/2013 10:52 AM, Luis Fernando García Hernández wrote:

Dear All,

This is a data relating leg shaking on differenntre treatments. I reused
several individuals so I want to know 1) if there are significative
differences on shaking per treatment and 2) if the reused individuals
presented some effect or significative variation.

Nevertehless when I make

model1-aov(Legshaking~Idmale*Idfemale*tratament)

or

model1-aov(Legshaking~Idmale+Idfemale+tratament)

I get

13744 out of 13824 effects not estimable
Estimated effects may be unbalanced

What could be wrong here? Any advice about how to handle this data in R
will be welcome!


Hi Luis,
Are you sure that you don't want to compare males and females as a group 
and not every individual male and female?


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Estimated effects may be unbalanced

2013-05-19 Thread Bert Gunter
I would say  that the OP should seek local statistical help, as he
appears to be out of his statistical depth. This would appear to be a
mixed effects models-type setup, but a local statistical expert would
be much better able to judge what the goals of the study were and what
sort of approach, including graphics, might be appropriate.

Seeking remote statistical help from this list when your statistical
background is weak is

a) Inappropriate -- this is not a statistical help list; it's for R software;

b) Highly risky, as relevant details  of the underlying context and
the way the data were obtained are likely to be overlooked. It is
**not** merely a question of finding the appropriate function and just
turning the crank, although that's the way many seems to approach data
analysis.

Cheers,
Bert
On Sun, May 19, 2013 at 6:00 PM, Jim Lemon j...@bitwrit.com.au wrote:
 On 05/20/2013 10:52 AM, Luis Fernando García Hernández wrote:

 Dear All,

 This is a data relating leg shaking on differenntre treatments. I reused
 several individuals so I want to know 1) if there are significative
 differences on shaking per treatment and 2) if the reused individuals
 presented some effect or significative variation.

 Nevertehless when I make

 model1-aov(Legshaking~Idmale*Idfemale*tratament)

 or

 model1-aov(Legshaking~Idmale+Idfemale+tratament)

 I get

 13744 out of 13824 effects not estimable
 Estimated effects may be unbalanced

 What could be wrong here? Any advice about how to handle this data in R
 will be welcome!

 Hi Luis,
 Are you sure that you don't want to compare males and females as a group and
 not every individual male and female?

 Jim

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova und Tukey HSD

2013-05-13 Thread Meyners, Michael
Jochen,

a) this is an English-spoken mailing list; other languages are not encouraged 
nor will they typically generate a lot of replies...
b) your code is fine, so this is not an R-issue; you are rather stuck with some 
of the stats background -- you might want to see a friendly local statistician 
for advice :-) or try a stats mailing list (see e.g. the posting guide)
c) your interpretation is ok -- if the CI does not contain 0, you'd have 
significance at the respective level. Alternatively, you might also use the p 
value (column p adj) and compare that to your significance level (and it 
seems odd that any search machine might fail to give you some links on the 
relationship of p values and CIs -- I got plenty at first try)
d) In brief: you are doing ~100 (rough estimate) paired comparisons. With that, 
you can expect TukeyHSD to be pretty conservative, and it does not come as a 
surprise that you are unable to identify pairs of different groups, even less 
as the p value from the F test is not extremely small (~3%; whatever extremely 
small may mean in this case). Google also easily gets me to this site (and 
many other useful ones): http://www.pmean.com/05/TukeyTest.html which might 
provide some further guidance.

Net, tough luck to encounter such a situation, but apparently none of the group 
differences is strong enough to still show up as significant after correcting 
for multiplicity (which is certainly called for) in this case where you have an 
inflated number of significance tests. 

HTH, Michael

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Jochen Schreiber
 Sent: Montag, 13. Mai 2013 14:25
 To: r-help@r-project.org
 Subject: [R] Anova und Tukey HSD
 
 Hallo zusammen,
 
 ich mache zuerst mittels eine Anova eine Überprüfung ob in den Daten ein
 signifikanter Unterschied exisitiert. Danach will ich mittels des TukeyHSD
 rausfinden, zwischen welchen Gruppe der Unterschied vorliegt. Allerdings
 weiß ich nicht wie ich die Daten interpretieren soll. Hier erstmal mein R 
 Code:
 
 [code]
 Value-c(-0.9944999814033508,-0.3585381469727,0.7063000202178955,-
 1.77435803833,-
 1.080299973487854,0.3055071525574,1.849046325684,-
 0.412440395355,0.5827999711036682,1.750669482422,-
 6.693999767303467,-0.877943869019,-
 1.3408000469207764,1.2560999393463135,-
 0.1004081062317,1.849046325684,-
 0.331905722046,0.4957999885082245,0.877943869019,0.73879998922
 34802,0.877943869019,0.9154000282287598,0.877943869019,0.70630
 00202178955,-1.3408000469207764,0.7063000202178955,-
 0.331905722046,-1.6448999643325806,0.412440395355,-
 1.6448999643325806,-
 0.877943869019,0.7487000226974487,0.4399000108242035,1.8490463
 25684,-1.6448999643325806,-2.4323999881744385,1.2265000343322754,-
 0.4957999885082245,-9.999899864196777,-1.750669482422,-
 1.6448999643325806,-9.999899864196777,0.877943869019,-
 5.06279993057251,0.877943869019,-2.967745776367,-
 5.06279993057251,-6.693999767303467,-
 1.0990500450134277,0.9944999814033508,-0.467745776367,-
 0.3585381469727,-
 9.999899864196777,0.5827999711036682,0.7487000226974487,0.738799989223
 4802,-0.2533000111579895,-9.999899864196777,-
 1.0363999605178833,0.3055071525574,-1.174523162842,-
 0.806410490417,-9.999899864196777,-0.9944999814033508,-
 2.478300094604492,-0.150930858612,0.4957999885082245,-
 4.571800231933594,-6.324900150299072,-0.38530001044273376,-
 1.3408000469207764,-5.93179988861084,-6.693999767303467,-
 2.967745776367,0.877943869019,-0.05020405311584,-
 1.77435803833,-0.150930858612,-0.23725003004074097,-
 0.643268528748,1.2560999393463135,-
 0.1004081062317,0.4399000108242035,-
 0.7063000202178955,0.9154000282287598,-
 0.2181814033508,1.2265000343322754,-
 0.412440395355,0.1764581741333,-1.4758000373840332,-
 0.9944999814033508,-1.080299973487854,-0.643268528748,-
 9.999899864196777,-2.0536999702453613,-
 0.2181814033508,0.7487000226974487,0.02510202655792,-
 1.0363999605178833,-0.05020405311584,-
 0.7387999892234802,0.4957999885082245,-1.4758000373840332,-
 0.7063000202178955,0.1764581741333,-5.06279993057251,-
 0.643268528748,-1.4758000373840332,-0.9944999814033508,-
 0.2533000111579895,0.1764581741333,-
 0.331905722046,0.6776500344276428,0.3055071525574,-
 0.05020405311584,0.5827999711036682,1.2560999393463135,-
 0.4957999885082245,-0.38530001044273376,0.9944999814033508,-
 2.4323999881744385,1.126338964844,-
 0.9944999814033508,1.750669482422,1.080299973487854,-
 0.7387999892234802,-1.3408000469207764,0.612820980835,-
 2.0536999702453613,0.7063000202178955,-0.806410490417,-
 0.877943869019,-0.05020405311584,-2.967745776367,-
 0.877943869019,-2.0536999702453613,-1.3408000469207764,-
 1.3408000469207764,-0.38530001044273376,0.7063000202178955,-
 9.999899864196777,-
 

Re: [R] Anova unbalanced

2013-04-17 Thread Jose Iparraguirre
Dear Claudia,

Your question has been posed on many previous occasions. 
The (short) answer has always been the same: have a look at the Anova function 
in the car package but before doing that, get a copy of John Fox's Applied 
Regression Analysis and Generalized Linear Models book.
Best,

José


José Iparraguirre
Chief Economist
Age UK

T 020 303 31482
E jose.iparragui...@ageuk.org.uk
Twitter @jose.iparraguirre@ageuk


Tavis House, 1- 6 Tavistock Square
London, WC1H 9NB
www.ageuk.org.uk | ageukblog.org.uk | @ageukcampaigns 




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of paladini
Sent: 17 April 2013 10:47
To: r-help@r-project.org
Subject: [R] Anova unbalanced

Hello everybody,
I have got a data set with about 400 companies. Each company has a 
score for its enviroment comportment between 0 and 100. These companies 
belong to  about 15 different countries. I have e.g. 70 companies from 
UK and 5 from Luxembourg,- so the data set is pretty unbalanced and I 
want to do an ANOVA. Somthing like aov(enviromentscore~country). But the 
aov function is just for a balanced design.
So I wonder if I can use fit=lm(enviromentscore~country), anova (fit) 
instead? Would this be okay or can it also only be used with balanced 
data?

Thanking you in anticipation, best regards


Claudia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Please donate to the Syria Crisis Appeal by text or online:

To donate £5 by mobile, text SYRIA to 70800.  To donate online, please visit 

http://www.ageinternational.org.uk/syria

Over one million refugees are desperately in need of water, food, healthcare, 
warm clothing, 
blankets and shelter; Age International urgently needs your support to help 
affected older refugees.


Age International is a subsidiary charity of Age UK and a member of the 
Disasters Emergency Committee (DEC).  
The DEC launches and co-ordinates national fundraising appeals for public 
donations on behalf of its member agencies.

Texts cost £5 plus one standard rate message.  Age International will receive a 
minimum of £4.96.  
More info at ageinternational.org.uk/SyriaTerms



 

---
Age UK is a registered charity and company limited by guarantee, (registered 
charity number 1128267, registered company number 6825798). 
Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

For the purposes of promoting Age UK Insurance, Age UK is an Appointed 
Representative of Age UK Enterprises Limited, Age UK is an Introducer 
Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth 
Access for the purposes of introducing potential annuity and health 
cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit 
Solutions Limited and Simplyhealth Access are all authorised and 
regulated by the Financial Services Authority. 
--

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are 
addressed. If you receive a message in error, please advise the sender and 
delete immediately.

Except where this email is sent in the usual course of our business, any 
opinions expressed in this email are those of the author and do not 
necessarily reflect the opinions of Age UK or its subsidiaries and associated 
companies. Age UK monitors all e-mail transmissions passing 
through its network and may block or modify mails which are deemed to be 
unsuitable.

Age Concern England (charity number 261794) and Help the Aged (charity number 
272786) and their trading and other associated companies merged 
on 1st April 2009.  Together they have formed the Age UK Group, dedicated to 
improving the lives of people in later life.  The three national 
Age Concerns in Scotland, Northern Ireland and Wales have also merged with Help 
the Aged in these nations to form three registered charities: 
Age Scotland, Age NI, Age Cymru.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova comparisons

2013-02-23 Thread Rolf Turner

On 02/23/2013 08:55 PM, Robert Zimbardo wrote:

I have several linear models on the same data:

m1 - lm(y ~ poly(x,1))
m2 - lm(y ~ poly(x,2))
m3 - lm(y ~ poly(x,3))

What I don't understand is why

anova(m1, m2, m3, test=F)

- yields the same RSS and SS values, but a different p-value from anova(m1,
m2, test=F)
- when it also yields the SAME as anova(m2, m3, test=F)

What am I missing?


A basic understanding of the theory of linear models.  This really has 
little

to do with R.  Go and read a good intro to linear modelling.

Insofar as your question has anything to do with R:

When you do

anova(m1, m2, m3, test=F)

the mean squared error from m3 is used as the denominator of the F 
statistic.


When you do

anova(m1, m2, test=F)

the mean squared error from m2 is used as the denominator of the F 
statistic.


cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova comparisons

2013-02-23 Thread peter dalgaard

On Feb 23, 2013, at 10:06 , Rolf Turner wrote:

 
 What am I missing?
 
 A basic understanding of the theory of linear models.  This really has little
 to do with R.  Go and read a good intro to linear modelling.
 
 Insofar as your question has anything to do with R:
 
 When you do
 
anova(m1, m2, m3, test=F)
 
 the mean squared error from m3 is used as the denominator of the F statistic.
 
 When you do
 
anova(m1, m2, test=F)
 
 the mean squared error from m2 is used as the denominator of the F statistic.

Maybe a bit quick there. It's actually about (computer) conventions for ANOVA 
tables; the theory would prefer you to do what the poster expects: Test 
sequentially and update the denominator MS along the route. However, lazy 
programmers (and before them lazy practising statisticians) decided that it is 
easier to use the MS from the biggest model throughout.

(There's also a post-hoc argument, that said MS is less susceptible to small 
departures from the model assumptions, but I'd say that the main reason is 
convenience.)

Even though it is a common convention, you do need exposure to at least one 
statistical computer program or one traditional analysis of variance textbook 
to see it. 

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA repeated measures and post-hoc

2012-12-30 Thread Helios de Rosario
A (late) update to this question:

On Fri Aug 17 07:33:29, Henrik Singmann wrote:
 Hi Diego,

 I am struggeling with this question also for some time and there does

 not seem to be an easy and general solution to this problem. At least
I 
 haven't found one.
 However, if you have just one repeated-measures factor, use the
solution 
 describe by me here: http://stats.stackexchange.com/a/15532/442 
 Furthermore, you might wanna check the package phia with accompanying

 vignette (but it uses multivariate tests).

There is a new version of phia that also works with mixed-effects
models - but notice the caveat about the calculation of p-values for
such models, as explained in the vignette.

For more issues about mixed-effects models in R, there is a specifice
SIG-list:
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

 For running the ANOVA you could also check my new package afex.

 Cheers,
 Henrik
 
 
 Diego Bucci schrieb:
 Hi,
  I performed an ANOVA repeated measures but I still can't find any
good news regarding the possibility to perform multiple comparisons.
  Can anyone help me?
  Thanks


INSTITUTO DE BIOMECÁNICA DE VALENCIA
Universidad Politécnica de Valencia • Edificio 9C
Camino de Vera s/n • 46022 VALENCIA (ESPAÑA)
Tel. +34 96 387 91 60 • Fax +34 96 387 91 69
www.ibv.org

  Antes de imprimir este e-mail piense bien si es necesario hacerlo.
En cumplimiento de la Ley Orgánica 15/1999 reguladora de la Protección
de Datos de Carácter Personal, le informamos de que el presente mensaje
contiene información confidencial, siendo para uso exclusivo del
destinatario arriba indicado. En caso de no ser usted el destinatario
del mismo le informamos que su recepción no le autoriza a su divulgación
o reproducción por cualquier medio, debiendo destruirlo de inmediato,
rogándole lo notifique al remitente.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova

2012-11-27 Thread Jose Iparraguirre
Hi, 

You can use the car package and choose Type-III test.
Also, have a look at the package easyanova.
Regards,
José

José Iparraguirre
Chief Economist
Age UK


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of humming-bird
Sent: 27 November 2012 10:12
To: r-help@r-project.org
Subject: [R] Anova

Hi everyone,

I am new to this forum and also new to statistics and I would appreciated it
if someone would take some time to answer my question.

I am analyzing companies in regard to their leverage. I categorized the
companies into 3 groups: small, mid and large. For the group small, I have
55 debt multiples, for mid 42 and for large 72. (Unfortunately I can not
provide my data because it is confidential.)
I am now trying to find out whether the mean debt multiples are
significantly different for the 3 different groups. For this reason I
calculated an anova table with the aov function and to display the results
for each pair I did the tukey.hsd function.
Now my question: Am I allowed to use these functions given that my data is
unbalanced? Can use I read several times that aov is only valid for balanced
data? If not, is there another function that I can use?

Thank you very much for your answers.


Call:
aov(formula = Debt.Ebitdax ~ Company.Size, data = Anetdebtx2003)

Terms:
Company.Size Residuals
Sum of Squares 302.3089 926.2174
Deg. of Freedom 2 166

Residual standard error: 2.362123 
Estimated effects may be unbalanced

 summary(Anovanetdebtx2003)
Df Sum Sq Mean Sq F value Pr(F) 
Company.Size 2 302.3 151.15 27.09 6.58e-11 ***
Residuals 166 926.2 5.58 
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 


 TukeyHSD(Anovanetdebtx2003)
Tukey multiple comparisons of means
95% family-wise confidence level

Fit: aov(formula = Debt.Ebitdax ~ Company.Size, data = Anetdebtx2003)

$Company.Size
diff lwr upr p adj
Mid Market Buyout-Large Buyout -1.446292 -2.530922 -0.3616617 0.0054123
Small Buyout-Large Buyout -3.112143 -4.112545 -2.1117420 0.000
Small Buyout-Mid Market Buyout -1.665852 -2.810574 -0.5211300 0.0021037



--
View this message in context: http://r.789695.n4.nabble.com/Anova-tp4650940.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

A Star for Christmas

Kick start the festive season by attending one of Age UK’s Carol Concerts, A 
Star for Christmas. Taking place at Manchester Cathedral on Saturday 1 December 
and London’s St Pancras Church (opposite Euston Station) on Thursday 6 
December, they will feature special musical performances, readings by your 
favourite celebrities and carols, followed by mince pies and wine.
Tickets are priced at £20 full price/ £10 concessions. For more information, 
please visit http://www.ageuk.org.uk/astarforchristmas  or contact the 
Fundraising Events Team on
020 303 31725.

Age UK  Improving later life
www.ageuk.org.uk

 

---
Age UK is a registered charity and company limited by guarantee, (registered 
charity number 1128267, registered company number 6825798). 
Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

For the purposes of promoting Age UK Insurance, Age UK is an Appointed 
Representative of Age UK Enterprises Limited, Age UK is an Introducer 
Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth 
Access for the purposes of introducing potential annuity and health 
cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit 
Solutions Limited and Simplyhealth Access are all authorised and 
regulated by the Financial Services Authority. 
--

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are 
addressed. If you receive a message in error, please advise the sender and 
delete immediately.

Except where this email is sent in the usual course of our business, any 
opinions expressed in this email are those of the author and do not 
necessarily reflect the opinions of Age UK or its subsidiaries and associated 
companies. Age UK monitors all e-mail transmissions passing 
through its network and may block or modify mails which are deemed to be 
unsuitable.

Age Concern England (charity number 261794) and Help the Aged (charity number 
272786) and their trading and other associated companies merged 
on 1st April 2009.  Together they have formed the Age UK Group, dedicated to 
improving the lives of people in later life.  The three national 
Age Concerns in Scotland, Northern Ireland and Wales have also merged with Help 
the Aged in these nations to form three registered charities: 
Age Scotland, Age NI, Age 

Re: [R] Anova

2012-11-27 Thread S Ellison
 Now my question: Am I allowed to use these functions given 
 that my data is unbalanced? 

Unusually, Yes, assuming all the other requisite assumptions are reasonably 
well satisfied. One-way ANOVA interpretation is not much affected by imbalance 
because (among other things) with only one factor there is no 'model order' 
dependence or hypothesis separation to worry about. You don't need to use 
alternate sum-of-squares calculations (eg type II and III as in car) because 
types I, II and III sums of squares are all the same in a one-way case.

In a two-factor or higher analysis it's a (very) different story. 

However, one-way anova is still adversely affected by failures in homogeneity 
of variance, non-normality etc. 

S Ellison

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova test for variables with different lengths

2012-10-16 Thread Jose Iparraguirre
Hi Sachin,

You may find this tutorial useful: 
http://goanna.cs.rmit.edu.au/~fscholer/anova.php
And you'll need the car package; but become yourself familiar with Type I, II 
and III sums of squares models before running the Anova; the tutorial explains 
these in detail.
Hope it helps.

José


José Iparraguirre
Chief Economist
Age UK

T 020 303 31482
E jose.iparragui...@ageuk.org.uk
Twitter @jose.iparraguirre@ageuk


Tavis House, 1- 6 Tavistock Square
London, WC1H 9NB
www.ageuk.org.uk | ageukblog.org.uk | @ageukcampaigns 


For a copy of our new Economic Monitor and the full Chief Economist's report, 
visit the Age UK Knowledge Hub 
http://www.ageuk.org.uk/professional-resources-home/knowledge-hub-evidence-statistics/


For evidence and statistics on the older population, visit the Age UK Knowledge 
Hub 
http://www.ageuk.org.uk/professional-resources-home/knowledge-hub-evidence-statistics/


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Sachinthaka Abeywardana
Sent: 16 October 2012 04:18
To: r-help@r-project.org
Subject: [R] anova test for variables with different lengths

Hi all,

I want to test whether the MEAN of two different variables, (and
different number of observations) are the same. I am trying to use the
anova test but it doesn't seem to like that the number of observations
are different:

a=c(1:5)
b=c(1:3)
aov_test=aov(a~b)
Error in model.frame.default(formula = a ~ b, drop.unused.levels = TRUE) :
  variable lengths differ (found for 'b')

Any ideas as to how I would go about doing this test?

Thanks,
Sachin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Age UK and YouthNet are official charities for the Virgin London Marathon 2013

We need you to Run for it. Join the team and help raise vital funds to bring 
generations together to combat loneliness and isolation.

Go to http://www.runforit.org.uk for more information or contact Helen Parson 
at helen.pars...@ageuk.org.uk or on 020 303 31369.

Age UK and YouthNet. A lifeline, online.

www.runforit.org.uk



Age UK Improving later life

www.ageuk.org.uk



---
Age UK is a registered charity and company limited by guarantee, (registered 
charity number 1128267, registered company number 6825798). 
Registered office: Tavis House, 1-6 Tavistock Square, London WC1H 9NA.

For the purposes of promoting Age UK Insurance, Age UK is an Appointed 
Representative of Age UK Enterprises Limited, Age UK is an Introducer 
Appointed Representative of JLT Benefit Solutions Limited and Simplyhealth 
Access for the purposes of introducing potential annuity and health 
cash plans customers respectively.  Age UK Enterprises Limited, JLT Benefit 
Solutions Limited and Simplyhealth Access are all authorised and 
regulated by the Financial Services Authority. 
--

This email and any files transmitted with it are confide...{{dropped:28}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova test for variables with different lengths

2012-10-16 Thread S Ellison
 

 I want to test whether the MEAN of two different variables, 
 (and different number of observations) are the same. I am 
 trying to use the anova test but it doesn't seem to like that 
 the number of observations are different:
 
 a=c(1:5)
 b=c(1:3)
 aov_test=aov(a~b)
 Error in model.frame.default(formula = a ~ b, 
 drop.unused.levels = TRUE) :
   variable lengths differ (found for 'b')
 
 -Original Message-
 You may find this tutorial useful: 
 http://goanna.cs.rmit.edu.au/~fscholer/anova.php
 And you'll need the car package; but become yourself familiar 
 with Type I, II and III sums of squares models before running 
 the Anova; the tutorial explains these in detail.
 Hope it helps.

Sadly, I doubt that it will, though it would be good advice if the OP had got 
as far as formulating the model correctly. 

But they haven't. The OP has tried to model a variable of length 5 using a 
predictor of length 3. (In fact what they've just done is a simple linear 
regression of variables with different length). This will not work, no  matter 
what the OP does about types of SS. 

First, a t test would do this job, assuming normality - though incidentally the 
variances differ so the default t.test will return a somewhat different result 
to anova, which effectively assumes equal variance by default.

Second, to use aov correctly, read ?formula and look at the examples for this 
and ?lm

Then, if you want to get the same result as an equal variance t test using 
ANOVA, you'd have to concatenate the two groups and then model with a predictor 
indicating the groups. In this instance

y - c(a, b)
g - factor( rep( letters[1:2], c(length(a), length(b) ) ), )
summary( aov(y~g) )

Since this is a one way problem the type of SS won't matter, but in other cases 
it would be crucial to at least understand why - and to what extent - anova can 
be  unsafe* on unbalanced data.


S Ellison

*unsafe reads as actively dangerous in this context.

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova, F statistics, P-values

2012-10-06 Thread Jhope
Thank you for the replies. 

I am actually trying to gain p-values and f values, and tried the below
script but unsuccessful.

1) I have read in another forum to use the package lmer but apparently it
does not exist. 
2) Then I tried the pvals.fnc but it is not a function. 
3) I read also that a glm model must first be created but was not able to
gain a P-value through summary but got F statistics. 
4) Is there a way to get the P-value and F statistics? Or does this require
two different executions? 
5) I am afraid to update my R version because I may loose my ALL saved
scripts, should I do so? 

Please advise, Jean


 install.packages(lmer)
Installing package(s) into
‘/Library/Frameworks/R.framework/Versions/2.13/Resources/library’
(as ‘lib’ is unspecified)
Warning in install.packages :
  package ‘lmer’ is not available (for R version 2.13.1)

 pvals.fnc(HSuccess ~ VegIndex, data = data.to.analyze)
Error: could not find function pvals.fnc
 Model1.glm - glm(cbind(Shells, TotalEggs-Shells) ~ HTL,
 data=data.to.analyze, family = binomial)
 summary(Model1.glm)




--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-tp4645130p4645242.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova

2012-10-05 Thread Rui Barradas

Hello,

You're thinking of ?aov. anova() does _not_ have a formula interface, it 
would be


anova(lm(HSuccess ~ Veg, data = data.to.analyze))

or

aov(HSuccess ~ Veg, data = data.to.analyze)

Hope this helps,

Rui Barradas
Em 05-10-2012 09:27, Jhope escreveu:

Hi R-listers,

I am trying to do an ANOVA for the following scatterplot and received the
following error:

library(car)
scatterplot(HSuccess ~ Veg,
 data = data.to.analyze,
 xlab = Vegetation border (m),
 ylab = Hatching success (%))

anova(HSuccess ~ Veg, data=data.to.analyze)

Error in UseMethod(anova) :
   no applicable method for 'anova' applied to an object of class formula

I am wondering if there is a better way to do this?
Please advise, Jean




--
View this message in context: http://r.789695.n4.nabble.com/Anova-tp4645130.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova

2012-10-05 Thread John Fox
Dear Jean,

On Fri, 5 Oct 2012 01:27:55 -0700 (PDT)
 Jhope jeanwaij...@gmail.com wrote:
 Hi R-listers, 
 
 I am trying to do an ANOVA for the following scatterplot and received the
 following error:
 
 library(car)
 scatterplot(HSuccess ~ Veg, 
 data = data.to.analyze,
 xlab = Vegetation border (m), 
 ylab = Hatching success (%))
 
 anova(HSuccess ~ Veg, data=data.to.analyze)
 
 Error in UseMethod(anova) : 
   no applicable method for 'anova' applied to an object of class formula
 
 I am wondering if there is a better way to do this?

anova() needs a model object, not a formula, as its first argument:

anova(lm(HSuccess ~ Veg, data=data.to.analyze))

Alternatively, you can use aov(), with summary(), to get the ANOVA table:

summary(aov(HSuccess ~ Veg, data=data.to.analyze))

I hope this helps,
 John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

 Please advise, Jean
 
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Anova-tp4645130.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-30 Thread arun
Hi,

I got your data from Nabble.  I meant to do it by dput() because the .csv files 
may not get through the R-help mailing list.

This is what I got:
dat1-read.delim(typohne.csv)
 str(dat1)
#'data.frame':    60 obs. of  3 variables:
# $ typ : Factor w/ 5 levels A,B,C,D,..: 1 2 3 4 5 1 2 3 4 5 ...
# $ abun: int  7 3 14 8 15 5 21 19 9 21 ...
# $ div : int  7 3 11 6 14 5 12 15 8 12 ...

summary(fm1-aov(abun~typ,data=dat1))
#    Df Sum Sq Mean Sq F value Pr(F)  
#typ  4    847  211.69   3.654 0.0104 *
#Residuals   55   3186   57.93 
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 myresults-TukeyHSD(fm1,typ,ordered=TRUE)
 myresults
#  Tukey multiple comparisons of means
#    95% family-wise confidence level
#    factor levels have been ordered

#Fit: aov(formula = abun ~ typ, data = dat1)
#
#$typ
# diff    lwr  upr p adj
#B-A  6.50 -2.2633699 15.26337 0.2382355
#D-A  6.75 -2.0133699 15.51337 0.2055350
#E-A  9.08  0.3199635 17.84670 0.0386441
#C-A 11.17  2.4032968 19.93004 0.0060292
#D-B  0.25 -8.5133699  9.01337 0.902
#E-B  2.58 -6.1800365 11.34670 0.9197106
#C-B  4.67 -4.0967032 13.43004 0.5657010
#E-D  2.33 -6.4300365 11.09670 0.9432637
#C-D  4.416667 -4.3467032 13.18004 0.6167517
#C-E  2.08 -6.6800365 10.84670 0.9619249
library(agricolae)
 HSD.test(fm1,typ,group=TRUE)
#Study:
#HSD Test for abun 
#Mean Square Error:  57.92879 
#typ,  means
#  abun   std.err replication
#A  6.25000 0.6976693  12
#B 12.75000 2.4061947  12
#C 17.41667 2.9062585  12
#D 13.0 2.6198543  12
#E 15.3 1.5970301  12
#alpha: 0.05 ; Df Error: 55 
#Critical Value of Studentized Range: 3.988545 
#Honestly Significant Difference: 8.76337 
#Means with the same letter are not significantly different.
#Groups, Treatments and means
#a      C      17.41667 
#a      E      15.3 
#ab      D      13 
#ab      B      12.75 
# b      A      6.25 


I am not sure why you didn't get the results as above.  Check str() to see 
whether the str() I got matches with yours.

Have a great day!
A.K.

    











- Original Message -
From: Landi ent-ar...@gmx.de
To: r-help@r-project.org
Cc: 
Sent: Friday, September 28, 2012 10:35 AM
Subject: Re: [R] Anova and tukey-grouping

Hello !

Thanks for your advice. I tried it, but the output is the same:
 HSD.test(anova.typabunmit, typ, group=TRUE)
Name:  typ 
ds.typabunmit$typ 

I don't get the values...!?!?



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-30 Thread Landi
typohne.csv http://r.789695.n4.nabble.com/file/n4644635/typohne.csv  

Hello arun kirshna,

I checked str() but I do not see any mistake.
Find attached a subset of my data (did you mean this with dput() ?)
Typ= treatment
abun= abundance
div= diversity, number of species

best wishes.



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644635.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-30 Thread Landi
Hello again !

It worked exactly the same way as yours!
I'm kind of astonished, because in my opinion our skripts are the same
(obviously not...).
Whatever! It works :-D

Tomorrow (as where I live it's already evening...sunday) will be a great
day, indeed, and I can go on with analysis !
I really appreciate the conversation with you and your support!



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644642.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-28 Thread arun
HI,

I guess there is a mistake in your code.  You should have used typ instead of 
abun as abun is the dependent variable.
summary(fm1 - aov(breaks ~ wool + tension, data = warpbreaks))
myresults    -  TukeyHSD(fm1, tension, ordered = TRUE)
library(agricolae)

HSD.test(fm1,wool,group=TRUE)
#Study:
#HSD Test for breaks 
#Mean Square Error:  134.9578 
#wool,  means
#    breaks  std.err replication
#A 31.03704 3.050609  27
#B 25.25926 1.789963  27
#alpha: 0.05 ; Df Error: 50 
#Critical Value of Studentized Range: 2.840532 
#Honestly Significant Difference: 6.350628 
#Means with the same letter are not significantly different.
#Groups, Treatments and means
#a      A      31.037037037037 
#a      B      25.2592592592593 

 

A.K.



- Original Message -
From: Landi ent-ar...@gmx.de
To: r-help@r-project.org
Cc: 
Sent: Friday, September 28, 2012 5:41 AM
Subject: [R] Anova and tukey-grouping

Hello,

I am really new to R and it's still a challenge to me.
Currently I'm working on my Master's Thesis. My supervisor works with SAS
and is not familiar with R at all.

I want to run an Anova, a tukey-test and as a result I want to have the
tukey-grouping ( something like A - AB - B)

I came across the HSD.test in the agricolae-package, but... unfortunately I
do not get an output (like here in the answer
http://stats.stackexchange.com/questions/31547/how-to-obtain-the-results-of-a-tukey-hsd-post-hoc-test-in-a-table-showing-groupe
)

I did it like this:

##   ANOVA
anova.typabunmit-aov(ds.typabunmit$abun ~ ds.typabunmit$typ)
summary(anova.typabunmit)
summary.lm(anova.typabunmit)

## post HOC
tukey.typabunmit-TukeyHSD(anova.typabunmit)
tukey.typabunmit

## HSD
HSD.test(anova.typabunmit, abun, group=TRUE)



and the ONLY output is this:
Name:  abun 
ds.typabunmit$typ 


I would be very pleased about some ides..:!





--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-28 Thread Landi
Hello !

Thanks for your advice. I tried it, but the output is the same:
 HSD.test(anova.typabunmit, typ, group=TRUE)
Name:  typ 
 ds.typabunmit$typ 

I don't get the values...!?!?



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova and tukey-grouping

2012-09-28 Thread arun
Hi,

As I mentioned earlier, these are just guess work until you provide a subset of 
your data with dput().  Also, please check the structure of the data with str().

A.K.  






- Original Message -
From: Landi ent-ar...@gmx.de
To: r-help@r-project.org
Cc: 
Sent: Friday, September 28, 2012 10:35 AM
Subject: Re: [R] Anova and tukey-grouping

Hello !

Thanks for your advice. I tried it, but the output is the same:
 HSD.test(anova.typabunmit, typ, group=TRUE)
Name:  typ 
ds.typabunmit$typ 

I don't get the values...!?!?



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-and-tukey-grouping-tp4644485p4644513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova problem with order of terms in model

2012-09-26 Thread Modi2020
Please check if your independent variables or Xs are independent.
If they are not that can affect the sequential decomposition. 



--
View this message in context: 
http://r.789695.n4.nabble.com/Anova-problem-with-order-of-terms-in-model-tp791744p4644320.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA repeated measures and post-hoc

2012-08-16 Thread Robert Baer
Check out:
http://rtutorialseries.blogspot.com/2011/02/r-tutorial-series-two-way-repeated.html
On 8/15/2012 11:32 AM, Diego Bucci wrote:
 Hi,
 I performed an ANOVA repeated measures but I still can't find any good news 
 regarding the possibility to perform multiple comparisons.
 Can anyone help me?
 Thanks

 Diego Bucci
 Fisiologia Veterinaria
 Dipartimento di Scienze Mediche Veterinarie
 Università degli Studi di Bologna
 Via Tolara di Sopra, 50
 40064 Ozzano dell'Emilia, BO
 Tel. 00390512097904
 mail diego.buc...@unibo.it


   [[alternative HTML version deleted]]



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA repeated measures and post-hoc

2012-08-16 Thread Henrik Singmann

Hi Diego,

I am struggeling with this question also for some time and there does 
not seem to be an easy and general solution to this problem. At least I 
haven't found one.
However, if you have just one repeated-measures factor, use the solution 
describe by me here: http://stats.stackexchange.com/a/15532/442
Furthermore, you might wanna check the package phia with accompanying 
vignette (but it uses multivariate tests).

For running the ANOVA you could also check my new package afex.

Cheers,
Henrik


Diego Bucci schrieb:

Hi,
I performed an ANOVA repeated measures but I still can't find any good news 
regarding the possibility to perform multiple comparisons.
Can anyone help me?
Thanks

Diego Bucci
Fisiologia Veterinaria
Dipartimento di Scienze Mediche Veterinarie
Università degli Studi di Bologna
Via Tolara di Sopra, 50
40064 Ozzano dell'Emilia, BO
Tel. 00390512097904
mail diego.buc...@unibo.it


[[alternative HTML version deleted]]





--
Dipl. Psych. Henrik Singmann
PhD Student
Albert-Ludwigs-Universität Freiburg, Germany
http://www.psychologie.uni-freiburg.de/Members/singmann

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova in unbalanced data

2012-08-14 Thread S Ellison

 

 -Original Message-
 Say I have the following data:
 
 a-data.frame(col1=c(rep(a,5),rep(b,7)),col2=runif(12))
 
 a_aov-aov(a$col2~a$col1)
 
 summary(aov)
 
 
 Note that there are 5 observations for a and 7 for b, thus is 
 unbalanced. What would be the correct way of doing anova for this set?
 

As this is a single factor case, the imbalance doesn't affect the 
interpretation. For two-way and higher models, it would affect the 
interpretation, and john fox's post (and a very large literature) then applies. 
But here, the usual variants and contrast choices will all return the same 
p-value, so aov works, as does 
anova(lm(col2~col1, data=a)) #note that the 'data' argument also works in aov


S

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova in unbalanced data

2012-08-13 Thread arun
HI,

Check this link:

https://stat.ethz.ch/pipermail/r-help/2011-April/273858.html

A.K.



- Original Message -
From: Sachinthaka Abeywardana sachin.abeyward...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Monday, August 13, 2012 10:09 PM
Subject: [R] anova in unbalanced data

Hi all,

Say I have the following data:

a-data.frame(col1=c(rep(a,5),rep(b,7)),col2=runif(12))

a_aov-aov(a$col2~a$col1)

summary(aov)


Note that there are 5 observations for a and 7 for b, thus is
unbalanced. What would be the correct way of doing anova for this set?


Thanks,

Sachin

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova Type II and Contrasts

2012-07-06 Thread David Winsemius

Dear Peter;

This is an exact duplicate of a question posted on SO. Cross-posting  
is deprecated on Rhelp.


--
David.

On Jul 6, 2012, at 11:06 AM, mails wrote:

the study design of the data I have to analyse is simple. There is 1  
control group (CTRL) and 2 different treatment groups (TREAT_1 and  
TREAT_2).
The data also includes 2 covariates COV1 and COV2. I have been asked  
to check if there is a linear or quadratic treatment effect in the  
data.


I created a dummy data set to explain my situation:

df1 - data.frame(

Observation = c(rep(CTRL,15), rep(TREAT_1,13), rep(TREAT_2,  
12)),


COV1 = c(rep(A1, 30), rep(A2, 10)),

COV2 = c(rep(B1, 5), rep(B2, 5), rep(B3, 10), rep(B1, 5),  
rep(B2, 5), rep(B3, 10)),


Variable = c(3944133, 3632461, 3351754, 3655975, 3487722, 3644783,  
3491138, 3328894,
   3654507, 3465627, 3511446, 3507249, 3373233,  
3432867, 3640888,


   3677593, 3585096, 3441775, 3608574, 3669114,  
4000812, 3503511, 3423968,

   3647391, 3584604, 3548256, 3505411, 3665138,

  4049955, 3425512, 3834061, 3639699, 3522208,  
3711928, 3576597, 3786781,

  3591042, 3995802, 3493091, 3674475)
)

plot(Variable ~ Observation, data = df1)

As you can see from the plot there is a linear relationship between  
the control and the treatment groups. To check if this linear effect  
is statistical
significant I change the contrasts using the contr.poly() function  
and fit a linear model like this:


contrasts(df1$Observation) - contr.poly(levels(df1$Observation))
lm1 - lm(log(Variable) ~ Observation, data = df1)
summary.lm(lm1)

From the summary we can see that the linear effect is statistically  
significant:


Observation.L  0.029141   0.0123772.3550.024 *
Observation.Q  0.002233   0.0124820.1790.859

However, this first model does not include any of the two  
covariates. Including them results in a non-significant p-value for  
the linear relationship:


lm2 - lm(log(Variable) ~ Observation + COV1 + COV2, data = df1)
summary.lm(lm2)

Observation.L  0.041160.02624   1.5680.126
Observation.Q  0.010030.01894   0.5300.600
COV1A2-0.012030.04202  -0.2860.776
COV2B2-0.020710.02202  -0.9410.354
COV2B3-0.020830.02066  -1.0080.320

So far so good. However, I have been told to conduct a Type II Anova  
rather than a Type I. To conduct a Type II Anova I used the Anova()  
function

from the car package.

Anova(lm2, type=II)

Anova Table (Type II tests)

Response: log(Variable)
   Sum Sq Df F value Pr(F)
Observation 0.006253  2  1.4651 0.2453
COV1  0.000175  1  0.0820 0.7763
COV2  0.002768  2  0.6485 0.5292
Residuals  0.072555 34

The problem here with using Type II is that you do not get a p-value  
for the linear and quadratic effect.
So I do not know if the treatment effect is statistically linear and  
or quadratic.


I found out that the following code produces the same p-value for  
Observation as the Anova() function. However, the result also does  
not include

any p-values for the linear or quadratic effect:

lm2 - lm(log(Variable) ~ Observation + COV1 + COV2, data = df1)
lm3 - lm(log(Variable) ~ COV1 + COV2, data = df1)
anova(lm2, lm3)


Does anybody know how to conduct a Type II anova and the contrasts  
function to obtain the p-values for the linear and quadratic effects?


Help would be very much appreciated.

Best Peter
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Anova Type II and Contrasts

2012-07-06 Thread John Fox
Dear Peter,

Because your model is additive, type-II and type-III tests are identical, 
and the t-tests for the linear and quadratic coefficients are interpretable.

I hope this helps,
 John


John Fox
Sen. William McMaster Prof. of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox/

On Fri, 6 Jul 2012 16:06:26 +0100
 mails mails00...@gmail.com wrote:
 the study design of the data I have to analyse is simple. There is 1 control 
 group (CTRL) and 2 different treatment groups (TREAT_1 and TREAT_2). 
 The data also includes 2 covariates COV1 and COV2. I have been asked to check 
 if there is a linear or quadratic treatment effect in the data. 
 
 I created a dummy data set to explain my situation: 
 
 df1 - data.frame( 
 
 Observation = c(rep(CTRL,15), rep(TREAT_1,13), rep(TREAT_2, 12)), 
 
 COV1 = c(rep(A1, 30), rep(A2, 10)), 
 
 COV2 = c(rep(B1, 5), rep(B2, 5), rep(B3, 10), rep(B1, 5), rep(B2, 
 5), rep(B3, 10)), 
 
 Variable = c(3944133, 3632461, 3351754, 3655975, 3487722, 3644783, 3491138, 
 3328894, 
 3654507, 3465627, 3511446, 3507249, 3373233, 3432867, 
 3640888, 
 
 3677593, 3585096, 3441775, 3608574, 3669114, 4000812, 
 3503511, 3423968, 
 3647391, 3584604, 3548256, 3505411, 3665138, 
 
4049955, 3425512, 3834061, 3639699, 3522208, 3711928, 
 3576597, 3786781, 
3591042, 3995802, 3493091, 3674475) 
 ) 
 
 plot(Variable ~ Observation, data = df1) 
 
 As you can see from the plot there is a linear relationship between the 
 control and the treatment groups. To check if this linear effect is 
 statistical 
 significant I change the contrasts using the contr.poly() function and fit a 
 linear model like this: 
 
 contrasts(df1$Observation) - contr.poly(levels(df1$Observation)) 
 lm1 - lm(log(Variable) ~ Observation, data = df1) 
 summary.lm(lm1) 
 
 From the summary we can see that the linear effect is statistically 
 significant: 
 
 Observation.L  0.029141   0.0123772.3550.024 *   
 Observation.Q  0.002233   0.0124820.1790.859   
 
 However, this first model does not include any of the two covariates. 
 Including them results in a non-significant p-value for the linear 
 relationship: 
 
 lm2 - lm(log(Variable) ~ Observation + COV1 + COV2, data = df1) 
 summary.lm(lm2) 
 
 Observation.L  0.041160.02624   1.5680.126 
 Observation.Q  0.010030.01894   0.5300.600 
 COV1A2-0.012030.04202  -0.2860.776 
 COV2B2-0.020710.02202  -0.9410.354 
 COV2B3-0.020830.02066  -1.0080.320   
 
 So far so good. However, I have been told to conduct a Type II Anova rather 
 than a Type I. To conduct a Type II Anova I used the Anova() function 
  from the car package. 
 
 Anova(lm2, type=II) 
 
 Anova Table (Type II tests) 
 
 Response: log(Variable) 
 Sum Sq Df F value Pr(F) 
 Observation 0.006253  2  1.4651 0.2453 
 COV1  0.000175  1  0.0820 0.7763 
 COV2  0.002768  2  0.6485 0.5292 
 Residuals  0.072555 34 
 
 The problem here with using Type II is that you do not get a p-value for the 
 linear and quadratic effect. 
 So I do not know if the treatment effect is statistically linear and or 
 quadratic. 
 
 I found out that the following code produces the same p-value for Observation 
 as the Anova() function. However, the result also does not include 
 any p-values for the linear or quadratic effect: 
 
 lm2 - lm(log(Variable) ~ Observation + COV1 + COV2, data = df1) 
 lm3 - lm(log(Variable) ~ COV1 + COV2, data = df1) 
 anova(lm2, lm3) 
 
 
 Does anybody know how to conduct a Type II anova and the contrasts function 
 to obtain the p-values for the linear and quadratic effects? 
 
 Help would be very much appreciated. 
 
 Best Peter
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA help

2012-06-20 Thread David Winsemius


On Jun 19, 2012, at 5:36 PM, James Johnson wrote:


Hi All,

I have a microarray dataset as follows:

  expt1 expt2 expt3  expt4 expt 5

gene1val  val val  val val

gene2val  val val  valval

.
.
..
gene15000   val   val val  val val


The result is from the same organism in four different experiments.   
Also, there are 4 replicates of each experiment. My aim was to find  
genes that are statistically significant across the four  
experiments. I carried out one-way anova as follows:



sTest-read.table(myData.dat,header = T, row.names = 1)
group - gl(4,4,16, label=c(Glucose,Citrate, Tide,Dawn))

gm1 - (x~group, data=sTest)

I received error messages that x is unknown, and did not know how  
to go further from here.




This appears to be a college homework assignment and as such is quite  
possibily inappropriate submission to Rhelp. If not, you should post  
the results of str(sTest).


You should also read the Posting Guide and learn to post in plain text.


PLEASE do read the posting guide http://www.R-project.org/posting-guide.html



and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova of lme objects (model1, model2) gives different results depending on order of models

2012-06-01 Thread Chris Beeley

Well that's that cleared up then. Thanks to all.

Chris B.

On 31/05/2012 17:51, Albyn Jones wrote:

No, both yield the same result: reject the null hypothesis,
which always corresponds to the restricted (smaller) model.

albyn

On Thu, May 31, 2012 at 12:47:30PM +0100, Chris Beeley wrote:

Hello-

I understand that it's convention, when comparing two models using
the anova function anova(model1, model2), to put the more
complicated (for want of a better word) model as the second model.
However, I'm using lme in the nlme package and I've found that the
order of the models actually gives opposite results. I'm not sure if
this is supposed to be the case or if I have missed something
important, and I can't find anything in the Pinheiro and Bates book
or in ?anova, or in Google for that matter which unfortunately only
returns results about ANOVA which isn't much help. I'm using the
latest version of R and nlme, just checked both.

Here is the code and output:


PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit)

PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit,
+  correlation=corAR1(form=~Date|Case))


anova(PHQmodel1, PHQmodel2) # accept model 2

 Model df  AIC  BIClogLik   Test
L.Ratio p-value
PHQmodel1 1  8 48784.57 48840.43 -24384.28
PHQmodel2 2  9 48284.68 48347.51 -24133.34 1 vs 2 501.8926.0001


PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit,
+  correlation=corAR1(form=~Date|Case))

PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,

random=~1|Case, na.action=na.omit)


anova(PHQmodel1, PHQmodel2) # accept model 2

  Model df  AIC  BIClogLik   Test
L.Ratio p-value
PHQmodel1 1  9 48284.68 48347.51 -24133.34
PHQmodel2 2  8 48784.57 48840.43 -24384.28 1 vs 2 501.8926.0001

In both cases I am led to accept model 2 even though they are
opposite models. Is it really just that you have to put them in the
right order? It just seems like if there were say four models you
wouldn't necessarily be able to determine the correct order.

Many thanks,
Chris Beeley, Institute of Mental Health, UK

...session info follows


sessionInfo()

R version 2.15.0 (2012-03-30)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] grid  stats graphics  grDevices utils datasets
methods   base

other attached packages:
  [1] gridExtra_0.9  RColorBrewer_1.0-5 car_2.0-12
nnet_7.3-1 MASS_7.3-17
  [6] xtable_1.7-0   psych_1.2.4languageR_1.4
nlme_3.1-104   ggplot2_0.9.1

loaded via a namespace (and not attached):
  [1] colorspace_1.1-1 dichromat_1.2-4  digest_0.5.2 labeling_0.1
lattice_0.20-6   memoise_0.1
  [7] munsell_0.3  plyr_1.7.1   proto_0.3-9.2
reshape2_1.2.1   scales_0.2.1 stringr_0.6
[13] tools_2.15.0


packageDescription(nlme)

Package: nlme
Version: 3.1-104
Date: 2012-05-21
Priority: recommended
Title: Linear and Nonlinear Mixed Effects Models
Authors@R: c(person(Jose, Pinheiro, comment = S version),
person(Douglas, Bates, comment =
up to 2007), person(Saikat, DebRoy, comment = up
to 2002), person(Deepayan,
Sarkar, comment = up to 2005), person(R-core, email
= r-c...@r-project.org, role =
c(aut, cre)))
Author: Jose Pinheiro (S version), Douglas Bates (up to 2007),
Saikat DebRoy (up to 2002), Deepayan
Sarkar (up to 2005), the R Core team.
Maintainer: R-corer-c...@r-project.org
Description: Fit and compare Gaussian linear and nonlinear
mixed-effects models.
Depends: graphics, stats, R (= 2.13)
Imports: lattice
Suggests: Hmisc, MASS
LazyLoad: yes
LazyData: yes
License: GPL (= 2)
BugReports: http://bugs.r-project.org
Packaged: 2012-05-23 07:28:59 UTC; ripley
Repository: CRAN
Date/Publication: 2012-05-23 07:37:45
Built: R 2.15.0; x86_64-pc-mingw32; 2012-05-29 12:36:01 UTC; windows

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova of lme objects (model1, model2) gives different results depending on order of models

2012-05-31 Thread Albyn Jones
No, both yield the same result: reject the null hypothesis,
which always corresponds to the restricted (smaller) model.

albyn

On Thu, May 31, 2012 at 12:47:30PM +0100, Chris Beeley wrote:
 Hello-
 
 I understand that it's convention, when comparing two models using
 the anova function anova(model1, model2), to put the more
 complicated (for want of a better word) model as the second model.
 However, I'm using lme in the nlme package and I've found that the
 order of the models actually gives opposite results. I'm not sure if
 this is supposed to be the case or if I have missed something
 important, and I can't find anything in the Pinheiro and Bates book
 or in ?anova, or in Google for that matter which unfortunately only
 returns results about ANOVA which isn't much help. I'm using the
 latest version of R and nlme, just checked both.
 
 Here is the code and output:
 
  PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,
 random=~1|Case, na.action=na.omit)
 
  PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,
 random=~1|Case, na.action=na.omit,
 +  correlation=corAR1(form=~Date|Case))
 
  anova(PHQmodel1, PHQmodel2) # accept model 2
 Model df  AIC  BIClogLik   Test
 L.Ratio p-value
 PHQmodel1 1  8 48784.57 48840.43 -24384.28
 PHQmodel2 2  9 48284.68 48347.51 -24133.34 1 vs 2 501.8926 .0001
 
  PHQmodel1=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,
 random=~1|Case, na.action=na.omit,
 +  correlation=corAR1(form=~Date|Case))
 
  PHQmodel2=lme(PHQ~Age+Gender+Date*Treatment, data=compfinal,
 random=~1|Case, na.action=na.omit)
 
  anova(PHQmodel1, PHQmodel2) # accept model 2
  Model df  AIC  BIClogLik   Test
 L.Ratio p-value
 PHQmodel1 1  9 48284.68 48347.51 -24133.34
 PHQmodel2 2  8 48784.57 48840.43 -24384.28 1 vs 2 501.8926 .0001
 
 In both cases I am led to accept model 2 even though they are
 opposite models. Is it really just that you have to put them in the
 right order? It just seems like if there were say four models you
 wouldn't necessarily be able to determine the correct order.
 
 Many thanks,
 Chris Beeley, Institute of Mental Health, UK
 
 ...session info follows
 
  sessionInfo()
 R version 2.15.0 (2012-03-30)
 Platform: i386-pc-mingw32/i386 (32-bit)
 
 locale:
 [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United
 Kingdom.1252
 [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C
 [5] LC_TIME=English_United Kingdom.1252
 
 attached base packages:
 [1] grid  stats graphics  grDevices utils datasets
 methods   base
 
 other attached packages:
  [1] gridExtra_0.9  RColorBrewer_1.0-5 car_2.0-12
 nnet_7.3-1 MASS_7.3-17
  [6] xtable_1.7-0   psych_1.2.4languageR_1.4
 nlme_3.1-104   ggplot2_0.9.1
 
 loaded via a namespace (and not attached):
  [1] colorspace_1.1-1 dichromat_1.2-4  digest_0.5.2 labeling_0.1
 lattice_0.20-6   memoise_0.1
  [7] munsell_0.3  plyr_1.7.1   proto_0.3-9.2
 reshape2_1.2.1   scales_0.2.1 stringr_0.6
 [13] tools_2.15.0
 
  packageDescription(nlme)
 Package: nlme
 Version: 3.1-104
 Date: 2012-05-21
 Priority: recommended
 Title: Linear and Nonlinear Mixed Effects Models
 Authors@R: c(person(Jose, Pinheiro, comment = S version),
 person(Douglas, Bates, comment =
up to 2007), person(Saikat, DebRoy, comment = up
 to 2002), person(Deepayan,
Sarkar, comment = up to 2005), person(R-core, email
 = r-c...@r-project.org, role =
c(aut, cre)))
 Author: Jose Pinheiro (S version), Douglas Bates (up to 2007),
 Saikat DebRoy (up to 2002), Deepayan
Sarkar (up to 2005), the R Core team.
 Maintainer: R-core r-c...@r-project.org
 Description: Fit and compare Gaussian linear and nonlinear
 mixed-effects models.
 Depends: graphics, stats, R (= 2.13)
 Imports: lattice
 Suggests: Hmisc, MASS
 LazyLoad: yes
 LazyData: yes
 License: GPL (= 2)
 BugReports: http://bugs.r-project.org
 Packaged: 2012-05-23 07:28:59 UTC; ripley
 Repository: CRAN
 Date/Publication: 2012-05-23 07:37:45
 Built: R 2.15.0; x86_64-pc-mingw32; 2012-05-29 12:36:01 UTC; windows
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Albyn Jones
Reed College
jo...@reed.edu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA question

2012-05-11 Thread ONKELINX, Thierry
Dear Robert,

It is easier to use lm instead of aov if you want coefficients for each group. 
Note that you can use rnorm vectorised.

set.seed(0)
N - 100 # sample size
MEAN - c(10, 20, 30, 40, 50)
VAR - c(20,20,1, 20, 20)
LABELS - factor(c(A, B, C, D, E))

# create a data frame with labels
df - data.frame(Label=rep(LABELS, each=N))
df$Value - rnorm(nrow(df), mean = MEAN[df$Label], sd = sqrt(VAR[df$Label]))

mod_aov - aov(Value ~ Label, data=df)
mod_lm - lm(Value ~ Label, data = df)
all.equal(anova(mod_aov), anova(mod_lm))

summary(mod_aov)
summary(mod_lm)

summary(mod_lm)$coef
confint(mod_lm)

#without intercept
mod_lm0 - lm(Value ~ 0 + Label, data = df)
summary(mod_lm0)$coef
confint(mod_lm0)

Best regards,

Thierry


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and 
Forest
team Biometrie  Kwaliteitszorg / team Biometrics  Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey

-Oorspronkelijk bericht-
Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens 
Robert Latest
Verzonden: vrijdag 11 mei 2012 9:37
Aan: r-help@r-project.org
Onderwerp: [R] ANOVA question

Hello all,

I'm very satisfied to say that my grip on both R and statistics is showing the 
first hints of firmness, on a very greenhorn level.

I'm faced with a problem that I intend to analyze using ANOVA, and to test my 
understanding of a primitive, one-way ANOVA I've written the self-contained 
practice script below. It works as expected.

But here's my question: How can I not only get the values of the coefficients 
for the different levels of the explanatory factor(s), but also the 
corresponding standard errors and confidence levels?
Below I have started doing that on foot by looping over the levels of my 
single factor, but I suppose this gets complicated and messy with more complex 
models. Any ideas?

Thanks,
robert


set.seed(0)

N - 100 # sample size

MEAN - c(10, 20, 30, 40, 50)
VAR - c(20,20,1, 20, 20)
LABELS - c(A, B, C, D, E)

# create a data frame with labels
df - data.frame(Label=rep(LABELS, each=N)) df$Value - NA # fill in random 
data for each factor level for (i in 1:length(MEAN)) {
df$Value[(1+N*(i-1)):(N*i)] - rnorm(N, MEAN[i], sqrt(VAR[i])) }



par(mfrow=c(2,2))
plot(df)  # Box plot of the data
plot(df$Value)# scatter plot

mod_aov - aov(Value ~ Label, data=df)

print(summary(mod_aov))
print(mod_aov$coefficients)

rsd - mod_aov$residuals

plot(rsd)

# find and print mean() and var() for each level for (l in levels(df$Label)) {
index - df$Label == l

# Method 1: directly from data
smp - df$Value[index]  # extract sample for this label
ssq_smp -  var(smp)*(length(smp)-1) # sum of squares is variance
 # times d.f.
# Method 2: from ANOVA residuals
rsd_grp - rsd[index]# extract residuals
ssq_rsd - sum(rsd_grp **2)  # compute sum of squares

# print mean, variance, and difference between SSQs from the two
# methods.
write(sprintf(%s: mean=%5.1f var=%5.1f (%.2g), l,
mean(smp), var(smp),
ssq_smp-ssq_rsd), )
# ...and it works like expected! But is there a shortcut that would give me # 
the same result in a one-liner?
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the 
writer and may not be regarded as stating an official position of INBO, as long 
as the message is not confirmed by a duly signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA question

2012-05-11 Thread Robert Latest
Hello Thierry,

thanks for your answer! There is one thing, however, that I don't
understand. The values labeled B in my data are generated with
1/20th the variance of the others, yet the standard error and
confidence intervals are the same for all levels of the factor. How
come?


 summary(mod_lm0)$coef
   Estimate Std. Error   t value  Pr(|t|)
LabelA 10.10138  0.3937038  25.65730  5.752714e-93
LabelB 19.79629  0.3937038  50.28218 1.226942e-196
LabelC 30.06722  0.3937038  76.37016 4.825571e-276
LabelD 40.01442  0.3937038 101.63586  0.00e+00
LabelE 49.78282  0.3937038 126.44738  0.00e+00


Thanks,
robert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA problem

2012-05-04 Thread Bert Gunter
Rob:

On Fri, May 4, 2012 at 9:18 AM, robgriffin247 rg.rfo...@hotmail.co.uk wrote:
 Hi,
 I need to create a data frame containing the results of a number of ANOVA's
 but I'm having some trouble setting it up (some being enough for me to spend
 3 days trying with no progress and be left staring in to the abyss which
 some people call a weekend, and what I will call 2 quiet days in the
 office...)

I would suggest staying out of the office and consulting a local
statistician Monday morning. As a poor second choice, post on a
statistics Help list (e.g. stats.stackexchange.com).

I haven't gone through your post in detail, but it appears to have
little to do with R and a **lot** to do with your lack of statistical
understanding. It appears that you need to formulate a scientifically
appropriate mixed effect model (the problem is never how to set up an
anova), and interaction with a local consultant is the best way to do
that.

I suppose you could also post this on the r-sig-mixed-models list, as
they often go beyond the R issues to the statistical modeling. But
remote consulting is a risky business, as despite the best of
intentions on both sides, incomplete or mis- communication can lead to
errors of the third kind (right answer -- wrong question).

Best,
Bert


 The response variable is *V*.
 I need to do an ANOVA for each *G*.
 The fixed effect will be *S* (M or F) whilst also having the *S*L* and
 *L* (1 or 2) as random effects.
 The anova of *G* /AB01 /would be some thing like: y=V, fixed=S, Random= L 
 L*S...
 The new data frame would then compile all the variance components for each
 G, including total and residual variance.

 here is the example dataframe using 2 G's, with 2 S values, 2 L, and 2
 replicates for each.

 df-as.data.frame(c(AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB02,AB02,AB02,AB02,AB02,AB02,AB02,AB02))
 names(df)-G
 df$L-as.numeric(c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2))
 df$S-(c(m,m,f,f,m,m,f,f,m,m,f,f,m,m,f,f))
 df$R-as.numeric(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2))
 df$V-as.numeric(c(1,2,12,21,5,6,12,34,1,6,52,41,5,43,13,24))

 It is worth noting the actual data this will be used on is 1*G's,
 2*S's,  40*L's,  and 2*R's so hand writing an ANOVA for each G is not
 preferred...

 Here is a twitter link to a crudely drawn illustration of the aim
 illustrated (using 3 Ls) in case I have confused you with words (through my
 own poor understanding):
 https://twitter.com/#!/robgriffin247/status/198446041316593666/photo/1/large
 https://twitter.com/#!/robgriffin247/status/198446041316593666/photo/1/large

 Thanks in advance for your time,
 Rob
 (please save my weekend...)

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/ANOVA-problem-tp4609062.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA problem

2012-05-04 Thread Richard M. Heiberger
The following constructs the data.frame that I think the original
poster asked for.
I don't understand the graph, so I didn't attempt it.

I agree with Bert that this might not make sense.  Specifically, the distinction
between AB01 and AB02 is not modeled, and that is probably the critical factor.

I made several style changes in the dataset.  The name df is a function name,
and its use as a data.frame name will lead the reader to confusion.
I constructed the data.frame directly, not by constructing vectors and putting
them together. I declared the factor variables to be factors.  For factors with
more than two levels, this assures that they get the right number of degrees of
freedom in the anova table.


rg - data.frame(G=c(AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB01,
   AB02,AB02,AB02,AB02,AB02,AB02,AB02,AB02),
 L=factor(c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2)),
 S=factor(c(m,m,f,f,m,m,f,f,
   m,m,f,f,m,m,f,f)),
 R=factor(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2)),
 V=c(1,2,12,21,5,6,12,34,1,6,52,41,5,43,13,24))

summary(aov(V ~ S * L, data=rg[1:8,])) ## no Error term, to be sure we
understand

rg.aov - lapply(split(rg, rg$G),
 function(x) aov(V ~ S*L + Error(L), data=x))
summary(rg.aov[[1]])  ## same Sums of Squares as above, but now with Error term


anovaSumsOfSquares - function(list.of.aov.objects) {
  t(sapply(rg.aov, function(y) {
tmpy -
  sapply(y[-1], function(x) {
tmp - summary(x)[[1]]
nt - sub( +$, , rownames(tmp))
result - tmp[,Sum Sq]
names(result) - nt
result})
c(tmpy[[1]], tmpy[[2]])
  }))
}
anovaSumsOfSquares(rg.aov)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA problem

2012-05-04 Thread rob.griffin.247
Gee Bert, thanks for the really helpful tip. But if you read my post properly
you'll note that I do know how ANOVA's work. 
 The anova of *G* /AB01 /would be some thing like: y=V, fixed=S, Random= L
 
 L*S...

I didn't want to show a full model formula in case it led people do the
wrong path to solving the real problem (seeing as there are several ways to
create mixed effects models which for some reason may not work with
solutions to the problem) which is how to actually get R to do ANOVA to
analyse the data for each value of G in the data frame given in the example
and then get R to give me the output data frame I desire, ergo, *it is
indeed an R problem.*

Perhaps you should read up on the R mailing list posting guidelines:
Questions about statistics: The R mailing lists are primarily intended for
questions and discussion about the R software. However, questions about
statistical methodology are sometimes posted. If the question is well-asked
and of interest to someone on the list, *it may elicit an informative
up-to-date answer*  so not rude and sarcastic ones then..

I will admit statistics is an element of the question I have posed, but it
is entirely in an R based context.
My understanding of statistics is perfectly acceptable thanks to numerous
courses taken through my undergraduate, masters, and PhD studies. If you're
not willing to help someone solve their problems then don't bother posting -
do you have nothing better to do with your time?

I would also suggest that my post has a lot more to do with R than your post
just moments ago which is solely about statistics and is of no relevance to
the R help forum. 
http://r.789695.n4.nabble.com/Off-Topic-Crime-Statistics-Don-t-Pay-td4609170.html
http://r.789695.n4.nabble.com/Off-Topic-Crime-Statistics-Don-t-Pay-td4609170.html
 

I know you regularly post on this forum and are often helpful, but sometimes
unhelpful posts are unnecessary. 

Rant over. 

As for everyone else:
Firstly, sorry about the above, it's been a long week. Secondly, I would
still really like some helpful answers from people who are interested in
helping me, and more constructive replies will be greatly appreciated.

On Fri, May 4, 2012 at 9:18 AM, robgriffin247 lt;rg.rforum@.cogt; wrote:
 Hi,
 I need to create a data frame containing the results of a number of
 ANOVA's
 but I'm having some trouble setting it up (some being enough for me to
 spend
 3 days trying with no progress and be left staring in to the abyss which
 some people call a weekend, and what I will call 2 quiet days in the
 office...)

I would suggest staying out of the office and consulting a local
statistician Monday morning. As a poor second choice, post on a
statistics Help list (e.g. stats.stackexchange.com).

I haven't gone through your post in detail, but it appears to have
little to do with R and a **lot** to do with your lack of statistical
understanding. It appears that you need to formulate a scientifically
appropriate mixed effect model (the problem is never how to set up an
anova), and interaction with a local consultant is the best way to do
that.

I suppose you could also post this on the r-sig-mixed-models list, as
they often go beyond the R issues to the statistical modeling. But
remote consulting is a risky business, as despite the best of
intentions on both sides, incomplete or mis- communication can lead to
errors of the third kind (right answer -- wrong question).

Best,
Bert


 The response variable is *V*.
 I need to do an ANOVA for each *G*.
 The fixed effect will be *S* (M or F) whilst also having the *S*L* and
 *L* (1 or 2) as random effects.
 The anova of *G* /AB01 /would be some thing like: y=V, fixed=S, Random= L
 
 L*S...
 The new data frame would then compile all the variance components for each
 G, including total and residual variance.

 here is the example dataframe using 2 G's, with 2 S values, 2 L, and 2
 replicates for each.

 df-as.data.frame(c(AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB01,AB02,AB02,AB02,AB02,AB02,AB02,AB02,AB02))
 names(df)-G
 df$L-as.numeric(c(1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2))
 df$S-(c(m,m,f,f,m,m,f,f,m,m,f,f,m,m,f,f))
 df$R-as.numeric(c(1,2,1,2,1,2,1,2,1,2,1,2,1,2,1,2))
 df$V-as.numeric(c(1,2,12,21,5,6,12,34,1,6,52,41,5,43,13,24))

 It is worth noting the actual data this will be used on is 1*G's,
 2*S's,  40*L's,  and 2*R's so hand writing an ANOVA for each G is not
 preferred...

 Here is a twitter link to a crudely drawn illustration of the aim
 illustrated (using 3 Ls) in case I have confused you with words (through
 my
 own poor understanding):
 https://twitter.com/#!/robgriffin247/status/198446041316593666/photo/1/large
 https://twitter.com/#!/robgriffin247/status/198446041316593666/photo/1/large

 Thanks in advance for your time,
 Rob
 (please save my weekend...)

 --
 View this message in context:
 http://r.789695.n4.nabble.com/ANOVA-problem-tp4609062.html
 Sent from the R help mailing list archive at Nabble.com.

 

Re: [R] ANOVA problem

2012-05-04 Thread rob.griffin.247
Thanks Richard, that works great on the test data,
I'll try it out on the full dataset now and let you know how it goes.
Thanks a lot!

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-problem-tp4609062p4609400.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA in quantreg - faulty test for 'nesting'?

2012-04-21 Thread Roger Koenker
Yes, the models are nested, and yes I probably should have been more clever
about parsing formulae, for these cases.  I'll have a look at the code for glm, 
I
presume that lm() is also capable of groking this?   Meanwhile, I would like to
point out that any of these linear restrictions can be reformulated as an 
exclusion
restriction, see for example the remark on pages 5-6 of 

http://www.econ.uiuc.edu/~roger/research/ranks/ranktests.pdf

or the references cited there.  And using this sort of parameterization you can
use anova.rq().


Roger Koenker
rkoen...@illinois.edu




On Apr 20, 2012, at 7:59 PM, Galen Sher wrote:

 Thanks to Brian's suggestion of using the logLik() function, I've dug a
 little deeper. I definitely think f1 and f2 are nested models. For example,
 by adding x2 to fmla1, we obtain a formula that quite clearly nests fmla1
 and achieves the same log likelihood as that obtained for f2. Here is the
 extra code to show this:
 
 fmla3 = y~I(x1+x2)+x2
 f3=glm(fmla3)
 logLik(f1); logLik(f2); logLik(f3)
 
 If f2=f3, as the log likelihoods would suggest, then this gives us a
 workaround: define the intermediate formula fmla3 and the fit f3 as above,
 and then conduct the analysis of variance between models f1 and f3 instead
 of f1 and f2. This doesn't offend anova.rq() any longer:
 
 f3.qr = rq(fmla3)
 anova(f1.qr,f3.qr) #this is actually anova(f1.qr, f2.qr) which resulted in
 an error earlier
 
 -Galen
 
 On Fri, Apr 20, 2012 at 6:47 PM, Brian S Cade ca...@usgs.gov wrote:
 
 Galen:  Interesting, first time I tried it I couldn't get anova.glm to
 compute the p-values (got a warning) but I tried it again and it worked.
 Your larger (alternative hypothesis) model is  y = B0 + B1*x1 + B2*x2 + e
 and your smaller (null hypotheisis) model is y = B0 + B3*(x1 + x2).  So  I
 guess I see that what you're trying to test in this case is that B1 = B2.
 I don't think Roger Koenker anticipated such a test with anova.rq.  Other
 options besides using information criteria (AIC, BIC, etc) for comparing
 nonnested models include the Vuong test.  But not sure how readily the
 theory of Vuong's test (like a paired t-test) extends to quantile
 regression.
 
 Brian
 
 Brian S. Cade, PhD
 
 U. S. Geological Survey
 Fort Collins Science Center
 2150 Centre Ave., Bldg. C
 Fort Collins, CO  80526-8818
 
 email:  brian_c...@usgs.gov
 tel:  970 226-9326
 
 
 From: Galen Sher galens...@gmail.com To: Brian S Cade ca...@usgs.gov
 Date: 04/20/2012 09:59 AM Subject: Re: [R] ANOVA in quantreg - faulty
 test for 'nesting'?
 --
 
 
 
 Thanks Brian. I think anova.glm() requires the user to specify the
 appropriate distribution. In the example above, if I use either of the
 following commands
 
 anova(f1,f2,test=Chisq) #or
 anova(f1,f2,test=F)
 
 then anova.glm() will compute and display p-values associated with the
 deviance statistics. The reason I thought these models are nested is
 because the first model can be thought of as the second model estimated
 under an additional linear equality constraint. I suppose that's less of an
 R question and more of a regression question.
 
 Thanks for the logLik suggestion. In the absence of more information I'll
 have to do that - I'm just wary of conducting the test myself!
 
 Regards,
 Galen
 
 On Fri, Apr 20, 2012 at 4:31 PM, Brian S Cade 
 *ca...@usgs.gov*ca...@usgs.gov
 wrote:
 Galen:  Actually don't see how the models are nested (ask yourself what
 parameter in the model with 3 parameters is set to zero in the 2 parameter
 model?) and indeed if I try your code anova.glm will compute the difference
 in deviance between the two models but it does not compute a probability
 value for that difference in deviance as that would not make sense for
 models that aren't nested.   Koenker's implementation of anova.rq
 immediately detects that the models aren't nested so doesn't even compute
 the deviance difference.  You could use the logLik function on the rq
 objects to get their log likelihoods or use AIC (BIC) to compare the
 quantile regression models.
 
 Brian
 
 Brian S. Cade, PhD
 
 U. S. Geological Survey
 Fort Collins Science Center
 2150 Centre Ave., Bldg. C
 Fort Collins, CO  80526-8818
 
 email:  *brian_c...@usgs.gov* brian_c...@usgs.gov
 tel:  *970 226-9326* 970%20226-9326
 
  From: galen *galens...@gmail.com* galens...@gmail.com  To: *
 r-help@r-project.org* r-help@r-project.org  Date: 04/19/2012 02:40 PM
 Subject: [R] ANOVA in quantreg - faulty test for 'nesting'?  Sent by: *
 r-help-boun...@r-project.org* r-help-boun...@r-project.org
 
 --
 
 
 
 
 I am trying to implement an ANOVA on a pair of quantile regression models
 in
 R. The anova.rq() function performs a basic check to see whether the models
 are nested, but I think this check is failing in my case. I think my models
 are nested despite the anova.rqlist() function saying otherwise. Here is an
 example where the GLM ANOVA regards the models as nested

Re: [R] ANOVA in quantreg - faulty test for 'nesting'?

2012-04-20 Thread Galen Sher
Thanks to Brian's suggestion of using the logLik() function, I've dug a
little deeper. I definitely think f1 and f2 are nested models. For example,
by adding x2 to fmla1, we obtain a formula that quite clearly nests fmla1
and achieves the same log likelihood as that obtained for f2. Here is the
extra code to show this:

fmla3 = y~I(x1+x2)+x2
f3=glm(fmla3)
logLik(f1); logLik(f2); logLik(f3)

If f2=f3, as the log likelihoods would suggest, then this gives us a
workaround: define the intermediate formula fmla3 and the fit f3 as above,
and then conduct the analysis of variance between models f1 and f3 instead
of f1 and f2. This doesn't offend anova.rq() any longer:

f3.qr = rq(fmla3)
anova(f1.qr,f3.qr) #this is actually anova(f1.qr, f2.qr) which resulted in
an error earlier

-Galen

On Fri, Apr 20, 2012 at 6:47 PM, Brian S Cade ca...@usgs.gov wrote:

 Galen:  Interesting, first time I tried it I couldn't get anova.glm to
 compute the p-values (got a warning) but I tried it again and it worked.
 Your larger (alternative hypothesis) model is  y = B0 + B1*x1 + B2*x2 + e
 and your smaller (null hypotheisis) model is y = B0 + B3*(x1 + x2).  So  I
 guess I see that what you're trying to test in this case is that B1 = B2.
  I don't think Roger Koenker anticipated such a test with anova.rq.  Other
 options besides using information criteria (AIC, BIC, etc) for comparing
 nonnested models include the Vuong test.  But not sure how readily the
 theory of Vuong's test (like a paired t-test) extends to quantile
 regression.

 Brian

 Brian S. Cade, PhD

 U. S. Geological Survey
 Fort Collins Science Center
 2150 Centre Ave., Bldg. C
 Fort Collins, CO  80526-8818

 email:  brian_c...@usgs.gov
 tel:  970 226-9326


  From: Galen Sher galens...@gmail.com To: Brian S Cade ca...@usgs.gov
 Date: 04/20/2012 09:59 AM Subject: Re: [R] ANOVA in quantreg - faulty
 test for 'nesting'?
 --



 Thanks Brian. I think anova.glm() requires the user to specify the
 appropriate distribution. In the example above, if I use either of the
 following commands

 anova(f1,f2,test=Chisq) #or
 anova(f1,f2,test=F)

 then anova.glm() will compute and display p-values associated with the
 deviance statistics. The reason I thought these models are nested is
 because the first model can be thought of as the second model estimated
 under an additional linear equality constraint. I suppose that's less of an
 R question and more of a regression question.

 Thanks for the logLik suggestion. In the absence of more information I'll
 have to do that - I'm just wary of conducting the test myself!

 Regards,
 Galen

 On Fri, Apr 20, 2012 at 4:31 PM, Brian S Cade 
 *ca...@usgs.gov*ca...@usgs.gov
 wrote:
 Galen:  Actually don't see how the models are nested (ask yourself what
 parameter in the model with 3 parameters is set to zero in the 2 parameter
 model?) and indeed if I try your code anova.glm will compute the difference
 in deviance between the two models but it does not compute a probability
 value for that difference in deviance as that would not make sense for
 models that aren't nested.   Koenker's implementation of anova.rq
 immediately detects that the models aren't nested so doesn't even compute
 the deviance difference.  You could use the logLik function on the rq
 objects to get their log likelihoods or use AIC (BIC) to compare the
 quantile regression models.

 Brian

 Brian S. Cade, PhD

 U. S. Geological Survey
 Fort Collins Science Center
 2150 Centre Ave., Bldg. C
 Fort Collins, CO  80526-8818

 email:  *brian_c...@usgs.gov* brian_c...@usgs.gov
 tel:  *970 226-9326* 970%20226-9326

   From: galen *galens...@gmail.com* galens...@gmail.com  To: *
 r-help@r-project.org* r-help@r-project.org  Date: 04/19/2012 02:40 PM
 Subject: [R] ANOVA in quantreg - faulty test for 'nesting'?  Sent by: *
 r-help-boun...@r-project.org* r-help-boun...@r-project.org

  --




 I am trying to implement an ANOVA on a pair of quantile regression models
 in
 R. The anova.rq() function performs a basic check to see whether the models
 are nested, but I think this check is failing in my case. I think my models
 are nested despite the anova.rqlist() function saying otherwise. Here is an
 example where the GLM ANOVA regards the models as nested, but the quantile
 regression ANOVA tells me the models aren't nested:

 y = rnorm(100)
 x1 = rnorm(100)
 x2 = rnorm(100)

 fmla1 = y~I(x1+x2)
 fmla2 = y~x1+x2

 f1 = glm(fmla1)
 f2 = glm(fmla2)

 anova(f1,f2) #This works

 f1.qr = rq(fmla1)
 f2.qr = rq(fmla2)

 anova(f1.qr,f2.qr) #Error!
 #Error in anova.rqlist(object, ...) : Models aren't nested

 Are the models in fact not nested? If they are nested, is there an easy
 workaround I could use? Many thanks in anticipation.

 --
 View this message in context: *
 http://r.789695.n4.nabble.com/ANOVA-in-quantreg-faulty-test-for-nesting-tp4571994p4571994.html
 *http://r.789695.n4.nabble.com/ANOVA-in-quantreg-faulty-test-for-nesting

Re: [R] ANOVA Data Error

2012-04-17 Thread David Winsemius


On Apr 17, 2012, at 1:03 PM, MatthewJudd wrote:


Hello I am having some trouble performing a simple ANOVA in R

I am working with 2001 census tract information regarding religion.   
I am
quite a rookie at this whole R experiance, but I would like to  
improve my

skills.

*I read in my data and subset it accordingly to select only the  
columns I

desired. *

newottawa 
=ottawa_census[,c(97,121,124,125,126,127,128,129,130,131,132,133)]

newottawa

*This is the formula  used to try and perform my ANOVA*

anovaottawa=aov(newottawa$RomanCatholic~newottawa$UnitedChurch)
summary(anovaottawa)

The (-) in the formula is actually a tilde in my R code (but it  
wont

show up properly in this email)

*My Error Message:*

Error in lm.fit(x, y, offset = offset, singular.ok =  
singular.ok, ...) :

 invalid to change the storage mode of a factor
In addition: Warning message:
In model.response(mf, numeric) :
 using type=numeric with a factor response will be ignored

I fear that my data is in the wrong format or something of that  
nature and

thus R will not allow me to perform analysis on it.


It would seem passing strange to have a variable named RomanCatholic  
that was a continuous variable. My guess is that you are not using the  
function properly. How much experience do you have with regression  
functions in R ... and a corollary question:  is this homework?


--
David.






David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA testing over nested MS term

2012-03-18 Thread peter dalgaard

On Mar 18, 2012, at 08:21 , shoreliner11 wrote:

 I'm still relatively new to R but was wondering if anyone could help me force
 R to compute the f-statistic etc using the  the nested term rather than the
 residual. In my particular case we were nesting a treatment effect by a
 replicated tank which was not non-significant enough (p0.25)to be dropped
 from the statistical model.

Ouch! Don't _ever_ take high p-values as proof of absence of effect. 

 
 Here's my code:
 summary(analysis1-aov(Area~Treatment/Tank))
 
 Area is the response variable, treatment is the main effect, nested within a
 replicated tank. Thanks so much for your help.
 

I.e. Tanks are numbered within Treatments, so that Treatment 1, Tank 1 is 
distinct from Treatment 2, Tank 1?

Something like aov(Area~Treatment + Error(Treatment:Tank)) should do it. 
There's a wart in aov() causing this sort of model to throw a warning that 
Error() model is singular, for reasons that I never quite fathomed. An 
explicit 

TrTa - interaction(Treatment,Tank)

and then using Error(TrTa) gets rid of the warning and provides equivalent 
output.


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/ANOVA-testing-over-nested-MS-term-tp4481767p4481767.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova analysis on factors...

2011-12-08 Thread Sarah Goslee
And a further problem with both approaches: is 11pm (23) really that
different from 1am?

On Thu, Dec 8, 2011 at 3:28 PM, Michael comtech@gmail.com wrote:
 Hi all,

 If we wanted to study the effect on the mean of the hourly data based on
 the hours within a day...

 and we wanted to do Anova analysis...

 We have two choices:

 Please see below:

 Why are these two approaches giving very different p-values? And which one
 shall I use?

 Thanks a lot!

 1. treating the hours as double/floating numbers:


 anova(lm(hourlydata~as.double(hours_factors)))

 Df Sum Sq Mean Sq F value Pr(F)

 as.double(hours_factors) 1 0.0002 0.00019876 1.3425 0.2466

 Residuals 14868 2.2013 0.00014806

 2. treating the hours as factors:



 anova(lm(hourlydata~hours_factors))

 Df Sum Sq Mean Sq F value Pr(F)

 hours_factors 9 0.00077 8.5979e-05 0.5806 0.8142

 Residuals 14860 2.20072 1.4810e-04

        [[alternative HTML version deleted]]

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] anova analysis on factors...

2011-12-08 Thread David Winsemius


On Dec 8, 2011, at 3:28 PM, Michael wrote:


Hi all,

If we wanted to study the effect on the mean of the hourly data  
based on

the hours within a day...

and we wanted to do Anova analysis...

We have two choices:


Who is we and how were these constraints imposed?



Please see below:

Why are these two approaches giving very different p-values?


They are markedly different statistical models.


And which one
shall I use?



Without knowing your situation better and the eventual purposes of  
this analysis, it would be difficult to give sensible advice. I  
suspect the answer is neither.


--
David.


Thanks a lot!

1. treating the hours as double/floating numbers:


anova(lm(hourlydata~as.double(hours_factors)))

Df Sum Sq Mean Sq F value Pr(F)

as.double(hours_factors) 1 0.0002 0.00019876 1.3425 0.2466

Residuals 14868 2.2013 0.00014806

2. treating the hours as factors:



anova(lm(hourlydata~hours_factors))

Df Sum Sq Mean Sq F value Pr(F)

hours_factors 9 0.00077 8.5979e-05 0.5806 0.8142

Residuals 14860 2.20072 1.4810e-04

[[alternative HTML version deleted]]




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA and LSD TEST - contrasting results

2011-11-20 Thread peter dalgaard

On Nov 19, 2011, at 15:37 , Roberto wrote:

 First of all, thank you for the reply to my topic.
 
 Anova summary has shown no significative difference into Treatment (Pr
 0.374), otherwise LSD Test has shown difference between 2 Treatment.
 I suppose that's something strange.
 

It isn't. You need to read up on familywise error rate and its connection to 
LSD and Tukey HSD tests (Google gets you there soon enough; I doubt people in 
here will rewrite textbook material just for you).

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA and LSD TEST - contrasting results

2011-11-20 Thread Roberto
I didn't meant to rewrite textbook. I asked only for a clarification to my
doubt, because I have started to study statistic recently.

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-and-LSD-TEST-contrasting-results-tp4086090p4088810.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA and LSD TEST - contrasting results

2011-11-19 Thread John Sorkin
Roberto,
Please be more specific. What are the differences between ANOVA and LSD that 
give you concern?
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

 Roberto rmosce...@unitus.it 11/19/2011 6:10 AM 
Dear all,
concerne to my issue, I obtained different results between ANOVA and LSD
TEST (agricolae package), and I ask me why




anova - aov(SS ~ Treatment*Lot)
summary(anova)

 DfSum Sq   Mean Sq F value Pr(F)
Treatment   7 0.534 7.622e-06   1.107  0.374
Lot2 0.059 2.954e-06   0.429  0.654
Treatment:Lot 14 0.0001038 7.412e-06   1.077  0.401
Residuals48 0.0003304 6.883e-06

 LSD.test(anova, Treatment)

Study:

LSD t Test for SS 

Mean Square Error:  6.883013e-06 

Trattamento,  means and individual ( 95 %) CI

SS  std.err replicationLCLUCL
A30 0.02966558 0.0019037856   9 0.02583776 0.03349340
AM  0.02844520 0.0005186885   9 0.02740230 0.02948809
C30 0.02913658 0.0006792219   9 0.02777092 0.03050225
CM  0.02964944 0.0005398926   9 0.02856391 0.03073496
T30 0.02979624 0.0006920447   9 0.02840479 0.03118769
TS  0.02891764 0.0004981543   9 0.02791603 0.02991924
V30 0.03112188 0.0005157621   9 0.03008487 0.03215889
VM  0.03087886 0.0006874520   9 0.02949664 0.03226107

alpha: 0.05 ; Df Error: 48
Critical Value of t: 2.010635 

Least Significant Difference 0.002486659
Means with the same letter are not significantly different.

Groups, Treatments and means
aA1 0.03112188 
ab   A2  0.03087886 
ab   B1 0.02979624 
ab   C1 0.02966558 
ab   D1  0.02964944 
ab   D2 0.02913658 
ab   B2  0.02891764 
 b   C2  0.0284452 

how is it possible? my mistake to set anova?

Best regards,
Roberto


--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-and-LSD-TEST-contrasting-results-tp4086090p4086090.html
 
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA and LSD TEST - contrasting results

2011-11-19 Thread Roberto
First of all, thank you for the reply to my topic.

Anova summary has shown no significative difference into Treatment (Pr
0.374), otherwise LSD Test has shown difference between 2 Treatment.
I suppose that's something strange.

Roberto



--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-and-LSD-TEST-contrasting-results-tp4086090p4086436.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA from imported data has only 1 degree of freedom

2011-10-10 Thread shardman
Hi David,

Apologies again and thankyou for your help, I've edited my original post to
clarify what I was asking. What I meant was that the factor had only 1
degrees of freedom when it should have had 2 (14 in total), so you're right
there were 14 but not in the right place.

In SPSS you select one column as a factor and another as a dependent
variable so this wouldn't happen, it's easy to use but not that versatile
and very expensive. I've been told good things about R so I'm trying to
teach myself.

I followed your suggestion and I now have the results I need,
I'll take more care with posting in future,

Yours,
Sam

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-from-imported-data-has-only-1-degree-of-freedom-tp3887528p3888322.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA from imported data has only 1 degree of freedom

2011-10-10 Thread shardman
Thanks bbolker, that's really helpful. I'll look out for the book too, it
could be helpful!

Yours,
Sam

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-from-imported-data-has-only-1-degree-of-freedom-tp3887528p3891246.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA from imported data has only 1 degree of freedom

2011-10-09 Thread David Winsemius


On Oct 9, 2011, at 11:10 AM, shardman wrote:


Hi,

I'm trying to analyse some data I have imported into R from a .csv  
file but
when I carry out the aov command the results show only one degree of  
freedom
when there should be 14. Does anyone know why? I'd really appreciate  
some

help, the data is pasted below.



To me this appears likely to be homework. If so then please read your  
college's rules regarding  academic honesty. In any case, please read  
the Posting Guide.




/The imported table looks ike this this:/

   Order Transect Sample Abundance
1  Coleoptera1  113
2  Coleoptera1  212
3  Coleoptera1  311
4  Coleoptera1  413
5  Coleoptera1  5 6
6  Coleoptera2  118
7  Coleoptera2  218
8  Coleoptera2  316
9  Coleoptera2  421
10 Coleoptera2  511
11 Coleoptera3  119
12 Coleoptera3  216
13 Coleoptera3  3 9
14 Coleoptera3  432
15 Coleoptera3  529

/The command I am using is this:/

anova2-aov(Abundance~Transect,data=beetle)

/The results come out like this:/

Df Sum Sq Mean Sq F value  Pr(F)
Transect 1 250.00 250.000  7.2394 0.01852 *
Residuals   13 448.93  34.533
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Cheers,
Sam


--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-from-imported-data-has-only-1-degree-of-freedom-tp3887528p3887528.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA from imported data has only 1 degree of freedom

2011-10-09 Thread shardman
Hi David,

Thanks for your message.

I can assure you this is not homework. I'm working on an ecology project and
am trying to analyse the results from the fieldwork. I don't want other
people to do the work for me I was just hoping someone might be able to spot
where I have made a mistake, I'm still teaching myself to use R after having
used SPSS for a long time.

I did read the posting guidelines but couldn't see any reason not to ask for
help, I'm sorry if I've got it wrong,
Yours,
Sam

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-from-imported-data-has-only-1-degree-of-freedom-tp3887528p3887979.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA from imported data has only 1 degree of freedom

2011-10-09 Thread David Winsemius


On Oct 9, 2011, at 2:08 PM, shardman wrote:


Hi David,

Thanks for your message.

I can assure you this is not homework. I'm working on an ecology  
project and
am trying to analyse the results from the fieldwork. I don't want  
other
people to do the work for me I was just hoping someone might be able  
to spot
where I have made a mistake, I'm still teaching myself to use R  
after having

used SPSS for a long time.

I did read the posting guidelines but couldn't see any reason not to  
ask for

help, I'm sorry if I've got it wrong,


Please read the Posting Guide again. Nabble is not the standard venue  
where people read this list and you seem to have missed the part where  
the PG asked you to include relevant context in replies. It also also  
asks you to describe your scientific domain (which you have now done)  
and academic position (at least in part so the list can remain non- 
homework oriented).


My memory from your initial posting was that you had numeric  
identifiers for you groups and did not wrap the group variable name in  
factor(), so it was treated as a continuous variable. Doesn't SPSS  
have some mechanism to flag a variable as discrete?


Perhaps(from memory) something along these lines:
aov(outcome ~ factor(group), data=dat)

(Further comments from memory of earlier posting.)
You also were asking why there were not 14 df in the output but there  
were since 1+13 = 14. Surely you were not expecting 14 df to be  
associated with the grouping variable when there were only 3 groups?





Yours,
Sam

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-from-imported-data-has-only-1-degree-of-freedom-tp3887528p3887979.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA from imported data has only 1 degree of freedom

2011-10-09 Thread Ben Bolker
shardman samuelhardman at hotmail.com writes:

 
 Hi,
 
 I'm trying to analyse some data I have imported into R from a .csv file but
 when I carry out the aov command the results show only one degree of freedom
 when there should be 14. Does anyone know why? I'd really appreciate some
 help, the data is pasted below.
 
 /The imported table looks ike this this:/
 
 Order Transect Sample Abundance
 1  Coleoptera1  113
[snip]
 15 Coleoptera3  529
 
 /The command I am using is this:/
 
 anova2-aov(Abundance~Transect,data=beetle)
 

I was going to tell you to read up on the distinction between
factors and numeric variables in statistical models, but I can't
immediately find a reference on line (this is bound to be
in Peter Dalgaard's book or any other basic-stats-with-R
text).

str() will tell you whether the columns are numeric or factors.

Try

beetle - transform(beetle,Transect=factor(Transect))

[or ?read.csv and use colClasses explicitly to specify
the classes of the columns]

before running your anova.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA/ANCOVA Repeated Measure Mixed Model

2011-10-07 Thread Nick GH
Hi Daniel,

Thanks for your response, I don't completely follow. I also should have
split this into two questions.

Regarding my first question:



Daniel Malter wrote:
 
 You are modeling Condition * Stimulus * Group as fully interacted fixed
 effects... A simple random effect for the individual might suffice. 

From what I understand, you are suggesting that I am unnecessarily
complicating my statistics by including factors that are not necessarily
considered treatment levels. This would be correct, as I am mainly
interested in the effect of Group (the order of the two conditions).

You could try to model this with lmer (in the lme4 library) and inspect
whether such a complex error structure actually adds anything to your model. 

I have figured out how to download the lme4 library, but I am not clear how
I would implement or interpret the results using lmer

Second ANCOVA question


it may not make much sense to include one in the regression of the other if
if there is reason to believe that the one measured first does not affect
the one measured second.

Given that I am interested in how the order of Alert and Passive effect
stimulus responses, it seems like this would not be a prudent analysis.
However, there seems to be no effect of order in the passive case but an
effect of order in the alert case. So it might be a good option?

 It is therefore advisable that you talk to your local statistician, if any.

Unfortunately, I don't have that option.


Thanks again for your help,
Nick GH



--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-ANCOVA-Repeated-Measure-Mixed-Model-tp3880868p3882349.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA/ANCOVA Repeated Measure Mixed Model

2011-10-06 Thread Daniel Malter
You are modeling Condition * Stimulus * Group as fully interacted fixed
effects. I do not actually think that you would need a complex random effect
structure for this. A simple random effect for the individual might suffice.
You could try to model this with lmer (in the lme4 library) and inspect
whether such a complex error structure actually adds anything to your model.

If the alert/passive stages are distinct measures that were assessed in
strict temporal order, then you could include the passive stage result
(assuming passive was assessed first) in the regression of the active stage
(the Response for the same subject under the same stimulus). Otherwise this
it may not make much sense to include one in the regression of the other if
there was no strict temporal order or if there is reason to believe that the
one measured first does not affect the one measured second.

But after all, these answers depend on your context and what you are trying
to actually show. It is therefore advisable that you talk to your local
statistician, if any.

HTH,
Daniel


Nick GH wrote:
 
 Hello,
 
 I am trying to test some results I have for significance. It has been
 recommended that I use R and I am completely new to this.
 
 
Set-up:

Groups: two groups of 8 subjects (16 total)
Two conditions: alert and passive
Measurements: responses for three different stimuli (A, B, and C)
 measured in each condition
 
 
Experiment: Testing the order of conditions 

Group one: Alert A, B and C followed by Passive A, B, and C 
Group two: Passive A, B, and C followed by Alert A, B, and C
 
 Stimuli A, B and C are randomly interleaved in the experiment, does this
 matter in my ANOVA?
  
 I am interested in making a between and within group comparison of
 responses to A, B, and C
 
 heHere is what I am doing:

 
 My data is arranged in the following way
 
 Group Subject Condition Stimulus Response
 One  S1  Alert  A   _Value_
 One  S1  Alert  B   _Value_
 One  S1  Alert  C   _Value_
 One  S1  Passive  A   _Value_
 One  S1  Passive  B   _Value_
 One  S1  Passive  C   _Value_
 One  S2  Alert  A   _Value_
 ...
 
 Two  S9  Alert  A   _Value_
 Two  S9  Alert  B   _Value_
 Two  S9  Alert  C   _Value_
 ...
 
 This is the code I used:
 
 M400_500_anova = aov(Response ~ Condition * Stimulus * Group +
 Error(Subject/Condition * Stimulus ), data=My_Data)
 
 Is this correct?
 
 
ANCOVA question

 
 I haven't gotten as far with this one, but I want to do an ANCOVA using
 the passive responses as a co-variate, but I am not sure how to even do
 this with so many factors/groups/repeated measures. I am also not sure
 what function to use in R, or how to even write the command. 
 
 Thank you so much for your help!
 Nick GH
 

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-ANCOVA-Repeated-Measure-Mixed-Model-tp3880868p3881024.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA define as factor or not

2011-09-27 Thread Jean V Adams
HronnE wrote on 09/27/2011 06:27:38 AM:
 
 Hi all
 
 This is probably a simple problem but somehow I am having much trouble 
with
 finding a solution, so I seek your help!
 
 I have a data-set with continuous response variables. The explanatory
 variably is 4xpH treatments (so 8.08, 7.94, 7.81 and 7.71)  so  also
 continuous and not technically factorial. 
 
 However I have decided to do Anova's (as well as regression) to explore 
the
 effect of pH.
 
 What I don't understand is why the anova done in such a way:
 
 summary(aov(BioMass~pH))
 
 ... gives me completely different p-values if I define the pH as factor 
or
 not. And what would be the correct approach?
 
 Help on this subject would be much appreciated!
 
 Hronn


If pH is a numeric variable (see ?class) then your formula is instructing 
aov() to include it as a linear term with one degree of freedom, just like 
lm() does if you're fitting a regression.  If you want to treat pH as a 
categorical variable, then you must make sure that it is included in the 
model as a factor.  Then you will see that it uses three degrees of 
freedom.

BioMass - rnorm(100)
pH - sample(c(8.08, 7.94, 7.81, 7.71), size=100, replace=TRUE)
pH.f - as.factor(pH)

summary(aov(BioMass ~ pH))
summary(lm(BioMass ~ pH))

summary(aov(BioMass ~ pH.f))
summary(lm(BioMass ~ pH.f))


Jean
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA define as factor or not

2011-09-27 Thread HronnE
Thank you very much for a quick reply!

I had not realized the degrees of freedom changed. It was my lack of
understanding of the aov function.

I will continue defining the pH as factor for the ANOVA's. 

Cheers,

Hronn



-
-- 
Hrönn Egilsdóttir
PhD Student
Marine Research Institute
Skúlagata 4
121 Reykjavík
ICELAND
--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-define-as-factor-or-not-tp3846861p3847781.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA with many IV's

2011-06-08 Thread Dennis Murphy
Hi:

You can try something like this: assuming the factor variables of
interest and the response variable are in a data frame named df,

ivset - c(comma separated vector of factor names)
myaovs -  lapply(ivset, function(x) {
 form - as.formula(substitute(yvar ~ foo, list(foo = as.name(x
 aov(form, data = df)
   } )

This should generate a list of aov objects, one per factor in ivset.
From there, R has functions to extract pieces of output as needed.

Since no example data was presented, the above is untested, so caveat emptor.

HTH,
Dennis

On Wed, Jun 8, 2011 at 3:25 PM, mandakaye mandak...@gmail.com wrote:
 I'd like to conduct one-way ANOVA's on multiple IV's. Is there a function for
 aov (y~all of my IV's)? Thank you!

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/ANOVA-with-many-IV-s-tp3583788p3583788.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA with many IV's

2011-06-08 Thread Rolf Turner


It is not nearly as complicated as Dennis Murphy makes out.

Just do

aov(y ~ ., data=X)

where X is your data frame with one column (the response) name ``y''
and then any number of other columns which will then form the
predictors (which may be either numeric predictors or factors).

cheers,

Rolf Turner

On 09/06/11 16:14, Dennis Murphy wrote:

Hi:

You can try something like this: assuming the factor variables of
interest and the response variable are in a data frame named df,

ivset- c(comma separated vector of factor names)
myaovs-  lapply(ivset, function(x) {
  form- as.formula(substitute(yvar ~ foo, list(foo = as.name(x
  aov(form, data = df)
} )

This should generate a list of aov objects, one per factor in ivset.
 From there, R has functions to extract pieces of output as needed.

Since no example data was presented, the above is untested, so caveat emptor.

HTH,
Dennis

On Wed, Jun 8, 2011 at 3:25 PM, mandakayemandak...@gmail.com  wrote:

I'd like to conduct one-way ANOVA's on multiple IV's. Is there a function for
aov (y~all of my IV's)? Thank you!

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-with-many-IV-s-tp3583788p3583788.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA Residual SS and MS of 0

2011-05-24 Thread Bert Gunter
 Where do I begin troubleshooting such a problem?

1. By examining your data to see whether you have entered it correctly.

2. By reading the posting Guide at the bottom of this and every
message and doing as it asks (posting a portion of your data and R
code so that we can see what you did).

3. By consulting your local statistician so that she can help you
better understand the nature of your problem and whether you have
appropriate data and methodology to address it.

-- Bert



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/ANOVA-Residual-SS-and-MS-of-0-tp3548031p3548031.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA 1 too few degrees of freedom

2011-05-11 Thread Rovinpiper


Please read the first two paragraphs of Details from ?formula

Ok, I just did.

So, If I understand this properly, the term Plot*Day would include both the
main effects of a and b and their second order interactions. So it could be
written Plot + Day + Plot:Day. 

The term Plot:Day includes only the second order interactions of Plot and
Day. 

So would that make this model redundant:

Rs ~ Plot + Day + Plot:Day

Since it includes the main effects of Plot and Day twice?

Thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3514828.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA 1 too few degrees of freedom

2011-05-10 Thread Rovinpiper
So what is the difference between a colon and an asterisk in this code? For
that matter what does the slash mean?

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3512977.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA 1 too few degrees of freedom

2011-05-10 Thread Richard M. Heiberger
David,

Please read the first two paragraphs of Details from ?formula

You can try out the formulas with the notation

 expand.formula - function(f) colnames(attr(terms(f), factors))
 expand.formula(~a+b)
[1] a b
 expand.formula(~a:b)
[1] a:b
 expand.formula(~a*b)
[1] a   b   a:b
 expand.formula(~a/b)
[1] a   a:b


See a design text for more information.

Rich


On Tue, May 10, 2011 at 4:51 PM, Rovinpiper david.j.mee...@gmail.comwrote:

 So what is the difference between a colon and an asterisk in this code? For
 that matter what does the slash mean?

 --
 View this message in context:
 http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3512977.html
  Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA 1 too few degrees of freedom

2011-05-06 Thread peter dalgaard

On May 5, 2011, at 23:30 , Rovinpiper wrote:

 Thanks slre,
 
 I seem to be making some progress now.
 
 Using a colon instead of an asterisk in the code really changes things. I
 had been getting residual SS and MS of zero. Which is ridiculous. Now I get
 much more plausible values. 
 
 Also, When I used an asterisk instead of a colon It wouldn't give results
 for three way interactions. With colons it will.
 
 You are correct about plot being nested within treatment. There are six
 plots in each of 2 treatments.
 
 So, I guess I will have to perform a separate analysis to quantify the
 effect of treatment.
 

Beware that as you have highly significant effects of plot and its interaction 
with day, and plot being nested in treatment, you can't test for treatment or 
treatment:day effect with a systematic effect of plot  and plot:treatment in 
the model (you are only getting p values because of the sequential computation 
of the anova table - if you put plot before treatment, you'd get zero df). More 
likely, you want to make the plot terms random, as in ~treatment*day + 
Error(plot/day)


 Thanks again.
 
 Analysis of Variance Table
 
 Response: Combined.Rs
 
 Df  Sum Sq   Mean Sq   F value  Pr(F)
 Combined.Trt 1  
 52.80 52.805 96.26012.2e-16 ***
 Combined.Plot10 
 677.69   67.769 123.5380 2.2e-16 ***
 as.factor(Combined.Day)16 
 2817.47 176.092   321.0041 2.2e-16 ***
 Combined.Trt:as.factor(Combined.Day)  16   47.82
 2.989   5.44874.048e-10 ***
 Combined.Trt:Combined.Plot:as.factor(Combined.Day)80  455.42   5.693  
 10.3776   2.2e-16 ***
 Residuals   
 284 155.79   0.549   
 ---
 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3499649.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA 1 too few degrees of freedom

2011-05-05 Thread S Ellison


 Rovinpiper david.j.mee...@gmail.com 04/05/2011 22:43 
So this seems to indicate that I have what I want. I have two
respiration data points at each plot on each day.

Yes; if you had only Plot+Day you'd have a completely balanced full
factorial ... for Plot and Day.

But I think I now see an answer to your puzzlement. 

Your original ANOVA table was 
  Df  Sum
SqMean SqF value   Pr(F) 

Combined.Trt   1   52.80  
52.805 2.0186e+30   2.2e-16 *** 
as.factor(Combined.Plot) 10  677.6967.769  
  2.5907e+30   2.2e-16 *** 
as.factor(Combined.Day) 16  2817.47  176.092  
6.7317e+30   2.2e-16 *** 
Combined.Trt:as.factor(Combined.Day)   16   47.82  2.989  
1.1426e+29   2.2e-16 *** 
as.factor(Combined.Day):Combined.Plot 160  611.21   3.820 1.4604e+29 
2.2e-16 *** 

Residuals 2040.00  
0.000 

Now, your Day:Plot combination has 17*12=204 combinations. And you are
showing 204 residual DOF, which is exactly right for two observations
per group and 204 groups. That's exactly right for the model
Day+Plot+Day:Plot.

But you have Trt in the model as well. Where are the Treatment DoF
coming from? 
Clearly, the treatment DoF has not come out of the residual term. That
means the treatment DoF must have come from somewhere else. Since you've
lost 2 instead of 1 DoF from Plot and kept your n[Day]-1 degrees of
freedom for Day, I would guess that Plot is nested in Trt. If that is
the case, you'd 'obviously' expect n[Plot]-2 dof for Plot. And indeed i
could replicate your ANOVA table DoF with 

Trt-gl(2,204)
Plot-gl(12,34) #Note that Plot is NOT fully crossed with Trt; each Trt
has only 6 Plots
Day - gl(17,2,408)
y-rnorm(408)
anova(lm(y~Trt+Plot+Day+Trt:Day+Day:Plot))

So that would be one data structure that would explain your DoF.

But if that is indeed the structure, there are other concerns that you
may need to look out for. 
First, compare the two models
anova(lm(y~Trt+Plot+Day+Trt:Day+Day:Plot))
#and
anova(lm(y~Trt+Plot+Day+Day:Plot+Trt:Day))

Same terms in the model specification, but different DoF and Trt:Day
has completely vanished from the second ANOVA table. Something is
clearly either aliased, unbalanced or incorrectly specified. In my
presumed version of events, because the Plot is nested in Trt, Day:Plot
is also nested in Trt. To get _consistent_ results independent of model
order you would need to reflect that additional nesting in the model:

anova(lm(y~Trt+Plot+Day+Trt:Day:Plot+Trt:Day))
#vs
anova(lm(y~Trt+Plot+Day+Trt:Day+Trt:Day:Plot))

and this time, the two models give identical results for all rows in
the table. Somewhat reassuringly 
anova(lm(y~Day+Trt/Plot+Trt:Day+Trt:Day:Plot))
and even more reassuringly car's Anova does too.

But there's another more fundamental thing that may suggest this is
still not quite the right thing to do. If Plot is nested in Trt, it
matters whether plot is random or fixed when asking about the treatment
effect. I'd guess Plot is random. If plot is random and nested in Trt it
would not be appropriate to simply compare any calculated Trt MS with
the residual MS. Rather, one would compare the Trt MS with the Trt:Plot
interaction. Or perhaps resort to a mixed effects model using lme (if
Plot is the only random term) or lmer if Plot and Day are both random. 




***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA

2011-05-05 Thread David Winsemius


On May 5, 2011, at 2:31 PM, Asan Ramzan wrote:


Hello R-Help

How can i exctact and store the within group mean squared  
difference from an

anova summary table into a varible.


In the absence of an example and code it's speculation, but something  
along the lines of:


anova(fit)$`Mean Sq`   ##  might give you something to work with.



[[alternative HTML version deleted]]

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA 1 too few degrees of freedom

2011-05-05 Thread Rovinpiper
Thanks slre,

I seem to be making some progress now.

Using a colon instead of an asterisk in the code really changes things. I
had been getting residual SS and MS of zero. Which is ridiculous. Now I get
much more plausible values. 

Also, When I used an asterisk instead of a colon It wouldn't give results
for three way interactions. With colons it will.

You are correct about plot being nested within treatment. There are six
plots in each of 2 treatments.

So, I guess I will have to perform a separate analysis to quantify the
effect of treatment.

Thanks again.

Analysis of Variance Table

Response: Combined.Rs

Df  Sum Sq   Mean Sq   F value  Pr(F)
Combined.Trt 1  
52.80 52.805 96.26012.2e-16 ***
Combined.Plot10 
677.69   67.769 123.5380 2.2e-16 ***
as.factor(Combined.Day)16 
2817.47 176.092   321.0041 2.2e-16 ***
Combined.Trt:as.factor(Combined.Day)  16   47.82
2.989   5.44874.048e-10 ***
Combined.Trt:Combined.Plot:as.factor(Combined.Day)80  455.42   5.693  
10.3776   2.2e-16 ***
Residuals   
284 155.79   0.549   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 


--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3499649.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA 1 too few degrees of freedom

2011-05-04 Thread Rovinpiper
This response went to my email:

Without your data it's hard to say, but one possibility is that your
plots are nested within treatments instead of crossed, or that you have
something rather more cunning going on involving the Days. For example
if you had 8 days for six of your plots and another 8 days for the
remaining 6 plots, you may find that the total degrees of freedom aren't
quite what you expected, as those subgroups need an intercept each. (I
had this for a replicated latin square design - but I have to say that
my problem then was that lm() gavbe me too many df and an apparently
unbalanced design - I had to add the superset factor manually to get it
right)

Another possibility is that one of your plots has no data; try
table(Combined.Plot) to check.



--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3496870.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA 1 too few degrees of freedom

2011-05-04 Thread Rovinpiper
And I responded as follows:

Hi,

Thanks for your advice. I tried using table() to check for missing
data. Here are the results:

 table(Combined.Plot)

Combined.Plot

 60m   A1   B1   B3   B4   C5   C9   D2   D9 F60m   F8   Q7
 34 34   3434   343434   3434   34  34   34

 table(Combined.Day)

Combined.Day

  1 2.5  7.5   8.5  10.5 12.5 14.5 17.5 18.5 19.5 21.5 24.5 29.5
32.5 37.5 50.5 79.5
  24   2424   24242424   242424   24   24
24 24 24 24 24

So this seems to indicate that I have what I want. I have two
respiration data points at each plot on each day.

I set this data up so that I could make comparisons of all of the
samples on the same day. That's why I have numbers like 2.5.

I also made a table for comparison of Combined.Day and Combined.Plot.

 cbind(Combined.Day, Combined.Plot)
Combined.Day Combined.Plot
11  60m
2.5  2.560m
7.5  7.560m
8.5  8.560m
10.5 10.5   60m
12.5 12.5   60m
14.5 14.5   60m
17.5 17.5   60m
18.5 18.5   60m
19.5 19.5   60m
21.5 21.5   60m
24.5 24.5   60m
29.5 29.5   60m
32.5 32.5   60m
37.5 37.5   60m
50.5 50.5   60m
79.5 79.5   60m
11  A1
2.5  2.5A1
7.5  7.5A1
8.5  8.5A1
10.5 10.5   A1
12.5 12.5   A1
14.5 14.5   A1
17.5 17.5   A1
18.5 18.5   A1
19.5 19.5   A1
21.5 21.5   A1
24.5 24.5   A1
29.5 29.5   A1
32.5 32.5   A1
37.5 37.5   A1
50.5 50.5   A1
79.5 79.5   A1
11  B1
2.5  2.5B1
7.5  7.5B1
8.5  8.5B1
10.5 10.5   B1
12.5 12.5   B1
14.5 14.5   B1
17.5 17.5   B1
18.5 18.5   B1
19.5 19.5   B1
21.5 21.5   B1
24.5 24.5   B1
29.5 29.5   B1
32.5 32.5   B1
37.5 37.5   B1
50.5 50.5   B1
79.5 79.5   B1
11  B3
2.5  2.5B3
7.5  7.5B3
8.5  8.5B3
10.5 10.5   B3
12.5 12.5   B3
14.5 14.5   B3
17.5 17.5   B3
18.5 18.5   B3
19.5 19.5   B3
21.5 21.5   B3
24.5 24.5   B3
29.5 29.5   B3
32.5 32.5   B3
37.5 37.5   B3
50.5 50.5   B3
79.5 79.5   B3
11  B4
2.5  2.5B4
7.5  7.5B4
8.5  8.5B4
10.5 10.5   B4
12.5 12.5   B4
14.5 14.5   B4
17.5 17.5   B4
18.5 18.5   B4
19.5 19.5   B4
21.5 21.5   B4
24.5 24.5   B4
29.5 29.5   B4
32.5 32.5   B4
37.5 37.5   B4
50.5 50.5   B4
79.5 79.5   B4
11  C5
2.5  2.5C5
7.5  7.5C5
8.5  8.5C5
10.5 10.5   C5
12.5 12.5   C5
14.5 14.5   C5
17.5 17.5   C5
18.5 18.5   C5
19.5 19.5   C5
21.5 21.5   C5
24.5 24.5   C5
29.5 29.5   C5
32.5 32.5   C5
37.5 37.5   C5
50.5 50.5   C5
79.5 79.5   C5
11  C9
2.5  2.5C9
7.5  7.5C9
8.5  8.5C9
10.5 10.5   C9
12.5 12.5   C9
14.5 14.5   C9
17.5 17.5   C9
18.5 18.5   C9
19.5 19.5   C9
21.5 21.5   C9
24.5 24.5   C9
29.5 29.5   C9
32.5 32.5   C9
37.5 37.5   C9
50.5 50.5   C9
79.5 79.5   C9
11  D2
2.5  2.5D2
7.5  7.5D2
8.5  8.5D2
10.5 10.5   D2
12.5 12.5   D2
14.5 14.5   D2
17.5 17.5   D2
18.5 18.5   D2
19.5 19.5   D2
21.5 21.5   D2
24.5 24.5   D2
29.5 29.5   D2
32.5 32.5   D2
37.5 37.5   D2
50.5 50.5   D2
79.5 79.5   D2
11  D9
2.5  2.5D9
7.5  7.5D9
8.5  8.5D9
10.5 10.5   D9
12.5 12.5   D9
14.5 14.5   D9
17.5 17.5   D9
18.5 18.5   D9
19.5 19.5   D9
21.5 21.5   D9
24.5 24.5   D9
29.5 29.5   D9
32.5 32.5   D9
37.5 37.5   D9
50.5 50.5   D9
79.5 79.5   D9
11  F60m
2.5  2.5F60m
7.5  7.5F60m
8.5  8.5F60m
10.5 10.5   F60m
12.5 12.5   F60m
14.5 14.5   F60m
17.5 17.5   F60m
18.5 18.5   F60m
19.5 19.5   F60m
21.5 21.5   F60m
24.5 24.5   F60m
29.5 29.5   F60m
32.5 32.5   F60m
37.5 37.5   F60m
50.5 50.5   F60m
79.5 79.5   F60m
11  F8
2.5  2.5F8
7.5  7.5F8
8.5  8.5F8
10.5 10.5   F8
12.5 12.5   F8
14.5 14.5   F8
17.5 17.5   F8
18.5 18.5   F8
19.5 19.5   F8
21.5 21.5   F8
24.5 24.5   F8
29.5 29.5   F8
32.5 32.5   F8
37.5 37.5   F8
50.5 50.5   F8
79.5 79.5   F8
11  Q7
2.5  2.5Q7
7.5  7.5Q7
8.5  8.5Q7
10.5 10.5   Q7
12.5 12.5   Q7
14.5 14.5   Q7
17.5 17.5   Q7
18.5 18.5   Q7
19.5 19.5   Q7
21.5 21.5   Q7
24.5 24.5   Q7
29.5 29.5   Q7
32.5 32.5   Q7
37.5 37.5   Q7
50.5 50.5   Q7
79.5 79.5   Q7
11  60m
2.5  2.560m
7.5  7.560m
8.5  8.560m
10.5 10.5   60m
12.5 12.5   60m
14.5 14.5   60m
17.5 17.5   60m
18.5 18.5   60m
19.5 

Re: [R] ANOVA 1 too few degrees of freedom

2011-05-03 Thread Richard M. Heiberger
Most likely your combined.trt is linearly dependent on the combined.plot
factor.  Try

Anova.Trt.D.M.T.Pr.Model - aov(Combined.Rs ~  as.factor(Combined.Plot)
.
and see if combined.plot now has the 11 df you are anticipating.

Rich

On Tue, May 3, 2011 at 3:37 PM, Rovinpiper david.j.mee...@gmail.com wrote:

 I'm running an ANOVA on some data for respiration in a forest. I am having
 a
 problem with my degrees of freedom. For one of my variables I get one fewer
 degrees of freedom than I should.

 I have 12 plots and I therefore expected 11 degrees of freedom, but instead
 I got 10.

 Any ideas?

 I have some code and output below:


  class(Combined.Plot)
 [1] character

  levels(as.factor(Combined.Plot))
  [1] 60m  A1   B1   B3   B4   C5   C9   D2   D9   F60m
 F8   Q7

  nlevels(as.factor(Combined.Plot))
 [1] 12

  Anova.Trt.D.M.T.Pr.Model - aov(Combined.Rs~Combined.Trt +
  as.factor(Combined.Plot) + as.factor(Combined.Day) +
  Combined.Trt*as.factor(Combined.Day) +
  Combined.Plot*as.factor(Combined.Day))

 Warning message:
 In model.matrix.default(mt, mf, contrasts) :
  variable 'Combined.Plot' converted to a factor

  summary(Anova.Trt.D.M.T.Pr.Model)
   Df  Sum Sq
 Mean SqF value   Pr(F)

 Combined.Trt   1   52.80
 52.805 2.0186e+30   2.2e-16 ***
 as.factor(Combined.Plot) 10  677.6967.769
 2.5907e+30   2.2e-16 ***
 as.factor(Combined.Day) 16  2817.47  176.092
 6.7317e+30   2.2e-16 ***
 Combined.Trt:as.factor(Combined.Day)   16   47.82  2.989
 1.1426e+29   2.2e-16 ***
 as.factor(Combined.Day):Combined.Plot 160  611.21   3.820 1.4604e+29 
 2.2e-16 ***

 Residuals 2040.00
 0.000

 ---

 Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 --
 View this message in context:
 http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3493349.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA 1 too few degrees of freedom

2011-05-03 Thread Rovinpiper
Hi Richard,

Thanks for your advice.

I think that your suggestion is that I run the ANOVA with Combined.Plot as a
factor. I have tried that does not alleviate the problem.

Did I understand you properly?

Do you have another idea?

Thanks,

David

--
View this message in context: 
http://r.789695.n4.nabble.com/ANOVA-1-too-few-degrees-of-freedom-tp3493349p3493632.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA for stratified cox regression

2011-03-10 Thread Thomas Lumley
On Fri, Mar 11, 2011 at 8:25 AM, Brian McLoone brianbmclo...@gmail.com wrote:
 This is a follow-up to a query that was posted regarding some problems that
 emerge when running anova analyses for cox models, posted by Mathias Gondan:

 Matthias Gondan wrote:

* Dear List, I have tried a stratified Cox Regression, it is working 
fine, except for** the Anova-Tests: Here the commands (should work 
out of the box): library(survival)** d = colon[colon$etype==2, ]** m 
= coxph(Surv(time, status) ~ strata(sex) + rx, data=d)** summary(m)** # 
Printout ok** anova(m, test='Chisq') This is the output of the anova 
command:   ** Analysis of Deviance Table**  Cox model: response is 
Surv(time, status)** Terms added sequentially (first to last)       
       Df Deviance Resid. Df Resid. Dev P(|Chi|)** NULL                    
       929     5233.5          ** strata(sex)   0                929        
             ** rx            2                927     5221.2          ** 
Warning message:** In is.na(coef(fit)) :**   is.na() auf nicht-(Liste 
oder Vektor) des Typs 'NULL' angewendet**      It should be possible 
to do Chi-Square-Tests in a stratified analysis, ** right?*


 We have run into the same problem.  In particular, we are running stratified
 coxph models, but we are not able to generate a Chisq probability for both
 models.

 drop1(Model,test='Chisq')
 Single term deletions

 Model:
 Surv(adjusted_days_60, Death_60_10) ~ strata(chol_60)
                Df    AIC    LRT Pr(Chi)
 none             7437.6
 strata(chol_60)  0 8513.2 1075.6

 As you can see, why is the probability of the Chisq - i.e., Pr(Chi) -
 missing?

Because you can't do that.  The strata() term corresponds to an
infinite-dimensional parameter, a new baseline hazard, so it isn't
like a straightforward likelihood ratio test.  It certainly doesn't
have a chisquared distribution with a fixed number of degrees of
freedom -- I don't know if anyone has even worked out the sampling
distribution of this deviance difference.

-thomas


-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA and Pseudoreplication in R

2011-02-25 Thread Bert Gunter
I can hopefully save bandwidth here by suggesting that this belongs on
the R-sig-mixed-models list.

-- Bert

As an aside, shouldn't you be figuring this out yourself or seeking
local consulting expertise?

On Fri, Feb 25, 2011 at 9:08 AM, Ben Ward benjamin.w...@bathspa.org wrote:
 Hi, As part of my dissertation, I'm going to be doing an Anova, comparing
 the dead zone diameters on plates of microbial growth with little paper
 disks loaded with antimicrobial, a clear zone appears where death occurs,
 the size depending on the strength and succeptibility. So it's basically 4
 different treatments, and I'm comparing the diameters (in mm) of circles.
 I'm concerned however, about Pseudoreplication and how to deal with it in R,
 (I thought of using the Error() term.

 I have four levels of one factor(called Treatment): NE.Dettol, EV.Dettol,
 NE.Garlic, EV.Garlic.   (NE.Dettol is E.coli not evolved to dettol,
 exposed to dettol to get dead zones. And the same for NE.Garlic, but with
 garlic, not dettol. EV.Dettol is E.coli that has been evolved against
 dettol, and then tested afterwards against dettol to get the dead zones.
 Same applies for EV.Garlic but with garlic).  You see from the four levels
 (or treatments) there are two chemicals involved. So my first concern is
 whether they should be analysed using two seperate ANOVA's.

 NE.Dettol and NE.Garlic are both the same organism - a lab stock E.coli,
 just exposed to two different chemicals.
 EV.Dettol and EV.Garlic, are in principle, likely to be two different forms
 of the organism after the many experimental doses of their respective
 chemical.

 For NE.Garlic and NE.Dettol I have 5, what I've called Lineages, basically
 seperate bottles of them (10 in total).
 Then I have 5 Bottles (Lineages) of EV.Dettol, and 5 of EV.Garlic. - This
 was done because there was the possiblity that, whilst I'm expecting them
 all to respond in a similar manner, there are many evolutionary paths to the
 same result, and previous research and reading shows that occasionally one
 or two react differently to the rest through random chance.
 The point I observed above (NE.Dettol and NE.Garlic are both the same
 organism...) is also applicable to the 5 bottles: The 5 bottles each of
 NE.Garlic and NE.Dettol are supposed to be all the same organism - from a
 stock one kept in store in the lab.
 There is potential though for the 5 of EV.Garlic, to be different from one
 another, and potential for the 5 EV.Dettol to be different from one another.

 The Lineage (bottle) is also a factor then, with 5 levels (1,2,3,4,5).
 Because they may be different.

 To get the measurements of the diamter of the zones. I take out a small
 amount from a tube and spread it on a plate, then take three paper disks,
 soaked in their respective chemical, either Dettol or Garlic. and press them
 and and incubate them.
 Then when the zones have appeared after a day or 2. I take 4 diameter
 measurements from each zone, across the zone at different angles, to take
 account for the fact, that there may be a weird shape, or not quite
 circular.

 I'm concerned about pseudoreplication, such as the multiple readings from
 one disk, and the 5 lineages - which might be different from one another in
 each of the Two EV. treatments, but not with NE. treatments.

 I read that I can remove pseudoreplication from  the multiple readings from
 each disk, by using the 4 readings on each disk, to produce a mean for the
 disks, and analyse those means - Exerciseing caution where there are extreme
 values. I think the 3 disks for each lineage themselves are not
 pseudoreplication, because they are genuinley 3 disks on a plate: the Disk
 Diffusion Test replicated 3 times - but the multiple readings from one disk
 if eel, is pseudoreplication. I've also read about including Error() terms
 in a formula.

 I'm unsure of the two NE. Treatments comming from the same culture does not
 introduce pseudoreplications at Treatment Factor Level, because of the two
 different antimicrobials used have two different effects.

 I was hoping for a more expert opinion on whether I have identified
 pseudoreplication correctly or if there is indeed pseudoreplication in the 5
 Lineages or anywhere else I haven't seen. And how best this is dealt with in
 R. At the minute my solution to the multiple readings from one disk is to
 simply make a new factor, with the means on and do Anova from that, or even
 take the means before I even load the dataset into R. I'm wondering if an
 Error() term would be correct.

 Thanks,
 Ben W.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list

Re: [R] anova() interpretation and error message

2011-02-06 Thread Peter Ehlers

See comments inline.

On 2011-02-06 03:17, Jinsong Zhao wrote:

Hi there,

I have a data frame as listed below:

Ca.P.Biomass.A
   P   Biomass
1 334.5567 0.287
2 737.5400 0.571
3 894.5300 0.639
4 782.3800 0.5836667
5 857.5900 0.600
6 829.2700 0.588

I have fit the data using logistic, Michaelis–Menten, and linear model,
they all give significance.

fm1- nls(Biomass~SSlogis(P, phi1, phi2, phi3), data=Ca.P.Biomass.A)
fm2- nls(Biomass~SSmicmen(P, phi1, phi2), data=Ca.P.Biomass.A)
fm3- lm(Biomass~P, data = Ca.P.Biomass.A)

I hope to compare the difference among the three models, and I using
anova(). As for the example here, the three models seem not have
significant difference. However, I am confused by the negative df in the
following ANOVA table. And my question is how to interpret the results,
if the Pr  0.05.

anova(fm1,fm2,fm3)
Analysis of Variance Table

Model 1: Biomass ~ SSlogis(P, phi1, phi2, phi3)
Model 2: Biomass ~ SSmicmen(P, phi1, phi2)
Model 3: Biomass ~ P
Res.Df Res.Sum Sq Df  Sum Sq F value Pr(F)
1  3 0.00063741
2  4 0.00087249 -1 -0.00023508  1.1064 0.3701
3  4 0.00142751  0  0.


Read the Details section in help(anova.lm) to see why
you get negative DF values. The reason is simply that
3 - 4 =  -1. Compare

 anova(fm1, fm2)

with

 anova(fm2, fm1)

That's why I prefer to list my models in order of increasing
complexity. Note also that for interpretation models should
be nested. Yours aren't.



And when the argument position changed, the anova() give different
results. It seems the anova() compare the first model with all other models.

anova(fm2,fm1,fm3)
Analysis of Variance Table

Model 1: Biomass ~ SSmicmen(P, phi1, phi2)
Model 2: Biomass ~ SSlogis(P, phi1, phi2, phi3)
Model 3: Biomass ~ P
Res.Df Res.Sum Sq Df  Sum Sq F value Pr(F)
1  4 0.00087249
2  3 0.00063741  1  0.00023508  1.1064 0.3701
3  4 0.00142751 -1 -0.00079010  3.7187 0.1494

When I put the fm3, a linear model, in the first position, and two nls
model following it, anova() give the following error message. It seems
abnormal.


Not so strange. If you look at

 methods(anova)

you will see that there are different functions for
different classes of model. So, if your first model is
of class nls, then anova.nls will be used; for a
first model of class lm anova.lm is used. The main
thing is not to mix models of different class. If you
want a linear model to compare with the nonlinear
ones, use nls() to estimate the linear model. But you
still have the problem of non-nestedness.



anova(fm3,fm1,fm2)
Analysis of Variance Table

Response: Biomass
Df   Sum Sq  Mean Sq F valuePr(F)
P  1 0.081163 0.081163  227.43 0.0001127 ***
Residuals  4 0.001428 0.000357
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In anova.lmlist(object, ...) :
models with response c(NULL, NULL) removed because response
differs from model 1

Any suggestions and comments will be really appreciated. Thanks in advance.



After the above comments, I have one more:

You have 6 observations and you want to fit relatively
complex models whose estimated coefficients will be
*extremely* sensitive to *one* of your observations
(I assume that you have looked at a plot of the data).
The only way this could make any sense is if there is
well established theory that specifies one particular
model and you want to check that your data are at
least not obviously inconsistent with that theory.
Fishing through several models with six data points
makes no sense. I think that the best you can conclude
from your data is that Biomass probably increases with P.

Peter Ehlers


Regards,
Jinsong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA plotting

2011-01-20 Thread Sarah Goslee
I'm not sure what a real ANOVA diagram is supposed to look like, nor
do I know what your data look like.

But this might get you started:
fakedata - runif(100)
fakegroups - sample(rep(letters[1:5], each=20))
boxplot(fakedata ~ fakegroups)

If that isn't what you're after, a clearer explanation with a
reproducible example
would help us help you.

Sarah

On Thu, Jan 20, 2011 at 2:15 PM, Bulent Arikan bulent.ari...@gmail.com wrote:
 Dear List,

 I recently started using R and I have a simple question. I am running R (v.
 2.12.1) and Rcmdr (v.1-6.3) on Mac (Snow Leopard).

 I am using a data set I used before for practicing ANOVA with R, so I know
 what the results should look like. I can get ANOVA table using both Rcmdr
 and GUI. However, I cannot make R prepare the ANOVA diagram (with boxplots,
 showing the data points, including the outliers) for the dataset. If I use
 Rcmdr (ModelsGraphs) then I get some graphical representation for the ANOVA
 model I prepared in R (showing diagnostic plots, Q-Q plots, etc.). However,
 the real ANOVA diagram is not coming up. I tried using plot command in GUI
 but did not get what I wanted. I could not find an answer to this on-line or
 in my book. I apologize if this was covered recently in the mailing list, I
 just became a member.

 Thanks for all the help!




-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ANOVA plotting

2011-01-20 Thread Robert Baer
I recently started using R and I have a simple question. I am running R 
(v.

2.12.1) and Rcmdr (v.1-6.3) on Mac (Snow Leopard).

I am using a data set I used before for practicing ANOVA with R, so I 
know

what the results should look like. I can get ANOVA table using both Rcmdr
and GUI. However, I cannot make R prepare the ANOVA diagram (with 
boxplots,
showing the data points, including the outliers) for the dataset. If I 
use
Rcmdr (ModelsGraphs) then I get some graphical representation for the 
ANOVA
model I prepared in R (showing diagnostic plots, Q-Q plots, etc.). 
However,
the real ANOVA diagram is not coming up. I tried using plot command in 
GUI
but did not get what I wanted. I could not find an answer to this on-line 
or
in my book. I apologize if this was covered recently in the mailing list, 
I

just became a member.


Note that it is possible to do any Rcmdr menu operation from the command 
line simply by typing the command line output that results from doing the 
menu operation in Rcmdr.  Observe the red text in the output window.  The 
Rcmdr package supplies not only the Menu driven window, but some functions 
to accomplish the operations of the Window.  Observe the red text in the 
output window.


Assuming you want an interaction plot for 2-factor ANOVA and following on 
from Sarah's example paste the following code at the command line.  Note 
that it makes use of the Rcmdr convenience function plotMeans() so you need 
the Rcmdr package loaded:


 library(Rcmdr)
 fakedata - runif(100)
 fakegroups - sample(rep(letters[1:5], each=20))
 factor2 = sample(rep(letters[1:2], each=50))
 plotMeans(df$data, df$groups, df$factor2, error.bars=se)

For more information on how to use this function from the command line see 
its help file by typing

?plotMeans


Hope you can generalize this if you were actually asking a different 
question.  See posting guide for best practices on providing a reproducible 
example of what you want to accomplish.


Rob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   3   >