Giovanni, Have you tried Bert suggestion 2)? Because his log(R) ~ A*B + C + D is NOT the same as your log(R)~A+B+I(A*B)+C+D
Note that I(A * B) means: create a new variable that is the product of A and B. Which is not meaningfull if A and B are factors (hence the warning you got). So I(A * B) is not the interaction between A and B. You need A:B if you want the interaction. Thierry > -----Oorspronkelijk bericht----- > Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] > Namens Giovanni Azua > Verzonden: maandag 21 november 2011 17:00 > Aan: r-help@r-project.org > Onderwerp: Re: [R] [OT] 1 vs 2-way anova technical question > > Hello Bert, > > Thank you for taking the time to try to answer. > > 1) I know this, however if one is interested in only interaction between two > specific factors then in R one uses I(A*B*C) meaning 3-way anova for that and > not the implicit 2-ways that would otherwise be computed. > > 2) True, but it fails. > > 3) No, I don't have any factors with one level, I never said that. It would > not be a > 2^k experiment otherwise, my OP states this clearly, this is a 2^k > experimental > design ___2___ > > 4) this is only your judgmental attitude that many people unfortunately have > in > some of these lists, focussing on ad-hominem judgements or even attacks to try > to prove their superiority without actually answering nor adding any value to > the > question at hand. I have taken many graduate courses in subjects that have all > Statistics in the title and passed all of them. However, as an experienced > Software Engineer working for more than 10 years in the field, I can tell you > that > there is a huge difference between solving toy problems to implementing real- > life complex projects. Same rules apply here, one thing is the toy examples > one > finds in R books and course exercises and another totally different story is > the > real life data I am trying to model. I'm a student in the quantitative part > and > learning, so I do have some gaps, I am curious and trying to learn and I think > there is no shame in that. If this makes you upset maybe you should ask to > split > the list in two or more: "Advanc! > ed-PhD-black-belt-10th-dan-in-Statistics-and-R level" list and "newbies" > list. > > Best regards, > Giovanni > > On Nov 21, 2011, at 3:55 PM, Bert Gunter wrote: > > > Giovanni: > > > > 1. Please read ?formula and/or An Introduction to R for how to specify > > linear models in R. > > > > 2. Correct specification of what you want (if I understand correctly) > > is > > log(R) ~ A*B + C + D > > > > 3. ... which presumably will also fail because some of your factors > > have only one level, which means that you cannot use them in your > > model. > > > > 4. ... which, in turn, suggests you don't know what your doing > > statistically and should seek local assistance, especially in trying > > to interpret a fit to an unbalanced model (you can't do it as you > > probably think you can). > > > > I should say in your defense that posts on this list indicate that > > point 4 is a widely shared problem among posters here. > > > > Cheers, > > Bert > > > > On Mon, Nov 21, 2011 at 5:02 AM, Giovanni Azua <brave...@gmail.com> > wrote: > >> Hello, > >> > >> Couple of clarifications: > >> - A,B,C,D are factors and I am also interested in possible > >> interactions but the model that comes out from aov R~A*B*C*D violates > >> the model assumptions > >> - My 2^k is unbalanced i.e. missing data and an additional level I > >> also include in one of the factors i.e. C > >> - I was referring in the OP to the 4-way interactions and not 2-way, I'm > >> sorry > for my confusion. > >> - I tried to create an aov model with less interactions this way but I get > >> the > following error: > >> > >> model.aov <- aov(log(R)~A+B+I(A*B)+C+D,data=throughput) > >> Error in `contrasts<-`(`*tmp*`, value = "contr.treatment") : > >> contrasts can be applied only to factors with 2 or more levels In > >> addition: Warning message: > >> In Ops.factor(A, B) : * not meaningful for factors > >> > >> Here I was trying to say: do a one-way anova except for the A and B factors > for which I would like to get their 2-way interactions ... > >> > >> Thanks in advance, > >> Best regards, > >> Giovanni > >> > >> On Nov 21, 2011, at 12:04 PM, Giovanni Azua wrote: > >> > >>> Hello, > >>> > >>> I know there is plenty of people in this group who can give me a > >>> good answer :) > >>> > >>> I have a 2^k model where k=4 like this: > >>> Model 1) R~A*B*C*D > >>> > >>> If I use the "*" in R among all elements it means to me to explore all > interactions and include them in the model i.e. I think this would be the so > called > 2-way anova. However, if I do this, it leads to model violations i.e. the > homoscedasticity is violated, the normality assumption of the sample errors > i.e. > residuals is violated etc. I tried correcting the issues using different > standard > transformations: log, sqrt, Box-Cox forms etc but none really improve the > result. > In this case even though the model assumptions do not hold, some of the > interactions are found to significatively influence the response variable. > But then > shall I trust the results of this Model 1) given that the assumptions do not > hold? > >>> > >>> Then I try this other model where I exclude the interactions (is this the > >>> 1- > way anova?): > >>> Model 2) R~A+B+C+D > >>> > >>> In this one the model assumptions hold except the existence of some > outliers and a slightly heavy tail in the QQ-plot. > >>> > >>> Given that the assumptions for Model 1) do not hold, I assume I should > ignore the results altogether for Model 1) or? or instead can I safely use > the Sum > Sq. of Model 1) to get my table of percent of variations? > >>> > >>> This to me was a bit counter-intuitive since I assumed that if there was > collinearity among factors (and there is e.g. I(A*B*C)) the Model 1) and I > included those interactions, my model would be more accurate ... ok this > turned > into a brand new topic of model selection but I am mostly interested in the > question: if model is violated can I or must I not use the results e.g. Sum > Sqr for > that model? > >>> > >>> Can anyone advice please? > >>> > >>> btw I have bought most books on R and statistical analysis. I have > researched them all and the ANOVA coverage is very shallow in most of them > specially in the R-sy ones, they just offer a slightly pimped up version of > the R- > help. > >>> > >>> I am also unofficially following a course on ANOVA from the university I > >>> am > registered in and most examples are too simplistic and either the assumptions > just hold easily or the assumptions don't hold and nothing happens. > >>> > >>> Thanks in advance, > >>> Best regards, > >>> Giovanni > >>> > >> > >> > >> [[alternative HTML version deleted]] > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > > > > > -- > > > > Bert Gunter > > Genentech Nonclinical Biostatistics > > > > Internal Contact Info: > > Phone: 467-7374 > > Website: > > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb > > -biostatistics/pdb-ncb-home.htm > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.