[R] MASS library rob.cov ellipse

2007-07-02 Thread Jenifer Larson-Hall
I used to get a really useful graph when I ran the following command
using the MASS library:


Besides the regular output, a graph appeared that had the classical
correlation and the robust correlation, and two ellipses, one
surrounding the data that would be used in the classical correlation and
the other surrounding the data in the robust correlation. 

I've searched through the MASS library but don't see a separate command
that could produce this graph. Does anyone know whether one exists, or
did the graph just disappear in the newer version of R?

Thanks for any help,

Dr. Jenifer Larson-Hall
Assistant Professor of Linguistics
University of North Texas

R-help@stat.math.ethz.ch mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Comparing models in multiple regression and hierarchical linear regression

2006-11-06 Thread Jenifer Larson-Hall
I don’t know if this question properly belongs on this list, but I’ll ask it 
here because I’ve been using R to run linear regression models, and it is only 
in using R (after switching from using SPSS) that I have discovered the process 
of fitting a linear model. However, after reading Crowley (2002), Fox (2002), 
Verzani (2004), Dalgaard (2002) and of course searching the R-help archives I 
cannot find an answer to my question.
I have 5 explanatory variables (NR, NS, PA, KINDERWR, WM) and one 
response variable (G1L1WR). A simple main effects model finds that only PA is 
statistically significant, and an anova comparison between a 5-variable main 
effects model and a 1-variable main effects model finds no difference between 
the models. So it is possible to simplify the model to just G1L1WR ~ PA. This 
leaves me with a residual standard error of 0.3026 on 35 degrees of freedom and 
an adjusted R2 of 0.552.
I also decided, following Crawley’s (2002) advice, to create a maximal 
model, G1L1WR ~ NR*NS*PA*KINDERWR*WM. This full model is not a good fit, but a 
stepAIC through the model revealed the model which had a maximal fit:

maximal.fit=lm(formula = G1L1WR ~ NR + KINDERWR + NS + WM + PA + NR:KINDERWR + 

All of the terms of this model have statistical t-tests, the residual standard 
error has gone down to 0.2102, and the adjusted R2 has increased to 0.7839. An 
anova shows a clear difference between the simplified model and the maximal fit 
model. My question is, should I really pick the maximal fit over the simple 
model when it is really so much harder to understand? I guess there’s really no 
easy answer to that, but if that’s so, then my question is—would there be 
anything wrong with me saying that sometimes you might value parsimony and ease 
of understanding over best fit? Because I don’t really know what the maximal 
fit model buys you. It seems unintelligible to me. All of the terms are 
involved in interactions to some extent, but there are 4-way interactions and 
3-way interactions and 2-way interactions and I’m not sure even how to 
understand it. A nice tree model showed that at higher levels of PA, KINDERWR 
and NS affected scores. That I can understand, but that is not reflected in 
this model.

An auxiliary question, probably easier to answer, is how could I do 
hierarchical linear regression? The authors knew that PA would be the largest 
contributor to the response variable because of previous research, and their 
research question was whether PA would contribute anything AFTER the other 4 
variables had already eaten their piece of the response variable pie. I know 
how to do a hierarchical regression in SPSS, and want to show in parallel how 
to do this in R. I did search R-help archives and didn’t find quite anything 
that would just plain tell me how to do hierarchical linear regression.

Thanks in advance for any help.

Dr. Jenifer Larson-Hall
Assistant Professor of Linguistics
University of North Texas

R-help@stat.math.ethz.ch mailing list
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Counting observations split by a factor when there are NAs in the data

2006-07-10 Thread Jenifer Larson-Hall
I am a very novice R user, a social scientist (linguist) who is trying
to learn to use R after being very familiar with SPSS. Please be kind!

My concern:
I cannot figure out a way to get an accurate count of observations of
one column of data split by a factor when there are NAs in the data.

I know how to use commands like tapply and summaryBy to obtain other
summary statistics I am interested in, such as the following:
tapply(RLWTEST, list(STATUS), mean, na.rm=T)
summaryBy(RLWTEST~STATUS, data=lh.forgotten, FUN=c(mean, sd, min, max),

However, with tapply I know I cannot use length to get a count where
there are NAs. summaryBy appears to work the same way. I do know how to
get a count of the entire column using sum:

However, this does not give me a count split up by my factor (STATUS). I
have looked through Daalgard (2002) and Verzani (2005), and have
searched the help files, but with no luck.

Thank you in advance for your help. I love R and am interested in making
it more accessible to social scientist types like me. I know it can do
everything SPSS can and more, but sometimes the very simplest things
seem to be a lot harder in R.


Dr. Jenifer Larson-Hall
Assistant Professor of Linguistics
University of North Texas

R-help@stat.math.ethz.ch mailing list
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html