Roger Levy wrote:
Laura de Ruiter wrote:
Dear R-users and -experts,
I am performing a rather simple analysis on a small data set (pasted
below this email) and keep getting a to me inexplicable result.
Perhaps I am missing something here - it would be great if someone
could point out to me what I am doing wrong.
I want to test whether the factor "Info" (which has three levels:
"new", "given", "accessible") is a significant predictor for the
binary variable "DeaccYN". The random factor is "Subject". The
distribution of the data looks as follows:
-----------------------------------------------------------------------------
Info
DeaccYN given new accessible
no 25 42 21
yes 11 0 1
------------------------------------------------------------------------------
This is the model:
----------------------------------------------------------------------------------------------------------
deacc.lmer = lmer (DeaccYN ~ Info + (1|Subject), data = dat, family =
"binomial")
-----------------------------------------------------------------------------------------------------------------
However, given the distribution above, this outcome seems rather weird
to me:
---------------------------------------------------------------------------------------------------------
summary (deacc.lmer)
Generalized linear mixed model fit using Laplace
Formula: DeaccYN ~ Info + (1 | Subject)
Data: dat
Family: binomial(logit link)
AIC BIC logLik deviance
60.4 70.82 -26.2 52.4
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 0.18797 0.43356
number of obs: 100, groups: Subject, 21
Estimated scale (compare to 1 ) 0.7316067
Fixed effects:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.8635 0.3795 -2.2754 0.0229 *
Infonew -18.7451 2764.2445 -0.0068 0.9946
Infoaccessible -2.2496 1.1186 -2.0110 0.0443 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> [...]
----------------------------------------------------------------------------------------------------
Why should the difference between 25/11 and 21/1 be significant, but
the difference between 25/11 and 42/0 not? Very odd to me seems the
standard error of 2764!
> [...]
I was wondering: Is it perhaps a problem for the model that there are
no cases in the DeaccYN == "yes" category for Info == "given"? And if
this
^^^^^
I believe you mean "new" here.
is the case, why?
Am I overlooking something here?
Dear Laura,
Independently of the issue that Florian is raising...you are right that
the lack of is a problem for the model (to be precise, it's a problem
for estimating the significance of the parameter estimate using the z
value).
Whoops, this should have read
"you are right that the lack of observations in the "yes/new"
category is a problem for the model..."
--
Roger Levy Email: rl...@ucsd.edu
Assistant Professor Phone: 858-534-7219
Department of Linguistics Fax: 858-534-4789
UC San Diego Web: http://ling.ucsd.edu/~rlevy
_______________________________________________
R-lang mailing list
R-lang@ling.ucsd.edu
http://pidgin.ucsd.edu/mailman/listinfo/r-lang