Re: [R] Using anova(f1, f2) to compare lmer models yields seemingly erroneous Chisq = 0, p = 1
Thank you all for your insight! I am glad to hear, at least, that I am doing something incorrectly (since the results do not make sense), and I am very grateful for your attempts to remedy my very limited (and admittedly self-taught) understanding of multilevel models and R. As I mentioned in the problem statement, predictor.1 explains vastly more variance in outcome than predictor.2 (R2 = 15% vs. 5% in OLS regression, with very large N), and the model estimates are very similar for the multilevel model as for OLS regression. Therefore, I am quite confident that predictor.1 comprises a much better model. I understand that several of you are saying that anova() cannot be used to compare these two multilevel models. Is there *any* way to compare two predictors to see which better predicts the outcome in a multilevel model? f1's lower AIC and BIC, and higher logLik are concordant with the idea that predictor.1 is superior to predictor.2, as best as I understand it, but is there any way to test whether that difference is statistically significant? The only function I can find online is anova() to compare models, but its output is nonsensical and, as you are all saying, it does not apply to my situation anyway. Interestingly, anova() seems to work if I arbitrarily subset my observations, but when I use all the observations anova() generates Chisq = 0. This is probably a red herring but I thought I would mention it in case it is not. Also, I concede that I am confused what you mean that the two models (f1 and f2) are not nested, and therefore anova() cannot be used. What would be an example of a nested model: comparing predictor.1 to a model with both predictor.1 and predictor.2? Surely there must also be a way to compare the predictive power of predictor.1 and predictor.2 to each other in a zero-order sense, but I am at a loss to identify it. Alain Zuur wrote: rapton wrote: When I run two models, the output of each model is generated correctly as far as I can tell (e.g. summary(f1) and summary(f2) for the multilevel model output look perfectly reasonable), and in this case (see below) predictor.1 explains vastly more variance in outcome than predictor.2 (R2 = 15% vs. 5% in OLS regression, with very large N). What I am utterly puzzled by is that when I run an anova comparing the two multilevel model fits, the Chisq comes back as 0, with p = 1. I am pretty sure that fit #1 (f1) is a much better predictor of the outcome than f2, which is reflected in the AIC, BIC , and logLik values. Why might anova be giving me this curious output? How can I fix it? I am sure I am making a dumb error somewhere, but I cannot figure out what it is. Any help or suggestions would be greatly appreciated! -Matt f1 - (lmer(outcome ~ predictor.1 + (1 | person), data=i)) f2 - (lmer(outcome ~ predictor.2 + (1 | person), data=i)) anova(f1, f2) Data: i Models: f1: outcome ~ predictor.1 + (1 | person) f2: outcome ~ predictor.2 + (1 | person) DfAIC BIClogLik Chisq Chi Df Pr(Chisq) f1 6 45443 45489 -22715 f2 25 47317 47511 -23633 0 19 1 Your models are nest nestedit doesn't make sense to do. Alain -- View this message in context: http://www.nabble.com/Using-anova%28f1%2C-f2%29-to-compare-lmer-models-yields-seemingly-erroneous-Chisq-%3D-0%2C-p-%3D-1-tp25297254p25338046.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using anova(f1, f2) to compare lmer models yields seemingly erroneous Chisq = 0, p = 1
Hello, I am using R to analyze a large multilevel data set, using lmer() to model my data, and using anova() to compare the fit of various models. When I run two models, the output of each model is generated correctly as far as I can tell (e.g. summary(f1) and summary(f2) for the multilevel model output look perfectly reasonable), and in this case (see below) predictor.1 explains vastly more variance in outcome than predictor.2 (R2 = 15% vs. 5% in OLS regression, with very large N). What I am utterly puzzled by is that when I run an anova comparing the two multilevel model fits, the Chisq comes back as 0, with p = 1. I am pretty sure that fit #1 (f1) is a much better predictor of the outcome than f2, which is reflected in the AIC, BIC , and logLik values. Why might anova be giving me this curious output? How can I fix it? I am sure I am making a dumb error somewhere, but I cannot figure out what it is. Any help or suggestions would be greatly appreciated! -Matt f1 - (lmer(outcome ~ predictor.1 + (1 | person), data=i)) f2 - (lmer(outcome ~ predictor.2 + (1 | person), data=i)) anova(f1, f2) Data: i Models: f1: outcome ~ predictor.1 + (1 | person) f2: outcome ~ predictor.2 + (1 | person) DfAIC BIClogLik Chisq Chi Df Pr(Chisq) f1 6 45443 45489 -22715 f2 25 47317 47511 -23633 0 19 1 -- View this message in context: http://www.nabble.com/Using-anova%28f1%2C-f2%29-to-compare-lmer-models-yields-seemingly-erroneous-Chisq-%3D-0%2C-p-%3D-1-tp25297254p25297254.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using loops to run functions over a list of variables
Hello, I have a data set with many variables, and often I want to run a given function, like summary() or cor() or lmer() etc. on many combinations of one or more than one of these variables. For every combination of variables I want to analyze I have been writing out the code by hand, but given that I want to run many different functions over dozens and dozens of variables combinations it is taking a lot of time and making for very inelegent code. There *has* to be a better way! I have tried looking through numerous message boards but everything I've tried has failed. It seems like loops would solve this problem nicely. (1) Create list of variables of interest (2) Iterate through the list, running a given function on each variable I have a data matrix which I have creatively called data. It has variables named focus and productive. If I run the function summary(), for instance, it works fine: summary(data$focus) summary(data$productive) Both of these work. If I try to use a loop like: factors - c(data$focus, data$productive) for(i in 1:2){ summary(get(factors[i])) } It given the following errors: Error in get(factors[i]) : variable data$focus was not found Error in summary(get(factors[i])) : error in evaluating the argument 'object' in selecting a method for function 'summary' But data$focus *does* exist! I could run summary(data$focus) and it works perfectly. What am I doing wrong? Even if I get this working, is there a better way to do this, especially if I have dozens of variables to analyze? Any ideas would be greatly appreciated! -- View this message in context: http://www.nabble.com/Using-loops-to-run-functions-over-a-list-of-variables-tp23505399p23505399.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.