[R] R^2 for multilevel models
Hi there, In multiple regression one way to view R^2 is as (the square of) the correlation between original y's and the estimated y's. Suppose you fit a multilevel model with random intercept for each cluster. Would it be valid to compute an R^2 by using fixed effects plus the group intercepts to reduce the residuals? I suspect this has been done and, given its absence from the lmer output, refuted somewhere. A reference would be lovely. Code below. Many thanks, Andy # require(lme4) # First generate some data people = 100 obs = 4 # The random intercepts: sub_noise = data.frame(id = 1:people, rand_int = rnorm(people,0,60)) # Merge everything id = rep(1:people,obs) thedata = data.frame(id) thedata = merge(sub_noise, thedata) thedata$x = rep(1:obs,people) # Setup the relationship between x and y thedata$y = 23*thedata$x + 20 + thedata$rand_int + rnorm(people*obs, 0, 20) # plus residuals # Have a look plot(y~x, data = thedata) # Now fit a standard regression model lm1 = lm(y ~ x, data = thedata) summary(lm1) # Use the model to get the R^2 predicted = coef(lm1)[1] + thedata$x * coef(lm1)[2] plot(thedata$y, predicted) # It's a noisy mess cor(thedata$y, predicted)^2 # Now how about adjusting everyone's intercept towards the group # intercept? lmer1 = lmer(y ~ x + (1|id), data=thedata) summary(lmer1) # Get the random intercepts and stick them in a table ran_effects = data.frame(rownames(ranef(lmer1)$id), ranef(lmer1)$id[1]) names(ran_effects) = c(id, b) ran_effects_data = merge(thedata, ran_effects) # Now compute predicted.ml = fixef(lmer1)[1] + ran_effects_data$x * fixef(lmer1)[2] + ran_effects_data$b plot(thedata$y, predicted.ml) cor(thedata$y, predicted.ml)^2 # Looks much nicer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] New mailing list: R for psychology research
Hello all, There's a new mailing list for researchers in psychology who are learning and using R. New users of R are especially welcome. To join, venture to http://www.jiscmail.ac.uk/archives/psych-r.html and click Join or leave the list (or change settings) You can also peruse the archives from that page. Best wishes, Andy -- Andy Fugard, Postgraduate Research Student Psychology (Room F15), The University of Edinburgh, 7 George Square, Edinburgh EH8 9JZ, UK Mobile: +44 (0)78 123 87190 http://www.possibly.me.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Estimating parameters of 2 phase Coxian using optim
Hi There, Perhaps the problem is the line loglik-log(p %*% expm(Q * y(i)) %*% q) You mention that y is a vector but here you're treating it as a function. Maybe try loglik-log(p %*% expm(Q * y[i]) %*% q) ? Don't have a clue about the correctness of the contents of cox2.lik... Andy On 6 Mar 2007, at 08:54, Laura Hill wrote: Hi, My name is Laura. I'm a PhD student at Queen's University Belfast and have just started learning R. I was wondering if somebody could help me to see where I am going wrong in my code for estimating the parameters [mu1, mu2, lambda1] of a 2-phase Coxian Distribution. cox2.lik-function(theta, y){ mu1-theta[1] mu2-theta[2] lambda1-theta[3] p-Matrix(c(1, 0), nrow=1, ncol=2) Q-Matrix(c(-(lambda1 + mu1), 0, lambda1, -mu2), nrow=2, ncol=2) q-Matrix(c(mu1, mu2), nrow=2, ncol=1) for (i in 1:length(y)){ loglik-log(p %*% expm(Q * y(i)) %*% q) return(loglik)} sumloglik-sum(loglik) return(-sumloglik) } I have installed the Matrix package. y is a vector of 240 survival times. In the R console I typed source(/private/var/automount/users/lhill07/Desktop/cox2.lik.R) optim(c(0.5, 0.5, 0.5), cox2.lik, y=y, method=BFGS) Error: could not find function y Error in expm(Q * y(i)) : error in evaluating the argument 'x' in selecting a method for function 'expm' Error in log(p %*% expm(Q * y(i)) %*% q) : error in evaluating the argument 'x' in selecting a method for function 'log' I'm sorry if I have missed something really obvious and I would appreciate any help that is offered. Hopefully I have given enough information. Many thanks in advance, Laura __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Andy Fugard, Postgraduate Research Student Psychology (Room F15), The University of Edinburgh, 7 George Square, Edinburgh EH8 9JZ, UK Mobile: +44 (0)78 123 87190 http://www.possibly.me.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Integers
On 25 Feb 2007, at 06:51, Anup Nandialath wrote: Is there an R function to generate random integers? Thanks in advance. The package Random does what you want, but requires a net connection. http://cran.r-project.org/src/contrib/Descriptions/random.html This package provides an interface to the true random number service provided by the random.org website created by Mads Haahr. The random.org web service samples atmospheric noise via radio tuned to an unused broadcasting frequency together with a skew correction algorithm due to John von Neumann. install.packages(random) library(random) ?random randomNumbers(10,1,6,1) V1 1 5 2 3 3 6 4 3 5 3 6 1 7 6 8 4 9 5 10 3 Andy -- Andy Fugard, Postgraduate Research Student Psychology (Room F15), The University of Edinburgh, 7 George Square, Edinburgh EH8 9JZ, UK Mobile: +44 (0)78 123 87190 http://www.possibly.me.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Combining Dataframes
Would need more info. Merge could still do the job; you might just have to call it approx a dozen - 1 times! Andy On 25 Feb 2007, at 16:27, Bert Jacobs wrote: Hi, What is the best way to combine several dataframes (approx a dozen, all having one column) into one? All dataframes have a different rowlength, and do not contain numbers. As this new dataframe should have the length of the dataframe with the most rows, the difference in rows with the other dataframes can be filled with the value NA. I've tried merge (only possible with 2 df) and cbind (gives error) Thx for helping me out. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Alberto Vieira Ferreira Monteiro Sent: 25 February 2007 16:52 To: r-help@stat.math.ethz.ch Subject: Re: [R] Random Integers Charles Annis, P.E. wrote: rpois(n, lambda) ... will do it. But you should tell us something about how you want your numbers to be distributed, since rpois() produces integers having a Poisson distribution. nitpick rpois does not generate random _integers_, it generates random _natural numbers_. /nitpick The question should be more descriptive. Random is half of the things we need to know, the other half is how deterministic you want your integers. For example, if you want to generate random integers in such a way that all integers have the same probability, then this can't be done. OTOH, if you want to simulate random integers that distribute like integers appear in Nature, then it's still not precise, but there are serious attempts to reproduce this behaviour. Check in the wikipedia (www.wikipedia.org) those distributions: Zip's law, Zeta distribution, Benford's law, Zipf-Mandelbrot law. The problem is that all of them generate positive random integers, but it's not difficult to extrapolate them to integers. Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Andy Fugard, Postgraduate Research Student Psychology (Room F15), The University of Edinburgh, 7 George Square, Edinburgh EH8 9JZ, UK Mobile: +44 (0)78 123 87190 http://www.possibly.me.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Repeated measures logistic regression
Dear all, I'm struggling to find the best (set of?) function(s) to do repeated measures logistic regression on some data from a psychology experiment. An artificial version of the data I've got is as follows. Firstly, each participant filled in a questionnaire, the result of which is a score. questionnaire ID Score 1 1 6 2 2 5 3 3 6 4 4 2 ... Secondly, each participant did a task which required a series of button-pushes. The response is binary. The factors CondA and CondB describe the structure of the stimulus: experiment ID CondA CondB Response 11a1b11 21a2b20 31a3b10 41a4b20 51a1b11 61a2b20 71a3b10 81a4b20 92a1b11 10 2a2b20 11 2a3b10 12 2a4b20 13 2a1b11 14 2a2b20 15 2a3b10 16 2a4b20 I would like to model how someone's score on the questionnaire relates to the responses they give in the button-pushing. I'm particularly interested in interactions between the type of the stimulus and the score. I combined the experiment and the questionnaire dataframe with a merge so now there an additional column. exp.q ID Score CondA CondB Response 11 6a1b11 21 6a2b20 31 6a3b10 41 6a4b20 51 6a1b11 61 6a2b20 71 6a3b10 81 6a4b20 92 5a1b11 10 2 5a2b20 11 2 5a3b10 12 2 5a4b20 ... Eventually, via glm, glmmPQL, and a few others, I ended up with lmer. My questions follow. I suspect (or hope) that I need to be pointed towards the relevant literature. I own Faraway's Extending the Linear Model with R and Crawley's Statistics: An Introduction using R. 1. Is the way I've combined the tables okay? I'm concerned that the repetition of the score is Bad but can't think of any other way to code things. 2. Is lmer the most appropriate function to use? 3. If so, does the following call capture what I'm trying to model? model1 = lmer(Response ~ CondA * CondB * Score + (1|Subject), data =exp.q, family = binomial) I just want to tell lmer, Look, this set of responses all comes from the same person: tell me the within-subject stuff that's going on and how that's affected by their score! 4. Is there any way to do stepwise model simplification? In the real data I have, there are several more predictors, including more than one questionnaire score and subscores. I have specific hypotheses about what could be going on, so I can live with manual editing of the formulae, but it's nice for exploratory purposes to do stepwise simplification. 5. What's the best way to discover and report the relative contribution of each predictor? I'm after an analogue of standardized betas (though I recently learned that they're thoroughly evil). 6. Is there anyway to get a p-value for goodness of fit? Many thanks for any help, Andy -- Andy Fugard, Postgraduate Research Student Psychology (Room F15), The University of Edinburgh, 7 George Square, Edinburgh EH8 9JZ, UK Mobile: +44 (0)78 123 87190 http://www.possibly.me.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.