[R] R^2 for multilevel models

2007-08-13 Thread Andy Fugard
Hi there,

In multiple regression one way to view R^2 is as (the square of) the 
correlation between original y's and the estimated y's.

Suppose you fit a multilevel model with random intercept for each 
cluster.  Would it be valid to compute an R^2 by using fixed effects 
plus the group intercepts to reduce the residuals?

I suspect this has been done and, given its absence from the lmer 
output, refuted somewhere.  A reference would be lovely.

Code below.

Many thanks,

Andy

#

require(lme4)

# First generate some data

people = 100
obs = 4

# The random intercepts:

sub_noise = data.frame(id = 1:people, rand_int = rnorm(people,0,60))

# Merge everything

id = rep(1:people,obs)
thedata = data.frame(id)
thedata = merge(sub_noise, thedata)
thedata$x = rep(1:obs,people)

# Setup the relationship between x and y

thedata$y = 23*thedata$x + 20 + thedata$rand_int
   + rnorm(people*obs, 0, 20) # plus residuals

# Have a look

plot(y~x, data = thedata)

# Now fit a standard regression model

lm1 = lm(y ~ x, data = thedata)
summary(lm1)

# Use the model to get the R^2
predicted = coef(lm1)[1] + thedata$x * coef(lm1)[2]
plot(thedata$y, predicted)

# It's a noisy mess

cor(thedata$y, predicted)^2

# Now how about adjusting everyone's intercept towards the group
# intercept?

lmer1 = lmer(y ~ x + (1|id), data=thedata)
summary(lmer1)

# Get the random intercepts and stick them in a table

ran_effects = data.frame(rownames(ranef(lmer1)$id), ranef(lmer1)$id[1])
names(ran_effects) = c(id, b)
ran_effects_data = merge(thedata, ran_effects)

# Now compute

predicted.ml = fixef(lmer1)[1] + ran_effects_data$x * fixef(lmer1)[2]
   + ran_effects_data$b
plot(thedata$y, predicted.ml)
cor(thedata$y, predicted.ml)^2

# Looks much nicer

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New mailing list: R for psychology research

2007-05-14 Thread Andy Fugard
Hello all,

There's a new mailing list for researchers in psychology who are  
learning and using R.  New users of R are especially welcome.

To join, venture to

   http://www.jiscmail.ac.uk/archives/psych-r.html

and click

   Join or leave the list (or change settings)

You can also peruse the archives from that page.

Best wishes,

Andy

-- 
Andy Fugard, Postgraduate Research Student
Psychology (Room F15), The University of Edinburgh,
   7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190   http://www.possibly.me.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Estimating parameters of 2 phase Coxian using optim

2007-03-06 Thread Andy Fugard
Hi There,

Perhaps the problem is the line

 loglik-log(p %*% expm(Q * y(i)) %*% q)

You mention that y is a vector but here you're treating it as a  
function.  Maybe try

 loglik-log(p %*% expm(Q * y[i]) %*% q)

?

Don't have a clue about the correctness of the contents of cox2.lik...

Andy


On 6 Mar 2007, at 08:54, Laura Hill wrote:

 Hi,

 My name is Laura. I'm a PhD student at Queen's University Belfast  
 and have
 just started learning R. I was wondering if somebody could help me  
 to see
 where I am going wrong in my code for estimating the parameters  
 [mu1, mu2,
 lambda1] of a 2-phase Coxian Distribution.

 cox2.lik-function(theta, y){
 mu1-theta[1]

 mu2-theta[2]

 lambda1-theta[3]

 p-Matrix(c(1, 0), nrow=1, ncol=2)

 Q-Matrix(c(-(lambda1 + mu1), 0, lambda1, -mu2), nrow=2, ncol=2)

 q-Matrix(c(mu1, mu2), nrow=2, ncol=1)

 for (i in 1:length(y)){
 loglik-log(p %*% expm(Q * y(i)) %*% q)
 return(loglik)}

 sumloglik-sum(loglik)

 return(-sumloglik)
 }

 I have installed the Matrix package. y is a vector of 240 survival  
 times. In
 the R console I typed

 source(/private/var/automount/users/lhill07/Desktop/cox2.lik.R)

 optim(c(0.5, 0.5, 0.5), cox2.lik, y=y, method=BFGS)

 Error: could not find function y

 Error in expm(Q * y(i)) : error in evaluating the argument 'x' in  
 selecting
 a method for function 'expm'

 Error in log(p %*% expm(Q * y(i)) %*% q) :
 error in evaluating the argument 'x' in selecting a method for  
 function
 'log'


 I'm sorry if I have missed something really obvious and I would  
 appreciate
 any help that is offered. Hopefully I have given enough information.

 Many thanks in advance,

 Laura

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


--
Andy Fugard, Postgraduate Research Student
Psychology (Room F15), The University of Edinburgh,
   7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190   http://www.possibly.me.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Integers

2007-02-25 Thread Andy Fugard


On 25 Feb 2007, at 06:51, Anup Nandialath wrote:


 Is there an R function to generate random integers?  Thanks in  
 advance.

The package Random does what you want, but requires a net connection.

http://cran.r-project.org/src/contrib/Descriptions/random.html

This package provides an interface to the true random number service  
provided by the random.org website created by Mads Haahr. The  
random.org web service samples atmospheric noise via radio tuned to  
an unused broadcasting frequency together with a skew correction  
algorithm due to John von Neumann.

install.packages(random)
library(random)
?random

  randomNumbers(10,1,6,1)
V1
1   5
2   3
3   6
4   3
5   3
6   1
7   6
8   4
9   5
10  3

Andy

--
Andy Fugard, Postgraduate Research Student
Psychology (Room F15), The University of Edinburgh,
   7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190   http://www.possibly.me.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining Dataframes

2007-02-25 Thread Andy Fugard
Would need more info.  Merge could still do the job; you might just  
have to call it approx a dozen - 1 times!

Andy


On 25 Feb 2007, at 16:27, Bert Jacobs wrote:

 Hi,

 What is the best way to combine several dataframes (approx a dozen,  
 all
 having one column) into one? All dataframes have a different  
 rowlength, and
 do not contain numbers.
 As this new dataframe should have the length of the dataframe with  
 the most
 rows, the difference in rows with the other dataframes can be  
 filled with
 the value NA.

 I've tried merge (only possible with 2 df) and cbind (gives error)

 Thx for helping me out.


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Alberto Vieira
 Ferreira Monteiro
 Sent: 25 February 2007 16:52
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] Random Integers

 Charles Annis, P.E. wrote:

 rpois(n, lambda)

 ... will do it.  But you should tell us something about how you  
 want your
 numbers to be distributed, since rpois() produces integers having a
 Poisson
 distribution.

 nitpick
 rpois does not generate random _integers_, it generates random
 _natural numbers_.
 /nitpick

 The question should be more descriptive. Random is half of the  
 things
 we need to know, the other half is how deterministic you want your  
 integers.

 For example, if you want to generate random integers in such a way  
 that
 all integers have the same probability, then this can't be done.  
 OTOH, if
 you want to simulate random integers that distribute like integers  
 appear
 in Nature, then it's still not precise, but there are serious  
 attempts
 to reproduce this behaviour. Check in the wikipedia  
 (www.wikipedia.org)
 those distributions: Zip's law, Zeta distribution, Benford's law,
 Zipf-Mandelbrot law. The problem is that all of them generate positive
 random integers, but it's not difficult to extrapolate them to  
 integers.

 Alberto Monteiro

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


--
Andy Fugard, Postgraduate Research Student
Psychology (Room F15), The University of Edinburgh,
   7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190   http://www.possibly.me.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Repeated measures logistic regression

2007-02-25 Thread Andy Fugard
Dear all,

I'm struggling to find the best (set of?) function(s) to do repeated  
measures logistic regression on some data from a psychology experiment.

An artificial version of the data I've got is as follows.  Firstly,  
each participant filled in a questionnaire, the result of which is a  
score.

  questionnaire
ID Score
1   1 6
2   2 5
3   3 6
4   4 2
...

Secondly, each participant did a task which required a series of  
button-pushes.  The response is binary.  The factors CondA and CondB  
describe the structure of the stimulus:

  experiment
 ID CondA CondB Response
11a1b11
21a2b20
31a3b10
41a4b20
51a1b11
61a2b20
71a3b10
81a4b20
92a1b11
10   2a2b20
11   2a3b10
12   2a4b20
13   2a1b11
14   2a2b20
15   2a3b10
16   2a4b20

I would like to model how someone's score on the questionnaire  
relates to the responses they give in the button-pushing.  I'm  
particularly interested in interactions between the type of the  
stimulus and the score.

I combined the experiment and the questionnaire dataframe with a  
merge so now there an additional column.

  exp.q
 ID Score CondA CondB Response
11 6a1b11
21 6a2b20
31 6a3b10
41 6a4b20
51 6a1b11
61 6a2b20
71 6a3b10
81 6a4b20
92 5a1b11
10   2 5a2b20
11   2 5a3b10
12   2 5a4b20
...

Eventually, via glm, glmmPQL, and a few others, I ended up with  
lmer.  My questions follow.  I suspect (or hope) that I need to be  
pointed towards the relevant literature.  I own Faraway's Extending  
the Linear Model with R and Crawley's Statistics: An Introduction  
using R.

1. Is the way I've combined the tables okay?  I'm concerned that the  
repetition of the score is Bad but can't think of any other way to  
code things.

2. Is lmer the most appropriate function to use?

3. If so, does the following call capture what I'm trying to model?

model1 = lmer(Response ~ CondA * CondB * Score + (1|Subject),
   data =exp.q,
   family = binomial)

I just want to tell lmer, Look, this set of responses all comes from  
the same person: tell me the within-subject stuff that's going on and  
how that's affected by their score!

4. Is there any way to do stepwise model simplification?  In the real  
data I have, there are several more predictors, including more than  
one questionnaire score and subscores.  I have specific hypotheses  
about what could be going on, so I can live with manual editing of  
the formulae, but it's nice for exploratory purposes to do stepwise  
simplification.

5. What's the best way to discover and report the relative  
contribution of each predictor?  I'm after an analogue of  
standardized betas (though I recently learned that they're thoroughly  
evil).

6. Is there anyway to get a p-value for goodness of fit?

Many thanks for any help,

Andy

--
Andy Fugard, Postgraduate Research Student
Psychology (Room F15), The University of Edinburgh,
   7 George Square, Edinburgh EH8 9JZ, UK
Mobile: +44 (0)78 123 87190   http://www.possibly.me.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.