Re: [R] Functions ,Optim, Dataframe

2006-07-31 Thread Dieter Menne
Michael Papenfus mmpapenf at wisc.edu writes:

 
 I have defined the following function:
 
 fr-function(x) {
 u-x[1]
 v-x[2]
 sqrt(sum((plnorm(c(3,6),u,v)-c(.55,.85))^2))
 }
 which I then solve using optim
 y-optim(c(1,1),fr)
 
   y$par
 [1] 1.0029771 0.7610545
 This works fine.
 
 Now I want to use these two steps on a dataframe:
  mydat-data.frame(d1=c(3,5),d2=c(6,10),p1=c(.55,.05),p2=c(.85,.35))
   mydat
   d1 d2   p1   p2
 1  3  6 0.55 0.85
 2  5 10 0.05 0.35
 
 where for each row in mydat, I append the two parameter resulting from 
 optim into mydat.
 I want to do this for a larger dataset but thought I would start with a 
 simple two row dataframe.
 

I would prefer a loop in this case.

fr-function(x) {
sqrt(sum((plnorm(c(3,6),x[1],x[2])-c(x[3],x[4]))^2))
}
y-optim(c(1,2,0.55,0.85),fr)

mydat-data.frame(d1=c(1,0.5),d2=c(1,0.1),p1=c(.55,.05),p2=c(.85,.35))
myres-mydat   # simple way to allocate dataframe for results
names(myres) = paste(res,names(myres),sep=.)

for (i in 1:nrow(mydat)){
  y - optim(mydat[i,1:4],fr)
  myres[i,] - y$par
}
mydat = cbind(mydat,myres)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random structure of nested design in lme

2006-07-31 Thread ESCHEN Rene
Spencer,

Thank you for the kind and elaborate reply to my previous post.

I did consider the option you suggested and many variations. Depending on the 
order of the random factors, lme will either give the same output as the aov 
model for soiltype or for habitat, but not both in the same model. 

The closest I came was 

  anova(lme(NA.1~soiltype*habitat,random=~1|destination/soiltype))

However, it apppears that in this case the interaction is tested at the same 
level as soiltype.

In this post, a small sample dataset with a brief explanation of the meaning of 
the different column titles is included below. Also, I included both the aov 
model and the lme model.

Hopefully, this will help to get closer to a solution to my problem.

Best regards,

René Eschen.

___

#Small sample dataset
#
data=read.table(Sample dataset.csv,header=T) 
require(nlme)
soiltype=factor(soiltype)
habitat=factor(habitat)
destination=factor(destination)
origin=factor(origin)
summary(aov(response~soiltype*habitat+Error(destination+origin)))
anova(lme(response~soiltype*habitat,random=~1|destination/origin))
#
#habitat type is either 'arable' or 'grassland'
#destination indicates what site the soil was transplanted into, and is 
considered a random factor within habitat type
#soiltype is either 'arable' or 'grassland'
#origin indicates what site the soil was taken from, and is considered a 
random factor within soil type
#response is the response variable, typically some plant parameter such as 
growth rate or number of leaves, but in this example it is a random number 
between 0 and 1.
#
habitat   destination   soiltype  originresponse
1   1   1   1   0.63
1   2   1   1   0.76
1   3   1   1   0.14
2   4   1   1   0.27
2   5   1   1   0.88
2   6   1   1   0.41
1   1   1   2   0.47
1   2   1   2   0.48
1   3   1   2   0.76
2   4   1   2   0.83
2   5   1   2   0.88
2   6   1   2   0.57
1   1   1   3   0.80
1   2   1   3   0.31
1   3   1   3   0.22
2   4   1   3   0.53
2   5   1   3   0.97
2   6   1   3   0.30
1   1   2   4   0.46
1   2   2   4   0.99
1   3   2   4   0.56
2   4   2   4   0.32
2   5   2   4   0.46
2   6   2   4   0.64
1   1   2   5   0.03
1   2   2   5   0.41
1   3   2   5   0.24
2   4   2   5   0.60
2   5   2   5   0.04
2   6   2   5   0.30
1   1   2   6   0.97
1   2   2   6   0.60
1   3   2   6   0.22
2   4   2   6   0.16
2   5   2   6   0.58
2   6   2   6   0.21



-Original Message-
From: Spencer Graves [mailto:[EMAIL PROTECTED]
Sent: Sat 2006-07-22 20:03
To: ESCHEN Rene
Cc: Doran, Harold; r-help@stat.math.ethz.ch
Subject: Re: [R] Random structure of nested design in lme
 
  Have you considered the following:

  anova(lme(NA.1~soiltype*habitat,random=~1|destination/origin))

  This seems more closely to match the 'aov' command in your original 
post.  This model might be written in more detail as follows:

  NA.1[s, h, i,j,k] = b0 + ST[s] + H[h] +
ST.H[s[i],j[j] j] + d[i] + o[i,j] + e[i,j,k]

where b0 = a constant to be estimated,

  s = the soil type for that particular sample,

  h = the habitat for that sample,

  ST = soil type coefficients to be estimated subject to a constraint 
that they sum to 0,

  H = habitat coefficients to be estimated subject to the constraint 
that they sum to 0,

  ST.H = soil type by habitat interaction coefficients to be estimated 
subject to constraints that ST.H[s,.] sum to 0 and ST.H[., h] also sum 
to 0,

  d[i] = a random deviation associated with each destination, assuming 
the d's are all normal, independent, with mean 0 and unknown but 
constant variance s2.d

  o[i, j] = a random deviation associated with each destination / 
origin combination, assuming the o's are all normal, independent, with 
mean 0 and unknown variance s2.o,

and   e[i,j,j] = the standard unknown noise term, normal, independent 
with mean 0 and unknown variance s2.e.

  The model you wrote includes nested noise terms for soil type and 
habitat as well.  These terms are not estimable, which makes the answers 
garbage, but the 'lme' function does not check for replicates and 
therefore sometimes gives garbage answers without warning.

  To get more information from the fit, I suggest you first try 
'methods(class=lme)', and review help pages associated with what you 
see listed 

Re: [R] Reading multiple txt files into one data frame

2006-07-31 Thread Paul Lemmens
On 7/30/06, Kartik Pappu [EMAIL PROTECTED] wrote:
 Hello All,

 I have a device that spews out experimental data as a series of text
 files each of which contains one column with several rows of numeric
 data.  My problem is that for each trial it gives me one text file
 (and I run between 30 to 50 trials at a time) and I would ideally like
 to merge all these text files into one large data frame with each
 column representing a single trial.  It is not a problem if NA
 characters are added to make all the columna of eaqual length.  Right
 now I am doing this by opening each file individually and cutting and
 pasting the data into an excel file.  How can I do this in R assuming
 all my text files are in one directory.

 Is it also possible to customize the column headers.  For example if I
 have 32 trials and 16 are experimental and 16 are control and I want
 to name the columns Expt1, Expt2,... Expt16  and the control
 columns Cntl1,...Cntl16.

 Kartik

setwd(E:/Cooperation @ Delft-Nijmegen (Feb. 2006 - Sep.
2006)/Research/Study 20 - Roughness/Experiment 20a - Roughness Index
for CUReT textures/Statistics)

#  Concatenate the raw data files.
data.path = ../data files/
(datafiles - list.files(path=data.path, pattern=subject\_[0-9]+\.txt$))
exp20a - do.call('rbind',
  lapply(datafiles,
function(x) read.table(paste(data.path, x, sep=
rm(datafiles, data.path)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] standardized random effects with ranef.lme()

2006-07-31 Thread Doran, Harold
OK, I see how the standardized random effects are calculated. Here is
what I now see

library(nlme)

fm2 - lme(distance~age, Orthodont)

# unstandardized
age_ranef - ranef(fm2)[,2]

#standardized
age_Sranef - ranef(fm2, standard=TRUE)[,2]

# We can use these to solve for the standard error, because the formula
according to help for ranef.lme is

# Standardized_randomEffects = random_effects/standard error

age_ranef/age_Sranef

# OK, now note the values are exactly the same. Now, look at

VarCorr(fm2)

You can see the value used to standardize is the standard deviation of
the random effect for age.

Now, the help function does say divided by the corresponding standard
error. 

I've copied Doug Bates because the values in the stdDev column are the
standard deviations of the variance components and not standard errors
of those variance components. So, I'm not sure why the help says that
the standardized random effects are divided by the corresponding SE.
Maybe he can clarify if he has time.

I hope that helps
Harold


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Doran, Harold
 Sent: Sunday, July 30, 2006 3:40 PM
 To: Spencer Graves; Dirk Enzmann
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] standardized random effects with ranef.lme()
 
  
   Why do the results differ although the estimates (random
  effects and
   thus their variances) are almost identical? I noticed that
  lme() does
   not compute the standard errors of the variances of the
  random effects
   - for several reasons, but if this is true, how does
  ranef() calculate
   the standardized random effects (the help says: 
  'standardized (i.e.
   divided by the corresponding estimated standard error)').
   
   Is there a way to obtain similar results as in MLWin (or: 
 should I 
   prefer the results of ranef() for certain reasons)?
 
 I think there are two different issues here. The lme function 
 does not produce a standard error of the variance component 
 as some other multilevel packages do. It is often recommended 
 in the multilevel literature to consider the p-value of the 
 variance components and fix
 or retain the variance if p  .05. There are good reasons not 
 to follow this practice.
 
 If you were using lmer(), you still wouldn't get this 
 statistic, but you could use the MCMCsamp() function to 
 examine the distribution of the random effects.
 
 The second issue (I think) is that the conditional variance 
 of the random effect is not the same as the standard error of 
 the variance of the random effect. From the definition in the 
 help of ranef.lme, I believe it is the random effect divided 
 by its conditional standard error. I don't know how to get 
 the posterior variance of the random effects in lme, but I do 
 in lmer, so we can experiment a bit. It has been a while 
 since I have really used lme and I do not think there is an 
 extractor function to get these and I didn't see the 
 variances in the model object.
 
 Let's work through an example to see what we get. Here is what I see
 
 library(nlme)
 data(Orthodont)
 detach(package:nlme)
 
 library(Matrix)
 fm1 - lmer(distance ~ age + (age|Subject), data = Orthodont) 
 # equivalent to # fm1 - lme(distance ~ age, data=Orthodont)
 
 # Extract the variances of the random effects qq - 
 attr(ranef(fm1, postVar = TRUE)[[1]], postVar)
 
 # divide the random effects by their standard error of age
 Sranef_lmer - ranef(fm1)[[1]][,2]/ sqrt(qq[2,2,])
 
 library(nlme)
 # Now, run the lme model
 fm2 - lme(distance ~ age, Orthodont)
 
 # get the standardized random effects from lme Sranef_lme - 
 ranef(fm2, standard=T)[2]
 
 cor(Sranef_lmer, Sranef_lme)
 [1] 1
 
 Notice the perfect correlation. But, the actual values in 
 Sranef_lme and Sranef_lmer are a bit different and I cannot 
 see why just yet. I need to go eat lunch, but I'll think about this. 
 
 Maybe somebody else sees something.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] re 11. uniroot and function opposite signs warning

2006-07-31 Thread tfjbl
Nurza,
Try running a while loop steping out until you have a start and finish 
thats the function is opposite in sign. You need a start and finish
where F is + and - on either side of the loop. Graphing F might help.


step-10
checkme-F(start)*F(finish+step)

while(checkme0){
initialstep-initialstep*2
checkme-F(start)*F(finish+step)
}
answer-uniroot(F,low=start,up=finish+step,maxiter=100)$root



Enjoy
Joe Liddle
[EMAIL PROTECTED]
---BeginMessage---
Send R-help mailing list submissions to
r-help@stat.math.ethz.ch

To subscribe or unsubscribe via the World Wide Web, visit
https://stat.ethz.ch/mailman/listinfo/r-help
or, via email, send a message with subject or body 'help' to
[EMAIL PROTECTED]

You can reach the person managing the list at
[EMAIL PROTECTED]

When replying, please edit your Subject line so it is more specific
than Re: Contents of R-help digest...


Today's Topics:

   1. Re: Extracting from a matrix w/o for-loop (Adrian DUSA)
   2. Re: fancier plotting (jim holtman)
   3. Re: (no subject) (jim holtman)
   4. Re: memory problems when combining randomForests
  (Eleni Rapsomaniki)
   5. Re: maximum likelihood (Ben Bolker)
   6. Re: (no subject) (Douglas Bates)
   7. Re: negative binomial lmer (Douglas Bates)
   8. Re: random effects with lmer() and lme(), three random
  factors (Douglas Bates)
   9. Colours in Lattice (Lorenzo Isella)
  10. placing rectangle behind plot (Gabor Grothendieck)
  11. uniroot (nurza m)
  12. Reading multiple txt files into one data frame (Kartik Pappu)
  13. Re: Reading multiple txt files into one data frame (jim holtman)
  14. Log color scale (Kartik Pappu)
  15. Re: placing rectangle behind plot (Sebastian P. Luque)
  16. Re: placing rectangle behind plot (Gabor Grothendieck)
  17. Re: nested repeated measures in R (Spencer Graves)
  18. Question about data used to fit the mixed model
  (Nantachai Kantanantha)
  19. Re: Log color scale ([EMAIL PROTECTED])


--

Message: 1
Date: Sat, 29 Jul 2006 13:50:01 +0300
From: Adrian DUSA [EMAIL PROTECTED]
Subject: Re: [R] Extracting from a matrix w/o for-loop
To: Horace Tso [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch, Carlo Giovanni Camarda
[EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
Content-Type: text/plain;  charset=iso-8859-1

Hi,

On Friday 28 July 2006 20:21, Horace Tso wrote:
 Unless there is another level of complexity that i didn't see here,
 wouldn't it be a simply application of sapply as follow,

 sapply( 1:dim(M2)[[1]], function(x) M1[M2[x,1], M2[x,2]] )

Andy's previous answer involving matrix indexing (M1[M2]) is simpler but just 
for the sake of it, since we're dealing with matrices it is not a case of 
sapply but of _apply_:

apply(M2, 1, function(x) M1[x[1], x[2]])

My 2c,
Adrian

-- 
Adrian DUSA
Arhiva Romana de Date Sociale
Bd. Schitu Magureanu nr.1
050025 Bucuresti sectorul 5
Romania
Tel./Fax: +40 21 3126618 \
  +40 21 3120210 / int.101



--

Message: 2
Date: Sat, 29 Jul 2006 08:14:02 -0400
From: jim holtman [EMAIL PROTECTED]
Subject: Re: [R] fancier plotting
To: Fred J. [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Message-ID:
[EMAIL PROTECTED]
Content-Type: text/plain

Wasn't exactly sure what you wanted to do.  Is this close?

mypch - c(a=19, b=19, c=19, d=22) #point type
mycol - c(a='green', b='red', c='black', d='blue') #color
mydf - data.frame(x=c('a','b', 'b','c','d'), y=c(2, 4, 8, 6, 2))
plot(mydf$y, type='p', pch=mypch[mydf$x], col=mycol[mydf$x])


On 7/29/06, Fred J. [EMAIL PROTECTED] wrote:

 Hi

 thank you for talking the time to help me with this.

 I have a sequence of numbers in a file and an equal sequence of various
 character, say(a b c d) each occurs more than once. I need to plot the
 numbers so that numbers corresponding to a in the other sequence would have
 green dots, those corresponding to b a red dot, nothing on c and blue square
 for d. i.e

 2 a  show a green dot
 4 b  show a red dot
 8 b  show a red dot
 6 c  show default colour
 2 d  show blue square

 I have the code below which plots the data but I have no clue how to
 inject the extra fancies.

 
 ###
 # ploting #
 ###
 library(tkrplot)

 #just the turning points
 L - length(I0); #points to plot

 tt - tktoplevel()
 left - tclVar(1)
 oldleft - tclVar(1)
 right - tclVar(L)
 cury - tclVar(' ')
 curx - NA
 tmpusr - numeric(4)
 tmpplt - numeric(4)

 f1 - function(){
 lleft - as.numeric(tclvalue(left))
 rright - as.numeric(tclvalue(right))
 x - seq(lleft,rright,by=1)
 par(bg='black', fg='green', col='white', col.axis='white',
  col.lab='magenta', col.main='blue', col.sub='cyan')
 plot(x,I0[x], type='s')

 ##   par(new=TRUE)
 ##   plot(x,I1[x], type='s', col='yellow',axes=F)

 par(new=TRUE)
 plot(x,I2[x], type='s', col='cyan',axes=F)

 axis(4)
 tmpusr 

Re: [R] main= bquote(paste(Results for , beta, 3, ==.(b1)))) doesn't work.

2006-07-31 Thread Stuart Leask
b1-3
plot(1,1, main= bquote(paste(Results for , beta==.(b1

seems to work.

Stuart
Dr Stuart J Leask DM MRCPsych MA BChir
Senior Lecturer and Honorary Consultant in Clinical Psychiatry
University Dept of Psychiatry, Duncan Macmillan House
Porchester Road. Nottingham. NG3 6AA.
http://www.nottingham.ac.uk/psychiatry/staff/s_leask.html

- Original Message - 
From: Marco Boks [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Sunday, July 30, 2006 10:15 PM
Subject: [R] main= bquote(paste(Results for , beta, 3,==.(b1 doesn't 
work.


 Hi,

 I need to plot the beta as the symbol, followed by the index 3 as the 
 title of a graph.

 This code works main= bquote(paste(Results for , beta ==.(b1))

 but I also need the index 3.
 I tried (simplified):

plot(x,y, main= bquote(paste(Results for , beta, 3, ==.(b1

 and a few other versions, but I can not get it to run properly.

 Any help would be greatly appreciated,

 Thanks

 Marco
 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code. 


This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] DOE in R

2006-07-31 Thread Petr Pikal
See

?aov
?lm

something like
 
lm(result~yourfactor1+yourfactor2+yourfactor3+yourfactor5, 
data=yourdataframe)
or
aov(result~yourfactor1+yourfactor2+yourfactor3+yourfactor5, 
data=yourdataframe)

but the exact structure of lm or aov construction depends on what you 
want to test.

HTH
Petr


On 28 Jul 2006 at 21:43, [EMAIL PROTECTED] wrote:

Date sent:  Fri, 28 Jul 2006 21:43:38 -0700
From:   [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject:[R] DOE in R
Send reply to:  [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]

 Hi.
 
 I'm a student in a graduate program at Simon Fraser University in
 Canada.
 
 I am trying to run a simple screening experiment with some simulated
 data.
 
 I simply want to do an ANOVA of an experiemnt with 5 factors (4 have 2
 levels, the last has 3 levels) and 48 runs (ie, full factorial).
 
 The thing is that I have multiple observations for each level
 combination (run).
 
 So,
 
 1) How do I do the anova based on the setup above?
 
 and 
 
 2) More importantly, because of convergence issues for my simulations,
 I will likely have an unequal number of observations for the 48 runs.
 How can I do this?
 
 Seems like a straightforward enough situation.
 
 I am trying to avaoid writing my own C code to do the analysis since I
 am working under some pretty tight time constraints.
 
 ANy help would be appreciated.
 
 Thanks very much.
 
 Dean Vrecko
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with allp ossible combination.

2006-07-31 Thread Arun Kumar Saha
Dear R Users,

Suppose I have a dataset like this:

  a   b

39700   485.00
39300   485.00
39100   480.00
38800   487.00
38800   492.00
39300   507.00
39500   493.00
39400   494.00
39500   494.00
39100   494.00
39200   490.00

Now I want get a-b for all possible combinations of a and b. Using two 'for'
loop it is easy to calculate. But problem arises when row length of the data
set is large eg. 1000 or more. Then R takes lot of time to do that. Can
anyone please tell me whether there is any R-function to do such kind of job
quickly?

Thanks and regards,

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with allp ossible combination.

2006-07-31 Thread Ido M. Tamir

Now I want get a-b for all possible combinations of a and b.

outer(a,b,-)

hth
ido

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with allp ossible combination.

2006-07-31 Thread Jacques VESLOT
diffs - do.call(expand.grid, dt)
diffs$delta - rowSums(expand.grid(dt$a, -dt$b))
---
Jacques VESLOT

CNRS UMR 8090
I.B.L (2ème étage)
1 rue du Professeur Calmette
B.P. 245
59019 Lille Cedex

Tel : 33 (0)3.20.87.10.44
Fax : 33 (0)3.20.87.10.31

http://www-good.ibl.fr
---


Arun Kumar Saha a écrit :
 Dear R Users,
 
 Suppose I have a dataset like this:
 
   a   b
 
 39700   485.00
 39300   485.00
 39100   480.00
 38800   487.00
 38800   492.00
 39300   507.00
 39500   493.00
 39400   494.00
 39500   494.00
 39100   494.00
 39200   490.00
 
 Now I want get a-b for all possible combinations of a and b. Using two 'for'
 loop it is easy to calculate. But problem arises when row length of the data
 set is large eg. 1000 or more. Then R takes lot of time to do that. Can
 anyone please tell me whether there is any R-function to do such kind of job
 quickly?
 
 Thanks and regards,
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Great R documentation

2006-07-31 Thread hadley wickham
Dear all,

I'm trying to improve the documentation I provide my R packages, and
to that end I'd like to find out what you think is great R
documentation.  I'm particularly interested in function documentation,
but great vignettes, websites or book are also of interest.

What is your favourite bit of R documentation, and why?

Thanks,

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] math symbols and text with mtext()

2006-07-31 Thread Juan Lewinger
Dear R users,

Two questions:

1) Is there a way to simplify the mtext() line below ?

beta=c(1,-1)
m=5
plot(1)
mtext( bquote(paste( beta == .(paste( (, paste(beta, collapse=, ), ) )) 
)), outer=TRUE,line=-3)

2) How do I get the embedded carriage return \n below to work, i.e for the 
text that follows it to appear on the next line?

beta=c(1,-1)
m=5
plot(1)
mtext( bquote(paste( beta == .(paste( (, paste(beta, collapse=, ), )  )), 
 \n Expected # of true positives = , .(m))), 
outer=TRUE, line=-3)

Thanks in advance! 

Juan Pablo Lewinger, Ph.D. 
Department of Preventive Medicine 
Keck School of Medicine 
University of Southern California

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Possible to subscribe to RNews?

2006-07-31 Thread Peter Dalgaard
Rainer M Krug [EMAIL PROTECTED] writes:

 Hi
 
 is it possible to subscribe to RNews that it get's emailed as soon as it
 is released or is there an announcement list for new issues?

It is announced on R-announce, along with a few other selected items.
See, e.g.

http://tolstoy.newcastle.edu.au/R/announce/06/index.html


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] math symbols and text with mtext()

2006-07-31 Thread Peter Dalgaard
Juan Lewinger [EMAIL PROTECTED] writes:

 Dear R users,
 
 Two questions:
 
 1) Is there a way to simplify the mtext() line below ?
 
 beta=c(1,-1)
 m=5
 plot(1)
 mtext( bquote(paste( beta == .(paste( (, paste(beta, collapse=, ), ) )) 
 )), outer=TRUE,line=-3)
 

The outer paste() is superfluous, but apart from that: Not really.
You're specifying a nonstandard formatting of beta, (1, -1) so you
also need to give all details. If you can live with a slightly
different format, try

mtext( bquote(beta == .(deparse(beta) ) ), outer=TRUE,line=-3)

and in fact,  

mtext(bquote( beta == .(substring(deparse(beta), 2)) ), outer=TRUE,line=-3)

will strip off the leading c and give you what you want, but only
slightly less cryptically.


 2) How do I get the embedded carriage return \n below to work, i.e for the 
 text that follows it to appear on the next line?
 
 beta=c(1,-1)
 m=5
 plot(1)
 mtext( bquote(paste( beta == .(paste( (, paste(beta, collapse=, ), )  
 )), 
  \n Expected # of true positives = , .(m))), 
 outer=TRUE, line=-3)
 
 Thanks in advance! 
 
 Juan Pablo Lewinger, Ph.D. 
 Department of Preventive Medicine 
 Keck School of Medicine 
 University of Southern California
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Great R documentation

2006-07-31 Thread Karl Ove Hufthammer
hadley wickham skreiv:

 I'm trying to improve the documentation I provide my R packages, and
 to that end I'd like to find out what you think is great R
 documentation.  I'm particularly interested in function documentation,
 but great vignettes, websites or book are also of interest.
 
 What is your favourite bit of R documentation, and why?

I find that a graphic is worth *at least* a thousand words. I learn very
much from looking at examples of the graphical output of functions, and
it’s often much easier to look through ‘example(function)’ for a output
that looks similar to what I need, and to tweak it, than to read the
documentation to find out how to create the needed graphic (if it’s
possible at all).

And it’s fun too!

Example:
demo(graphics)
and
library(lattice)
example(xyplot)

These beautiful and interesting graphics.

My advice will therefore be to document every function with plenty of
interesting and useful and different (trivial variants on a graphic is not
interesting) and *pretty* examples.

And do not start the examples section with a very advanced example, with
many parameters and based on many transformations of a data set. For
example, do not write:

... 10 impossible-to-understand lines for generating or transforming
the data set ...
fancyPlot(x,y,data=foo,lw=3,rty=2,bw=full,qrs=partial,method=bayes,
  nw=bar,clp=list(open.edge=TRUE,col=1,doubleMar=list(type=tr)),
  compute=c(o,p,lower,upper),cex=1.2,xlim=range(x)*1.3)

Instead, start with:

fancyPlot(anscombe)

or

x=rnorm(100)
fancyPlot(x)

Then gradually make the examples more advanced or complete.

And do document/comment the examples. Say what’s going on, what the graphic
(or table, or textual output) shows and why it’s interesting.

One more thing: The ‘lattice’ package also has a nice introduction:

?Lattice

I believe all packages should have such a introduction, to give an overview
of the package, what it’s about and some examples of use.

One last advice: If you have a vignette or a demo, do tell in the
‘Description’ of ‘library(help=package)’. It’s *very* easy to miss
otherwise (and many people don’t know that demos or even vignettes exist).

-- 
Karl Ove Hufthammer
E-mail and Jabber: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Algebraic operation on the missing values

2006-07-31 Thread Joanna Procelewska
Hi all,

I have a large set of descriptors, which are stored as the vectors, each one
containing about 450 elements. Now I have to perform some algebraical
operations on this set to eliminate the redundant ones. The problem is, that
not all vales in the vectors are known. 
Are there any norm defined how should I process such vectors? Simple example:
having two vectors:
 
a  b
3  4
2  null
3  6
 
I can imagine that a+b is [7 null 9]', but what about scalar product? Is it
null or have it a value? 
I don't want to replace missing values with the concrete ones, but they
significantly complicate my computations. 

Does anyone know whether there are any ways to solve this problem?
Please share some experience. Really appreciate the help!

Sincerely,

Joanna

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory problems when combining randomForests

2006-07-31 Thread Eleni Rapsomaniki
Hello

I've just realised attachments are not allowed, so the data for the example in
my previous message is:

pos.df=read.table(http://www.savefile.com/projects3.php?fid=6240314pid=847249key=119090;,
header=T)

neg.df=read.table(http://fs07.savefile.com/download.php?pid=847249fid=9829834key=362779;,
header=T)

And my last two questions (promise!): 
The first is related to the order of columns (ie. explanatory variables). I get
different order of importance for my variables depending on their order in the
training data. Is there a parameter I could fiddle with (e.g. ntree) to get a
more stable importance order?

And finally, since interactions are not implemented, is there another method I
could use in R to find dependencies among categorical variables? (lm doesn't
accept categorical variables).

Many thanks
Eleni Rapsomaniki
Birkbeck College, UK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] LIMMA: makeContrasts() function

2006-07-31 Thread Jay Shin
Hello,
I am trying to run a makeConstrasts() function to compute eBayes t- 
test with the following design list (six hybridizations and three  
samples -in duplicates):
A   BC
1  1   00
2 1   00
3 0   10
4 0   10
50   01
60   01

I would like to compare sample B-C, so when I run:
 contrasts.matrix-makeContrasts(B-C, levels=design)
it works great!

However, I have B and C stored in as 'member' (e.g. member[1] = B,  
member[2] = C).
So, when I try to run:
member1-member[1]
member2-member[2]
 contrasts.matrix-makeContrasts(member1-member2, levels=design)

  I get a following error message:
Error in eval(expr, envir, enclos) : object member1 not found

Can you please help me how I can directly compare the content of the  
'member' in makeContrasts() function?
Thank you,
Jay


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Algebraic operation on the missing values

2006-07-31 Thread Petr Pikal
Hi

see
?complete.cases and/or ?is.na for evaluating non missing entries. 

However in any operation in which you use NA value, result shall be 
NA as you do not know what actually is NA.

HTH
Petr





On 31 Jul 2006 at 11:44, Joanna Procelewska wrote:

Date sent:  Mon, 31 Jul 2006 11:44:25 +0200 (CEST)
From:   Joanna Procelewska [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject:[R] Algebraic operation on the missing values

 Hi all,
 
 I have a large set of descriptors, which are stored as the vectors,
 each one containing about 450 elements. Now I have to perform some
 algebraical operations on this set to eliminate the redundant ones.
 The problem is, that not all vales in the vectors are known. Are there
 any norm defined how should I process such vectors? Simple example:
 having two vectors:
 
 a  b
 3  4
 2  null
 3  6
 
 I can imagine that a+b is [7 null 9]', but what about scalar product?
 Is it null or have it a value? I don't want to replace missing values
 with the concrete ones, but they significantly complicate my
 computations. 
 
 Does anyone know whether there are any ways to solve this problem?
 Please share some experience. Really appreciate the help!
 
 Sincerely,
 
 Joanna
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

Petr Pikal
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Please HELP: Problem with BUILD command

2006-07-31 Thread Zajd, John
Greetings,
 
I am unable to successfully BUILD due to a file that apears to be too
long.
 
I know it it due to the length of a particular R program because if I
remove a line, even a comment line,
from the file it then successfully builds. However, if I add the line
back in, the build fails.
 
The file is only 14Kb long.
 
Any help or suggestions are greatly appreciated.
 
Thank you for your time.
John Zajd
Constella Group
Raleigh, NC
919 313-7746

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Great R documentation

2006-07-31 Thread Michael Dewey
At 10:09 31/07/2006, hadley wickham wrote:
Dear all,

I'm trying to improve the documentation I provide my R packages, and
to that end I'd like to find out what you think is great R
documentation.  I'm particularly interested in function documentation,

Hadley, I do not think any bit of function documentation, if you mean 
the manual page type documents, is ever that enlightening unless you 
already know what the function does. For me the useful documents are 
the more extended narrative documents.

What I find helpful is:
start with what is the aim of this document
tell me what I am assumed to know first (and ideally tell me where to 
look if I do not)
start with an example using the defaults
tell me what the output means in terms of the scientific problem
tell me what action I might take when it gives me a warning (if that 
is predictable)
tell me about other packages with the same aim and why I should use this one
tell me where to go for more information

but great vignettes, websites or book are also of interest.

What is your favourite bit of R documentation, and why?

Thanks,

Hadley



Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glmmNQ

2006-07-31 Thread Maria Salomé Esteves Cabral
Hi!
 
Can anyone let me know where is the function glmmNQ? It's said that it is in 
the MASS library but I can not find it.
 
Thanks
 
Salomé

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Please HELP: Problem with BUILD command

2006-07-31 Thread Prof Brian Ripley
As we have said to you before, we need a reproducible example, and it is 
very likely that you are speculating.

There is no 'BUILD' command in R:  do you mean 'R CMD build'?  If so, 
that does not parse any `R program'.

On Mon, 31 Jul 2006, Zajd, John wrote:

 Greetings,
  
 I am unable to successfully BUILD due to a file that apears to be too
 long.
  
 I know it it due to the length of a particular R program because if I
 remove a line, even a comment line,
 from the file it then successfully builds. However, if I add the line
 back in, the build fails.
  
 The file is only 14Kb long.
  
 Any help or suggestions are greatly appreciated.
  
 Thank you for your time.
 John Zajd
 Constella Group
 Raleigh, NC
 919 313-7746
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] glmmNQ

2006-07-31 Thread Prof Brian Ripley
On Mon, 31 Jul 2006, Maria Salomé Esteves Cabral wrote:

 Hi!
  Can anyone let me know where is the function glmmNQ? It's said that it 
 is in the MASS library but I can not find it.

Where is it said to be in the MASS *package*?  Not in MASS the book, for 
sure.  The function of that name used in MASS the book has never been made 
available for use with R.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Please HELP: Problem with BUILD command

2006-07-31 Thread Uwe Ligges
Zajd, John wrote:
 Greetings,
  
 I am unable to successfully BUILD due to a file that apears to be too
 long.
  
 I know it it due to the length of a particular R program because if I
 remove a line, even a comment line,
 from the file it then successfully builds. However, if I add the line
 back in, the build fails.
  
 The file is only 14Kb long.


Strange. Can you make this file available, please?

Uwe Ligges



 Any help or suggestions are greatly appreciated.
  
 Thank you for your time.
 John Zajd
 Constella Group
 Raleigh, NC
 919 313-7746
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Three questions about a model for possibly periodic data with varying amplitude

2006-07-31 Thread Andrew Robinson
Hi dear R community,

I have up to 12 measures of a protein for each of 6 patients, taken
every two or three days.  The pattern of the protein looks periodic,
but the height of the peaks is highly variable.  It's something like
this:

patient - data.frame(
day = c(1, 3, 5, 8, 10, 12, 15, 17, 19, 22, 24, 26),
protein =  c(5, 3, 10, 7, 2, 8, 25, 12, 7, 20, 10, 5)
)
plot(patient$day, patient$protein, type=b) 

My goal is two-fold: firstly, I need to test for periodicity, and
secondly, I need to try to predict the temporal location of future
peaks.  Of course, the peaks might be occurring on unmeasured days.

I have been looking at this model:

wave.form - 
  deriv3( ~ sin(2*pi*((day-offset)/period + 0.25)) * amplitude + mean,
 c(period, offset, amplitude, mean),
 function(day, period, offset, amplitude, mean){})

curve(wave.form(x, period=7, offset=2, mean=5, amplitude=4), 
   from=1, to=30)

So, for my purposes, the mean and the amplitude seem to be irrelevant;
I want to estimate the location and the offset.  To get going I've
been using the following approach:

require(nlme)

wave.1 - gnls(protein ~ wave.form(day, period, offset, amplitude, mean),
   start=list(period=7, offset=0, amplitude=10, mean=0),
   weights=varPower(), data=patient)

 ... which seems to fit the wave form pretty nicely.  

Question 1) I'm wondering, however, if anyone can suggest any
improvements to my model or fitting procedure, or general
approach.

Generalizing to a non-linear mixed effects model is proving difficult.
For example,

#

set.seed(1234)

patients - expand.grid(patient.no = 1:6,
day = patient$day)

patient.params - data.frame(patient.no = 1:6,
 period = c(5,6,7,8,7,6),
 offset = c(1,2,3,1,2,3),
 amplitude = c(10,6,10,6,10,6),
 mean = c(22,14,22,14,22,14))

patients - merge(patients, patient.params)

patients$protein - with(patients, 
 wave.form(day, period, offset, amplitude, mean)) +
 rnorm(n=dim(patients)[1], mean=5, sd=2)

patients - groupedData(protein ~ day | patient.no, data=patients)

protein.nlme - nlme(protein ~ 
 wave.form(day, period, offset, amplitude, mean),
 fixed = period + offset + mean + amplitude ~ 1,
 random = period + offset ~ 1 | patient.no,
 start = c(period=2*pi, offset=0, mean=10,
   amplitude=5),
 data = patients)

random.effects(protein.nlme) 

#

I get the following values for the BLUPS:

  periodoffset
2  0 -5.738510e-09
4  0 -6.426104e-08
6  0  6.847430e-09
1  0  6.275570e-07
5  0 -1.486590e-06
3  0  9.221848e-07

It seems clear to me that these BLUPS are quite unsuitable.  

Question 2) Can anyone suggest how I might improve my use of nlme?
Other than using more data :) 

Question 3) Finally, I'm also wondering what would be a suitable test for
periodicity for these data. I'd like to test the null
hypothesis that the data are not periodic.

All suggestions, discussion, welcome!

Best wishes

Andrew
-- 
Andrew Robinson  
Department of Mathematics and StatisticsTel: +61-3-8344-9763
University of Melbourne, VIC 3010 Australia Fax: +61-3-8344-4599
Email: [EMAIL PROTECTED] http://www.ms.unimelb.edu.au

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Algebraic operation on the missing values

2006-07-31 Thread Joanna Procelewska
Thanks for the answer.

The problem is I have to perform a forward selection on the set and in every
step construct an orthonormal base for the subspace spanned on the selected
vectors. This means that I can use only the full vectors for the constructing
a base, or? 

Joanna

--- Petr Pikal [EMAIL PROTECTED] schrieb:

 Hi
 
 see
 ?complete.cases and/or ?is.na for evaluating non missing entries. 
 
 However in any operation in which you use NA value, result shall be 
 NA as you do not know what actually is NA.
 
 HTH
 Petr
 
  Hi all,
  
  I have a large set of descriptors, which are stored as the vectors,
  each one containing about 450 elements. Now I have to perform some
  algebraical operations on this set to eliminate the redundant ones.
  The problem is, that not all vales in the vectors are known. Are there
  any norm defined how should I process such vectors? Simple example:
  having two vectors:
  
  a  b
  3  4
  2  null
  3  6
  
  I can imagine that a+b is [7 null 9]', but what about scalar product?
  Is it null or have it a value? I don't want to replace missing values
  with the concrete ones, but they significantly complicate my
  computations. 
  
  Does anyone know whether there are any ways to solve this problem?
  Please share some experience. Really appreciate the help!
  
  Sincerely,
  
  Joanna
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html and provide commented,
  minimal, self-contained, reproducible code.
 
 Petr Pikal
 [EMAIL PROTECTED]
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Random Effects Model with Interacting Covariates

2006-07-31 Thread Dov Stekel
Hi

I have been asked by a colleague to perform a statistical analysis 
which uses random effects - but I am struggling to get this to work 
with nlme in R. Help would be very much appreciated!

Essentially, the data consists of:

10 patients. Each patient has been given three different treatments (on 
three separate days). 15 measurements (continuous variable) have been 
taken from each patient both before and after each of the treatments.  
So the data looks like:

Patient WhenTreat   Measurement
a   before  A   10.3
a   before  A   11.2
...
a   after   A   12.4
...
a   before  B   11.6
...
a   after   B   ...

and the same for treatment C, patients, b,c,d, etc.

My colleague would like to test to see if the treatments are different 
from each other. i.e., is the change (before to after) due to the 
treatments different between the treatments. It would seem to me like a 
random effects model in which we are interested in the significance of 
the interaction terms Treat:When, with repeated measures in the 
patients (who are random effects, but crossed with the covariates). 
Unfortunately, the groupedData formula only lets me put a single 
covariate on the LHS - nothing as complicated as this!

I could, of course, advise her to simply combine all 30 data points for 
each treatment in each patient into a single number (representing 
difference between before and after), but is there a way to use all the 
data in an LME?

Thanks!


Dov



**

Dr Dov Stekel
Lecturer in Bioinformatics
School of Biosciences
University of Birmingham
Birmingham B15 2TT
Tel: +44 121 414 4209
Email: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] standardized residuals (random effects) using nlme and ranef

2006-07-31 Thread Dirk Enzmann
As suggested I try another post.

First I give a reproducible example. The example data set has been 
provided by I. Plewis (1997), Statistics in Education. London: Arnold (I 
include the residuals obtained by MLWin):

# -

library(nlme)

# Example data from Plewis

BaseURL=http://www.ottersbek.de/;
ds32 = read.fwf(paste(BaseURL,'ds32.dat',sep=''),
   widths=c(9,10,10,10,10,10),
   col.names=c('class','pupil','zcurric',
   'curric','zmath1','math2'),
   colClasses=c('factor','factor','numeric',
'numeric','numeric','numeric'))

# Random slopes model (p. 44):
model.4b = lme(math2~zcurric,random=~zcurric|class,method='ML',data=ds32)

# Random effects (intercept and slope residuals) of level 1 (class):
RandEff = ranef(model.4b,level=1)

# Results of MLWin (c300 = intercept residuals of level class,
#   c301 = slope residuals of level class,
#   c304 = standardized intercept residuals,
#   c305 = standardized slope residuals):
MLWinRes = read.fwf(paste(BaseURL,'resid_ex32.txt',sep=''),
 widths=c(15,15,15,15),
 col.names=c('c300','c301','c304','c305'),
 colClasses=c('numeric','numeric','numeric','numeric'))

# They are identical to the results of MLWin (c300, c301):
round(cbind(RandEff,MLWinRes[,1:2]),5)

# However, using option standard the results differ considerably:
round(cbind(ranef(model.4b,level=1,standard=T),MLWinRes[,3:4]),5)

# -

 From the RSiteSearch(MLWin) there are two hits that seem to be 
relevant: One of Douglas Bates

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/53097.html

where he explains why he regards getting standard errors of the 
estimates of variance components as not being important. I think (but 
I'm not sure) that this implies that I should not use standardized 
residuals, as well.

Secondly a post of Roel de Jong

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/62280.html

However, I can't figure out how I could make use of his suggestion to 
obtain the standardized residuals I am looking for.

A part of the answer has been given in an answer to my previous post by 
Harold Doran. He showed that the so called standardized residuals 
obtained by ranef() using the option standard=T are the residuals 
devided by the standard deviation of the random effects, not divided by 
the corresponding standard error as stated in the ranef() help - if I 
understood him correctly.

In the multilevel list he also showed how to create a caterpillar plot 
using lmer() and the errbar() function of Hmisc (I modified his example 
to adapt it to my data):

# 

detach(package:nlme)
library(Matrix)
library(Hmisc)

# Fit a sample model:
fm1 = lmer(math2 ~ zcurric + (zcurric|class), ds32, method='ML', 
control=list(gradient=FALSE, niterEM=0))

.Call(mer_gradComp, fm1, PACKAGE = Matrix)

# extract random effects:
fm1.blup = ranef(fm1)$class

#extract variances and multiple by scale factor:
fm1.post.var = [EMAIL PROTECTED] * attr(VarCorr(fm1),sc)**2

# posterior variance of slope on fm1:
fm1.post.slo = sqrt(fm1.post.var[2,2,])

# Slopes:
x = fm1.blup[,2]+outer(fm1.post.slo, c(-1.96,0,1.96))

# This code creates a catepillar plot using the errbar function:
x - x[order(x[,2]) , ]
print(errbar(1:dim(x)[1], x[,2],x[,3],x[,1],ylab='Slopes', xlab='Classes'))
abline(h=0)

# 

Is it correct that I can obtain a respecive plot for the intercepts in a 
similar way?

# 

# posterior variance of intercept on fm1.post.int:
fm1.post.int = sqrt(fm1.post.var[1,1,])

# Intercepts:
x = fm1.blup[,1]+outer(fm1.post.int, c(-1.96,0,1.96))

# This code creates a catepillar plot using the errbar function
x - x[order(x[,2]) , ]
print(errbar(1:dim(x)[1], x[,2],x[,3],x[,1],ylab='Intercepts', 
xlab='Classes'))
abline(h=0)

# 

To sum up, I can't figure out how MLWin calculates the standardized 
residuals. But I understand that this is not a question for the R list. 
Nevertheless, it would help if someone could point me to some arguments 
why not to use them and stick to the results obtainable by ranef().

Spencer Graves wrote:

   Have you tried RSiteSearch(MLWin)?  I just got 29 hits.  I 
 wonder if any one of these might relate to your question?
 
   If you would like more help on this issue from this listserve, 
 please submit another post, preferably illustrating your question with 
 the simplest possible self-contained example that illustrates your 
 question, perhaps like the following:
 
   fm1.16 - lme(distance~age, data=Orthodont[1:16,],
random=~age|Subject)
 
   Hope this helps.
   Spencer Graves
 p.s.  PLEASE do read the posting guide 

[R] Monospaced fonts in legends

2006-07-31 Thread Erich Neuwirth
Is there a way of using monospaced fonts in legends in a plot,
but still using standard proportionally spaced fonts for all the titles?


-- 
Erich Neuwirth, University of Vienna
Faculty of Computer Science
Computer Supported Didactics Working Group
Visit our SunSITE at http://sunsite.univie.ac.at
Phone: +43-1-4277-39464 Fax: +43-1-4277-39459

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] boxcox transformation

2006-07-31 Thread Greg Snow
boxcox from MASS and bct from TeachingDemos do different things.  The boxcox 
function does not return the transformed y values, it returns log-likelihood 
values for various values of lambda, these values can be used to decide which 
value of lambda to use (generally it is used by giving a sequence of lambda 
values then looking at the plot (see the plotit argument)).  

It is generally not a good idea to just take the lambda with the highest 
log-likelihood, rather look for values of lambda within the confidence interval 
that make scientific sense.  

Once you have decided on a value of lambda to use then you can use the bct 
function from TeachingDemos (or other ways) to compute the transformed y values.

Hope this helps,


-Original Message-
From: [EMAIL PROTECTED] on behalf of Torsten Mathies
Sent: Sat 7/29/2006 12:52 AM
To: r-help@stat.math.ethz.ch
Subject: [R] boxcox transformation
 
I've got a vector of data (hours to drive from a to b) y. 
 
After a qqplot I know, that they don't fit the normal probability. 
 
I would like to transform these data with the boxcox transformation
(MASS), that they fit the model.
 
When I try
 
ybx-boxcox(y~1,0) 
qqnorm(ybx)
 
the plot is different from
 
library (TeachingDemos)
ybct-bct(y,0) //
qqnorm(ybct)
 
How can I transform y to fit with the normal probability model?
 
 
Yours
Torsten

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] user defined covariance structure

2006-07-31 Thread Jonathan Smith
I am writing as I am still having trouble trying to define my own covariance 
matrix.  My code is displayed below.  I am defining the covariance matrix in 
the form of an AR1 process so it can be easily checked if working correctly. 
  Another question I have is if it is possible to define the matrix without 
giving p a specific value and leaving it in as a coefficient.  For the 
reason that it can be estimated when model is run.  I figure that is the way 
AR1 works in GLS.
Thank you
Jon Smith
I would appreciate any help s I am stuck here and need to figure this out 
prior to continuing my research.

function ()
{
library(nlme)

tim-c(1,2,3,4,1,2,3,4,1,2,3,4,1,2,3,4)
peep-c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,4,4)
y-c(11.78,9.53,11.03,9.89,10.80,8.74,10.25,10.69,5.60,7.27,6.81,4.56,7.01,5.64,6.30,8.31)
#This y data was created from and AR1 model with correlation coefficient 
equaling 0.7.

timMat-matrix(c(tim),ncol=1,nrow=16)
peepMat-matrix(c(peep),ncol=1,nrow=16)
yMat-matrix(c(y),ncol=1,nrow=16)
dataframe-data.frame(yMat,timMat,peepMat)
p=0.7
g=1

tester2-corSymm(value = c(p^(1),p^(2),p^(3),p^(1),p^(2),p^(1)),form = ~ 
timMat|peepMat)
tester2-Initialize(tester2, data = dataframe)
testMat2-corMatrix(tester2)
print(testMat2)
# this appears to be working correctly

#smanGls-gls(yMat~timMat,data = dataframe, corr = corAR1(form = 
~timMat|peepMat))
#   works perfectly

smanGls-gls(yMat~timMat,data = dataframe, corr = corSymm(tester2))

arsum-summary(smanGls)
print(arsum)

#this is what message I get when I try to use the covarince matrix I 
defined.
#Error in gls(yMat ~ timMat, data = dataframe, corr = corSymm(tester2)) :
  #  false convergence (8)

}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Great R documentation

2006-07-31 Thread John Kane

--- hadley wickham [EMAIL PROTECTED] wrote:

 Dear all,
 
 I'm trying to improve the documentation I provide my

I am a new user and at moment I am finding An
Introduction to S and the Hmisc and Design Libraries”
by Carlos Alzola and Frank E. Harrell is very helpful
as it explains some simple data manipulations that
other documentation seems to assume one will know. 

My opinion is that much of the documentation is very
good but very terse and at least in the help files
sometimes a bit too clever.   There seldom seems to be
enough  explaination as to why something does
something which has often left me able to do exactly
what I want to do but having a difficult time
generalizing.


Of course, if I could track down whoever at my local
university has taken out all the R books I might be
much better off. :)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Monospaced fonts in legends

2006-07-31 Thread Prof Brian Ripley
On Mon, 31 Jul 2006, Erich Neuwirth wrote:

 Is there a way of using monospaced fonts in legends in a plot,
 but still using standard proportionally spaced fonts for all the titles?

Yes, just use the family= argument as appropriate, e.g. par(family=mono) 
before calling legend.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] placing rectangle behind plot

2006-07-31 Thread Greg Snow
One thing you could try (probably as last resort, if someone comes up with a 
better idea, use that) is to plot to an xfig device, then use xfig (or jfig) to 
move the rectangle to the back, then convert it to whatever final graphics 
format you want.


-Original Message-
From: [EMAIL PROTECTED] on behalf of Gabor Grothendieck
Sent: Sat 7/29/2006 3:20 PM
To: RHelp
Subject: [R] placing rectangle behind plot
 
I am trying to create a lattice plot and would like to later, i.e. after
the plot is drawn, add a grey rectangle behind a portion of it.
The following works except that the rectrangle is on top of and
obscures a portion of the chart.  I also tried adding col = transparent
to the gpar list but that did not help -- I am on windows and
perhaps the windows device does not support transparency?
At any rate, how can I place the rectangle behind the plotted
points without drawing the rectangle first?

library(lattice)
library(grid)
trellis.unfocus()
x - 1:10
xyplot(x ~ x | gl(2,1), layout = 1:2)
trellis.focus(panel, 1, 1)
grid.rect(w = .5, gp = gpar(fill = light grey))
trellis.unfocus()

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Log color scale

2006-07-31 Thread Thomas Lumley

On Sun, 30 Jul 2006, Uwe Ligges wrote:


[EMAIL PROTECTED] wrote:


Kartik Pappu a écrit :



However I need to plot my data in a log transformed color scale.  Is
this possible?  I will be happy to explain further, but basically I
need to do this because there are large variations in the max and min
values of my raw data and I am trying to highlight the differences in
the values at the lower end of my raw data (while still displaying the
values at the high end of the spectrum for comparison) so if figured
the best way to represent this on a (RGB) color scale is to do it with
a log transform.  I do not want to use too many colors because that
make the figure too busy.
Any suggestions on how to achieve this.



Don't use too much time to search a good palette, build it yourself.


It is worth looking into the packages RColorBrewer and colorspace.
Some people thought about colors in graphics and wrote some code that
might result in more appropriate colors and palettes than those that are
quickly hacked...



colorRampPalette() will interpolate between colors (eg those from one of 
the ColorBrewer palettes) and has a bias= argument to make the colors 
closer together at one end of the scale.


-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Power of a single sample binomial test

2006-07-31 Thread Greg Snow
It looks like the pwr.p.test function from the pwr package would do what you 
want.


-Original Message-
From: [EMAIL PROTECTED] on behalf of Chris Evans
Sent: Sun 7/30/2006 12:53 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Power of a single sample binomial test
 
The only references to this I can find searching the archives are to a
student who asked in relation to his course work on a stats course.
Promise I'm not doing that!

I have a situation in which we want to test proportions against an
expected proportion, binom.test() is great.  I'd like to do some post
hoc power tests (the x and n were beyond our control in the survey as
all we could set was an overall n.max where n  n.max, n is between  1
and 44).

I would love to work out our power to have detected a proportions
different from the expected (.307).  I've run two-tailed binomial tests
as we were interested in both high and low.  We can not unreasonably
confine to the directional prediction of observed x/n  .307, say .15
if that makes the maths easier.  I can't see functions in R that will do
this for me.  The only book I seem to have to hand that addresses this is:
Kraemer, H. C.  Thiemann, S. (1988) How many subjects?  Statistical
power analysis in research. Newbury Park California, Sage Publications, Inc.
which I appreciate is ageing but I assume still correct.  The problem I
have is that I can use R to get Kraemer's upper case delta (p.77) and
look up in their Master table but I'd love a more flexible function
that would  say solve for power where p1, n, p0 and alpha are given.  I
think I ought to be able to work out how their master table was
calculated and work from that but I'm finding the mathematics a bit
opaque for my ageing brain.  Their model is clearly one-tailed.  I'm not
sure how one works a two-tailed power.

A search around for web calculators etc. turns up all manner of things,
some probably good, some dead etc.  I'd hugely appreciate if someone
here could share anything they may have in R or point me to R solutions
I may have missed.

TIA,

Chris

-- 
Chris Evans [EMAIL PROTECTED]
Professor of Psychotherapy, Nottingham University;
Consultant Psychiatrist in Psychotherapy, Rampton Hospital;
Research Programmes Director, Nottinghamshire NHS Trust;
Hon. SL Institute of Psychiatry, Hon. Con., Tavistock  Portman Trust
**If I am writing from one of those roles, it will be clear. Otherwise**

**my views are my own and not representative of those institutions**

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] standardized residuals (random effects) using nlme and ranef

2006-07-31 Thread Doran, Harold
  To sum up, I can't figure out how MLWin calculates the 
 standardized residuals. But I understand that this is not a 
 question for the R list. 
 Nevertheless, it would help if someone could point me to some 
 arguments why not to use them and stick to the results 
 obtainable by ranef().


Hi Dirk:

Well, it is interesting that mlWin and lmer generate the same exact
random effects but different results for the standardized random
effects. Now, my prior post showed exactly how lme calculates the
standardized random effects, so this is now totally transparent.

What I would recommend you do is this

1) Calculate the unstandardized random effects in mlWin
2) Calculate the standardized random effects in mlWin
3) Divide the mlWin unstandarized random effects by the standarized
random effects. This will show what denominator is used to standardize
the random effects.

Basically, just replicate what I did in my prior email using the mlWin
data.

Clearly, since R and mlWin generate the exact same unstandardized random
effects, but different standardized results, the difference lies in what
denominator is used. In R, we know what denominator is used and it is
the standard deviation of the variance component for the random effect.

If you do the steps above, you will solve for the denominator used in
mlWins calculation. This will solve your problem.

Harold

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory problems when combining randomForests

2006-07-31 Thread Weiwei Shi
Hi, Andy:

What's the Jerry Friedman's ISLE? I googled it and did not find the paper on
it. Could you give me a link, please?

Thanks,

Weiwei

On 7/31/06, Eleni Rapsomaniki [EMAIL PROTECTED] wrote:

 Hello

 I've just realised attachments are not allowed, so the data for the
 example in
 my previous message is:

 pos.df=read.table(
 http://www.savefile.com/projects3.php?fid=6240314pid=847249key=119090;,
 header=T)

 neg.df=read.table(
 http://fs07.savefile.com/download.php?pid=847249fid=9829834key=362779;,
 header=T)

 And my last two questions (promise!):
 The first is related to the order of columns (ie. explanatory variables).
 I get
 different order of importance for my variables depending on their order in
 the
 training data. Is there a parameter I could fiddle with (e.g. ntree) to
 get a
 more stable importance order?

 And finally, since interactions are not implemented, is there another
 method I
 could use in R to find dependencies among categorical variables? (lm
 doesn't
 accept categorical variables).

 Many thanks
 Eleni Rapsomaniki
 Birkbeck College, UK

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Weiwei Shi, Ph.D

Did you always know?
No, I did not. But I believed...
---Matrix III

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with documentation of a data set

2006-07-31 Thread Gattuso, Jean-Pierre
Hi:

I am trying to add a data set in my package but get the following error 
when the package is checked:

Undocumented data sets:
seacarb_test
All user-level objects in a package should have documentation entries.
See chapter 'Writing R documentation files' in manual 'Writing R
Extensions'.

However, I do have a file seacarb_test.Rd in the man directory. 
Reading the manual 'Writing R Extensions' did not help.

What am I doing wrong?

Thanks,
jp

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory problems when combining randomForests

2006-07-31 Thread Eleni Rapsomaniki
Hi Andy,

  I get different order of importance for my variables depending on their
order in the training data.

Perhaps answering my own question, the change in importance rankings could be
attributed to the fact that before passing my data to randomForest I impute the
missing values randomly (using the combined distributions of pos+neg), so the
data seen by RF is slightly different. Then combining this with the fact that
RF chooses data randomly it makes sense to see different rankings.

In a previous thread regarding simplifying variables:
http://thread.gmane.org/gmane.comp.lang.r.general/6989/focus=6993

you say:
The basic problem is that when you select important variables by RF and then
re-run RF with those variables, the OOB error rate become biased downward.
As you iterate more times, the overfitting becomes more and more severe
(in the sense that, the OOB error rate will keep decreasing while error rate
on an independent test set will be flat or increases) 

But if every time you remove a variable you pass some test data (ie data not
used to train the model) and base the performance of the new, reduced model on
the error rate on the confusion matrix for the test data, then this
overfitting should not be an issue, right?  (unless of course you were
referring to unsupervised learning).

Best regards
Eleni Rapsomaniki
Birkbeck College, UK

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] standardized residuals (random effects) using nlme and ranef

2006-07-31 Thread Douglas Bates
I have a cut-and-paste error in this message that I sent a few minutes
ago. I show the evaluation of rr as
 rr - ranef(model.4b, postVar = TRUE)
when it was
 rr - ranef(fm1, postVar = TRUE)

I also omitted the evaluation of rr1, which is

rr1 - rr$class


On 7/31/06, Douglas Bates [EMAIL PROTECTED] wrote:
 Thank you for providing the reproducible example and the explanation
 of what you are seeking to calculate.

 On 7/31/06, Dirk Enzmann [EMAIL PROTECTED] wrote:
  As suggested I try another post.
 
  First I give a reproducible example. The example data set has been
  provided by I. Plewis (1997), Statistics in Education. London: Arnold (I
  include the residuals obtained by MLWin):
 
  # -
 
  library(nlme)
 
  # Example data from Plewis
 
  BaseURL=http://www.ottersbek.de/;
  ds32 = read.fwf(paste(BaseURL,'ds32.dat',sep=''),
 widths=c(9,10,10,10,10,10),
 col.names=c('class','pupil','zcurric',
 'curric','zmath1','math2'),
 colClasses=c('factor','factor','numeric',
  'numeric','numeric','numeric'))
 
  # Random slopes model (p. 44):
  model.4b = lme(math2~zcurric,random=~zcurric|class,method='ML',data=ds32)
 
  # Random effects (intercept and slope residuals) of level 1 (class):
  RandEff = ranef(model.4b,level=1)
 
  # Results of MLWin (c300 = intercept residuals of level class,
  #   c301 = slope residuals of level class,
  #   c304 = standardized intercept residuals,
  #   c305 = standardized slope residuals):
  MLWinRes = read.fwf(paste(BaseURL,'resid_ex32.txt',sep=''),
   widths=c(15,15,15,15),
   col.names=c('c300','c301','c304','c305'),
   colClasses=c('numeric','numeric','numeric','numeric'))
 
  # They are identical to the results of MLWin (c300, c301):
  round(cbind(RandEff,MLWinRes[,1:2]),5)
 
  # However, using option standard the results differ considerably:
  round(cbind(ranef(model.4b,level=1,standard=T),MLWinRes[,3:4]),5)

 Your summation below of the cause of this inconsistency is correct.
 As Harold explains the standardized random effects returned by ranef
 in the nlme package are the estimates (actually the predictors) of
 the random effects divided by the estimated standard deviation of
 those random effects, not by the estimated standard error as stated in
 the documentation.  I will correct the documentation.

 That is, standardized random effects in the nlme package are
 produced by dividing all the intercept random effects by the same
 number (2.9723) and all the slope random effects by 2.0014.
  rr1 - ranef(model.4b)
  rr2 - ranef(model.4b, standard = TRUE)
  rr1/rr2
   (Intercept)  zcurric
 82.972373 2.001491
122.972373 2.001491
172.972373 2.001491
222.972373 2.001491
232.972373 2.001491
282.972373 2.001491
292.972373 2.001491
302.972373 2.001491
312.972373 2.001491
322.972373 2.001491
332.972373 2.001491
342.972373 2.001491
352.972373 2.001491
362.972373 2.001491
372.972373 2.001491
382.972373 2.001491
392.972373 2.001491
412.972373 2.001491
422.972373 2.001491
432.972373 2.001491
442.972373 2.001491
452.972373 2.001491
462.972373 2.001491
472.972373 2.001491

 Another way of defining a standardization would be to use what Harold
 Doran and others call the posterior variance-covariance of the
 random effects.  On reflection I think I would prefer the term
 conditional variance but I use posterior below.

 Strictly speaking the random effects are not parameters in the model -
 they are unobserved random variables.  Conditional on the values of
 the parameters in the model and on the observed data, the random
 effects are normally distributed with a mean and variance-covariance
 that can be calculated.

 Influenced by Harold I added an optional argument postVar to the
 ranef extractor function for lmer models (but apparently I failed to
 document it - I will correct that).  If you fit your model with lmer,
 as you do below, you can standardize the  random effects according to
 the conditional variances by

  fm1 - lmer(math2 ~ zcurric + (zcurric|class), ds32, method='ML')
  rr - ranef(model.4b, postVar = TRUE)
  str(rr)  # shows the structure of the object rr
 Formal class 'ranef.lmer' [package Matrix] with 1 slots
   ..@ .Data:List of 1
   .. ..$ :`data.frame': 24 obs. of  2 variables:
   .. .. ..$ (Intercept): num [1:24]  0.860 -1.962 -1.105  4.631 -0.781 ...
   .. .. ..$ zcurric: num [1:24]  0.2748 -1.3194  0.0841  0.5958  0.8080 
 ...
   .. .. ..- attr(*, postVar)= num [1:2, 1:2, 1:24]  

Re: [R] standardized residuals (random effects) using nlme and ranef

2006-07-31 Thread Douglas Bates
Thank you for providing the reproducible example and the explanation
of what you are seeking to calculate.

On 7/31/06, Dirk Enzmann [EMAIL PROTECTED] wrote:
 As suggested I try another post.

 First I give a reproducible example. The example data set has been
 provided by I. Plewis (1997), Statistics in Education. London: Arnold (I
 include the residuals obtained by MLWin):

 # -

 library(nlme)

 # Example data from Plewis

 BaseURL=http://www.ottersbek.de/;
 ds32 = read.fwf(paste(BaseURL,'ds32.dat',sep=''),
widths=c(9,10,10,10,10,10),
col.names=c('class','pupil','zcurric',
'curric','zmath1','math2'),
colClasses=c('factor','factor','numeric',
 'numeric','numeric','numeric'))

 # Random slopes model (p. 44):
 model.4b = lme(math2~zcurric,random=~zcurric|class,method='ML',data=ds32)

 # Random effects (intercept and slope residuals) of level 1 (class):
 RandEff = ranef(model.4b,level=1)

 # Results of MLWin (c300 = intercept residuals of level class,
 #   c301 = slope residuals of level class,
 #   c304 = standardized intercept residuals,
 #   c305 = standardized slope residuals):
 MLWinRes = read.fwf(paste(BaseURL,'resid_ex32.txt',sep=''),
  widths=c(15,15,15,15),
  col.names=c('c300','c301','c304','c305'),
  colClasses=c('numeric','numeric','numeric','numeric'))

 # They are identical to the results of MLWin (c300, c301):
 round(cbind(RandEff,MLWinRes[,1:2]),5)

 # However, using option standard the results differ considerably:
 round(cbind(ranef(model.4b,level=1,standard=T),MLWinRes[,3:4]),5)

Your summation below of the cause of this inconsistency is correct.
As Harold explains the standardized random effects returned by ranef
in the nlme package are the estimates (actually the predictors) of
the random effects divided by the estimated standard deviation of
those random effects, not by the estimated standard error as stated in
the documentation.  I will correct the documentation.

That is, standardized random effects in the nlme package are
produced by dividing all the intercept random effects by the same
number (2.9723) and all the slope random effects by 2.0014.
 rr1 - ranef(model.4b)
 rr2 - ranef(model.4b, standard = TRUE)
 rr1/rr2
  (Intercept)  zcurric
82.972373 2.001491
   122.972373 2.001491
   172.972373 2.001491
   222.972373 2.001491
   232.972373 2.001491
   282.972373 2.001491
   292.972373 2.001491
   302.972373 2.001491
   312.972373 2.001491
   322.972373 2.001491
   332.972373 2.001491
   342.972373 2.001491
   352.972373 2.001491
   362.972373 2.001491
   372.972373 2.001491
   382.972373 2.001491
   392.972373 2.001491
   412.972373 2.001491
   422.972373 2.001491
   432.972373 2.001491
   442.972373 2.001491
   452.972373 2.001491
   462.972373 2.001491
   472.972373 2.001491

Another way of defining a standardization would be to use what Harold
Doran and others call the posterior variance-covariance of the
random effects.  On reflection I think I would prefer the term
conditional variance but I use posterior below.

Strictly speaking the random effects are not parameters in the model -
they are unobserved random variables.  Conditional on the values of
the parameters in the model and on the observed data, the random
effects are normally distributed with a mean and variance-covariance
that can be calculated.

Influenced by Harold I added an optional argument postVar to the
ranef extractor function for lmer models (but apparently I failed to
document it - I will correct that).  If you fit your model with lmer,
as you do below, you can standardize the  random effects according to
the conditional variances by

 fm1 - lmer(math2 ~ zcurric + (zcurric|class), ds32, method='ML')
 rr - ranef(model.4b, postVar = TRUE)
 str(rr)  # shows the structure of the object rr
Formal class 'ranef.lmer' [package Matrix] with 1 slots
  ..@ .Data:List of 1
  .. ..$ :`data.frame': 24 obs. of  2 variables:
  .. .. ..$ (Intercept): num [1:24]  0.860 -1.962 -1.105  4.631 -0.781 ...
  .. .. ..$ zcurric: num [1:24]  0.2748 -1.3194  0.0841  0.5958  0.8080 ...
  .. .. ..- attr(*, postVar)= num [1:2, 1:2, 1:24]  2.562 -0.585
-0.585  1.837  3.027 ...

The two sets of conditional standard deviations are

 sqrt(attr(rr1, postVar)[1,1,])
 [1] 1.600589 1.739768 1.466261 1.542172 1.942629 1.506112 1.892815 1.665414
 [9] 1.377917 1.574244 1.755679 2.039292 1.887651 1.451916 1.732030 1.659786
[17] 1.987397 1.795273 1.542049 1.784441 1.629663 1.753223 1.585520 1.470161
 sqrt(attr(rr1, postVar)[2,2,])
 [1] 1.355537 1.630075 

Re: [R] memory problems when combining randomForests [Broadcast]

2006-07-31 Thread Liaw, Andy
It's the 5th paper on his web page.
http://www-stat.stanford.edu/~jhf/ftp/isle.pdf
http://www-stat.stanford.edu/~jhf/ftp/isle.pdf 
 
Cheers,
Andy


  _  

From: Weiwei Shi [mailto:[EMAIL PROTECTED] 
Sent: Monday, July 31, 2006 11:38 AM
To: Eleni Rapsomaniki
Cc: Liaw, Andy; r-help@stat.math.ethz.ch
Subject: Re: [R] memory problems when combining randomForests [Broadcast]


Hi, Andy:

What's the Jerry Friedman's ISLE? I googled it and did not find the paper on
it. Could you give me a link, please?

Thanks,

Weiwei


On 7/31/06, Eleni Rapsomaniki [EMAIL PROTECTED]
mailto:[EMAIL PROTECTED]  wrote: 

Hello

I've just realised attachments are not allowed, so the data for the example
in
my previous message is:

pos.df=read.table(
http://www.savefile.com/projects3.php?fid=6240314pid=847249key=119090
http://www.savefile.com/projects3.php?fid=6240314pid=847249key=119090;,
header=T)

neg.df=read.table(
http://fs07.savefile.com/download.php?pid=847249fid=9829834key=362779
http://fs07.savefile.com/download.php?pid=847249fid=9829834key=362779;,
header=T)

And my last two questions (promise!):
The first is related to the order of columns (ie. explanatory variables). I
get 
different order of importance for my variables depending on their order in
the
training data. Is there a parameter I could fiddle with (e.g. ntree) to get
a
more stable importance order?

And finally, since interactions are not implemented, is there another method
I 
could use in R to find dependencies among categorical variables? (lm doesn't
accept categorical variables).

Many thanks
Eleni Rapsomaniki
Birkbeck College, UK

__ 
R-help@stat.math.ethz.ch mailto:R-help@stat.math.ethz.ch  mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.





-- 
Weiwei Shi, Ph.D

Did you always know?
No, I did not. But I believed...
---Matrix III



--

--
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Functions ,Optim, Dataframe

2006-07-31 Thread Michael Papenfus
I think I need to clarify a little further on my original question.

I have the following two rows of data:
mydat-data.frame(d1=c(3,5),d2=c(6,10),p1=c(.55,.05),p2=c(.85,.35))
 mydat
  d1 d2 p1 p2
1 3 6 0.55 0.85
2 5 10 0.05 0.35

I need to optimize the following function using  optim for each row in mydat
fr-function(x) {
u-x[1]
v-x[2]
sqrt(sum((plnorm(c(d1,d2,u,v)-c(p1,p2))^2))
}
x0-c(1,1)# starting values for two unknown parameters
y-optim(x0,fr)

In my defined function fr, (d1 d2 p1 p2) are known values which I need 
to read in from my dataframe and u  v are the TWO unknown parameters.  
I want to solve this equation for each row of my dataframe.

I can get this to work when I manually plug in the known values (d1 d2 
p1 p2).  However, I would like to apply this to each row in my dataframe 
where the known values are automatically passed to my function which 
then is sent to optim which solves for the two unknown parameters for 
each row in the dataframe.

thanks again,
mike

-- 

[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave error in example code

2006-07-31 Thread LL
Hi.. I am running R version 2.3.1 on a Windows XP machine with the latest 
Miktex 2.5 installed. I get no errors from R when running the Sweave example, 

testfile - system.file(Sweave, Sweave-test-1.Rnw, package = utils)

However, when I tex the resulting .tex file (after installing a4.sty) I get the 
error below. 

This is pdfeTeX, Version 3.141592-1.30.6-2.2 (MiKTeX 2.5 RC 1)
entering extended mode
(Sweave-test-1.tex
LaTeX2e 2005/12/01
Babel v3.8g and hyphenation patterns for english, dumylang, nohyphenation, ge
rman, ngerman, french, loaded.
(C:\Program Files\MiKTeX 2.5\tex\latex\base\article.cls
Document Class: article 2005/09/16 v1.4f Standard LaTeX document class
(C:\Program Files\MiKTeX 2.5\tex\latex\base\size10.clo))
(C:\Program Files\MiKTeX 2.5\tex\latex\ltxmisc\a4wide.sty
(C:\Program Files\MiKTeX 2.5\tex\latex\ntgclass\a4.sty))
! Missing \endcsname inserted.
to be read again 
   \protect 
l.11 \begin
   {document}
? 
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Effects Model with Interacting Covariates

2006-07-31 Thread Douglas Bates
On 7/31/06, Dov Stekel [EMAIL PROTECTED] wrote:
 Hi

 I have been asked by a colleague to perform a statistical analysis
 which uses random effects - but I am struggling to get this to work
 with nlme in R. Help would be very much appreciated!

 Essentially, the data consists of:

 10 patients. Each patient has been given three different treatments (on
 three separate days). 15 measurements (continuous variable) have been
 taken from each patient both before and after each of the treatments.
 So the data looks like:

 Patient WhenTreat   Measurement
 a   before  A   10.3
 a   before  A   11.2
 ...
 a   after   A   12.4
 ...
 a   before  B   11.6
 ...
 a   after   B   ...

 and the same for treatment C, patients, b,c,d, etc.

 My colleague would like to test to see if the treatments are different
 from each other. i.e., is the change (before to after) due to the
 treatments different between the treatments. It would seem to me like a
 random effects model in which we are interested in the significance of
 the interaction terms Treat:When, with repeated measures in the
 patients (who are random effects, but crossed with the covariates).
 Unfortunately, the groupedData formula only lets me put a single
 covariate on the LHS - nothing as complicated as this!

I'm not sure I understand what the LHS of a formula for a groupedData
object has to do with your question.

You will need to specify the model that you wish to fit by lme and,
for that, you will need to decide which terms should be fixed effects
and which random effects.  Do you think that the patients contribute
only an additive shift in the response or do you think that the
patients may have different initial values and different levels of
change in the Before/After responses?

It seems that you could begin by fitting

fm1 - lme(Measurement ~ When*Treat, random = ~ 1 | Patient, data = ...)

and

fm2 - lme(Measurement ~ When*Treat, random = ~ 1|Patient/When, data = ...)

There are many other variations that you could consider but we can
only guess at because we don't know enough of the context of the data.
 For example, it is possible that it would be appropriate to eliminate
a main effect for Treat because the Treatment cannot be expected to
influence the measurement before the Treatment is applied.  The
fixed-effects term would then be specified as

fm3 - lme(Measurement ~ When + When:Treat, random = ...)


 I could, of course, advise her to simply combine all 30 data points for
 each treatment in each patient into a single number (representing
 difference between before and after), but is there a way to use all the
 data in an LME?

 Thanks!


 Dov



 **

 Dr Dov Stekel
 Lecturer in Bioinformatics
 School of Biosciences
 University of Birmingham
 Birmingham B15 2TT
 Tel: +44 121 414 4209
 Email: [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Functions ,Optim, Dataframe

2006-07-31 Thread Tony Plate
Supply your additional arguments to optim() and they will get passed to 
your function:

  mydat-data.frame(d1=c(3,5),d2=c(6,10),p1=c(.55,.05),p2=c(.85,.35))
 
  fr-function(x, d) {
+ # d is a vector of d1, d2, p1  p2
+ u - x[1]
+ v - x[2]
+ d1 - d[1]
+ d2 - d[2]
+ p1 - d[3]
+ p2 - d[4]
+ sqrt(sum((plnorm(c(d1,d2,u,v)-c(p1,p2))^2)))
+ }
  x0 - c(1,1)# starting values for two unknown parameters
  y1 - optim(x0,fr,d=unlist(mydat[1,]))
  y2 - optim(x0,fr,d=unlist(mydat[2,]))
  y1$par
[1] 0.462500 0.828125
  y2$par
[1] -1.0937500  0.2828125
  yall - apply(mydat, 1, function(d) optim(x0,fr,d=d))
  yall[[1]]$par
[1] 0.462500 0.828125
  yall[[2]]$par
[1] -1.0937500  0.2828125
 

One thing you must be careful of is that none of the arguments to your 
function match or partially match the named arguments of optim(), which are:
  names(formals(optim))
[1] par fn  gr  method  lower   upper   control
[8] hessian ...
 

For example, if your function has an argument 'he=', you will not be 
able to pass it, because if you say optim(x0, fr, he=3), the 'he' will 
match the 'hessian=' argument of optim(), and it will not be interpreted 
as being a '...' argument.

-- Tony Plate

Michael Papenfus wrote:
 I think I need to clarify a little further on my original question.
 
 I have the following two rows of data:
 mydat-data.frame(d1=c(3,5),d2=c(6,10),p1=c(.55,.05),p2=c(.85,.35))
  mydat
   d1 d2 p1 p2
 1 3 6 0.55 0.85
 2 5 10 0.05 0.35
 
 I need to optimize the following function using  optim for each row in mydat
 fr-function(x) {
 u-x[1]
 v-x[2]
 sqrt(sum((plnorm(c(d1,d2,u,v)-c(p1,p2))^2))
 }
 x0-c(1,1)# starting values for two unknown parameters
 y-optim(x0,fr)
 
 In my defined function fr, (d1 d2 p1 p2) are known values which I need 
 to read in from my dataframe and u  v are the TWO unknown parameters.  
 I want to solve this equation for each row of my dataframe.
 
 I can get this to work when I manually plug in the known values (d1 d2 
 p1 p2).  However, I would like to apply this to each row in my dataframe 
 where the known values are automatically passed to my function which 
 then is sent to optim which solves for the two unknown parameters for 
 each row in the dataframe.
 
 thanks again,
 mike


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave error in example code

2006-07-31 Thread Sundar Dorai-Raj


LL wrote:
 Hi.. I am running R version 2.3.1 on a Windows XP machine with the latest 
 Miktex 2.5 installed. I get no errors from R when running the Sweave example, 
 
 testfile - system.file(Sweave, Sweave-test-1.Rnw, package = utils)
 
 However, when I tex the resulting .tex file (after installing a4.sty) I get 
 the error below. 
 
 This is pdfeTeX, Version 3.141592-1.30.6-2.2 (MiKTeX 2.5 RC 1)
 entering extended mode
 (Sweave-test-1.tex
 LaTeX2e 2005/12/01
 Babel v3.8g and hyphenation patterns for english, dumylang, nohyphenation, 
 ge
 rman, ngerman, french, loaded.
 (C:\Program Files\MiKTeX 2.5\tex\latex\base\article.cls
 Document Class: article 2005/09/16 v1.4f Standard LaTeX document class
 (C:\Program Files\MiKTeX 2.5\tex\latex\base\size10.clo))
 (C:\Program Files\MiKTeX 2.5\tex\latex\ltxmisc\a4wide.sty
 (C:\Program Files\MiKTeX 2.5\tex\latex\ntgclass\a4.sty))
 ! Missing \endcsname inserted.
 to be read again 
\protect 
 l.11 \begin
{document}
 ? 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


This works for me. However, when I ran this, MiKTeX prompted me to 
install the ntgclass package, which I did. Everything ran smoothly after 
that. I'm using R-2.3.1 with MiKTeX 2.4 in WinXP Pro.

HTH,

--sundar

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Effects Model with Interacting Covariates

2006-07-31 Thread Dov Stekel
Douglas

That's very helpful! It's just a syntax error in my use of lme (I find 
the documentation hard to figure!). I'm actually also using the formula

lme(Measurement~Treatment/When etc)

as this gives the right contrasts to look at the interactions between 
each of the treatments and before/after. I'm still working on a model 
formula that will give me a single p-value for 'is the difference 
between before and after different for different treatments'.

And this all feels much happier than not using a random effects model 
and simply using patient as a blocking variable (i.e. Measurement ~ 
Treat/When + Patient) which seems unsatisfactory for independence 
reasons. (I'm not really a statistician - just the most stats-savvy 
person in my department!)

Thanks,

Dov


On 31 Jul 2006, at 18:38, Douglas Bates wrote:

 On 7/31/06, Dov Stekel [EMAIL PROTECTED] wrote:
 Hi

 I have been asked by a colleague to perform a statistical analysis
 which uses random effects - but I am struggling to get this to work
 with nlme in R. Help would be very much appreciated!

 Essentially, the data consists of:

 10 patients. Each patient has been given three different treatments 
 (on
 three separate days). 15 measurements (continuous variable) have been
 taken from each patient both before and after each of the treatments.
 So the data looks like:

 Patient WhenTreat   Measurement
 a   before  A   10.3
 a   before  A   11.2
 ...
 a   after   A   12.4
 ...
 a   before  B   11.6
 ...
 a   after   B   ...

 and the same for treatment C, patients, b,c,d, etc.

 My colleague would like to test to see if the treatments are different
 from each other. i.e., is the change (before to after) due to the
 treatments different between the treatments. It would seem to me like 
 a
 random effects model in which we are interested in the significance of
 the interaction terms Treat:When, with repeated measures in the
 patients (who are random effects, but crossed with the covariates).
 Unfortunately, the groupedData formula only lets me put a single
 covariate on the LHS - nothing as complicated as this!

 I'm not sure I understand what the LHS of a formula for a groupedData
 object has to do with your question.

 You will need to specify the model that you wish to fit by lme and,
 for that, you will need to decide which terms should be fixed effects
 and which random effects.  Do you think that the patients contribute
 only an additive shift in the response or do you think that the
 patients may have different initial values and different levels of
 change in the Before/After responses?

 It seems that you could begin by fitting

 fm1 - lme(Measurement ~ When*Treat, random = ~ 1 | Patient, data = 
 ...)

 and

 fm2 - lme(Measurement ~ When*Treat, random = ~ 1|Patient/When, data = 
 ...)

 There are many other variations that you could consider but we can
 only guess at because we don't know enough of the context of the data.
 For example, it is possible that it would be appropriate to eliminate
 a main effect for Treat because the Treatment cannot be expected to
 influence the measurement before the Treatment is applied.  The
 fixed-effects term would then be specified as

 fm3 - lme(Measurement ~ When + When:Treat, random = ...)


 I could, of course, advise her to simply combine all 30 data points 
 for
 each treatment in each patient into a single number (representing
 difference between before and after), but is there a way to use all 
 the
 data in an LME?

 Thanks!


 Dov



 **

 Dr Dov Stekel
 Lecturer in Bioinformatics
 School of Biosciences
 University of Birmingham
 Birmingham B15 2TT
 Tel: +44 121 414 4209
 Email: [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



**

Dr Dov Stekel
Lecturer in Bioinformatics
School of Biosciences
University of Birmingham
Birmingham B15 2TT
Tel: +44 121 414 4209
Email: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Random Effects Model with Interacting Covariates

2006-07-31 Thread Dov Stekel

Ooh,

 lme(Measurement~Treatment/When etc)


and

  lm(Measurement ~ Treat/When + Patient)

give exactly the same results! How interesting!


Dov


 which seems unsatisfactory for independence
 reasons. (I'm not really a statistician - just the most stats-savvy
 person in my department!)

 Thanks,

 Dov


 On 31 Jul 2006, at 18:38, Douglas Bates wrote:

 On 7/31/06, Dov Stekel [EMAIL PROTECTED] wrote:
 Hi

 I have been asked by a colleague to perform a statistical analysis
 which uses random effects - but I am struggling to get this to work
 with nlme in R. Help would be very much appreciated!

 Essentially, the data consists of:

 10 patients. Each patient has been given three different treatments
 (on
 three separate days). 15 measurements (continuous variable) have been
 taken from each patient both before and after each of the treatments.
 So the data looks like:

 Patient WhenTreat   Measurement
 a   before  A   10.3
 a   before  A   11.2
 ...
 a   after   A   12.4
 ...
 a   before  B   11.6
 ...
 a   after   B   ...

 and the same for treatment C, patients, b,c,d, etc.

 My colleague would like to test to see if the treatments are 
 different
 from each other. i.e., is the change (before to after) due to the
 treatments different between the treatments. It would seem to me like
 a
 random effects model in which we are interested in the significance 
 of
 the interaction terms Treat:When, with repeated measures in the
 patients (who are random effects, but crossed with the covariates).
 Unfortunately, the groupedData formula only lets me put a single
 covariate on the LHS - nothing as complicated as this!

 I'm not sure I understand what the LHS of a formula for a groupedData
 object has to do with your question.

 You will need to specify the model that you wish to fit by lme and,
 for that, you will need to decide which terms should be fixed effects
 and which random effects.  Do you think that the patients contribute
 only an additive shift in the response or do you think that the
 patients may have different initial values and different levels of
 change in the Before/After responses?

 It seems that you could begin by fitting

 fm1 - lme(Measurement ~ When*Treat, random = ~ 1 | Patient, data =
 ...)

 and

 fm2 - lme(Measurement ~ When*Treat, random = ~ 1|Patient/When, data =
 ...)

 There are many other variations that you could consider but we can
 only guess at because we don't know enough of the context of the data.
 For example, it is possible that it would be appropriate to eliminate
 a main effect for Treat because the Treatment cannot be expected to
 influence the measurement before the Treatment is applied.  The
 fixed-effects term would then be specified as

 fm3 - lme(Measurement ~ When + When:Treat, random = ...)


 I could, of course, advise her to simply combine all 30 data points
 for
 each treatment in each patient into a single number (representing
 difference between before and after), but is there a way to use all
 the
 data in an LME?

 Thanks!


 Dov



 **

 Dr Dov Stekel
 Lecturer in Bioinformatics
 School of Biosciences
 University of Birmingham
 Birmingham B15 2TT
 Tel: +44 121 414 4209
 Email: [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 **

 Dr Dov Stekel
 Lecturer in Bioinformatics
 School of Biosciences
 University of Birmingham
 Birmingham B15 2TT
 Tel: +44 121 414 4209
 Email: [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


**

Dr Dov Stekel
Lecturer in Bioinformatics
School of Biosciences
University of Birmingham
Birmingham B15 2TT
Tel: +44 121 414 4209
Email: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory problems when combining randomForests

2006-07-31 Thread Weiwei Shi
Found it from another paper:
importance sample learning ensemble (ISLE)
which originates from Friedman and Popescu (2003).


On 7/31/06, Weiwei Shi [EMAIL PROTECTED] wrote:

 Hi, Andy:

 What's the Jerry Friedman's ISLE? I googled it and did not find the paper
 on it. Could you give me a link, please?

 Thanks,

 Weiwei


 On 7/31/06, Eleni Rapsomaniki [EMAIL PROTECTED] wrote:
 
  Hello
 
  I've just realised attachments are not allowed, so the data for the
  example in
  my previous message is:
 
  pos.df=read.table(http://www.savefile.com/projects3.php?fid=6240314pid=847249key=119090
  ,
  header=T)
 
  neg.df=read.table(http://fs07.savefile.com/download.php?pid=847249fid=9829834key=362779
  ,
  header=T)
 
  And my last two questions (promise!):
  The first is related to the order of columns (ie. explanatory
  variables). I get
  different order of importance for my variables depending on their order
  in the
  training data. Is there a parameter I could fiddle with (e.g. ntree) to
  get a
  more stable importance order?
 
  And finally, since interactions are not implemented, is there another
  method I
  could use in R to find dependencies among categorical variables? (lm
  doesn't
  accept categorical variables).
 
  Many thanks
  Eleni Rapsomaniki
  Birkbeck College, UK
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Weiwei Shi, Ph.D

 Did you always know?
 No, I did not. But I believed...
 ---Matrix III




-- 
Weiwei Shi, Ph.D

Did you always know?
No, I did not. But I believed...
---Matrix III

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Functions ,Optim, Dataframe

2006-07-31 Thread Tony Plate
I added an example of passing additional arguments through optim() to 
the objective and gradient functions to the Discussion section of the 
Wiki-fied R documentation.  See it at 
http://wiki.r-project.org/rwiki/doku.php?id=rdoc:stats:optim

-- Tony Plate

PS.  I had to add purge=true to the end of the URL, i.e., 
http://wiki.r-project.org/rwiki/doku.php?id=rdoc:stats:optimpurge=true 
in order to see the original documentation the first time -- it's 
something to do with bad cache entries for the page.

Michael Papenfus wrote:
 I think I need to clarify a little further on my original question.
 
 I have the following two rows of data:
 mydat-data.frame(d1=c(3,5),d2=c(6,10),p1=c(.55,.05),p2=c(.85,.35))
  mydat
   d1 d2 p1 p2
 1 3 6 0.55 0.85
 2 5 10 0.05 0.35
 
 I need to optimize the following function using  optim for each row in mydat
 fr-function(x) {
 u-x[1]
 v-x[2]
 sqrt(sum((plnorm(c(d1,d2,u,v)-c(p1,p2))^2))
 }
 x0-c(1,1)# starting values for two unknown parameters
 y-optim(x0,fr)
 
 In my defined function fr, (d1 d2 p1 p2) are known values which I need 
 to read in from my dataframe and u  v are the TWO unknown parameters.  
 I want to solve this equation for each row of my dataframe.
 
 I can get this to work when I manually plug in the known values (d1 d2 
 p1 p2).  However, I would like to apply this to each row in my dataframe 
 where the known values are automatically passed to my function which 
 then is sent to optim which solves for the two unknown parameters for 
 each row in the dataframe.
 
 thanks again,
 mike


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.spss 'error reading system-file header'

2006-07-31 Thread Finn Sandø
When I try to import an spss sav file with read.spss() I am getting the 
following error
'Error in read.spss(X:\\.sav) : error reading system-file header' 
and the import process is aborted.
I have tried in v. 2.3.0 and 2.3.1
The sav-file loads without problems in spss v14 I have tried saving in 
older spss v7 but are getting the same result.
The read.spss() has other errors (the 'Unrecognized record type 7, 
subtype 7 encountered in system file') but it does not seem to have any 
impact.
This leads me to thinking that the spss.read() slowly is growing out of 
date which would be sad.
So question 1:
Does anyone know if these problems are going to be solved? I know the 
read.spss() function is build on the PSPP project so maybe it takes 
someone with c-knowledge to do something about it.
If someone is going to work on the problem I will be happy to help by 
testing and providing problematic test-files.
Question 2
Is there some way to import spss-sav files in this case other than save 
in a non-spss format?
Regards
FS

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] random effects with lmer() and lme(), three random factors

2006-07-31 Thread Xianqun \(Wilson\) Wang
Dr. Bates,

Thanks for the notes! It helps. So now I see consistent resluts from
both lme and lmer. Since I have several response variables to look at, I
will reduce the model separately. 

Speaking of the model reduction, it is clear in this example that the
trivial variance of Operator:Run could be ignored. For a
non-trivial-variance multivariate model, I wonder if there is any
function (like step function for lm and glm) would work with lme or lmer
and allow me to do the AIC-based model selection.   


Wilson
  

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Douglas
Bates
Sent: Saturday, July 29, 2006 7:51 AM
To: Xianqun (Wilson) Wang
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] random effects with lmer() and lme(), three random
factors

On 7/28/06, Xianqun (Wilson) Wang [EMAIL PROTECTED] wrote:
 Hi, all,

 I have a question about random effects model. I am dealing with a 
 three-factor experiment dataset. The response variable y is modeled 
 against three factors: Samples, Operators, and Runs. The experimental 
 design is as follow:

 4 samples were randomly chosen from a large pool of test samples. Each

 of the 4 samples was analyzed by 4 operators, randomly selected from a

 group of operators. Each operator independently analyzed same samples 
 over 5 runs (runs nested in operator). I would like to know the 
 following things:

 (1) the standard deviation within each run;

 (2) the standard deviation between runs;

 (3) the standard deviation within operator

 (4) the standard deviation between operator.

 With this data, I assumed the three factors are all random effects. So

 the model I am looking for is

 Model:  y  = Samples(random) + Operator(random) + Operator:Run(random)

 +
 Error(Operator) + Error(Operator:Run)  + Residuals

 I am using lme function in nlme package. Here is the R code I have

 1.   using lme:

 First I created a grouped data using

 gx - groupedData(y ~ 1 | Sample, data=x)

 gx$dummy - factor(rep(1,nrow(gx)))

 then I run the lme

 fm- lme(y ~ 1, data=gx,
 random=list(dummy=pdBlocked(list(pdIdent(~Sample-1),
 pdIdent(~Operator-1),
 pdIdent(~Operator:Run-1)

 finally, I use VarCorr to extract the variance components

vc - VarCorr(fm)
  Variance   StdDev
 Operator:Run 1.595713e-10:20   1.263215e-05:20
 Sample   5.035235e+00: 4   2.243933e+00: 4
 Operator 5.483145e-04: 4   2.341612e-02: 4
 Residuals8.543601e-02: 1   2.922944e-01: 1

 2.  Using lmer in Matrix package:

 fm - lmer(y ~ (1 | Sample) + (1 | Operator) +
(1|Operator:Run), data=x)

That model specification can now be written as

fm - lmer(y ~ (1|Sample) + (1|Operator/Run), x)

  summary(fm)

 Linear mixed-effects model fit by REML
 Formula: H.I.Index ~ (1 | Sample.Name) + (1 | Operator) + (1 |
 Operator:Run)
   Data: x
   AIC  BIClogLik MLdeviance REMLdeviance
  96.73522 109.0108 -44.36761   90.80064 88.73522
 Random effects:
  Groups   NameVariance   Std.Dev.
  Operator:Run (Intercept) 4.2718e-11 6.5359e-06
  Operator (Intercept) 5.4821e-04 2.3414e-02
  Sample   (Intercept) 5.0352e+00 2.2439e+00
  Residual 8.5436e-02 2.9229e-01
 number of obs: 159, groups: Operator:Run, 20; Operator, 4; 
 Sample.Name,
 4

 Fixed effects:
  Estimate Std. Error  t value
 (Intercept) 0.0020818  1.1222683 0.001855

 There is a difference between lmer and lme is for the factor 
 Operator:Run.

It's just a matter of round-off.  It is possible for the ML or REML
estimates of a variance component to be zero, as is the case here, but
the current computational methods do not allow the value zero because
this will cause some of the matrix decompositions to fail.  In lmer we
use a constrained optimization with the relative variance (variance of a
random effect divided by the residual variance) constrained to be
greater than or equal to 5e-10, which is exactly the value you have
here.

I'll add code to the model fitting routine to warn the user when
convergence to the boundary value occurs.  I haven't done that in the
past because it is not always easy to explain what is occurring.
For a model with variance components only, like yours, convergence on
the boundary means that an estimated variance component is zero.  In the
case of bivariate or multivariate random effects the variance-covariance
matrix can be singular without either of the variances being zero.

The bottom line for you is that the estimated variance for Operator:Run
is zero and you should reduce the model to y ~ (1|Sample)
+ (1|Operator)


I cannot find where the problem is. Could anyone point me
 out if my model specification is correct for the problem I am dealing 
 with. I am pretty new user to lme and lmer. Thanks for your help!





 Wilson Wang






 [[alternative 

Re: [R] Sweave error in example code

2006-07-31 Thread Prof Brian Ripley
On Mon, 31 Jul 2006, Sundar Dorai-Raj wrote:

 
 
 LL wrote:
  Hi.. I am running R version 2.3.1 on a Windows XP machine with the latest 
  Miktex 2.5 installed. I get no errors from R when running the Sweave 
  example, 
  
  testfile - system.file(Sweave, Sweave-test-1.Rnw, package = utils)
  
  However, when I tex the resulting .tex file (after installing a4.sty) I get 
  the error below. 
  
  This is pdfeTeX, Version 3.141592-1.30.6-2.2 (MiKTeX 2.5 RC 1)
  entering extended mode
  (Sweave-test-1.tex
  LaTeX2e 2005/12/01
  Babel v3.8g and hyphenation patterns for english, dumylang, 
  nohyphenation, ge
  rman, ngerman, french, loaded.
  (C:\Program Files\MiKTeX 2.5\tex\latex\base\article.cls
  Document Class: article 2005/09/16 v1.4f Standard LaTeX document class
  (C:\Program Files\MiKTeX 2.5\tex\latex\base\size10.clo))
  (C:\Program Files\MiKTeX 2.5\tex\latex\ltxmisc\a4wide.sty
  (C:\Program Files\MiKTeX 2.5\tex\latex\ntgclass\a4.sty))
  ! Missing \endcsname inserted.
  to be read again 
 \protect 
  l.11 \begin
 {document}
  ? 
  [[alternative HTML version deleted]]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 This works for me. However, when I ran this, MiKTeX prompted me to 
 install the ntgclass package, which I did. Everything ran smoothly after 
 that. I'm using R-2.3.1 with MiKTeX 2.4 in WinXP Pro.

But he is using MiKTeX 2.5: looks like a problem with MiKTeX, as the latex 
error is in the initial processing.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RCurl

2006-07-31 Thread Rajarshi Guha
Hi, does anybody know where I might the RCurl package - the omegahat.org
server seems to be down

Thanks,

---
Rajarshi Guha [EMAIL PROTECTED]
GPG Fingerprint: 0CCA 8EE2 2EEB 25E2 AB04 06F7 1BB9 E634 9B87 56EE
---
Every little picofarad has a nanohenry all its own.
-- Don Vonada

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] resampling mean distances

2006-07-31 Thread Jose Andres
  Hi all,

I am trying to generate a distribution for the mean euclidean  
distance between a group of n elements in a given surface (the  
elements are randomly picked).  Fo doing so I've written the  
following code:

sampling- function(x,size) {

x- x[sample(1:nrow(x),size),]

mat- matrix(c(x$V6,x$V7,x$V8), ncol=3)

mean.dist- mean(dist(mat,euclidean))


}

x is the file where the data are stored
size is the size of the group
mat generates a simple matrix. V6, V7, and V8 are the 3D (x,y,z)  
coordinates of the group elements .
mean.dist  calculates the mean pairwise distance between the objects  
of the group.

Everything works fine but I want  to repeat this many times (e.g.  
1) and  store the mean.dist values in a new variable so I can   
generate the distribution of mean pairwise distances of a group of  
size n in my surface.

Is there any easy way to do this? I'd really appreciate all your  
comments.

Thanks in advance,

/Jose










On Jul 31, 2006, at 15:35, Prof Brian Ripley wrote:

 On Mon, 31 Jul 2006, Sundar Dorai-Raj wrote:



 LL wrote:
 Hi.. I am running R version 2.3.1 on a Windows XP machine with  
 the latest Miktex 2.5 installed. I get no errors from R when  
 running the Sweave example,

 testfile - system.file(Sweave, Sweave-test-1.Rnw, package =  
 utils)

 However, when I tex the resulting .tex file (after installing  
 a4.sty) I get the error below.

 This is pdfeTeX, Version 3.141592-1.30.6-2.2 (MiKTeX 2.5 RC 1)
 entering extended mode
 (Sweave-test-1.tex
 LaTeX2e 2005/12/01
 Babel v3.8g and hyphenation patterns for english, dumylang,  
 nohyphenation, ge
 rman, ngerman, french, loaded.
 (C:\Program Files\MiKTeX 2.5\tex\latex\base\article.cls
 Document Class: article 2005/09/16 v1.4f Standard LaTeX document  
 class
 (C:\Program Files\MiKTeX 2.5\tex\latex\base\size10.clo))
 (C:\Program Files\MiKTeX 2.5\tex\latex\ltxmisc\a4wide.sty
 (C:\Program Files\MiKTeX 2.5\tex\latex\ntgclass\a4.sty))
 ! Missing \endcsname inserted.
 to be read again
\protect
 l.11 \begin
{document}
 ?
 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


 This works for me. However, when I ran this, MiKTeX prompted me to
 install the ntgclass package, which I did. Everything ran smoothly  
 after
 that. I'm using R-2.3.1 with MiKTeX 2.4 in WinXP Pro.

 But he is using MiKTeX 2.5: looks like a problem with MiKTeX, as  
 the latex
 error is in the initial processing.

 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] na.rm problem

2006-07-31 Thread sonal
hi,

i am a new member.

i am using R in finding correlation between two variables of unequal length.

when i use 

cor(x,y,na.rm=T,use=complete)

where x has observations from 1928 to 2006  y has observations from 1950 to
2006. I used na.rm=T to use the complete observations.  So missing values
should be handled by casewise deletion. But it gives me error

Error in cov(close, close1, na.rm = T, use = complete) : 
unused argument(s) (na.rm ...)


Please help me with this as I am new to R.

Thanks,
Sonal

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: na.rm problem

2006-07-31 Thread sonal
hi,

i am a new member.

i am using R in finding correlation between two variables of unequal length.

when i use

cor(x,y,na.rm=T,use=complete)

where x has observations from 1928 to 2006  y has observations from 1950 to
2006. I used na.rm=T to use the complete observations.  So missing values
should be handled by casewise deletion. But it gives me error

Error in cor(close, close1, na.rm = T, use = complete) : 
unused argument(s) (na.rm ...)

Please help me with this as I am new to R.

Thanks,
Sonal

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RCurl

2006-07-31 Thread Seth Falcon
Rajarshi Guha [EMAIL PROTECTED] writes:

 Hi, does anybody know where I might the RCurl package - the omegahat.org
 server seems to be down

The Bioconductor project hosts a mirror of a subset of Omegahat
packages (RCurl is included).  You can find the listing here:

http://www.bioconductor.org/packages/release/omegahat/

There is a browsable HTML package listing at the above URL which is
also a valid CRAN-style package repository.

So, for example, you should be able to do:

install.packages(RCurl,
 repos=http://www.bioconductor.org/packages/release/omegahat/;,
 dependencies=TRUE)


+ seth

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How does biplot.princomp scale its axes?

2006-07-31 Thread Patrick Connolly
I'm attempting to modify how biplot draws its red vectors (among other
things).  This is how I've started:


Biplot - function(xx, comps = c(1, 2), cex = c(.6, .4))
{
  ## Purpose: Makes a biplot with princomp() object to not show arrows
  ## --
  ## Arguments: xx is an object made using princomp()
  ## --
scores - xx$scores[, paste(Comp, comps, sep = .)]
loadings - xx$loadings[, paste(Comp, comps, sep = .)]
plot(range(scores), range(scores), xlab = , ylab = , xaxt = n,
   yaxt = n, pch =  )
text(scores[,1], scores[,2], rownames(scores), cex = cex[1])
axis(2)
axis(1)
}

I can make part of a biplot using that function with the USArrests data:
Biplot(princomp(USArrests, cor = TRUE), c(1,2), cex = c(.6, .4))

Compare that with what we get using biplot.princomp:
biplot(princomp(USArrests, cor = TRUE), c(1,2), cex = c(.6, .4))

It seems to me that the y-values are the same in both plots, but some
sort of scaling on the x-axis is happening.  Something similar seems
to happen with the loadings as well.  

I notice in the documentation for biplot, mention is made of ... many
variations on biplots.  Would I be doing something inexcusable if I
ignored the differences I've noticed here?

TIA

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___Patrick Connolly   
 {~._.~} Great minds discuss ideas
 _( Y )_Middle minds discuss events 
(:_~*~_:)Small minds discuss people  
 (_)-(_)   . Anon
  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] if function and apply

2006-07-31 Thread John Kane
Runninn R.2.3.1  Windows XP

I have a dataset just imported from SPSS.  It has any
number of 99's as missing data and it looks like the
next dataset will have custom missing codes. I have
abouat 120 variables and an N of 2000.

I think thatI would like to apply a function to the
data.frame (or to a matrix of the data if needed) to
recode all the 99's to NA.  I thought that I could
adapt an example from the list using apply but with
no success.  

Is there a decent source of examples of how to write
an if statement on the web? I'm missing something
simple.
Here is an example of what I have been trying.

##  
cat - c( 3,5,6,8)
dog - c(3,5,3,6)
rat - c (5, NA, 4, 9)
mat - (cbind(cat,dog, rat))
Df - data.frame(cbind(cat, dog, rat)

#  define function
fn - function (x a) {
if (x==a)return  (b) else x
}

apply(mat, c(1,2), fn, 99, NA)
#

Thanks

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] if function and apply

2006-07-31 Thread jim holtman
You test matrix did not have any 99 in it, so I replaced all '3's with NAs

 mat
 cat dog rat
[1,]   3   3   5
[2,]   5   5  NA
[3,]   6   3   4
[4,]   8   6   9
 mat[mat == 3] - NA
 mat
 cat dog rat
[1,]  NA  NA   5
[2,]   5   5  NA
[3,]   6  NA   4
[4,]   8   6   9



You can not do value == NA; you have to use 'is.na(value)'


On 7/31/06, John Kane [EMAIL PROTECTED] wrote:

 Runninn R.2.3.1  Windows XP

 I have a dataset just imported from SPSS.  It has any
 number of 99's as missing data and it looks like the
 next dataset will have custom missing codes. I have
 abouat 120 variables and an N of 2000.

 I think thatI would like to apply a function to the
 data.frame (or to a matrix of the data if needed) to
 recode all the 99's to NA.  I thought that I could
 adapt an example from the list using apply but with
 no success.

 Is there a decent source of examples of how to write
 an if statement on the web? I'm missing something
 simple.
 Here is an example of what I have been trying.

 ##
 cat - c( 3,5,6,8)
 dog - c(3,5,3,6)
 rat - c (5, NA, 4, 9)
 mat - (cbind(cat,dog, rat))
 Df - data.frame(cbind(cat, dog, rat)

 #  define function
 fn - function (x a) {
 if (x==a)return  (b) else x
 }

 apply(mat, c(1,2), fn, 99, NA)
 #

 Thanks

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does biplot.princomp scale its axes?

2006-07-31 Thread Gabor Grothendieck
Its easiest to just check the source.  biplot is a generic which calls
biplot.princomp which calls biplot.default which in turn calls plot so
try this and examine the source:

stats:::biplot.default


On 7/31/06, Patrick Connolly [EMAIL PROTECTED] wrote:
 I'm attempting to modify how biplot draws its red vectors (among other
 things).  This is how I've started:


 Biplot - function(xx, comps = c(1, 2), cex = c(.6, .4))
 {
  ## Purpose: Makes a biplot with princomp() object to not show arrows
  ## --
  ## Arguments: xx is an object made using princomp()
  ## --
 scores - xx$scores[, paste(Comp, comps, sep = .)]
 loadings - xx$loadings[, paste(Comp, comps, sep = .)]
 plot(range(scores), range(scores), xlab = , ylab = , xaxt = n,
   yaxt = n, pch =  )
 text(scores[,1], scores[,2], rownames(scores), cex = cex[1])
 axis(2)
 axis(1)
 }

 I can make part of a biplot using that function with the USArrests data:
 Biplot(princomp(USArrests, cor = TRUE), c(1,2), cex = c(.6, .4))

 Compare that with what we get using biplot.princomp:
 biplot(princomp(USArrests, cor = TRUE), c(1,2), cex = c(.6, .4))

 It seems to me that the y-values are the same in both plots, but some
 sort of scaling on the x-axis is happening.  Something similar seems
 to happen with the loadings as well.

 I notice in the documentation for biplot, mention is made of ... many
 variations on biplots.  Would I be doing something inexcusable if I
 ignored the differences I've noticed here?

 TIA

 --
 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
   ___Patrick Connolly
  {~._.~} Great minds discuss ideas
  _( Y )_Middle minds discuss events
 (:_~*~_:)Small minds discuss people
  (_)-(_)   . Anon

 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about data used to fit the mixed model

2006-07-31 Thread Douglas Bates
On 7/29/06, Nantachai Kantanantha [EMAIL PROTECTED] wrote:
 Hi everyone,

 I would like to ask a question regarding to the data used to fit the mixed
 model.

 I wonder that, for the response variable data used to fit the mixed model
 (either via spm or lme), we must have several observations per subject
 (i.e. Yij,  i = 1,..,M,  j = 1,.., ni) or it can be just one observation per
 subject (i.e. Yi,  i = 1,...,M). Since we have to specify the groups for
 random effect components, if we have only one observation per subject, then
 each group will have only one observation.

As Harold Doran mentioned in his earlier reply, if you only have one
observation in each group you can't estimate the parameters in a mixed
model because the random effect for a group is completely confounded
with the per-observation noise term for the observation.  The model
would be of the form

X\beta + Z b + \epsilon

for which you would estimate the variance of the components of b and
the variance of the components of \epsilon.  However, with only one
observation per group the number of components in b and in \epsilon
would be the same and, by suitably reordering the observations, the
matrix Z could be made to be an identity matrix.  Thus the model
reduces to

 X\beta + (b + \epsilon)

and the elements of b are confounded with those of \epsilon.

A different version of this question is to ask whether some of the
groups can have only a single observation while others have more that
one observation.  The answer to that is a qualified yes.

An example of data with different numbers of observations per group is
the star data that Harold mentioned.  The student identifier in this
data set is named id.  If we table the number of observations per
student then table that result we get a table of the number of
students with 1, 2, 3 or 4 observations.

 data(star, package = 'mlmRev')
 table(table(star$id))

   1234
4314 2455 1744 3085
 length(unique(star$id))
[1] 11598
 4314/11598
[1] 0.3719607

This shows that more than a third of the students have data from only
a single year.

It is possible to include such students in a mixed model with a random
effect for student.  It is even possible to include such students in a
mixed model with a random intercept and a random slope with respect to
time for student.  However, such students contribute very little
information to the model fit and the estimates (actually
predictors) of the random effects for such students are artificially
small because they are confounded with the per-observation noise term.

So while it can be attractive when designing an experimental or
planning a observational study to have many groups and few
observations per group, such experiments or studies provide very
sparse information.  Using a mixed model on such data doesn't
magically add information to the data.  Mixed models are statistical
models, not magic.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fred Mosteller and the star data from package mlmRev

2006-07-31 Thread Douglas Bates
In writing about the star data from package mlmRev I was reminded of
a comment in the New York Times obituary,  C. Frederick Mosteller, a
Pioneer of Statistics, Dies at 89, that appeared on July 27. In part
it stated

In the 1980's, he was instrumental in persuading Tennessee to conduct
a controlled study on the effect of classroom size. The study showed
convincingly that smaller classes significantly helped children from
poorer minority families.

I believe this is referring to the study that generated the star data
in package mlmRev.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] questions regarding spline functions

2006-07-31 Thread Dylan Beaudette
Greetings,

A couple general questions regarding the use of splines to interpolate depth 
profile data.

Here is an example of a set of depths, with associated attributes for a given 
soil profile, along  with a function for calculating midpoints from a set of 
soil horizon boundaries:

#calculate midpoints:
mid - function(x) {
for( i in 1:length(x)) {
 if( i  1) {
   a[i] = (x[i] - x[i-1]) / 2 + x[i-1] 
  } 
 } 
#reurn the results
a[which(!is.na(a))]
}

#horizon depth bounds
z - c(0,2,18,24,68,160,170,192,200)

#horizon midpoints, associated with horizon attribute
x - mid(z)

#clay pct
y - c(0,1,2,2,4,7,6,1)

#plot them
plot(y ~ x, xlab=Depth, ylab=Percent Clay, type=s)
points(y ~ x, cex=0.5, pch=16)

These point pairs usually represent a trend with depth, which I would like to 
model with splines - or some similar approach, as they have been found to 
work better than other methods such as a fitted polynomial.

Using the B Spline function from the 'splines' package, it is possible to fit 
a model of some property with depth based on the bs() function:

#natual, B-Splines
library(splines)

#fit a b-spline model:
fm - lm(y ~ bs(x, df=5) )

I am able to predict a soil property with depth, at unsampled locations with 
this model with:

new_x -  seq(range(x)[1], range(x)[2], len = 200)

#predict attribute at unsampled depths:
new_y - predict(fm, data.frame(x=new_x) )

#plot the predicted attribute at the unsampled depths
lines(new_x, new_y, col='red')

This tends to work fairly well (see attached), but I am wondering if I can use 
the 'knots' parameter in the bs() function for incorporation of horizon 
boundary information into the creation of the spline. Moreover, it would be 
nice if the spline-based model 'fm' would predict a set of values with 
similar mean and range as the original data points: i.e

#summary of clay values from original data:
summary(y)
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
  0.000   1.000   2.000   2.875   4.500   7.00

#see above
summary(new_y)
Min.  1st Qu.   Median Mean  3rd Qu. Max. 
-0.05786  2.09500  3.13200  3.62800  5.17100  7.08700


This is based on an article I read :
http://www.sciencedirect.com/science?_ob=ArticleURL_udi=B6V67-3WWRDYY-3_user=4421_handle=V-WA-A-W-AU-MsSAYWW-UUA-U-AACZEDZWBC-AACVCCDUBC-BZAYUEWB-AU-U_fmt=summary_coverDate=08%2F31%2F1999_rdoc=3_orig=browse_srch=%23toc%235807%231999%23999089998%23108393!_cdi=5807view=c_acct=C59598_version=1_urlVersion=0_userid=4421md5=488f1e114d8d64265ff65506e9587e71

where the author talks about a so-called 'equal-area quadratic smoothing 
spline' approach to describing a soil property depth function. Unfortunately 
the author did not provide sample code

Any thoughts / input would be greatly appreciated!

Cheers,

-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341


spline1.png
Description: PNG image
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question about prediction etc. in Ridge regression (MASS library)

2006-07-31 Thread jz7
Dear all,

I am trying to apply Ridge regression to my dataset, and then I would like
to predict the Y responses using the Ridge model (of certain lambda) for
new data point. The only Ridge regression functions I found is in MASS
library. However, there are very few functions available: lm.ridge(),
plot(), and select(). I didn't see any option to predict the Y response.

Does anyone know what else functions I could use to make prediction (using
Ridge model) or how I should write my own code to do the prediction? Also,
is there any way to calculate R^2 (or q^2) or the LOO-CV for Ridge model?

Really appreciate your kind help!

Sincerely,
Jeny

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] standard dev in glmmPQL

2006-07-31 Thread Maria Salomé Esteves Cabral
Hi!
 
Can anyone let me know how can I get the stdDev of the random intercept from 
the output  of glmmPQL?
 
 
Thanks
 
Salomé

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rgb and col2rgb color conversion/modification/shading

2006-07-31 Thread ccarey
I want to get a lighter shade of a color...I have a lot of colored objects and
want each one printed as a foreground against a slightly lighter background.

I thought I could try something like changing the alpha channel by first
converting it to rgb. 

But prior to trying that, I'm stuck with how to get the color after converting
using col2rgb() to be interpreted again as a color, rather than a simple
vector?

Anyone have any help/ or alternative suggestion...

Thanks, -c
--
TRYING WITH A SINGLE COLOR:

mycol-red

 col2rgb(mycol)
  [,1]
red255
green0
blue 0

 rgb(col2rgb(mycol),maxColorValue=255)

Error in rgb(col2rgb(red)) : argument green is missing, with no default

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitting models in a loop

2006-07-31 Thread Murray Jorgensen
If I want to display a few polynomial regression fits I can do something 
like

for (i in 1:6) {
mod - lm(y ~ poly(x,i))
print(summary(mod))
}

Suppose that I don't want to over-write the fitted model objects, 
though. How do I create a list of blank fitted model objects for later 
use in a loop?

Murray Jorgensen
-- 
Dr Murray Jorgensen  http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: [EMAIL PROTECTED]Fax 7 838 4155
Phone  +64 7 838 4773 wkHome +64 7 825 0441Mobile 021 1395 862

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confirmation Request (3355406281)

2006-07-31 Thread CNI-COPYRIGHT administration
This is an automated message from the
  [EMAIL PROTECTED] mailing list manager

Somebody (probably you) have requested the subscribe(digest) operation
  for your r-help@stat.math.ethz.ch address

If you want to confirm this operation,
  use the Reply command in your mailer.

Check that the Subject of the reply message contains
  the confirmation ID: 3355406281,
  the reply is directed to [EMAIL PROTECTED],
  and the 'From' address of your reply is r-help@stat.math.ethz.ch.

If you do not want to confirm the requested operation, simply do nothing

All requests about this mailing list
  should be sent to [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fitting models in a loop

2006-07-31 Thread Bill.Venables
Murray,

Here is a general paradigm I tend to use for such problems.  It extends
to fairly general model sequences, including different responses, c

First a couple of tiny, tricky but useful functions:

subst - function(Command, ...) do.call(substitute, list(Command,
list(...)))

abut - function(...)  ## jam things tightly together
  do.call(paste, c(lapply(list(...), as.character), sep = )) 

Name - function(...) as.name(do.call(abut, list(...)))

Now the gist.

fitCommand - quote({
MODELi - lm(y ~ poly(x, degree = i), theData)
print(summary(MODELi))
})
for(i in 1:6) {
thisCommand - subst(fitCommand, MODELi = Name(model_, i), i =
i)
print(thisCommand)  ## only as a check
eval(thisCommand)
}

At this point you should have the results and

objects(pat = ^model_)

should list the fitted model objects, all of which can be updated,
summarised, plotted, c, because the information on their construction
is all embedded in the call.

Bill.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Murray Jorgensen
Sent: Tuesday, 1 August 2006 2:09 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Fitting models in a loop

If I want to display a few polynomial regression fits I can do something

like

for (i in 1:6) {
mod - lm(y ~ poly(x,i))
print(summary(mod))
}

Suppose that I don't want to over-write the fitted model objects, 
though. How do I create a list of blank fitted model objects for later 
use in a loop?

Murray Jorgensen
-- 
Dr Murray Jorgensen  http://www.stats.waikato.ac.nz/Staff/maj.html
Department of Statistics, University of Waikato, Hamilton, New Zealand
Email: [EMAIL PROTECTED]Fax 7 838 4155
Phone  +64 7 838 4773 wkHome +64 7 825 0441Mobile 021 1395 862

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.