Re: [R] slow computation of mixed ANOVA using aov

2005-03-19 Thread Peter Dalgaard
Steven Lacey [EMAIL PROTECTED] writes:

 Dear R-help list,
  
 I am trying to do a mixed ANOVA on a 8960 x 5 dataframe. I have 3 factors
 for which I want to test all main effects and interactions : f1 (40 levels),
 f2 (7 levels), and f3 (4 levels). I also have a subject factor, subject, and
 a dependent variable, dv. 
  
 Some more information about the factors:
 f2 is a between-subject factor. That is, for each level of f2 there are 8
 nested levels of the subject factor. For example, levels 1-8 of subject are
 nested in level 1 of f2. Levels 9-16 of subject are nested in level 2 of f2.
 In other words, the subjects that participated in any level of f2 are
 different from the subjects that participated in any other level of f2. 
  
 In contrast, f1 and f3 are within-subject factors. That is, for any one of
 the 56 subjects, we have a 160 medians corresponding to each condition from
 a complete crossing of factors f1 and f2. While it is true that we do have
 replicate observations for any subject in each of these conditions, we take
 the median of those values and operate as if there is only a single
 observation for each subject in each of the 160 within-subject conditions. 
  
 Below is code that will generate dataframe d, which is just like the one I
 am working with:
  
 f1-gl(40,1,8960,ordered=T)
 f2-gl(7,1280,8960,ordered=T)
 f3-gl(4,40,8960,ordered=T)
 subject-gl(56,160,8960,ordered=T)
 dv-rnorm(8960,mean=500,sd=50)
 d - data.frame(f1,f2,f3,f4,dv)
  
 To run the mixed ANOVA I use the following call (modeled after J. Baron):
 aov(dv~f1*f2*f3+Error(subject/(f1*f2)),data=d)

[You mean subject/(f1*f3), right? f2 is a coarsening of subject]

 WARNING: Exert caution when running the aov command. I have run the exact
 same command on Windows and Unix machines (each with 1GB of RAM; allocated
 up to 3 or 4GB of memory for the analysis ) and it has taken many, many
 hours to finish. That said, this is not a new problem posted on the R-help
 list. There are several posts where analysts have run into similar problems.
 My general impression of these posts, and correct me if I am wrong, is that
 because aov is a wrapper around lm, the extra time is required to build and
 manipulate a design matrix (via qr decomposition) that is 8960 x several
 thousand columns large! Is that so? It seems fitting because if I call aov
 with only a single factor, then it returns in a few seconds. 

Yes, this is basically correct. The main issue is the calculation of
the projection onto the terms in the Error model, which is done using
the generic lm algorithm even though the design is typicaly orthogonal
so that there are much faster ways to get to the result. 

To paraphrase: if you have double determinations and an error term at
the level of each pair, the algorithm fits a model with N/2 parameters
in order to calculate the mean and difference within each pair. For
large designs, this becomes an issue...

This is in some sense a tradeoff of generality for speed, but the
results for non-orthogonal designs are typically very hard to
interpret.

The topic does come up off and on, and we have been considering the
option of adding an algorithm where it is assumed that the Error model
consists of mutually orthogonal and balanced terms (in the sense that
all factor levels appear equally frequently). Just a simple matter of
programming... 

For the near term, there are a couple of things that you can do:

- avoid having an error term that is equivalent to the full data set.
  In your case, subject:f1:f3 is such a term, and subject/(f1+f3) is
  actually equivalent (the second order interaction term becomes the
  residual stratum). This at least saves you from inverting an NxN
  matrix. 

- use a version of R compiled with a fast BLAS, on a fast computer
  with a lot of RAM... (A ~2K square matrix inversion will take a
  while, but hours sounds a bit excessive). 

- (not too sure of this, but R 2.1.0 will have some new code for
  multivariate lm, with intra-individual designs, and
  tests under the sphericity assumptions; it is possible that
  your models can be reformulated in that framework. You'd have to
  set it up as a model with 160 responses on 56 subjects, though, and
  the code wasn't really designed with that sort of intrasubject
  dimensionality in mind.)
  
 In order to test if the computational slowness was something unique to R, I
 ran the same analysis, including all 3 factors, in SPSS. To my surprise SPSS
 returned almost instantaneously. I do not know much about the algorithm in
 SPSS, but I suspect it may be calculating condition means and sums of
 squares rather than generating a design matrix. Does that sound plausible?
  
 At this point I am a dedicated R user. However, I do the kind of analysis
 described above quite often. It is important that my statistical package be
 able to handle it more efficiently than what I have been able to get R to do
 at this point. Am I doing anything obviously wrong? Is there a 

[R] How I calculate nCr with R ? (Como calculo nCr con R? )

2005-03-19 Thread Mario Morales
En español  (In Spanish)
Necesito calcular la en numeros de combinaciones de n cosas
tomando k al tiempo.
Como hago eso en R ???
Yo escribí mi propia función pero pienso que  de esa forma no es
fácil para mis estudiantes .
He estado buscando en la ayuda y no he encontrado información
sobre una función que calcule eso directamente. Podrían ayudarme
In English (en Inglés )
I need calculate the combination of n things taking k at time.
How do I do that in R ?
I wrote my own function but in this form isn't easy for my
students.
Can you help me ?
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] slow computation of mixed ANOVA using aov

2005-03-19 Thread Peter Dalgaard
Peter Dalgaard [EMAIL PROTECTED] writes:

 - use a version of R compiled with a fast BLAS, on a fast computer
   with a lot of RAM... (A ~2K square matrix inversion will take a
   while, but hours sounds a bit excessive). 

To wit:

  system.time(aov(dv~f1*f2*f3+Error(subject/(f1+f3)),data=d))
[1] 582.46   9.31 619.01   0.00   0.00

i.e. about 10 minutes on an Opteron 240 (dual, but it only seemed to
use one cpu for this task) with 1GB.


-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How I calculate nCr with R ? (Como calculo nCr con R? )

2005-03-19 Thread Uwe Ligges
Mario Morales wrote:
En español  (In Spanish)
Necesito calcular la en numeros de combinaciones de n cosas
tomando k al tiempo.
Como hago eso en R ???
Yo escribí mi propia función pero pienso que  de esa forma no es
fácil para mis estudiantes .
He estado buscando en la ayuda y no he encontrado información
sobre una función que calcule eso directamente. Podrían ayudarme
In English (en Inglés )
I need calculate the combination of n things taking k at time.
Just the number of combinations: ?choose
For listing all of them (you could have found those packages yourself, 
BTW), e.g:
- combinations() in package gtools in bundle gregmisc
- combn() in package combinat

Uwe Ligges

How do I do that in R ?
I wrote my own function but in this form isn't easy for my
students.
Can you help me ?
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How I calculate nCr with R ? (Como calculo nCr con R? )

2005-03-19 Thread Jean Eid
do you mean n choose k which is a built in function  see
?choose




On Sat, 19 Mar 2005, Mario Morales wrote:

 En español  (In Spanish)

 Necesito calcular la en numeros de combinaciones de n cosas
 tomando k al tiempo.

 Como hago eso en R ???

 Yo escribí mi propia función pero pienso que  de esa forma no es
 fácil para mis estudiantes .

 He estado buscando en la ayuda y no he encontrado información
 sobre una función que calcule eso directamente. Podrían ayudarme


 In English (en Inglés )

 I need calculate the combination of n things taking k at time.


 How do I do that in R ?

 I wrote my own function but in this form isn't easy for my
 students.

 Can you help me ?

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] the number of cluster

2005-03-19 Thread Uwe Ligges
XP Sun wrote:

 hi, all,
 
   how to decide the number of cluster before you use kmeans and hclust? 
   thank you in advance!

Depends on your criterion. Best idea is always to use the brain and
think about how many clusters are sensible for the particular
task/problem/data.
For hclust, you can also look at the dendrogram's height and distances
of dissimilarities in order to cut.

Uwe Ligges


 best
 -xpsun
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to use a R package with C code

2005-03-19 Thread Sameul M Mwalili
 C function name not in load table means that youe DLL/SO file is not loaded. 
You should use 
 .First.lib - function(lib, pkg)
{
library.dynam(your_pkg_name, pkg, lib)
}

Regads,
Samuel.

Duncan Murdoch [EMAIL PROTECTED] wrote:
On Wed, 9 Mar 2005 18:54:36 -0500, [EMAIL PROTECTED] wrote :


Hello, everybody,

I created a R package which includes C code. But I load this package, and carry
out the R function in it. It shows C function is not in load table as follows.
Would you tell me what is the problem? Where do I make mistake?

You aren't giving enough information for anyone to know that. You
need to tell us exactly what you did to create your package, and what
operating system you're on.

Duncan Murdoch

Maggie



[Previously saved workspace restored]

 library(var)

Attaching package 'var':


 The following object(s) are masked _by_ .GlobalEnv :

 b

 wxt0124()
Error in .C(wxt1221) : C function name not in load table

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


-


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] How I calculate nCr with R ? (Como calculo nCr con R? )

2005-03-19 Thread Joseph Beyene
Try 

choose(k,r)

 choose(10,2)
[1] 45


JB



 -Original Message-
 From: Mario Morales [mailto:[EMAIL PROTECTED]
 Sent: March 19, 2005 5:37 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] How I calculate nCr with R ? (Como calculo nCr con R? )
 
 En español  (In Spanish)
 
 Necesito calcular la en numeros de combinaciones de n cosas
 tomando k al tiempo.
 
 Como hago eso en R ???
 
 Yo escribí mi propia función pero pienso que  de esa forma no es
 fácil para mis estudiantes .
 
 He estado buscando en la ayuda y no he encontrado información
 sobre una función que calcule eso directamente. Podrían ayudarme
 
 
 In English (en Inglés )
 
 I need calculate the combination of n things taking k at time.
 
 
 How do I do that in R ?
 
 I wrote my own function but in this form isn't easy for my
 students.
 
 Can you help me ?


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] problem with legend

2005-03-19 Thread Bernardo Rangel Tura
I have problem with legend command. Please look this script:
dcbv.fm
Time Series:
Start = 1980
End = 2002
Frequency = 1
 [1] 2994.023 3388.414 3111.762 2990.967 3077.438 3058.274 3049.934 2974.130
 [9] 2889.659 2801.790 2631.391 2661.700 2312.526 2518.968 2567.044 2443.952
[17] 2117.638 2042.461 2025.816 1939.560 1640.775 1583.609 1659.912
 dcbv.ms
Time Series:
Start = 1980
End = 2002
Frequency = 1
 [1] 3700.239 4076.438 3856.495 3680.345 3871.887 3789.770 3831.173 3768.876
 [9] 3585.572 3754.374 3372.859 3419.667 3185.194 3319.215 3445.845 3265.214
[17] 2773.961 2661.904 2669.835 2569.190 2187.719 2217.756 2196.378
plot(dcbv.ms,ylim=c(min(dcbv.fm),max(dcbv.ms)))
lines(dcbv.fm,col=2)
legend(1984,2500,c(DCVB-MS,DCBV-FM),col=c(1,2),cex=.6,fill=T)
At end of script the legend of plot have only one color: black. I think the 
legend will have two colors: black and red.

Where I make mistake?
version
 _
platform i386-pc-mingw32
arch i386
os   mingw32
system   i386, mingw32
status
major2
minor0.1
year 2004
month11
day  15
language R
Thanks in advance
Bernardo Rangel Tura, MD, MSc
National Institute of Cardiology Laranjeiras
Rio de Janeiro Brazil 

--
No virus found in this outgoing message.
Checked by AVG Anti-Virus.
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How I calculate nCr with R ? (Como calculo nCr con R? )

2005-03-19 Thread Michael Dewey
At 10:37 19/03/05, Mario Morales wrote:
En español  (In Spanish)
Necesito calcular la en numeros de combinaciones de n cosas
tomando k al tiempo.
In English we usually read this as N choose r and with that clue you might 
go ?choose

Incidentally your English is fine although I see the logic in giving us both.
Como hago eso en R ???
Yo escribí mi propia función pero pienso que  de esa forma no es
fácil para mis estudiantes .
He estado buscando en la ayuda y no he encontrado información
sobre una función que calcule eso directamente. Podrían ayudarme
In English (en Inglés )
I need calculate the combination of n things taking k at time.
How do I do that in R ?
I wrote my own function but in this form isn't easy for my
students.
Can you help me ?

Michael Dewey
[EMAIL PROTECTED]
http://www.aghmed.fsnet.co.uk/home.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] simple problem, but not for me

2005-03-19 Thread alexbri
Hello, I'm new in R and I want to do one thing that is very easy in excel, 
however, I cant do it in R.

Suppose we have the data frame:

 

data- data.frame(A=c(a1,a2,a3,a4,a5))

 

I need to obtain another column in the same data frame (lets say 
B=c(b1,b2,b3,b4,b5) in the following way:

 

b1=a1/(a1+a2+a3+a4+a5)

b2=a2/(a2+a3+a4+a5)

b3=a3/(a3+a4+a5)

b4=a4/(a4+a5)

b5=a5/a5

 

a1..a5 and b1...b5 are always numeric values

 

(this is just an example, what I really want is apply this kind of formula to a 
much larger data frame)

 

I think this is easy for those who are used to work in R (and I suppose there 
are many different ways), but I can make it in this moment, because I cannot 
leave behind, the excel thinking way.

I hope you understand my problem and please, help me soon. 

 

Alexandre

 

[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Control of vertical spacing in Lattice key text?

2005-03-19 Thread Deepayan Sarkar
On Tuesday 15 March 2005 17:18, Patrick Connolly wrote:
 I find the key and legend functions in Lattice very useful.  Trouble
 is, now I can see what else I'd like to be able to do with them.

 If I put a title on a key, it appears too close to the key itself,
 and if there's a line break in the title (which often happens), the
 leading between the lines is too much.

 What I can do is print to a postscript file, then find the lines in
 the postscript file where my text appears and fiddle with the
 postscript code to move the text up or down as I wish.  For single
 plots that's OK, but I'd prefer to be able to do it within R
 especially when I wish to do dozens of them.

 Is there something in the documentation I overlooked?

Nope. The space used for the title is currently hard-coded (to be 1.2 
times the height of the title string). It should be easy enough to add 
a new option in draw.key, provided I can think up a good name 
(suggestions welcome).

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] simple problem, but not for me

2005-03-19 Thread Uwe Ligges
alexbri wrote:
Hello, I'm new in R and I want to do one thing that is very easy in excel, 
however, I cant do it in R.
Suppose we have the data frame:
 

data- data.frame(A=c(a1,a2,a3,a4,a5))
 

I need to obtain another column in the same data frame (lets say 
B=c(b1,b2,b3,b4,b5) in the following way:
 

b1=a1/(a1+a2+a3+a4+a5)
b2=a2/(a2+a3+a4+a5)
b3=a3/(a3+a4+a5)
b4=a4/(a4+a5)
b5=a5/a5

  temp - rev(data$A)
  data$B - rev(temp / cumsum(temp))
Uwe Ligges


 

a1..a5 and b1...b5 are always numeric values
 

(this is just an example, what I really want is apply this kind of formula to a 
much larger data frame)
 

I think this is easy for those who are used to work in R (and I suppose there are many 
different ways), but I can make it in this moment, because I cannot leave behind, the 
excel thinking way.
I hope you understand my problem and please, help me soon. 

 

Alexandre
 

[EMAIL PROTECTED]
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] problem with legend

2005-03-19 Thread Uwe Ligges
Bernardo Rangel Tura wrote:
I have problem with legend command. Please look this script:
 dcbv.fm
Time Series:
Start = 1980
End = 2002
Frequency = 1
 [1] 2994.023 3388.414 3111.762 2990.967 3077.438 3058.274 3049.934 
2974.130
 [9] 2889.659 2801.790 2631.391 2661.700 2312.526 2518.968 2567.044 
2443.952
[17] 2117.638 2042.461 2025.816 1939.560 1640.775 1583.609 1659.912

  dcbv.ms
Time Series:
Start = 1980
End = 2002
Frequency = 1
 [1] 3700.239 4076.438 3856.495 3680.345 3871.887 3789.770 3831.173 
3768.876
 [9] 3585.572 3754.374 3372.859 3419.667 3185.194 3319.215 3445.845 
3265.214
[17] 2773.961 2661.904 2669.835 2569.190 2187.719 2217.756 2196.378

 plot(dcbv.ms,ylim=c(min(dcbv.fm),max(dcbv.ms)))
 lines(dcbv.fm,col=2)
 legend(1984,2500,c(DCVB-MS,DCBV-FM),col=c(1,2),cex=.6,fill=T)

So you want filles boxes? Then you should specify the color in the fill 
argument:

 legend(1984, 2500, c(DCVB-MS, DCBV-FM), cex=.6, fill=1:2)
or do you want some lines?
 legend(1984, 2500, c(DCVB-MS, DCBV-FM), cex=.6, col=1:2, lwd=2)
Uwe Ligges

At end of script the legend of plot have only one color: black. I think 
the legend will have two colors: black and red.

Where I make mistake?
version
 _
platform i386-pc-mingw32
arch i386
os   mingw32
system   i386, mingw32
status
major2
minor0.1
year 2004
month11
day  15
language R
Thanks in advance
Bernardo Rangel Tura, MD, MSc
National Institute of Cardiology Laranjeiras
Rio de Janeiro Brazil
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] simple problem, but not for me

2005-03-19 Thread Rolf Turner

[EMAIL PROTECTED] wrote:

 Hello, I'm new in R and I want to do one thing that is very easy in
 excel, however, I cant do it in R.

Well, if you've deadened your brain by using Excel,
no wonder.

 Suppose we have the data frame:
 
 data- data.frame(A=c(a1,a2,a3,a4,a5))

Oh, for Pete's sake!  This makes ``data'' (NOT a good name
for an object!) into a data frame with a single column named
``A''.  That column will be a factor with 5 entries (an 5
levels) with these levels being (the character strings)
a1,a2,a3,a4, and a5.

Nothing to do with what you actually want.
 
 I need to obtain another column in the same data frame (lets say
 B=c(b1,b2,b3,b4,b5) in the following way:

This would make B a ***vector*** equal to the concatenation
of b1, ..., b5.

Perhaps you mean:

B - cbind(b1,b2,b3,b4,b5)

 b1=a1/(a1+a2+a3+a4+a5)
 
 b2=a2/(a2+a3+a4+a5)
 
 b3=a3/(a3+a4+a5)
 
 b4=a4/(a4+a5)
 
 b5=a5/a5
 
 a1..a5 and b1...b5 are always numeric values
 
 (this is just an example, what I really want is apply this kind of
 formula to a much larger data frame)
 
You are very confused.  Your notation and your use of
the function c() are all wrong.

If you are going to use R, get the basic syntax straight.

You probably should be using matrices rather than data frames
given that the entries are all numeric.

Be that as it may, if A is a (numeric) matrix then

B - A/t(apply(A,1,function(x){rev(cumsum(x))}))

will give what you appear to want.

cheers,

Rolf Turner
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] slow computation of mixed ANOVA using aov

2005-03-19 Thread Peter Dalgaard
Peter Dalgaard [EMAIL PROTECTED] writes:

 - (not too sure of this, but R 2.1.0 will have some new code for
   multivariate lm, with intra-individual designs, and
   tests under the sphericity assumptions; it is possible that
   your models can be reformulated in that framework. You'd have to
   set it up as a model with 160 responses on 56 subjects, though, and
   the code wasn't really designed with that sort of intrasubject
   dimensionality in mind.)

OK, I've tried this now and it does seem to work, and much faster than
the aov() approach. I had to fix a buglet in the code (eigenvalues
coming up with small imaginary parts due to numerics), so currently
available versions won't quite work, but it should be in the alpha
releases that are supposed to start Monday.

Here's how it works:

### orig setup, edited so as to actually work...
 f1-gl(40,1,8960,ordered=T)
 f2-gl(7,1280,8960,ordered=T)
 f3-gl(4,40,8960,ordered=T)
 subject-gl(56,160,8960,ordered=T)
 dv-rnorm(8960,mean=500,sd=50)
 d - data.frame(f1,f2,f3,subject,dv)

### Regroup to 56x160 matrix response 
 f2a - f2[seq(1,8801,160)]
 idata - d[1:160,] # intrasubj. design, actually need only f1,f3 from this
 Y - matrix(dv,56,byrow=T)

### now fit models with effect of f2a, constant, and empty
 fit - lm(Y~f2a)
 fit2 - lm(Y~1)
 fit3 - lm(Y~0)

## The main trick is to do anova on within-subject effects. First look
## at the interactions, alias residuals from an additive model ~f1+f3.
## The reduction Model 1- Model 2 corresponds to testing the f1:f2:f3
## interaction in aov (tests whether the f1:f3 interaction depends on
## f2a) and Model 2 - Model 3 is the test for the f1:f3 interaction
## being zero.

## I'm not quite sure what to make of the correction terms when the
## estimated SSD matrix is singular (it has to be, the dimension is
## 117, but the df is only 49). Probably the G-G epsilon is bogus, but
## the H-F one seems to fare rather well.

 anova(fit,fit2,fit3,test=Spherical,X=~f1+f3,idata=idata)
Analysis of Variance Table

Model 1: Y ~ f2a
Model 2: Y ~ 1
Model 3: Y ~ 0

Contrasts orthogonal to
~f1 + f3

Greenhouse-Geisser epsilon: 0.2903
Huynh-Feldt epsilon:0.9639

  Res.Df   Df Gen.var.  F num Df den Df  Pr(F)  G-G Pr  H-F Pr
1 49   0.0
2 556  0.0 0.9036702   5733 0.96034 0.82268 0.95748
3 561  0.0 1.0460117   5733 0.35003 0.39624 0.35206

## Next, we can look at the f1 effects. This is basically the
## difference between two projections, on ~f1+f3 and ~f3
## This gives us the tests for f2:f1 and f1

 anova(fit,fit2,fit3,test=Spherical,M=~f1+f3,X=~f3,idata=idata)
Analysis of Variance Table

Model 1: Y ~ f2a
Model 2: Y ~ 1
Model 3: Y ~ 0

Contrasts orthogonal to
~f3


Contrasts spanned by
~f1 + f3

Greenhouse-Geisser epsilon: 0.5598
Huynh-Feldt epsilon:1.0284

  Res.Df   Df Gen.var.  F num Df den Df Pr(F) G-G Pr H-F Pr
1 49315.74
2 556   344.98 0.9883234   1911   0.54   0.52   0.54
3 561   347.15 0.8393 39   1911   0.75   0.68   0.75

## Same thing with f3

 anova(fit,fit2,fit3,test=Spherical,M=~f1+f3,X=~f1,idata=idata)
Analysis of Variance Table

Model 1: Y ~ f2a
Model 2: Y ~ 1
Model 3: Y ~ 0

Contrasts orthogonal to
~f1


Contrasts spanned by
~f1 + f3

Greenhouse-Geisser epsilon: 0.9482
Huynh-Feldt epsilon:1.0128

  Res.Df  Df Gen.var.  F num Df den Df Pr(F) G-G Pr H-F Pr
1 49   35.171
2 55   6   34.679 0.9010 18147  0.578  0.574  0.578
3 56   1   34.658 1.0039  3147  0.393  0.390  0.393

## Actually, because of the complete design, we can ignore f1 and get
## the same analysis:

 anova(fit,fit2,fit3,test=Spherical,M=~f3,X=~1,idata=idata)
Analysis of Variance Table

Model 1: Y ~ f2a
Model 2: Y ~ 1
Model 3: Y ~ 0

Contrasts orthogonal to
~1


Contrasts spanned by
~f3

Greenhouse-Geisser epsilon: 0.9482
Huynh-Feldt epsilon:1.0128

  Res.Df  Df Gen.var.  F num Df den Df Pr(F) G-G Pr H-F Pr
1 49   35.171
2 55   6   34.679 0.9010 18147  0.578  0.574  0.578
3 56   1   34.658 1.0039  3147  0.393  0.390  0.393

## Finally we get the main effect of f2a as follows. Notice that the
## Model 2 - Model 3 reduction is the test for zero overall mean,
## which you might (and aov does) want to omit.  

 anova(fit,fit2,fit3,test=Spherical,M=~1,X=~0,idata=idata)
Analysis of Variance Table

Model 1: Y ~ f2a
Model 2: Y ~ 1
Model 3: Y ~ 0

Contrasts orthogonal to
~0


Contrasts spanned by
~1

Greenhouse-Geisser epsilon: 1
Huynh-Feldt epsilon:1

  Res.Df Df Gen.var.  F num Df den Df Pr(F) G-G Pr
  H-F Pr
1 49  11
2 55  6   12 1.5010e+00  6 49  1.976e-01  1.976e-01
  1.976e-01
3 56  1   249956 1.2699e+06  1 49 8.346e-110 8.346e-110
  8.346e-110

## Finally aov() results for comparison:

 system.time(aa - aov(dv~f1*f2*f3+Error(subject/(f1+f3)),data=d))
[1] 526.50   9.27 561.29   0.00   0.00
 summary(aa)

Error: 

Re: [R] simple problem, but not for me

2005-03-19 Thread Spencer Graves
The suggestions by Uwe and Rolf use some of the subtler features of 
R.  A simpler (to me) if more tedious approach is provided by the 
following: 

Data - data.frame(a1=1:4, a2=5:8, a3=9:12)
Data$b1 - Data$a1/(Data$a1+Data$a2+Data$a3)
Data$b2 - Data$a2/(Data$a2+Data$a3)
 Data
 a1 a2 a3 b1b2
1  1  5  9 0.0667 0.3571429
2  2  6 10 0. 0.375
3  3  7 11 0.14285714 0.389
4  4  8 12 0.1667 0.400
 The above can be written in fewer characters using with: 

Data$b1 - with(Data, a1/(a1+a2+a3))
Data$b2 - with(Data, a2/(a2+a3))
 If you want something that will work with an arbitrary number of 
columns, consider the following: 

Data - data.frame(a1=1:4, a2=5:8, a3=9:12)
dat2 - Data
k - length(dat2)
for(i in (k-1):1)
 dat2[[i]] - (dat2[[i]]+dat2[[i+1]])
Dat2 - cbind(Data, Data/dat2)
names(Dat2)[k+(1:k)] - paste(b, 1:k, sep=)
 Data
 a1 a2 a3
1  1  5  9
2  2  6 10
3  3  7 11
4  4  8 12
 dat2
 a1 a2 a3
1 15 14  9
2 18 16 10
3 21 18 11
4 24 20 12
 Dat2
 a1 a2 a3 b1b2 b3
1  1  5  9 0.0667 0.3571429  1
2  2  6 10 0. 0.375  1
3  3  7 11 0.14285714 0.389  1
4  4  8 12 0.1667 0.400  1
 If you want to do this more than once, you can wrap the above code 
in a function;  see Writing your own functions in An Introduction to 
R, available, e.g., vial help.start(). 

 hope this helps. 
 spencer graves
p.s.  The posting guide http://www.R-project.org/posting-guide.html; 
also seems quite valuable.  In an discussion on (and off) this list 
earlier this week, several people reported they had found solutions to 
their own problems in the process of preparing a question to post to 
this list.  And even if you don't find a solution, the result will more 
likely elicit useful replies. 

Rolf Turner wrote:
[EMAIL PROTECTED] wrote:
 

Hello, I'm new in R and I want to do one thing that is very easy in
excel, however, I cant do it in R.
   

Well, if you've deadened your brain by using Excel,
no wonder.
 

Suppose we have the data frame:
data- data.frame(A=c(a1,a2,a3,a4,a5))
   

Oh, for Pete's sake!  This makes ``data'' (NOT a good name
for an object!) into a data frame with a single column named
``A''.  That column will be a factor with 5 entries (an 5
levels) with these levels being (the character strings)
a1,a2,a3,a4, and a5.

Nothing to do with what you actually want.
 

I need to obtain another column in the same data frame (lets say
B=c(b1,b2,b3,b4,b5) in the following way:
   

This would make B a ***vector*** equal to the concatenation
of b1, ..., b5.
Perhaps you mean:
B - cbind(b1,b2,b3,b4,b5)
 

b1=a1/(a1+a2+a3+a4+a5)
b2=a2/(a2+a3+a4+a5)
b3=a3/(a3+a4+a5)
b4=a4/(a4+a5)
b5=a5/a5
a1..a5 and b1...b5 are always numeric values
(this is just an example, what I really want is apply this kind of
formula to a much larger data frame)
   

You are very confused.  Your notation and your use of
the function c() are all wrong.
If you are going to use R, get the basic syntax straight.
You probably should be using matrices rather than data frames
given that the entries are all numeric.
Be that as it may, if A is a (numeric) matrix then
B - A/t(apply(A,1,function(x){rev(cumsum(x))}))
will give what you appear to want.
cheers,
Rolf Turner
[EMAIL PROTECTED]
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Re: Repeated Measures, groupedData and lme

2005-03-19 Thread Ignacio Colonna
Emma,
I am not an expert, but I have been trying to fit similar models.
Adding to Keith's reply to your question, I can suggest what I concluded was
the most reasonable model for my case. Based on Keith's Model1, you might
also want to allow for a correlation among years within each experimental
unit (I am assuming the experiment was conducted at the exact same location
over the 3 years). 
Say you want to impose an autoregressive, order 1 structure (you can
change this to any other structure you may consider appropriate) 
To do this at the block*treatment unit (the smallest experimental
unit size in your experiment), you can add to keith's code:

correlation=corAR1(form=~1|block/treatment)

thus the entire code would be
Model1-lme(mg~treatment + year + treatment:year, random=~1|block,
correlation=corAR1(form=~1|block/treatment),data=magnesium)

This results in a model with a certain covariance among all exp.units within
the same block, plus another covariance between pairs of years within the
same exp.unit, and this covariance decreases as the difference in time
increases.

You can see graphically the structure of this covariance by doing:
rho-0.3
ar1-corAR1(value=rho,form=~1|block/treatment)
ar1-Initialize(ar1,data=yourdata)
m1-corMatrix(ar1)
plot(m1$1/name of first treatment[,1])

Now, I really hope someone more knowledgeable is checking this out there and
will point out whether this is incorrect, as I have used it for my analysis
assuming was the correct approach. 

Ignacio


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Keith Wong
Sent: Friday, March 18, 2005 5:41 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Re: Repeated Measures, groupedData and lme

Hello,

I'm an R-newbie, but I've been learning to use lme for repeated measures
experiments as well.

If I understand correctly: 
  Outcome variable: Mg (Kg/ha)
  Subject/grouping variable: block

  Condition/treatment: treatment (19 levels)
  Repeated factor: time (3 levels: 99, 02, 04)


I think if you specify the model formula in the lme call, then the formula
structure specified in the groupedData object is ignored.

One suggestion for the model:

Model1-lme(mg~treatment + year + treatment:year, random=~1|block,
data=magnesium)

If the question of interest is the treatment:year interaction

Or
Model2 - lme(mg~treatment, random=~1|block, data=magnesium)


Hope this helps ... and hope the experts chime in at this point to give more
guidance.

Keith


--quoting original post---
Hello

I am trying to fit a REML to some soil mineral data which has been
collected over the time period 1999 - 2004. I want to know if the 19
different treatments imposed, differ in terms of their soil mineral
content. A tree model of the data has shown differences between the
treatments can be attributed to the Magnesium, Potassium and organic
matter content of the soil, with Magnesium being the primary separating
variable.

I am looking at soil mineral data were collected : 99, 02, 04. 

In the experiment, there are 19 different treatments (treatmentcontrol,
treatment6TFYM, treatment 12TFYM etc),  which are replicated in 3
blocks.

For the magnesium soil data, I have created the following groupedData
object: 

magnesium-groupedData(Mg~year|treatment, inner=~block) 
Where mg=magnesium Kg/ha

As it is a repeated measures I was going to use an lme.  I have looked
at Pinherio and Bates : Mixed-Effects models in S and S-plus and I am
getting slightly confused.  In order to fit the lme, should I specify
the data to use in the model as the grouped structure model?

If so is the following command correct:

Model1-lme(mg~treatment, random=block|year, data=magnesium)? 

I am slightly worried that it isn't, because in model summary, instead
of listing the 19 different treatments in the fixed effects section, it
writes intercept (as normal), then treatment^1, treatment^2 etc.

However if I don't specify the groupedData object in the model, then in
the fixed effects section, it names the treatments (i.e. intercept,
treatmentcontrol, treatment6TFYM.

Should I be fitting the model using the whole data set rather than the
groupedData object?


Thank you very much for your help


Emma Pilgrim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] newbie question about beta distribution

2005-03-19 Thread faisal99
hi everyone,
I'm still a newbie in statistics,

I have a question about beta distribution, that is,

On the ref/tutorials I've found on the net, why beta distribution always
have value p(x) more than 1?
As I know, any probability density function always have value not more
than 1?

is there any one who can explain to me, I'm not statistics people, but I
need to code that needing some of this distribution function.

thx before

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] newbie question about beta distribution

2005-03-19 Thread Roger D. Peng
A probability density must integrate to 1.  The specific values of the density 
can be either more or less than 1.

-roger
[EMAIL PROTECTED] wrote:
hi everyone,
I'm still a newbie in statistics,
I have a question about beta distribution, that is,
On the ref/tutorials I've found on the net, why beta distribution always
have value p(x) more than 1?
As I know, any probability density function always have value not more
than 1?
is there any one who can explain to me, I'm not statistics people, but I
need to code that needing some of this distribution function.
thx before
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html