Re: [R] Factor levels

2007-08-28 Thread Peter Alspach
Sebastain

Does the following work for you?

seb <- read.table(file='clipboard', header=T)
seb$c
 [1] w k r s k p l u z s j j x r d j x w q f
Levels: d f j k l p q r s u w x z
seb$c <- factor(seb$c, levels=unique(seb$c))
seb$c
 [1] w k r s k p l u z s j j x r d j x w q f
Levels: w k r s p l u z j x d q f

Peter Alspach
 

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Sébastien
> Sent: Wednesday, 29 August 2007 9:00 a.m.
> To: Gabor Grothendieck
> Cc: R-help
> Subject: Re: [R] Factor levels
> 
> Ok, I cannot send to you one of my dataset since they are 
> confidential. 
> But I can produce a dummy "mini" dataset to illustrate my question. 
> Let's say I have a csv file with 3 columns and 20 rows which 
> content is reproduced by the following line.
> 
>  > mydata<-data.frame(a=1:20,
> b=sample(100:200,20,replace=T),c=sample(letters[1:26], 20, 
> replace = T))  > mydata
> a   b c
> 1   1 176 w
> 2   2 141 k
> 3   3 172 r
> 4   4 182 s
> 5   5 123 k
> 6   6 153 p
> 7   7 176 l
> 8   8 170 u
> 9   9 140 z
> 10 10 194 s
> 11 11 164 j
> 12 12 100 j
> 13 13 127 x
> 14 14 137 r
> 15 15 198 d
> 16 16 173 j
> 17 17 113 x
> 18 18 144 w
> 19 19 198 q
> 20 20 122 f
> 
> If I had to read the csv file, I would use something like: 
> mydata<-data.frame(read.table(file="c:/test.csv",header=T))
> 
> Now, if you look at mydata$c, the levels are alphabetically ordered.
>  > mydata$c
>  [1] w k r s k p l u z s j j x r d j x w q f
> Levels: d f j k l p q r s u w x z
> 
> What I am trying to do is to reorder the levels as to have 
> them in the order they appear in the table, ie
> Levels: w k r s p l u z j x d q f
> 
> Again, keep in mind that my script should be used on datasets 
> which content are unknown to me. In my example, I have used 
> letters for mydata$c, but my code may have to handle factors 
> of numeric or character values (I need to transform specific 
> columns of my dataset into factors for plotting purposes). My 
> goal is to let the code scan the content of each factor of my 
> data.frame during or after the read.table step and reorder 
> their levels automatically without having to ask the user to 
> hard-code the level order.
> 
> In a way, my problem is more related to the way the factor 
> levels are ordered than to the read.table function, although 
> I guess there is a link...
> 
> Gabor Grothendieck a écrit :
> > Its not clear from your description what you want

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] simple coding question

2007-07-30 Thread Peter Alspach

Kirsten

One way to do this:

kirsten <- c(123, 1234, 12345)
100*as.numeric(paste(substring(kirsten, 1, 3), substring(kirsten, 4, 5),
sep='.'))

HTH ....

Peter Alspach
  

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Kirsten Beyer
> Sent: Tuesday, 31 July 2007 9:31 a.m.
> To: r-help@stat.math.ethz.ch
> Subject: [R] simple coding question
> 
> I have a list of ICD9 (disease) codes with various formats - 3 digit,
> 4 digit, 5 digit.  The first three digits of these codes are 
> what I am most interested in.  I would like to either add 
> zeros to the 3 and 4 digit codes to make them 5 digit codes 
> or add decimal points to put them all in the format ###.##.  
> I did not see a function that allows me to do this in the 
> formatting command.  This seems simple - can someone help?
> 
> Thanks,
> K.Beyer
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Aggregate to find majority level of a factor

2007-05-31 Thread Peter Alspach

Jon

One way:  assuming your data.frame is 'jon'

aggregate(jon[,2], list(jon[,1]), function(x)
levels(x)[which.max(table(x))])
  Group.1 x
1   Plot1   big
2   Plot2 small
3   Plot3 small 

HTH ....

Peter Alspach

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of 
> Thompson, Jonathan
> Sent: Friday, 1 June 2007 7:26 a.m.
> To: r-help@stat.math.ethz.ch
> Subject: [R] Aggregate to find majority level of a factor
> 
> I want to use the aggregate function to summarize data by a 
> factor (my field plots), but I want the summary to be the 
> majority level of another factor.
> 
>  
> For example, given the dataframe:
> 
> Plot1 big
> Plot1 big
> Plot1 small
> Plot2 big
> Plot2 small
> Plot2 small
> Plot3 small
> Plot3 small
> Plot3 small
> 
> 
> My desired result would be:
> Plot1 big
> Plot2 small
> Plot3 small
> 
> 
> I can't seem to find a scalar function that will give me the 
> majority level. 
> 
> Thanks in advance,
> 
> Jonathan Thompson
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A particular shuffling on a vector

2007-04-19 Thread Peter Alspach

Emmanuel

One option which appears to work:

emmanuel <- c(1,1,1,2,2,3,3,3)
#emmanuel <- c(1,1,2,3,4,5,6,6,6,6,6,6,6,6,7,8)
runs <- rle(emmanuel)[[1]]
shuffle <- sample(1:length(runs))
newEmm <- rep(emmanuel[cumsum(runs)[shuffle]], runs[shuffle])
startPos <- sample(1:length(emmanuel), 1)
if (startPos==1) newEmm else
newEmm[c(startPos:length(newEmm),1:(startPos-1))]

Peter Alspach

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Emmanuel Levy
> Sent: Friday, 20 April 2007 1:03 p.m.
> To: r-help@stat.math.ethz.ch
> Subject: [R] A particular shuffling on a vector
> 
> Hello,
> 
> I was wondering if anyone can think of a straightforward way (without
> loops) to do the following shuffling:
> 
> Let's imagine a vector:
> c(1,1,1,2,2,3,3,3)
> 
> I would like to derive shuffled vectors __where the same 
> digits are never separated__, although they can be at both 
> ends (periodicity).
> 
> So the following shuffled vectors are possible:
> 
> c(2,2,1,1,1,3,3,3)
> c(2,1,1,1,3,3,3,2)
> c(3,3,3,1,1,1,2,2)
> c(3,1,1,1,2,2,3,3)
> etc ...
> 
> I should mention that there can be any number of different 
> numbers, and any number of repetition of each number.
> 
> So the vectors I have to deal with could look like
> c(1,1,2,3,4,5,6,6,6,6,6,6,6,6,7,8) for example
> 
> Since I have to derive many shuffled versions for each 
> vector, I am looking for an efficient way of doing it. Can 
> you think of a way without nested loops?
> 
> Many thanks for your help,
> 
> Best,
> 
> Emmanuel
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Moving plot labels

2007-04-04 Thread Peter Alspach

Dean

Try:

plot(c(-0.25,18),c(0, max(patient10)),type="n", ylab="SD of POST
estimator", xlab="")
title(xlab='Scans\n(a)', line=3) 

Peter Alspach

> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
> Sent: Thursday, 5 April 2007 12:21 p.m.
> To: r-help@stat.math.ethz.ch
> Subject: [R] Moving plot labels
> 
> Hi.
> 
> I'm sure this is a simple problem, yet I can't seem to get 
> simple help for it.
> 
> I am simply trying to move my xlab in plot().
> 
> I am currently using the following commands:
> 
> plot(c(-0.25,18),c(0, max(patient10)),type="n", ylab="SD of 
> POST estimator", xlab="Scans \n (a)")
> 
> But when the plot prints, the xlab is printed over top the 
> xaxis. I tried adjusting mar(), but this just works with the 
> margins outside the plotting area. What is a simple fix? Is 
> there a single command I can add that adjusts the size of the 
> plotting area within the area controlled by mar()?
> 
> Thanks very much.
> 
> Dean
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hwo can i get a vector that...

2007-03-07 Thread Peter Alspach

Check out which.max 

Peter Alspach


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of bunny 
> , lautloscrew.com
> Sent: Thursday, 8 March 2007 11:20 a.m.
> To: R-help@stat.math.ethz.ch
> Subject: Re: [R] hwo can i get a vector that...
> 
> 
> 
> 
> > dear all,
> >
> > how can i get a vector that shows the number of the column 
> of matrix 
> > that contains the maximum of the row ??
> > can´t believe in need a loop for this...
> >
> > i have a  100 x 3 matrix and want to get a 100 x 1 vector 
> with values 
> > 1,2,3 .
> >
> > there must be a simple solution. i just cannot find it. i think am 
> > searching on the wrong end.
> >
> > thx for help in advance.
> >
> > m.
> 
> 
> EDIT: ok,  i know the following by now :)
> 
> apply(for18[,-1], 1, max, na.rm=T)
> 
> but this doesn´t get me the number of the column - which is 
> what i need... 
>   [[alternative HTML version deleted]]
> 
> 
> 

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] If you had just one book on R to buy...

2007-02-25 Thread Peter Alspach

Julien

This is quite a common question.  Within R, try

RSiteSearch('one good book for R') 

and follow the threads ......

Peter Alspach


> -Original Message-
> From: [EMAIL PROTECTED] 
> [mailto:[EMAIL PROTECTED] On Behalf Of Julien Barnier
> Sent: Monday, 26 February 2007 8:51 a.m.
> To: r-help@stat.math.ethz.ch
> Subject: [R] If you had just one book on R to buy...
> 
> Hi,
> 
> I am starting a new job as a study analyst for a social 
> science research unit. I would really like to use R as my 
> main tool for data manipulation and analysis. So I'd like to 
> ask you, if you had just one book on R to buy (or to keep), 
> which one would it be ? I already bought the Handbook of 
> Statistical Analysis Using R, but I'd like to have something 
> more complete, both on the statistical point of view and on R usage.
> 
> I thought that "Modern applied statistics with S-Plus" would 
> be a good choice, but maybe some of you could have 
> interesting suggestions ?
> 
> Thanks in advance,
> 
> --
> Julien
> 
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify selected substances across individuals

2007-01-21 Thread Peter Alspach

Jenny

>   Below is an example data, which contains three id-numbers.
> For each id there are three substances in each of the three
> blocks. Some substances are repeated twice.The subsatances
> are the same for all ids. The id number 3 is actually a
> control so all responses (y) that are equal or greater than 4
> are supposed to be removed from this id number. This I can do
> easily in R but what I need help with is I want to have those
> substances that are removed from id number 3 also removed
> from other ids as well. I could do an algorithm like : for id
> in 1:2, if substance = c("abc","dgf") then delete but if the
> substances to be removed have long strings and are more than
> 2 (for example 20 substances) then it would take long time to
> list the substances manually. Can you guys please show me a
> clever way to do what I described above ?

Use subsetting to identify that substances, and then !(...%in%...) to
remove records with these substances:

yourData[!(yourData$substance %in% yourData[yourData$id==3 &
yourData$y>=4, 'substance']),]

The above is untested and will need modification is yourData$id or
yourData$y contains missing values.
   
Peter Alspach

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] making a grid of points

2006-12-07 Thread Peter Alspach

Ross

I think you want

?expand.grid

BTW, help.search('grid') finds this.

Cheers ......

Peter Alspach


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Ross Boylan
> Sent: Friday, 8 December 2006 8:03 a.m.
> To: r-help
> Subject: [R] making a grid of points
>
> I'd like to evaluate a function at each point on a 2 or 3-D
> grid.  Is there some function that already does this, or
> generates the grid of points?
>
> My search has led me to the grid and lattice packages, and I
> found a reference to the sp package (e.g., SpatialGrid) for
> this.  There are things in there that might be relevant, but
> at first blush many of them are embedded in other concepts
> (grobs, shingles, rugs) and don't obviously solve the problem.
>
> I know this is not a hard thing to program, but I suspect
> someone has already done it.  Any pointers?
>
> Thanks.
> --
> Ross Boylan  wk:  (415) 514-8146
> 185 Berry St #5700   [EMAIL PROTECTED]
> Dept of Epidemiology and Biostatistics   fax: (415) 514-8150
> University of California, San Francisco
> San Francisco, CA 94107-1739 hm:  (415) 550-1062
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] getting a title in a plot during an lapply

2006-11-16 Thread Peter Alspach

Mark

> In my code below tempa and tempb are numeric vectors that I
> combined into a dataframe along with the deciles of tempa.
> I have an lapply statement that goes through the dataframe
> and does ten plots according to the appropriate decile.
> The code is below and it works fine. There are no bugs so I
> figure there was no need to include structure statements on the data.
> Also, I don't want to use coplot because the axes get sort of
> messed up because of outliers.
>
> tempa.deciles<-cut(tempa,breaks=quantile(tempa,probs=seq(0,1,0
> .1)),inclu
> de.lowest=TRUE)
> df<-data.frame(tempa,tempb,tempa.deciles)
> levels(df$tempa.deciles)<-paste("Decile ",1:10,sep="")
> lapply(split(df, df$tempa.deciles), function(x) {
> print(cor(x$tempa,x$tempb)); plot(x$tempa,x$tempb);
>
> I would like to include a title on each plot so that I can
> recognize which plot goes with which decile but I don't know
> how to do that basically because I can't figure out what
> lapply is looping over by number. I definitely looked around
> in various R books but I was not successful. Thanks for your help.

Try

plot(x$tempa, x$tempb, main=x$tempa.deciles[1])

Peter Alspach

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating a table

2006-11-14 Thread Peter Alspach

Michael

One solution:

df<-data.frame(loc=c("A","B","A","A","A"),
   year=c(1970,1970,1970,1976,1980))
df[,3] <- cut(df$year, c(1969.5,1974.5,1979.5,1984.5),
c('1970-74','1975-79','1980-85'))
with(df, addmargins(table(loc, V3)))


Peter Alspach

> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Michael Graber
> Sent: Wednesday, 15 November 2006 9:21 a.m.
> To: R-Mailingliste
> Subject: [R] Creating a table
>
> Dear R List,
>
> I am a new to R,  so my question may be easy to answer for you:
>
> I have a dataframe, for example:
>
> df<-data.frame(loc=c("A","B","A","A","A"),
> year=as.numeric(c("1970","1970","1970","1976","1980")))
>
> and I want to create the following table without using loops:
>
>   1970-74 ; 1975-79 ; 1980-85; rowsum
> A 2   1   1  4
> B 1   00  1
> colsum   31   15
>
> so that the frequencies of df$loc are shown in the table for
> different time intervals.
>
> Thanks in advance for any hint,
>
> Michael Graber
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] summary linear regression

2006-11-09 Thread Peter Alspach

Gorka

See the message from Brian Ripley, which is the first item from

RSiteSearch('R-squared')

Peter Alspach


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Gorka Merino
> Sent: Friday, 10 November 2006 4:54 a.m.
> To: r-help@stat.math.ethz.ch
> Subject: [R] summary linear regression
>
>
> >>Good morning,
> >>
> >>I am a PhD student in Barcelona working with R. My question
> is about
> >>the summary of linear regressions.
> >>The output of that summary gives some statistical parameters of the
> >>regression. One of them is the R-squared. In the help menu
> i have read
> >>that the manner to calculate the R-squared is
> >>
> >>R^2 = 1 - Sum(R[i]^2) / Sum((y[i]- y*)^2), and  when the linear
> >>regression is forced to start at the point (0,0)  y* is 0.
> >>
> >>My question is if the R-squared obtained when the linear regression
> >>crosses the origin has the same meaning as when it does with an
> >>intercept. I'm faced to a theoretical model which brings
> some values
> >>for a variable that compared to the observed ones gives
> some R*squared
> >>of
> >>0.32 when the regression has an intercept and 0.7 when it
> is forced to
> >>cross the origin.
> >>To sum up, it may be stated that the theoretical methodolgy
> followed
> >>explains the 70% of  of the observed variance?
> >>
> >>Thank you very much.
> >>
> >>Gorka Merino.
>
> Gorka Merino
> Institut de Ciències del Mar, CMIMA-CSIC Psg. Marítim de la
> Barceloneta 37-49 08003-BARCELONA (Spain)
>
> Tel.: (34) 932 30 95 48
> e-mail: [EMAIL PROTECTED]
>
> CMIMA:
> Tel.: (34) 932 30 95 00
> Fax:  (34) 932 30 95 55
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] barplot question

2006-10-17 Thread Peter Alspach

Mark

> i'm doing a bar plot and there are 16 column variables. is
> there a way to make the variable names go down instead of
> across when you do the barplot ?
> because the names are so long, the barplot just shows 3 names
> and leaves the rest out. if i could rotate the names 90
> degrees, it would probably fit a lot more.

Is this the sort of thing you mean:

temp <- barplot(rnorm(16, 3))
text(temp, rep(-0.2, 16), paste('trt', 1:16), srt=90, adj=1)

Peter Alspach

> or maybe i can use space to make the horizontal width longer
> ? I looed up ?barlot but i'm not sure. when 1st and 2nd are
> on the bottom, things look fine but i'm not as interesed in
> those 2 barplots. 
> 
> i didn't use any special options. i just did
> 
> barplot(probsignmatrix)
> barplot(t(probsignmatrix))
> 
> barplot(probsignmatrix,beside=T)
> barplot(t(probsignmatrix),beside=T)
> 
> 
> 
> i put probsignmatrix below in case someone wants to see what
> i mean because it may not be clear. i don't expect anyone to
> type it in but rounding would still show what i mean.
> thanks a lot.
> 
> 
>  pcount pmpppcount pmmppcount pmmmpcount pcount
> mcount pppmmcount ppmmmcount ppmppcount ppmmpcount
> pppmpcount ppmpmcount pmpmpcount pmpmmcount pmmpmcount
> pmppmcount 1st 0.03477157 0.02842640 0.03157360 0.03365482
> 0.04010152 0.03553299
> 0.03989848 0.04182741 0.02817259 0.03203046 0.02781726 0.02218274
> 0.01771574 0.02289340 0.02583756 0.02390863 2nd 0.04648895
> 0.02901495 0.03092490 0.03064044 0.04108420 0.03998700
> 0.03958062 0.04059655 0.03039662 0.03027471 0.02901495 0.02170026
> 0.01601105 0.02287874 0.02165962 0.02267555
> 

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confusing about contrasts concept [long]

2006-08-16 Thread Peter Alspach

Tian

It appears the attachment might not have worked so I'll embed Bill's
message at the end.

Peter Alspach


> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Peter Alspach
> Sent: Thursday, 17 August 2006 8:02 a.m.
> To: T Mu; R-Help
> Subject: Re: [R] confusing about contrasts concept
>
>
> Tian
>
> Bill Venables wrote an excellent explanation to the S list
> back in 1997.
> I saved it as a pdf file and attach it herewith ...
>
> Peter Alspach
>  
>
> > -Original Message-
> > From: [EMAIL PROTECTED]
>
> > [mailto:[EMAIL PROTECTED] On Behalf Of T Mu
> > Sent: Thursday, 17 August 2006 3:23 a.m.
> > To: R-Help
> > Subject: [R] confusing about contrasts concept
> >
>
> > Hi all,
> >
>
> > Where can I find a thorough explanation of Contrasts and
>
> > Contrasts Matrices?
> > I read some help but still confused.
> >
>
> > Thank you,
> > Tian


Date sent:  Sat, 29 Mar 1997 17:23:31 +1030
From:   Bill Venables <[EMAIL PROTECTED]>
To: Wei Qian <[EMAIL PROTECTED]>
Copies to:  [EMAIL PROTECTED]
Subject:Re: test contrasts [long]

Wei Qian writes:

I am new to Splus, so please don't be offended if my question is too
naive.

We're all friends here, Wei. It's not that kind of group...mostly.

Does anyone know how to test contrasts in Splus? To be specific, suppose
I have a one-way layout experiment with 3 treatments, and I want to test
the contrasts of treatment 1 vs. treatment 2, treatment 1 vs. treatment
3, etc. In SAS, I could use the following:

 proc gim;
   class trt;
  model y = trt;
  contrasts"! vs. 2" 1-10;
  contrasts "2 vs. 3" 10-1;
run;

 One way I can think of is that to construct my design matrix and obtain
the contrast sum of squares through a series of matrix operations, but
is there any easy way to do it or any built-in function in Splus can do
it?

The answer is 'yes' but hardly anyone seems to know how to do it. The
explanation in the 'white book' for example, seems a little incomplete
to me and not quite adequate to settle the case you raise. (The
explanation in the yellow book is also inadequate, but I hope come July
we will have fixed that.) Since this is one of the most frequent
questions people ask me in direct email, too, let me try (again) to sort
it out in some detail.

A formula such as y ~ f, where f is a factor in principle generates a
single classification model in the form

*y_{ij} == mu + phi_i + e_{ij}

Write the design matrix in the form X = [1 Xf], where, assuming f has p
levels, Xf is the n x p dummy variable (ie binary) matrix corresponding
to the phi_i's. So in matrix terms the model is written as

*y = 1 mu + Xf phi + e

(a) If you remove the intercept term, using y ~ f -1, then Xf is the
design matrix you get and the coefficients correspond to the class
means;
(b) If the intercept term is left in, then the design matrix X does
not have full column rank, so an adjustment has to be made to Xf to make
it so.

The redundancy comes about because the columns of Xf add to the the
1-vector, that is
Xf l_p = l_n. So let C be any p x (p -1) matrix such that [1 C] is
nonsingular. It can easily be seen that Xc = [1 (Xf C)] will have full
column rank and that fitting the model using this model matrix is
equivalent to the original redundantly specified model. The matrix C is
called the *coding matrix* for the factor f.

The linear model is actually fitted in the form

*y = 1 mu + (Xf C) alpha + e

where alpha has (p-1) components, of course. In order to make sense of
the alpha's we need to relate them back to the phi's.

For any such C there will be a vector, v, such the v'C = 0' (using ' for
transpose). (In fact v spans the orthogonal complement to the column
space of C). Clearly phi and alpha are related by

*C alpha = phi

but since v'C = 0', it follows that an identification constraint
applies, namely v'phi = 0. By multiplying both sides by (C'C)^{-1} C',
it also follows that

*alpha =(C'C)^{-1}C'phi

which provides an interpretation for the alpha's in terms of the
(constrained) phi's. For example take the Helmert contrasts.

> contr.helmert(4)
[,1][,2][,3]
1   -1  -1  -1
2   1   -1  -1
3   0   2   -1
4   0   0   3

The constraint vector is clearly v= (1,1,1,1), since the columns add to
zero. In this case the columns are also mutually orthogonal, so the
matrix (C'C^{-l} C' (the generalized inverse of C) has a similar form
apart from a few row divisors:

>fractions(ginverse(contr.helmert(4)))
[,1][,2][,3][,4]
[1,]-1/21/2 0   0
[2,]-1/6-1/61/3 0
[3,]-1/12   -1/12   

Re: [R] confusing about contrasts concept

2006-08-16 Thread Peter Alspach

Tian

Bill Venables wrote an excellent explanation to the S list back in 1997.
I saved it as a pdf file and attach it herewith ...

Peter Alspach
  

> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of T Mu
> Sent: Thursday, 17 August 2006 3:23 a.m.
> To: R-Help
> Subject: [R] confusing about contrasts concept
>
> Hi all,
>
> Where can I find a thorough explanation of Contrasts and
> Contrasts Matrices?
> I read some help but still confused.
>
> Thank you,
> Tian
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


__

The contents of this e-mail are privileged and/or confidential to the
named recipient and are not to be used by any other person and/or
organisation. If you have received this e-mail in error, please notify
the sender and delete all material pertaining to this e-mail.

R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to choose columns in data.frame by parts of columns' names?

2006-05-30 Thread Peter Alspach

Wei-wei

> I have a data.frame which has names as following.
> [1] "XG1"  "YG1"  "XEST" "YEST"
> [2] "XNOEMP1"  "XNOEMP2" "YNOEMP1"  "YNOEMP2"
> [3] "XBUS10"   "XBUS10A" "XBUS10B"  "XBUS10C"
> [4] "YBUS10"   "YBUS10A"  "YBUS10B"  "YBUS10C"
> [5] "XOWNBUS"  "XSELFEST"  "YOWNBUS"  "YSELFEST"
>
> Those columns have names beginning with "X" or "Y". Each "X"
> is paired by a "Y", e.g. "XG1" and "YG1", but they are not in
> the order of "X Y X Y ...". I want to combine "X*" and "Y*" like this:
>
> data.new[,"G1"]  <- (data.old[,"XG1"] + endata.use[,"YG1"])/2
>
> How to choose columns by parts of names? For example, I can pick out
> XG1 and YG1 because they have the common part "G1".

Not entirely sure what you mean but one approach might be to re-order
the columns so that they are in order.

yourNames
 [1] "XG1"  "YG1"  "XEST" "YEST" "XNOEMP1"  "XNOEMP2"
 [7] "YNOEMP1"  "YNOEMP2"  "XBUS10"   "XBUS10A"  "XBUS10B"  "XBUS10C"
[13] "YBUS10"   "YBUS10A"  "YBUS10B"  "YBUS10C"  "XOWNBUS"  "XSELFEST"
[19] "YOWNBUS"  "YSELFEST"
yourNames[order(substring(yourNames,2), substring(yourNames, 1,1))]
 [1] "XBUS10"   "YBUS10"   "XBUS10A"  "YBUS10A"  "XBUS10B"  "YBUS10B"
 [7] "XBUS10C"  "YBUS10C"  "XEST" "YEST" "XG1"  "YG1"
[13] "XNOEMP1"  "YNOEMP1"  "XNOEMP2"  "YNOEMP2"  "XOWNBUS"  "YOWNBUS"
[19] "XSELFEST" "YSELFEST"

gives an idea of what I mean ...

Peter Alspach



__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] factors and mca

2006-05-02 Thread Peter Alspach

Carlos

?mca states that mca works on a dataframe.  As you've written it

is.data.frame(de) returns FALSE

Try

de <- data.frame(d,e)  instead of  de <- factor(c(d,e))

HTH

Peter Alspach



> -Original Message-
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED] On Behalf Of Carlos
> Mauricio Cardeal Mendes
> Sent: Wednesday, 3 May 2006 8:23 a.m.
> To: 'R-Help help'
> Subject: [R] factors and mca
>
> Hi ! Wonder if I have this code below:
>
> a <- c(1, 2, 3, 2, 1)
> b <- c(3, 2, 3, 1, 1)
> d <-as.factor(a)
> e <-as.factor(b)
> table(d,e)
> is.factor(a)
> is.factor(b)
> is.factor(d)
> is.factor(e)
> de <- factor(c(d,e))
> is.factor(de)
> require(MASS)
> mca(de)
>
> I'd like to perform a correspondence analysis, but I can't
> understand why the message error after mca.
>
> Aren't those (d and e) factors as is.factors shows ?
>
>  > Erro em mca(de) : all variables must be factors
>
> I'm running R 2.3.0 under windows XP
>
>
> Thank you !
> Mauricio
>
> __
> R-help@stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
> http://www.R-project.org/posting-guide.html
>

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Special characters: plus/minus - a method that works

2006-03-20 Thread Peter Alspach

Gabor Grothendieck wrote:

> Another way that works on my Windows XP system is to use "\261" .

Please note Windows escape codes are not portable thus not recommended as 
Martin Maechler pointed out in a response to a suggestion of mine:

On Tue, 14 Jun 2005, Martin Maechler wrote:


>>>>>> "Peter" == Peter Alspach 
>>>>>> on Tue, 14 Jun 2005 14:11:47 +1200 writes:
>
> Peter> Ben
>
> Peter> Others have pointed out plotmath. However, for some
> Peter> superscripts (including 2) it may be easier to use
> Peter> the appropriate escape sequence (at in Windows):
>
> Peter> ylab = 'BA (m\262/ha)'
>
> but please refrain from doing that way.
> You should write R code that is portable, and ``plotmath''
> has been designed to be portable. Windows escape codes are not,
> and may not even work in future (or current?) versions of
> Windows with `unusual' locale settings {fonts, etc}.


Peter Alspach

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] superscript in figures - basic question

2005-06-13 Thread Peter Alspach

Ben

Others have pointed out plotmath.  However, for some superscripts (including 2) 
it may be easier to use the appropriate escape sequence (at in Windows):

ylab = 'BA (m\262/ha)'

Cheers ....

Peter Alspach


>>> "Benjamin M. Osborne" <[EMAIL PROTECTED]> 14/06/05 13:42:53 >>>
Although I see similar, but more complex, questions addressed in the help
archive, I'm having trouble adding superscripted text to the y-axis labels of
some figures, and I can't find anything in the R documentation on this.

I have:

ylab="BA (m2/ha)"

but I want the "2" to be superscripted.
Thanks in advence for the help, or for pointing out the appropriate help file.

-Ben Osborne

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] simple predict question

2005-05-31 Thread Peter Alspach

Alternatively, one can convert dat to a data frame first (?lm states that data 
is "an optional data frame containing the variables in the model."):

dat<-matrix(c(0,0,10,20),2,byrow=T)
dat <- as.data.frame(dat)
lm1<-lm(V2~V1, dat)
predict(lm1, data.frame(V1=seq(0,10,1)))
 1  2  3  4  5  6  7  8  9 10 11 
 0  2  4  6  8 10 12 14 16 18 20 

Cheers .

Peter Alspach


>>> Gabor Grothendieck <[EMAIL PROTECTED]> 01/06/05 09:00:15 >>>
On 5/31/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> Excuse the simple question...
> I'm not sure what I'm doing wrong with predict, but let me use this example:
> Suppose I do:
> dat<-matrix(c(0,0,10,20),2,byrow=T)
> lm1<-lm(dat[,2]~dat[,1])
> 
> Suppose I want to generate the linearly-interpolated y-values between the
> point (0,0) and (0,20) at every unit interval.
> I thought I just do:
> predict(lm1, data.frame(seq(0,10,1))) to get 0,2,4,6...,18,20, but instead
> I just get:
> 12
> 0  20
> 

I think the names are confusing it.  Try:

x <- dat[,1]; y <- dat[,2]
lm1 <- lm(y ~ x)
predict(lm1, data.frame(x = 1:10))

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Adding up intervals by activity

2005-02-23 Thread Peter Alspach

Lorin

You could use rle():  Say your Activity and Interval data is in lorin, then:

tmp.rle <- rle(as.vector(lorin[,1]))[[1]]
tapply(lorin[,2], rep(1:length(tmp.rle), tmp.rle), sum)
 1  2  3 
16  8  6 

HTH

Peter Alspach

>>> Lorin Hochstein <[EMAIL PROTECTED]> 24/02/05 16:18:43 >>>
Hello all,

I have a data frame with the following columns: timestamp, activity, 
interval.

Each row contains information about an activity that occured over a time 
interval. "Timestamp" is when the interval ended, "Activity" is a factor 
that represents the type of activity, and "Interval" is a double which 
represents an interval in time. The data frame is sorted by timestamp. 
For example:

TimestampActivity  Interval
2004-02-12 08:59:08  A 5
2004-02-12 09:23:03  A 7
2004-02-12 09:56:04  A 4
2004-02-12 10:39:00  B 5
2004-02-12 11:23:06  B 3
2004-02-12 12:56:01  A 4
2004-02-12 12:59:01  A 2
...

I would like to know how much time was spent on an activity before the 
activity changed (to compute time, I just care about the lengths of the 
intervals, not the values of the timestamps). For example, given the 
table above, I would like to get an output that looks like:


Activity  TotalTime
A 16
B 8
A 6

Is this possible to do in R without resorting to a for loop or recursion 
(i.e. what's the R-ish way to do this?) I know I can use the "diff" 
function to identify which rows represent a change in activity, but is 
there anything I can do with that?

Lorin

--
Lorin Hochstein
Graduate Research Assistant
Experimental Software Engineering Group
Computer Science Dept., University of Maryland
email:[EMAIL PROTECTED]  tel:301-405-2721

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Positive log-likelihood in lme

2005-02-16 Thread Peter Alspach

Kia ora

I'm a using lme (from nlme package) with data similar to the Orthodont dataset 
and am getting positive log-likelihoods (>100).  This seems usual and I 
wondered if someone could offer a possible explanation.

I can supply a sample dataset if requested, but I feel almost certain that this 
question has been asked and answered recently.  However, I can find no trace of 
it in the mail archives (although I have spent several hours reading lots of 
other interesting things :-)).

Thanks .....

Peter Alspach


__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] dropping rows

2004-12-01 Thread Peter Alspach

Tobias

I remember finding Patrick Burns' "S Poetry" (see http://www.burns-stat.com/ ) 
worth reading - and it covers this sort of thing nicely.

Peter Alspach


>>> Tobias Muhlhofer <[EMAIL PROTECTED]> 02/12/04 13:57:47 >>>
Thanks.

The problem is that there is extremely little on dataframes or matrices 
in "An Intro to R", which I did read and I frankly don't know where else 
to go.

Once I know a function like subset() exists, I can then read the help 
files on it and that's fine, but I would never dream this function up 
myself...

As for indexing, I DID read "An Introduction to R" and I did NOT catch 
the part where it says you can use any variable in the dataframe to 
index it, nor would I have thought of it by myself. From that 
documentation, I only learned about using row-labels to index things...

But I am definitely thankful for the quick help given to me by people on 
this list, and so I guess being RTFM'ed is a small price to pay for 
figuring out how to solve the problem I need to solve.

Toby


Jeff Laake wrote:
> Here's an example:
> 
> earlydata=data[data$year<1960,]
> 
> Lookup help and read manuals on manipulating dataframes.
> 
> 
> Tobias Muhlhofer wrote:
> 
>> Hi!
>>
>> Sorry for asking a trivial questions, but I can't seem to figure this 
>> out.
>>
>> I have a dataframe called master containing 30-odd variables.
>>
>> In this dataframe, I have observations across these 30 variables from 
>> 1930 to 2003 (I've made a "year" variable). How can I drop all rows 
>> for which the year is less than 1960? I'm assuming something with 
>> ifelse() but I can't quite figure it out.
>>
>> I would appreciate a suggestion of some syntax.
>>
>> Thanks!
>> Toby
>>
>> __
>> [EMAIL PROTECTED] mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help 
>> PLEASE do read the posting guide! 
>> http://www.R-project.org/posting-guide.html 
> 
> 
> 
> 
> 

-- 
**
When Thomas Edison invented the light bulb he tried over 2000
experiments before he got it to work. A young reporter asked
him how it felt to have failed so many times. He said
"I never failed once. I invented the light bulb.
It just happened to be a 2000-step process."

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] unbalanced design

2004-12-01 Thread Peter Alspach

Damián

I asked a similar question a few months ago (3 August 2004):

> temp.aov <- aov(S~rep+trt1*trt2*trt3, data=dummy.data)
> model.tables(temp.aov, type='mean', se=T)
>
> Returns the means, but states "Design is unbalanced - use se.contrasts
> for se's" which is a little surprising since the design is balanced. 

To which Prof Ripley replied: If you used the default treatment contrasts, it 
is not.  Try Helmert
contrasts with aov().

If I recall correctly, following Prof Ripley's suggestion led aov() to accept 
the design was balanced, but model.tables() still did not (but that could have 
been my error).  However, se.contrast() worked.

Cheers 

Peter Alspach



>>> Damián Cirelli <[EMAIL PROTECTED]> 02/12/04 09:50:13 >>>
Hi all,
I'm new to R and have the following problem:
I have a 2 factor design (a has 2 levels, b has 3 levels). I have an
object kidney.aov which is an aov(y ~ a*b), and when I ask for
model.tables(kidney.avo, se=T) I get the following message along with
the table of effects:

Design is unbalanced - use se.contrast() for se's

but the design is NOT unbalanced... each fator level combination has the
same n

I' d appreciate any help.
Thanks.

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] as.list.matrix

2004-10-28 Thread Peter Alspach

Kjetil

Isn't a data.frame as special type of list, and thus one could use
as.data.frame?

Peter Alspach


>>> Kjetil Brinchmann Halvorsen <[EMAIL PROTECTED]> 29/10/04
14:16:03 >>>
I found the need of converting a matrix into a list of its columns
(for use with do.call), and was surprised there was no method
as.list.matrix, it could easily be a part of as.list default

I wrote

my.as.list.matrix <- function(mat) {
   if(!is.matrix(mat))stop("Argument must be a matrix")
   n <- NCOL(mat)
   res <- vector(mode="list", length=n)
   for (i in seq(along=res))
 res[[i]] <- mat[,i]
   res
}

but still think there must be some in-built function doing this!

By the way, my use is

 with(VRS,do.call("scatter.smooth",
c(my.as.list.matrix(na.omit(cbind(Tmed, 
resid(VRS.mod1,type="response",
xlab="", ylab="")))

there shoud be an easier way for such a daily task?

Kjetil

-- 

Kjetil Halvorsen.

Peace is the most effective weapon of mass construction.
   --  Mahdi Elmandjra

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] ifelse() question

2004-10-28 Thread Peter Alspach

Francisco

Did you try changing the factors to character, with as.character?

Also you don't really need ifelse() for this.  Something like the
following (untested) should do it:

dat[,4] <- as.character(dat[,4])
dat[,5] <- as.character(dat[,5])
dat[dat[,4]=='POR',4] <- dat[dat[,4]=='POR',5]
dat[,4] <- as.factor(dat[,4])
dat[,5] <- as.factor(dat[,5])


Peter Alspach

>>> "F Z" <[EMAIL PROTECTED]> 29/10/04 12:48:54 >>>
Hi

I have a data.frame with dim = 18638 (rows) 6 (cols)

names(dat)
[1] "id"  "long""lat" "species" "type""size"

Variable "species" and "type" are factors.  Species has 5 levels "BOV"
"CAP" 
"CER" "OVI" "POR"
Variable "type" has 11 levels "BRD" "CL" ... "OTHER"

I would like to replace the values on species by the values on types
only if 
species is == "POR"
I tried:

x<-ifelse(dat$species %in% "POR",dat$type,dat$species)
dat[,4]<-x
but levels(x)
[1] "1"  "2"  "3"  "4"  "5"  "6"  "8"  "9"  "10" "11" "12"

So x changes the factor names by numbers.  I can not use factor() to
recover 
the names since the resulting factors in x are a mixture of factors
from 
species and type.

I also tried

x<-gsub(pattern = "POR",replacement= factor(dat$type),dat$species) 
with 
same behavior.

Apparently I did not have my granola bar today so I can't find a
solution!  
Any help is greatly appreciated

Thanks!

Francisco

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html

__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Standard errors from glm

2004-08-03 Thread Peter Alspach

Dear Prof Ripley

Thanks for your reply and clarification.  However:

1.  Regarding model.tables() returning "Design is unbalanced".  Setting contrasts to 
Helmert does indeed make the design balanced, but model.tables() still returns "Design 
is unbalanced":

> options()$contrasts
unordered   ordered 
"contr.treatment"  "contr.poly" 
> aov(S~rep+trt1*trt2*trt3, data=dummy.data)
Call:
...
Residual standard error: 14.59899 
Estimated effects may be unbalanced
> options(contrasts=c("contr.helmert", "contr.treatment"))
> aov(S~rep+trt1*trt2*trt3, data=dummy.data)
Call:
...
Residual standard error: 14.59899 
Estimated effects are balanced
> model.tables(aov(S~rep+trt1*trt2*trt3, data=dummy.data), se=T)
Design is unbalanced - use se.contrasts for se's
Tables of effects
...

However, this is a relatively minor issue, and covered in ?model.tables which clearly 
states that "The implementation is incomplete, and only the simpler cases have been 
tested thoroughly."

2.  You point out that "In either case you can predict something you want to estimate 
and use
predict(, se=TRUE)."  Doesn't this give the standard error of the predicted value, 
rather than the mean for, say, trt1 level 0?  For example:
> predict(temp.lm, newdata=data.frame(rep='1', trt1='0', trt2='1', trt3='0'), se=T)
$fit
[1] 32

$se.fit
[1] 10.53591

$df
[1] 23

$residual.scale
[1] 14.59899

Whereas from the analysis of variance table we can get the standard error of the mean 
for trt1 as being sqrt(anova(temp.lm)[9,3]/12) = 4.214365.  It is the equivalent of 
this latter value that I'm after in the glm() case.


>>> Prof Brian Ripley <[EMAIL PROTECTED]> 03/08/04 18:10:56 >>>
On Tue, 3 Aug 2004, Peter Alspach wrote:

[Lines wrapped for legibility.]

> I'm having a little difficulty getting the correct standard errors from
> a glm.object (R 1.9.0 under Windows XP 5.1).  predict() will gives
> standard errors of the predicted values, but I am wanting the standard
> errors of the mean.
> 
> To clarify:
> 
> Assume I have a 4x3x2 factorial with 2 complete replications (i.e. 48
> observations, I've appended a dummy set of data at the end of this
> message).  Call the treatments trt1 (4 levels), trt2 (3 levels) and trt3
> (2 levels) and the replications rep - all are factors.  The observed
> data is S.  Then:
> 
> temp.aov <- aov(S~rep+trt1*trt2*trt3, data=dummy.data)
> model.tables(temp.aov, type='mean', se=T)
> 
> Returns the means, but states "Design is unbalanced - use se.contrasts
> for se's" which is a little surprising since the design is balanced.  

If you used the default treatment contrasts, it is not.  Try Helmert 
contrasts with aov().

> Nevertheless, se.contrast gives what I'd expect:
> 
> se.contrast(temp.aov, list(trt1==0, trt1==1), data=dummy.data)
> [1] 5.960012
> 
> i.e. standard error of mean is 5.960012/sqrt(2) = 4.214, which is the
> sqrt(anova(temp.aov)[9,3]/12) as expected.  Similarly for interactions,
> e.g.:
> 
> se.contrast(temp.aov, list(trt1==0 & trt2==0, trt1==1 & trt2==1), 
> data=dummy.data)/sqrt(2)
> [1]  7.299494
> 
> How do I get the equivalent of these standard errors if I have used
> lm(), and by extension glm()?  I think I should be able to get these
> using predict(..., type='terms', se=T) or coef(summary()) but can't
> quite see how.

In either case you can predict something you want to estimate and use
predict(, se=TRUE).

-- 
Brian D. Ripley,  [EMAIL PROTECTED] 
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/ 
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595


__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Standard errors from glm

2004-08-02 Thread Peter Alspach

Roger

Thanks - I have tried that but it doesn't give the standard errors I required.

Using my example:

coef(summary(temp.lm)) gives:

  Estimate Std. Error t value Pr(>|t|)
(Intercept)   44.5  10.535912  4.22364968 0.0003224616
rep2  -1.0   4.214365 -0.23728369 0.8145375583
trt11-13.0  14.598987 -0.89047272 0.3824325293
trt12  3.0  14.598987  0.20549370 0.8389943428
trt13-17.0  14.598987 -1.16446432 0.2561726432
.. etc ...

that is, standard error for the (intercept) is 10.54, rep 4.21, main effects 14.60, 
2-way interactions 20.65 and 3-way interaction 29.20.  These can also be obtained via
sqrt(diag(vcov(temp.lm))).

It is not clear to me how to go from these estimates to those from the aov() call.  I 
have tried pre- and post- multiplying vcov() by the design matrix but this gives the 
same standard errors as predict(temp.lm, se=T); i.e. those of the predicted values.

Regards ....

Peter Alspach


>>> "Roger D. Peng" <[EMAIL PROTECTED]> 03/08/04 12:22:44 >>>
Try summary(glm.object)$coefficients.

-roger

Peter Alspach wrote:
> Kia ora list members:
> 
> I'm having a little difficulty getting the correct standard
> errors from a glm.object (R 1.9.0 under Windows XP 5.1).
> predict() will gives standard errors of the predicted values,
> but I am wanting the standard errors of the mean.
> 
> To clarify:
> 
> Assume I have a 4x3x2 factorial with 2 complete replications
> (i.e. 48 observations, I've appended a dummy set of data at
> the end of this message).  Call the treatments trt1 (4
> levels), trt2 (3 levels) and trt3 (2 levels) and the
> replications rep - all are factors.  The observed data is S.
> Then:
> 
> temp.aov <- aov(S~rep+trt1*trt2*trt3, data=dummy.data) 
> model.tables(temp.aov, type='mean', se=T)
> 
> Returns the means, but states "Design is unbalanced - use
> se.contrasts for se's" which is a little surprising since the
> design is balanced.  Nevertheless, se.contrast gives what I'd
> expect:
> 
> se.contrast(temp.aov, list(trt1==0, trt1==1), data=dummy.data)
>  [1] 5.960012
> 
> i.e. standard error of mean is 5.960012/sqrt(2) = 4.214, which
> is the sqrt(anova(temp.aov)[9,3]/12) as expected.  Similarly
> for interactions, e.g.:
> 
> se.contrast(temp.aov, list(trt1==0 & trt2==0, trt1==1 &
> trt2==1), data=dummy.data)/sqrt(2) [1]  7.299494
> 
> How do I get the equivalent of these standard errors if I have
> used lm(), and by extension glm()?  I think I should be able
> to get these using predict(..., type='terms', se=T) or
> coef(summary()) but can't quite see how.
> 
> predict(lm(S~rep+trt1*trt2*trt3, data=dummy.data),
> type='terms', se=T) or predict(glm(cbind(S,
> 100-S)~rep+trt1*trt2*trt3, data=dummy.data,
> family='binomial'), type='terms', se=T) or, as in my case, 
> predict(glm(cbind(S, 100-S)~rep+trt1*trt2*trt3,
> data=dummy.data, family='quasibinomial'), type='terms', se=T)
> 
> 
> Thanks 
> 
> Peter Alspach HortResearch
> 
> dummy.data trt1   trt2trt3rep S 0 0   0   1   33 0   
>  0   0   2   55 00   1   1
> 18 0  0   1   2   12 01   0   1   47 01   0  
>  2   16 01   1   1   22 01   1   2   33 02
> 0 1   22 02   0   2   18 02   1   1   60 0   
>  2   1   2   40 10   0   1   38 10   0   2   
> 24 
> 1 0   1   18 10   1   2   14 11   0  
>  1   69 11   0   2   42 11   1   1   42 11   
> 1   2
> 44 1  2   0   1   48 12   0   2   26 12   1  
>  1   46 12   1   2   33 20   0   1   48 20
> 0 2   46 20   1   1   24 20   1   28 2   
>  1   0   1   69 21   0   2   33 21   1   1   
> 22 
> 2 1   1   2   33 22   0   1   33 22   0  
>  2   18 22   1   1   26 22   1   2   42 30   
> 0   1
> 12 3  0   0   2   42 30   1   1   16 30   1  
>  2   22 31   0   1   14 31   0   2   60 31
> 1 1   40 31   1   2   55 32   0   1   47 3   
>  2   0   2   38 32   1   1   18 32   1   2   
&

[R] Standard errors from glm

2004-08-02 Thread Peter Alspach

Kia ora list members:

I'm having a little difficulty getting the correct standard errors from a glm.object 
(R 1.9.0 under Windows XP 5.1).  predict() will gives standard errors of the predicted 
values, but I am wanting the standard errors of the mean.

To clarify:

Assume I have a 4x3x2 factorial with 2 complete replications (i.e. 48 observations, 
I've appended a dummy set of data at the end of this message).  Call the treatments 
trt1 (4 levels), trt2 (3 levels) and trt3 (2 levels) and the replications rep - all 
are factors.  The observed data is S.  Then:

temp.aov <- aov(S~rep+trt1*trt2*trt3, data=dummy.data)
model.tables(temp.aov, type='mean', se=T)

Returns the means, but states "Design is unbalanced - use se.contrasts for se's" which 
is a little surprising since the design is balanced.  Nevertheless, se.contrast gives 
what I'd expect:

se.contrast(temp.aov, list(trt1==0, trt1==1), data=dummy.data)
[1] 5.960012

i.e. standard error of mean is 5.960012/sqrt(2) = 4.214, which is the 
sqrt(anova(temp.aov)[9,3]/12) as expected.  Similarly for interactions, e.g.:

se.contrast(temp.aov, list(trt1==0 & trt2==0, trt1==1 & trt2==1), 
data=dummy.data)/sqrt(2)
[1]  7.299494

How do I get the equivalent of these standard errors if I have used lm(), and by 
extension glm()?  I think I should be able to get these using predict(..., 
type='terms', se=T) or coef(summary()) but can't quite see how.

predict(lm(S~rep+trt1*trt2*trt3, data=dummy.data), type='terms', se=T)
or
predict(glm(cbind(S, 100-S)~rep+trt1*trt2*trt3, data=dummy.data, family='binomial'), 
type='terms', se=T)
or, as in my case,
predict(glm(cbind(S, 100-S)~rep+trt1*trt2*trt3, data=dummy.data, 
family='quasibinomial'), type='terms', se=T)


Thanks 

Peter Alspach
HortResearch

dummy.data
trt1trt2trt3rep S
0   0   0   1   33
0   0   0   2   55
0   0   1   1   18
0   0   1   2   12
0   1   0   1   47
0   1   0   2   16
0   1   1   1   22
0   1   1   2   33
0   2   0   1   22
0   2   0   2   18
0   2   1   1   60
0   2   1   2   40
1   0   0   1   38
1   0   0   2   24
1   0   1   18
1   0   1   2   14
1   1   0   1   69
1   1   0   2   42
1   1   1   1   42
1   1   1   2   44
1   2   0   1   48
1   2   0   2   26
1   2   1   1   46
1   2   1   2   33
2   0   0   1   48
2   0   0   2   46
2   0   1   1   24
2   0   1   28
2   1   0   1   69
2   1   0   2   33
2   1   1   1   22
2   1   1   2   33
2   2   0   1   33
2   2   0   2   18
2   2   1   1   26
2   2   1   2   42
3   0   0   1   12
3   0   0   2   42
3   0   1   1   16
3   0   1   2   22
3   1   0   1   14
3   1   0   2   60
3   1   1   1   40
3   1   1   2   55
3   2   0   1   47
3   2   0   2   38
3   2   1   1   18
3   2   1   2   44


__

The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Scoping Rules: Summary

2003-10-09 Thread Peter Alspach
Thanks to Andy Liaw, Roger Peng, Thomas Lumley, Brian Ripley and Peter
Dalgaard, all of whom addressed my questions or threads arising from
them.  The full messages were posted to the list so this is a brief
summary:

Andy Liaw explained the difference between lexical and dynamic scoping
and the rationale behind the choice of lexical scoping for R.  Roger
Peng showed how to modify fnB.  Brian Ripley suggested that it is
generally better to pass functions an object rather than just the name,
and warned of the dangers of using get() on the result of
deparse(substitute()).

Thanks all

Peter Alspach


Original question below:

I'm using R1.7.1 (Windows 2000) and having difficulty with scoping. 
I've studied the FAQ and on-line manuals and think I have identified
the
source of my difficulty, but cannot work out the solution.

For the purposes of illustration.  I have three functions as defined
below:

fnA <- function(my.x)
{
  list(first=as.character(substitute(my.x)), second=sqrt(my.x))
}

fnB <- function(AA)
{
  tmp.x <- get(AA$first)
  tmp.x^2
}

fnC <- function()
{
  x <- 1:2
  y <- fnA(x)
  z <- fnB(y)
  c(x,y,z)
}

fnA() has a vector as an argument and returns the name of the vector
and the square root of its elements in a list.  fn(B) takes the result
of fn(A) as its argument, gets the appropriate vector and computes the
square of its elements.  These work fine when called at the command
line.

fnC() defines a local vector x and calls fnA() which operates on this
vector.  Then fnB() is called, but it operates on a global vector x in
GlobalEnv (or returns an error is x doesn't exist there) - but I want
it
to operate on the local vector.

I think this is related to the enclosing environment of all three
functions being GlobalEnv (since they were created at the command
line),
but the parent environment of fnB() being different when invoked from
within fnC().

My questions:

1  Have I correctly understood the issue ?
2  How do I make fnB() operate on the local vector rather than the
global one ?
3  And, as an aside, I have used as.character(substitute(my.x)) to
pass
the name - but deparse(substitute(my.x)) also works.  Is there any
reason to prefer one over the other?

Thank you ...



__
The contents of this e-mail are privileged and/or confidenti...{{dropped}}

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help


[R] Scoping rules

2003-10-08 Thread Peter Alspach
Dear List members:

I'm using R1.7.1 (Windows 2000) and having difficulty with scoping. 
I've studied the FAQ and on-line manuals and think I have identified
the
source of my difficulty, but cannot work out the solution.

For the purposes of illustration.  I have three functions as defined
below:

fnA <- function(my.x)
{
  list(first=as.character(substitute(my.x)), second=sqrt(my.x))
}

fnB <- function(AA)
{
  tmp.x <- get(AA$first)
  tmp.x^2
}

fnC <- function()
{
  x <- 1:2
  y <- fnA(x)
  z <- fnB(y)
  c(x,y,z)
}

fnA() has a vector as an argument and returns the name of the vector
and the square root of its elements in a list.  fn(B) takes the result
of fn(A) as its argument, gets the appropriate vector and computes the
square of its elements.  These work fine when called at the command
line.

fnC() defines a local vector x and calls fnA() which operates on this
vector.  Then fnB() is called, but it operates on a global vector x in
GlobalEnv (or returns an error is x doesn't exist there) - but I want
it
to operate on the local vector.

I think this is related to the enclosing environment of all three
functions being GlobalEnv (since they were created at the command
line),
but the parent environment of fnB() being different when invoked from
within fnC().

My questions:

1  Have I correctly understood the issue ?
2  How do I make fnB() operate on the local vector rather than the
global one ?
3  And, as an aside, I have used as.character(substitute(my.x)) to
pass
the name - but deparse(substitute(my.x)) also works.  Is there any
reason to prefer one over the other?

Thank you ...


Peter Alspach



__
The contents of this e-mail are privileged and/or confidential to the
named recipient and are not to be used by any other person and/or
organisation. If you have received this e-mail in error, please notify 
the sender and delete all material pertaining to this e-mail.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help