### Re: [R] A coding question involving variable assignments in ifelse()

I agree entirely with Gabor. My advice would be to just ignore the people who think differently -- however, if you want those particular folks to respond, you'll have to play by their rules. (and if you don't play by their rules, you'll just have to ignore the consequences -- this _IS_ the internet, after all). On Friday 27 April 2007, Gabor Grothendieck wrote: I don't think there is any requirement to identify yourself in any way nor should their be. Many people on the list are in academia and in those cases they probably want their name in lights but others may wish to have a lower profile and its common to use an alias on the net for privacy. On 4/27/07, xpRt.wannabe [EMAIL PROTECTED] wrote: Is this an ad hominem comment or a comment of brevity? Unless my eyes are playing tricks on me, I can't seem to find any language in the Posting Guide on what is considered a reasonable vs. unreasonable request from an anonymous poster. Kindly point me to it if it exists. In any case, thanks for your time and suggestion. On 4/26/07, Duncan Murdoch [EMAIL PROTECTED] wrote: On 4/26/2007 5:21 PM, xpRt.wannabe wrote: I made a few slight modifications to the original model in an effort to see the inner workings of the code: deductible - 1 coverage.limit - 2 insurance.threshold - deductible + coverage.limit snip set.seed(123) loss - abs(rnorm(rpois(1, 5), 1, 3)) n - length(loss) accept - runif(n) 0.8 payout - runif(n) 0.999 sum(ifelse(accept payout, ifelse(loss insurance.threshold, loss - coverage.limit, pmin(loss, deductible)), 0)) [1] 6.188817 snip To tease out the data as well as to see the effect of 'accept payout', I did the following: loss [1] 3.401663 4.570620 4.068667 4.718488 accept [1] TRUE FALSE TRUE TRUE # The second loss claim is NOT accepted by the insurance company. payout [1] TRUE TRUE TRUE TRUE accept payout [1] TRUE FALSE TRUE TRUE # The second entry is FALSE because of the second entry in 'accept.' Based on the inner ifelse() expression, the original loss numbers become : 1.401663, 2.570620, 2.068667, 2.718488, respectively (which is fine and what I wanted). Because the second entry in 'accept payout' is FALSE, the second altered loss number (2.570620) becomes 0, making sum(...) equal 6.188817. Unfortunately this is _not_ what I want, and I apologize for not being clear in the first place. What I want is: for any FALSE entry, the original loss number is unaltered, as opposed to become 0. So in the example above, the four numbers that should have been added are: 1.401663, 4.570620, 2.068667, 2.718488, yielding 10.759438 instead of 6.188817. Any further suggestions would be greatly appreciated. I'm sorry, but from an anonymous poster that's not a reasonable request. Just work it out yourself. Duncan Murdoch On 4/26/07, Duncan Murdoch [EMAIL PROTECTED] wrote: On 4/26/2007 2:31 PM, xpRt.wannabe wrote: Just to be sure, is what I have below the right intepretation of your suggestion: Yes, that's what I suggested. Duncan Murdoch deductible - 15 coverage.limit - 75 insurance.threshold - deductible + coverage.limit tmpf - function() { loss - rlnorm(rpois(1, 3), 2, 5) n - length(loss) accept - runif(n) 0.8 payout - runif(n) 0.999 sum(ifelse(accept payout, ifelse(loss insurance.threshold, loss - coverage.limit, pmin(loss, deductible)), 0)) } net - replicate(100, tmpf()) On 4/26/07, Duncan Murdoch [EMAIL PROTECTED] wrote: On 4/26/2007 12:48 PM, xpRt.wannabe wrote: Dear List, Below is a simple, standard loss model that takes into account the terms of an insurance policy: deductible - 15 coverage.limit - 75 insurance.threshold - deductible + coverage.limit tmpf - function() { loss - rlnorm(rpois(1, 3), 2, 5) sum(ifelse(loss insurance.threshold, loss - coverage.limit, pmin(loss, deductible))) } net - replicate(100, tmpf()) Now, I would like to enhance the model by incorporating the following two probabilities: 1. Probability of claim being accepted by the insurance company, say, 0.8 2. Probability of payout by the insurance company, say, 0.999 Could anyone suggest how one might do this? A general way to generate events with probability p is runif(n) p. So I'd add n - length(loss) accept - runif(n) 0.8 payout - runif(n) 0.999 and then require accept payout before any payment at all, e.g. sum(ifelse(accept payout, [ your old ifelse expression ], 0)) There are a lot of implicit independence assumptions here; they may not be very realistic. Duncan

### [R] Converting list of data frame to data frame

Greetings, This might be something very simple but a nice solution eludes me!! I have a function that I call within sapply that generates data frame in each call. Now when sapply returns me back the result - it's in the form of a list of data frames. so in order to extract the information into a single data frame I have to loop thru the following code: for(i=1:n) { my.df = rbind(my.df,list.from.sapply[,i]); } Is there anyway to code it better? Thanks in advance. AP [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Hmisc curve label size cex

R-Masters, I need to produce high resolution line plots and place labels on the curves. It seems that cex must be high relative to the other cex values in order to produce sufficiently large legible tick labels at high resolutions. But high cex values cause the curve labels to become gigantic when using Hmisc. I've struggled and searched the archives, but cannot find a way of controlling the sizes of the curve labels in this situation. These commands produce the problem on my PC using XP: png(trial.png, width=3000, height=2400, res = 600, pointsize=12 ) par(ann=F, font.main=1, font.lab=1, font.axis=1, cex=5, cex.main=1, cex.lab=1, cex.axis=1, lwd=12, las=1, mar=c(4, 4, 2, 2) ) x = seq(-2.5, 2.5, length=100) labcurve( list( One= list( x,sin(x)), Two= list( x,cos(x)), Three=list( x,(x*x)), Four= list( x,exp(x)) ), keys=c('1','2','3','4'), keyloc=none, pl=TRUE ) dev.off() Thanks for your time. -- Brian O'Connor Ontario, Canada __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Hmisc curve label size cex

Brian O'Connor wrote: R-Masters, I need to produce high resolution line plots and place labels on the curves. It seems that cex must be high relative to the other cex values in order to produce sufficiently large legible tick labels at high resolutions. But high cex values cause the curve labels to become gigantic when using Hmisc. I've struggled and searched the archives, but cannot find a way of controlling the sizes of the curve labels in this situation. These commands produce the problem on my PC using XP: png(trial.png, width=3000, height=2400, res = 600, pointsize=12 ) par(ann=F, font.main=1, font.lab=1, font.axis=1, cex=5, cex.main=1, cex.lab=1, cex.axis=1, lwd=12, las=1, mar=c(4, 4, 2, 2) ) x = seq(-2.5, 2.5, length=100) labcurve( list( One= list( x,sin(x)), Two= list( x,cos(x)), Three=list( x,(x*x)), Four= list( x,exp(x)) ), keys=c('1','2','3','4'), keyloc=none, pl=TRUE ) dev.off() Thanks for your time. cex.main .lab .axis etc. are relative so yo need for your case to specify something like cex.axis=1/5 Not sure why you are using keys of 1-4 when you've already given nice labels. I tried labcurve( list( One= list( x,sin(x)), Two= list( x,cos(x)), Three=list( x,(x*x)), Four= list( x,exp(x)) ), pl=TRUE ) and got some nice output after reducing cex.* Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] normalizing affy data caused an error

Hi all, I tried to do normalization of affymetrix data with bioconductor on a Linux server. When I read in the cel files all seemed ok. But the next step caused an error. With Win XP all works fine. Did anyone experience similar problems? Thanks, Thomas PI - ReadAffy() PI AffyBatch object size of arrays=712x712 features (14 kb) cdf=ATH1-121501 (??? affyids) number of samples=6 number of genes=506944 annotation=ath1121501 notes= Warning messages: 1: Line starting 'TITLEError/TITLE ...' is malformed! 2: Line starting 'BODY ...' is malformed! 3: Line starting 'H1Error/H1 ...' is malformed! 4: missing cdf environment! in show(AffyBatch) sessionInfo() R version 2.5.0 (2007-04-23) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets [7] methods base other attached packages: affy affyio Biobase 1.14.0 1.4.0 1.14.0 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] pacf

Hi, I wanted to understand exactly how acf and pacf works, so I tried to calculate ac and pac manually. For ac, I used the standard acf formula: acf(k) = sum(X(t)-Xbar)(X(t-k)-Xbar))/sum(X(t)-Xbar)^2. But for pac, I could not figure out how to calculate it by hand. I understand that in both R and EVIEWS, it is done using the Durbin-Levinson algorithm by the computer. However, I don't understand exactly how the algorithm works just by looking at the algorithm. Does anyone know if there is a short cut to calculate pac by hand (or in a spreadsheet), or is it too complex of a procedure that a computer is absolutely necessary? It seems that there should be a natural relationship between ac and pac so that once ac is calculated, pac can be easily calculated based on ac. Thanks, -- Tom [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Write text in the

Matt == Matthew Neilson [EMAIL PROTECTED] on Fri, 27 Apr 2007 15:54:20 +0100 writes: Matt Hey Felix, Matt So basically what you want is a figure containing a block of four plots, with a main title for the figure? If that's the case then something like this should work: Matt # BEGIN CODE # Matt par(oma=c(0,0,1,0), mfrow=c(2,2)) Matt for(i in 1:4){ Matt plot(NA,xlim=range(0,10),ylim=range(-5,5)) Matt title(paste(Plot ,i,sep=)) Matt } Matt par(mfrow=c(1,1), oma=c(0,0,1,0)) Matt mtext(Main Title, 3, outer = TRUE, cex = par(cex.main)) Matt # END CODE # Matt You can play about with the margins, but I think that's the general idea. Is this what you're after? Yes, and since this is so often desired, with our sfsmisc package, you can simply say sfsmisc::mult.fig(4, main = Main Title) for(i in 1:4){ plot(NA,xlim=range(0,10),ylim=range(-5,5)) title(paste(Plot ,i,sep=)) } If you're a good R-citizen, you will want to be able to reset the graphics parameters, which would extend the above to op - sfsmisc::mult.fig(4, main = Main Title) $ old.par for(i in 1:4){ plot(NA,xlim=range(0,10),ylim=range(-5,5)) title(paste(Plot ,i,sep=)) } par(op) -- Martin Maechler, ETH Zurich Matt On Fri Apr 27 15:34 , Felix Wave [EMAIL PROTECTED] sent: Hello, I started a graphic device: par(oma = c(2,0,0,0), mfrow=c(2,2) ) in the cols and rows are three images. Now I want to write a text in the device region, it's the main title of the graphic device. But when I use mtext() I can only write in the figure region of my four plots. Has anybody an idea? Thanks a lot. Felix My R Code: -- par(oma = c(2,0,0,0), mfrow=c(2,2) ) mtext(Main title, side = 3, line = 0) image(zDIV) image(zMEDIAN ) image(zMEAN) image(zSD) dev.off() graphic: --- |MAIN TITLE device region --- | figure region| figure region| | --|| | | ||| || | | ||| || | | ||| || | | ||| || | -- | | | --- | figure region| figure region| | --|| | | ||| || | | ||| || | | ||| || | | ||| || | -- | | | __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Converting list of data frame to data frame

On 4/28/07, Ajit Pawar [EMAIL PROTECTED] wrote: Greetings, This might be something very simple but a nice solution eludes me!! I have a function that I call within sapply that generates data frame in each call. Now when sapply returns me back the result - it's in the form of a list of data frames. so in order to extract the information into a single data frame I have to loop thru the following code: for(i=1:n) { my.df = rbind(my.df,list.from.sapply[,i]); } Is there anyway to code it better? do.call(rbind, my.df.list.from.sapply) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] The confidence level of p-value of ks.boot

Hello! I need to compare 2 datasets whether they come from the same distribution. I use function ks.boot{Matching}. And what is the confidence level of the p-value, returned by ks.boot function? The code is: set=read.table(http://stella.sai.msu.ru:8080/~gala/data/testsets.csv;, header=T,sep=',') set1=set[!is.na(set$set1),'set1'] set2=set[!is.na(set$set2),'set2'] library(Matching) ks.b=ks.boot(set1,set2,1000) ks.b Thank you! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] A coding question involving variable assignments in ifelse()

On 4/28/2007 6:20 AM, AJ Rossini wrote: I agree entirely with Gabor. My advice would be to just ignore the people who think differently That's fairly bad advice, in that many of the people who actually provide helpful advice are old-fashioned, and like to know who they're providing it to. If xpRt.wannabe had followed your advice a few days ago, s/he would have seen no help at all. Or maybe you meant to say, ignore their wishes, and not ignore their help? -- however, if you want those particular folks to respond, you'll have to play by their rules. (and if you don't play by their rules, you'll just have to ignore the consequences -- this _IS_ the internet, after all). And if you want anyone else to respond, you may just be out of luck. Duncan Murdoch On Friday 27 April 2007, Gabor Grothendieck wrote: I don't think there is any requirement to identify yourself in any way nor should their be. Many people on the list are in academia and in those cases they probably want their name in lights but others may wish to have a lower profile and its common to use an alias on the net for privacy. On 4/27/07, xpRt.wannabe [EMAIL PROTECTED] wrote: Is this an ad hominem comment or a comment of brevity? Unless my eyes are playing tricks on me, I can't seem to find any language in the Posting Guide on what is considered a reasonable vs. unreasonable request from an anonymous poster. Kindly point me to it if it exists. In any case, thanks for your time and suggestion. On 4/26/07, Duncan Murdoch [EMAIL PROTECTED] wrote: On 4/26/2007 5:21 PM, xpRt.wannabe wrote: I made a few slight modifications to the original model in an effort to see the inner workings of the code: deductible - 1 coverage.limit - 2 insurance.threshold - deductible + coverage.limit snip set.seed(123) loss - abs(rnorm(rpois(1, 5), 1, 3)) n - length(loss) accept - runif(n) 0.8 payout - runif(n) 0.999 sum(ifelse(accept payout, ifelse(loss insurance.threshold, loss - coverage.limit, pmin(loss, deductible)), 0)) [1] 6.188817 snip To tease out the data as well as to see the effect of 'accept payout', I did the following: loss [1] 3.401663 4.570620 4.068667 4.718488 accept [1] TRUE FALSE TRUE TRUE # The second loss claim is NOT accepted by the insurance company. payout [1] TRUE TRUE TRUE TRUE accept payout [1] TRUE FALSE TRUE TRUE # The second entry is FALSE because of the second entry in 'accept.' Based on the inner ifelse() expression, the original loss numbers become : 1.401663, 2.570620, 2.068667, 2.718488, respectively (which is fine and what I wanted). Because the second entry in 'accept payout' is FALSE, the second altered loss number (2.570620) becomes 0, making sum(...) equal 6.188817. Unfortunately this is _not_ what I want, and I apologize for not being clear in the first place. What I want is: for any FALSE entry, the original loss number is unaltered, as opposed to become 0. So in the example above, the four numbers that should have been added are: 1.401663, 4.570620, 2.068667, 2.718488, yielding 10.759438 instead of 6.188817. Any further suggestions would be greatly appreciated. I'm sorry, but from an anonymous poster that's not a reasonable request. Just work it out yourself. Duncan Murdoch On 4/26/07, Duncan Murdoch [EMAIL PROTECTED] wrote: On 4/26/2007 2:31 PM, xpRt.wannabe wrote: Just to be sure, is what I have below the right intepretation of your suggestion: Yes, that's what I suggested. Duncan Murdoch deductible - 15 coverage.limit - 75 insurance.threshold - deductible + coverage.limit tmpf - function() { loss - rlnorm(rpois(1, 3), 2, 5) n - length(loss) accept - runif(n) 0.8 payout - runif(n) 0.999 sum(ifelse(accept payout, ifelse(loss insurance.threshold, loss - coverage.limit, pmin(loss, deductible)), 0)) } net - replicate(100, tmpf()) On 4/26/07, Duncan Murdoch [EMAIL PROTECTED] wrote: On 4/26/2007 12:48 PM, xpRt.wannabe wrote: Dear List, Below is a simple, standard loss model that takes into account the terms of an insurance policy: deductible - 15 coverage.limit - 75 insurance.threshold - deductible + coverage.limit tmpf - function() { loss - rlnorm(rpois(1, 3), 2, 5) sum(ifelse(loss insurance.threshold, loss - coverage.limit, pmin(loss, deductible))) } net - replicate(100, tmpf()) Now, I would like to enhance the model by incorporating the following two probabilities: 1. Probability of claim being accepted by the insurance company, say, 0.8 2. Probability of payout by the insurance company, say, 0.999 Could anyone suggest how one might do this? A general way to generate events with probability p is runif(n) p. So I'd add n - length(loss) accept - runif(n) 0.8 payout - runif(n) 0.999 and then require accept payout before any payment at all, e.g. sum(ifelse(accept payout, [ your old ifelse expression ], 0)) There are a lot of implicit

### [R] RWinEdt and Windows Vista

Hi, I have a new computer with Windows Vista and I am trying to use RWinEdt, which I have always used. I am using R version 2.5. The installation of the RWinEdt library is funny. First, it didn't install at all. Then, I uninstalled/reinstalled both R and WinEdt, downloaded the package again from the CRAN repositary, got some error messages, but RWinEdt initialized. I closed R and WinEdt, launched R again, type library(RWinEdt), and got several dialogue boxes (again), asking if I wanted to creat shortcuts, etc. Then, RWinEdt didn't work anymore. I'm sorry for the messy email, but I've done so many installations/uninstallations/re installations, that I am alos confused. I guess the ultimate questions is: are there any known issued between R, RWinEdt and Windows Vista? I appreciate any help. Thanks, Dimitri __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Perpendicular symbol in plotmath?

Hey, Does anyone know of an equivalent to the LaTeX \perp (perpendicular) symbol for adding to R plots? Parallel is easy enough (||), but I haven't been able to find a way of adding perpendicular. The plotmath documentation doesn't mention how to do it, so I'm inclined to think that it doesn't exist - but surely there must be some way of achieving the desired result, right? Any help will be much appreciated, -Matt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Using factors in R

Dear R super-users, I am quite new in using R and I am not managing to edit factors. Lest suppose that one has the following data: Factor A Factor B Factor C Claims Factor A has 3 factors (1,2 and 3). To simplify the glm model I only want to have 2 factor (let's say 1 and 3). I should I do this? Thank you in advance. Kind regards, Pedro Sobral [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Perpendicular symbol in plotmath?

Its available in the Hershey fonts: plot(0, 0, type = n) text(0, 0, A \\pp B, vfont = c(serif, plain)) On 4/28/07, Matthew Neilson [EMAIL PROTECTED] wrote: Hey, Does anyone know of an equivalent to the LaTeX \perp (perpendicular) symbol for adding to R plots? Parallel is easy enough (||), but I haven't been able to find a way of adding perpendicular. The plotmath documentation doesn't mention how to do it, so I'm inclined to think that it doesn't exist - but surely there must be some way of achieving the desired result, right? Any help will be much appreciated, -Matt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] RWinEdt and Windows Vista

Dimitri: Several of us early Vista users have encountered difficulties that ultimately were related to Vista's treating only the Administrator as having permission to make some changes. Even if you are the only user, you still do not have administrative privileges by default. (I think this is a good thing since it diminishes accidental or malicious changes). With R the problem surfaces when trying to install packages. The problem is easily mitigated by running R as the administrator and installing the packages or updates. (Subsequent R sessions can be run by double clicking on the icon, as usual.) To do that, right-click on the program's icon and choose Run as administrator. Perhaps this will solve your RWinEdt problems as well. Best wishes! Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dimitri Szerman Sent: Saturday, April 28, 2007 11:51 AM To: R-Help Subject: [R] RWinEdt and Windows Vista Hi, I have a new computer with Windows Vista and I am trying to use RWinEdt, which I have always used. I am using R version 2.5. The installation of the RWinEdt library is funny. First, it didn't install at all. Then, I uninstalled/reinstalled both R and WinEdt, downloaded the package again from the CRAN repositary, got some error messages, but RWinEdt initialized. I closed R and WinEdt, launched R again, type library(RWinEdt), and got several dialogue boxes (again), asking if I wanted to creat shortcuts, etc. Then, RWinEdt didn't work anymore. I'm sorry for the messy email, but I've done so many installations/uninstallations/re installations, that I am alos confused. I guess the ultimate questions is: are there any known issued between R, RWinEdt and Windows Vista? I appreciate any help. Thanks, Dimitri __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] RWinEdt and Windows Vista

Charles Annis, P.E. wrote: Dimitri: Several of us early Vista users have encountered difficulties that ultimately were related to Vista's treating only the Administrator as having permission to make some changes. Even if you are the only user, you still do not have administrative privileges by default. (I think this is a good thing since it diminishes accidental or malicious changes). With R the problem surfaces when trying to install packages. The problem is easily mitigated by running R as the administrator and installing the packages or updates. (Subsequent R sessions can be run by double clicking on the icon, as usual.) To do that, right-click on the program's icon and choose Run as administrator. Perhaps this will solve your RWinEdt problems as well. Indeed, I think you will have to install the package and load the package the first time with administrator privileges. Please report whether it works or not, since I do not have a Vista machine available to try it out myself. Thanks. Uwe Ligges (as the package maintainer of RWinEdt) Best wishes! Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dimitri Szerman Sent: Saturday, April 28, 2007 11:51 AM To: R-Help Subject: [R] RWinEdt and Windows Vista Hi, I have a new computer with Windows Vista and I am trying to use RWinEdt, which I have always used. I am using R version 2.5. The installation of the RWinEdt library is funny. First, it didn't install at all. Then, I uninstalled/reinstalled both R and WinEdt, downloaded the package again from the CRAN repositary, got some error messages, but RWinEdt initialized. I closed R and WinEdt, launched R again, type library(RWinEdt), and got several dialogue boxes (again), asking if I wanted to creat shortcuts, etc. Then, RWinEdt didn't work anymore. I'm sorry for the messy email, but I've done so many installations/uninstallations/re installations, that I am alos confused. I guess the ultimate questions is: are there any known issued between R, RWinEdt and Windows Vista? I appreciate any help. Thanks, Dimitri __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] normalizing affy data caused an error

Thomas Funke wrote: Hi all, I tried to do normalization of affymetrix data with bioconductor on a Linux server. When I read in the cel files all seemed ok. But the next step caused an error. With Win XP all works fine. Did anyone experience similar problems? Thanks, Thomas There is the Bioconductor mailing list for questions related to BioConductor. People on that list are more likely to be of help on related questions. Best, Uwe Ligges PI - ReadAffy() PI AffyBatch object size of arrays=712x712 features (14 kb) cdf=ATH1-121501 (??? affyids) number of samples=6 number of genes=506944 annotation=ath1121501 notes= Warning messages: 1: Line starting 'TITLEError/TITLE ...' is malformed! 2: Line starting 'BODY ...' is malformed! 3: Line starting 'H1Error/H1 ...' is malformed! 4: missing cdf environment! in show(AffyBatch) sessionInfo() R version 2.5.0 (2007-04-23) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] tools stats graphics grDevices utils datasets [7] methods base other attached packages: affy affyio Biobase 1.14.0 1.4.0 1.14.0 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Drawing Tangent

Arun == Arun Kumar Saha [EMAIL PROTECTED] on Thu, 26 Apr 2007 23:44:03 +0530 writes: Arun Dear all R-users, Arun I would like to draw a tangent of a given function for a particular (given) Arun point. However the straight line representing it should not cut any axis, it Arun should be a small line. Can anyone tell me how to do this? You will eventually call segments() to draw that short line. par(usr) {and maybe e.g. par(pin)} will probably be relevant when determining how long the line segment(s) should be. Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Using factors in R

Pedro Sobral wrote: Dear R super-users, I am quite new in using R and I am not managing to edit factors. Lest suppose that one has the following data: Factor A Factor B Factor C Claims Factor A has 3 factors (1,2 and 3). To simplify the glm model I only want to have 2 factor (let's say 1 and 3). You mean factor A has 3 *levels*, I think. The question is how you want to remove level 2: exclude the observations or join those in 2 with those of another level (hence making them observations with level 1 or 3)? BTW: A more worked out example would be helpful to provide example code that solves your problem - as the posting guide suggests. Uwe Ligges I should I do this? Thank you in advance. Kind regards, Pedro Sobral [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Calculating Variance-covariance matrix for a multivariate normal distribution.

Dear all R users, I wanted to calculated a sample Variance covariance matrix of a five-variate normal distribution. However I stuck to calculate each element of that matrix. My question is should I calculate ordinary variance and covariances, taking pairwise variables? or I should take partial covariance between any two variables, keeping other fixed. In my decent opinion is I should go for the second option? Your help will be highly appreciated. Thanks and regards, stat - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] error returned by make check in R-2.5.0

Professor Ripley, Thank you for your comments. On 4/28/07, Prof Brian Ripley [EMAIL PROTECTED] wrote: How big is abs(lam - lam2[i])/Meps ? Here is the result on my system: abs(lam - lam2[i])/Meps [1] 60 18 17 0 12 which is quite surprising to me that the maximum value exactly equals the cutoff, so perhaps this is not such a big problem. I changed the value to 61 in the reg-tests-1.R file, and then make check runs with no errors. This could be one of those cases where your system (which CPU, what compilers?) is unusually inaccurate, but Linux systems do not usually differ much in their accuracies on the same CPU. It is somewhat disconcerting that my system is unusually inaccurate. I'm running Intel Xeon 3.20GHz CPUs and I've been using gcc 4.1.1 and ifort to compile R (these are what the compile script choose by default on my system). Perhaps it is a bad idea to mix the GNU and Intel compilers? I've checked several systems, as see maximum values of around 40. R 2.5.0 has an updated LAPACK so it will be different from 2.4.1. My hunch is that this is a compiler optimization bug, so you could see what happens at lower optimization levels. I'm not sure how to do this, but from reading R-admin.html, my guess is that setting the CFLAGS for ./configure will change the optimization level for LAPACK compilation, so I used ./configure CFLAGS=-g -O And after make, I find the exact same values as before. So I thought I'd try to set the Fortran compiler to a GNU compiler, to be consistent with gcc. When I set F77=g77, make check finishes fine, and running the same test abs(lam - lam2[i])/Meps [1] 32 8 17 1 2 which makes me more comfortable. Also, if I use the g95 compiler (./configure F77=g95), the eigen test gives abs(lam - lam2[i])/Meps [1] 48.0 10.0 10.5 9.0 6.0 So, I suppose I will use the g77 compiler. Eric Thompson Tufts University Civil Environmental Engineering Graduate Student On Fri, 27 Apr 2007, Eric Thompson wrote: Today I tried to install the R-2.5.0 (currently running R-2.4.1) on Mandriva Linux. The ./configure and make commands seem to run fine, but make check gives the following messages: running regression tests make[3]: Entering directory `/home/ethomp04/R-2.5.0/tests' running code in 'reg-tests-1.R' ...make[3]: *** [reg-tests-1.Rout] Error 1 make[3]: Leaving directory `/home/ethomp04/R-2.5.0/tests' make[2]: *** [test-Reg] Error 2 make[2]: Leaving directory `/home/ethomp04/R-2.5.0/tests' make[1]: *** [test-all-basics] Error 1 make[1]: Leaving directory `/home/ethomp04/R-2.5.0/tests' make: *** [check] Error 2 Regarding make check, the R-admin.html page says Failures are not necessarily problems as they might be caused by missing functionality... So, looking at the reg-tests-1.Rout.fail file, I see that the error occurs here: ## eigen Meps - .Machine$double.eps set.seed(321, kind = default) # force a particular seed m - matrix(round(rnorm(25),3), 5,5) sm - m + t(m) #- symmetric matrix em - eigen(sm); V - em$vect print(lam - em$values) # ordered DEcreasingly [1] 5.1738946 3.1585064 0.6849974 -1.6299494 -2.5074489 stopifnot( + abs(sm %*% V - V %*% diag(lam)) 60*Meps, + abs(sm - V %*% diag(lam) %*% t(V)) 60*Meps) ##--- Symmetric = FALSE: -- different to above : --- em - eigen(sm, symmetric = FALSE); V2 - em$vect print(lam2 - em$values) # ordered decreasingly in ABSolute value ! [1] 5.1738946 3.1585064 -2.5074489 -1.6299494 0.6849974 print(i - rev(order(lam2))) [1] 1 2 5 4 3 stopifnot(abs(lam - lam2[i]) 60 * Meps) Error: abs(lam - lam2[i]) 60 * Meps is not all TRUE Execution halted Interestingly, running these same tests on R-2.4.1 on the same system does not give an error: Meps - .Machine$double.eps set.seed(321, kind = default) # force a particular seed m - matrix(round(rnorm(25),3), 5,5) sm - m + t(m) #- symmetric matrix em - eigen(sm); V - em$vect print(lam - em$values) # ordered DEcreasingly [1] 5.17389456321 3.15850637323 0.68499738238 -1.62994940108 -2.50744891774 stopifnot( + abs(sm %*% V - V %*% diag(lam)) 60*Meps, + abs(sm - V %*% diag(lam) %*% t(V)) 60*Meps) ##--- Symmetric = FALSE: -- different to above : --- em - eigen(sm, symmetric = FALSE); V2 - em$vect print(lam2 - em$values) # ordered decreasingly in ABSolute value ! [1] 5.17389456321 3.15850637323 -2.50744891774 -1.62994940108 0.68499738238 print(i - rev(order(lam2))) [1] 1 2 5 4 3 stopifnot(abs(lam - lam2[i]) 60 * Meps) abs(lam - lam2[i]) 60 * Meps [1] TRUE TRUE TRUE TRUE TRUE I'm not sure what to do next, or how serious of a problem this might be. I would appreciate any suggestions or advice. I thought maybe this was due to something about how my system is setup, but since I don't get the error in R-2.4.1, that seems to imply to me that there is something different in R-2.5.0 that

### Re: [R] error returned by make check in R-2.5.0

On Sat, 28 Apr 2007, Eric Thompson wrote: Professor Ripley, Thank you for your comments. On 4/28/07, Prof Brian Ripley [EMAIL PROTECTED] wrote: How big is abs(lam - lam2[i])/Meps ? Here is the result on my system: abs(lam - lam2[i])/Meps [1] 60 18 17 0 12 which is quite surprising to me that the maximum value exactly equals the cutoff, so perhaps this is not such a big problem. I changed the value to 61 in the reg-tests-1.R file, and then make check runs with no errors. This could be one of those cases where your system (which CPU, what compilers?) is unusually inaccurate, but Linux systems do not usually differ much in their accuracies on the same CPU. It is somewhat disconcerting that my system is unusually inaccurate. I'm running Intel Xeon 3.20GHz CPUs and I've been using gcc 4.1.1 and ifort to compile R (these are what the compile script choose by default on my system). Perhaps it is a bad idea to mix the GNU and Intel compilers? Ah, ifort is less accurate than gfortran quite often, because it cares more about speed. It all depends on the version: on my Opteron x86_64 system and icc/ifort 9.1 the test passes, whereas on a i686 (older Xeons) with icc/ifort 9.0 quite a few tests fail unless optimization is turned down. (I guess those Xeons could be either architecture.) We will raise the limit a bit, as this seems a little too strigent. [...] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] weight

Natalie O'Toole napsal(a): Does anyone know why it is giving me this error? Any help would be greatly appreciated!! Thanks, Nat myfile-(c:/test2.txt) mysubset-myfile mysubset$Y_Q02 -mysubset$DVSELF -NULL mysubset2-mysubset mysubset2$Y_Q10B -mysubset2$GP2_07 -NULL myVariableNames-c(PUMFID=rnorm(10),PROV=rnorm(10),REGION=rnorm(10),GRADE=rnorm(10),Y_Q10A=rnorm(10),WTPP=rnorm(10)) df-mysubset2[, 2:5] * mysubset2[, 6] HERE it has to stop with the error message you provided, not below. The code might run for a few more lines, but the problem is here. Your mysubset2, btw. we have no idea what that might be, aparently does not have 2 dimensions. Besides, your code is very dirty and it is indeed very easy to make a mistake in such code. You really should start reading some introductory manual as someone suggested before. You are not likely to recieve more (different) answers to questions of this kind. Petr myVariableWidths-c(5,2,1,2,1,12.4) df-read.fwf( file=myfile, width=myVariableWidths, col.names=myVariableNames, row.names=PUMFID, fill=TRUE, strip.white=TRUE) happyguys-subset(df, PROV==48 GRADE == 7 Y_Q10A 9) print(happyguys) where it is bolded, i'm getting the following error: Error in mysubset2[, 2:5] : incorrect number of dimensions snip -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Randomising matrices

Nick Cutler s0455078 at sms.ed.ac.uk writes: I would like to be able to randomise presence-absence (i.e. binary) matrices whilst keeping both the row and column totals constant. Is there a function in R that would allow me to do this? I'm working with vegetation presence-absence matrices based on field observations. The matrices are formatted to have sites as rows and species as columns. The presence of a species on a site is indicated with a 1 (absence is obviously indicated with a 0). I would like to randomise the matrices many times in order to construct null models. However, I cannot identify a function in R to do this, and the programming looks tricky for someone of my limited skills. Can anybody help me out? Nick, For a 1001 x 1001 matrix, this method takes less than 2 seconds on my 2 year old Windows PC. ronetab( marg1, marg2 ) returns a table of 0's and 1's according to the marginal contraints. ck.ronetab( marg1, marg2 ) checks that all the constraints were honored. msample - function(x,marg) { ## Purpose: sample at most one each from each cell of a margin ## -- ## Arguments: x - number to sample, marg - a vector of integers ## -- ## Author: Charles C. Berry, Date: 28 Apr 2007, 08:17 ## GPL 2.0 or better if (!(x=sum(marg) all(marg=0))) browser() wm - which(marg!=0) if (length(wm)==1) { wm } else { sample( seq(along=marg), x, prob=marg ) } } ronetab - function(m1,m2,debug=F) { ## Purpose: sample from a table with fixed margins and {0,1} cell values ## -- ## Arguments: m1, m2 - two margins ## -- ## Author: Charles C. Berry, Date: 28 Apr 2007, 08:21 ## GPL 2.0 or better stopifnot( sum(m1)==sum(m2)|| max(m1)length(m2) || max(m2)length(m1) ) i.list - j.list - list() k - 0 while( sum(m1)0 ){ k - k+1 if ( sum(m1!=0) sum(m2!=0) ){ i - which.max( m1) j - msample( m1[i], m2 ) i.list[[ k ]] - rep( i, m1[i] ) j.list[[ k ]] - j m1[i] - 0 m2[ j ] - m2[ j ] - 1 } else { j - which.max( m2 ) i - msample( m2[j], m1 ) i.list[[ k ]] - i j.list[[ k ]] - rep( j, m2[j] ) m2[j] - 0 m1[ i ] - m1[ i ] - 1 } } res - array(0, c(length(m1), length(m2) ) ) res[ cbind( unlist(i.list), unlist(j.list) ) ] - 1 res } ck.ronetab - function(m1,m2){ tab - ronetab(m1,m2) m1.ck - all(m1==rowSums(tab)) m2.ck - all(m2==colSums(tab)) cell.ck - all( tab %in% 0:1 ) res - m1.ck m2.ck cell.ck if (!res) attr(res,tab) - tab res } I'll warn you that I have not worked through what looks to be a tedious proof that this randomly samples matrices under the constraints. The heuristics seem right, and a few simulation spot checks look reasonable. If you do not want to trust it, you can still use it to generate a starting value for an MCMC run. HTH, Chuck __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Hmisc curve label size cex

Thanks for your reply Frank. I realize that the cex values are relative. The problem is that for high resolutions, cex has to be high (e.g., around 5) for the tick and curve labels to be legible. The curve and tick labels were tiny when I used cex=1 and set the other cex values to 1/5. But when cex is high, then the Hmisc curve labels become too large. I don't doubt that I'm doing something wrong, I just can't figure out what. And I want the integer curve labels because I will have many lines and multiple plots, and need to keep the labels brief. Thanks. Brian O'Connor wrote: R-Masters, I need to produce high resolution line plots and place labels on the curves. It seems that cex must be high relative to the other cex values in order to produce sufficiently large legible tick labels at high resolutions. But high cex values cause the curve labels to become gigantic when using Hmisc. I've struggled and searched the archives, but cannot find a way of controlling the sizes of the curve labels in this situation. These commands produce the problem on my PC using XP: png(trial.png, width=3000, height=2400, res = 600, pointsize=12 ) par(ann=F, font.main=1, font.lab=1, font.axis=1, cex=5, cex.main=1, cex.lab=1, cex.axis=1, lwd=12, las=1, mar=c(4, 4, 2, 2) ) x = seq(-2.5, 2.5, length=100) labcurve( list( One= list( x,sin(x)), Two= list( x,cos(x)), Three=list( x,(x*x)), Four= list( x,exp(x)) ), keys=c('1','2','3','4'), keyloc=none, pl=TRUE ) dev.off() Thanks for your time. cex.main .lab .axis etc. are relative so yo need for your case to specify something like cex.axis=1/5 Not sure why you are using keys of 1-4 when you've already given nice labels. I tried labcurve( list( One= list( x,sin(x)), Two= list( x,cos(x)), Three=list( x,(x*x)), Four= list( x,exp(x)) ), pl=TRUE ) and got some nice output after reducing cex.* Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University -- -- -- Brian O'Connor Department of Psychology Lakehead University 955 Oliver Road Thunder Bay, Ontario, Canada P7B 5E1 (807) 343-8322 Fax: (807) 346-7734 e-mail: [EMAIL PROTECTED] alternate e-mail: [EMAIL PROTECTED] http://flash.lakeheadu.ca/~boconno2/boconnor.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] pacf

tom == tom soyer [EMAIL PROTECTED] on Sat, 28 Apr 2007 08:15:39 -0500 writes: tom I wanted to understand exactly how acf and pacf works, tom so I tried to calculate ac and pac manually. For ac, I tom used the standard acf formula: acf(k) = tom sum(X(t)-Xbar)(X(t-k)-Xbar))/sum(X(t)-Xbar)^2. But for tom pac, I could not figure out how to calculate it by tom hand. I understand that in both R and EVIEWS, it is tom done using the Durbin-Levinson algorithm by the tom computer. However, I don't understand exactly how the tom algorithm works just by looking at the algorithm. Does tom anyone know if there is a short cut to calculate pac by tom hand (or in a spreadsheet), or is it too complex of a tom procedure that a computer is absolutely necessary? It tom seems that there should be a natural relationship tom between ac and pac so that once ac is calculated, pac tom can be easily calculated based on ac. easily, yes, by the Durbin-Levinson algorithm ;-) (is this a homework problem?) Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] digest readability/ thunderbird email client/ R-help

Hi Sam, SamMc == Sam McClatchie [EMAIL PROTECTED] on Fri, 27 Apr 2007 11:21:56 -0700 writes: SamMc System:Linux kernel 2.6.15 Ubuntu dapper ... SamMc Has anyone figured out how to make the R-help digest SamMc more easily readable in the Thunderbird mail client? SamMc I think the digest used to be more readable than is SamMc currently the case with all of the messages as SamMc attachments. SamMc I know the work around is to get individual messages SamMc rather than the digest, use another mail client, or SamMc just look at the archives on the web... {and there are at least two alternatives to the standard archives, notably Gmane. BTW: Just found an URL that should list all R lists carried by them : http://dir.gmane.org/search.php?match=.r. It had been pointed out more than once that Thunderbird unfortunately is not adhering to the standard(s) of such digests-- too bad it still is not. One alternative is to get the digest as plain digest ((which BTW another standard-complying digest format, that e.g. emacs Rmail or VM can easily deal with)) which will most probably just appear as one long e-mail in Thunderbird, with table of contents of all the subject lines, but nothing clickable Regards, Martin Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Calculating Variance-covariance matrix for a multivariate normal distribution.

stat stat wrote: Dear all R users, I wanted to calculated a sample Variance covariance matrix of a five-variate normal distribution. However I stuck to calculate each element of that matrix. My question is should I calculate ordinary variance and covariances, taking pairwise variables? or I should take partial covariance between any two variables, keeping other fixed. In my decent opinion is I should go for the second option? The definition is the former, though... Your help will be highly appreciated Thanks and regards, stat - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Calculating Variance-covariance matrix for a multivariate normal distribution.

Dear stat, Interesting claim to a name! In any case, var(X) where X is the data matrix with n rows of 5-variables should do the trick. Btw, please read the posting guide: your question is legitimate, hiding your identity (stat stat) is not. Best wishes, Ranjan On Sat, 28 Apr 2007 16:36:55 +0100 (BST) stat stat [EMAIL PROTECTED] wrote: Dear all R users, I wanted to calculated a sample Variance covariance matrix of a five-variate normal distribution. However I stuck to calculate each element of that matrix. My question is should I calculate ordinary variance and covariances, taking pairwise variables? or I should take partial covariance between any two variables, keeping other fixed. In my decent opinion is I should go for the second option? Your help will be highly appreciated. Thanks and regards, stat - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] omit y=zero line in histogram

Reply to self: set border=NA, stupid. Paul Artes wrote: Dear all, hist ( ) plots a horizontal line at y=0 when the respective bin is empty. I can deal with this by modifying the hist object before plotting it (x$density[x$density == 0] - NA), but I'm sure I've seen a more elegant way. Perhaps this was in truehist (MASS). I have looked but can't find it. Does anyone know? Best wishes Paul -- View this message in context: http://www.nabble.com/omit-y%3Dzero-line-in-histogram-tf3643983.html#a10236720 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Comparing MCMClogit, glm and BRUGS

Hello, I have two related questions, one about MCMClogit and the other about BRUGS: Given the data on nausea due to diuretic and nsaid below: nsaid diureticyes no 0 0 185 6527 0 1 53 1444 1 0 42 1293 1 1 25 253 A logistic regression in glm gives: Coefficients: Estimate Std. Error z value Pr(|z|) (Intercept) -3.563350.07456 -47.794 2e-16 *** nsaid0.136300.17361 0.785 0.43242 diuretic 0.258470.15849 1.631 0.10293 I(nsaid * diuretic) 0.854070.30603 2.791 0.00526 ** But in BRUGS: model { for(i in 1:N) { yes[i] ~ dbin(p[i],no[i]) logit(p[i]) - beta0+beta1*nsaid[i]+beta2*diuretic[i]+beta3*(nsaid[i]*diuretic[i]) } beta0 ~ dnorm(0,0.05) beta1 ~ dnorm(0,0.05) beta2 ~ dnorm(0,0.05) beta3 ~ dnorm(0,0.05) } results: samplesStats(*) mean sd MC_error val2.5pc median val97.5pc start sample beta0 -3.5370 0.07481 0.001134 -3.68800 -3.5370 -3.3910 1001 1 beta1 0.1332 0.17540 0.003035 -0.21610 0.13540.4663 1001 1 beta2 0.2591 0.15710 0.002757 -0.05212 0.26080.5610 1001 1 beta3 0.9142 0.30900 0.005573 0.30840 0.91761.5150 1001 1 The interaction term beta3 (0.9142) is a little different from the one of glm, why? Using the MCMClogit (same burnin and iterations as above) from the MCMC package gives a closer estimate to glm Mean SD Naive SE Time-series SE (Intercept)-3.5612 0.07678 0.0007678 0.003306 nsaid 0.1356 0.17240 0.0017240 0.007093 diuretic0.2453 0.16045 0.0016045 0.005340 nsaid:diuretic 0.8558 0.30756 0.0030756 0.011460 But the data cannot be entered in a summary like they are above (yes and no counts), instead they have to be entered as such: nsaid diuretic nausea 00 1 00 1 00 1 00 1 10 1 etc... more 9800 rows! Is there a way to use summary data (yes, no) with MCMClogit? THANKS -- View this message in context: http://www.nabble.com/Comparing-MCMClogit%2C-glm-and-BRUGS-tf3663589.html#a10236835 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] RWinEdt and Windows Vista

Thanks Charles, it seems to be working fine now, But I had to uninstall/reinstall both R and WinEdt. 2007/4/28, Uwe Ligges [EMAIL PROTECTED]: Charles Annis, P.E. wrote: Dimitri: Several of us early Vista users have encountered difficulties that ultimately were related to Vista's treating only the Administrator as having permission to make some changes. Even if you are the only user, you still do not have administrative privileges by default. (I think this is a good thing since it diminishes accidental or malicious changes). With R the problem surfaces when trying to install packages. The problem is easily mitigated by running R as the administrator and installing the packages or updates. (Subsequent R sessions can be run by double clicking on the icon, as usual.) To do that, right-click on the program's icon and choose Run as administrator. Perhaps this will solve your RWinEdt problems as well. Indeed, I think you will have to install the package and load the package the first time with administrator privileges. Please report whether it works or not, since I do not have a Vista machine available to try it out myself. Thanks. Uwe Ligges (as the package maintainer of RWinEdt) Best wishes! Charles Annis, P.E. [EMAIL PROTECTED] phone: 561-352-9699 eFax: 614-455-3265 http://www.StatisticalEngineering.com -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Dimitri Szerman Sent: Saturday, April 28, 2007 11:51 AM To: R-Help Subject: [R] RWinEdt and Windows Vista Hi, I have a new computer with Windows Vista and I am trying to use RWinEdt, which I have always used. I am using R version 2.5. The installation of the RWinEdt library is funny. First, it didn't install at all. Then, I uninstalled/reinstalled both R and WinEdt, downloaded the package again from the CRAN repositary, got some error messages, but RWinEdt initialized. I closed R and WinEdt, launched R again, type library(RWinEdt), and got several dialogue boxes (again), asking if I wanted to creat shortcuts, etc. Then, RWinEdt didn't work anymore. I'm sorry for the messy email, but I've done so many installations/uninstallations/re installations, that I am alos confused. I guess the ultimate questions is: are there any known issued between R, RWinEdt and Windows Vista? I appreciate any help. Thanks, Dimitri __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Retraction WAS: Re: Randomising matrices

Sorry folks, With some further checking, it turns out that this sampling scheme does not conform to the relevant null. :-( Chuck On Sat, 28 Apr 2007, Charles C. Berry wrote: Nick Cutler s0455078 at sms.ed.ac.uk writes: I would like to be able to randomise presence-absence (i.e. binary) matrices whilst keeping both the row and column totals constant. Is there a function in R that would allow me to do this? I'm working with vegetation presence-absence matrices based on field observations. The matrices are formatted to have sites as rows and species as columns. The presence of a species on a site is indicated with a 1 (absence is obviously indicated with a 0). I would like to randomise the matrices many times in order to construct null models. However, I cannot identify a function in R to do this, and the programming looks tricky for someone of my limited skills. Can anybody help me out? Nick, For a 1001 x 1001 matrix, this method takes less than 2 seconds on my 2 year old Windows PC. ronetab( marg1, marg2 ) returns a table of 0's and 1's according to the marginal contraints. ck.ronetab( marg1, marg2 ) checks that all the constraints were honored. msample - function(x,marg) { ## Purpose: sample at most one each from each cell of a margin ## -- ## Arguments: x - number to sample, marg - a vector of integers ## -- ## Author: Charles C. Berry, Date: 28 Apr 2007, 08:17 ## GPL 2.0 or better if (!(x=sum(marg) all(marg=0))) browser() wm - which(marg!=0) if (length(wm)==1) { wm } else { sample( seq(along=marg), x, prob=marg ) } } ronetab - function(m1,m2,debug=F) { ## Purpose: sample from a table with fixed margins and {0,1} cell values ## -- ## Arguments: m1, m2 - two margins ## -- ## Author: Charles C. Berry, Date: 28 Apr 2007, 08:21 ## GPL 2.0 or better stopifnot( sum(m1)==sum(m2)|| max(m1)length(m2) || max(m2)length(m1) ) i.list - j.list - list() k - 0 while( sum(m1)0 ){ k - k+1 if ( sum(m1!=0) sum(m2!=0) ){ i - which.max( m1) j - msample( m1[i], m2 ) i.list[[ k ]] - rep( i, m1[i] ) j.list[[ k ]] - j m1[i] - 0 m2[ j ] - m2[ j ] - 1 } else { j - which.max( m2 ) i - msample( m2[j], m1 ) i.list[[ k ]] - i j.list[[ k ]] - rep( j, m2[j] ) m2[j] - 0 m1[ i ] - m1[ i ] - 1 } } res - array(0, c(length(m1), length(m2) ) ) res[ cbind( unlist(i.list), unlist(j.list) ) ] - 1 res } ck.ronetab - function(m1,m2){ tab - ronetab(m1,m2) m1.ck - all(m1==rowSums(tab)) m2.ck - all(m2==colSums(tab)) cell.ck - all( tab %in% 0:1 ) res - m1.ck m2.ck cell.ck if (!res) attr(res,tab) - tab res } I'll warn you that I have not worked through what looks to be a tedious proof that this randomly samples matrices under the constraints. The heuristics seem right, and a few simulation spot checks look reasonable. If you do not want to trust it, you can still use it to generate a starting value for an MCMC run. HTH, Chuck __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] weight

Hi, I'm getting an error message: Error in df[, 1:4] * df[, 5] : non-numeric argument to binary operator In addition: Warning message: Incompatible methods (Ops.data.frame, Ops.factor) for * here is my code: ##reading in the file happyguys-read.table(c:/test4.dat, header=TRUE, row.names=1) ##subset the file based on Select If test-subset(happyguys, PROV==48 GRADE == 7 Y_Q10A 9) ##sorting the file mydata-test mydataSorted-mydata[ order(mydata$Y_Q10A), ] print(mydataSorted) ##assigning a different name to file happyguys-mydataSorted ##trying to weight my data data.frame-happyguys df-data.frame df1-df[, 1:4] * df[, 5] ##getting error message here?? Error in df[, 1:4] * df[, 5] : non-numeric argument to binary operator In addition: Warning message: Incompatible methods (Ops.data.frame, Ops.factor) for * Does anyone know what this error message means? I've been reviewing R code all day getting more familiar with it Thanks, Nat This communication is intended for the use of the recipient to which it is addressed, and may contain confidential, personal, and or privileged information. Please contact the sender immediately if you are not the intended recipient of this communication, and do not copy, distribute, or take action relying on it. Any communication received in error, or subsequent reply, should be deleted or destroyed. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Perpendicular symbol in plotmath?

Thanks for your response, Gabor. That works quite nicely. The documentation states that it is not possible to mix and match Hershey fonts with plotmath symbols. My *ideal* scenario would be to write the perpendicular symbol as a subscript (specifically, I would like to have \epsilon_{\perp} as an axis label). I have searched the help archive, and it turned up the following post from 2002: http://tinyurl.com/2m8n9c which explains a way of faking subscripts when using the Hershey fonts, though it does have several drawbacks. Have things moved on in the last five years, or is this still the best known solution? Many thanks for your help, -Matt On Sat Apr 28 17:35 , 'Gabor Grothendieck' [EMAIL PROTECTED] sent: Its available in the Hershey fonts: plot(0, 0, type = n) text(0, 0, A \\pp B, vfont = c(serif, plain)) On 4/28/07, Matthew Neilson [EMAIL PROTECTED] wrote: Hey, Does anyone know of an equivalent to the LaTeX \perp (perpendicular) symbol for adding to R plots? Parallel is easy enough (||), but I haven't been able to find a way of adding perpendicular. The plotmath documentation doesn't mention how to do it, so I'm inclined to think that it doesn't exist - but surely there must be some way of achieving the desired result, right? Any help will be much appreciated, -Matt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] weight

IIRC you have a yes/no smoking variable scored 1/2 ? It is possibly being read in as a factor not as an integer. try class(df$smoking.variable) to see . --- Natalie O'Toole [EMAIL PROTECTED] wrote: Hi, I'm getting an error message: Error in df[, 1:4] * df[, 5] : non-numeric argument to binary operator In addition: Warning message: Incompatible methods (Ops.data.frame, Ops.factor) for * here is my code: ##reading in the file happyguys-read.table(c:/test4.dat, header=TRUE, row.names=1) ##subset the file based on Select If test-subset(happyguys, PROV==48 GRADE == 7 Y_Q10A 9) ##sorting the file mydata-test mydataSorted-mydata[ order(mydata$Y_Q10A), ] print(mydataSorted) ##assigning a different name to file happyguys-mydataSorted ##trying to weight my data data.frame-happyguys df-data.frame df1-df[, 1:4] * df[, 5] ##getting error message here?? Error in df[, 1:4] * df[, 5] : non-numeric argument to binary operator In addition: Warning message: Incompatible methods (Ops.data.frame, Ops.factor) for * Does anyone know what this error message means? I've been reviewing R code all day getting more familiar with it Thanks, Nat This communication is intended for the use of the recipient to which it is addressed, and may contain confidential, personal, and or privileged information. Please contact the sender immediately if you are not the intended recipient of this communication, and do not copy, distribute, or take action relying on it. Any communication received in error, or subsequent reply, should be deleted or destroyed. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Perpendicular symbol in plotmath?

I don't think you can mix plotmath and Hershey but you could do this: plot(1, xlab = quote(epsilon)) z - list(x = 1.00823, y = 0.4475955) text(z$x, z$y, \\pp, vfont = c(serif, plain), xpd = TRUE, cex = .7) You can set z via: z - locator() On 4/28/07, Matthew Neilson [EMAIL PROTECTED] wrote: Thanks for your response, Gabor. That works quite nicely. The documentation states that it is not possible to mix and match Hershey fonts with plotmath symbols. My *ideal* scenario would be to write the perpendicular symbol as a subscript (specifically, I would like to have \epsilon_{\perp} as an axis label). I have searched the help archive, and it turned up the following post from 2002: http://tinyurl.com/2m8n9c which explains a way of faking subscripts when using the Hershey fonts, though it does have several drawbacks. Have things moved on in the last five years, or is this still the best known solution? Many thanks for your help, -Matt On Sat Apr 28 17:35 , 'Gabor Grothendieck' [EMAIL PROTECTED] sent: Its available in the Hershey fonts: plot(0, 0, type = n) text(0, 0, A \\pp B, vfont = c(serif, plain)) On 4/28/07, Matthew Neilson [EMAIL PROTECTED] wrote: Hey, Does anyone know of an equivalent to the LaTeX \perp (perpendicular) symbol for adding to R plots? Parallel is easy enough (||), but I haven't been able to find a way of adding perpendicular. The plotmath documentation doesn't mention how to do it, so I'm inclined to think that it doesn't exist - but surely there must be some way of achieving the desired result, right? Any help will be much appreciated, -Matt __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] randomForest gives different results for formula call v. x, y methods. Why?

Just out of curiosity, I took the default iris example in the RF helpfile... but seeing the admonition against using the formula interface for large data sets, I wanted to play around a bit to see how the various options affected the output. Found something interesting I couldn't find documentation for... Just like the example... set.seed(12) # to be sure I have reproducibility form.rf-randomForest(Species ~ ., data=iris) form.rf Call: randomForest(formula = Species ~ ., data = iris) Type of random forest: classification Number of trees: 500 No. of variables tried at each split: 2 OOB estimate of error rate: 4.67% Confusion matrix: setosa versicolor virginica class.error setosa 50 0 00.00 versicolor 0 47 30.06 virginica 0 4460.08 long.rf-randomForest(x=iris[,1:4],y=iris[,5]) long.rf Call: randomForest(x = iris[, 1:4], y = iris[, 5]) Type of random forest: classification Number of trees: 500 No. of variables tried at each split: 2 OOB estimate of error rate: 4% Confusion matrix: setosa versicolor virginica class.error setosa 50 0 00.00 versicolor 0 47 30.06 virginica 0 3470.06 (Now, if I had non-contiguous columns for predictors, I'd have to call it this way) long2.rf-randomForest(x=iris[,c(1:4)],y=iris[,5]) long2.rf Call: randomForest(x = iris[, c(1:4)], y = iris[, 5]) Type of random forest: classification Number of trees: 500 No. of variables tried at each split: 2 OOB estimate of error rate: 5.33% Confusion matrix: setosa versicolor virginica class.error setosa 50 0 00.00 versicolor 0 47 30.06 virginica 0 5450.10 Any idea why these two should give different results? I can only figure that the seed, even though it's set, somehow gets altered by the use of a formula long3.rf-randomForest(x=iris[,c(1,2,3,4)],y=iris[,5]) long3.rf Call: randomForest(x = iris[, c(1, 2, 3, 4)], y = iris[, 5]) Type of random forest: classification Number of trees: 500 No. of variables tried at each split: 2 OOB estimate of error rate: 4.67% Confusion matrix: setosa versicolor virginica class.error setosa 50 0 00.00 versicolor 0 47 30.06 virginica 0 4460.08 Either that or I'm calling it wrong in the long example, or else there's a bug. Not a life threatening situation, but I am curious as to the mechanics of this. I use that sort of column identification all the time and it seems to work OK, but here I get different results (form.rf v. long.rf or long2.rf) or not (form.rf v. long3.rf) depending how I call the function. Any insights? -- --- David L. Van Brunt, Ph.D. mailto:[EMAIL PROTECTED] If Tyranny and Oppression come to this land, it will be in the guise of fighting a foreign enemy. --James Madison [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] how to code the censor variable for survfit

Dear r-helpers, This is my first time to run survival analysis. Currently, I have a data set which contains two variables, the variable of time to event (or time to censoring) and the variable of censor indicator. For the indicator variable, it was coded as 0 and 1. 0 represents right censor, 1 means event of interest. Now I try to use survfit in the package of survival. I wrote the following code: rptsurv - survfit(surv(time,censor)~1,data=x) Before I run the code, I am concerned with my 0/1 coding to the censor indicator because I did not see any argument in the syntax of survfit, which may tell the program that value 1 means event. I checked the documentations and R-help archive, but ended in vain. Would you please kindly tell me how survfit treats censor variables? In 0/1 coding, is it the default that 1 means event and 0 means right censor? What if the censor was coded as 2 or 3 instead of 0 or 1? I means how the survfit knows the difference. In SAS, if a lifetest procedure (similar to survfit) is performed, there is an argument specifying which value in the censor variable is treated as event. I know I could just compare the results from R and from SAS to see the difference. However, I really want to know exactly how survfit deals with this problem. Thank you very much in advance. sincerely, Jiang Lu University of Pittsburgh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] how to code the censor variable for survfit

The Surv object contains the information on the type of censoring. Look at ?Surv for an explanation of how censored events are represented. -Christos -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lu, Jiang Sent: Saturday, April 28, 2007 11:10 PM To: r-help@stat.math.ethz.ch Subject: [R] how to code the censor variable for survfit Dear r-helpers, This is my first time to run survival analysis. Currently, I have a data set which contains two variables, the variable of time to event (or time to censoring) and the variable of censor indicator. For the indicator variable, it was coded as 0 and 1. 0 represents right censor, 1 means event of interest. Now I try to use survfit in the package of survival. I wrote the following code: rptsurv - survfit(surv(time,censor)~1,data=x) Before I run the code, I am concerned with my 0/1 coding to the censor indicator because I did not see any argument in the syntax of survfit, which may tell the program that value 1 means event. I checked the documentations and R-help archive, but ended in vain. Would you please kindly tell me how survfit treats censor variables? In 0/1 coding, is it the default that 1 means event and 0 means right censor? What if the censor was coded as 2 or 3 instead of 0 or 1? I means how the survfit knows the difference. In SAS, if a lifetest procedure (similar to survfit) is performed, there is an argument specifying which value in the censor variable is treated as event. I know I could just compare the results from R and from SAS to see the difference. However, I really want to know exactly how survfit deals with this problem. Thank you very much in advance. sincerely, Jiang Lu University of Pittsburgh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] how to code the censor variable for survfit

And be careful - R is case-sensitive. You have surv(...) instead of Surv(...) in your code, that will probably give an error. The coding is as you have it - 1=failure, 0=censored. Petr Christos Hatzis napsal(a): The Surv object contains the information on the type of censoring. Look at ?Surv for an explanation of how censored events are represented. -Christos -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Lu, Jiang Sent: Saturday, April 28, 2007 11:10 PM To: r-help@stat.math.ethz.ch Subject: [R] how to code the censor variable for survfit Dear r-helpers, This is my first time to run survival analysis. Currently, I have a data set which contains two variables, the variable of time to event (or time to censoring) and the variable of censor indicator. For the indicator variable, it was coded as 0 and 1. 0 represents right censor, 1 means event of interest. Now I try to use survfit in the package of survival. I wrote the following code: rptsurv - survfit(surv(time,censor)~1,data=x) Before I run the code, I am concerned with my 0/1 coding to the censor indicator because I did not see any argument in the syntax of survfit, which may tell the program that value 1 means event. I checked the documentations and R-help archive, but ended in vain. Would you please kindly tell me how survfit treats censor variables? In 0/1 coding, is it the default that 1 means event and 0 means right censor? What if the censor was coded as 2 or 3 instead of 0 or 1? I means how the survfit knows the difference. In SAS, if a lifetest procedure (similar to survfit) is performed, there is an argument specifying which value in the censor variable is treated as event. I know I could just compare the results from R and from SAS to see the difference. However, I really want to know exactly how survfit deals with this problem. Thank you very much in advance. sincerely, Jiang Lu University of Pittsburgh __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Petr Klasterecky Dept. of Probability and Statistics Charles University in Prague Czech Republic __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.