### [R] cauculating dissimilarities in R

Dear All, Ive got a statistical question on calculating dissimilarities in R. I want to calculate the different types of dissimilarities on the flower dataset found in the package cluster. Flower is a data frame with 18 observations on 8 variables. Variable 1 and 2 are binary, variable 3 is asymmetric binary, variable 4 is nominal, variable 5 and 6 are ordered and variable 7 and 8 are interval scaled. Commands to load the dataset in R. library(cluster) data(flower) flower What are the different types of dissimilarities that can be calculated on such a dataset? Do I need to group the types of variables first i.e. all binary together then run the calculation? Do I use dissimilarity indices such as Jaccard or should it be classification function such as daisy which should be used? Many thanks, Elvina Payet (MSc) University of La Reunion __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Sort problem with merge (again)

On Mon, 25 Sep 2006, Bruce LaZerte wrote: # R version 2.3.1 (2006-06-01) Debian Linux testing # Is the following behaviour a bug, feature or just a lack of # understanding on my part? I see that this was discussed here # last March with no apparent resolution. Reference? It is the third alternative. A factor is sorted by its codes: consider x - factor(1:3, levels=as.character(3:1)) x [1] 1 2 3 Levels: 3 2 1 sort(x) [1] 3 2 1 Levels: 3 2 1 and that is what is happening here: for your example the levels of df$Date are levels(df$Date) [1] 1970-04-04 1970-08-11 1970-10-18 1970-06-04 1970-08-18 so the result is sorted correctly. If you want to sort a character column in lexicographic order, don't make it into a factor. Similarly for a date column: use class Date. d - as.factor(c(1970-04-04,1970-08-11,1970-10-18)) x - c(9,10,11) ch - data.frame(Date=d,X=x) d - as.factor(c(1970-06-04,1970-08-11,1970-08-18)) y - c(109,110,111) sp - data.frame(Date=d,Y=y) df - merge(ch,sp,all=TRUE,by=Date) # the rows with dates missing all ch vars are tacked on the end. # the rows with dates missing all sp vars are sorted in with # the row with a date with vars from both ch and sp # is.ordered(df$Date) returns FALSE # The rows of df are not sorted as they should be as sort=TRUE # is the default. Adding sort=TRUE does nothing. # So try this: # dd - df[order(df$Date),] # But that doesn't work. # Nor does sort(df$Date) # But sort(as.vector(df$Date)) does work. # As does order(as.vector(df$Date)), so this works: dd - df[order(as.vector(df$Date)),] # ? -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] calculating dissimilarities in R

Dear All, Ive got a statistical question on calculating dissimilarities in R. I want to calculate the different types of dissimilarities on the flower dataset found in the package cluster. Flower is a data frame with 18 observations on 8 variables. Variable 1 and 2 are binary, variable 3 is asymmetric binary, variable 4 is nominal, variable 5 and 6 are ordered and variable 7 and 8 are interval scaled. Commands to load the dataset in R. library(cluster) data(flower) flower What are the different types of dissimilarities that can be calculated on such a dataset? Do I need to group the types of variables first i.e. all binary together then run the calculation? Do I use dissimilarity indices such as Jaccard or should it be classification function such as daisy which should be used? Many thanks, Elvina Payet (MSc) University of La Reunion __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] warning message in nlm

Dear R-users, I am trying to find the MLEs for a loglikelihood function (loglikcs39) and tried using both optim and nlm. fredcs39-function(b1,b2,x){return(exp(b1+b2*x))} loglikcs39-function(theta,len){ sum(mcs39[1:len]*fredcs39(theta[1],theta[2],c(8:(7+len))) - pcs39[1:len] * log(fredcs39(theta[1],theta[2],c(8:(7+len) } theta.start-c(0.1,0.1) 1. The output from using optim is as follow -- optcs39-optim(theta.start,loglikcs39,len=120,method=BFGS) optcs39 $par [1] -1.27795226 -0.03626846 $value [1] 7470.551 $counts function gradient 133 23 $convergence [1] 0 $message NULL 2. The output from using nlm is as follow --- outcs39-nlm(loglikcs39,theta.start,len=120) Warning messages: 1: NA/Inf replaced by maximum positive value 2: NA/Inf replaced by maximum positive value 3: NA/Inf replaced by maximum positive value 4: NA/Inf replaced by maximum positive value 5: NA/Inf replaced by maximum positive value 6: NA/Inf replaced by maximum positive value 7: NA/Inf replaced by maximum positive value outcs39 $minimum [1] 7470.551 $estimate [1] -1.27817854 -0.03626027 $gradient [1] -8.933577e-06 -1.460512e-04 $code [1] 1 $iterations [1] 40 As you can see, the values obtained from using both functions are very similar. But, what puzzled is the warning message that i got from using nlm. Could anyone please shed some light on how this warning message come about and whether it is a cause for concern? Many thanks in advance for any advice! singyee [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] lapply, plot and additional arguments

Dear all Hopefully somebody will know the answer. I have some list x - data.frame(a = 1:9, beta = exp(-4:4), logic = rep(c(TRUE,FALSE), c(5,4))) x.l - split(x, x$logic) plot(x.l$a, x.l$beta) and I want to plot lines color coded according to logic variable lapply(x.l, function(x, ...) lines(x$a, x$beta, col=1:2)) lapply(x.l, function(x,...) lines(x$a,x$beta), col=1:2) lapply(x.l, function(x,...) lines(x$a,x$beta, ...), col=1:2) Well, lapply seems to ignore my best attempts to persuade it to use different colours for each part of x.l list. Anybody knows how to code different colours when using lapply for such plotting? At present time I use a loop but maybe lapply could do it too. Best regards. Petr Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] printing a variable name in a for loop

Hello, How do you print a variable name in a for loop? I'm trying to construct a csv file that looks like this: Hello, variable1, value_of_variable1, World, Hello, variable2, value_of_variable2, World, Hello, variable3, value_of_variable3, World, Using this: for (variable in list(variable1, variable2, variable3)){ cat(Hello,, ???variable???, variable, , World,) } This works fine if I'm trying to print the VALUE of variable, but I want to print the NAME of variable as well. Thanks, Suzi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] [R-pkgs] the IPSUR package

Dear useRs, We are pleased to announce the preliminary release of the IPSUR package. The primary audience was originally envisioned to be upper division undergraduate mathematics/statistics/engineering majors, but other useRs may find this material useful. In a nutshell, this package slightly modifies and adds selected functionality to the R Commander by John Fox. The changes were meant to customize Rcmdr for our Statistics classes, populated for the most part by the audience above. Some clever functions written by John Verzani were translated to IPSUR from UsingR. Downloads for the package (while the CRAN submission is pending) are at http://www.cc.ysu.edu/~gjkerns/IPSUR/package/index.htm Check out the Features page to see what the package offers. http://www.cc.ysu.edu/~gjkerns/IPSUR/package/features.htm Full credit must be given to John Fox, together with his diverse team of dedicated contributors. Indeed, without all of their countless hours of effort the IPSUR package would not be possible. Kudos to them for providing excellent software to the R community. Cheers, Jay *** G. Jay Kerns, Ph.D. Department of Mathematics Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX E-mail: [EMAIL PROTECTED] http://www.cc.ysu.edu/~gjkerns/ [[alternative HTML version deleted]] ___ R-packages mailing list R-packages@stat.math.ethz.ch https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Different results in agnes and hclust

Hello to everybody, I have a question regarding the results obtained from the hclust and the agnes funtion using the ward algorithm because they seem to differ from each other. I also ran a cluster analysis using the ward algorithm in Matlab and obtained the same results as from agnes. I'm using the pvclust package in order to confirm the clustering results which internally uses the hclust function. Therefore I'm not too shure what to do with the results. This problem doesn't appear when using the average algorithm. Regards Robert Rein __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] rpart

Dear r-help-list: If I use the rpart method like cfit-rpart(y~.,data=data,...), what kind of tree is stored in cfit? Is it right that this tree is not pruned at all, that it is the full tree? If so, it's up to me to choose a subtree by using the printcp method. In the technical report from Atkinson and Therneau An Introduction to recursive partitioning using the rpart routines from 2000, one can see the following table on page 15: CP nsplit relerror xerror xstd 1 0.105 0 1.0 1. 0.108 2 0.056 3 0.68519 1.1852 0.111 3 0.028 4 0.62963 1.0556 0.109 4 0.574 6 0.57407 1.0556 0.109 5 0.100 7 0.6 1.0556 0.109 Some lines below it says We see that the best tree has 5 terminal nodes (4 splits). Why that if the xerror is the lowest for the tree only consisting of the root? Thank you very much for your help Henri -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] About the display of matrix

For a matrix A, i don't want to display the zero elements in it , How to do with that? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Voung test implementation in R

Dear All, I would like to know if the Voung test (Voung; Econometrica, 1989) to compare two non-nested regression models has been implemented in R. Thanks in advance for your assistance, mirko [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Creating Movies with R

J.R. Lockwood wrote: An alternative that I've used a few times is the jpg() function to create the sequence of images, and then converting these to an mpeg movie using mencoder distributed with mplayer. This works on both windows and linux. I have a pretty self-contained example file written up that I can send to anyone who is interested. Oddly, the most challenging part was creating a sequence of file names that would be correctly ordered - for this I use: lex - function(N){ ## produce vector of N lexicograpically ordered strings ndig - nchar(N) substr(formatC((1:N)/10^ndig,digits=ndig,format=f),3,1000) } Hi, Or you could have asked the `filename` argument of `jpeg` to do the job for you, ie : filename = something%04d as documented in ?jpeg jpeg(filename = something%04.jpg, onefile = FALSE) for(i in 1:10){ plot(i) } Cheers, Romain PS : For those who have ideas of movies, I once started a website R Movies Gallery as a little sister of R Graph(ics) Gallery ... you may want to send me code to populate the website, or populate the wiki with such examples. The idea is not to produce pretty science fiction type movies with R, but use R abilities to create some useful animation that could highlight some statistical concepts such as the LCT, ... Plus, there's a place where grid could show its full power. On Fri, 22 Sep 2006, Jeffrey Horner wrote: Date: Fri, 22 Sep 2006 13:46:52 -0500 From: Jeffrey Horner [EMAIL PROTECTED] To: Lorenzo Isella [EMAIL PROTECTED], r-help@stat.math.ethz.ch Subject: Re: [R] Creating Movies with R If you run R on Linux, then you can run the ImageMagick command called convert. I place this in an R function to use a sequence of PNG plots as movie frames: make.mov.plotcol3d - function(){ unlink(plotcol3d.mpg) system(convert -delay 10 plotcol3d*.png plotcol3d.mpg) } Examples can be seen here: http://biostat.mc.vanderbilt.edu/JrhRgbColorSpace Look for the 'Download Movie' links. Cheers, Jeff Lorenzo Isella wrote: Dear All, I'd like to know if it is possible to create animations with R. To be specific, I attach a code I am using for my research to plot some analytical results in 3D using the lattice package. It is not necessary to go through the code. Simply, it plots some 3D density profiles at two different times selected by the user. I wonder if it is possible to use the data generated for different times to create something like an .avi file. Here is the script: rm(list=ls()) library(lattice) # I start defining the analytical functions needed to get the density as a function of time expect_position - function(t,lam1,lam2,pos_ini,vel_ini) {1/(lam1-lam2)*(lam1*exp(lam2*t)-lam2*exp(lam1*t))*pos_ini+ 1/(lam1-lam2)*(exp(lam1*t)-exp(lam2*t))*vel_ini } sigma_pos-function(t,q,lam1,lam2) { q/(lam1-lam2)^2*( (exp(2*lam1*t)-1)/(2*lam1)-2/(lam1+lam2)*(exp(lam1*t+lam2*t)-1) + (exp(2*lam2*t)-1)/(2*lam2) ) } rho_x-function(x,expect_position,sigma_pos) { 1/sqrt(2*pi*sigma_pos)*exp(-1/2*(x-expect_position)^2/sigma_pos) } Now the physical parameters tau-0.1 beta-1/tau St-tau ### since I am in dimensionless units and tau is already in units of 1/|alpha| D=2e-2 q-2*beta^2*D ### Now the grid in space and time time-5 # time extent tsteps-501 # time steps newtime-seq(0,time,len=tsteps) Now the things specific for the dynamics along x lam1- -beta/2*(1+sqrt(1+4*St)) lam2- -beta/2*(1-sqrt(1+4*St)) xmin- -0.5 xmax-0.5 x0-0.1 vx0-x0 nx-101 ## grid intervals along x newx-seq(xmin,xmax,len=nx) # grid along x # M1 - do.call(g, c(list(x = newx), mypar)) mypar-c(q,lam1,lam2) sig_xx-do.call(sigma_pos,c(list(t=newtime),mypar)) mypar-c(lam1,lam2,x0,vx0) exp_x-do.call(expect_position,c(list(t=newtime),mypar)) #rho_x-function(x,expect_position,sigma_pos) #NB: at t=0, the density blows up, since I have a delta as the initial state! # At any t0, instead, the result is finite. #for this reason I now redefine time by getting rid of the istant t=0 to work out # the density rho_x_t-matrix(ncol=nx,nrow=tsteps-1) for (i in 2:tsteps) {mypar-c(exp_x[i],sig_xx[i]) myrho_x-do.call(rho_x,c(list(x=newx),mypar)) rho_x_t[ i-1, ]-myrho_x } ### Now I also define a scaled density rho_x_t_scaled-matrix(ncol=nx,nrow=tsteps-1) for (i in 2:tsteps) {mypar-c(exp_x[i],sig_xx[i]) myrho_x-do.call(rho_x,c(list(x=newx),mypar)) rho_x_t_scaled[ i-1, ]-myrho_x/max(myrho_x) } ###Now I deal with the dynamics along y lam1- -beta/2*(1+sqrt(1-4*St)) lam2- -beta/2*(1-sqrt(1-4*St)) ymin- 0 ymax- 1 y0-ymax vy0- -y0 mypar-c(q,lam1,lam2) sig_yy-do.call(sigma_pos,c(list(t=newtime),mypar)) mypar-c(lam1,lam2,y0,vy0) exp_y-do.call(expect_position,c(list(t=newtime),mypar)) # now I introduce the function giving the density along y: this has to include the BC of zero # density at wall rho_y-function(y,expect_position,sigma_pos) {

### [R] package usage statistics.

Dear useRs, Is it possible to get the R package usage statistics? That is, does R contain any tools to estimate which packages were used and how often? I am going to temporary change the workplace and packing the data and their processing scripts on my computer in order to continue my projects. During my work on the current workplace I periodically have had installed new R packages, have investigated them and used them in my work or did not used them, depending on their functionality. Now I am thinking about writing an R script which will automatically download and install everything I need from the R repository. So, I need a list of packages I have used in R. The first solution in my head is to scan all disks for R scripts and .Rhistory files, extract calls for library from them and save names of loaded packages. I would appreciate other variants. --- Best regards, Vladimirmailto:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] venn diagram with more than three vectors

On Tue, 2006-09-26 at 10:02 +0200, Oosting, J. (PATH) wrote: I am not aware of existing functions to draw venn diagrams with more than 3 sets, but you could have a look at http://en.wikipedia.org/wiki/Venn_diagram to see how these can be constructed. Jan Oosting Package vegan has a function (varpart) and plot method that will draw venn diagrams with up to 4 sets. It works on results from redundancy analyses, but you could probably adapt it to your needs. HTH G -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pan Zheng Sent: dinsdag 26 september 2006 2:09 To: r-help@stat.math.ethz.ch Subject: [R] venn diagram with more than three vectors Hi, I am using venn diagram function in AMDA to plot the venn diagram. But it seems in this function, it can only plot 3 or less vectors. Is there a way to plot the venn diagram with more than 3 vectors? Please help. Thanks. Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% *Note new Address and Fax and Telephone numbers from 10th April 2006* %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Gavin Simpson [t] +44 (0)20 7679 0522 ECRC [f] +44 (0)20 7679 0565 UCL Department of Geography Pearson Building [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street London, UK[w] http://www.ucl.ac.uk/~ucfagls/cv/ WC1E 6BT [w] http://www.ucl.ac.uk/~ucfagls/ %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] rpart

On Mon, 25 Sep 2006, [EMAIL PROTECTED] wrote: Dear r-help-list: If I use the rpart method like cfit-rpart(y~.,data=data,...), what kind of tree is stored in cfit? Is it right that this tree is not pruned at all, that it is the full tree? It is an rpart object. This contains both the tree and the instructions for pruning it at all values of cp: note that cp is also used in deciding how large a tree to grow. If so, it's up to me to choose a subtree by using the printcp method. Or the plotcp method. In the technical report from Atkinson and Therneau An Introduction to recursive partitioning using the rpart routines from 2000, one can see the following table on page 15: CP nsplit relerror xerror xstd 1 0.105 0 1.0 1. 0.108 2 0.056 3 0.68519 1.1852 0.111 3 0.028 4 0.62963 1.0556 0.109 4 0.574 6 0.57407 1.0556 0.109 5 0.100 7 0.6 1.0556 0.109 Some lines below it says We see that the best tree has 5 terminal nodes (4 splits). Why that if the xerror is the lowest for the tree only consisting of the root? There are *two* reports with that name: this seems to be from minitech.ps. The choice is explained in the rest of that para (the 1-SE rule was used). My guess is that the authors excluded the root as not being a tree, but only they can answer that. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Accessing C- source code of R

Dear list, I'm looking for the sources code of parts of R, (e.g. spline). Does anyone know where I can access it ? Gunther __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] venn diagram with more than three vectors

I am not aware of existing functions to draw venn diagrams with more than 3 sets, but you could have a look at http://en.wikipedia.org/wiki/Venn_diagram to see how these can be constructed. Jan Oosting -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Pan Zheng Sent: dinsdag 26 september 2006 2:09 To: r-help@stat.math.ethz.ch Subject: [R] venn diagram with more than three vectors Hi, I am using venn diagram function in AMDA to plot the venn diagram. But it seems in this function, it can only plot 3 or less vectors. Is there a way to plot the venn diagram with more than 3 vectors? Please help. Thanks. Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] glmmPQL in 2.3.1

On Mon, 25 Sep 2006, Justin Rhodes wrote: Dear R-help, I recently tried implementing glmmPQL in 2.3.1, I thought *I* had implemented it: are you talking about my function in package MASS or your own implementation? and I discovered a few differences as compared to 2.2.1. You appear to be talking about contributed packages (MASS, and glmmPQL also depends on nlme) without giving their version numbers. I am fitting a regression with fixed and random effects with Gamma error structure. First, 2.3.1 gives different estimates than 2.2.1, and 2.3.1, takes more iterations to converge. We have no idea, given the lack of reproducible example. glmmPQL does give the same answers as before for the book examples for which it is support software. This may well be due to an underlying change in nlme. Second, when I try using the anova function it says, 'anova' is not available for PQL fits, why? Any help would be greatly appreciated. Because anova implies you are using an optimization criterion, such as least squares or maximum likelihood, and so there is something like a deviance to partition. It was not used in the book with glmmPQL supports, but it seems some people were using glmmPQL without reference to that book so I made a number of their misuses explicit errors. This *is* in the NEWS and WHATS.NEWS files for MASS and VR: - There are anova() and logLik() methods for class glmmPQL to stop misuse. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] printing a variable name in a for loop

This would do it: v1 - 5 v2 - 6 v3 - 7 vns - paste(v,1:3,sep=) for (i in 1:length(vns)) cat(Hello, vns[i], get(vns[i]), World\n, sep=,) Hello,v1,5,World Hello,v2,6,World Hello,v3,7,World On 24/09/06, Suzi Fei [EMAIL PROTECTED] wrote: Hello, How do you print a variable name in a for loop? I'm trying to construct a csv file that looks like this: Hello, variable1, value_of_variable1, World, Hello, variable2, value_of_variable2, World, Hello, variable3, value_of_variable3, World, Using this: for (variable in list(variable1, variable2, variable3)){ cat(Hello,, ???variable???, variable, , World,) } This works fine if I'm trying to print the VALUE of variable, but I want to print the NAME of variable as well. Thanks, Suzi __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Statistical data and Map-package

Dear helpeRs, I'm working with the map-package and came upon a problem which I couldn't solve. I hope onee of you can. If not, this can be seen as a suggestion for new versions of the package. I'm trying to create a map of some European countries, filled with colors corresponding to some values. Let's say I have the following countries and I assign the following colors (fictional): country2001 - c(Austria, Belgium, Switzerland, Czechoslovakia, Germany, Denmark, Spain, Finland, France, UK, Greece, Hungary, Ireland, Israel, Italy, Luxembourg, Netherlands, Norway, Poland, Portugal, Sweden, Slovenia) color2001 - c(green, yellow,red,red, red, red, red, red, green, red, red, red, red, red, red, red, red, blue, red, red, red, orange) I then let the colors and the values correspond using 'match.map', like this: match - match.map(world,country2001) color - color2001[match] And finally I plot the map. It works perfectly fine. map(database=world, fill=TRUE, col=color) But as I mentioned, I want to create a map of Europe. So, I use xlim and ylim to let some parts of the world fall of the map. The syntax becomes like this: map(database=world, fill=TRUE, col=color, xlim=c(-25,70),ylim=c (35,71)) Now, a problem arises. The regions on the map are colored by the vector 'color'. It needs therefore to correspond to the order in which the polygons are drawn. Since some of the full world-map isn't drawn this time, the color-vector doesn't correspond anymore. This results in the coloring of the wrong countries. Does anybody know of a way to solve this? Thanks very much in advance, Rense Nieuwenhuis [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] printing a variable name in a for loop

Suzi Fei wrote: Hello, How do you print a variable name in a for loop? I'm trying to construct a csv file that looks like this: Hello, variable1, value_of_variable1, World, Hello, variable2, value_of_variable2, World, Hello, variable3, value_of_variable3, World, Using this: for (variable in list(variable1, variable2, variable3)){ cat(Hello,, ???variable???, variable, , World,) } This works fine if I'm trying to print the VALUE of variable, but I want to print the NAME of variable as well. This is a teetering heap of assumptions, but is this what you wanted? Suzi-1 HiYa-function(x) { cat(Hello,deparse(substitute(x)),x,World\n,sep=, ) } HiYa(Suzi) Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Statistical data and Map-package

On Tue, 26 Sep 2006, Rense Nieuwenhuis wrote: Dear helpeRs, I'm working with the map-package and came upon a problem which I couldn't solve. I hope onee of you can. If not, this can be seen as a suggestion for new versions of the package. I'm trying to create a map of some European countries, filled with colors corresponding to some values. Let's say I have the following countries and I assign the following colors (fictional): country2001 - c(Austria, Belgium, Switzerland, Czechoslovakia, Germany, Denmark, Spain, Finland, France, UK, Greece, Hungary, Ireland, Israel, Italy, Luxembourg, Netherlands, Norway, Poland, Portugal, Sweden, Slovenia) color2001 - c(green, yellow,red,red, red, red, red, red, green, red, red, red, red, red, red, red, red, blue, red, red, red, orange) I then let the colors and the values correspond using 'match.map', like this: match - match.map(world,country2001) color - color2001[match] And finally I plot the map. It works perfectly fine. map(database=world, fill=TRUE, col=color) But as I mentioned, I want to create a map of Europe. So, I use xlim and ylim to let some parts of the world fall of the map. The syntax becomes like this: map(database=world, fill=TRUE, col=color, xlim=c(-25,70),ylim=c (35,71)) Now, a problem arises. The regions on the map are colored by the vector 'color'. It needs therefore to correspond to the order in which the polygons are drawn. Since some of the full world-map isn't drawn this time, the color-vector doesn't correspond anymore. This results in the coloring of the wrong countries. Does anybody know of a way to solve this? Within the maps package: europe - map(database=world, fill=TRUE, plot=FALSE, xlim=c(-25,70),ylim=c(35,71)) match - match.map(europe,country2001) color - color2001[match] map(database=world, fill=TRUE, col=color, xlim=c(-25,70),ylim=c(35,71)) but I'm afraid the world database precedes the dissolution of the Soviet Union, Czechoslovakia, and Yugoslavia, and doesn't code Sicily or Sardinia in Italy, so the result is perhaps not yet what you need: europe$names[grep(Sicily, europe$names)] - Italy:Sicily europe$names[grep(Sardinia, europe$names)] - Italy:Sardinia match - match.map(europe,country2001) color - color2001[match] map(database=world, fill=TRUE, col=color, xlim=c(-25,70),ylim=c(35,71)) deals with Italy, but you won't get Slovenia. There was a discussion about this on the R-sig-geo list in March this year starting here: http://finzi.psych.upenn.edu/R/Rhelp02a/archive/78303.html or equivalently: http://article.gmane.org/gmane.comp.lang.r.geo/299 Thanks very much in advance, Rense Nieuwenhuis [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Need help with boxplots

To prevent confusion you might want to use a red dot rather than a line: points(1:2, c(mean(a), mean(b)), col = red) and perhaps label it since its non-standard: text(1:2, c(mean(a), mean(b)), Mean, pos = 4) On 9/26/06, laba diena [EMAIL PROTECTED] wrote: How to add a mean line in the boxplot keeping the median line ? For example in this: set.seed(1) a - rnorm(10) b - rnorm(10) boxplot(a, b) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Need help with boxplots

How to add a mean line in the boxplot keeping the median line ? For example in this: set.seed(1) a - rnorm(10) b - rnorm(10) boxplot(a, b) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Vectorise a for loop?

Hi R guru coders I wrote a bit of code to add a new column onto a topTable dataframe. That is a list of genes processed using the limma package. I used a for loop but I kept feeling there was a better way using a more vector oriented approach. I looked at several commands such as apply, by etc but could not find a good way to do it. I have this feeling there is a command or technique eluding me. (Is there an expr:value1?value2 construction in R?) Can anybody suggest an elegant solution? Details: So, the topTable looks like this: topa1[1:5,c(1,2,3,4)] IDName GB_accession M 11195 245828 SIGKEC9 AX135029 -7.670197 10966107FHL1 B14446 -5.089926 6287 25744 M90LL137340 -4.531744 777 2288 VSNL1 LF039555 -4.035472 11310 272294 M98LL031650 3.866422 I want to add a fold column so it will look like this: topa1[1:5,c(1,2,3,4,10)] IDName GB_accession M fold 11195 245828 SIGKEC9 AX135029 -7.670197 203.68521 10966107FHL1 B14446 -5.089926 34.05810 6287 25744 M90LL137340 -4.531744 23.13082 777 2288 VSNL1 LF039555 -4.035472 16.39828 11310 272294 M98LL031650 3.866422 14.58508 The fold values is calculated from the M column which is a log2 value. The calculation is different depending on whether the M value is negative or positive. That is if the gene is down regulated the reciprocal value has to be used to calculate a fold value. Here is my clunky, not vectorised code : # Function to add a fold column to the toptable ttfold-function(tt) { fold-NULL for (i in 1:length(tt$M)) { if (tt$M[i] 0 ) { fold[i]-1/(2^tt$M[i]) } else { fold[i]-2^tt$M[i] } } tt-cbind(tt, fold) } # Add fold column to top tables topa1-ttfold(topa1) Regards J --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk http://www.ifr.ac.uk/ www.foodandhealthnetwork.com http://www.foodandhealthnetwork.com/ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] package usage statistics. (UPDATE)

Here is the perl script with some comments pre #!/bin/perl -w use File::Find; # we use the standard Perl module. # its procedure will scan the directory tree and put all package names to the hash # along with counting the number of their loadings. %pkgs=(base=-1,# won't print packages installed by default datasets=-1, grDevices=-1, graphics=-1, grid=-1, methods=-1, splines=-1, stats=-1, stats4=-1, tcltk=-1, tools=-1, utils=-1, MASS=-1 ); sub wanted { # this subroutine is used by the File::Find procedure # it adds package names to the hash above return if($_!~/\.[Rr]$/ $_!~/\.[Rr]history$/); # do nothing if this file doesn't contain R commands open IN, .$File::Find::name or die(cannot open file $!); while(IN){ if(/library\((.*)\)/){# looking for library(...) calls $pkgname=$1; next if(! -d C:\\Program Files\\R\\library\\$pkgname); # don't do anything if the package directory doesn't exist # simple protection against typos if(exists $pkgs{$pkgname}) { $pkgs{$pkgname}=$pkgs{$pkgname}+1;# here we assume that basic packages are not loaded }else{ # with library() $pkgs{$pkgname}=1; } } } close(IN); } sub getdepends {# this subroutine resolves the package dependencies $pkgname=$_[0]; # its argument is a package name. It finds the packages the current one depends on # and adds them to the hash above open IN, C:\\Program Files\\R\\library\\$pkgname\\DESCRIPTION or return; #do {print (cannot open file C:\\Program Files\\R\\library\\$pkgname\\DESCRIPTION\n $!); while(IN){ if($_=~/^Imports: (.*)/ || $_=~/^Depends: (.*)/) { @deplist=split(/,/,$1); for(@deplist) { next if(/R \(.*\)/); # exclude dependencies on R version s/\s//g; if(/(.*)\(.*\)/) { $pkgname=$1; }else{ $pkgname=$_; } if(exists $pkgs{$pkgname}) { $pkgs{$pkgname}=$pkgs{$pkgname}+1 if($pkgs{$pkgname}0); # don't add basic packages }else{ $pkgs{$pkgname}=1; } } } } close(IN); } # now the main loop. hope, it is self-describing print Searching for R commands...; find({ wanted = \wanted, no_chdir = 1 }, '.'); print done!\n; print Now resolving dependencies...; for $p (keys %pkgs) { #print $p\n; getdepends($p); } print done!\n; open OUT, install.pkgs.r or die(cannot create file install.pkgs.r); print OUT install.packages(\n; foreach(keys %pkgs){ print OUT $_,\n if($pkgs{$_}0); } print OUT ask=FALSE)\n; close(OUT); /pre __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Voung test implementation in R

Yes, the pscl package contains that function. library(pscl) ?vuong Description Compares two models fit to the same data that do not nest via Vuong's non-nested test. Usage vuong(m1, m2, digits = getOption(digits)) On 9/26/06, mirko sanpietrucci [EMAIL PROTECTED] wrote: Dear All, I would like to know if the Voung test (Voung; Econometrica, 1989) to compare two non-nested regression models has been implemented in R. Thanks in advance for your assistance, mirko [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- 黄荣贵 Department of Sociology Fudan University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] putting stuff into bins...

Federico Calboli [EMAIL PROTECTED] writes: Hi All, I have a vector of data, a vector of bin breakpoints and I want to put my data in the bins and then extract fanciful informations like the mean value of each bin. I know I can write my own function, but I would have thought that R should have somewhere a function that took as arguments something like (data, breaks, what to do with the data in the bins). I surey could not find it trawling the R-help archives though. If such a function exists I'd be grateful to anyone pointing it out to me. cut, split+lapply, aggregate, by -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] putting stuff into bins...

Hi All, I have a vector of data, a vector of bin breakpoints and I want to put my data in the bins and then extract fanciful informations like the mean value of each bin. I know I can write my own function, but I would have thought that R should have somewhere a function that took as arguments something like (data, breaks, what to do with the data in the bins). I surey could not find it trawling the R-help archives though. If such a function exists I'd be grateful to anyone pointing it out to me. Cheers, Fede -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] putting stuff into bins...

Federico Calboli schrieb: Hi All, I have a vector of data, a vector of bin breakpoints and I want to put my data in the bins and then extract fanciful informations like the mean value of each bin. I know I can write my own function, but I would have thought that R should have somewhere a function that took as arguments something like (data, breaks, what to do with the data in the bins). I surey could not find it trawling the R-help archives though. If such a function exists I'd be grateful to anyone pointing it out to me. Cheers, Fede The following should be of help: bd384 - c(2.968, 2.097, 1.611, 3.038, 7.921, 5.476, 9.858, 1.397, 0.155, 1.301, 9.054, 1.958, 4.058, 3.918, 2.019, 3.689, 3.081, 4.229, 4.669, 2.274, 1.971, 10.379, 3.391, 2.093, 6.053, 4.196, 2.788, 4.511, 7.3, 5.856, 0.86, 2.093, 0.703, 1.182, 4.114, 2.075, 2.834, 3.698, 6.48, 2.36, 5.249, 5.1, 4.131, 0.02, 1.071, 4.455, 3.676, 2.666, 5.457, 1.046, 1.908, 3.064, 5.392, 8.393, 0.916, 9.665, 5.564, 3.599, 2.723, 2.87, 1.582, 5.453, 4.091, 3.716, 6.156, 2.039) cut(bd384,0:11) split(bd384,cut(bd384,0:11)) sapply(split(bd384,cut(bd384,0:11)),mean) D.Trenkler -- Dietrich Trenkler c/o Universitaet Osnabrueck Rolandstr. 8; D-49069 Osnabrueck, Germany email: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] rpart

On Tue, 26 Sep 2006, [EMAIL PROTECTED] wrote: Original-Nachricht Datum: Tue, 26 Sep 2006 09:56:53 +0100 (BST) Von: Prof Brian Ripley [EMAIL PROTECTED] An: [EMAIL PROTECTED] Betreff: Re: [R] rpart On Mon, 25 Sep 2006, [EMAIL PROTECTED] wrote: Dear r-help-list: If I use the rpart method like cfit-rpart(y~.,data=data,...), what kind of tree is stored in cfit? Is it right that this tree is not pruned at all, that it is the full tree? It is an rpart object. This contains both the tree and the instructions for pruning it at all values of cp: note that cp is also used in deciding how large a tree to grow. Ok, I have to explain my problem a little bit more in detail, I'm sorry for being so vague: I used the method in the following way: cfit- rpart(y~., method=class, minsplit=1, cp=0) I got a tree with a lot of terminals nodes that contained more than 100 observations. This made me believe that the tree was already pruned. On the other hand, the printcp method showed subtrees that were better. This made me believe that the tree hadn't been pruned before. So, are the trees a little bit pruned? Yes, as you asked for cp=0. Look up what that does in ?rpart.control. If so, it's up to me to choose a subtree by using the printcp method. Or the plotcp method. In the technical report from Atkinson and Therneau An Introduction to recursive partitioning using the rpart routines from 2000, one can see the following table on page 15: CP nsplit relerror xerror xstd 1 0.105 0 1.0 1. 0.108 2 0.056 3 0.68519 1.1852 0.111 3 0.028 4 0.62963 1.0556 0.109 4 0.574 6 0.57407 1.0556 0.109 5 0.100 7 0.6 1.0556 0.109 Some lines below it says We see that the best tree has 5 terminal nodes (4 splits). Why that if the xerror is the lowest for the tree only consisting of the root? There are *two* reports with that name: this seems to be from minitech.ps. The choice is explained in the rest of that para (the 1-SE rule was used). My guess is that the authors excluded the root as not being a tree, but only they can answer that. Are both reports from 2000? But you're right, I'm talking about the one from minitch.ps. The 1-SE-rule only explains why they didn't choose the tree with 6 or 7 splits, but not why they didn't choose the tree without a split. The exclusion of the root as not being a tree was my first explanation, too. But if the tree only consisting of the root is still better than any other tree, why would I choose a tree with 4 splits then? Henri -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] package usage statistics. (UPDATE)

On Tue, 26 Sep 2006, Vladimir Eremeev wrote: Here is the perl script with some comments ?? t1 - installed.packages() t2 - is.na(t1[,Priority]) t3 - names(t2)[t2] t4 - sapply(t3, function(n) file.info(system.file(R, package=n)[1])$atime) class(t4) - POSIXct sort(t4) though the R directory may not be the best place to look for the atime? (This isn't the same, but you get the idea, there is a well-known fortune ...) Roger pre #!/bin/perl -w use File::Find; # we use the standard Perl module. # its procedure will scan the directory tree and put all package names to the hash # along with counting the number of their loadings. %pkgs=(base=-1,# won't print packages installed by default datasets=-1, grDevices=-1, graphics=-1, grid=-1, methods=-1, splines=-1, stats=-1, stats4=-1, tcltk=-1, tools=-1, utils=-1, MASS=-1 ); sub wanted { # this subroutine is used by the File::Find procedure # it adds package names to the hash above return if($_!~/\.[Rr]$/ $_!~/\.[Rr]history$/); # do nothing if this file doesn't contain R commands open IN, .$File::Find::name or die(cannot open file $!); while(IN){ if(/library\((.*)\)/){# looking for library(...) calls $pkgname=$1; next if(! -d C:\\Program Files\\R\\library\\$pkgname); # don't do anything if the package directory doesn't exist # simple protection against typos if(exists $pkgs{$pkgname}) { $pkgs{$pkgname}=$pkgs{$pkgname}+1;# here we assume that basic packages are not loaded }else{ # with library() $pkgs{$pkgname}=1; } } } close(IN); } sub getdepends {# this subroutine resolves the package dependencies $pkgname=$_[0]; # its argument is a package name. It finds the packages the current one depends on # and adds them to the hash above open IN, C:\\Program Files\\R\\library\\$pkgname\\DESCRIPTION or return; #do {print (cannot open file C:\\Program Files\\R\\library\\$pkgname\\DESCRIPTION\n $!); while(IN){ if($_=~/^Imports: (.*)/ || $_=~/^Depends: (.*)/) { @deplist=split(/,/,$1); for(@deplist) { next if(/R \(.*\)/); # exclude dependencies on R version s/\s//g; if(/(.*)\(.*\)/) { $pkgname=$1; }else{ $pkgname=$_; } if(exists $pkgs{$pkgname}) { $pkgs{$pkgname}=$pkgs{$pkgname}+1 if($pkgs{$pkgname}0); # don't add basic packages }else{ $pkgs{$pkgname}=1; } } } } close(IN); } # now the main loop. hope, it is self-describing print Searching for R commands...; find({ wanted = \wanted, no_chdir = 1 }, '.'); print done!\n; print Now resolving dependencies...; for $p (keys %pkgs) { #print $p\n; getdepends($p); } print done!\n; open OUT, install.pkgs.r or die(cannot create file install.pkgs.r); print OUT install.packages(\n; foreach(keys %pkgs){ print OUT $_,\n if($pkgs{$_}0); } print OUT ask=FALSE)\n; close(OUT); /pre __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Roger Bivand Economic Geography Section, Department of Economics, Norwegian School of Economics and Business Administration, Helleveien 30, N-5045 Bergen, Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43 e-mail: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] putting stuff into bins...

I don't know about such a function, but tapply(data,cut(data,breaks),what to do) should give you what you need. HIH Ciao, Stefano On Tue, Sep 26, 2006 at 12:44:35PM +0100, Federico Calboli wrote: FedericoHi All, Federico FedericoI have a vector of data, a vector of bin breakpoints and I want to put my data Federicoin the bins and then extract fanciful informations like the mean value of each bin. Federico FedericoI know I can write my own function, but I would have thought that R should have Federicosomewhere a function that took as arguments something like (data, breaks, what Federicoto do with the data in the bins). I surey could not find it trawling the R-help Federicoarchives though. Federico FedericoIf such a function exists I'd be grateful to anyone pointing it out to me. Federico FedericoCheers, Federico FedericoFede Federico Federico-- FedericoFederico C. F. Calboli FedericoDepartment of Epidemiology and Public Health FedericoImperial College, St Mary's Campus FedericoNorfolk Place, London W2 1PG Federico FedericoTel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 Federico Federicof.calboli [.a.t] imperial.ac.uk Federicof.calboli [.a.t] gmail.com Federico Federico__ FedericoR-help@stat.math.ethz.ch mailing list Federicohttps://stat.ethz.ch/mailman/listinfo/r-help FedericoPLEASE do read the posting guide http://www.R-project.org/posting-guide.html Federicoand provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] about the determinant of a symmetric compound matrix

Dear R users, even if this question is not related to an issue about R, probably some of you will be able to help me. I have a square matrix of dimension k by k with alpha on the diagonal and beta everywhee else. This symmetric matrix is called symmetric compound matrix and has the form a( I + cJ), where I is the k by k identity matrix J is the k by k matrix of all ones a = alpha - beta c = beta/a I need to evaluate the determinant of this matrix. Is there any algebric formula for that? thank you for your help Stefano [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] rpart

Original-Nachricht Datum: Tue, 26 Sep 2006 09:56:53 +0100 (BST) Von: Prof Brian Ripley [EMAIL PROTECTED] An: [EMAIL PROTECTED] Betreff: Re: [R] rpart On Mon, 25 Sep 2006, [EMAIL PROTECTED] wrote: Dear r-help-list: If I use the rpart method like cfit-rpart(y~.,data=data,...), what kind of tree is stored in cfit? Is it right that this tree is not pruned at all, that it is the full tree? It is an rpart object. This contains both the tree and the instructions for pruning it at all values of cp: note that cp is also used in deciding how large a tree to grow. Ok, I have to explain my problem a little bit more in detail, I'm sorry for being so vague: I used the method in the following way: cfit- rpart(y~., method=class, minsplit=1, cp=0) I got a tree with a lot of terminals nodes that contained more than 100 observations. This made me believe that the tree was already pruned. On the other hand, the printcp method showed subtrees that were better. This made me believe that the tree hadn't been pruned before. So, are the trees a little bit pruned? If so, it's up to me to choose a subtree by using the printcp method. Or the plotcp method. In the technical report from Atkinson and Therneau An Introduction to recursive partitioning using the rpart routines from 2000, one can see the following table on page 15: CP nsplit relerror xerror xstd 1 0.105 0 1.0 1. 0.108 2 0.056 3 0.68519 1.1852 0.111 3 0.028 4 0.62963 1.0556 0.109 4 0.574 6 0.57407 1.0556 0.109 5 0.100 7 0.6 1.0556 0.109 Some lines below it says We see that the best tree has 5 terminal nodes (4 splits). Why that if the xerror is the lowest for the tree only consisting of the root? There are *two* reports with that name: this seems to be from minitech.ps. The choice is explained in the rest of that para (the 1-SE rule was used). My guess is that the authors excluded the root as not being a tree, but only they can answer that. Are both reports from 2000? But you're right, I'm talking about the one from minitch.ps. The 1-SE-rule only explains why they didn't choose the tree with 6 or 7 splits, but not why they didn't choose the tree without a split. The exclusion of the root as not being a tree was my first explanation, too. But if the tree only consisting of the root is still better than any other tree, why would I choose a tree with 4 splits then? Henri -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] putting stuff into bins...

This would work. The point is to make a factor from the breakpoints using cut, then use this to calculate the statistics on the binned data. x - rnorm(500) f - cut(x,10) aggregate(x,list(f),mean) Group.1 x 1(-2.71,-2.09] -2.3668991 2(-2.09,-1.46] -1.7332011 3 (-1.46,-0.834] -1.1156487 4 (-0.834,-0.208] -0.5117649 5 (-0.208,0.418] 0.1277991 6 (0.418,1.04] 0.7092500 7 (1.04,1.67] 1.2859184 8 (1.67,2.3] 1.9347327 9 (2.3,2.92] 2.5518835 10 (2.92,3.55] 3.2873698 On 26/09/06, Federico Calboli [EMAIL PROTECTED] wrote: Hi All, I have a vector of data, a vector of bin breakpoints and I want to put my data in the bins and then extract fanciful informations like the mean value of each bin. I know I can write my own function, but I would have thought that R should have somewhere a function that took as arguments something like (data, breaks, what to do with the data in the bins). I surey could not find it trawling the R-help archives though. If such a function exists I'd be grateful to anyone pointing it out to me. Cheers, Fede -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] crostab qry - Too many crosstab column headers

Hello I guess not, but is there a way to reduce/split up a MS-access crosstab query resulting in more than 256 cols. by using sqlGetResults in RODBC to e.g. produce several dataframes of 256 columns (that is without changing the query itself). --- [1] [RODBC] ERROR: Could not SQLExecDirect [2] S1001 -1040 [Microsoft][ODBC Microsoft Access Driver] Too many crosstab column headers (2270). R - 2.3 WinXp with Ms access 2002 Best Regards Anders __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Vectorise a for loop?

tt$fold - ifelse(tt$M 0, 1/(2^tt$M), 2^tt$M) --- Jacques VESLOT CNRS UMR 8090 I.B.L (2ème étage) 1 rue du Professeur Calmette B.P. 245 59019 Lille Cedex Tel : 33 (0)3.20.87.10.44 Fax : 33 (0)3.20.87.10.31 http://www-good.ibl.fr --- john seers (IFR) a écrit : Hi R guru coders I wrote a bit of code to add a new column onto a topTable dataframe. That is a list of genes processed using the limma package. I used a for loop but I kept feeling there was a better way using a more vector oriented approach. I looked at several commands such as apply, by etc but could not find a good way to do it. I have this feeling there is a command or technique eluding me. (Is there an expr:value1?value2 construction in R?) Can anybody suggest an elegant solution? Details: So, the topTable looks like this: topa1[1:5,c(1,2,3,4)] IDName GB_accession M 11195 245828 SIGKEC9 AX135029 -7.670197 10966107FHL1 B14446 -5.089926 6287 25744 M90LL137340 -4.531744 777 2288 VSNL1 LF039555 -4.035472 11310 272294 M98LL031650 3.866422 I want to add a fold column so it will look like this: topa1[1:5,c(1,2,3,4,10)] IDName GB_accession M fold 11195 245828 SIGKEC9 AX135029 -7.670197 203.68521 10966107FHL1 B14446 -5.089926 34.05810 6287 25744 M90LL137340 -4.531744 23.13082 777 2288 VSNL1 LF039555 -4.035472 16.39828 11310 272294 M98LL031650 3.866422 14.58508 The fold values is calculated from the M column which is a log2 value. The calculation is different depending on whether the M value is negative or positive. That is if the gene is down regulated the reciprocal value has to be used to calculate a fold value. Here is my clunky, not vectorised code : # Function to add a fold column to the toptable ttfold-function(tt) { fold-NULL for (i in 1:length(tt$M)) { if (tt$M[i] 0 ) { fold[i]-1/(2^tt$M[i]) } else { fold[i]-2^tt$M[i] } } tt-cbind(tt, fold) } # Add fold column to top tables topa1-ttfold(topa1) Regards J --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk http://www.ifr.ac.uk/ www.foodandhealthnetwork.com http://www.foodandhealthnetwork.com/ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] rpart

Original-Nachricht Datum: Tue, 26 Sep 2006 12:54:22 +0100 (BST) Von: Prof Brian Ripley [EMAIL PROTECTED] An: [EMAIL PROTECTED] Betreff: Re: [R] rpart On Tue, 26 Sep 2006, [EMAIL PROTECTED] wrote: Original-Nachricht Datum: Tue, 26 Sep 2006 09:56:53 +0100 (BST) Von: Prof Brian Ripley [EMAIL PROTECTED] An: [EMAIL PROTECTED] Betreff: Re: [R] rpart On Mon, 25 Sep 2006, [EMAIL PROTECTED] wrote: Dear r-help-list: If I use the rpart method like cfit-rpart(y~.,data=data,...), what kind of tree is stored in cfit? Is it right that this tree is not pruned at all, that it is the full tree? It is an rpart object. This contains both the tree and the instructions for pruning it at all values of cp: note that cp is also used in deciding how large a tree to grow. Ok, I have to explain my problem a little bit more in detail, I'm sorry for being so vague: I used the method in the following way: cfit- rpart(y~., method=class, minsplit=1, cp=0) I got a tree with a lot of terminals nodes that contained more than 100 observations. This made me believe that the tree was already pruned. On the other hand, the printcp method showed subtrees that were better. This made me believe that the tree hadn't been pruned before. So, are the trees a little bit pruned? Yes, as you asked for cp=0. Look up what that does in ?rpart.control. I thought I would get a full tree by choosing cp=0 - and it was one. The nodes with more than 100 observations were not split further because there was no sequence of splits which made the class label change for any subset. (A bad explanation, but you probably know what I mean.) I realized that when I chose cp=-1. Thank you very much for your help! If so, it's up to me to choose a subtree by using the printcp method. Or the plotcp method. In the technical report from Atkinson and Therneau An Introduction to recursive partitioning using the rpart routines from 2000, one can see the following table on page 15: CP nsplit relerror xerror xstd 1 0.105 0 1.0 1. 0.108 2 0.056 3 0.68519 1.1852 0.111 3 0.028 4 0.62963 1.0556 0.109 4 0.574 6 0.57407 1.0556 0.109 5 0.100 7 0.6 1.0556 0.109 Some lines below it says We see that the best tree has 5 terminal nodes (4 splits). Why that if the xerror is the lowest for the tree only consisting of the root? There are *two* reports with that name: this seems to be from minitech.ps. The choice is explained in the rest of that para (the 1-SE rule was used). My guess is that the authors excluded the root as not being a tree, but only they can answer that. Are both reports from 2000? But you're right, I'm talking about the one from minitch.ps. The 1-SE-rule only explains why they didn't choose the tree with 6 or 7 splits, but not why they didn't choose the tree without a split. The exclusion of the root as not being a tree was my first explanation, too. But if the tree only consisting of the root is still better than any other tree, why would I choose a tree with 4 splits then? Henri -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Need help with boxplots

How to add a mean *line* in the boxplot keeping the median line ? Maybe it is possible to do using the *segments* function ? For example in this: set.seed(1) a - rnorm(10) b - rnorm(10) boxplot(a, b) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Vectorise a for loop?

Hi Jacques Yes, that looks a whole lot better. That ifelse is exactly what I was searching for. Merci. J --- John Seers Institute of Food Research Norwich Research Park Colney Norwich NR4 7UA tel +44 (0)1603 251497 fax +44 (0)1603 507723 e-mail [EMAIL PROTECTED] e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ Web sites: www.ifr.ac.uk www.foodandhealthnetwork.com -Original Message- From: Jacques VESLOT [mailto:[EMAIL PROTECTED] Sent: 26 September 2006 14:02 To: john seers (IFR) Cc: R-help Subject: Re: [R] Vectorise a for loop? tt$fold - ifelse(tt$M 0, 1/(2^tt$M), 2^tt$M) --- Jacques VESLOT CNRS UMR 8090 I.B.L (2ème étage) 1 rue du Professeur Calmette B.P. 245 59019 Lille Cedex Tel : 33 (0)3.20.87.10.44 Fax : 33 (0)3.20.87.10.31 http://www-good.ibl.fr __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] creation of new variables

Hello All, I have 8 variables named a b c d e f g h I need to create four variables from these 8 vraibles in R. the new variables are ab,cd,ef,gh. Can anyone pleas help me thanks, Pratap - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] about the determinant of a symmetric compound matrix

Stefano Sofia [EMAIL PROTECTED] writes: Dear R users, even if this question is not related to an issue about R, probably some of you will be able to help me. I have a square matrix of dimension k by k with alpha on the diagonal and beta everywhee else. This symmetric matrix is called symmetric compound matrix and has the form a( I + cJ), where I is the k by k identity matrix J is the k by k matrix of all ones a = alpha - beta c = beta/a I need to evaluate the determinant of this matrix. Is there any algebric formula for that? Yes. Unusually, this is not from the famous Rao p.33, but from p.32... [1]: det(A+XX') = det(A)(1+X'A^{-1}X) provided det(A) != 0 now put X = sqrt(c) times a vector of ones and get det(I+cJ) = 1+ck. Multiply by a^k for the general case. Quick sanity check: m - matrix(.1,7,7) diag(m) - .9 det(m) [1] 0.393216 .8^7 * (1 + .1/.8 * 7) [1] 0.393216 Alternatively, you can do it via eigenvalues: The off-diagonal part (beta*J) corresponds to a single direction along the unit vector c(1,1,...,1)/sqrt(7). The diagonal part corresponds to adding (alpha - beta)*I, which has total sphericity so you can arrange that one eigenvector of it points in the same direction and you end up with (alpha - beta)^(k-1) * (alpha - beta + k*beta) (.9-.1)^6*((.9-.1)+ 7*.1) [1] 0.393216 (Getting this right on the first try is almost impossible...) [1] CR Rao, Linear Statistical Inference and Its Applications, 2nd ed. Wiley 1973. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] treatment effect at specific time point within mixed effects model

All, The code below is for a pseudo dataset of repeated measures on patients where there is also a treatment factor called drug. Time is treated as categorical. What code is necessary to test for a treatment effect at a single time point, e.g., time = 3? Does the answer matter if the design is a crossover design, i.e, each patient received drug and placebo? Finally, what would be a good response to someone that suggests to do a simple t-test (paired in crossover case) instead of the test above within a mixed model? thanks! dave z = rnorm(24, mean=0, sd=1) time = rep(1:6, 4) Patient = rep(1:4, each = 6) drug = factor(rep(c(I, P), each = 6, times = 2)) ## P = placebo, I = Ibuprofen dat.new = data.frame(time, drug, z, Patient) data.grp = groupedData(z ~ time | Patient, data = dat.new) fm1 = lme(z ~ factor(time) + drug + factor(time):drug, data = data.grp, random = list(Patient = ~ 1) ) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] package usage statistics. (UPDATE)

Dear Roger, Tuesday, September 26, 2006, 4:16:38 PM, you wrote: RB On Tue, 26 Sep 2006, Vladimir Eremeev wrote: Here is the perl script with some comments RB ?? Sorry, forgot to mention, this script is designed to run from the root of the working directory tree. It scans all R session histories and scripts and analyzes them. This is performed through the call find({ wanted = \wanted, no_chdir = 1 }, '.'); The second parameter to find is a list of directories. This allows, for example, build a histogram of package usage. --- Best regards, Vladimirmailto:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] October R/Splus course in Washington DC, San Francisco, Seattle *** R/Splus Fundamentals and Programming Techniques

XLSolutions Corporation (www.xlsolutions-corp.com) is proud to announce our 2-day October 2006 R/S-plus Fundamentals and Programming Techniques : www.xlsolutions-corp.com/Rfund.htm *** Washington DC / October 12-13, 2006 *** Seattle Wa / October 19-20 *** San Francisco / October 26-27 Reserve your seat now at the early bird rates! Payment due AFTER the class Course Description: This two-day beginner to intermediate R/S-plus course focuses on a broad spectrum of topics, from reading raw data to a comparison of R and S. We will learn the essentials of data manipulation, graphical visualization and R/S-plus programming. We will explore statistical data analysis tools,including graphics with data sets. How to enhance your plots, build your own packages (librairies) and connect via ODBC,etc. We will perform some statistical modeling and fit linear regression models. Participants are encouraged to bring data for interactive sessions With the following outline: - An Overview of R and S - Data Manipulation and Graphics - Using Lattice Graphics - A Comparison of R and S-Plus - How can R Complement SAS? - Writing Functions - Avoiding Loops - Vectorization - Statistical Modeling - Project Management - Techniques for Effective use of R and S - Enhancing Plots - Using High-level Plotting Functions - Building and Distributing Packages (libraries) - Connecting; ODBC, Rweb, Orca via sockets and via Rjava Email us for group discounts. Email Sue Turner: [EMAIL PROTECTED] Phone: 206-686-1578 Visit us: www.xlsolutions-corp.com/training.htm Please let us know if you and your colleagues are interested in this classto take advantage of group discount. Register now to secure your seat! Interested in R/Splus Advanced course? email us. Cheers, Elvis Miller, PhD Manager Training. XLSolutions Corporation 206 686 1578 www.xlsolutions-corp.com [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] printing a variable name in a for loop

Example: lst - list(variable1, variable2, variable3) for (kk in seq(along=lst)) { name - names(lst)[kk]; value - lst[[kk]]; cat(Hello,, name, value, , World,) } /Henrik On 9/26/06, Jim Lemon [EMAIL PROTECTED] wrote: Suzi Fei wrote: Hello, How do you print a variable name in a for loop? I'm trying to construct a csv file that looks like this: Hello, variable1, value_of_variable1, World, Hello, variable2, value_of_variable2, World, Hello, variable3, value_of_variable3, World, Using this: for (variable in list(variable1, variable2, variable3)){ cat(Hello,, ???variable???, variable, , World,) } This works fine if I'm trying to print the VALUE of variable, but I want to print the NAME of variable as well. This is a teetering heap of assumptions, but is this what you wanted? Suzi-1 HiYa-function(x) { cat(Hello,deparse(substitute(x)),x,World\n,sep=, ) } HiYa(Suzi) Jim __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] putting stuff into bins...

probably a combination of cut() and tapply() could be of help in this case, e.g., x - rnorm(100) tapply(x, cut(x, -4:4), mean) Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Federico Calboli [EMAIL PROTECTED] To: r-help r-help@stat.math.ethz.ch Sent: Tuesday, September 26, 2006 1:44 PM Subject: [R] putting stuff into bins... Hi All, I have a vector of data, a vector of bin breakpoints and I want to put my data in the bins and then extract fanciful informations like the mean value of each bin. I know I can write my own function, but I would have thought that R should have somewhere a function that took as arguments something like (data, breaks, what to do with the data in the bins). I surey could not find it trawling the R-help archives though. If such a function exists I'd be grateful to anyone pointing it out to me. Cheers, Fede -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Accessing C- source code of R

Gunther Höning wrote: Dear list, I'm looking for the sources code of parts of R, (e.g. spline). Does anyone know where I can access it ? I plan to write a corresponding R Help Desk article on Accessing the source. A draft is available from: http://www.statistik.uni-dortmund.de/~ligges/R_Help_Desk_preview.pdf Can you please tell me if this description is sufficient? Thanks, Uwe Ligges Gunther __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] venn diagram with more than three vectors

Hi, I am using venn diagram function in AMDA to plot the venn diagram. But it seems in this function, it can only plot 3 or less vectors. Is there a way to plot the venn diagram with more than 3 vectors? Please help. Thanks. Z - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] About the display of matrix

SQW == S Q WEN [EMAIL PROTECTED] on Mon, 25 Sep 2006 23:12:10 -0700 writes: SQW For a matrix A, i don't want to display the zero SQW elements in it , How to do with that? (Using a fairly recent version of R) Either use as.table() and use the print() method for table explicitly, or, if you are really working with sparse matrices, use the 'Matrix' package: set.seed(1); m - matrix(rpois(80, lambda=.8), 8,10);m [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [1,]011012101 0 [2,]004001110 0 [3,]100021111 1 [4,]201011201 2 [5,]012211020 2 [6,]200001002 0 [7,]211110010 1 [8,]110101002 3 print(as.table(m), zero = .) A B C D E F G H I J A . 1 1 . 1 2 1 . 1 . B . . 4 . . 1 1 1 . . C 1 . . . 2 1 1 1 1 1 D 2 . 1 . 1 1 2 . 1 2 E . 1 2 2 1 1 . 2 . 2 F 2 . . . . 1 . . 2 . G 2 1 1 1 1 . . 1 . 1 H 1 1 . 1 . 1 . . 2 3 library(Matrix) Loading required package: lattice M - Matrix(m, sparse = TRUE) M 8 x 10 sparse Matrix of class dgCMatrix [1,] . 1 1 . 1 2 1 . 1 . [2,] . . 4 . . 1 1 1 . . [3,] 1 . . . 2 1 1 1 1 1 [4,] 2 . 1 . 1 1 2 . 1 2 [5,] . 1 2 2 1 1 . 2 . 2 [6,] 2 . . . . 1 . . 2 . [7,] 2 1 1 1 1 . . 1 . 1 [8,] 1 1 . 1 . 1 . . 2 3 --- Martin Maechler, ETH Zurich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Need help with boxplots

The problem with a line, I think, would be that the width of the boxes can vary depending on the number of boxes in the plot, etc. No doubt it could be done, but you'd probably have to look into the bxp function to see how the widths are calculated. On 26/09/06, laba diena [EMAIL PROTECTED] wrote: How to add a mean *line* in the boxplot keeping the median line ? Maybe it is possible to do using the *segments* function ? For example in this: set.seed(1) a - rnorm(10) b - rnorm(10) boxplot(a, b) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] calculating dissimilarities in R

Hi Elvina, Elvina == Elvina Payet [EMAIL PROTECTED] on Tue, 26 Sep 2006 05:48:01 GMT writes: Elvina __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Statistical data and Map-package

You may also want to look at the maptools (and sp) package, it can read in and plot shapefiles from external sources. Some sources of maps that maptools can plot include: http://www.vdstech.com/map_data.htm http://openmap.bbn.com/data/shape/timezone/ http://arcdata.esri.com/data_downloader/DataDownloader?part=10200stack= back Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Rense Nieuwenhuis Sent: Tuesday, September 26, 2006 2:21 AM To: r-help@stat.math.ethz.ch Subject: [R] Statistical data and Map-package Dear helpeRs, I'm working with the map-package and came upon a problem which I couldn't solve. I hope onee of you can. If not, this can be seen as a suggestion for new versions of the package. I'm trying to create a map of some European countries, filled with colors corresponding to some values. Let's say I have the following countries and I assign the following colors (fictional): country2001 - c(Austria, Belgium, Switzerland, Czechoslovakia, Germany, Denmark, Spain, Finland, France, UK, Greece, Hungary, Ireland, Israel, Italy, Luxembourg, Netherlands, Norway, Poland, Portugal, Sweden, Slovenia) color2001 - c(green, yellow,red,red, red, red, red, red, green, red, red, red, red, red, red, red, red, blue, red, red, red, orange) I then let the colors and the values correspond using 'match.map', like this: match - match.map(world,country2001) color - color2001[match] And finally I plot the map. It works perfectly fine. map(database=world, fill=TRUE, col=color) But as I mentioned, I want to create a map of Europe. So, I use xlim and ylim to let some parts of the world fall of the map. The syntax becomes like this: map(database=world, fill=TRUE, col=color, xlim=c(-25,70),ylim=c (35,71)) Now, a problem arises. The regions on the map are colored by the vector 'color'. It needs therefore to correspond to the order in which the polygons are drawn. Since some of the full world-map isn't drawn this time, the color-vector doesn't correspond anymore. This results in the coloring of the wrong countries. Does anybody know of a way to solve this? Thanks very much in advance, Rense Nieuwenhuis [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Need help with boxplots

And if you really do want a line segment try this: M - c(mean(a), mean(b)) segments(1:2-0.4, M, 1:2+0.4, M, col = red) On 9/26/06, Gabor Grothendieck [EMAIL PROTECTED] wrote: To prevent confusion you might want to use a red dot rather than a line: points(1:2, c(mean(a), mean(b)), col = red) and perhaps label it since its non-standard: text(1:2, c(mean(a), mean(b)), Mean, pos = 4) On 9/26/06, laba diena [EMAIL PROTECTED] wrote: How to add a mean line in the boxplot keeping the median line ? For example in this: set.seed(1) a - rnorm(10) b - rnorm(10) boxplot(a, b) [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] lapply, plot and additional arguments

maybe something like this could help: x - data.frame(a = 1:9, beta = exp(-4:4), logic = rep(c(TRUE, FALSE), c(5, 4))) x.l - split(x, x$logic) plot(x$a, x$beta) mapply(function(x, y) lines(x$a, x$b, col = y), x.l, 1:2) Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Petr Pikal [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Tuesday, September 26, 2006 5:40 PM Subject: [R] lapply, plot and additional arguments Dear all Hopefully somebody will know the answer. I have some list x - data.frame(a = 1:9, beta = exp(-4:4), logic = rep(c(TRUE,FALSE), c(5,4))) x.l - split(x, x$logic) plot(x.l$a, x.l$beta) and I want to plot lines color coded according to logic variable lapply(x.l, function(x, ...) lines(x$a, x$beta, col=1:2)) lapply(x.l, function(x,...) lines(x$a,x$beta), col=1:2) lapply(x.l, function(x,...) lines(x$a,x$beta, ...), col=1:2) Well, lapply seems to ignore my best attempts to persuade it to use different colours for each part of x.l list. Anybody knows how to code different colours when using lapply for such plotting? At present time I use a loop but maybe lapply could do it too. Best regards. Petr Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] set off error messages

Hello there! I'm creacting a loop for(i in 1:n){...}within which I build a nls model at each iteration. for some of the values of i, the algoritm in the nls function doesn't converge or cannot find a solution and consequently an error message is produced, and so my loop is interupted. The errors don't really matter to me as all the other values might still be useful and therefore I want to ignore the errors, so that that the return of the models for which no solution is found should just be NA values, so that I get a value for every i. How can I turn off the error message and make return NA values instead? Thanks in advance for your help... Fabian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] set off error messages

On Tue, 26 Sep 2006, Mollet, Fabian wrote: Hello there! I'm creacting a loop for(i in 1:n){...}within which I build a nls model at each iteration. for some of the values of i, the algoritm in the nls function doesn't converge or cannot find a solution and consequently an error message is produced, and so my loop is interupted. The errors don't really matter to me as all the other values might still be useful and therefore I want to ignore the errors, so that that the return of the models for which no solution is found should just be NA values, so that I get a value for every i. How can I turn off the error message and make return NA values instead? This is a FAQ (7.32) -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Extention of Pie Chart in R (was Re: Adding percentage to Pie Charts)

Jim Lemon jim at bitwrit.com.au writes: I admit to interpreting this pretty loosely, but I would like to know what people think of a fan plot. Hi all, I tried the fan.plots that Jim has been very nice to provide. It made me think if there was something like, clock.plots in R? Something like the following, anything that comes close? The idea an extention in yet another way of Pie Charts, extending the fan.plots provided by Jim. * A value will be depicted on a clock.plot using 1 or 2 hands of an analog clock on a circle calibrated from 0 to 100 (same as 0). * For values between 0 and 99 use the position of only one hand of the clock (needle). * For values of 100, use the second hand (needle), and move it to 1. * Some way to identify needles, and two two overlapping needles. * Use color coding or line-types to differentiate variables. This is basically a clock calibrated on a scale of 100, rather than 60. It can visually depict values between 1 and 1. Do we have something like this R? Anupam. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] set off error messages

Try ?try On 26/09/06, Mollet, Fabian [EMAIL PROTECTED] wrote: Hello there! I'm creacting a loop for(i in 1:n){...}within which I build a nls model at each iteration. for some of the values of i, the algoritm in the nls function doesn't converge or cannot find a solution and consequently an error message is produced, and so my loop is interupted. The errors don't really matter to me as all the other values might still be useful and therefore I want to ignore the errors, so that that the return of the models for which no solution is found should just be NA values, so that I get a value for every i. How can I turn off the error message and make return NA values instead? Thanks in advance for your help... Fabian __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Not all functions work in RSPerl package?

Hi, Prof Duncan I am sorry to report to a wrong place. But I am lucky to meet you by chance, right? Thanks first ^^ 1. The variable y1 is an array get from Perl, each element is from a database (the type should be numeric). Here is the code for that. $query = qq{ select exonCount, count(hsEnsGene) as geneCount from countTop1000ks group by exonCount order by exonCount; }; $sql=$orthologDB-prepare($query); $sql-execute()or die Could not execute '$query' ...; my @x1; my @y1; while(my($exonCount, $geneCount) = $sql-fetchrow_array()) { push(@x1, $exonCount); push(@y1, $geneCount); } Here is the result if I print out the value in @y1: print y1---,join( ,@y1), ---end\n; % y1---101 44 33 26 8 15 18 13 3 5 4 2 1 4 1 1---end But when I call R::callWithNames(barplot, {'',[EMAIL PROTECTED], 'main', 'Barplot the Gene number per exon with top1000 low Ks', 'xlab', Exon(low ks) number in the gene,'ylab', 'Numbers of gene'}); It always says non-numeric argument: % Error in -0.01 * height : non-numeric argument to binary operator If I asign the value to another array, like my @x=(101, 44, 33, 26, 8, 15, 18, 13, 3, 5, 4, 2, 1, 4, 1, 1); R::callWithNames(barplot, {'',[EMAIL PROTECTED], 'main', 'Barplot the Gene number per exon with top1000 low Ks', 'xlab', Exon(low ks) number in the gene,'ylab', 'Numbers of gene'}); then it works. I don't know why and what the difference is. I also thought whether it is because of the different data type between Perl and R, because in Perl, 3 and 3 could be same sometime. So I call R::callWithNames(as.numeric,{'',[EMAIL PROTECTED]); before I call R::callWithNames(barplot, {'',[EMAIL PROTECTED]); Same error! Same case if I change to use the R::boxplot([EMAIL PROTECTED]) as you said. I am not sure I explain clear this time. Looking forwards to your response! Regards, -Xianjun On Thu, 2006-09-21 at 07:33 -0700, Duncan Temple Lang wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi Xianjun [Important: Please don't send mail about an R package to r-bugs. That is for reporting bugs in R itself. Add on packages are different and it is only a coincidence that I am one of the R-core developers and package author. In general, all bug reports about a package should be sent to the author and questions should go to the author and the r-devel or r-help list as appropriate.] Is the problem you report a bug? Well not necessarily in RSPerl, but in your code. Unfortunately, you haven't told me what the variable y1 contains so it is hard to figure out what is going into the computations. A couple of things: a) Your example is calling boxplot in the first call and barplot in the second. b) in the first example, you are passing @y1 and in the second you are passing [EMAIL PROTECTED] I would guess that [EMAIL PROTECTED] is more appropriate and you might try that in the first case. c) the first case doesn't have any named arguments (just '') so why use callWithNames. Just R::boxplot([EMAIL PROTECTED]) You are calling the R functions, but you are getting an error during the invocation. The error message is coming from R. So the problem is that you are passing inputs to the functions that it cannot handle. This can happen directly in R and so also in RSPerl. My guess is that you don't have the correct type of data in @y1 or that you are not passing it in the call as a reference. Xianjun Dong wrote: Hi, It looks that not all function in R could be implemented by RSPerl. For example, when I call R::callWithNames(boxplot, {'',@y1}); or R::barplot([EMAIL PROTECTED]); There would be error: Error in -0.01 * height : non-numeric argument to binary operator Caught error in R::call() The same happened when calling barplot, but it's ok to call plot. Is it a bug? - -- Duncan Temple Lang[EMAIL PROTECTED] Department of Statistics work: (530) 752-4782 4210 Mathematical Sciences Building fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.3 (Darwin) iD8DBQFFEqKy9p/Jzwa2QP4RAoVcAJ4rK3CKGBCxlgdlJYke59l/Rm4rAQCffS1x nhSyWBrhQre0UXvv3DKD0KI= =EVsZ -END PGP SIGNATURE- [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] read.xport: Writing and reading dataframe to disk directly

Hi All, is there a way of directly writing to disk file, the dataframe or list of dataframes that result from read.xport function. This function converts SAS export files to R dataframes. I would like to convert a SAS transport file to R, but the resulting R dataframes do not fit in the memory of my computer. Is there way to write the output of this fucntion to disk, perhaps using some pipe or connection facility. Something like, filexpt.lst - lookup.xport(file.xpt) # works very well and returns a list with all kind of information about variable # name, format, labels, etc. save(filexpt.df - read.xport(file.xpt), file=filexpt.Rdata) # from what I can tell, this will not work. ? Is there a way to use a pipe or connection to write filexpt.df to disk as it is being created? ? Is there a way to use a connection to an R dataframe on disk, so I can get subsets (rows or colums) from the dataframe on disk, without having to read it into memory? I will be thankful for your help and suggestions. Anupam. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] New project: littler for GNU R

What ? == littler - Provides hash-bang (#!) capability for R (www.r-project.org) Why ? = GNU R, a language and environment for statistical computing and graphics, provides a wonderful system for 'programming with data' as well as interactive exploratory analysis, often involving graphs. Sometimes, however, simple scripts are desired. While GNU R can be used in batch mode, and while so-called 'here' documents can be crafted, a long-standing need for a scripting front-end has often been expressed by the R Community. littler (pronounced 'little R' and written 'r') aims to fill this need. It can be used directly on the command-line just like, say, bc(1): $ echo 'cat(pi^2,\n)' | r 9.869604 Equivalently, commands that are to be evaluated can be given on the command-line $ r -e 'cat(pi^2, \n)' 9.869604 But unlike bc(1), GNU R has a vast number of statistical functions. For example, we can quickly compute a summary() and show a stem-and-leaf plot for file sizes in a given directory via $ ls -l /boot | awk '!/^total/ {print $5}' | \ r -e 'fsizes - as.integer(readLines()); print(summary(fsizes)); stem(fsizes)' Min. 1st Qu. MedianMean 3rd Qu.Max. 13 512 110100 486900 768400 4735000 Loading required package: grDevices The decimal point is 6 digit(s) to the right of the | 0 | 002223 0 | 5557778899 1 | 112233 1 | 5 2 | 2 | 3 | 3 | 4 | 4 | 7 And, last but not least, this (somewhat unwieldy) expression can be stored in a helper script: $ cat examples/fsizes.r #!/usr/bin/env r fsizes - as.integer(readLines()) print(summary(fsizes)) stem(fsizes) (where calling /usr/bin/env is a trick from Python which allows one to forget whether r is installed in /usr/bin/r, /usr/local/bin/r, ~/bin/r, ...) A few examples are provided in the source directories examples/ and tests/. Where ? === littler can either be downloaded from http://biostat.mc.vanderbilt.edu/LittleR accessed by anonymous SVN: $ svn co http://littler.googlecode.com/svn/trunk/ littler or (soon !) be gotten from Debian mirrors via $ agt-get install littler littler is known to build and run on Linux and OS X. Who ? = Copyright (C) 2006 Jeffrey Horner and Dirk Eddelbuettel littler is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Comments are welcome, as are are suggestions, bug fixes, or patches. - Jeffrey Horner [EMAIL PROTECTED] - Dirk Eddelbuettel [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

Any plans for Windows? On 9/26/06, Jeffrey Horner [EMAIL PROTECTED] wrote: What ? == littler - Provides hash-bang (#!) capability for R (www.r-project.org) Why ? = GNU R, a language and environment for statistical computing and graphics, provides a wonderful system for 'programming with data' as well as interactive exploratory analysis, often involving graphs. Sometimes, however, simple scripts are desired. While GNU R can be used in batch mode, and while so-called 'here' documents can be crafted, a long-standing need for a scripting front-end has often been expressed by the R Community. littler (pronounced 'little R' and written 'r') aims to fill this need. It can be used directly on the command-line just like, say, bc(1): $ echo 'cat(pi^2,\n)' | r 9.869604 Equivalently, commands that are to be evaluated can be given on the command-line $ r -e 'cat(pi^2, \n)' 9.869604 But unlike bc(1), GNU R has a vast number of statistical functions. For example, we can quickly compute a summary() and show a stem-and-leaf plot for file sizes in a given directory via $ ls -l /boot | awk '!/^total/ {print $5}' | \ r -e 'fsizes - as.integer(readLines()); print(summary(fsizes)); stem(fsizes)' Min. 1st Qu. MedianMean 3rd Qu.Max. 13 512 110100 486900 768400 4735000 Loading required package: grDevices The decimal point is 6 digit(s) to the right of the | 0 | 002223 0 | 5557778899 1 | 112233 1 | 5 2 | 2 | 3 | 3 | 4 | 4 | 7 And, last but not least, this (somewhat unwieldy) expression can be stored in a helper script: $ cat examples/fsizes.r #!/usr/bin/env r fsizes - as.integer(readLines()) print(summary(fsizes)) stem(fsizes) (where calling /usr/bin/env is a trick from Python which allows one to forget whether r is installed in /usr/bin/r, /usr/local/bin/r, ~/bin/r, ...) A few examples are provided in the source directories examples/ and tests/. Where ? === littler can either be downloaded from http://biostat.mc.vanderbilt.edu/LittleR accessed by anonymous SVN: $ svn co http://littler.googlecode.com/svn/trunk/ littler or (soon !) be gotten from Debian mirrors via $ agt-get install littler littler is known to build and run on Linux and OS X. Who ? = Copyright (C) 2006 Jeffrey Horner and Dirk Eddelbuettel littler is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Comments are welcome, as are are suggestions, bug fixes, or patches. - Jeffrey Horner [EMAIL PROTECTED] - Dirk Eddelbuettel [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] dotplot, dropping unused levels of 'y'

Deepayan Sarkar wrote: On 9/15/06, Benjamin Tyner [EMAIL PROTECTED] wrote: In dotplot, what's the best way to suppress the unused levels of 'y' on a per-panel basis? This is useful for the case that 'y' is a factor taking perhaps thousands of levels, but for a given panel, only a handfull of these levels ever present. It's a bit problematic. Basically, you can use relation=free/sliced, but y behaves as as.numeric(y) would. So, if the small subset in each panel are always more or less contiguous (in terms of the levels being close to each other) then you would be fine. Otherwise you would not. In that case, you can still write your own prepanel and panel functions, e.g.: - library(lattice) y - factor(sample(1:100), levels = 1:100) x - 1:100 a - gl(9, 1, 100) dotplot(y ~ x | a) p - dotplot(y ~ x | a, scales = list(y = list(relation = free, rot = 0)), prepanel = function(x, y, ...) { yy - y[, drop = TRUE] list(ylim = levels(yy), yat = sort(unique(as.numeric(yy }, panel = function(x, y, ...) { yy - y[, drop = TRUE] panel.dotplot(x, yy, ...) }) -- Hope that gives you what you want. Deepayan I've been trying to extend this to allow groups, but am running into a bit of trouble. For example, the following doesn't quite work: (some of the unused factor levels are suppressed per panel, but not all): set.seed(47905) temp3-data.frame(s_port=factor(rpois(100,10)), POSIXtime=structure(1:100,class=c(POSIXt,POSIXct)), l_ipn=factor(rpois(100,10)), duration=runif(100), locality=sample(1:4,replace=TRUE,size=100), l_role=sample(c(-1,1),replace=TRUE,size=100)) plot-dotplot(s_port~POSIXtime|l_ipn, data=temp3, layout=c(1,1), pch=|, col=1:8, duration=temp3$duration, auto.key=list(col=1:8,points=FALSE), groups=locality*l_role, prepanel = function(x, y, ...) { yy - y[, drop = TRUE] list(ylim = levels(yy), yat = sort(unique(as.numeric(yy }, panel = panel.superpose, panel.groups = function(x, y, subscripts, duration, col, ...) { yy - y[, drop = TRUE] yy.n - as.numeric(yy) panel.abline(h=yy.n,col=lightgray) panel.xyplot(x=x,y=yy.n,subscripts=subscripts,col=col,...) panel.segments(x, yy.n, x+duration[subscripts], yy.n, col = col) }, scales=list(y=list(relation=free), x=list(rot=45)), xlab=time, ylab=source port) Thanks, Ben __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

On 26 September 2006 at 13:14, Gabor Grothendieck wrote: | Any plans for Windows? Someone with deeper knowledge of the Windows build process would need to help us. Interested? Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] creation of new variables

You may not have told us quite enough to be able to help you. It may be worth your while investing some time in describing the problem you are trying to solve a little bit more comprehensively. The posting guide http://www.R-project.org/posting-guide.html can be useful in helping you frame a question that stands a better chance of receiving help. Regards, Mike On 9/26/06, nalluri pratap [EMAIL PROTECTED] wrote: Hello All, I have 8 variables named a b c d e f g h I need to create four variables from these 8 vraibles in R. the new variables are ab,cd,ef,gh. Can anyone pleas help me thanks, Pratap - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Regards, Mike Nielsen __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

Wow, looks neat. OS X users will be unhappy with your naming choice as the default filesystem there is not case-sensitive :-( IOW, r and R do the same thing. I would expect it to otherwise work on OS X so a change of some sort might be worthwhile. + seth __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] How to Pack a matrix

Hello, Suppose I have a matrix a where a= sp1 sp2 sp3 sp4 sp5 sp6 site1 1 0 1 1 0 1 site2 1 0 1 1 0 1 site3 1 1 1 1 1 1 site4 0 1 1 1 0 1 site5 0 0 1 0 0 1 site6 0 0 1 0 1 0 And I want to pack that matrix so that the upper left corner contains most of the ones and the bottom right corner contains most of the zeros so that matrix b is b= sp3 sp6 sp4 sp1 sp2 sp5 site1 1 1 1 1 0 0 site2 1 1 1 1 0 0 site3 1 1 1 1 1 1 site4 1 1 1 0 1 0 site5 1 1 0 0 0 0 site6 1 0 0 0 0 1 Can any of you help me with some code to accomplish this? I have tried different forms of order and can't seem to figure it out. Basically I want to order the matrix by both the rows and columns. Thank you for your help. Cam Cameron Guenther, Ph.D. Associate Research Scientist FWC/FWRI, Marine Fisheries Research 100 8th Avenue S.E. St. Petersburg, FL 33701 (727)896-8626 Ext. 4305 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

Seth Falcon wrote: Wow, looks neat. OS X users will be unhappy with your naming choice as the default filesystem there is not case-sensitive :-( IOW, r and R do the same thing. I would expect it to otherwise work on OS X so a change of some sort might be worthwhile. (I'm always amazed at how I can miss the simplest details. I probably knew at some point that OS X shipped with a case-sensitive file system, which you can turn off somehow, but forgot. Thank goodness for peer review.) littler will install into /usr/local/bin by default, so I don't think there's a clash with the Mac binary provided by CRAN, right? Jeff -- http://biostat.mc.vanderbilt.edu/JeffreyHorner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] dotplot, dropping unused levels of 'y'

On 9/26/06, Benjamin Tyner [EMAIL PROTECTED] wrote: Deepayan Sarkar wrote: On 9/15/06, Benjamin Tyner [EMAIL PROTECTED] wrote: In dotplot, what's the best way to suppress the unused levels of 'y' on a per-panel basis? This is useful for the case that 'y' is a factor taking perhaps thousands of levels, but for a given panel, only a handfull of these levels ever present. It's a bit problematic. Basically, you can use relation=free/sliced, but y behaves as as.numeric(y) would. So, if the small subset in each panel are always more or less contiguous (in terms of the levels being close to each other) then you would be fine. Otherwise you would not. In that case, you can still write your own prepanel and panel functions, e.g.: - library(lattice) y - factor(sample(1:100), levels = 1:100) x - 1:100 a - gl(9, 1, 100) dotplot(y ~ x | a) p - dotplot(y ~ x | a, scales = list(y = list(relation = free, rot = 0)), prepanel = function(x, y, ...) { yy - y[, drop = TRUE] list(ylim = levels(yy), yat = sort(unique(as.numeric(yy }, panel = function(x, y, ...) { yy - y[, drop = TRUE] panel.dotplot(x, yy, ...) }) -- Hope that gives you what you want. Deepayan I've been trying to extend this to allow groups, but am running into a bit of trouble. For example, the following doesn't quite work: (some of the unused factor levels are suppressed per panel, but not all): I don't think panel = panel.superpose is enough. Try panel = function(x, y, ...) { yy - y[, drop = TRUE] yy.n - as.numeric(yy) panel.superpose(x, yy.n, ...) }, panel.groups = function(x, y, subscripts, duration, col, ...) { panel.abline(h = y, col = lightgray) panel.xyplot(x, y, col = col, ...) panel.segments(x, y, x + duration[subscripts], y, col = col) }, -Deepayan set.seed(47905) temp3-data.frame(s_port=factor(rpois(100,10)), POSIXtime=structure(1:100,class=c(POSIXt,POSIXct)), l_ipn=factor(rpois(100,10)), duration=runif(100), locality=sample(1:4,replace=TRUE,size=100), l_role=sample(c(-1,1),replace=TRUE,size=100)) plot-dotplot(s_port~POSIXtime|l_ipn, data=temp3, layout=c(1,1), pch=|, col=1:8, duration=temp3$duration, auto.key=list(col=1:8,points=FALSE), groups=locality*l_role, prepanel = function(x, y, ...) { yy - y[, drop = TRUE] list(ylim = levels(yy), yat = sort(unique(as.numeric(yy }, panel = panel.superpose, panel.groups = function(x, y, subscripts, duration, col, ...) { yy - y[, drop = TRUE] yy.n - as.numeric(yy) panel.abline(h=yy.n,col=lightgray) panel.xyplot(x=x,y=yy.n,subscripts=subscripts,col=col,...) panel.segments(x, yy.n, x+duration[subscripts], yy.n, col = col) }, scales=list(y=list(relation=free), x=list(rot=45)), xlab=time, ylab=source port) Thanks, Ben __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] How to Pack a matrix

It looks like your example only reorders the columns but your discussion refers to ordering rows too. I have only addressed the columns part but it is hopefully clear how to extend this or use other objective functions. We generate every permutation of the rows and define an objective function f which is smaller for more desirable column permutations and then use brute force to find the minimizer: library(combinat) mat - structure(c(1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 1, 1, 1, 1, 0), .Dim = c(6, 6), .Dimnames = list(c(site1, site2, site3, site4, site5, site6), c(sp1, sp2, sp3, sp4, sp5, sp6))) f - function(p) sum(mat[,p] * (row(mat) + col(mat))) perms - permn(ncol(mat)) mat[,perms[[which.min(sapply(perms, f))]]] On 9/26/06, Guenther, Cameron [EMAIL PROTECTED] wrote: Hello, Suppose I have a matrix a where a= sp1 sp2 sp3 sp4 sp5 sp6 site1 1 0 1 1 0 1 site2 1 0 1 1 0 1 site3 1 1 1 1 1 1 site4 0 1 1 1 0 1 site5 0 0 1 0 0 1 site6 0 0 1 0 1 0 And I want to pack that matrix so that the upper left corner contains most of the ones and the bottom right corner contains most of the zeros so that matrix b is b= sp3 sp6 sp4 sp1 sp2 sp5 site1 1 1 1 1 0 0 site2 1 1 1 1 0 0 site3 1 1 1 1 1 1 site4 1 1 1 0 1 0 site5 1 1 0 0 0 0 site6 1 0 0 0 0 1 Can any of you help me with some code to accomplish this? I have tried different forms of order and can't seem to figure it out. Basically I want to order the matrix by both the rows and columns. Thank you for your help. Cam Cameron Guenther, Ph.D. Associate Research Scientist FWC/FWRI, Marine Fisheries Research 100 8th Avenue S.E. St. Petersburg, FL 33701 (727)896-8626 Ext. 4305 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] colClasses: supressed 'NA'

Hi, The colClasses seem to be supressing 'NA' vlaues. How do I fix this? R script and first 5 lines of output is below. File test2.dat has blanks that are read as NA when I do not use 'colClasses', but as blanks when I use 'colClasses'. temp.df - read.fwf(test2.dat, width=c(10,1,1,1,1,2,2,3,3,1), col.names=c(psu,losewt,maintain,fewcal,phyact,age,income,weight, wtdesire,gender), colClasses=c(factor,factor,factor,factor,factor,numeric,factor, numeric,numeric,factor), nrows=27, comment.char=) temp.df psu losewt maintain fewcal phyact age income weight wtdesire gender 1 2003009323 2252 05220 220 1 2 2003005181 21 2 2 58 08165 145 2 3 2003015942 21 4 1 76 05142 130 2 4 2003011406 21 3 1 43 03110 110 2 5 2003006786 1 4 1 49 06178 145 2 ? why am I not getting missing values when I use 'colClasses'? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Building R for Windows with ATLAS

I think this is not a R-devel question. Sorry to all if I'm wrong, please let me know. I managed to build R successfully with the default BLAS but when I change the MKRULES to use ATLAS BLAS and set the path to C:/cygwin/home/Administrador/ATLAS/lib/WinNT_ATHLONSSE2 I got the following error message (I'm posting only the final part, there was a lot of compilation before this): cp R.dll ../../bin/ Building ../../bin/Rblas.dll gcc -shared -s -o ../../bin/Rblas.dll blas00.o dllversion.o Rblas.def \ -L../../bin -lR -LC:/WinNT_ATHLONSSE2 -lf77blas -latlas C:/WinNT_ATHLONSSE2/libf77blas.a(xerbla.o):xerbla.f:(.text+0xb): undefined refer ence to `s_wsfe' C:/WinNT_ATHLONSSE2/libf77blas.a(xerbla.o):xerbla.f:(.text+0x27): undefined refe rence to `do_fio' C:/WinNT_ATHLONSSE2/libf77blas.a(xerbla.o):xerbla.f:(.text+0x43): undefined refe rence to `do_fio' C:/WinNT_ATHLONSSE2/libf77blas.a(xerbla.o):xerbla.f:(.text+0x48): undefined refe rence to `e_wsfe' C:/WinNT_ATHLONSSE2/libf77blas.a(xerbla.o):xerbla.f:(.text+0x5c): undefined refe rence to `s_stop' collect2: ld returned 1 exit status make[2]: *** [../../bin/Rblas.dll] Error 1 make[1]: *** [rbuild] Error 2 make: *** [all] Error 2 The ATLAS BLAS was build using Cygwin. AFTER building ATLAS BLAS I changed the Path variable putting C:\Rtools\tools\bin;C:\MinGW\bin before everything else. To build R I followed R Administration and Instalation and Duncan Murdoch's guide at http://www.murdoch-sutherland.com/Rtools/, including the version of MinGW. At ATLAS web page (http://math-atlas.sourceforge.net/errata.html) I found the following: Q: I'm linking with C, and getting missing symbols (such as w_wsfe, do_fio, w_esfe or s_stop). R: These kinds of symbols are Fortran library calls. The problem is that the C linker does not automatically find the Fortran libraries. The most common fix is to either link using your fortran linker, or to rewrite your code so that Fortran routines are not called. If you know where they are, you can also choose to link in the Fortran libraries explicitly Well, I can understand that there is a huge probability that this is my problem. Unfortunately I know nothing of C or Fortran. Even if I knew that I have these Fortran libraries I wouldn't know how to link them. I tried to look at MinGW web page but found nothing. Any help would be mostly welcome, please. Giuseppe Antonaci Sorry for English errors and lack of knowledge. I hope I made myself understandable. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] creation of new variables

Depends on what these variables are. Are these vectors? if so a simple a*b etc should work. If they are columns of a data frame DF? then DF$a*DF$b. If these variables are part of a function then also a*b should work. On 9/26/06, nalluri pratap [EMAIL PROTECTED] wrote: Hello All, I have 8 variables named a b c d e f g h I need to create four variables from these 8 vraibles in R. the new variables are ab,cd,ef,gh. Can anyone pleas help me thanks, Pratap - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Ritwik Sinha Graduate Student Epidemiology and Biostatistics Case Western Reserve University http://darwin.cwru.edu/~rsinha [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] colClasses: supressed 'NA'

Because by default blank fields aren't considered to be missing in factors but they are in integer vectors. f1-factor(c(1,2,,3,4)) f1 [1] 1 2 3 4 Levels: 1 2 3 4 I think you can fix this by specifying na.strings=c(NA,) On 26/09/06, Anupam Tyagi [EMAIL PROTECTED] wrote: Hi, The colClasses seem to be supressing 'NA' vlaues. How do I fix this? R script and first 5 lines of output is below. File test2.dat has blanks that are read as NA when I do not use 'colClasses', but as blanks when I use 'colClasses'. temp.df - read.fwf(test2.dat, width=c(10,1,1,1,1,2,2,3,3,1), col.names=c (psu,losewt,maintain,fewcal,phyact,age,income,weight, wtdesire,gender), colClasses=c(factor,factor,factor,factor,factor,numeric,factor, numeric,numeric,factor), nrows=27, comment.char=) temp.df psu losewt maintain fewcal phyact age income weight wtdesire gender 1 2003009323 2252 05220 220 1 2 2003005181 21 2 2 58 08165 145 2 3 2003015942 21 4 1 76 05142 130 2 4 2003011406 21 3 1 43 03110 110 2 5 2003006786 1 4 1 49 06178 145 2 ? why am I not getting missing values when I use 'colClasses'? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- = David Barron Said Business School University of Oxford Park End Street Oxford OX1 1HP [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

On 9/26/2006 1:04 PM, Jeffrey Horner wrote: What ? == littler - Provides hash-bang (#!) capability for R (www.r-project.org) Why ? = GNU R, a language and environment for statistical computing and graphics, provides a wonderful system for 'programming with data' as well as interactive exploratory analysis, often involving graphs. Sometimes, however, simple scripts are desired. While GNU R can be used in batch mode, and while so-called 'here' documents can be crafted, a long-standing need for a scripting front-end has often been expressed by the R Community. littler (pronounced 'little R' and written 'r') aims to fill this need. It can be used directly on the command-line just like, say, bc(1): $ echo 'cat(pi^2,\n)' | r 9.869604 Is there a technical reason that this couldn't work by modifying the script that invokes R? That would avoid the r/R clash on MacOSX and Windows. In Windows R is R.exe, not a script, so some adjustment would be needed there, but that shouldn't be difficult. Duncan Murdoch Equivalently, commands that are to be evaluated can be given on the command-line $ r -e 'cat(pi^2, \n)' 9.869604 But unlike bc(1), GNU R has a vast number of statistical functions. For example, we can quickly compute a summary() and show a stem-and-leaf plot for file sizes in a given directory via $ ls -l /boot | awk '!/^total/ {print $5}' | \ r -e 'fsizes - as.integer(readLines()); print(summary(fsizes)); stem(fsizes)' Min. 1st Qu. MedianMean 3rd Qu.Max. 13 512 110100 486900 768400 4735000 Loading required package: grDevices The decimal point is 6 digit(s) to the right of the | 0 | 002223 0 | 5557778899 1 | 112233 1 | 5 2 | 2 | 3 | 3 | 4 | 4 | 7 And, last but not least, this (somewhat unwieldy) expression can be stored in a helper script: $ cat examples/fsizes.r #!/usr/bin/env r fsizes - as.integer(readLines()) print(summary(fsizes)) stem(fsizes) (where calling /usr/bin/env is a trick from Python which allows one to forget whether r is installed in /usr/bin/r, /usr/local/bin/r, ~/bin/r, ...) A few examples are provided in the source directories examples/ and tests/. Where ? === littler can either be downloaded from http://biostat.mc.vanderbilt.edu/LittleR accessed by anonymous SVN: $ svn co http://littler.googlecode.com/svn/trunk/ littler or (soon !) be gotten from Debian mirrors via $ agt-get install littler littler is known to build and run on Linux and OS X. Who ? = Copyright (C) 2006 Jeffrey Horner and Dirk Eddelbuettel littler is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Comments are welcome, as are are suggestions, bug fixes, or patches. - Jeffrey Horner [EMAIL PROTECTED] - Dirk Eddelbuettel [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] colClasses: supressed 'NA'

Anupam Tyagi wrote: Hi, The colClasses seem to be supressing 'NA' vlaues. How do I fix this? R script and first 5 lines of output is below. File test2.dat has blanks that are read as NA when I do not use 'colClasses', but as blanks when I use 'colClasses'. Well, you say it should be a factor, hence is taken as a level. Otherwise you have to specify na.string = . Uwe Ligges temp.df - read.fwf(test2.dat, width=c(10,1,1,1,1,2,2,3,3,1), col.names=c(psu,losewt,maintain,fewcal,phyact,age,income,weight, wtdesire,gender), colClasses=c(factor,factor,factor,factor,factor,numeric,factor, numeric,numeric,factor), nrows=27, comment.char=) temp.df psu losewt maintain fewcal phyact age income weight wtdesire gender 1 2003009323 2252 05220 220 1 2 2003005181 21 2 2 58 08165 145 2 3 2003015942 21 4 1 76 05142 130 2 4 2003011406 21 3 1 43 03110 110 2 5 2003006786 1 4 1 49 06178 145 2 ? why am I not getting missing values when I use 'colClasses'? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

On 26 September 2006 at 15:48, Duncan Murdoch wrote: | On 9/26/2006 1:04 PM, Jeffrey Horner wrote: | It can be used directly on the command-line just like, say, bc(1): | | | $ echo 'cat(pi^2,\n)' | r | 9.869604 | | Is there a technical reason that this couldn't work by modifying the | script that invokes R? That would avoid the r/R clash on MacOSX and | Windows. In Windows R is R.exe, not a script, so some adjustment would | be needed there, but that shouldn't be difficult. Quite possible. We would surely encourage it. We'd be happy to retire littler to the dustbin when `the real R' can do this too. Until then, littler appear to serve one of us rather well (as R still can't do shebang-style scripts), and may hence be of interest to others too. Regards, Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

On 9/26/06, Seth Falcon [EMAIL PROTECTED] wrote: Wow, looks neat. OS X users will be unhappy with your naming choice as the default filesystem there is not case-sensitive :-( IOW, r and R do the same thing. I would expect it to otherwise work on OS X so a change of some sort might be worthwhile. Installing as 'littler' on OS X might be a reasonable solution. Then again, adapting /usr/bin/R to have a python-style -c switch might be the best long-term solution for R 2.5+. Chris, waiting for apt-get install littler to work :-) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

I like this plan and have now played with the concept. I did the following on Windows in cygwin. It would also work in Unix, and I think could be tickled to work on the standard MS cmd line in Windows. It would certainly work on Windows with a Windows-native port of the basic unix utilities. echo 'options(echo=FALSE);cat(pi^2,\n)' | Rterm --no-save This produces an output file, that normally shows up in the *shell* buffer, but could be redirected. The obvious place to redirect it to is awk with a script to filter out everything above the echo of the options() line. The only change to R needed to remove the need for an awk script is to suppress the display of the copyright message and startup information. I suppose that could be done with a new --suppress-startup-info argument to Rterm. The other optimizations that Jeffrey and Dirk have, such as suppressing the loading of many of the standard packages, would also need to be done. Very good work and concept. Rich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

The way it should work IMHO is that one can write any of these (in analogy to awk/perl/etc.): R -f myprog.R mydata.dat R -f myprog.R mydata.dat cat mydata.dat | R -f myprog.R # or analogously on Windows R -e ...some.R.code... mydata.dat R -e ...some.R.code... mydata.dat and there should be a simple way for myprog.R to read the input data that does not require that it know whether it was specified on the command line or redirected. On 9/26/06, Richard M. Heiberger [EMAIL PROTECTED] wrote: I like this plan and have now played with the concept. I did the following on Windows in cygwin. It would also work in Unix, and I think could be tickled to work on the standard MS cmd line in Windows. It would certainly work on Windows with a Windows-native port of the basic unix utilities. echo 'options(echo=FALSE);cat(pi^2,\n)' | Rterm --no-save This produces an output file, that normally shows up in the *shell* buffer, but could be redirected. The obvious place to redirect it to is awk with a script to filter out everything above the echo of the options() line. The only change to R needed to remove the need for an awk script is to suppress the display of the copyright message and startup information. I suppose that could be done with a new --suppress-startup-info argument to Rterm. The other optimizations that Jeffrey and Dirk have, such as suppressing the loading of many of the standard packages, would also need to be done. Very good work and concept. Rich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] 5 binary_class models vs one 5-class model

Hi, I apologize this question is not very r-related, but believe many people using R are expertised at or interested to know the answer to the following question. I am having a problem in classification. In bioinformatics study, we always ends with a limited size of samples. While in algorithms, some specific algorithm cannot handle modeling with more than 2 classes problem. For the time being, not considering those limitations, I just have a general question like this: suppose I have a problem for classification, which involves 5 classes. I am wondering if there is a general research comparison on which approach is more accurate: building 5 binary_class models or building one 5-class model (suppose cost (penalty) is same when accuracy is estimated). An extended or more practical question, in bioinformatics, if you do not have many samples but you are having such problem, what approach will you take? thanks, -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

On Tue, 26 Sep 2006, Richard M. Heiberger wrote: I like this plan and have now played with the concept. I did the following on Windows in cygwin. It would also work in Unix, and I think could be tickled to work on the standard MS cmd line in Windows. It would certainly work on Windows with a Windows-native port of the basic unix utilities. echo 'options(echo=FALSE);cat(pi^2,\n)' | Rterm --no-save This produces an output file, that normally shows up in the *shell* buffer, but could be redirected. The obvious place to redirect it to is awk with a script to filter out everything above the echo of the options() line. The only change to R needed to remove the need for an awk script is to suppress the display of the copyright message and startup information. I suppose that could be done with a new --suppress-startup-info argument to Rterm. It is called --slave. The other optimizations that Jeffrey and Dirk have, such as suppressing the loading of many of the standard packages, would also need to be done. Rterm --slave R_DEFAULT_PACKAHES=NULL and variables is already widely used in the R build process. Very good work and concept. Rich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

On 9/26/06, Richard M. Heiberger [EMAIL PROTECTED] wrote: I like this plan and have now played with the concept. I did the following on Windows in cygwin. It would also work in Unix, and I think could be tickled to work on the standard MS cmd line in Windows. It would certainly work on Windows with a Windows-native port of the basic unix utilities. echo 'options(echo=FALSE);cat(pi^2,\n)' | Rterm --no-save This produces an output file, that normally shows up in the *shell* buffer, but could be redirected. The obvious place to redirect it to is awk with a script to filter out everything above the echo of the options() line. It seems to me that a big difference between this and littler is how stdin is treated. How would you implement the fsizer.r example using this concept? The only change to R needed to remove the need for an awk script is to suppress the display of the copyright message and startup information. I suppose that could be done with a new --suppress-startup-info argument to Rterm. I typically use --vanilla --slave (which I assume would work on Windows too). The other optimizations that Jeffrey and Dirk have, such as suppressing the loading of many of the standard packages, would also need to be done. Very good work and concept. Rich __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] colClasses: supressed 'NA'

Uwe Ligges ligges at statistik.uni-dortmund.de writes: Well, you say it should be a factor, hence is taken as a level. And why not a level. Thanks for drawing my attention to it. It is common mistake that is easy to slip attention. Thanks a lot. Anupam. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Lattice strip labels for two factors

Dear All: In the following code which I modified from previous question, in addition to show the fact1 level names (y, b, r) in strips, I also want to have a color bar to indicate the state of every panel (in this example, y correspods to 1, and b, r correspond to 0). Does anyone have a quick solution? Thanks df - expand.grid(fact1=c(y,b,r), fact2=cfar,por,lis,set), year=1991:2000, value= NA) df[,value] - sample(1:50, 120, replace=TRUE) df$state - 0 df$state[df$fact1==y] - 1 require(lattice) xyplot( value ~ year | fact1, data=df, type=b, subset= fact2==far, strip = strip.custom(bg=gray.colors(1,0.95), factor.levels=c(yellow, black, red)), layout=c(1,3)) _ Share your special moments by uploading 500 photos per month to Windows Live Spaces __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Lattice strip labels for two factors

On 9/26/06, Joe Moore [EMAIL PROTECTED] wrote: Dear All: In the following code which I modified from previous question, Perhaps you should also have checked if it runs after the modification. in addition to show the fact1 level names (y, b, r) in strips, I also want to have a color bar to indicate the state of every panel (in this example, y correspods to 1, and b, r correspond to 0). Does anyone have a quick solution? No, but this might give you a hint (you need to write a suitable panel function): xyplot(value ~ year | fact1:factor(state), data=df, type=b, subset= fact2==far, layout=c(1,3)) Deepayan Thanks df - expand.grid(fact1=c(y,b,r), fact2=cfar,por,lis,set), year=1991:2000, value= NA) df[,value] - sample(1:50, 120, replace=TRUE) df$state - 0 df$state[df$fact1==y] - 1 require(lattice) xyplot( value ~ year | fact1, data=df, type=b, subset= fact2==far, strip = strip.custom(bg=gray.colors(1,0.95), factor.levels=c(yellow, black, red)), layout=c(1,3)) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] Bug in formals-

I think this is new since a previous version of R: h - function(x, trantab) trantab[x] w - 6:4 names(w) - c('cat','dog','giraffe') w cat dog giraffe 6 5 4 formals(h) - list(x=numeric(0), trantab=w) h function (x = numeric(0), trantab = c(6, 5, 4)) trantab[x] You can see that the names have been dropped from trantab's default values. I don't see a workaround but it seems to need fixing. Version 2.3.1 (2006-06-01) i486-pc-linux-gnu attached base packages: [1] grid methods stats graphics grDevices utils [7] datasets base other attached packages: lattice acepack Hmisc 0.13-10 1.3-2.2 3.0-12 -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Bug in formals-

On 9/26/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote: I think this is new since a previous version of R: h - function(x, trantab) trantab[x] w - 6:4 names(w) - c('cat','dog','giraffe') w cat dog giraffe 6 5 4 formals(h) - list(x=numeric(0), trantab=w) h function (x = numeric(0), trantab = c(6, 5, 4)) trantab[x] You can see that the names have been dropped from trantab's default values. Are you sure? I get formals(h) $x numeric(0) $trantab cat dog giraffe 6 5 4 h(1) cat 6 R version 2.4.0 beta (2006-09-21 r39463) x86_64-unknown-linux-gnu -Deepayan Version 2.3.1 (2006-06-01) i486-pc-linux-gnu attached base packages: [1] grid methods stats graphics grDevices utils [7] datasets base other attached packages: lattice acepack Hmisc 0.13-10 1.3-2.2 3.0-12 -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] Bug in formals-

This seems to be related to using c to define transtab. If we use list in place of c then it displays ok: h - function(x, trantab) transtab[x] formals(h) - list(x = numeric(0), transtab = c(cat = 6, dog = 5)) print(h) # bad display function (x = numeric(0), transtab = c(6, 5)) transtab[x] h(cat) # runs ok cat 6 formals(h) - list(x = numeric(0), transtab = list(cat = 6, dog = 5)) print(h) # now display is ok function (x = numeric(0), transtab = list(cat = 6, dog = 5)) transtab[x] h(cat) # runs ok $cat [1] 6 On 9/26/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote: Deepayan Sarkar wrote: On 9/26/06, Frank E Harrell Jr [EMAIL PROTECTED] wrote: I think this is new since a previous version of R: h - function(x, trantab) trantab[x] w - 6:4 names(w) - c('cat','dog','giraffe') w cat dog giraffe 6 5 4 formals(h) - list(x=numeric(0), trantab=w) h function (x = numeric(0), trantab = c(6, 5, 4)) trantab[x] You can see that the names have been dropped from trantab's default values. Are you sure? I get formals(h) $x numeric(0) $trantab cat dog giraffe 6 5 4 h(1) cat 6 R version 2.4.0 beta (2006-09-21 r39463) x86_64-unknown-linux-gnu -Deepayan Deepayan - You are correct. h('cat') is 6 as intended. I just looked at the function definition - the names attribute doesn't show for some reason. I was expecting function(..., trantab=c(cat=6, ..). Thanks Frank Version 2.3.1 (2006-06-01) i486-pc-linux-gnu attached base packages: [1] grid methods stats graphics grDevices utils [7] datasets base other attached packages: lattice acepack Hmisc 0.13-10 1.3-2.2 3.0-12 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

Duncan Murdoch wrote: On 9/26/2006 1:04 PM, Jeffrey Horner wrote: [...] It can be used directly on the command-line just like, say, bc(1): $ echo 'cat(pi^2,\n)' | r 9.869604 Is there a technical reason that this couldn't work by modifying the script that invokes R? That would avoid the r/R clash on MacOSX and Windows. In Windows R is R.exe, not a script, so some adjustment would be needed there, but that shouldn't be difficult. In fact, it does work: $ echo 'cat(pi^2,\n)' | R --vanilla --slave 9.869604 but what's more compelling is the ability to utilize the UNIX hash-bang mechanism. Jeff -- http://biostat.mc.vanderbilt.edu/JeffreyHorner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

Seth Falcon wrote: Jeffrey Horner [EMAIL PROTECTED] writes: [...] littler will install into /usr/local/bin by default, so I don't think there's a clash with the Mac binary provided by CRAN, right? It depends what you mean by clash :-) If both are on the PATH, then you get the first one, I suspect, when running either 'R' or 'r'. I haven't tested this bit yet, but on my OS X laptop I can invoke a new R session using either 'R' or 'r' (using an R built from source, not the R GUI app thingie). Good point, but the executable path can be named absolutely in hash-bang scripts. Relative paths work as well with the use of '/usr/bin/env program' as is described in the littler announcement, but then you don't get to pass arguments to 'program', just to the hash-bang script. So IMO, a different name or an integration into the R script in some way would be a big improvement. But I'd like to know why there's an R script in the first place. Why not just an executable as on windows? 'r' is cute, but going down the road of tools with the same name except for caps leads to confusion (for me). For example, R CMD build/INSTALL still catches me up after a number of years. That's a different problem than case-sensitivity. The word 'build' must have had a different semantic than INSTALL, and I'm not sure why one was all caps and the other isn't. Jeff -- http://biostat.mc.vanderbilt.edu/JeffreyHorner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

The real problem is that one wants to pipe the data in, not the R source. The idea is that one successively transforms the data in successive elements of the pipeline. For example one might want to write cut, grep, etc. in R rather than in C. This has been on my year-end wishlist for some time. On 9/26/06, Duncan Murdoch [EMAIL PROTECTED] wrote: On 9/26/2006 1:04 PM, Jeffrey Horner wrote: What ? == littler - Provides hash-bang (#!) capability for R (www.r-project.org) Why ? = GNU R, a language and environment for statistical computing and graphics, provides a wonderful system for 'programming with data' as well as interactive exploratory analysis, often involving graphs. Sometimes, however, simple scripts are desired. While GNU R can be used in batch mode, and while so-called 'here' documents can be crafted, a long-standing need for a scripting front-end has often been expressed by the R Community. littler (pronounced 'little R' and written 'r') aims to fill this need. It can be used directly on the command-line just like, say, bc(1): $ echo 'cat(pi^2,\n)' | r 9.869604 Is there a technical reason that this couldn't work by modifying the script that invokes R? That would avoid the r/R clash on MacOSX and Windows. In Windows R is R.exe, not a script, so some adjustment would be needed there, but that shouldn't be difficult. Duncan Murdoch Equivalently, commands that are to be evaluated can be given on the command-line $ r -e 'cat(pi^2, \n)' 9.869604 But unlike bc(1), GNU R has a vast number of statistical functions. For example, we can quickly compute a summary() and show a stem-and-leaf plot for file sizes in a given directory via $ ls -l /boot | awk '!/^total/ {print $5}' | \ r -e 'fsizes - as.integer(readLines()); print(summary(fsizes)); stem(fsizes)' Min. 1st Qu. MedianMean 3rd Qu.Max. 13 512 110100 486900 768400 4735000 Loading required package: grDevices The decimal point is 6 digit(s) to the right of the | 0 | 002223 0 | 5557778899 1 | 112233 1 | 5 2 | 2 | 3 | 3 | 4 | 4 | 7 And, last but not least, this (somewhat unwieldy) expression can be stored in a helper script: $ cat examples/fsizes.r #!/usr/bin/env r fsizes - as.integer(readLines()) print(summary(fsizes)) stem(fsizes) (where calling /usr/bin/env is a trick from Python which allows one to forget whether r is installed in /usr/bin/r, /usr/local/bin/r, ~/bin/r, ...) A few examples are provided in the source directories examples/ and tests/. Where ? === littler can either be downloaded from http://biostat.mc.vanderbilt.edu/LittleR accessed by anonymous SVN: $ svn co http://littler.googlecode.com/svn/trunk/ littler or (soon !) be gotten from Debian mirrors via $ agt-get install littler littler is known to build and run on Linux and OS X. Who ? = Copyright (C) 2006 Jeffrey Horner and Dirk Eddelbuettel littler is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Comments are welcome, as are are suggestions, bug fixes, or patches. - Jeffrey Horner [EMAIL PROTECTED] - Dirk Eddelbuettel [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch

### [R] matrix with additional upper, botton, left and right cells

Dear R Gurus, I have a matrix dim(1000x1000) and I need create a second matrix with dim(1002x1002) and insert my first matrix at position col=2,line=2. Please, see an example below: 0050055050 555000 5000505005 5005000500 000555 and I need 300500550503 35550003 350005050053 350050005003 30005553 Thanks a lot, miltinho __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

On 26 September 2006 at 22:17, Gabor Grothendieck wrote: | The real problem is that one wants to pipe the data in, not the | R source. The idea is that one successively transforms the | data in successive elements of the pipeline. But that is what our filesize example does:: | On 9/26/06, Duncan Murdoch [EMAIL PROTECTED] wrote: | On 9/26/2006 1:04 PM, Jeffrey Horner wrote: [...] | But unlike bc(1), GNU R has a vast number of statistical | functions. For example, we can quickly compute a summary() and show | a stem-and-leaf plot for file sizes in a given directory via | |$ ls -l /boot | awk '!/^total/ {print $5}' | \ | r -e 'fsizes - as.integer(readLines()); |print(summary(fsizes)); stem(fsizes)' | Min. 1st Qu. MedianMean 3rd Qu.Max. | 13 512 110100 486900 768400 4735000 |Loading required package: grDevices | | The decimal point is 6 digit(s) to the right of the | | | 0 | 002223 | 0 | 5557778899 | 1 | 112233 | 1 | 5 | 2 | | 2 | | 3 | | 3 | | 4 | | 4 | 7 Data to be processed on stdin, command via -e 'some long expression'. To make it simpler, here is a somewhat useless example of r piping into r (which I've indented for readability): $ r -e 'set.seed(42); sapply(rnorm(5),function(x) cat(x,\n))' | \ r -e 'cat(sum(abs(as.numeric(readLines(, \n)' 3.335916 Isn't that something where, to quote you, one wants to pipe the data in, not the R source ? Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] matrix with additional upper, botton, left and right cells

How about something like this: x - matrix(1:100,10) x.1 - array(-3, dim=c(12,12)) x.1[2:11, 2:11] - x x.1 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [1,] -3 -3 -3 -3 -3 -3 -3 -3 -3-3-3-3 [2,] -31 11 21 31 41 51 61 718191-3 [3,] -32 12 22 32 42 52 62 728292-3 [4,] -33 13 23 33 43 53 63 738393-3 [5,] -34 14 24 34 44 54 64 748494-3 [6,] -35 15 25 35 45 55 65 758595-3 [7,] -36 16 26 36 46 56 66 768696-3 [8,] -37 17 27 37 47 57 67 778797-3 [9,] -38 18 28 38 48 58 68 788898-3 [10,] -39 19 29 39 49 59 69 798999-3 [11,] -3 10 20 30 40 50 60 70 8090 100-3 [12,] -3 -3 -3 -3 -3 -3 -3 -3 -3-3-3-3 On 9/26/06, Milton Cezar [EMAIL PROTECTED] wrote: Dear R Gurus, I have a matrix dim(1000x1000) and I need create a second matrix with dim(1002x1002) and insert my first matrix at position col=2,line=2. Please, see an example below: 0050055050 555000 5000505005 5005000500 000555 and I need 300500550503 35550003 350005050053 350050005003 30005553 Thanks a lot, miltinho __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### [R] histogram colors in lattice

I have code that constructs a plot using the lattice package that looks something like the following toy example: library(lattice) Start - factor(rbinom(100,1,.5)) Answer - 2 - rbinom(100,1,.7) histogram(~Answer | Start, breaks=c(1, 1.4 ,1.6,2), scales=list(x=list(at=c(1.2,1.8),labels=c(Yes,No))), xlab=,ylab=) I would like to have different colors for the bars in the left and right panel (say red and green) but I can't find a way to do this. Can anyone give me any advice on how to achieve this? Thanks, Jamie Jarabek [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] New project: littler for GNU R

I think this is quoted out of context. I was referring to Duncan's post which shows an example of piping R code. On 9/26/06, Dirk Eddelbuettel [EMAIL PROTECTED] wrote: On 26 September 2006 at 22:17, Gabor Grothendieck wrote: | The real problem is that one wants to pipe the data in, not the | R source. The idea is that one successively transforms the | data in successive elements of the pipeline. But that is what our filesize example does:: | On 9/26/06, Duncan Murdoch [EMAIL PROTECTED] wrote: | On 9/26/2006 1:04 PM, Jeffrey Horner wrote: [...] | But unlike bc(1), GNU R has a vast number of statistical | functions. For example, we can quickly compute a summary() and show | a stem-and-leaf plot for file sizes in a given directory via | |$ ls -l /boot | awk '!/^total/ {print $5}' | \ | r -e 'fsizes - as.integer(readLines()); |print(summary(fsizes)); stem(fsizes)' | Min. 1st Qu. MedianMean 3rd Qu.Max. | 13 512 110100 486900 768400 4735000 |Loading required package: grDevices | | The decimal point is 6 digit(s) to the right of the | | | 0 | 002223 | 0 | 5557778899 | 1 | 112233 | 1 | 5 | 2 | | 2 | | 3 | | 3 | | 4 | | 4 | 7 Data to be processed on stdin, command via -e 'some long expression'. To make it simpler, here is a somewhat useless example of r piping into r (which I've indented for readability): $ r -e 'set.seed(42); sapply(rnorm(5),function(x) cat(x,\n))' | \ r -e 'cat(sum(abs(as.numeric(readLines(, \n)' 3.335916 Isn't that something where, to quote you, one wants to pipe the data in, not the R source ? Dirk -- Hell, there are no rules here - we're trying to accomplish something. -- Thomas A. Edison __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

### Re: [R] histogram colors in lattice

Try this: library(lattice) set.seed(1) ## added for reproducibility Start - factor(rbinom(100,1,.5)) Answer - 2 - rbinom(100,1,.7) histogram(~Answer | Start, breaks=c(1, 1.4 ,1.6,2), scales=list(x=list(at=c(1.2,1.8),labels=c(Yes,No))), panel = function(x, ..., panel.number, col) { ## added this panel function panel.histogram(x, ..., col = panel.number+1) }, xlab=,ylab=) On 9/26/06, Jamie Jarabek [EMAIL PROTECTED] wrote: I have code that constructs a plot using the lattice package that looks something like the following toy example: library(lattice) Start - factor(rbinom(100,1,.5)) Answer - 2 - rbinom(100,1,.7) histogram(~Answer | Start, breaks=c(1, 1.4 ,1.6,2), scales=list(x=list(at=c(1.2,1.8),labels=c(Yes,No))), xlab=,ylab=) I would like to have different colors for the bars in the left and right panel (say red and green) but I can't find a way to do this. Can anyone give me any advice on how to achieve this? Thanks, Jamie Jarabek [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.