[R] Tracing the top border of a histogram
I want to represent a histogram by the line along its top border, *without* kernel smoothing (to show several histograms in the same plot). This works, but is there simpler recommended way? x - rnorm(1000) tmp - hist(x, border=white) for (i in 1:(length(tmp$breaks)-1)){ segments(x0=tmp$breaks[i], x1=tmp$breaks[i+1], y0=tmp$counts[i]) segments(x0=tmp$breaks[i+1], y0=tmp$counts[i], y1=tmp$counts[i+1]) } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tracing the top border of a histogram
Very nice - thank you, I didn't know about type='s'. On Thu, Aug 1, 2013 at 3:43 PM, Greg Snow 538...@gmail.com wrote: lines( tmp$breaks, c(tmp$counts,tail(tmp$counts,1)), type='s', col='#00ff0077', lwd=5 ) On Thu, Aug 1, 2013 at 1:30 PM, Levi Waldron lwaldron.resea...@gmail.comwrote: I want to represent a histogram by the line along its top border, *without* kernel smoothing (to show several histograms in the same plot). This works, but is there simpler recommended way? x - rnorm(1000) tmp - hist(x, border=white) for (i in 1:(length(tmp$breaks)-1)){ segments(x0=tmp$breaks[i], x1=tmp$breaks[i+1], y0=tmp$counts[i]) segments(x0=tmp$breaks[i+1], y0=tmp$counts[i], y1=tmp$counts[i+1]) } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regularized Regressions
Perhaps I am wrong, but I think there are only a few packages supporting Elastic Net, and none of them also perform Best Subsets. On Wed, Apr 17, 2013 at 8:46 AM, Christos Giannoulis cgiann...@gmail.comwrote: Merhaba, Hello to you too Mehmet (Yasu ki sena) Thank you for your email and especially for sharing this package. I appreciate it. However, my feeling is that this package does not have the third component of Best Subsets (pls correct me if I am wrong). It uses only a combination of Ridge and Lasso. If you happen to know any other packages that uses all of them I would greatly welcome and appreciate if you were so kind and share it. I tried to search the cran lists but I am not sure I can found something like that. That's why I was asking the R-community Thank you again for your prompt response! Cheers Christos On Wed, Apr 17, 2013 at 8:16 AM, Suzen, Mehmet msu...@gmail.com wrote: Yasu, Try Elastic nets: http://cran.r-project.org/web/packages/pensim/index.html There some other packages supporting elastic nets: Just search the CRAN Cheers, Mehmet On 17 April 2013 13:19, Christos Giannoulis cgiann...@gmail.com wrote: Hi all, I would greatly appreciate if someone was so kind and share with us a package or method that uses a regularized regression approach that balances a regression model performance and model complexity. That said I would be most grateful is there is an R-package that combines Ridge (sum of squares coefficients), Lasso: Sum of absolute coefficients and Best Subsets: Number of coefficients as methods of regularized regression. Sincerely, Christos Giannoulis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] text command - how to get a white background to cover grid lines
For posterity, I found the TeachingDemos::shadowtext option most agreeable for this problem: * legend puts a large box around the text which did not seem possible to shrink, and does not accept vector x, y arguments * plotrix::boxed.labels did not work with pos=4 (this moved the text, but not the box) * TeachingDemos::shadowtext puts an opaque shadow around the text, rather than a box, which obscures a minimum of background while still making the labels readable. col=black, bg=white options produced this effect (default options are opposite this). On Mon, Feb 6, 2012 at 10:56 PM, 538...@gmail.com wrote: An alternative is the shadowtext function in the TeachingDemos package. On Mon, Feb 6, 2012 at 1:19 AM, Jim Lemon j...@bitwrit.com.au wrote: On 02/06/2012 08:23 AM, Henry wrote: New to R - rookie question. I'm a mechanical engineer and enjoying using R to make high quality graphs. I've searched. I want to put text notation on graph plot areas and have the text background box white to cover over the grid lines. my command so far text(15,5200,Air Flow,cex=.8,col=blue, background=white) # this doesn't work... I've tried bg=white, background color=white and a number of other attempts. The text is getting placed on the chart where I want it. Hi Henry, have a look at the boxed.labels function in the plotrix package. Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gregory (Greg) L. Snow Ph.D. 538...@gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] highlighting clusters in a heatmap
I would like to draw horizontal or vertical lines on a heatmap to highlight the clusters at some specified cut depth of the dendrogram. As a hacked example, the following code would work if I could set the coordinates of the top and bottom of the false color image correctly (ymin and ymax), but the correct values seem to depend on the output device and its size. I realize that heatmaps use a 2x2 layout which makes the coordinate system non-obvious, but the result seems very difficult to customize. I would appreciate any suggestions for manual or pre-made solutions. Example: set.seed(2) x - matrix(rnorm(1000),ncol=10) #obviously no real clusters here... row.hclust - hclust(dist(x)) row.dendro - as.dendrogram(row.hclust) heatmap(x, Rowv=row.dendro) row.cut - cutree(row.hclust,3)[row.hclust$order] cutpoints - which(row.cut[-1]!=row.cut[-length(row.cut)]) ymin - par(usr)[3]#in general incorrect ymax - par(usr)[4] #in general incorrect for (i in cutpoints){ thisy - ymin + (ymax-ymin)*(i-1)/nrow(x) abline(h=thisy,lw=3) } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] difference between the meaning of MARGIN in sweep() and apply()
For example, subtracting 1:2 from the rows of a two-column matrix: t(apply(matrix(1:6,ncol=2),MARGIN=1,function(y) y - 1:2)) [,1] [,2] [1,]02 [2,]13 [3,]24 sweep(matrix(1:6,ncol=2),MARGIN=2,1:2,FUN=-) [,1] [,2] [1,]02 [2,]13 [3,]24 Is there a logic to this difference, or is it just a quirk of the history of these functions? I found one discussion on this topic, but without what I thought was a very conclusive end: http://finzi.psych.upenn.edu/Rhelp08/2009-March/191930.html -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 (416)581-7453 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple question about error on CSV import
By default for read.table, comment.char = # so the first line was being treated as a comment line, and when you specified row.names=#, read.table couldn't find that column. On Tue, Sep 1, 2009 at 12:07 PM, esawdust lan...@360vl.com wrote: esawdust wrote: Here's the contents of a simple test2.csv CSV file: #,Status,Project 5842,New,Test snortalerts = read.table( /Users/lcox/Documents/test2.csv, header=TRUE, sep=,, row.names=#) Error in data[[rowvar]] : attempt to select less than one element Landon Figured out the answer, though it wasn't obvious (to me anyway). The symbol # used as the first column label was the problem. I changed that to be id and changed the read.table to be: snortalerts - read.table( /Users/lcox/Documents/test2.csv, header=TRUE, sep=,, row.names=id) and it worked fine. -- View this message in context: http://www.nabble.com/Simple-question-about-error-on-CSV-import-tp25242899p25243159.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 (416)581-7453 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two way joining vs heatmap
Hi Schalk, the heatmap function does not implement two-way joining as far as I know. It clusters rows and columns independently. However if you find or program a method that implements two-way joining, you could use the row and column ordering it returns in your heatmap using the Rowv and Colv arguments to heatmap. -levi On Mon, Aug 31, 2009 at 2:25 AM, Schalk Heunis schalk.heu...@enerweb.co.zawrote: Hi STATISTICA has a function called Two-way joining (see http://www.statsoft.com/TEXTBOOK/stcluan.html#twotwo) and the reference material states that this is based on the method as published by Hartigan (found this paper: http://www.jstor.org/pss/2284710 through wikipedia). What is the relationship (if any) between the heatmap function in R and this technique? Is there an alternative function to use? Thanks for the help! Schalk Heunis __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 (416)581-7453 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting par(srt) according to plot aspect ratio
On Fri, Aug 28, 2009 at 1:48 AM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: Note that the aspect ratio changes when you resize the plot but the angle of the plotted text will not. So the only safe route is to set 'asp' and use that setting to select the angle. That is true with screen output, although I can run the code block again after re-sizing to correct the text angle along with the aspect ratio. And for my purpose which is writing to file, it isn't an issue, for example the following plots which need different aspect ratios all work correctly: png(example%d.png) makeplot() #as per example code par(mfrow=c(2,1)) makeplot() makeplot() par(mfrow=c(1,2)) makeplot() makeplot() dev.off() -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 (416)581-7453 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] setting par(srt) according to plot aspect ratio
How can I look up the aspect ratio of a plot, so I can use that to correctly adjust the angle of text which is supposed to be parallel to a line in the plot? The following example code works for a 1:1 aspect ratio, but puts the text at the wrong angle if the plot region is short and wide or tall and narrow. I can't find a par() component containing the plot aspect ratio. It will be for png() or postscript() output, if that matters. f - function(x) x g - function(x) 2*x (f_angle - atan(1)*180/pi) (g_angle - atan(2)*180/pi) xpos - 0.2 plot(f) plot(g,add=TRUE) par(srt=f_angle) text(xpos,f(xpos),label=y=x,pos=3) par(srt=g_angle) text(xpos,g(xpos),label=y=2x,pos=3) -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 (416)581-7453 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting par(srt) according to plot aspect ratio
I frequently use R's help facilities and I know about the asp argument to plot, but this doesn't answer my question. I would like to allow the aspect to be determined automatically but *query* the aspect ratio for future use. I suppose one work-around would be to use the data ranges and plot region dimensions to estimate an appropriate value for asp, but this more complicated than I was hoping for. Thanks, Levi On Thu, Aug 27, 2009 at 6:19 PM, Bert Gunter gunter.ber...@gene.com wrote: Use R's help facilities, please. help.search(aspect ratio) gets you to ?plot.window which then gets you to ?plot (actually plot.default() ) Bert Gunter Genentech Nonclinical Biostatisics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Levi Waldron Sent: Thursday, August 27, 2009 1:54 PM To: r-help@r-project.org Subject: [R] setting par(srt) according to plot aspect ratio How can I look up the aspect ratio of a plot, so I can use that to correctly adjust the angle of text which is supposed to be parallel to a line in the plot? The following example code works for a 1:1 aspect ratio, but puts the text at the wrong angle if the plot region is short and wide or tall and narrow. I can't find a par() component containing the plot aspect ratio. It will be for png() or postscript() output, if that matters. f - function(x) x g - function(x) 2*x (f_angle - atan(1)*180/pi) (g_angle - atan(2)*180/pi) xpos - 0.2 plot(f) plot(g,add=TRUE) par(srt=f_angle) text(xpos,f(xpos),label=y=x,pos=3) par(srt=g_angle) text(xpos,g(xpos),label=y=2x,pos=3) -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 (416)581-7453 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 (416)581-7453 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] setting par(srt) according to plot aspect ratio
For posterity's sake, here is the solution I figured out. Putting the following lines after the plot(f) command seems to set the angle correctly: myasp - (par(fin)[2]-par(mai)[1]-par(mai)[3])/(par(fin)[1]-par(mai)[2]-par(mai)[4]) (f_angle - atan(myasp)*180/pi) (g_angle - atan(2*myasp)*180/pi) -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 (416)581-7453 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] multiple annotations on a heatmap
Can someone recommend a more sophisticated way to annotate heatmaps than the ColSideColors argument of heatmap and heatmap.2? In particular, I would like to be able to annotate columns with more than one piece of information, like in Figure 1 of the article at http://www3.interscience.wiley.com/cgi-bin/fulltext/117905619/HTMLSTART / doi:10.1002/hep.22256. Some example data and a heatmap: set.seed(1) y - matrix(rnorm(100),nrow=20,ncol=5) colnames(x) - LETTERS[1:5] rownames(x) - paste(r,1:20,sep=) set.seed(1) annotation - matrix(sample(c(+,-),15,replace=TRUE),ncol=5) colnames(annotation) - colnames(x) rownames(annotation) - paste(annotation,1:3) heatmap(x,Rowv=NA, ColSideColors=sapply(annotation[1,],function(x) switch(x,+=red,-=blue))) This heatmap annotates the columns by the first of the three annotations with a colored bar along the top of the heatmap, but ideally I would like to put all three annotations on the heatmap by putting three rows of +/- values between the top of the heatmap and the dendrogram, or even three colored bars. Specific or general suggestions would be welcome. Thank you, Levi -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology IBM Life Sciences Discovery Centre TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory space
Perhaps: gc() On Wed, Mar 18, 2009 at 8:26 AM, Abelian abelian1...@gmail.com wrote: Dear all when the program is runing, we can realize that the memory size will be asked more and more.. Therefore, we could meet the significant problem, such as out off memory size. However, even if i rm() some variables that i will not use it anymore, the memory size still not enough. By the way, i found that if i turn off the R-gui, the program can work again. For example, i generate a 800*45000 matrix four times. however, when i generate the fifth, the message tell me that the memory size is not enough. So, i guess that maybe there are some way that can release the memory such that the memory size is ths same with the situation of starting. Does anyone be able to provide the function to clean the memory ? Thanks a lot. Sincerely __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Difficulty Replacing a Row of a Data Frame
Your first example that works would just repeat the atomic vector to the length of the row. To replace a row with another data frame, you could use rbind: avGain - rbind(avGain[-j,],b) Or if the positioning is important, 2) avGain - rbind(avGain[1:(j-1),],b,avGain[(j+1):nrow(avGain),]) I didn't pay close attention to how you formed b, but this of course will only work if it has the same number of columns as avGain. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] breaking ties in order() based on many vectors
The order() function allows you to specify multiple vectors, which are used successively to break ties. If I want to use many vectors to break ties (say, 25 or more), that are columns of a matrix or elements of a list, does anyone know a shortcut to do this without passing 25 arguments to order()? -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology IBM Life Sciences Discovery Centre TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] breaking ties in order() based on many vectors
On Tue, Mar 17, 2009 at 12:56 PM, Patrick Burns pbu...@pburns.seanet.comwrote: Use 'do.call' as discussed in 'The R Inferno' and elsewhere. Perfect, thanks. -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology IBM Life Sciences Discovery Centre TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 (416)581-7453 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Difficulty Replacing a Row of a Data Frame
On Tue, Mar 17, 2009 at 12:54 PM, Bill Cunliffe b...@elevationllc.netwrote: Thanks for your help. Part of the problem was that I thought I was dealing with a data frame when it was actually an array. I forced all to be matrices for consistency. However, I see that avGain - rbind(avGain[1:(j-1),],b,avGain[(j+1):nrow(avGain),]) would be a great solution as positioning is important. This rbind solution will work, and if b has more than one row is possibly the best way to do it. But it will be very slow if you are doing it iteratively, compared to the direct replacement it sounds like you've figured out. cheers, Levi -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology IBM Life Sciences Discovery Centre TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] getting all pairwise combinations of elements in a character string
I'm able to do this as follows, but am wondering if anyone knows a simpler way which still avoids explicit loops? (mystring - letters[1:5]) [1] a b c d e unlist(sapply(mystring[-length(mystring)], + function(x) paste(x,mystring[(grep(x,mystring)+1):length(mystring)],sep=))) a1 a2 a3 a4 b1 b2 b3 c1 c2d ab ac ad ae bc bd be cd ce de -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology IBM Life Sciences Discovery Centre TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] getting all pairwise combinations of elements in a character string
I like both of these solutions much better - thank you! -Levi -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology IBM Life Sciences Discovery Centre TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looping multiple output values to dataframe
Stropharia wrote: # START R-CODE--- filenames - Sys.glob(/Users/Desktop/Test/*.csv) # get names of files to process # use * to get all variables - data.frame(1:length(filenames)) # preallocate assuming multiple values from each file # creates a dataframe with the same length of rows as the number of .csv files to process for (i in seq_along(filenames)){ input - read.csv(filenames[i], header=TRUE, na.strings=NA) data.frame(input) attach(input) result.A - x[2]*y[1] result.B - y[2]-x[1] result.C - x[3]+y[1] results - c(result.A, result.B, result.C) # concatenate result vectors variables[i] - results } variables - as.data.frame(t(as.matrix(variables))) # turn result vectors into a matrix, then transpose it and output as a data frame # add column and row names c.names - c(ResultA, ResultB, ResultC) # set names for result vectors colnames(variables) - c.names rownames(variables) - filenames # export to csv file write.csv(variables, file=/Users/Desktop/Test.csv) # END R-CODE--- I think something like this should work better: docalc - function(thisfile){ input - read.csv(filenames[i], header=TRUE, na.strings=NA) attach(input) result.A - x[2]*y[1] result.B - y[2]-x[1] result.C - x[3]+y[1] results - c(result.A, result.B, result.C) # concatenate result vectors names(results) - c(ResultA, ResultB, ResultC) return(results) } variables - sapply(filenames,docalc) -- Levi Waldron post-doctoral fellow Jurisica Lab, Ontario Cancer Institute Division of Signaling Biology IBM Life Sciences Discovery Centre TMDT 9-304D 101 College Street Toronto, Ontario M5G 1L7 (416)581-7453 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cumulative sum of within levels of a dataframe
This one should be easy but it's giving me a hard time mostly because tapply puts the results in a list. I want to calculate the cumulative sum of a variable in a dataframe, but with the accumulation only within each level of a factor. For a very simple example, take: df - data.frame(x=c(rep(1,5),rep(2,5),rep(3,5)),fac=gl(3,5,labels=letters[1:3])) df x fac 1 1 a 2 1 a 3 1 a 4 1 a 5 1 a 6 2 b 7 2 b 8 2 b 9 2 b 10 2 b 11 3 c 12 3 c 13 3 c 14 3 c 15 3 c I'd like to create another column in the dataframe so it looks like this, and make sure that the cumulative sums still match the right levels of the factor. I've included a willdo column that's just a cumulative sum, and an ideal column that's the cumulative sum minus the current value - the column headings are self explanatory. answer x fac willdo ideal 1 1 a 1 0 2 1 a 2 1 3 1 a 3 2 4 1 a 4 3 5 1 a 5 4 6 2 b 2 0 7 2 b 4 2 8 2 b 6 4 9 2 b 8 6 10 2 b 10 8 11 3 c 3 0 12 3 c 6 3 13 3 c 9 6 14 3 c 12 9 15 3 c 1512 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cumulative sum of within levels of a dataframe
Perfect, thanks all. Unlist and ave will both be very handy functions to know. -levi -- Levi Waldron, PhD Treated Wood Research Group Faculty of Forestry, University of Toronto 33 Willcocks Street, Toronto ON, M5S 3B3 CANADA Tel 647-866-5384 Fax 416-978-3834 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] choosing an appropriate linear model
Perhaps this was too big a question, so I'll ask something shorter: I have fit a linear model, and want to use its prediction intervals to calculate the sum of many individual predictions. 1) Some of the lower prediction intervals are negative, which is non-sensical. Should I just set all negative predictions to zero, or is there another way to require non-negative predictions only? 2) I am interested in the sum of many predictions based on the lm. How can I calculate the 95% prediction interval for the sum? Should I calculate a root mean square of the individual errors, or use a bootstrap method, or something else? ps. the data is attached to the end of this email. On Thu, Jun 5, 2008 at 6:25 PM, Levi Waldron [EMAIL PROTECTED] wrote: I am trying to model the observed leaching of wood preservative chemicals from treated wood during an outdoor experiment where leaching is caused by rainfall events. For each rainfall event, the amount of rainfall was recorded as well as the amount of preservative chemical leached. A number of climatic variables were measured, but the most important is the amount of rainfall. I have tried a simple linear model, with zero intercept because zero rainfall cannot cause any leaching (leachdata dataframe is attached to this email). The diagnostics show clearly non-normally distributed residuals with a simple linear regression, and I am trying to figure out what to do about it (see attached diagnostics.png). This dataset contains measurements from 57 rainfall events on three replicate samples, for a total of 171 measurements. Part of the problem is that physically, the leaching values can only be positive, so for the smaller rainfall amounts the residuals are all positive. If I allow an intercept then it is significantly positive, possibly since the researcher wouldn't have collected measurements for very small rain events, but in terms of the model it doesn't make sense physically to have a positive intercept, particularly since lab experiments have shown that a certain amount of rain exposure is required to wet the wood before leaching begins. I can get more normally distributed residuals by log-transforming the response, or using the optimal box-cox transformation of lambda = 0.21, which produces nicer-looking residuals but unsatisfactory prediction which is the main goal of the model (also attached). Any advice on how to create a better predictive model? I presume it has something to do with glm, especially since I have repeated rainfalls on replicate samples, but any advice on the approach to take would be much appreciated. The code I used to produce the attached plots is included below. leach.lm - lm(leachate~rainmm-1,data=leachdata) png(dianostics.png,height=1200,width=700) par(mfrow=c(3,2)) plot(leachate~rainmm,data=leachdata,main=Data and fitted line) abline(leach.lm) plot(predict(leach.lm)~leachdata$leachate,main=predicted vs. observed leaching amount,xlim=c(0,12),ylim=c(0,12),xlab=observed leaching,ylab=predicted leaching) abline(a=0,b=1) plot(leach.lm) dev.off() library(MASS) boxcox(leach.lm,plotit=T,lambda=seq(0,0.4,by=0.01)) boxtran - function(y,lambda,inverse=F){ if(inverse) return((lambda*y+1)^(1/lambda)) else return((y^lambda-1)/lambda) } png(boxcox-dianostics.png,height=1200,width=700) par(mfrow=c(3,2)) logleach.lm - lm(boxtran(leachate,0.21)~rainmm-1,data=leachdata) plot(leachate~rainmm,data=leachdata,main=Data and fitted line) x - leachdata$rainmm y - boxtran(predict(logleach.lm),0.21,T) xy - cbind(x,y)[order(x),] lines(xy) plot(y~leachdata$leachate,xlim=c(0,12),ylim=c(0,12),main=predicted vs. observed leaching amount,xlab=observed leaching,ylab=predicted leaching) abline(a=0,b=1) plot(logleach.lm) dev.off() `leachdata` - structure(list(rainmm = c(19.68, 36.168, 18.632, 2.74, 0.822, 9.864, 7.124, 29.592, 4.384, 11.508, 1.37, 3.288, 9.042, 2.74, 18.906, 4.932, 0.274, 3.836, 1.918, 4.384, 16.714, 5.754, 12.604, 2.466, 13.014, 2.74, 14.796, 5.754, 4.93, 5.21, 0.548, 1.644, 3.014, 6.028, 18.358, 1.918, 3.014, 18.358, 0.274, 1.918, 54.2, 43.4, 11.2, 1.6, 3.8, 70.2, 0.2, 24.4, 25.8, 13, 7.124, 10.96, 7.672, 3.562, 3.288, 6.02, 17.54, 19.68, 36.168, 18.632, 2.74, 0.822, 9.864, 7.124, 29.592, 4.384, 11.508, 1.37, 3.288, 9.042, 2.74, 18.906, 4.932, 0.274, 3.836, 1.918, 4.384, 16.714, 5.754, 12.604, 2.466, 13.014, 2.74, 14.796, 5.754, 4.93, 5.21, 0.548, 1.644, 3.014, 6.028, 18.358, 1.918, 3.014, 18.358, 0.274, 1.918, 54.2, 43.4, 11.2, 1.6, 3.8, 70.2, 0.2, 24.4, 25.8, 13, 7.124, 10.96, 7.672, 3.562, 3.288, 6.02, 17.54, 19.68, 36.168, 18.632, 2.74, 0.822, 9.864, 7.124, 29.592, 4.384, 11.508, 1.37, 3.288, 9.042, 2.74, 18.906, 4.932, 0.274, 3.836, 1.918, 4.384, 16.714, 5.754, 12.604, 2.466, 13.014, 2.74, 14.796, 5.754, 4.93, 5.21, 0.548, 1.644, 3.014, 6.028, 18.358, 1.918, 3.014, 18.358, 0.274, 1.918, 54.2, 43.4, 11.2, 1.6, 3.8, 70.2, 0.2, 24.4, 25.8, 13, 7.124, 10.96, 7.672
[R] lw in legend also changes thickness of characters in the legend??
Here's a simple example: x - 1:5 plot(x,x^2) lines(x,x^2) points(x,x,cex=2) lines(x,x,lw=3) legend(topleft,legend=c(y=x^2,y=x),pch=1,pt.cex=1:2,lw=c(1,3)) The thickness of the circles in the legend changes with lw. If you change the lw argument in legend to lw=c(1,1) then the thickness of the circles goes back to normal. How can I make the above legend so that both the lines and the plotting characters match what is in the graph? [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] customization of pairwise comparison plots
I am wondering how to customize a pairwise comparisons plot of a factorial ANOVA, without doing a lot of manual manipulation of a TukeyHSD object. The customizations I'd like are: 1. The aov used log-transformed response data, but I'd like to plot the intervals on their original, untransformed scales 2. Plot all the main and interaction effects together, rather than in a separate window for each. 3. Omit certain comparisons, specifically interactions without some common level. ie plot A1:B1 - A1:B2, but not A1:B2 - A3:B4. For example (I know not a valid one since the factor wool isn't significant, and it doesn't need to be log-transformed): data(warpbreaks) summary(fm1 - aov(log(breaks) ~ wool*tension, data = warpbreaks)) Df Sum Sq Mean Sq F value Pr(F) wool 1 0.3125 0.3125 2.2344 0.141511 tension 2 2.1762 1.0881 7.7792 0.001185 ** wool:tension 2 0.9131 0.4566 3.2642 0.046863 * Residuals48 6.7138 0.1399 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 The general features of plot(fm1) are good but I'd like to plot all the comparisons together in one plot, on the original scale, skipping the comparisons B:M-A:L, B:H-A:L, A:M-B:L, A:H-B:L, B:H-A:M, A:H-B:M while keeping the rest of the interaction comparisons. Advice on any of these, if not all, would be much appreciated. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] restricting pairwise comparisons of interaction effects
I'm interested in restricting the pairwise comparisons of interaction effects in a multi-way factorial ANOVA, because I find comparisons of interactions between all different variables different to interpret. For example (supposing a p0.10 cutoff just to be able to use this example): summary(fm1 - aov(breaks ~ wool*tension, data = warpbreaks)) Df Sum Sq Mean Sq F valuePr(F) wool 1 450.7 450.7 3.7653 0.0582130 . tension 2 2034.3 1017.1 8.4980 0.0006926 *** wool:tension 2 1002.8 501.4 4.1891 0.0210442 * Residuals48 5745.1 119.7 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 TukeyHSD(fm1,conf.level=0.90) Tukey multiple comparisons of means 90% family-wise confidence level Fit: aov(formula = breaks ~ wool * tension, data = warpbreaks) $wool diff lwruprp adj B-A -5.78 -10.77183 -0.7837284 0.058213 $tension diff lwr upr p adj M-L -10.00 -17.66710 -2.332900 0.0228554 H-L -14.72 -22.38932 -7.055122 0.0005595 H-M -4.72 -12.38932 2.944878 0.4049442 $`wool:tension` difflwrupr p adj B:L-A:L -16.333 -30.112566 -2.554101 0.0302143 A:M-A:L -20.556 -34.334788 -6.776323 0.0029580 B:M-A:L -15.778 -29.557010 -1.998545 0.0398172 A:H-A:L -20.000 -33.779233 -6.220767 0.0040955 B:H-A:L -25.778 -39.557010 -11.998545 0.0001136 A:M-B:L -4.222 -18.001455 9.557010 0.9626541 B:M-B:L 0.556 -13.223677 14.334788 0.978 A:H-B:L -3.667 -17.445899 10.112566 0.9797123 B:H-B:L -9.444 -23.223677 4.334788 0.4560950 B:M-A:M 4.778 -9.001455 18.557010 0.9377205 A:H-A:M 0.556 -13.223677 14.334788 0.978 B:H-A:M -5.222 -19.001455 8.557010 0.9114780 A:H-B:M -4.222 -18.001455 9.557010 0.9626541 B:H-B:M -10.000 -23.779233 3.779233 0.3918767 B:H-A:H -5.778 -19.557010 8.001455 0.8705572 It would seem to make sense (and please correct me if I'm wrong) to restrict the pairwise comparisons of wool:tension to terms like B:L-A:L, and A:M-A:L, and not calculate or try to interpret differences like B:M-A:L. How can I accomplish this (note that there are actually 5 factors in the experiment I'm analyzing)? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] adding device size-independent y=0 line to a lattice plot
Using the following lattice plot as an example, I would like to add horizontal lines where y=0: library(lattice) library(grid) fac - gl(4,12) x - letters[rep(1:3,16)] y - runif(48,min=0.0) dotplot(y~x|fac) I've tried it with grid.lines using npc and native units, which works fine unless I change the size of the output device - then the lines are in the wrong place. Is there a way to do this that is independent of the output device size? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spreading out a numeric vector
*Thank you* Greg, this function works perfectly! I had imagined that the ideal solution would iteratively modify the vector to fix new violations of mindiff created by each subsequent spreading of tight clusters, but couldn't figure how to do it. A small note, the vector x must be sorted before using this function, which is a reasonable requirement. Thanks also to Jim Lemon for pointing out the very useful plotrix package - the spread.labels function didn't work so well for this application because the timelines looked strange with the labels spread evenly, but I like a number of the functions provided by the package. -levi On Mon, Mar 24, 2008 at 12:43 PM, Greg Snow [EMAIL PROTECTED] wrote: Levi, Here is one possible function: spread - function(x, mindiff) { df - x[-1] - x[-length(x)] i - 1 while (any(df mindiff)) { x[c(df mindiff, FALSE)] - x[c(df mindiff, FALSE)] - mindiff/10 x[c(FALSE, df mindiff)] - x[c(FALSE, df mindiff)] + mindiff/10 df - x[-1] - x[-length(x)] i - i + 1 if (i 100) { break } } x } I have tried experimenting with using optim to minimize a function of the distances between new points and old points penealized for being to close, but it sometimes gave me some weird results, the above has worked fairly well for me (the above is based on some of the code inside the triplot function in the TeachingDemos package). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spreading out a numeric vector
Would this function be worth submitting to some existing cran library? I'd be willing to document it and add error-checking if there is somewhere that it belongs (with credit to Greg of course, unless Greg wants to submit it himself). I am using it in conjunction with a function I wrote which draws tics on the timeline and draws a line from the tic to the label when they aren't aligned. I have tried experimenting with using optim to minimize a function of the distances between new points and old points penealized for being to close, but it sometimes gave me some weird results, the above has worked fairly well for me (the above is based on some of the code inside the triplot function in the TeachingDemos package). Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] spreading out a numeric vector
I am creating a timeline plot, but running into a problem in positions where the values to plot are too close together to print the text without overlap. The simplest way I can think of to solve this (although there may be other ways?) is to create a new numeric vector whose values are as close as possible to the original vector, but spread out to a given minimum difference. For example, take the following vector: x - c(1,2,seq(2.1,2.3,0.1),3,4) x [1] 1.0 2.0 2.1 2.2 2.3 3.0 4.0 However, suppose I don't want to plot any of these points with less than 0.5 between them. The problem could be solved by a function that behaved something like this: x2 - spread(x,mindiff=0.5) x2 [1] 0.5 1.0 1.5 2.0 2.5 3.0 4.0 Or for a minimum difference of 0.2, x2 - spread(x,mindiff=0.2) x2 [1] 1.0 1.8 2.0 2.2 2.4 3.0 4.0 Thus, if there is a cluster of close values, spreading the values may require changing values which previously weren't too close to their nearest neighbors. Then I could use segments() to draw lines from the timeline to where the shifted text is printed, labeling tics on the timeline accurately but keeping the text from overlapping. Any ideas on how to program this or how to do it using built-in functions would be greatly appreciated. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] spreading out a numeric vector
On Sat, Mar 22, 2008 at 9:58 PM, jim holtman [EMAIL PROTECTED] wrote: Is this close to what you want? spread - function(x, mindiff=0.5){ + unique(as.integer(x / mindiff) * mindiff) + } x - c(1,2,seq(2.1,2.3,0.1),3,4) x [1] 1.0 2.0 2.1 2.2 2.3 3.0 4.0 spread(x) [1] 1 2 3 4 spread(x,.2) [1] 1.0 2.0 2.2 3.0 4.0 There are a couple problems with this solution: 1. The main problem is that it causes closely spaced numbers to vanish, rather than spread out 2. The vectors I'm actually working with are large integers, for example: x [1] 342951 1491880 5447637 6335019 7368604 8960148 10635802 16907621 [9] 18071422 18747461 19379960 21767641 21850851 23143537 26693431 35298190 with mindiff = 50. I could scale these numbers to make your solution work, except for problem #1. Interesting idea though. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] differentiating a numeric vector
What functions exist for differentiating a numeric vector (in my case spectral data)? That is, experimental data without an analytical function. ie, x - seq(1,10,0.1) y=x^3+rnorm(length(x),sd=0.01) #although the real function would be nothing simple like x^3... derivy - I know I could just use diff(y) but it would be nice to estimate derivatives at the endpoints, calculate higher-order derivatives, and maybe have some smoothing options ie by differentiating a spline or something like that. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] summarizing replicates with multiple treatments
I have a dataframe with several different treatment variables, and would like to calculate the mean and standard deviation of the replicates for each day and treatment variable. It seems like it should be easy, but I've only managed to do it for one treatment at a time using subset and tapply. Here is an example dataset: `exampledata` - structure(list(day = c(1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L), treat = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 2L, 2L ), .Label = c(a, b), class = factor), replicate = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), height = c(1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6), weight = c(11.1, 11.2, 11.3, 11.4, 11.5, 11.6, 12.1, 12.2, 12.3, 12.4, 12.5, 12.6, 13.1, 13.2, 13.3, 13.4, 13.5, 13.6)), .Names = c(day, treat, replicate, height, weight), class = data.frame, row.names = c(NA, -18L)) exampledata day treat replicate height weight 11 a 11.1 11.1 21 a 21.2 11.2 31 a 31.3 11.3 41 b 11.4 11.4 51 b 21.5 11.5 61 b 31.6 11.6 72 a 12.1 12.1 82 a 22.2 12.2 92 a 32.3 12.3 10 2 b 12.4 12.4 11 2 b 22.5 12.5 12 2 b 32.6 12.6 13 3 a 13.1 13.1 14 3 a 23.2 13.2 15 3 a 33.3 13.3 16 3 b 13.4 13.4 17 3 b 23.5 13.5 18 3 b 33.6 13.6 I would like to combine the replicates and get a dataframe like: day treat height.mean height.sd weight.mean weight.sd 1 a 1.2 0.1 11.20.1 1 b 1.5 0.1 11.50.1 2 a 2.2 0.1 12.20.1 2 b 2.5 0.1 12.50.1 3 a 3.2 0.1 13.20.1 3 b 3.5 0.1 13.50.1 or two dataframes, one with means and the other with standard deviations. Thus far I have been doing it a piece at a time, like below (extra verbose since tapply doesn't accept the data= argument!), but would like to do it for all the measurement columns and all the treatments in one go. Thanks! tapply(exampledata[exampledata$treat==a,]$height,exampledata[exampledata$treat==a,]$day,mean) 1 2 3 1.2 2.2 3.2 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] summarizing replicates with multiple treatments
On Tue, Mar 4, 2008 at 5:32 PM, Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: library(doBy) summaryBy(. ~ day + treat, exampledata, FUN = c(mean, sd)) Outstanding, so much better. Thanks. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.