Re: [R] Fastest Way to Divide Elements of Row With Its RowSum
At 2:40 PM +0900 9/17/09, Gundala Viswanath wrote: I have a data frame (dat). What I want to do is for each row, divide each row with the sum of its row. The number of row can be large 1million. Is there a faster way than doing it this way? datnorm; for (rw in 1:length(dat)) { tmp - dat[rw,]/sum(dat[rw,]) datnorm - rbind(datnorm, tmp); } - G.V. datnorm - dat/rowSums(dat) this will be faster if dat is a matrix rather than a data.frame. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- William Revelle http://personality-project.org/revelle.html Professor http://personality-project.org/personality.html Department of Psychology http://www.wcas.northwestern.edu/psych/ Northwestern University http://www.northwestern.edu/ Use R for psychology http://personality-project.org/r It is 5 minutes to midnight http://www.thebulletin.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] boxplot
Hi, I m not able to plot normalized data(normalization by rma) using boxplot. I don't know why? basically, object(formed of normalized data) belong to ExpressionSet class. It is showing error Error in x[!xna] : object of type 'S4' is not subsettable In addition: Warning messages: 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'S4' 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'S4' Now, how to plot? Should I have use another function? By Sukhbir Singh Rattan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] inline error message
Hi all, I installed the library inline to my default R-enviroment (c:\Programme\R.. ) downloaded and installed RTools from http://www.murdoch-sutherland.com/Rtools/ to c:\Rtools. Path variable is set right (with respect to order) but I get still the error message from my R Error in compileCode(f, code, language, verbose) : Compilation ERROR, function(s)/method(s) not created! Where is my failure ? Thank you ! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Filling Empty Column with String in read.table
I have a data file that looks like this. __DATA__ D7KAR5Z02F447V 176 G 0.22 D7KAR5Z02J3WLG 94 A 1.0529 D7KAR5Z02F4K6L 198 a 0.13 D7KAR5Z02J4SYO 67 C 0.9528 D7KAR5Z02J4SYO 83 C 1.0129 D7KAR5Z02J4SYO 97 T 0.13 D7KAR5Z02J4SYO 166 A 0.9427 I want the rows where the last column has no-entry to be filled with -. Is there a way to do it with read.table? I don't find any option to do it with read.table. This only fill the empty 5th column with blank. dat - read.table(myfile.txt,na.strings=-,header=FALSE,fill=T); G.V. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] rJava .jinit() : Cannot create Java virtual machine (-1)
On 09/17/2009 07:30 AM, _ wrote: Hi all, when using .jinit() I get the message .jinit() : Cannot create Java virtual machine (-1). You probably also need to set JAVA_HOME to that location, and have the bin directory (the one that contains java.exe) in your PATH as well. Usually, support for rJava goes in this mailing list : http://mailman.rz.uni-augsburg.de/mailman/listinfo/stats-rosuda-devel Romain I set the classpath variable to jre and to the jdk but nothing works. Sys.getenv(CLASSPATH) CLASSPATH C:\\Programme\\Java\\jre6\\bin Sys.getenv(CLASSPATH) CLASSPATH C:\\Programme\\Java\\jdk1.6.0_13 Java is correct installed because other Java-Applications are running. I would be nice if someone could help me. Thanks ! -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/yw8E : New R package : sos |- http://tr.im/y8y0 : search the graph gallery from R `- http://tr.im/y8wY : new R package : ant __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] latex code in R - convert to pdf
hi, is it possible to convert latex code to pdf in R (like a latex-program would do it)? Is there a package that comes with this capabilities? My problem is that I want to generate tables automatically - and I can't use a latex editor at that computer ... Besides latex ... are there good ways to generate tables in R? thanks for any suggestions! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do rotation for polygon?
hi everyone, Its really helped, it almost as what I wanted. Thanks alot. On Sat, Sep 12, 2009 at 4:30 AM, Greg Snow greg.s...@imail.org wrote: Does this do what you want? library(TeachingDemos) ms.pent - function(ang=0,...) { theta - seq(ang, length.out=6, by=2*pi/5) cbind( cumsum(cos(theta)/2), cumsum(sin(theta)/2) ) } par(xpd=NA) my.symbols( rep(1,5), rep(1,5), ms.pent, ang=seq(0, by=2*pi/5, length.out=5), add=FALSE, col=2:6 ) -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 *From:* Hemavathi Ramulu [mailto:hema.ram...@gmail.com] *Sent:* Thursday, September 10, 2009 2:44 AM *To:* Greg Snow *Cc:* r-help@r-project.org *Subject:* Re: [R] How to do rotation for polygon? Hi everyone, I still couldn't get the diagram as I mentioned before. I try Grey and Milton suggestion but it confusing. I hope anyone helped me. Thanks in advance. Regards, Hema. On Thu, Sep 3, 2009 at 11:39 PM, Greg Snow greg.s...@imail.org wrote: The my.symbols and ms.polygon functions in the TeachingDemos package may help. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of Hemavathi Ramulu Sent: Wednesday, September 02, 2009 11:05 PM To: r-help@r-project.org Subject: [R] How to do rotation for polygon? Hi everyone, I have coding for repeating pentagon as below: plot(0:11,type=n) for (i in 1:10 )polygon(rep(c(4,5,7,8,6)), i*c(.5,.3,.3,.5,.7), bor=2) which are increasing vertically. Now, I want to know how to rotate the pentagon, so that I will get pattern like flower. Basicly, repeating pentagon in circle. Thanks alot for helping me to solve this problem. -- Hemavathi [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Hemavathi Ramulu -- Hemavathi Ramulu [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R functions with array arguments
Dear R users, I'm trying to implement a self-defined function with multiple arguments, one of which is an array, but I find that the result is a single value instead of an array. This is the example I'm working on: # define integration limit vector Mabslim - c(-17.95, -16.65, -17.27, -17.62, -16.76, -17.07, -17.02) # Define function schech-function(x,alpha,xstar) (10^(0.4*(alpha+1)*(xstar- x))*exp(-10^(0.4*(xstar-x # Define new function integrating the previous one my_gamma-function(alpha,xstar,xlim,xmax) integrate(schech,xmax,xlim,alpha,xstar)$value my_gamma(-1,-21,Mabslim,-27) [1] 2.487746 Note that 'Mabslim' is used as an upper integration limit within the function. I tried to use sapply() but it looks to me as if this can be used only if the array is the first argument of the function. I'm a beginner to R, apologies if I'm just using the wrong approach. Any suggestion on how to solve this? Thanks, Maurizio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] latex code in R - convert to pdf
is it possible to convert latex code to pdf in R (like a latex-program would do it)? Is there a package that comes with this capabilities? My problem is that I want to generate tables automatically - and I can't use a latex editor at that computer ... Besides latex ... are there good ways to generate tables in R? Have a look at Sweave and xtable - I think that's what you want. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Turning points in a series
Good morning once more. My problem of yesterday has been addressed. Having learned a few tricks from that, I wish to ask another question in connection with that. My data is a cosmic ray data consisting of dates and counts. When I plot a graph of counts versus dates, the resultant signal shows a number of maximum and minimum points. These minimum points (turning points) are of interest to me. Reading these dates and counts off from the plot is difficult as I am dealing with a large data. I have been looking at turnpoints function in pastecs library but have not been able to figure out the appropriate commands that one can use to find the minima/maxima (turning points) or pits/peaks in a series. My data is of the form shown below where y stands for year, m month, d day and finally count. Is there a way I could find these minima together with the dates they occurred? I would be indebted to those of you who will show me the way out of these problem. Thank you. Best regards Ogbos y m d count 93 02 07 3974.6 93 02 08 3976.7 93 02 09 3955.2 93 02 10 3955.0 93 02 11 3971.8 93 02 12 3972.8 93 02 13 3961.0 93 02 14 3972.8 93 02 15 4008.0 93 02 16 4004.2 93 02 17 3981.2 93 02 18 3996.8 93 02 19 4028.2 93 02 20 4029.5 93 02 21 3953.4 93 02 22 3857.3 93 02 23 3848.3 93 02 24 3869.8 93 02 25 3898.1 93 02 26 3920.5 93 02 27 3936.7 93 02 28 3931.9 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] latex code in R - convert to pdf
Martin Batholdy wrote: is it possible to convert latex code to pdf in R (like a latex-program would do it)? Is there a package that comes with this capabilities? Unfortunately you're out of luck if you're seeking a direct path from LaTeX code generated in R to pdf without passing through a LaTeX compiler. Re-implementing the pdfTeX compiler in R would be a monumental undertaking as the TeX macro-expansion language is pretty hairy-- low-level TeX can make the worst Perl screen-vomit look tame by comparison. And that's just TeX-- throw in a pile of LaTeX macro packages and the difficulty shoots up another order of magnitude or two. Martin Batholdy wrote: My problem is that I want to generate tables automatically - and I can't use a latex editor at that computer ... Besides latex ... are there good ways to generate tables in R? There's the xtable package which can perform some automagical formatting of R objects to LaTeX code. If you don't have access to LaTeX at a workstation, you could try using xtable's HTML output mode-- then use Word or OpenOffice to open the HTML file and copy the table. There are also a few utilities out there that can perform a conversion from TeX to HTML-- they might be worth Googling if xtable's HTML output isn't working for you. Good luck! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/latex-code-in-R--%3E-convert-to-pdf-tp25486430p25487067.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] geoR, variofit
Hello All! I calculate a variogram using the function variog (package geoR) afterwards I use variofit to fit a spherical model (see code below). Now I just changed the units of the variable (in this case MPa to kPa just a factor of 1000). If I do so, I get a different fit and therefore different ranges etc. Why? The semi-variance is of course 6 orders of magnitude higher but the values are the same. So from my point of view it should be the same fit. Maybe there is a reason for this I do not get. Hopefully someone could reply to this question. Thanks, Sascha Bellaire! Parts of code, just multiplying wl1$PSI with 1000 changed it. #Calculate the extent of the current sampling design locations - cbind(wl1$X,wl1$Y) extent - max(dist(locations)) max.dist - extent/2 data1 - cbind(wl1$X,wl1$Y,wl1$PSI*1000) geodata1 - as.geodata(data1, coords.col = 1:2, data.col = 3) # Calculate the sample variogram vario_var1 - variog(geodata1, uvec = nlags, trend = trend, option = bin, estimator.type = modulus, max.dist = max.dist, pairs.min = 10., direction = omnidirectional, messages = TRUE) # Fit a model variofit_var1 - variofit(vario_var1, cov.model = variotype, fix.nugget = FALSE, max.dist = vario_var1$max.dist, messages = TRUE) ___ Sascha Bellaire WSL Institute for Snow and Avalanche Research SLF Formation of Alpine Natural Hazards Flüelastrasse 11 7260 Davos Dorf, Switzerland Tel.: +41 81 4170 292 Fax: +41 81 4170 110 www.slf.ch [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Quadradic constraint in optimization of linear program
Can someone suggest which package is required for optimization of linear program with quadradic constraint. Eg max x1+x2+x3 subject to x1+x2+x3=10 x3^2 = 0 -- View this message in context: http://www.nabble.com/Quadradic-constraint-in-optimization-of-linear-program-tp25487057p25487057.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] latex code in R - convert to pdf
Hi, for basic tables (e.g. display a data.frame without fancy formatting), you could try the textplot() function from the gplots package, or this rough function for Grid graphics, source(http://gridextra.googlecode.com/svn/trunk/R/tableGrob.r;) # install.packages(gridextra, repos=http://R-Forge.R-project.org;) tc = textConnection( carat VeryLongWordIndeed color clarity depth 14513 1.35 Ideal J VS2 61.4 28685 0.30 Good G VVS1 64.0 50368 0.75 Ideal F SI2 59.2) d = read.table(tc,head=T) close(tc) grid.newpage() grid.table(d) HTH, baptiste 2009/9/17 Martin Batholdy batho...@googlemail.com hi, is it possible to convert latex code to pdf in R (like a latex-program would do it)? Is there a package that comes with this capabilities? My problem is that I want to generate tables automatically - and I can't use a latex editor at that computer ... Besides latex ... are there good ways to generate tables in R? thanks for any suggestions! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] latex code in R - convert to pdf
On Thu, Sep 17, 2009 at 10:08:57AM +0200, Philipp Pagel wrote: is it possible to convert latex code to pdf in R (like a latex-program would do it)? Is there a package that comes with this capabilities? My problem is that I want to generate tables automatically - and I can't use a latex editor at that computer ... Besides latex ... are there good ways to generate tables in R? Have a look at Sweave and xtable - I think that's what you want. Charlies post made me aware that by latex editor you may mean that there is no LaTeX installation on your machine. In that case Sweave and xtable will obviously be of little use. If you have Openoffice on that computer package odfWeave may be the solution. If openoffice is not available, either, maybe package HTMLUtils would be another option (I haven't used it so far, so I may be wrong here). cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] latex code in R - convert to pdf
Hello, You could try : - odfWeave : to generate odf (open office...) report - R2HTML and hwriter : to generate html report - ascii : to generate asciidoc http://www.methods.co.nz/asciidoc/ report and then convert it to html, xml, pdf and more. 2009/9/17 Philipp Pagel p.pa...@wzw.tum.de On Thu, Sep 17, 2009 at 10:08:57AM +0200, Philipp Pagel wrote: is it possible to convert latex code to pdf in R (like a latex-program would do it)? Is there a package that comes with this capabilities? My problem is that I want to generate tables automatically - and I can't use a latex editor at that computer ... Besides latex ... are there good ways to generate tables in R? Have a look at Sweave and xtable - I think that's what you want. Charlies post made me aware that by latex editor you may mean that there is no LaTeX installation on your machine. In that case Sweave and xtable will obviously be of little use. If you have Openoffice on that computer package odfWeave may be the solution. If openoffice is not available, either, maybe package HTMLUtils would be another option (I haven't used it so far, so I may be wrong here). cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan 85350 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/http://webclu.bio.wzw.tum.de/%7Epagel/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R functions with array arguments
Try this, sapply(Mabslim , my_gamma, alpha=-1, xstar = -21, xmax = -27) or wrap it with ?Vectorize, vmy_gamma = Vectorize(my_gamma, vectorize.args = xlim) vmy_gamma(alpha=-1, xstar = -21, xlim= Mabslim, xmax = -27) HTH, baptiste 2009/9/17 Maurizio Paolillo paoli...@na.infn.it: Dear R users, I'm trying to implement a self-defined function with multiple arguments, one of which is an array, but I find that the result is a single value instead of an array. This is the example I'm working on: # define integration limit vector Mabslim - c(-17.95, -16.65, -17.27, -17.62, -16.76, -17.07, -17.02) # Define function schech-function(x,alpha,xstar) (10^(0.4*(alpha+1)*(xstar- x))*exp(-10^(0.4*(xstar-x # Define new function integrating the previous one my_gamma-function(alpha,xstar,xlim,xmax) integrate(schech,xmax,xlim,alpha,xstar)$value my_gamma(-1,-21,Mabslim,-27) [1] 2.487746 Note that 'Mabslim' is used as an upper integration limit within the function. I tried to use sapply() but it looks to me as if this can be used only if the array is the first argument of the function. I'm a beginner to R, apologies if I'm just using the wrong approach. Any suggestion on how to solve this? Thanks, Maurizio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Turning points in a series
On 17-Sep-09 08:10:47, ogbos okike wrote: Good morning once more. My problem of yesterday has been addressed. Having learned a few tricks from that, I wish to ask another question in connection with that. My data is a cosmic ray data consisting of dates and counts. When I plot a graph of counts versus dates, the resultant signal shows a number of maximum and minimum points. These minimum points (turning points) are of interest to me. Reading these dates and counts off from the plot is difficult as I am dealing with a large data. I have been looking at turnpoints function in pastecs library but have not been able to figure out the appropriate commands that one can use to find the minima/maxima (turning points) or pits/peaks in a series. My data is of the form shown below where y stands for year, m month, d day and finally count. Is there a way I could find these minima together with the dates they occurred? I would be indebted to those of you who will show me the way out of these problem. Thank you. Best regards Ogbos y m d count 93 02 07 3974.6 93 02 08 3976.7 93 02 09 3955.2 93 02 10 3955.0 93 02 11 3971.8 93 02 12 3972.8 93 02 13 3961.0 93 02 14 3972.8 93 02 15 4008.0 93 02 16 4004.2 93 02 17 3981.2 93 02 18 3996.8 93 02 19 4028.2 93 02 20 4029.5 93 02 21 3953.4 93 02 22 3857.3 93 02 23 3848.3 93 02 24 3869.8 93 02 25 3898.1 93 02 26 3920.5 93 02 27 3936.7 93 02 28 3931.9 The following simple function TP() (for Turning Point) locates the positions i where x[i] is greater than both of its immediate neighbours (local maximum) or less than both of its neighbours (local minimum). TP - function(x){ L - length(x) which( ((x[1:(L-2)]x[2:(N-1)])(x[2:(L-1)]x[3:L])) |((x[1:(L-2)]x[2:(N-1)])(x[2:(L-1)]x[3:L])) ) + 1 } Applied to your series count above: TP(count) # [1] 2 4 6 7 9 11 14 17 21 If you assign these values to an index: ix - TP(count) rbind(d[ix],count[ix]) # [1,]8.0 10 12.0 13 15 17.0 20.0 23.0 27.0 # [2,] 3976.7 3955 3972.8 3961 4008 3981.2 4029.5 3848.3 3936.7 Of course, this is only a very simplistic view of turning point, and will pick out everything which is a local minimum or maximum. The above function can be extended (in a fairly obvious way) to identify each position i where x[i] is greater than its neighbours out to 2 on either side, or less than these neighbours; or more generally out to k on either side. A lot depends on how you want to interpret turning point. With your count series, it might be that you were only interested in identifying the relatively extreme turning points, such as i=4 (maybe), i=9 (maybe), i=14, i=17, i=21(maybe). Hoping this helps, Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 17-Sep-09 Time: 09:47:53 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to extract data.frame columns using regex?
I think this is what you want: df - data.frame(x1=1:11,x2=2:12,x3=3:13,y=4:14) grep('^x',names(df)) [1] 1 2 3 The returned indexes refer to the column positions, so you could do: names(df)[grep('^x',names(df))] [1] x1 x2 x3 or df[,grep('^x',names(df))] x1 x2 x3 1 1 2 3 2 2 3 4 3 3 4 5 4 4 5 6 5 5 6 7 6 6 7 8 7 7 8 9 8 8 9 10 9 9 10 11 10 10 11 12 11 11 12 13 HTH Schalk Heunis On Thu, Sep 17, 2009 at 5:03 AM, Peng Yu pengyu...@gmail.com wrote: Hi, data.frame(x1=1:11,x2=2:12,x3=3:13,y=4:14) I want to extract all the columns that with the name 'x?'. Is there a general way to do this in R? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package documentation of S4 methods
Hi, I'm new to this mailing list and to R-programming so if the question is stupid I apologize. I have to create a package, which includes an S4-class called BList. For objects of this class I implemented a method show, which displays the first 15 data-lines of the object. I further implemented a method showall, which displays the whole data in the object. For the showall-method i first defined a generic method: setGeneric(showall, function(object){ out- standardGeneric(showall) }) and then defined the method using setMethod(showall, signature(object=BList), function(object){ ... } My problem: I don't know how to document the showall-method for the package. If I just add the method in the documentation 'BList-class.Rd' as an alias using: \alias{showall,BList-method} and then add it to the list of methods: \section{Methods}{ \describe{ \item{show}{\code{signature(object = BList)}: Short output} \item{showall}{\code{signature(object = BList)}: Long output} } } I get the following warning (when I check the package for installation): * checking for missing documentation entries ... WARNING Undocumented code objects: showall All user-level objects in a package should have documentation entries. I then used promptMethods() to build a seperate documentation-file 'showall-methods.Rd'. If I then run the check I get the warning: * checking Rd files ... WARNING Rd files with duplicated alias 'showall,BList-method': BList-class.Rd showall-methods.Rd If I then remove the alias line '\alias{showall,BList-method}' from either 'BList-class.Rd' or 'showall-methods.Rd' I get the first warning (Undocumented code objects...). My question: how do you document methods of S4-classes in a package? It works fine for the method show but it does not work for showall, for which I had to define the generic function first. What can I do to fix this, or is it better to just define showall as a function and not as a method? Thanks a lot, Elton G. _ Learn how to add other email accounts to Hotmail in 3 easy steps. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] JGR install and run question
Hello all, I tried to install the GUI interface JGR for R yesterday on my Windows Vista machine running R 2.9.2. It did install and run but whenever I ran the package manager it crashed. It also did not appear to display all the available packages. For instance Hmisc and bayesm were not in the list. Since I am new to R, and more so to JGR, I'd just like to know if JGR is still supported and should run or if I'm wasting my time with it. It had the nice feature of color coding the programs so that errors were easier to spot, like in SAS. Is anyone else successfully running JGR on R 2.9.2 with Vista? Thanks in advance for any comments. -- Best regards, David Young Marketing and Statistical Consultant Madrid, Spain +34 913 540 381 http://www.linkedin.com/in/europedavidyoung __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] JGR install and run question
Hello all, I tried to install the GUI interface JGR for R yesterday on my Windows Vista machine running R 2.9.2. It did install and run but whenever I ran the package manager it crashed. It also did not appear to display all the available packages. For instance Hmisc and bayesm were not in the list. Since I am new to R, and more so to JGR, I'd just like to know if JGR is still supported and should run or if I'm wasting my time with it. It had the nice feature of color coding the programs so that errors were easier to spot, like in SAS. Is anyone else successfully running JGR on R 2.9.2 with Vista? Thanks in advance for any comments. -- Best regards, David Young Marketing and Statistical Consultant Madrid, Spain +34 913 540 381 http://www.linkedin.com/in/europedavidyoung __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to extract data.frame columns using regex?
SH == Schalk Heunis schalk.heu...@enerweb.co.za on Thu, 17 Sep 2009 11:15:16 +0200 writes: SH I think this is what you want: df - data.frame(x1=1:11,x2=2:12,x3=3:13,y=4:14) grep('^x',names(df)) SH [1] 1 2 3 SH The returned indexes refer to the column positions, so you could do: names(df)[grep('^x',names(df))] SH [1] x1 x2 x3 yes, or slightly more elegant and efficient grep('^x',names(df), value = TRUE) [1] x1 x2 x3 SH or df[,grep('^x',names(df))] SH x1 x2 x3 SH 1 1 2 3 SH 2 2 3 4 SH 3 3 4 5 SH 4 4 5 6 SH 5 5 6 7 SH 6 6 7 8 SH 7 7 8 9 SH 8 8 9 10 SH 9 9 10 11 SH 10 10 11 12 SH 11 11 12 13 SH HTH SH Schalk Heunis SH On Thu, Sep 17, 2009 at 5:03 AM, Peng Yu pengyu...@gmail.com wrote: Hi, data.frame(x1=1:11,x2=2:12,x3=3:13,y=4:14) I want to extract all the columns that with the name 'x?'. Is there a general way to do this in R? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. SH [[alternative HTML version deleted]] SH __ SH R-help@r-project.org mailing list SH https://stat.ethz.ch/mailman/listinfo/r-help SH PLEASE do read the posting guide http://www.R-project.org/posting-guide.html SH and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Turning points in a series
Hello, I don't see what's wrong with turnpoints() from pastecs. It is easy to use, and provides additional information for each turnpoints, i.e., probability of occurrence against the null hypothesis that the series is purely random, and the number of bits of information associated with the points according to Kendall's information theory. See ?turnpoints. Rewriting the function is a nice exercise, but a small explanation on how to use turnpoints() is much easier. So, nobody is able to tell that simply using: library(pastecs) turnpoints(dat$count) does the job? Am I the only one interested by the extra information provided by turnpoints()? Here is a more extensive example that shows also how to get date or counts associated to pits/peaks: txt - y m d count 93 02 07 3974.6 93 02 08 3976.7 93 02 09 3955.2 93 02 10 3955.0 93 02 11 3971.8 93 02 12 3972.8 93 02 13 3961.0 93 02 14 3972.8 93 02 15 4008.0 93 02 16 4004.2 93 02 17 3981.2 93 02 18 3996.8 93 02 19 4028.2 93 02 20 4029.5 93 02 21 3953.4 93 02 22 3857.3 93 02 23 3848.3 93 02 24 3869.8 93 02 25 3898.1 93 02 26 3920.5 93 02 27 3936.7 93 02 28 3931.9 con - textConnection(txt) dat - read.table(con, header = TRUE) close(con) dat$date - as.Date(paste(dat$y, dat$m, dat$d), format = %y %m %d) library(pastecs) tp - turnpoints(dat$count) tp summary(tp) # Indicate which turnpoints are significant (see ?turnpoints) plot(tp, level = 0.05) # Another plot plot(dat$count, type = l) lines(tp) # Get counts for all turnpoints allcounts - - dat$count[extract(tp, no.tp = FALSE, peak = TRUE, pit = TRUE)] # Get dates for all turnpoints alldates - dat$date[extract(tp, no.tp = FALSE, peak = TRUE, pit = TRUE)] alldates # Get dates for informative turnpoints (5%) only (see ?turnpoints) alldates[tp$proba 0.05] # Get dates for peaks only dat$date[extract(tp, no.tp = FALSE, peak = TRUE, pit = FALSE)] # Etc... Best, Philippe Grosjean ..°})) ) ) ) ) ) ( ( ( ( (Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( (Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons-Hainaut University, Belgium ( ( ( ( ( .. (Ted Harding) wrote: On 17-Sep-09 08:10:47, ogbos okike wrote: Good morning once more. My problem of yesterday has been addressed. Having learned a few tricks from that, I wish to ask another question in connection with that. My data is a cosmic ray data consisting of dates and counts. When I plot a graph of counts versus dates, the resultant signal shows a number of maximum and minimum points. These minimum points (turning points) are of interest to me. Reading these dates and counts off from the plot is difficult as I am dealing with a large data. I have been looking at turnpoints function in pastecs library but have not been able to figure out the appropriate commands that one can use to find the minima/maxima (turning points) or pits/peaks in a series. My data is of the form shown below where y stands for year, m month, d day and finally count. Is there a way I could find these minima together with the dates they occurred? I would be indebted to those of you who will show me the way out of these problem. Thank you. Best regards Ogbos y m d count 93 02 07 3974.6 93 02 08 3976.7 93 02 09 3955.2 93 02 10 3955.0 93 02 11 3971.8 93 02 12 3972.8 93 02 13 3961.0 93 02 14 3972.8 93 02 15 4008.0 93 02 16 4004.2 93 02 17 3981.2 93 02 18 3996.8 93 02 19 4028.2 93 02 20 4029.5 93 02 21 3953.4 93 02 22 3857.3 93 02 23 3848.3 93 02 24 3869.8 93 02 25 3898.1 93 02 26 3920.5 93 02 27 3936.7 93 02 28 3931.9 The following simple function TP() (for Turning Point) locates the positions i where x[i] is greater than both of its immediate neighbours (local maximum) or less than both of its neighbours (local minimum). TP - function(x){ L - length(x) which( ((x[1:(L-2)]x[2:(N-1)])(x[2:(L-1)]x[3:L])) |((x[1:(L-2)]x[2:(N-1)])(x[2:(L-1)]x[3:L])) ) + 1 } Applied to your series count above: TP(count) # [1] 2 4 6 7 9 11 14 17 21 If you assign these values to an index: ix - TP(count) rbind(d[ix],count[ix]) # [1,]8.0 10 12.0 13 15 17.0 20.0 23.0 27.0 # [2,] 3976.7 3955 3972.8 3961 4008 3981.2 4029.5 3848.3 3936.7 Of course, this is only a very simplistic view of turning point, and will pick out everything which is a local minimum or maximum. The above function can be extended (in a fairly obvious way) to identify each position i where x[i] is greater than its neighbours out to 2 on either side, or less than these neighbours; or more generally out to k on either side. A lot depends on how you want to interpret turning point. With your count series, it might be that you were only interested in identifying the relatively extreme turning points, such as i=4 (maybe), i=9 (maybe), i=14, i=17, i=21(maybe). Hoping this helps, Ted.
Re: [R] How to extract data.frame columns using regex?
On 09/17/2009 12:04 PM, Martin Maechler wrote: SH == Schalk Heunisschalk.heu...@enerweb.co.za on Thu, 17 Sep 2009 11:15:16 +0200 writes: SH I think this is what you want: df- data.frame(x1=1:11,x2=2:12,x3=3:13,y=4:14) grep('^x',names(df)) SH [1] 1 2 3 SH The returned indexes refer to the column positions, so you could do: names(df)[grep('^x',names(df))] SH [1] x1 x2 x3 yes, or slightly more elegant and efficient grep('^x',names(df), value = TRUE) [1] x1 x2 x3 or, if you have the operators package: require( operators ) names(df) %~|% ^x [1] x1 x2 x3 ... note that is has been declared confusing in this mailing list previously. http://article.gmane.org/gmane.comp.lang.r.general/154749 SH or df[,grep('^x',names(df))] SH x1 x2 x3 SH 1 1 2 3 SH 2 2 3 4 SH 3 3 4 5 SH 4 4 5 6 SH 5 5 6 7 SH 6 6 7 8 SH 7 7 8 9 SH 8 8 9 10 SH 9 9 10 11 SH 10 10 11 12 SH 11 11 12 13 SH HTH SH Schalk Heunis SH On Thu, Sep 17, 2009 at 5:03 AM, Peng Yupengyu...@gmail.com wrote: Hi, data.frame(x1=1:11,x2=2:12,x3=3:13,y=4:14) I want to extract all the columns that with the name 'x?'. Is there a general way to do this in R? Regards, Peng -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/yw8E : New R package : sos |- http://tr.im/y8y0 : search the graph gallery from R `- http://tr.im/y8wY : new R package : ant __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with the commands FUNCTION and DERIV to build a polynomial
Hi all, I need to automate a process in order to prepare a a big loop in the future but I have a problem with the *command function* First I fit a model with lm model1-lm(data2[,2]~data2[,1]+I(data2[,1]^2)+I(data2[,1]^3)+I(data2[,1]^4)) I extract the coefficients to build the polynomial. coef-as.matrix(model1$coefficients) In the next step I need to define the polynomial to derive it. If I write the coefficients manually (writing the numbers by hand) the deriv command works fine! bb-deriv(~2847.22015 -463.06063*x+ 25.43829*x^2 -0.17896*x^3, namevec=x, function.arg=40) I would like to automate this step by being able to extract the coefficients from the linear model and adding them into the polynomial (and not write them by hand)! But if I build the polynomial with the function(x) command calling the * coef* values, the numeric values are not interpreted, the command function does not read properly the coefficients from the linear model. fun-function(x) coef[1]+coef[2]*x+coef[3]*x^2+coef[4]*x^3 fun function(x) coef[1]+coef[2]*x+coef[3]*x^2+coef[4]*x^3 How can i avoid to write the values of the coefficients by hand?? I need to do this many many times, this is the reason i need to be able to automate the process and then build a loop to repeat it many times with different outputs of a linear model! Somebody can help me? -- Noela Grupo de Recursos Marinos y Pesquerías Universidad de A Coruña [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] JGR install and run question
I've had similar problems. See: http://article.gmane.org/gmane.comp.lang.r.rosuda.devel/747 Although a bit out of date, there is a page on GUIs for R at these two links: http://www.sciviews.org/_rgui/ http://wiki.r-project.org/rwiki/doku.php?id=guis:guis On Thu, Sep 17, 2009 at 5:58 AM, David Young dyo...@telefonica.net wrote: Hello all, I tried to install the GUI interface JGR for R yesterday on my Windows Vista machine running R 2.9.2. It did install and run but whenever I ran the package manager it crashed. It also did not appear to display all the available packages. For instance Hmisc and bayesm were not in the list. Since I am new to R, and more so to JGR, I'd just like to know if JGR is still supported and should run or if I'm wasting my time with it. It had the nice feature of color coding the programs so that errors were easier to spot, like in SAS. Is anyone else successfully running JGR on R 2.9.2 with Vista? Thanks in advance for any comments. -- Best regards, David Young Marketing and Statistical Consultant Madrid, Spain +34 913 540 381 http://www.linkedin.com/in/europedavidyoung __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RCurl and Google Scholar's EndNote references
Hi! I've performed a Google Scholar Search using a query, let's say Frank Harrell, and parsed the links to the EndNote references from the resulting HTML code. Now I'd like to download all the references automatically. For this, I have tried to use RCurl, but I can't seem to get it working: I always get error code 403 Forbidden from the web server. Initially I tried to do this without using cookies: library(RCurl) getURL( http://scholar.google.fi/scholar.enw?q=info:U6Gfb4QPVFMJ:scholar.google.com/output=citationhl=fioe=ASCIIct=citationcd=0 ) or getURLContent( http://scholar.google.fi/scholar.enw?q=info:U6Gfb4QPVFMJ:scholar.google.com/output=citationhl=fioe=ASCIIct=citationcd=0 ) Error: Forbidden and then with cookies: getURL( http://scholar.google.fi/scholar.enw?q=info:U6Gfb4QPVFMJ:scholar.google.com/output=citationhl=fioe=ASCIIct=citationcd=0;, .opts=list(cookiejar=cookiejar.txt)) But they both consistently fail the same way. What am I doing wrong? sessionInfo() R version 2.9.0 (2009-04-17) i386-pc-mingw32 locale: LC_COLLATE=Finnish_Finland.1252;LC_CTYPE=Finnish_Finland.1252;LC_MONETARY=Finnish_Finland.1252;LC_NUMERIC=C;LC_TIME=Finnish_Finland.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] RCurl_0.98-1 bitops_1.0-4.1 Thanks! Jarno [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Filling Empty Column with String in read.table
Try this after read the data file: dat[is.na(dat[,5]),5] - '-' On Thu, Sep 17, 2009 at 3:33 AM, Gundala Viswanath gunda...@gmail.com wrote: I have a data file that looks like this. __DATA__ D7KAR5Z02F447V 176 G 0.22 D7KAR5Z02J3WLG 94 A 1.05 29 D7KAR5Z02F4K6L 198 a 0.13 D7KAR5Z02J4SYO 67 C 0.95 28 D7KAR5Z02J4SYO 83 C 1.01 29 D7KAR5Z02J4SYO 97 T 0.13 D7KAR5Z02J4SYO 166 A 0.94 27 I want the rows where the last column has no-entry to be filled with -. Is there a way to do it with read.table? I don't find any option to do it with read.table. This only fill the empty 5th column with blank. dat - read.table(myfile.txt,na.strings=-,header=FALSE,fill=T); G.V. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Quadradic constraint in optimization of linear program
The 2nd constraint holds trivially and the 1st constraint implies that the maximum is 10. The solution is not unique and any values of the variables satisfying the 1st constraint are optimum. No software needed. On Thu, Sep 17, 2009 at 4:19 AM, pragathichi pragathichitr...@tcs.com wrote: Can someone suggest which package is required for optimization of linear program with quadradic constraint. Eg max x1+x2+x3 subject to x1+x2+x3=10 x3^2 = 0 -- View this message in context: http://www.nabble.com/Quadradic-constraint-in-optimization-of-linear-program-tp25487057p25487057.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What is the time complexity of 'match()'?
Hi, Suppose 'x' is a vector of length n and 'y' is a vector of length m, I am wondering what the time complexity of 'match(x,y)' is. Is it n times m? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] heatmap.2() problems with re-ordering of rows and columns
I have a file of the following form -11 -10 -9 -8 -10 -9 -8 NA -9 -7NA NA -8NA NA NA So basically a NxN matrix of log scores. I want to get a heatmap of these log scores but I'm having a problem. I'm using the following code library(gplots) data=read.table(filein.txt,header=FALSE) mat=as.matrix(data) heatmap.2(mat,dendrogram=c(none)) But on the picture, it rearranges all my row,columns. I want it the y axis to be labeled from [10,-10] and the x axis to be the same [-10,10] so that the bottom left cell is -10,-10 and the top right cell is 10,10 -- which is the way the matrix is laid out. Why is it rearranging my cells? -- View this message in context: http://www.nabble.com/heatmap.2%28%29-problems-with-re-ordering-of-rows-and-columns-tp25490249p25490249.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] poisson lognormal regression
Dear all, I want to directly write the loglike function of poisson- lognormal regression and then estimate the parameters. the response (Y)is count and I have two explanatory variable (x1:nominal) and (x2:continiuouse). which package in R can be used ?.How can input the data and write the loglike function and then optimize. regards I suggest you define a function with the process model and the likelihood and then use nlm to minimiza the negative log-likelihood. See the examples at the end of the nlm help. HTH Ruben __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error message in Design library
This was working a few weeks ago, but perhaps the package has been updated since then. model.1 - lrm(response ~ p_value, data=c_abl_oncogene_1_RTK) When I run the following command . . . . prediction.1 - predict(model.1, type=c(fitted)) I get the following error message. . . . Error in predictDesign(object, ..., type = lp, se.fit = FALSE) : could not find function Varcov It seems like a required function of predict may be missing in the Design package (although I doubt Professor Harrell would have overlooked this). Perhaps its my own stupidity with something. Any ideas what could be causing this? I'm sorry I didn't post a reproducable example, but it a lot of code. Any help would be appreciated. Best regards, Patrick R version 2.9.2 (2009-08-24) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] grid datasets tcltk grDevices splines graphics utils stats methods base other attached packages: [1] Design_2.2-0ROCR_1.0-2 gplots_2.7.1caTools_1.9 bitops_1.0-4.1 gdata_2.6.1 gtools_2.6.1svSocket_0.9-43 svMisc_0.9.48 TinnR_1.0.3 R2HTML_1.59-1 [12] Hmisc_3.7-0 MASS_7.2-48 survival_2.35-7 loaded via a namespace (and not attached): [1] cluster_1.12.0 lattice_0.17-25 tools_2.9.2 This email message, including any attachments, is for th...{{dropped:9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] latex code in R - convert to pdf
On Sep 17, 2009, at 4:31 AM, Philipp Pagel wrote: On Thu, Sep 17, 2009 at 10:08:57AM +0200, Philipp Pagel wrote: is it possible to convert latex code to pdf in R (like a latex-program would do it)? Is there a package that comes with this capabilities? My problem is that I want to generate tables automatically - and I can't use a latex editor at that computer ... Besides latex ... are there good ways to generate tables in R? Have a look at Sweave and xtable - I think that's what you want. Charlies post made me aware that by latex editor you may mean that there is no LaTeX installation on your machine. In that case Sweave and xtable will obviously be of little use. If you have Openoffice on that computer package odfWeave may be the solution. If openoffice is not available, either, maybe package HTMLUtils would be another option (I haven't used it so far, so I may be wrong here). The Sweave help page has a set of examples followed by a line that is commented out. If one runs the examples and then this line: tools::texi2dvi(Sweave-test-1.tex, pdf=TRUE) One gets a) an error message but also 4 files two of which are the expected pdf files and two if which are eps file type. I'm not sure what the error message is telling me but it is not correct to my eyes that it could be called failure: Error in tools::texi2dvi(Sweave-test-1.tex, pdf = TRUE) : Running 'texi2dvi' on 'Sweave-test-1.tex' failed. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to colour the tip labels in a phylogenetic tree
Hi, Using Ape, I have constructed an object of class phylo, using the method 'nj' (lets call the object 'tree_ja'). I also have a given subset of 'tree_ja' in a vector (lets call the vector 'subspecies'). What I want to do, is construct a nj tree - plot(tree_ja) - but have the species in vector 'subspecies' shown as red at the tips of the tree. The closest I've come is this: Given that 'tree_ja$tip.label' provides the following: [1] 1_T1 2_T1 3_T1 4_T1 5_T1 6_T1 [7] 7_T1 8_T1 9_T1 10_T1 11_T1 12_T1 and that my 'subspecies' vector is: subspecies - c(1_T1, 2_T1, 3_T1, 4_T1, 6_T1) which can also be written as: subspecies - c(tree_ja$tip.label[1:4], tree_ja$tip.label[5]) I can construct a method which gives me the following statement: plot(tree_ja, tip.col = c('red', 'red', 'red', 'red', 'black', 'red', 'black', 'black', 'black', 'black', 'black', 'black')) But this doesn't work (at least not on my full dataset, which as 118 tips - reduced to 12 here for brevity) and I'm SURE there must be a better way of doing it. Could anyone help me with this? Many thanks, Graham -- Dr. Graham Etherington Post-doctoral Bioinformatician, Department of Computational and Systems Biology John Innes Centre Norwich Research Park Colney Norwich NR4 7UH UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] inline error message
_ wrote: Hi all, I installed the library inline to my default R-enviroment (c:\Programme\R.. ) downloaded and installed RTools from http://www.murdoch-sutherland.com/Rtools/ to c:\Rtools. Path variable is set right (with respect to order) but I get still the error message from my R Error in compileCode(f, code, language, verbose) : Compilation ERROR, function(s)/method(s) not created! Where is my failure ? If you tell us what library (I guess package?) you mean and how you traied to install it and what command gave the error message (including a complete logfile perhaps), we may be able to help. Uwe Ligges Thank you ! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error message in Design library
On Sep 17, 2009, at 8:58 AM, Richardson, Patrick wrote: This was working a few weeks ago, but perhaps the package has been updated since then. model.1 - lrm(response ~ p_value, data=c_abl_oncogene_1_RTK) When I run the following command . . . . prediction.1 - predict(model.1, type=c(fitted)) I get the following error message. . . . Error in predictDesign(object, ..., type = lp, se.fit = FALSE) : could not find function Varcov It seems like a required function of predict may be missing in the Design package (although I doubt Professor Harrell would have overlooked this). Perhaps its my own stupidity with something. Any ideas what could be causing this? I'm sorry I didn't post a reproducable example, but it a lot of code. No, Prof Harrell did not overlook this. Instead he rewrote the whole package which is now rms. The missing Varcov function was made available in an earlier posting to r-help just a couple of days ago. https://stat.ethz.ch/pipermail/r-help/2009-September/211306.html -- David Any help would be appreciated. Best regards, Patrick R version 2.9.2 (2009-08-24) i386-pc-mingw32 locale: LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States. 1252;LC_MONETARY=English_United States. 1252;LC_NUMERIC=C;LC_TIME=English_United States.1252 attached base packages: [1] grid datasets tcltk grDevices splines graphics utils stats methods base other attached packages: [1] Design_2.2-0ROCR_1.0-2 gplots_2.7.1caTools_1.9 bitops_1.0-4.1 gdata_2.6.1 gtools_2.6.1svSocket_0.9-43 svMisc_0.9.48 TinnR_1.0.3 R2HTML_1.59-1 [12] Hmisc_3.7-0 MASS_7.2-48 survival_2.35-7 loaded via a namespace (and not attached): [1] cluster_1.12.0 lattice_0.17-25 tools_2.9.2 This email message, including any attachments, is for th...{{dropped: 9}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] heatmap.2() problems with re-ordering of rows and columns
Hi bioinformatics_guy I think you are looking for the image function: image(mat) The heatmap.2 function does hierarchical clustering on rows and columns and then orders the rows and columns according to the results of the clustering. Image simply plots the matrix. HTH Schalk Heunis On Thu, Sep 17, 2009 at 2:23 PM, bioinformatics_guy wwwhite...@gmail.comwrote: I have a file of the following form -11 -10 -9 -8 -10 -9 -8 NA -9 -7NA NA -8NA NA NA So basically a NxN matrix of log scores. I want to get a heatmap of these log scores but I'm having a problem. I'm using the following code library(gplots) data=read.table(filein.txt,header=FALSE) mat=as.matrix(data) heatmap.2(mat,dendrogram=c(none)) But on the picture, it rearranges all my row,columns. I want it the y axis to be labeled from [10,-10] and the x axis to be the same [-10,10] so that the bottom left cell is -10,-10 and the top right cell is 10,10 -- which is the way the matrix is laid out. Why is it rearranging my cells? -- View this message in context: http://www.nabble.com/heatmap.2%28%29-problems-with-re-ordering-of-rows-and-columns-tp25490249p25490249.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the time complexity of 'match()'?
try running it on a grid within system.time() And the answer will be revealed. Tal On Thu, Sep 17, 2009 at 3:15 PM, Peng Yu pengyu...@gmail.com wrote: Hi, Suppose 'x' is a vector of length n and 'y' is a vector of length m, I am wondering what the time complexity of 'match(x,y)' is. Is it n times m? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- My contact information: Tal Galili Phone number: 972-50-3373767 FaceBook: Tal Galili My Blogs: http://www.r-statistics.com/ http://www.talgalili.com http://www.biostatistics.co.il [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What is the time complexity of 'match()'?
On Thu, 17 Sep 2009, Peng Yu wrote: Hi, Suppose 'x' is a vector of length n and 'y' is a vector of length m, I am wondering what the time complexity of 'match(x,y)' is. Is it n times m? match() hashes the second argument then does hash lookups for the first argument (the source is in src/main/unique.c), so its time complexity will be close to m+n. Even just sorting the second argument and doing binary search would allow (m+n)log(m) complexity -- it really would take a brute force and ignorance approach to give mn time complexity, and match() is sometimes used for quite large m and n. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] heatmap.2() problems with re-ordering of rows and columns
Schalk, Thats a great function! The only question is, is it as flexible as heatmap.2? I figured out how to get it from rearranging the rows and columns but I can't figure out how to label the rows and columns? What I like about the heatmap.2 is that it gives a grid and histogram of the heatmap which is nice. I'm trying to library(gplots) data=read.table(filein.txt,header=FALSE) mat=as.matrix(data) heatmap.2(mat,dendrogram=c(none),trace=c(none),Rowv=F,Colv=F) which works but labels my columns V1 thru V21 and rows 1-21. Id like them to be different an under the man pages for heatmap.2 it states: # Row/Column Labeling margins = c(5, 5), ColSideColors, RowSideColors, cexRow = 0.2 + 1/log10(nr), cexCol = 0.2 + 1/log10(nc), labRow = NULL, labCol = NULL, So I'm adding cexRow=30-3(nr) (as I want it to decrement by 3 for each row but R spits back an error say nr is not recognized. I was looking at other help pages but couldn't find out how to lable the axis the way I wanted to. Schalk Heunis wrote: Hi bioinformatics_guy I think you are looking for the image function: image(mat) The heatmap.2 function does hierarchical clustering on rows and columns and then orders the rows and columns according to the results of the clustering. Image simply plots the matrix. HTH Schalk Heunis On Thu, Sep 17, 2009 at 2:23 PM, bioinformatics_guy wwwhite...@gmail.comwrote: I have a file of the following form -11 -10 -9 -8 -10 -9 -8 NA -9 -7NA NA -8NA NA NA So basically a NxN matrix of log scores. I want to get a heatmap of these log scores but I'm having a problem. I'm using the following code library(gplots) data=read.table(filein.txt,header=FALSE) mat=as.matrix(data) heatmap.2(mat,dendrogram=c(none)) But on the picture, it rearranges all my row,columns. I want it the y axis to be labeled from [10,-10] and the x axis to be the same [-10,10] so that the bottom left cell is -10,-10 and the top right cell is 10,10 -- which is the way the matrix is laid out. Why is it rearranging my cells? -- View this message in context: http://www.nabble.com/heatmap.2%28%29-problems-with-re-ordering-of-rows-and-columns-tp25490249p25490249.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/heatmap.2%28%29-problems-with-re-ordering-of-rows-and-columns-tp25490249p25491683.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with the commands FUNCTION and DERIV to build a polynomial
On Thu, 17 Sep 2009, [ISO-8859-1] Noela Sánchez wrote: Hi all, I need to automate a process in order to prepare a a big loop in the future but I have a problem with the *command function* First I fit a model with lm model1-lm(data2[,2]~data2[,1]+I(data2[,1]^2)+I(data2[,1]^3)+I(data2[,1]^4)) I extract the coefficients to build the polynomial. coef-as.matrix(model1$coefficients) In the next step I need to define the polynomial to derive it. If I write the coefficients manually (writing the numbers by hand) the deriv command works fine! bb-deriv(~2847.22015 -463.06063*x+ 25.43829*x^2 -0.17896*x^3, namevec=x, function.arg=40) I would like to automate this step by being able to extract the coefficients from the linear model and adding them into the polynomial (and not write them by hand)! But if I build the polynomial with the function(x) command calling the * coef* values, the numeric values are not interpreted, the command function does not read properly the coefficients from the linear model. fun-function(x) coef[1]+coef[2]*x+coef[3]*x^2+coef[4]*x^3 fun function(x) coef[1]+coef[2]*x+coef[3]*x^2+coef[4]*x^3 How can i avoid to write the values of the coefficients by hand?? As is often the case there are two parts to this answer: how to do what you asked, and how to do what you want. You can use bquote() to get the value of coef[1] into the expression, so deriv(bquote(.(coef[1])+.(coef[2])*x+.(coef[3])*x^2+.(coef[4])*x^3),x, function.arg=x) will give the derivative. Bill Venables would tell you to use Horner's Rule and write the polynomial as coef[1]+x*(coef[2]+x*(coef[3]+x*coef[4])) to get better speed and numerical stability, but the same trick still works. However, you really don't need deriv() to differentiate a polynomial, and so you can use ordinary lexical scope rather than substitution, for a much tidier answer derivative - function(coef){ function(x) coef[2]+x*(2*coef[3]+x*3*coef[4]) } -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fastest Way to Divide Elements of Row With Its RowSum
On Thu, 17 Sep 2009, William Revelle wrote: At 2:40 PM +0900 9/17/09, Gundala Viswanath wrote: I have a data frame (dat). What I want to do is for each row, divide each row with the sum of its row. The number of row can be large 1million. Is there a faster way than doing it this way? datnorm; for (rw in 1:length(dat)) { tmp - dat[rw,]/sum(dat[rw,]) datnorm - rbind(datnorm, tmp); } - G.V. datnorm - dat/rowSums(dat) this will be faster if dat is a matrix rather than a data.frame. Even if it's a data frame and he needs a data frame answer it might be faster to do mat-as.matrix(dat) matnorm-mat/rowSums(mat) datnorm-as.data.frame(dat) The other advantage, apart from speed, of doing it with dat/rowSums(dat) rather than the loop is he gets the right answer. The loop goes from 1 to the number of columns if dat is a data frame and 1 to the number of entries if dat is a matrix, not from 1 to the number of rows. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.eduUniversity of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to generate a matrix where each row (or column) is the same vector?
Hi, I can use the following code to generate a matrix, each column of which is 'x'. But I have to specify '5' twice in the second command. I am wondering if there is a better way to do it. x=1:10 matrix(rep(x,5),nc=5) t(matrix(rep(x,5),nc=5)) Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to generate a matrix where each row (or column) is the same vector?
On 09/17/2009 04:02 PM, Peng Yu wrote: Hi, I can use the following code to generate a matrix, each column of which is 'x'. But I have to specify '5' twice in the second command. I am wondering if there is a better way to do it. x=1:10 matrix(rep(x,5),nc=5) t(matrix(rep(x,5),nc=5)) Regards, Peng This works for me: do.call( cbind, rep( list( x ), 5 ) ) do.call( rbind, rep( list( x ), 5 ) ) Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/yw8E : New R package : sos |- http://tr.im/y8y0 : search the graph gallery from R `- http://tr.im/y8wY : new R package : ant __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to generate a matrix where each row (or column) is the same vector?
On 09/17/2009 04:02 PM, Peng Yu wrote: Hi, I can use the following code to generate a matrix, each column of which is 'x'. But I have to specify '5' twice in the second command. I am wondering if there is a better way to do it. x=1:10 matrix(rep(x,5),nc=5) t(matrix(rep(x,5),nc=5)) Regards, Peng Or this ; matrix(rep(x,5),nr=length(x)) Romain -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://tr.im/yw8E : New R package : sos |- http://tr.im/y8y0 : search the graph gallery from R `- http://tr.im/y8wY : new R package : ant __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to generate a matrix where each row (or column) is the same vector?
Or: replicate(5, 1:10) On Thu, Sep 17, 2009 at 11:02 AM, Peng Yu pengyu...@gmail.com wrote: Hi, I can use the following code to generate a matrix, each column of which is 'x'. But I have to specify '5' twice in the second command. I am wondering if there is a better way to do it. x=1:10 matrix(rep(x,5),nc=5) t(matrix(rep(x,5),nc=5)) Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Why S4 method is not visible from another package?
Dear All, maybe this is something obvious, I seem to be incapable of understanding how S4 works. So, in package 'A' I defined a summary method for my class: setMethod(summary, signature(object=ListHyperGResult), function(object, pvalue=pvalueCutoff(object), categorySize=NULL) { whatever }) ListHyperGResult has a subclass, GOListHyperGResult: setClass(GOListHyperGResult, representation=representation(conditional=logical), contains=ListHyperGResult, prototype=prototype(testname=GO)) The summary method is exported in the NAMESPACE: exportMethods(summary) Package 'B' depends on package 'A', this is stated in the 'DESCRIPTION' file. If I call 'summary' on a 'GOListHyperGResult' in package B, then the default summary method is called instead of the correct one, despite that I have Browse[1] showMethods(summary) Function: summary (package base) object=AnnDbBimap object=ANY object=Bimap object=DBIObject object=HyperGResultBase object=KEGGHyperGResult object=LinearMResultBase object=ListHyperGResult object=PFAMHyperGResult object=SQLiteConnection object=SQLiteDriver object=SQLiteResult Browse[1] class(gos[[1]]) [1] GOListHyperGResult But I still get: Browse[1] is(gos[[1]], ListHyperGResult) [1] TRUE Browse[1] summary(gos[[1]]) Length Class Mode 1 GOListHyperGResult S4 What am I doing wrong? sessionInfo() R version 2.9.0 (2009-04-17) x86_64-redhat-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] hgu95av2.db_2.2.12 ALL_1.4.4 ExpressionView_0.1 [4] caTools_1.9 bitops_1.0-4.1 KEGG.db_2.2.5 [7] GO.db_2.2.5 RSQLite_0.7-1 DBI_0.2-4 [10] eisa_0.1genefilter_1.24.2 Category_2.10.0 [13] AnnotationDbi_1.6.0 Biobase_2.4.1 isa2_0.1 loaded via a namespace (and not attached): [1] annotate_1.22.0 graph_1.22.2GSEABase_1.6.0 RBGL_1.20.0 [5] splines_2.9.0 survival_2.35-4 tools_2.9.0 XML_2.6-0 [9] xtable_1.5-5 Thanks, Gabor -- Gabor Csardi gabor.csa...@unil.ch UNIL DGM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] heatmap.2() problems with re-ordering of rows and columns
Try placing the column names into labCol and the rownames into labRow e.g.heatmap.2(mat,dendrogram=c(none), Rowv=F, Colv=F, labRow = seq(-7.5,7.5,by=5), labCol=seq(-3,3,by=2)) Schalk Heunis On Thu, Sep 17, 2009 at 3:53 PM, bioinformatics_guy wwwhite...@gmail.comwrote: Schalk, Thats a great function! The only question is, is it as flexible as heatmap.2? I figured out how to get it from rearranging the rows and columns but I can't figure out how to label the rows and columns? What I like about the heatmap.2 is that it gives a grid and histogram of the heatmap which is nice. I'm trying to library(gplots) data=read.table(filein.txt,header=FALSE) mat=as.matrix(data) heatmap.2(mat,dendrogram=c(none),trace=c(none),Rowv=F,Colv=F) which works but labels my columns V1 thru V21 and rows 1-21. Id like them to be different an under the man pages for heatmap.2 it states: # Row/Column Labeling margins = c(5, 5), ColSideColors, RowSideColors, cexRow = 0.2 + 1/log10(nr), cexCol = 0.2 + 1/log10(nc), labRow = NULL, labCol = NULL, So I'm adding cexRow=30-3(nr) (as I want it to decrement by 3 for each row but R spits back an error say nr is not recognized. I was looking at other help pages but couldn't find out how to lable the axis the way I wanted to. Schalk Heunis wrote: Hi bioinformatics_guy I think you are looking for the image function: image(mat) The heatmap.2 function does hierarchical clustering on rows and columns and then orders the rows and columns according to the results of the clustering. Image simply plots the matrix. HTH Schalk Heunis On Thu, Sep 17, 2009 at 2:23 PM, bioinformatics_guy wwwhite...@gmail.comwrote: I have a file of the following form -11 -10 -9 -8 -10 -9 -8 NA -9 -7NA NA -8NA NA NA So basically a NxN matrix of log scores. I want to get a heatmap of these log scores but I'm having a problem. I'm using the following code library(gplots) data=read.table(filein.txt,header=FALSE) mat=as.matrix(data) heatmap.2(mat,dendrogram=c(none)) But on the picture, it rearranges all my row,columns. I want it the y axis to be labeled from [10,-10] and the x axis to be the same [-10,10] so that the bottom left cell is -10,-10 and the top right cell is 10,10 -- which is the way the matrix is laid out. Why is it rearranging my cells? -- View this message in context: http://www.nabble.com/heatmap.2%28%29-problems-with-re-ordering-of-rows-and-columns-tp25490249p25490249.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/heatmap.2%28%29-problems-with-re-ordering-of-rows-and-columns-tp25490249p25491683.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing random forests and classification trees
Greetings tree and forest coders- I'm interested in comparing randomforests and regression tree/ bagging tree models. I'd like to propose a basis for doing this, get feedback, and document this here. I kept it in this thread since that makes sense. In this case I think it's appropriate to compare the R^2 values as one basic measure. I'm actually going to compare mean error (ME), mean absolute error (MAE), root mean squared error (RMSE) as well. This means that I need estimates from each approach so that I can form residuals. **As I see it, the important details are in how to set up the models so that I have comparable estimates, particularly in how the trees/forests are trained and evaluated.** For regression/bagging trees, the typical approach for my application is 100 runs of 10-fold CV. In each run all the values are estimated in an out-of-the-bag sense; each fold is estimated while it is withheld from fitting, thus fit is not inflated. The estimates are then averaged over the 100 runs at each point to get an average simulation and this is used to calculate residuals and the measures mentioned above. Somewhat more specifically, the steps are: I fit a model, I prune it via inspection, I loop 100 times on xpred.rpart(model,xval=10,cp=cp at bottom of cptable from pruned fit) to generate the 100 runs (bagging is thus performed while holding the cp criteria fixed?), I average these pointwise, I calculate the desired stats/quantities for comparison to other models. For randomForests, I would want to fit the model in a similar way, ie 100 runs of 10-fold CV. I think the 10-fold part is clear, the 100 runs, maybe less so. To get 10-fold OOB estimates, I set replace=FALSE, sampsize=.9*nrow(x). Then I get a randomForest with $predicted being the average OOB estimates over all trees for which each point was OOB. I would assume that each tree is constructed with a different 10-fold partitioning of the data set. Thus the number of runs is really more like the number of trees constructed. If i wanted to be really thorough, I could fit 100 random forests and get the $predicted for each and then average these pointwise. But that seems like over kill; isnt that the lesson of plot.randomForest that as the # of trees goes up the error converges to some limit. (from what i've seen). Thus, my primary concern is in the amount of data used for training and cross validating the model in an out-of-bag sense; can i meaningfully compare 10-fold oob estimates sing xpred.rpart to a random forest fit using 90% of the data as sampsize? Of secondary concern is the number of bagging trees versus then number of trees in the random forest. As long as the average estimate error is nearing some limit with the number of bagging trees I'm using, I think this is all that matters. So this is more of methodological difference to be retained, similar to differences in pruning under bagging and random forests, though I should probably specify the node sizes to be similar for each. Am I overlooking anything of grave consequence? Any and all thoughts are welcome. If you are aware of any comparisons of rpart and randomForests in the literature for any field (for regression) of which I am ignorant, I would appreciate the tip. I have read over Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction by Prasad, Iverson, and Liaw. I may have missed it, but I did not see discussion of maintaining consistency in the way the models were trained, though it is a very nice paper overall and contained many interesting approaches and points. Thanks in advance, James -- View this message in context: http://www.nabble.com/-R--comparing-random-forests-and-classification-trees-tp8682315p25491934.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] What does model.matrix() return?
Hi, I don't understand what the meaning of the following lines returned by model.matrix(). Can somebody help me understand it? What can they be used for? attr(,assign) [1] 0 1 2 2 attr(,contrasts) attr(,contrasts)$A [1] contr.treatment attr(,contrasts)$B [1] contr.treatment Regards, Peng a=2 b=3 n=4 A = rep(sapply(1:a,function(x){rep(x,n)}),b) B = as.vector(sapply(sapply(1:b, function(x){rep(x,n)}), function(x){rep(x,a)})) Y = A + B + rnorm(a*b*n) fr = data.frame(Y=Y,A=as.factor(A),B=as.factor(B)) afit=aov(Y ~ A + B,fr) model.matrix(afit) (Intercept) A2 B2 B3 11 0 0 0 21 0 0 0 31 0 0 0 41 0 0 0 51 1 0 0 61 1 0 0 71 1 0 0 81 1 0 0 91 0 1 0 10 1 0 1 0 11 1 0 1 0 12 1 0 1 0 13 1 1 1 0 14 1 1 1 0 15 1 1 1 0 16 1 1 1 0 17 1 0 0 1 18 1 0 0 1 19 1 0 0 1 20 1 0 0 1 21 1 1 0 1 22 1 1 0 1 23 1 1 0 1 24 1 1 0 1 attr(,assign) [1] 0 1 2 2 attr(,contrasts) attr(,contrasts)$A [1] contr.treatment attr(,contrasts)$B [1] contr.treatment __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] comparing random forests and classification trees
The validate.rpart function in the rms package will handle the rpart part of this. It makes sure that the tree is re-built from scratch for each re-sample. It estimates MSE and Somers' Dxy (twice (ROC area -.5)). Frank jamesmcc wrote: Greetings tree and forest coders- I'm interested in comparing randomforests and regression tree/ bagging tree models. I'd like to propose a basis for doing this, get feedback, and document this here. I kept it in this thread since that makes sense. In this case I think it's appropriate to compare the R^2 values as one basic measure. I'm actually going to compare mean error (ME), mean absolute error (MAE), root mean squared error (RMSE) as well. This means that I need estimates from each approach so that I can form residuals. **As I see it, the important details are in how to set up the models so that I have comparable estimates, particularly in how the trees/forests are trained and evaluated.** For regression/bagging trees, the typical approach for my application is 100 runs of 10-fold CV. In each run all the values are estimated in an out-of-the-bag sense; each fold is estimated while it is withheld from fitting, thus fit is not inflated. The estimates are then averaged over the 100 runs at each point to get an average simulation and this is used to calculate residuals and the measures mentioned above. Somewhat more specifically, the steps are: I fit a model, I prune it via inspection, I loop 100 times on xpred.rpart(model,xval=10,cp=cp at bottom of cptable from pruned fit) to generate the 100 runs (bagging is thus performed while holding the cp criteria fixed?), I average these pointwise, I calculate the desired stats/quantities for comparison to other models. For randomForests, I would want to fit the model in a similar way, ie 100 runs of 10-fold CV. I think the 10-fold part is clear, the 100 runs, maybe less so. To get 10-fold OOB estimates, I set replace=FALSE, sampsize=.9*nrow(x). Then I get a randomForest with $predicted being the average OOB estimates over all trees for which each point was OOB. I would assume that each tree is constructed with a different 10-fold partitioning of the data set. Thus the number of runs is really more like the number of trees constructed. If i wanted to be really thorough, I could fit 100 random forests and get the $predicted for each and then average these pointwise. But that seems like over kill; isnt that the lesson of plot.randomForest that as the # of trees goes up the error converges to some limit. (from what i've seen). Thus, my primary concern is in the amount of data used for training and cross validating the model in an out-of-bag sense; can i meaningfully compare 10-fold oob estimates sing xpred.rpart to a random forest fit using 90% of the data as sampsize? Of secondary concern is the number of bagging trees versus then number of trees in the random forest. As long as the average estimate error is nearing some limit with the number of bagging trees I'm using, I think this is all that matters. So this is more of methodological difference to be retained, similar to differences in pruning under bagging and random forests, though I should probably specify the node sizes to be similar for each. Am I overlooking anything of grave consequence? Any and all thoughts are welcome. If you are aware of any comparisons of rpart and randomForests in the literature for any field (for regression) of which I am ignorant, I would appreciate the tip. I have read over Newer Classification and Regression Tree Techniques: Bagging and Random Forests for Ecological Prediction by Prasad, Iverson, and Liaw. I may have missed it, but I did not see discussion of maintaining consistency in the way the models were trained, though it is a very nice paper overall and contained many interesting approaches and points. Thanks in advance, James -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to do a PCA with ordered variables?
Hi, I want to do a pca using a set of variables which are ordered. I have used the dudi.mix method from the ade4 package, but when I do the $index it shows me that R has considered my variables as quantitative. What should I do? -- View this message in context: http://www.nabble.com/How-to-do-a-PCA-with-ordered-variables--tp25491950p25491950.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SVM
Hello, I have 12 sample each sample has got 1000 observation, i.e I have a matrix X with 1000 rows and 12 columns! m - svm(t(X)) p - predict (m) Can anyone tell me how to use svmtrain() in R! Many Yhanks, Samuel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] hint required code in Tinn-R
Dear Tinn-R users, I've a basic question: I don't understand how to configure Tinn-R to hint about the required elements in a certain code, e.g. when typing mean( a box would pop-up just above the code informing that (x, trim = 0, na.rm = FALSE, ...) is the required info which must be given... Anyone who could help me with finding this function? Best Regards, Jonas -- View this message in context: http://www.nabble.com/hint-required-code-in-Tinn-R-tp25487914p25487914.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to separate a function by 2 probabilities
Good Mourning, I have a function to generate a matrix as I show part of it; g[j,i]-if (gen[j,i]==0) al1[i,1]+al1[i,1] else ... However i would like that this function occurred with a probability P and that another function (another formula to generate g matrix) with probability P-1 That´s it, if P is .7, i would like that in 70% of the times (for random i and j) the matrix g was generated according to the formula above and in 30% of the times with a different formula which i did not write could anyone help me? Thank you very much Márcio -- View this message in context: http://www.nabble.com/How-to-separate-a-function-by-2-probabilities-tp25491943p25491943.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems with the commands FUNCTION and DERIV to build a polynomial
Hi all, I need to automate a process in order to prepare a a big loop in the future but I have a problem with the *command function* First I fit a model with lm model1-lm(data2[,2]~data2[,1]+I(data2[,1]^2)+I(data2[,1]^3)+I(data2[,1]^4)) I extract the coefficients to build the polynomial. coef-as.matrix(model1$coefficients) In the next step I need to define the polynomial to derive it. If I write the coefficients manually (writing the numbers by hand) the deriv command works fine! bb-deriv(~2847.22015 -463.06063*x+ 25.43829*x^2 -0.17896*x^3, namevec=x, function.arg=40) I would like to automate this step by being able to extract the coefficients from the linear model and adding them into the polynomial (and not write them by hand)! But if I build the polynomial with the function(x) command calling the * coef* values, the numeric values are not interpreted, the command function does not read properly the coefficients from the linear model. fun-function(x) coef[1]+coef[2]*x+coef[3]*x^2+coef[4]*x^3 fun function(x) coef[1]+coef[2]*x+coef[3]*x^2+coef[4]*x^3 How can i avoid to write the values of the coefficients by hand?? I need to do this many many times, this is the reason i need to be able to automate the process and then build a loop to repeat it many times with different outputs of a linear model! Somebody can help me? -- Noela Grupo de Recursos Marinos y Pesquerías Universidad de A Coruña [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Simple as.Date question dealing with a timezone offset
I've been trying to understand the as.Date functionality and I have a date and time stamp field that looks like this: Tue Sep 15 09:22:09 -0600 2009 and I need to turn it into an R Date object for analysis. Simple date conversions I have down, no problem: adate = c(7/30/1959) as.Date(adate,%m/%d/%Y) [1] 1959-07-30 But when it comes to the type of date/time string format I have above, I can't figure out a format string that will work. The timezone offset is one part that causes problems. Building up to a working format string for the full time stamp string, I can make it as far as: adate = c(Tue Sep 15 09:22:09 -0600 2009) as.Date(adate,format=%a %b %d %H:%M:%S) [1] 2009-09-15 (apparently year defaults to current year when it's not specified in the format string). Because the Year comes after the timezone offset, I have to deal with the timezone offset in the format string. But when I get to the timezone offset value I can't use %z or %Z because those are output only as.Date(adate,format=%a %b %d %H:%M:%S %z %Y) [1] NA I'm close, but can't incorporate the timezone offset field in the date/time stamp string. What am I missing? I suppose one workaround is to split the date/time string into its component parts, reassemble it into a string as.Date can deal with, but that seems to defeat one of the purposes of as.Date's format capability. Any advice for how to translate a Tue Sep 15 09:22:09 -0600 2009 into an R Date object? Landon -- View this message in context: http://www.nabble.com/Simple-as.Date-question-dealing-with-a-timezone-offset-tp25491955p25491955.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SVM
Hi, On Sep 17, 2009, at 7:39 AM, Samuel Okoye wrote: Hello, I have 12 sample each sample has got 1000 observation, i.e I have a matrix X with 1000 rows and 12 columns! m - svm(t(X)) p - predict (m) Can anyone tell me how to use svmtrain() in R! I guess you're using the svm in the e1071 package? What's svmtrain? The call to svm trains the svm. The call to predict uses it on new data, but you need to give it new data to predict on. You have: p - predict(m) What exactly do you want your model to do? Predict on what? Please see the code in the Examples section of ?svm .. it's pretty straight forward. Let us know what problems you're having understanding those examples and we can try to offer some insight. -steve -- Steve Lianoglou Graduate Student: Computational Systems Biology | Memorial Sloan-Kettering Cancer Center | Weill Medical College of Cornell University Contact Info: http://cbio.mskcc.org/~lianos/contact __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] stableFit
A quick question about stableFit() in the fBasics package. Is it possible to constrain the gamma and delta parameters and only estimate the alpha and beta parameters? I tried: ## set.seed(1953) r = rstable(n = 1000, alpha = 1.9, beta = 0.3) stableFit(r, gamma=1, delta=0, type=c(q, mle), doplot=TRUE, trace=TRUE) ## but that seems to estimate the gamma and delta as well. Thanks __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] r-inferno.pdf with detailed table of contents and bookmarks
Hi, I don't find a r-inferno.pdf that has detailed table of contents and bookmarks. If it is possible, can somebody help generated one and post it on line? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to separate a function by 2 probabilities
Marcio Define two functions, e.g. f1-function(i,j) i+j f2-function(i,j) i-j then call them based on the probability e.g. 0.7 f - if(runif(1)0.7) f1 else f2 f(1,1) or more compact (if(runif(1)0.7) f1 else f2)(1,1) HTH Schalk Heunis On Thu, Sep 17, 2009 at 5:10 PM, Marcio Resende mresende...@yahoo.com.brwrote: Good Mourning, I have a function to generate a matrix as I show part of it; g[j,i]-if (gen[j,i]==0) al1[i,1]+al1[i,1] else ... However i would like that this function occurred with a probability P and that another function (another formula to generate g matrix) with probability P-1 That´s it, if P is .7, i would like that in 70% of the times (for random i and j) the matrix g was generated according to the formula above and in 30% of the times with a different formula which i did not write could anyone help me? Thank you very much Márcio -- View this message in context: http://www.nabble.com/How-to-separate-a-function-by-2-probabilities-tp25491943p25491943.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lpSolve constraints don't seem to have an effect
Dear R users, I would like to optimize a linear approximation of a quadratic function using lpSolve. My code runs without any error or warning message but the constraints that I set don't seem to work properly. Nevertheless, I am certain that my code is somewhere wrong. I would like to solve the following problem: max 2x-x^2+y subject to 2x^2 + 3y^2 = 6 2= x,y = 0 I would split the [0,2] interval into 8 equal parts by defining k=9 points (0 included) -lambdas in the following code- and observe the value of the objective function and the constraint on these points and approximate it by the following program: (Sorry for the Latex type equations) max Sum_{j=1}^{2}Sum_{k=0}^{8}f_{kj}lambda_{kj} subject to: Sum_{j=1}^{2}Sum_{k=0}^{8}g_{kj}lambda_{kj}+x2 = 6 Sum_{k=0}^{8}lambda_{k1} = 1 Sum_{k=0}^{8}lambda_{k2} = 1 lambda_{k1} = 0 lambda_{k2} = 0 x2 = 0 library(lpSolve) # Objective function and constraint f1 - function(x) 2*x-x^2 f2 - function(y) y g1 - function(x) 2*x^2 g2 - function(y) 3*y^2 # Setting objective function and constraint values G1 - g1(seq(0,2,by=0.25)) G2 - g2(seq(0,2,by=0.25)) F1 - f1(seq(0,2,by=0.25)) F2 - f2(seq(0,2,by=0.25)) con - c(G1,G2,1) # slack variable included obj - c(F1,F2) # Defining the constraints lambdasx2 - diag(1,ncol=19,nrow=19) lambda1x2 - c(rep(1,times=9),rep(0,times=9),0) lambda2x2 - c(rep(0,times=9),rep(1,times=9),0) lambdamatrix - rbind(lambda1x2,lambda2x2,lambdasx2) conlamb - rbind(con,lambdamatrix) # Defining the right-hand side rightside - c(6,1,1,rep(0,times=19)) # Defining the directions direc - c(=,=,=,rep(=,times=19)) # LP lp(max,obj,conlamb,direc,rightside)$solution # But I receive [1] 0.00 1.00 5.75 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 # [12] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 # However, in the constraints the sum of the first 8 values should be 1 and the second 8 values should be # 1 Thank you very much in advance, Adam -- View this message in context: http://www.nabble.com/lpSolve-constraints-don%27t-seem-to-have-an-effect-tp25491959p25491959.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to separate a function by 2 probabilities
Assuming storage is not a problem, first generate two matrices, one by each method, call these A and B. Then if dim(A) = dim(B) = c(m,n) and k = m*n z - rbinom(k,1, .7) result - A*z + B*(1-z) Bert Gunter Genentech Nonclinical Biostatistics -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Marcio Resende Sent: Thursday, September 17, 2009 8:11 AM To: r-help@r-project.org Subject: [R] How to separate a function by 2 probabilities Good Mourning, I have a function to generate a matrix as I show part of it; g[j,i]-if (gen[j,i]==0) al1[i,1]+al1[i,1] else ... However i would like that this function occurred with a probability P and that another function (another formula to generate g matrix) with probability P-1 That´s it, if P is .7, i would like that in 70% of the times (for random i and j) the matrix g was generated according to the formula above and in 30% of the times with a different formula which i did not write could anyone help me? Thank you very much Márcio -- View this message in context: http://www.nabble.com/How-to-separate-a-function-by-2-probabilities-tp254919 43p25491943.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] QQ plotting of various distributions...
Hello! I am trying with this question again: I would like to test few distributional assumptions for some behavioral response data. There are few theories about true distribution of those data, like: normal, lognormal, gamma, ex-Gaussian (exponential-Gaussian), Wald (inverse Gaussian) etc. The best way would be via qq-plot, to show to students differences. First two are trivial: qqnorm(dat$X) qqnorm(log(dat$X)) Then, things are getting more hairy. I am not sure how to make plots for the rest. I tried gamma with: qqmath(~ X, data=dat, distribution=function(X) qgamma(X, shape, scale)) Which should be the same as: plot(qgamma(ppoints(dat$X), shape, scale), sort(dat$X)) Shape and scale parameters I got via mhsmm package that has gammafit() for shape and scale parameters estimation. Am I on right track? Does anyone know how to plot the rest: ex-Gaussian (exponential-Gaussian), Wald (inverse Gaussian)? Thanks, PM __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do a PCA with ordered variables?
P. Branco wrote: I have used the dudi.mix method from the ade4 package, but when I do the $index it shows me that R has considered my variables as quantitative. What should I do? You should make sure that they are encoded as ordered factors, which has nothing to do with ade4's dudi.mix(). ## ?ordered Mark. P.Branco wrote: Hi, I want to do a pca using a set of variables which are ordered. I have used the dudi.mix method from the ade4 package, but when I do the $index it shows me that R has considered my variables as quantitative. What should I do? -- View this message in context: http://www.nabble.com/How-to-do-a-PCA-with-ordered-variables--tp25491950p25491969.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: boxplot
Hi I do not know rma but from help page boxplot requires as input a formula, list (only some list of numerics), data frame or numeric vector. I am not sure if your object is one of these. If not you need to convert it to object which is acceptable for boxplot. Regards Petr r-help-boun...@r-project.org napsal dne 17.09.2009 08:29:43: Hi, I m not able to plot normalized data(normalization by rma) using boxplot. I don't know why? basically, object(formed of normalized data) belong to ExpressionSet class. It is showing error Error in x[!xna] : object of type 'S4' is not subsettable In addition: Warning messages: 1: In is.na(x) : is.na() applied to non-(list or vector) of type 'S4' 2: In is.na(x) : is.na() applied to non-(list or vector) of type 'S4' Now, how to plot? Should I have use another function? By Sukhbir Singh Rattan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Dealing with heterogeneity with varComb weights
Hi, I am trying to add multiple variance structures such as the first example below: vf1 - varComb(varIdent(form = ~1|Sex), varPower()) However my code below will not work can anybody please advise me? VFcomb-varComb(varExp(form=~depcptwithextybf),varFixed(form=~FebNAO)) also if you have two variables with the same weights function would you write that as: VFcomb-varComb(varExp(form=~depcptwithextybf),varExp(form=~FebNAO)) thanks Rebecca -- View this message in context: http://www.nabble.com/Dealing-with-heterogeneity-with-varComb-weights-tp25491971p25491971.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Why S4 method is not visible from another package?
Gábor Csárdi wrote: Dear All, maybe this is something obvious, I seem to be incapable of understanding how S4 works. So, in package 'A' I defined a summary method for my class: setMethod(summary, signature(object=ListHyperGResult), function(object, pvalue=pvalueCutoff(object), categorySize=NULL) { whatever }) ListHyperGResult has a subclass, GOListHyperGResult: setClass(GOListHyperGResult, representation=representation(conditional=logical), contains=ListHyperGResult, prototype=prototype(testname=GO)) The summary method is exported in the NAMESPACE: exportMethods(summary) Package 'B' depends on package 'A', this is stated in the 'DESCRIPTION' file. If I call 'summary' on a 'GOListHyperGResult' in Hi Gabor It is not S4 alone, but S4 + name spaces that are giving you problems. You probably want to Import: A rather than depends, and importFrom(A, summary). As it stands, inside the B name space, you find base::summary, whereas you've defined a method on summary that has been promoted to a generic in one of the packages that A imports (probably AnnotationDbi). This is a little bit of a guess; at some level it might seem more appropriate to Import: AnnotationDbi and importFrom(AnnotationDbi, summary) (or wherever the generic for summary that you are trying to use is created). Martin package B, then the default summary method is called instead of the correct one, despite that I have Browse[1] showMethods(summary) Function: summary (package base) object=AnnDbBimap object=ANY object=Bimap object=DBIObject object=HyperGResultBase object=KEGGHyperGResult object=LinearMResultBase object=ListHyperGResult object=PFAMHyperGResult object=SQLiteConnection object=SQLiteDriver object=SQLiteResult Browse[1] class(gos[[1]]) [1] GOListHyperGResult But I still get: Browse[1] is(gos[[1]], ListHyperGResult) [1] TRUE Browse[1] summary(gos[[1]]) Length Class Mode 1 GOListHyperGResult S4 What am I doing wrong? sessionInfo() R version 2.9.0 (2009-04-17) x86_64-redhat-linux-gnu locale: LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] hgu95av2.db_2.2.12 ALL_1.4.4 ExpressionView_0.1 [4] caTools_1.9 bitops_1.0-4.1 KEGG.db_2.2.5 [7] GO.db_2.2.5 RSQLite_0.7-1 DBI_0.2-4 [10] eisa_0.1genefilter_1.24.2 Category_2.10.0 [13] AnnotationDbi_1.6.0 Biobase_2.4.1 isa2_0.1 loaded via a namespace (and not attached): [1] annotate_1.22.0 graph_1.22.2GSEABase_1.6.0 RBGL_1.20.0 [5] splines_2.9.0 survival_2.35-4 tools_2.9.0 XML_2.6-0 [9] xtable_1.5-5 Thanks, Gabor __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: What is the best way to get a subset of a data.frame?
Hi r-help-boun...@r-project.org napsal dne 17.09.2009 04:14:24: Hi, I want to construct a data.frame 'y' by using x$x and x$y. I think that there might be better ways to do it (because, for example, we can use a_matrix[3:5,] to extract certain rows, where 'a_matrix' is a matrix). Can somebody let know what the best way is? a=data.frame(x=1:10,y=rep(abc,10),z=rep(xyz,10)) a x y z 1 1 abc xyz 2 2 abc xyz 3 3 abc xyz 4 4 abc xyz 5 5 abc xyz 6 6 abc xyz 7 7 abc xyz 8 8 abc xyz 9 9 abc xyz 10 10 abc xyz b=data.frame(x=a$x, y=a$y) b what is wrong on b[1:2,] Regards Petr x y 1 1 abc 2 2 abc 3 3 abc 4 4 abc 5 5 abc 6 6 abc 7 7 abc 8 8 abc 9 9 abc 10 10 abc Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with date specification
Hi everyone,I have a data daily data (x) for 10 years starting from 04-01-1995 to 03-31-2005. I was able to get the yearly sum for the ten years using aggregate(x, years, sum). But this gave me the yearly sum for 1995 (Apr- Dec); 1996 (Jan-Dec) -2005 (Jan-Mar). But I want to get the aggregates for Apr-1995 to Mar 1996, Apr 1996- mar 1997 and so on. your help will be higly appreciated. Thanks in advance -- Acharya, Subodh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Which one shall I use? '=' or '-'?
I wrote a post on the Revolutions blog a little while back discussing the difference between the two (and the history of the - operator). You can find it on blog.revolution-computing.com at: http://bit.ly/3YWw3R # David Smith On Wed, Sep 16, 2009 at 7:58 PM, Peng Yu pengyu...@gmail.com wrote: Hi, I was told to use - instead of = in the mailing list. I am wondering what the difference between them? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- David M Smith da...@revolution-computing.com Director of Community, REvolution Computing www.revolution-computing.com Tel: +1 (206) 577-4778 x3203 (San Francisco, USA) Check out our upcoming events schedule at www.revolution-computing.com/events __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with date specification
Subodh Assuming the data is ordered by date then you can define fin.years = (0:(10*12-1)) %/% 12 then use aggregate: aggregate(x, list(fin.years),sum) HTH Schalk Heunis On Thu, Sep 17, 2009 at 6:11 PM, Subodh Acharya shoeb...@gmail.com wrote: Hi everyone,I have a data daily data (x) for 10 years starting from 04-01-1995 to 03-31-2005. I was able to get the yearly sum for the ten years using aggregate(x, years, sum). But this gave me the yearly sum for 1995 (Apr- Dec); 1996 (Jan-Dec) -2005 (Jan-Mar). But I want to get the aggregates for Apr-1995 to Mar 1996, Apr 1996- mar 1997 and so on. your help will be higly appreciated. Thanks in advance -- Acharya, Subodh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] r-inferno.pdf with detailed table of contents and bookmarks
Hey, check this link: http://www.burns-stat.com/pages/Tutor/R_inferno.pdf Greetings Víctor De: r-help-boun...@r-project.org en nombre de Peng Yu Enviado el: jue 17/09/2009 10:43 Para: r-h...@stat.math.ethz.ch Asunto: [R] r-inferno.pdf with detailed table of contents and bookmarks Hi, I don't find a r-inferno.pdf that has detailed table of contents and bookmarks. If it is possible, can somebody help generated one and post it on line? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Dealing with heterogeneity with varComb weights
RS27 wrote: Hi, I am trying to add multiple variance structures such as the first example below: vf1 - varComb(varIdent(form = ~1|Sex), varPower()) However my code below will not work can anybody please advise me? VFcomb-varComb(varExp(form=~depcptwithextybf),varFixed(form=~FebNAO)) VarFixed won't work if FebNAO has values equal to 0. In fact, I wouldn't use varFixed at all. A bit more info on the error message would be handy... Alain also if you have two variables with the same weights function would you write that as: VFcomb-varComb(varExp(form=~depcptwithextybf),varExp(form=~FebNAO)) thanks Rebecca - Dr. Alain F. Zuur First author of: 1. Analysing Ecological Data (2007). Zuur, AF, Ieno, EN and Smith, GM. Springer. 680 p. 2. Mixed effects models and extensions in ecology with R. (2009). Zuur, AF, Ieno, EN, Walker, N, Saveliev, AA, and Smith, GM. Springer. 3. A Beginner's Guide to R (2009). Zuur, AF, Ieno, EN, Meesters, EHWG. Springer Statistical consultancy, courses, data analysis and software Highland Statistics Ltd. 6 Laverock road UK - AB41 6FN Newburgh Email: highs...@highstat.com URL: www.highstat.com -- View this message in context: http://www.nabble.com/Dealing-with-heterogeneity-with-varComb-weights-tp25491971p25491989.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to colour the tip labels in a phylogenetic tree
Graham Etherington wrote: Hi, Using Ape, I have constructed an object of class phylo, using the method 'nj' (lets call the object 'tree_ja'). I also have a given subset of 'tree_ja' in a vector (lets call the vector 'subspecies'). What I want to do, is construct a nj tree - plot(tree_ja) - but have the species in vector 'subspecies' shown as red at the tips of the tree. The closest I've come is this: Given that 'tree_ja$tip.label' provides the following: [1] 1_T1 2_T1 3_T1 4_T1 5_T1 6_T1 [7] 7_T1 8_T1 9_T1 10_T1 11_T1 12_T1 and that my 'subspecies' vector is: subspecies - c(1_T1, 2_T1, 3_T1, 4_T1, 6_T1) which can also be written as: subspecies - c(tree_ja$tip.label[1:4], tree_ja$tip.label[5]) I can construct a method which gives me the following statement: plot(tree_ja, tip.col = c('red', 'red', 'red', 'red', 'black', 'red', 'black', 'black', 'black', 'black', 'black', 'black')) But this doesn't work (at least not on my full dataset, which as 118 tips - reduced to 12 here for brevity) and I'm SURE there must be a better way of doing it. Could anyone help me with this? Many thanks, Graham -- Dr. Graham Etherington Post-doctoral Bioinformatician, Department of Computational and Systems Biology John Innes Centre Norwich Research Park Colney Norwich NR4 7UH UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. In general the r-sig-phylo group is better for this kind of question, and it would be better to give us a reproducible example, but here's an example that (I think) does what you want: library(ape) set.seed(1001) z = rcoal(10) z Phylogenetic tree with 10 tips and 9 internal nodes. Tip labels: t1, t7, t6, t10, t3, t9, ... Rooted; includes branch lengths. names(z) [1] edgeedge.length tip.label Nnode ss - z$tip.label[c(1,3,5,7)] ?plot.phylo plot(z,tip.color=ifelse(z$tip.label %in% ss, red,black)) -- View this message in context: http://www.nabble.com/How-to-colour-the-tip-labels-in-a-phylogenetic-tree-tp25490805p25491995.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with date specification
Try either of these which convert to year and qtr and then shift the qtr by one. The resulting series is labeled by the year in which the series dates start. Use as.integer(...) + 1 if you prefer to label by ending year. library(zoo) # test data DF - data.frame(date = Sys.Date() + 1:1000, value = 1:1000) # 1. Using a data frame -- # uses as.yearqtr from zoo but no zoo objects tapply(DF$value, as.integer(as.yearqtr(DF$date) - 0.25), sum) # 2. or converting the time series to zoo and then aggregating z - zoo(DF$value, as.Date(DF$date)) aggregate(z, as.integer(as.yearqtr(time(z))-0.25), sum) In future please provide some of your data using dput (see ?dput) to make it easier to answer and code if relevant. On Thu, Sep 17, 2009 at 12:11 PM, Subodh Acharya shoeb...@gmail.com wrote: Hi everyone,I have a data daily data (x) for 10 years starting from 04-01-1995 to 03-31-2005. I was able to get the yearly sum for the ten years using aggregate(x, years, sum). But this gave me the yearly sum for 1995 (Apr- Dec); 1996 (Jan-Dec) -2005 (Jan-Mar). But I want to get the aggregates for Apr-1995 to Mar 1996, Apr 1996- mar 1997 and so on. your help will be higly appreciated. Thanks in advance -- Acharya, Subodh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with date specification
On Thu, Sep 17, 2009 at 9:11 AM, Subodh Acharya shoeb...@gmail.com wrote: Hi everyone,I have a data daily data (x) for 10 years starting from 04-01-1995 to 03-31-2005. I was able to get the yearly sum for the ten years using aggregate(x, years, sum). But this gave me the yearly sum for 1995 (Apr- Dec); 1996 (Jan-Dec) -2005 (Jan-Mar). But I want to get the aggregates for Apr-1995 to Mar 1996, Apr 1996- mar 1997 and so on. your help will be higly appreciated. Thanks in advance -- Acharya, Subodh [[alternative HTML version deleted]] subset(x, date=Apr-96 date=Mar)) then do the sum? - Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fastest Way to Divide Elements of Row With Its RowSum
-Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Thomas Lumley Sent: Thursday, September 17, 2009 6:59 AM To: William Revelle Cc: r-h...@stat.math.ethz.ch Subject: Re: [R] Fastest Way to Divide Elements of Row With Its RowSum On Thu, 17 Sep 2009, William Revelle wrote: At 2:40 PM +0900 9/17/09, Gundala Viswanath wrote: I have a data frame (dat). What I want to do is for each row, divide each row with the sum of its row. The number of row can be large 1million. Is there a faster way than doing it this way? datnorm; for (rw in 1:length(dat)) { tmp - dat[rw,]/sum(dat[rw,]) datnorm - rbind(datnorm, tmp); } - G.V. datnorm - dat/rowSums(dat) this will be faster if dat is a matrix rather than a data.frame. Even if it's a data frame and he needs a data frame answer it might be faster to do mat-as.matrix(dat) matnorm-mat/rowSums(mat) datnorm-as.data.frame(dat) If the data.frame has many more rows than columns and the number of rows is large (e.g., dimensions 10^6 x 20) you may find that you run out of space converting it to a matrix. You can use much less space by looping over the columns, both to compute the row sums and to do the division. E.g., the following should require only 1 (maybe 2) column's worth of scratch space: f2 - function(x){ stopifnot(is.data.frame(x), ncol(x)=1) rowsum - x[[1]] if(ncol(x)1) for(i in 2:ncol(x)) rowsum - rowsum + x[[i]] for(i in 1:ncol(x)) x[[i]] - x[[i]] / rowsum x } For a 10^6 by 20 all numeric data.frame this runs in 13 seconds on my machine but things like x/rowSums(x) run out of memory. When working with data.frames it generally pays to think a column at a time instead of a row at a time. Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com The other advantage, apart from speed, of doing it with dat/rowSums(dat) rather than the loop is he gets the right answer. The loop goes from 1 to the number of columns if dat is a data frame and 1 to the number of entries if dat is a matrix, not from 1 to the number of rows. -thomas Thomas Lumley Assoc. Professor, Biostatistics tlum...@u.washington.edu University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] r-inferno.pdf with detailed table of contents and bookmarks
That's a reasonable request that I have planned for whenever I revise it. However, I'm not going to be doing that for some time yet (unit of time is somewhere in the months to years range). If someone is keen to do that, I can make the LyX file available to them. Patrick Burns patr...@burns-stat.com +44 (0)20 8525 0696 http://www.burns-stat.com (home of The R Inferno and A Guide for the Unwilling S User) Peng Yu wrote: Hi, I don't find a r-inferno.pdf that has detailed table of contents and bookmarks. If it is possible, can somebody help generated one and post it on line? Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Grouped Logistic (Or conditional Logistic.)
Hi, I'm not sure of the correct nomenclature or function for what I'm trying to do. I'm interested in calculated a logistic regression on a binary dependent variable (True,False). There are a few ways to easily do this in R. Both SVM and GLM work easily. The part that I want to add is group wise awareness. So that the algorithm computes the coefficients to maximize the liklihood of of a True label per group. An toy explanation is probably best. I've been looking at horse racing models as a fun field to learn about statistics and R. So, for this example, lets assume the following: 100 horses in our stable 10 horses per race 75 races this season (some horses race more than once.) The independent variables are things about a horse (average speed, number of past wins, etc.) The dependent variable is (Win, Lose) represented by (1,0) As mentioned above, an SVM or GLM will quickly work to estimate coefficients and probability of a Win. I'd like to take it further and estimate the probability of a win but look at the per race. I'm NOT interested in the group label as a final part of the model. I don't want a separate set of coefficients for each group. I just want the iterative algorithm to work toward maximizing the liklihood PER GROUP as an average. I looked extensively through rseek.org for things like grouped logistic and nested logistic. I couldn't seem to find anything do this. I'm probably naming it wrong. I assume that a MANUAL iteration concept would be to : 1) Pick a coefficient 2) Calculate the resulting probability for each horse. 3) Measure the strength of the result for each race (sum them together or average them?) 4) Adjust coefficient and repeat Surely there must be some standard function in a library that will do this. Can any of the stat gurus here offer some suggestions? Thanks! -- Noah __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Hi
Can I run a lack-of-fit test using R? How do I do that? Thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot, ribbon not showing up properly
Hi Thierry, I tried the code suggested below, but it didn't work fully. The ribbon showed up correctly for first and last days, but the days in between appeared to be ignored. I tried other ways of feeding geom_ribbon the summary stats but my ways didn't work either. Thanks for trying to help, Sock On Tue, 15 Sep 2009 11:17 +0200, ONKELINX, Thierry thierry.onkel...@inbo.be wrote: Dear Sock, I'm wondering if that mean_sdl function is return what you are expecting. I would calculate the statistics outside ggplot. RibbonData - ddply(dat.less, Day, function(x){ mean(x$Y) + c(ymin = -1, ymax = 1) * sd(x$Y) }) p + stat_summary(data=dat.less, aes(group=1), geom=crossbar, fun.data=mean_sdl, mult=1) + geom_ribbon(data = RibbonData, aes(group = 1, ymin = ymin, ymax = ymax), fill=alpha(blue, 1/10)) HTH, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 thierry.onkel...@inbo.be www.inbo.be To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey -Oorspronkelijk bericht- Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] Namens Sock Cheng Verzonden: maandag 14 september 2009 20:58 Aan: r-help@r-project.org Onderwerp: [R] ggplot, ribbon not showing up properly Hi, I'm trying to plot a longitudinal data set, using ggplot and adding some summary info (eg. mean, 1 sd bounds) using geom=ribbon. The summary info is based on a subset of the original data (eg. less an outlier). But I'm having trouble getting the ribbons to show up correctly. It's probably something obvious that I'm missing as a novice at ggplot2, and any help is much appreciated! Here's a simple example. I tried several things. - if I use geom=crossbar, everything is ok - if Day is set as rep(c(1,2,3,8,9), each=8), then everything is ok, which makes me wonder if the problem has to do with the ordering of Day? Day is supposed to be numeric. Thanks! Sock ### Example data. Ran using R version 2.9.2, ggplot2 version 0.8.3 ### set.seed(13) Day - rep(c(1, 2, 3, 8, 20), each=8) # The plot is ok if Day - rep(c(1,2,3,8,9), each=8) ID - rep(LETTERS[1:8], 5) Y - rnorm(length(Day), 100, 5) dat - data.frame(Day=Day, ID=ID, Y=Y) # outlier dat$Y[dat$ID==A dat$Day==8] - 150 dat.less - dat[!(dat$ID==A dat$Day==8),] # Longitudinal data plot. Obs for each subject is connected by a line over time p - ggplot(dat, aes(x=Day, y=Y, group=ID)) + scale_x_continuous(breaks=sort(unique(dat$Day))) + geom_line(colour=alpha(blue, 5/10)) # Adding mean, 1 sd bounds using crossbar geom is ok. But the same info using ribbon geom doesn't work. p + stat_summary(data=dat.less, aes(group=1), geom=crossbar, fun.data=mean_sdl, mult=1) + stat_summary(data=dat.less, aes(group=1), geom=ribbon, fun.data=mean_sdl, mult=1, fill=alpha(blue, 1/10)) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Druk dit bericht a.u.b. niet onnodig af. Please do not print this message unnecessarily. Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is door een geldig ondertekend document. The views expressed in this message and any annex are purely those of the writer and may not be regarded as stating an official position of INBO, as long as the message is not confirmed by a duly signed document. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] define a new family (and a new link function) for gam in gam package
Dear R list, is it possible to define a new family (and a new link function) for gam in gam package? How? I read the help for gam, family, gam.model, make.link but I did not find a solution. Regards -- Corrado Topi Area 18,Department of Biology University of York, York, YO10 5YW, UK Phone: + 44 (0) 1904 328645, E-mail: ct...@york.ac.uk __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] inline error message
On 9/17/2009 9:16 AM, Uwe Ligges wrote: _ wrote: Hi all, I installed the library inline to my default R-enviroment (c:\Programme\R.. ) downloaded and installed RTools from http://www.murdoch-sutherland.com/Rtools/ to c:\Rtools. Path variable is set right (with respect to order) but I get still the error message from my R Error in compileCode(f, code, language, verbose) : Compilation ERROR, function(s)/method(s) not created! Where is my failure ? If you tell us what library (I guess package?) you mean and how you traied to install it and what command gave the error message (including a complete logfile perhaps), we may be able to help. I think the package is inline, but the code that led to the error (or a simplified example) would help a lot in diagnosing the error. I know I wouldn't try without that. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to do a PCA with ordered variables?
If you are really serious about your variables being ordinal, you should analyze them using polychoric correlations. See polycor package by John Fox. On Thu, Sep 17, 2009 at 10:52 AM, Mark Difford mark_diff...@yahoo.co.uk wrote: I have used the dudi.mix method from the ade4 package, but when I do the $index it shows me that R has considered my variables as quantitative. What should I do? You should make sure that they are encoded as ordered factors, which has nothing to do with ade4's dudi.mix(). ## ?ordered Mark. P.Branco wrote: Hi, I want to do a pca using a set of variables which are ordered. I have used the dudi.mix method from the ade4 package, but when I do the $index it shows me that R has considered my variables as quantitative. What should I do? -- View this message in context: http://www.nabble.com/How-to-do-a-PCA-with-ordered-variables--tp25491950p25491969.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Stas Kolenikov, also found at http://stas.kolenikov.name Small print: I use this email account for mailing lists only. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R functions with array arguments
__ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouped Logistic (Or conditional Logistic.)
On 17-Sep-09 17:28:16, Noah Silverman wrote: Hi, I'm not sure of the correct nomenclature or function for what I'm trying to do. I'm interested in calculated a logistic regression on a binary dependent variable (True,False). There are a few ways to easily do this in R. Both SVM and GLM work easily. The part that I want to add is group wise awareness. So that the algorithm computes the coefficients to maximize the liklihood of of a True label per group. An toy explanation is probably best. I've been looking at horse racing models as a fun field to learn about statistics and R. So, for this example, lets assume the following: 100 horses in our stable 10 horses per race 75 races this season (some horses race more than once.) The independent variables are things about a horse (average speed, number of past wins, etc.) The dependent variable is (Win, Lose) represented by (1,0) As mentioned above, an SVM or GLM will quickly work to estimate coefficients and probability of a Win. I'd like to take it further and estimate the probability of a win but look at the per race. I'm NOT interested in the group label as a final part of the model. I don't want a separate set of coefficients for each group. I just want the iterative algorithm to work toward maximizing the liklihood PER GROUP as an average. I looked extensively through rseek.org for things like grouped logistic and nested logistic. I couldn't seem to find anything do this. I'm probably naming it wrong. I assume that a MANUAL iteration concept would be to : 1) Pick a coefficient 2) Calculate the resulting probability for each horse. 3) Measure the strength of the result for each race (sum them together or average them?) 4) Adjust coefficient and repeat Surely there must be some standard function in a library that will do this. Can any of the stat gurus here offer some suggestions? Thanks! -- Noah In the context of your fun example, you have a fundamental problem in that (if I've understood your statement of it correctly) you will have more than one of your horses in the same race (apparently 10). Therefore, one of them winning excludes any of the others winning in that same race, so their results are not independent of each other. Also, at least in real life, the probability that a given horse will win in a particular race depends not only on the covariates per horse (such as your average speed, number of past wins, etc.), and indeed on the condition of the race-course at the time, but also (and usually strongly) on the characteristics of the other horses in the same race. So a simple logistic model of the kind you seem to be proposing would certainly not be realistic! I would be happier thinking about your problem in the context of a different kind of example ... Ted. E-Mail: (Ted Harding) ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 17-Sep-09 Time: 19:06:27 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] dyn.load search path?
Sorry if this is somewhere in the fine manuals but I've been unable to locate it. Does dyn.load use a search path or does it just look in the current directory for non-fully-qualified filenames? If there is a search path, what is it? Thanks for your help -- View this message in context: http://www.nabble.com/dyn.load-search-path--tp25492214p25492214.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Simple as.Date question dealing with a timezone offset
On Sep 17, 2009, at 11:25 AM, esawdust wrote: I've been trying to understand the as.Date functionality and I have a date and time stamp field that looks like this: Tue Sep 15 09:22:09 -0600 2009 and I need to turn it into an R Date object for analysis. Simple date conversions I have down, no problem: adate = c(7/30/1959) as.Date(adate,%m/%d/%Y) [1] 1959-07-30 But when it comes to the type of date/time string format I have above, I can't figure out a format string that will work. The timezone offset is one part that causes problems. Building up to a working format string for the full time stamp string, I can make it as far as: adate = c(Tue Sep 15 09:22:09 -0600 2009) as.Date(adate,format=%a %b %d %H:%M:%S) [1] 2009-09-15 (apparently year defaults to current year when it's not specified in the format string). Because the Year comes after the timezone offset, I have to deal with the timezone offset in the format string. But when I get to the timezone offset value I can't use %z or %Z because those are output only as.Date(adate,format=%a %b %d %H:%M:%S %z %Y) [1] NA You are confusing R Date objects with the the date-time classes. I don't think Date classes objects even allow TZ offets. ?DateTimeClasses as.POSIXct(as.Date(adate,%m/%d/%Y), origin=1960-01-01, tz=GMT) [1] 1959-07-29 20:00:00 EDT' Notice that my TZ (GMT -4) was used. so it was still the prior day in New England. I'm close, but can't incorporate the timezone offset field in the date/time stamp string. What am I missing? I suppose one workaround is to split the date/ time string into its component parts, reassemble it into a string as.Date can deal with, but that seems to defeat one of the purposes of as.Date's format capability. Any advice for how to translate a Tue Sep 15 09:22:09 -0600 2009 into an R Date object? Landon -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouped Logistic (Or conditional Logistic.)
Ted, Thanks for the reply. For the example, I'm not looking to predict THE winner, but to find the best probabilities of winning. It would seem that the process of iterating through possible coefficients would be the same as a standard GLM, the evalation part as you work through them would have to be adjusted to look per group. I would call this something like grouped maximum liklihood if I got to make up the name. -N On 9/17/09 11:06 AM, (Ted Harding) wrote: On 17-Sep-09 17:28:16, Noah Silverman wrote: Hi, I'm not sure of the correct nomenclature or function for what I'm trying to do. I'm interested in calculated a logistic regression on a binary dependent variable (True,False). There are a few ways to easily do this in R. Both SVM and GLM work easily. The part that I want to add is group wise awareness. So that the algorithm computes the coefficients to maximize the liklihood of of a True label per group. An toy explanation is probably best. I've been looking at horse racing models as a fun field to learn about statistics and R. So, for this example, lets assume the following: 100 horses in our stable 10 horses per race 75 races this season (some horses race more than once.) The independent variables are things about a horse (average speed, number of past wins, etc.) The dependent variable is (Win, Lose) represented by (1,0) As mentioned above, an SVM or GLM will quickly work to estimate coefficients and probability of a Win. I'd like to take it further and estimate the probability of a win but look at the per race. I'm NOT interested in the group label as a final part of the model. I don't want a separate set of coefficients for each group. I just want the iterative algorithm to work toward maximizing the liklihood PER GROUP as an average. I looked extensively through rseek.org for things like grouped logistic and nested logistic. I couldn't seem to find anything do this. I'm probably naming it wrong. I assume that a MANUAL iteration concept would be to : 1) Pick a coefficient 2) Calculate the resulting probability for each horse. 3) Measure the strength of the result for each race (sum them together or average them?) 4) Adjust coefficient and repeat Surely there must be some standard function in a library that will do this. Can any of the stat gurus here offer some suggestions? Thanks! -- Noah In the context of your fun example, you have a fundamental problem in that (if I've understood your statement of it correctly) you will have more than one of your horses in the same race (apparently 10). Therefore, one of them winning excludes any of the others winning in that same race, so their results are not independent of each other. Also, at least in real life, the probability that a given horse will win in a particular race depends not only on the covariates per horse (such as your average speed, number of past wins, etc.), and indeed on the condition of the race-course at the time, but also (and usually strongly) on the characteristics of the other horses in the same race. So a simple logistic model of the kind you seem to be proposing would certainly not be realistic! I would be happier thinking about your problem in the context of a different kind of example ... Ted. E-Mail: (Ted Harding)ted.hard...@manchester.ac.uk Fax-to-email: +44 (0)870 094 0861 Date: 17-Sep-09 Time: 19:06:27 -- XFMail -- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouped Logistic (Or conditional Logistic.)
On Sep 17, 2009, at 2:06 PM, (Ted Harding) wrote: On 17-Sep-09 17:28:16, Noah Silverman wrote: Hi, I'm not sure of the correct nomenclature or function for what I'm trying to do. I'm interested in calculated a logistic regression on a binary dependent variable (True,False). There are a few ways to easily do this in R. Both SVM and GLM work easily. The part that I want to add is group wise awareness. So that the algorithm computes the coefficients to maximize the liklihood of of a True label per group. An toy explanation is probably best. I've been looking at horse racing models as a fun field to learn about statistics and R. So, for this example, lets assume the following: 100 horses in our stable 10 horses per race 75 races this season (some horses race more than once.) The independent variables are things about a horse (average speed, number of past wins, etc.) The dependent variable is (Win, Lose) represented by (1,0) As mentioned above, an SVM or GLM will quickly work to estimate coefficients and probability of a Win. I'd like to take it further and estimate the probability of a win but look at the per race. I'm NOT interested in the group label as a final part of the model. I don't want a separate set of coefficients for each group. I just want the iterative algorithm to work toward maximizing the liklihood PER GROUP as an average. I looked extensively through rseek.org for things like grouped logistic and nested logistic. I couldn't seem to find anything do this. I'm probably naming it wrong. I assume that a MANUAL iteration concept would be to : 1) Pick a coefficient 2) Calculate the resulting probability for each horse. 3) Measure the strength of the result for each race (sum them together or average them?) 4) Adjust coefficient and repeat Surely there must be some standard function in a library that will do this. Can any of the stat gurus here offer some suggestions? Thanks! -- Noah In the context of your fun example, you have a fundamental problem in that (if I've understood your statement of it correctly) you will have more than one of your horses in the same race (apparently 10). Therefore, one of them winning excludes any of the others winning in that same race, so their results are not independent of each other. Also, at least in real life, the probability that a given horse will win in a particular race depends not only on the covariates per horse (such as your average speed, number of past wins, etc.), and indeed on the condition of the race-course at the time, but also (and usually strongly) on the characteristics of the other horses in the same race. So a simple logistic model of the kind you seem to be proposing would certainly not be realistic! I would be happier thinking about your problem in the context of a different kind of example ... Ted; Would your set of concerns be addressed if the OP switched to a proportional odds logistic regression framework? Harrell discusses such in his RMS text. -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dyn.load search path?
Steve Jaffe wrote: Sorry if this is somewhere in the fine manuals but I've been unable to locate it. Does dyn.load use a search path or does it just look in the current directory for non-fully-qualified filenames? If there is a search path, what is it? Thanks for your help I believe dyn.load() will attempt to load a library from whatever path you point it at, like so: dyn.load('/path/to/dynamic/library') If your dynamic library references code in other dynamic libraries, I would imagine environment variables such as LD_LIBRARY_PATH would come into play-- but I'm not an expert on such voodoo. If you would like a shortcut to loading libraries in certain places that R knows about, such as the R package library, check out providing arguments to dyn.load() using system.file() or loading package libraries using library.dynam(). Hope that helps! -Charlie - Charlie Sharpsteen Undergraduate Environmental Resources Engineering Humboldt State University -- View this message in context: http://www.nabble.com/dyn.load-search-path--tp25492214p25492224.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How write the same number of elements in the first line as the rest of the file with write.table()?
Hi, The first line has less elements than the rest of 'rownames_colnames.write.table.xls'. I am wondering if there is a way to print an additional '\t' at the beginning of the first line. $ Rscript write.table.R x=matrix(1:20,nc=2) rownames(x)=letters[1:10] colnames(x)=letters[1:2] write.table(x,rownames_colnames.write.table.xls,sep='\t') $ cat rownames_colnames.write.table.xls a b a 1 11 b 2 12 c 3 13 d 4 14 e 5 15 f 6 16 g 7 17 h 8 18 i 9 19 j 10 20 Regards, Peng __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to separate a function by 2 probabilities
Marcio Resende wrote: Good Mourning, I have a function to generate a matrix as I show part of it; g[j,i]-if (gen[j,i]==0) al1[i,1]+al1[i,1] else ... However i would like that this function occurred with a probability P and that another function (another formula to generate g matrix) with probability P-1 That´s it, if P is .7, i would like that in 70% of the times (for random i and j) the matrix g was generated according to the formula above and in 30% of the times with a different formula which i did not write something along the lines of if (runif(1)0.7) {first formula} else {second formula} you can probably make this much more efficient by appropriate vectorization, but you didn't show enough of what you're doing to make specific recommendations cheers -- View this message in context: http://www.nabble.com/How-to-separate-a-function-by-2-probabilities-tp25491943p25492242.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How write the same number of elements in the first line as the rest of the file with write.table()?
On Sep 17, 2009, at 3:23 PM, Peng Yu wrote: Hi, The first line has less elements than the rest of 'rownames_colnames.write.table.xls'. I am wondering if there is a way to print an additional '\t' at the beginning of the first line. $ Rscript write.table.R x=matrix(1:20,nc=2) rownames(x)=letters[1:10] colnames(x)=letters[1:2] write.table(x,rownames_colnames.write.table.xls,sep='\t') If you replace your script file with this it gives the needed padding: #write.table.R x=matrix(1:20,nc=2) rownames(x)=letters[1:10] colnames(x)=letters[1:2] sink(rownames_colnames.write.table.xls) cat(\t) write.table(x,sep='\t') sink() $ cat rownames_colnames.write.table.xls a b a 1 11 b 2 12 c 3 13 d 4 14 e 5 15 f 6 16 g 7 17 h 8 18 i 9 19 j 10 20 -- David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] generating unordered combinations
Hi, I am trying to generate all unordered combinations of a set of numbers / characters, and I can only find a (very) clumsy way of doing this using expand.grid. For example, all unordered combinations of the numbers 0, 1, 2 are: 0, 0, 0 0, 0, 1 0, 0, 2 0, 1, 1 0, 1, 2 0, 2, 2 1, 1, 1 1, 1, 2 1, 2, 2 2, 2, 2 (I have not included, for example, 1, 0, 0, since it is equivalent to 0, 0, 1). I have found a way to generate this data.frame using expand.grid as follows: g - expand.grid(c(0,1,2), c(0,1,2), c(0,1,2)) for(i in 1:nrow(g)) { g[i,] - sort(as.character(g[i,])) } o - order(g$Var1, g$Var2, g$Var3) unique(g[o,]). This is obviously quite clumsy and hard to generalise to a greater number of characters, so I'm keen to find any other solutions. Can anyone suggest a better (more general, quicker) method? Cheers __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] What does model.matrix() return?
On Sep 17, 2009, at 11:13 AM, Peng Yu wrote: Hi, I don't understand what the meaning of the following lines returned by model.matrix(). Can somebody help me understand it? What can they be used for? attr(,assign) [1] 0 1 2 2 attr(,contrasts) attr(,contrasts)$A [1] contr.treatment attr(,contrasts)$B [1] contr.treatment ?contrasts. ---direct quote--- contr.treatment contrasts each level with the baseline level (specified by base): the baseline level is omitted. Note that this does not produce ‘contrasts’ as defined in the standard theory for linear models as they are not orthogonal to the intercept. ---end quote--- Also read through (again?): ?aov -- David. Regards, Peng a=2 b=3 n=4 A = rep(sapply(1:a,function(x){rep(x,n)}),b) B = as.vector(sapply(sapply(1:b, function(x){rep(x,n)}), function(x) {rep(x,a)})) Y = A + B + rnorm(a*b*n) fr = data.frame(Y=Y,A=as.factor(A),B=as.factor(B)) afit=aov(Y ~ A + B,fr) model.matrix(afit) (Intercept) A2 B2 B3 11 0 0 0 21 0 0 0 31 0 0 0 41 0 0 0 51 1 0 0 61 1 0 0 71 1 0 0 81 1 0 0 91 0 1 0 10 1 0 1 0 11 1 0 1 0 12 1 0 1 0 13 1 1 1 0 14 1 1 1 0 15 1 1 1 0 16 1 1 1 0 17 1 0 0 1 18 1 0 0 1 19 1 0 0 1 20 1 0 0 1 21 1 1 0 1 22 1 1 0 1 23 1 1 0 1 24 1 1 0 1 attr(,assign) [1] 0 1 2 2 attr(,contrasts) attr(,contrasts)$A [1] contr.treatment attr(,contrasts)$B [1] contr.treatment David Winsemius, MD Heritage Laboratories West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] referring to a row number and to a row condition, and to columns simultaneously
Hello, dear R-ers! I have a data frame: x-data.frame(a=c(4,2,4,1,3,4),b=c(1,3,4,1,5,0),c=c(NA,2,5,3,4,NA),d=rep(NA,6),e=rep(NA,6)) x When x$a==1, I would like to replace NAs in columns d and e with 8 and 9, respectively When x$a != 1, I would like to replace NAs in columns d and e 101 and 1022, respectively. However, I only want to do it for rows 2:5 - while ignoring what's happening in rows 1 and 6. Here is what I've come up with: x for(i in 2:5){ x[i x[[1]]==1,4:5]-c(8,9) x[i x[[1]]!=1,4:5]-c(101,102) } x However, something is wrong here. First, rows 1 and 6 are not ignored. Second, the order of 101 and 102 changes - I, however, always want to see 101 in column d and 102 in column e. Any advice? Thanks a lot! -- Dimitri Liakhovitski Ninah.com dimitri.liakhovit...@ninah.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.