Re: [R] poisson fit for histogram
Hi, see: http://cran.r-project.org/doc/contrib/Ricci-distributions-en.pdf Regards, Vito Thomas Isenbarger isen at plantpath.wisc.edu wrote: I haven't been an R lister for a bit, but I hope to enlist someone's help here. I think this is a simple question, so I hope the answer is not much trouble. Can you please respond directly to this email address in addition to the list (if responding to the list is warranted)? I have a histogram and I want to see if the data fit a Poisson distribution. How do I do this? It is preferable if it could be done without having to install any or many packages. I use R Version 1.12 (1622) on OS X Thank-you very much, Tom Isenbarger -- Tom Isenbarger PhD [EMAIL PROTECTED] 608.265.0850 Diventare costruttori di soluzioni Became solutions' constructors The business of the statistician is to catalyze the scientific learning process. George E. P. Box Statistical thinking will one day be as necessary for efficient citizenship as the ability to read and write H. G. Wells Top 10 reasons to become a Statistician 1. Deviation is considered normal 2. We feel complete and sufficient 3. We are 'mean' lovers 4. Statisticians do it discretely and continuously 5. We are right 95% of the time 6. We can legally comment on someone's posterior distribution 7. We may not be normal, but we are transformable 8. We never have to say we are certain 9. We are honestly significantly different 10. No one wants our jobs Visitate il portale http://www.modugno.it/ e in particolare la sezione su Palese http://www.modugno.it/archivio/palesesanto_spirito/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Clustered standard errors in a panel
Have you considered lmer in library(lme4)? If you are interested in this, you may want to check the article by Doug Bates in the latest R news, www.r-project.org - Documentation: Newsletter. spencer graves Thomas Davidoff wrote: I want to do the following: glm(y ~ x1 + x2 +...) within a panel. Hence y, x1, and x2 all vary at the individual level. However, there is likely correlation of these variables within an individual, so standard errors need adjustment. I do not want to estimate fixed effects, but do want to cluster standard errors at the individual level. Is there an automated way to do this? Nothing in the cluster documentation makes it clear that there is. (An alternative is to do this by hand. In that case, I would need to be able to calculate weighted sums of x1 and x2... at the individual level. I can do this at the variable level [with lapply,split and unsplit], but would love to be able to do so over the matrix of x's. Of course, doing by hand is less easy than an automated solution if it exists.) Thomas Davidoff Assistant Professor Haas School of Business UC Berkeley Berkeley, CA 94720 phone: (510) 643-1425 fax:(510) 643-7357 [EMAIL PROTECTED] http://faculty.haas.berkeley.edu/davidoff [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [EMAIL PROTECTED] www.pdf.com http://www.pdf.com Tel: 408-938-4420 Fax: 408-280-7915 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] family
You want to aim to write a family for R, not find the equivalents of S constructs -- they are different and so exact equivalents do not exist. In particular, an R family has several components which an S family does not. There are lots of example families for you to follow (e.g. see ?family and the negative binomial families in the MASS package). Others have found reading the sources sufficient to write new families. On Thu, 21 Jul 2005, Dr L. Y Hin wrote: I am in the process of migrating an S programme library to R, and it involves the family entity. I have checked ?family but it does not give much detail of its components. I will be very grateful if anyone can point towards sources/ways to look up on this areas with an aim to find the equivalance of the followings in S: family()$inverse family()$deriv family()$variance family()$deviance t -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] again, a question between R and C++
Dear R Users, I want to make a call from R into C++. My inputs are List1, List2, List3, IntegerID. The amount of elements of the lists and their type depend on IntegerID. Typical elements of a given list can be vectors, doubles, and even other lists. I want to return also a list (whose nature will depend also, possibly, on IntegerID). What I want to do is to call these 4 inputs from C++ and then use a factory pattern (depending on IntegerID) that will perform different calculations on the lists depending on the IntegerID (of course, I could also do this with a simple switch statement). I have been reading the documentation, especially the one regarding .Call and .External, and it seems that my algorithm could be implemented, but the examples I have seen up to now are such that what occupies the place of my lists are just vectors (like in convolve4 example). Is there an example where I could see how instead of a vector, a set of lists (with an unkown number of arguments, as well as unkown types) are used as inputs? I guess that the ideal would be that in the equivalent of the convolve4 function, my args would be variant type of lists, and then, after the factory pattern is called, and the correct class is registered (via IntegerID), this variant type is really decomposed into the individual types that compose the list (ie, vectors, doubles, ...). Of course, in the factory there should be as many decomposing algorithms as IntegerIDs, each creating a particular decomposition. Also, how returning a list (whose nature will depend also, possibly, on IntegerID) should be handled? Thank you in advance Jordi The information contained herein is confidential and is inte...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] cut in R
Steve Su [EMAIL PROTECTED] writes: Dear All, I wonder whether it is still valid to use the following R code for cut. All I have done is changed: if (is.na(breaks) | breaks 2) to: if (is.na(breaks) | breaks 1) so that it covers interval of 1? It seems okay for my purposes but I am not sure why R specifically does not allow break2 to happen. Steve. What do you need it for? It gives you a factor with only one group, so I suppose that the idea is that this is more likely to be due to a programming error. (However, I spot a bit of a bug in that we don't set include.lowest=TRUE when using breaks as a number: x - round(rnorm(20),2) x [1] 0.66 -2.22 -0.70 -1.68 0.38 -0.23 -0.43 -0.72 0.30 -0.22 -1.36 0.60 [13] 0.44 -0.40 -0.61 1.08 -0.41 -0.02 -1.41 -0.49 cut(x,breaks=3) [1] (-0.0189,1.08] (-2.22,-1.12] (-1.12,-0.0189] (-2.22,-1.12] [5] (-0.0189,1.08] (-1.12,-0.0189] (-1.12,-0.0189] (-1.12,-0.0189] [9] (-0.0189,1.08] (-1.12,-0.0189] (-2.22,-1.12] (-0.0189,1.08] [13] (-0.0189,1.08] (-1.12,-0.0189] (-1.12,-0.0189] (-0.0189,1.08] [17] (-1.12,-0.0189] (-1.12,-0.0189] (-2.22,-1.12] (-1.12,-0.0189] Levels: (-2.22,-1.12] (-1.12,-0.0189] (-0.0189,1.08] Notice how -2.22 appears to be inside the interval (-2.22,-1.12] .) -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] dpill in KernSmooth package
Hi, just a quick question does dpill computes the bandwidth or half-bandwidth? The help says bandwidth, but in the literature there is often confusion between the bandwidth and half-bandwidth. thanks, Giacomo [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Question about 'text' (add lm summary to a plot)
I would like to annotate my plot with a little box containing the slope, intercept and R^2 of a lm on the data. I would like it to look like... ++ | Slope : 3.45 +- 0.34 | | Intercept : -10.43 +- 1.42 | | R^2 : 0.78 | ++ However I can't make anything this neat, and I can't find out how to combine this with symbols for R^2 / +- (plus minus). Below is my best attempt (which is franky quite pour). Can anyone improve on the below? Specifically, aligned text and numbers, aligned decimal places, symbol for R^2 in the text (expression(R^2) seems to fail with 'paste') and +- Cheers, Dan. dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) abline(coef(dat.lm),lty=2,lwd=1.5) dat.lm.sum - summary(dat.lm) dat.lm.sum attributes(dat.lm.sum) my.text.1 - paste(Slope : , round(dat.lm.sum$coefficients[2],2), +/-, round(dat.lm.sum$coefficients[4],2)) my.text.2 - paste(Intercept : , round(dat.lm.sum$coefficients[1],2), +/-, round(dat.lm.sum$coefficients[3],2)) my.text.3 - paste(R^2 : , round(dat.lm.sum$r.squared,2)) my.text.1 my.text.2 my.text.3 ## Add legend text(x=3, y=300, paste(my.text.1, my.text.2, my.text.3, sep=\n), adj=c(0,0), cex=1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] bubble.plot() - standardize size of unit circle
Hello, I wrote a wrapper for symbols() that produces a bivariate bubble plot, for use when plot(x,y) hides multiple occurrences of the same x,y combination (e.g. if x,y are integers). Circle area ~ counts per bin, and circle size is controlled by 'scale'. Question: how can I automatically make the smallest circle the same size as a standard plot character, rather than having to approximate it using 'scale'? #Function: bubble.plot-function(x,y,scale=0.1,xlab=substitute(x),ylab=substitute(y),...){ z-table(x,y) xx-rep(as.numeric(rownames(z)),ncol(z)) yy-sort(rep(as.numeric(colnames(z)),nrow(z))) id-which(z!=0) symbols(xx[id],yy[id],inches=F,circles=sqrt(z[id])*scale,xlab=xlab,ylab=ylab,...)} #Example: x-rpois(100,3) y-x+rpois(100,2) bubble.plot(x,y) ___ How much free photo storage do you get? Store your holiday __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Is it possible to create highly customized report in *.xls format by using R/S+?
Thank you all for the replies. It is very eye-opening for me. I probably need something like RDCOMClient. I've tried it last night. Very nice package!!! On 7/20/05, Gabor Grothendieck [EMAIL PROTECTED] wrote: Here is an example where R is the client and Excel is the server so that R is issuing commands to Excel. This example uses the RDCOMClient package from www.omegahat.org: library(RDCOMClient) xl - COMCreate(Excel.Application) # starts up Excel xl[[Visible]] - TRUE # Excel becomes visible wkbk - xl$Workbooks()$Add() # new workbook # set some cells sh - xl$ActiveSheet() x12 - sh$Cells(1,2) x12[[Value]] - 123 x22 - sh$Cells(2,2) x22[[Value]] - 100 x31 - sh$Cells(3,1) x31[[Value]] - Total B3R - sh$Range(B3) B3R[[Formula]] - =Sum(R1C2:R2C2) B3R[[NumberFormat]] - _($* #,##0.00_) B3RF - B3R$Font() B3RF[[Bold]] - TRUE # save and exit wkbk$SaveAs(\\test.xls) xl$Quit() Code using the rcom package at (second link is mailing list): http://sunsite.univie.ac.at/rcom/download/ http://mailman.csd.univie.ac.at/pipermail/rcom-l/ would be nearly identical once the upcoming version of rcom comes out. rcom and omegahat both provide the possibility of having Excel as the client and R as the server; however, in that setup the user would have to have R running whereas in the above setup only you do. On 7/20/05, Wensui Liu [EMAIL PROTECTED] wrote: I appreciate your reply and understand your point completely. But at times we can't change the rule, the only choice is to follow the rule. Most deliverables in my work are in excel format. On 7/20/05, Greg Snow [EMAIL PROTECTED] wrote: See: http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html and http://www.stat.uiowa.edu/~jcryer/JSMTalk2001.pdf Greg Snow, Ph.D. Statistical Data Center, LDS Hospital Intermountain Health Care [EMAIL PROTECTED] (801) 408-8111 Wensui Liu [EMAIL PROTECTED] 07/19/05 03:22PM I remember in one slide of Prof. Ripley's presentation overhead, he said the most popular data analysis software is excel. So is there any resource or tutorial on this topic? Thank you so much! -- WenSui Liu, MS MA Senior Decision Support Analyst Division of Health Policy and Clinical Effectiveness Cincinnati Children Hospital Medical Center __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] again, a question between R and C++
Jordi, The place to ask this question is probably the r-devel list; it's a little too heavy for r-help. This is fairly easy to do using the .Call interface. Have a look at lapply2 in the Writing R Extensions manual. http://cran.r-project.org/doc/manuals/R-exts.html#Evaluating-R-expressio ns-from-C or just follow this short example. write a function in C++ as follows: SEXP myFunc(SEXP list1, SEXP list2, SEXP list3, SEXP list4, SEXP intID_SEXP) { // obtain the list length as follows: int list1_len = length(list1); // to access your integer (I assume it's a scalar, not a vector) // you need to grab the first element of this integer vector int INTEGER(intID_SEXP)[0] // you will want to add some checks to make sure the arguments are of the right type ... ... SEXP ans = (whatever) return ans; } you can call it in R as follows: .Call(myFunc, list1, list2, list3, list4, intID) -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Molins, Jordi Sent: Thursday, July 21, 2005 3:25 AM To: r-help@stat.math.ethz.ch Cc: Jordi Molins Subject: [R] again, a question between R and C++ Dear R Users, I want to make a call from R into C++. My inputs are List1, List2, List3, IntegerID. The amount of elements of the lists and their type depend on IntegerID. Typical elements of a given list can be vectors, doubles, and even other lists. I want to return also a list (whose nature will depend also, possibly, on IntegerID). What I want to do is to call these 4 inputs from C++ and then use a factory pattern (depending on IntegerID) that will perform different calculations on the lists depending on the IntegerID (of course, I could also do this with a simple switch statement). I have been reading the documentation, especially the one regarding .Call and .External, and it seems that my algorithm could be implemented, but the examples I have seen up to now are such that what occupies the place of my lists are just vectors (like in convolve4 example). Is there an example where I could see how instead of a vector, a set of lists (with an unkown number of arguments, as well as unkown types) are used as inputs? I guess that the ideal would be that in the equivalent of the convolve4 function, my args would be variant type of lists, and then, after the factory pattern is called, and the correct class is registered (via IntegerID), this variant type is really decomposed into the individual types that compose the list (ie, vectors, doubles, ...). Of course, in the factory there should be as many decomposing algorithms as IntegerIDs, each creating a particular decomposition. Also, how returning a list (whose nature will depend also, possibly, on IntegerID) should be handled? Thank you in advance Jordi The information contained herein is confidential and is\ int...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The steps of building library in R 2.1.1
Ivy_Li wrote: Dear All, With the warm support of every R expert, I have built my R library successfully. Especially thanks: Duncan Murdoch Gabor Grothendieck Henrik Bengtsson Uwe Ligges You are welcome. The following is intended for the records in the archive in order to protect readers. Without your help, I will lower efficiency. I noticed that some other friends were puzzled by the method of building library. Now, I organize a document about it. Hoping it can help more friends. 1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools Do you mean http://www.murdoch-sutherland.com/Rtools/ ? 2. Download the rw2011.exe; Install the newest version of R 3. Download the tools.zip; Unpack it into c:\cygwin Not required to call it cygwin - also a bit misleading... 4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active Perl in c:\Perl Why in C:\Perl ? 5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in c:\mingwin Why in c:\mingwin ? 6. Then go to Control Panel - System - Advanced - Environment Variables - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin The PATH variable already contains a couple of paths, add the two given above in front of all others, separated by ;. Why we add them in the beginning of the path? Because we want the folder that contains the tools to be at the beginning so that you eliminate the possibility of finding a different program of the same name first in a folder that comes prior to the one where the tools are stored. OK, this (1-6) is all described in the R Administration and Installation manual, hence I do not see why we have to repeat it here. 7. I use the package.skeleton() function to make a draft package. It will automate some of the setup for a new source package. It creates directories, saves functions anddata to appropriate places, and creates skeleton help files and 'README' files describing further steps in packaging. I type in R: f - function(x,y) x+y g - function(x,y) x-y d - data.frame(a=1, b=2) e - rnorm(1000) package.skeleton(list=c(f,g,d,e), name=example) Then modify the 'DESCRIPTION': Package: example Version: 1.0-1 Date: 2005-07-09 Title: My first function Author: Ivy [EMAIL PROTECTED] Maintainer: Ivy [EMAIL PROTECTED] Description: simple sum and subtract License: GPL version 2 or later Depends: R (= 1.9), stats, graphics, utils You can refer to the web page: http://cran.r-project.org/src/contrib/Descriptions/ There are larger source of examples. And you can read the part of 'Creating R Packages' in 'Writing R Extension'. It introduces some useful things for your reference. This is described in Writing R Extension and is not related to the setup of you system in 1-6. 8. Download hhc.exe Microsoft help compiler from somewhere. And save it somewhere in your path. I download a 'htmlhelp.exe' and setup. saved the hhc.exe into the 'C:\cygwin\bin' because this path has been writen in my PATH Variable Balue. However if you decided not to use the Help Compiler (hhc), then you need to modify the MkRules file in RHOME/src/gnuwin32 to tell it not to try to build that kind of help file This is described in the R Administration and Installation manual and I do not see why we should put the html compiler to the other tools. 9. In the DOS environment. Into the D:\ Type the following code: There is no DOS environment in Windows NT based operating systems. cd \Program Files\R\rw2010 bin\R CMD INSTALL /Program Files/R/rw2011/example I do not see why anybody would like to contaminate the binary installation of R with some development source packages. I'd rather use a separate directory. I think reading the two mentioned manuals shoul be sufficient. You have not added relevant information. By adding irrelevant information and omitting some relevant information, I guess we got something that is misleading if the reader does NOT read the manuals as well. Best, Uwe Ligges Firstly, because I install the new version R in the D:\Program Files\. So I should first into the D drive. Secondly, because I use the package.skeleton() function to build 'example' package in the path of D:\Program Files\R\rw2011\ So I must tell R the path where saved the 'example' package. So I write the code is like that. If your path is different from me, you should modify part of these code. 10.Finally, this package is successfully built up. -- Making package example adding build stamp to DESCRIPTION installing R files installing data files installing man source files installing indices not zipping data installing help Building/Updating help pages for package 'example'
[R] Problem with read.table()
Dear all, I have encountered a strange problem with read.table(). When I try to read a tab delimited file I get an error message for line 260 not being equal to 14 (see below). Using count.fields() suggests that a number of lines have length not equal to 14, but not 260. Looking at the actual file, however, I cannot see anything wrong with any lines. They all seem to have length 14, there are no double tabs etc., and the file reads correctly in other programs. Does anyone have any suggestions as to what this might stem from? I have placed a copy of the file at http://dss.ucsd.edu/~kgledits/archigos_v.1.9.asc regards, Kristian Skrede Gleditsch archigos1.9 - read.table(c:/work/work12/archigos/archigos_v.1.9.asc, + sep=\t,header=T,as.is=T,row.names=NULL) Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 260 did not have 14 elements a - count.fields(c:/work/work12/archigos/archigos_v.1.9.asc,sep=\t) a - data.frame(c(1:length(a)),a) a[a[,2]!=14,] c.1.length.a.. a 150 150 10 313 313 10 424 424 10 1189 1189 5 1510 1510 10 1514 1514 10 1590 1590 5 1600 1600 10 1612 1612 10 1618 1618 10 1619 1619 10 1709 1709 10 1722 1722 10 1981 1981 10 1985 1985 10 2112 2112 10 2178 2178 10 2208 2208 10 2224 2224 10 2530 2530 5 2536 2536 5 2573 2573 5 2928 2928 5 -- Kristian Skrede Gleditsch Department of Political Science, UCSD (On leave, University of Essex, 2005-6) Tel: +44 1206 872499, Fax: +44 1206 873234 Email: [EMAIL PROTECTED] or [EMAIL PROTECTED] http://weber.ucsd.edu/~kgledits/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R graphics
Hi I am trying to set up 16 graphs on one graphics page in R. I have used the mfrow=c(4,4) command. However I get a lot of white space between each graph. Does anyone know how I can reduce this? Thanks Sam __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Is it possible to create highly customized report in *.xls format by using R/S+?
So your conclusion is that the only choice is to make mistakes and get in trouble. (That's what Excel excels at.) Two options I haven't seen mentioned are: 1. Create your deliverables in HTML format, and change the extension from .htm to .xls; Excel will import them automatically. The way the file looks in Excel is determined by .CSS settings (I've seen this happen) and I presume HTML tags. 2. For the real spreadsheet thing, switch to OpenOffice.org. Their format is XML compressed with ZIP which you can easily work with since the format specifications are not proprietary. See http://xml.openoffice.org/ for details. -Original Message- From: Wensui Liu [mailto:[EMAIL PROTECTED] Sent: Wednesday, July 20, 2005 10:56 AM To: Greg Snow Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Is it possible to create highly customized report in *.xls format by using R/S+? I appreciate your reply and understand your point completely. But at times we can't change the rule, the only choice is to follow the rule. Most deliverables in my work are in excel format. On 7/20/05, Greg Snow [EMAIL PROTECTED] wrote: See: http://www.burns-stat.com/pages/Tutor/spreadsheet_addiction.html and http://www.stat.uiowa.edu/~jcryer/JSMTalk2001.pdf Greg Snow, Ph.D. Statistical Data Center, LDS Hospital Intermountain Health Care [EMAIL PROTECTED] (801) 408-8111 Wensui Liu [EMAIL PROTECTED] 07/19/05 03:22PM I remember in one slide of Prof. Ripley's presentation overhead, he said the most popular data analysis software is excel. So is there any resource or tutorial on this topic? Thank you so much! __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- WenSui Liu, MS MA Senior Decision Support Analyst Division of Health Policy and Clinical Effectiveness Cincinnati Children Hospital Medical Center __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question about 'text' (add lm summary to a plot)
Dear Dan I can only help you with your third problem, expression and paste. You can use: plot(1:5,1:5, type = n) text(2,4,expression(paste(Slope : , 3.45%+-%0.34, sep = )), pos = 4) text(2,3.8,expression(paste(Intercept : , -10.43%+-%1.42)), pos = 4) text(2,3.6,expression(paste(R^2,: , 0.78, sep = )), pos = 4) I do not have an elegant solution for the alignment. Regards, Christoph Buser -- Christoph Buser [EMAIL PROTECTED] Seminar fuer Statistik, LEO C13 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-44-632-4673 fax: 632-1228 http://stat.ethz.ch/~buser/ -- Dan Bolser writes: I would like to annotate my plot with a little box containing the slope, intercept and R^2 of a lm on the data. I would like it to look like... ++ | Slope : 3.45 +- 0.34 | | Intercept : -10.43 +- 1.42 | | R^2 : 0.78 | ++ However I can't make anything this neat, and I can't find out how to combine this with symbols for R^2 / +- (plus minus). Below is my best attempt (which is franky quite pour). Can anyone improve on the below? Specifically, aligned text and numbers, aligned decimal places, symbol for R^2 in the text (expression(R^2) seems to fail with 'paste') and +- Cheers, Dan. dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) abline(coef(dat.lm),lty=2,lwd=1.5) dat.lm.sum - summary(dat.lm) dat.lm.sum attributes(dat.lm.sum) my.text.1 - paste(Slope : , round(dat.lm.sum$coefficients[2],2), +/-, round(dat.lm.sum$coefficients[4],2)) my.text.2 - paste(Intercept : , round(dat.lm.sum$coefficients[1],2), +/-, round(dat.lm.sum$coefficients[3],2)) my.text.3 - paste(R^2 : , round(dat.lm.sum$r.squared,2)) my.text.1 my.text.2 my.text.3 ## Add legend text(x=3, y=300, paste(my.text.1, my.text.2, my.text.3, sep=\n), adj=c(0,0), cex=1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RandomForest question
Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when choosing mtry=80. How is it possible that more variables can used than there are in columns the data frame? thanks for your help + kind regards, Arne [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] heatmap color distribution
Hi all, I've got a set of gene expression data, and I'm plotting several heatmaps for subsets of the whole set. I'd like the heatmaps to have the same color distribution, so that comparisons may be made (roughly) across heatmaps; this would require that the color distribution and distance functions be based on the entire dataset, rather than on individual subsets. Does anyone know how to do this? Thanks in advance, Jake __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Chemoinformatic people
Just with R, or via another tool integrating R, such as Pipeline Pilot? best, -tony On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote: Dear colleague, Just an e-mail to know if they are people working in the field of chemoinformatic that are using R in their work. If yes I was wondering if we couldn't exchange tips and tricks about the use of R in this area ? Best regards Fred Ooms [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- best, -tony Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes (AJR, 4Jan05). A.J. Rossini [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R graphics
Sam Baxter wrote: Hi I am trying to set up 16 graphs on one graphics page in R. I have used the mfrow=c(4,4) command. However I get a lot of white space between each graph. Does anyone know how I can reduce this? Thanks Sam __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html ?layout as an alternative to par(mfrow) might be helpful anyway too large margins: ?par reduce value of mar, for instance __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The steps of building library in R 2.1.1
I think you have been using R too long. Something like this is very much needed. There are two problems: 1. the process itself is too complex (need to get rid of perl, integrate package development tools with package installation procedure [it should be as easy as downloading a package], remove necessity to set or modify any environment variables including the path variables). 2. there is too much material to absorb just to create a package. The manuals are insufficient. A step-by-step simplification is very much needed. Its no coincidence that there are a number of such descriptions on the net (google for 'making creating R package') since I would guess that just about everyone has significant problems in creating their first package on Windows. On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote: Ivy_Li wrote: Dear All, With the warm support of every R expert, I have built my R library successfully. Especially thanks: Duncan Murdoch Gabor Grothendieck Henrik Bengtsson Uwe Ligges You are welcome. The following is intended for the records in the archive in order to protect readers. Without your help, I will lower efficiency. I noticed that some other friends were puzzled by the method of building library. Now, I organize a document about it. Hoping it can help more friends. 1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools Do you mean http://www.murdoch-sutherland.com/Rtools/ ? 2. Download the rw2011.exe; Install the newest version of R 3. Download the tools.zip; Unpack it into c:\cygwin Not required to call it cygwin - also a bit misleading... 4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active Perl in c:\Perl Why in C:\Perl ? 5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in c:\mingwin Why in c:\mingwin ? 6. Then go to Control Panel - System - Advanced - Environment Variables - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin The PATH variable already contains a couple of paths, add the two given above in front of all others, separated by ;. Why we add them in the beginning of the path? Because we want the folder that contains the tools to be at the beginning so that you eliminate the possibility of finding a different program of the same name first in a folder that comes prior to the one where the tools are stored. OK, this (1-6) is all described in the R Administration and Installation manual, hence I do not see why we have to repeat it here. 7. I use the package.skeleton() function to make a draft package. It will automate some of the setup for a new source package. It creates directories, saves functions anddata to appropriate places, and creates skeleton help files and 'README' files describing further steps in packaging. I type in R: f - function(x,y) x+y g - function(x,y) x-y d - data.frame(a=1, b=2) e - rnorm(1000) package.skeleton(list=c(f,g,d,e), name=example) Then modify the 'DESCRIPTION': Package: example Version: 1.0-1 Date: 2005-07-09 Title: My first function Author: Ivy [EMAIL PROTECTED] Maintainer: Ivy [EMAIL PROTECTED] Description: simple sum and subtract License: GPL version 2 or later Depends: R (= 1.9), stats, graphics, utils You can refer to the web page: http://cran.r-project.org/src/contrib/Descriptions/ There are larger source of examples. And you can read the part of 'Creating R Packages' in 'Writing R Extension'. It introduces some useful things for your reference. This is described in Writing R Extension and is not related to the setup of you system in 1-6. 8. Download hhc.exe Microsoft help compiler from somewhere. And save it somewhere in your path. I download a 'htmlhelp.exe' and setup. saved the hhc.exe into the 'C:\cygwin\bin' because this path has been writen in my PATH Variable Balue. However if you decided not to use the Help Compiler (hhc), then you need to modify the MkRules file in RHOME/src/gnuwin32 to tell it not to try to build that kind of help file This is described in the R Administration and Installation manual and I do not see why we should put the html compiler to the other tools. 9. In the DOS environment. Into the D:\ Type the following code: There is no DOS environment in Windows NT based operating systems. cd \Program Files\R\rw2010 bin\R CMD INSTALL /Program Files/R/rw2011/example I do not see why anybody would like to contaminate the binary installation of R with some development source packages. I'd rather use a separate directory. I think reading the two mentioned manuals shoul be sufficient. You have not added relevant information. By adding irrelevant information and
Re: [R] R graphics
Sam Baxter wrote: Hi I am trying to set up 16 graphs on one graphics page in R. I have used the mfrow=c(4,4) command. However I get a lot of white space between each graph. Does anyone know how I can reduce this? Thanks Sam Two options: 1. play around with the `mar' parameter in ?par. 2. (Preferred) Use the lattice package. See, for example: library(lattice) trellis.device(theme = col.whitebg()) z - expand.grid(x = 1:10, y = 1:10, g = LETTERS[1:16]) xyplot(y ~ x | g, z) HTH, --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Chemoinformatic people
I am looking for both. Fred -Original Message- From: A.J. Rossini [mailto:[EMAIL PROTECTED] Sent: Thursday, July 21, 2005 3:36 PM To: Frédéric Ooms Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Chemoinformatic people Just with R, or via another tool integrating R, such as Pipeline Pilot? best, -tony On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote: Dear colleague, Just an e-mail to know if they are people working in the field of chemoinformatic that are using R in their work. If yes I was wondering if we couldn't exchange tips and tricks about the use of R in this area ? Best regards Fred Ooms [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- best, -tony Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes (AJR, 4Jan05). A.J. Rossini [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] bubble.plot() - standardize size of unit circle
Thanks- 'sizeplot' didn't come up in any of my searches. Dan --- Jim Lemon [EMAIL PROTECTED] wrote: Dan Bebber wrote: Hello, I wrote a wrapper for symbols() that produces a bivariate bubble plot, for use when plot(x,y) hides multiple occurrences of the same x,y combination (e.g. if x,y are integers). Circle area ~ counts per bin, and circle size is controlled by 'scale'. Question: how can I automatically make the smallest circle the same size as a standard plot character, rather than having to approximate it using 'scale'? Ben Bolker's sizeplot in the plotrix package does this using the standard plotting symbol 1. Jim ___ How much free photo storage do you get? Store your holiday __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem with read.table()
On Thu, 21 Jul 2005, Kristian Skrede Gleditsch wrote: Dear all, I have encountered a strange problem with read.table(). Most `strange problems' are user error, so please try not to blame your tools. When I try to read a tab delimited file I get an error message for line 260 not being equal to 14 (see below). Yes, but not line 260 in that file, but line 260 as read by scan(). Think about quotes ... it works for me with quote=, and the quote on ca line 150 is causing you to get some very large fields with embedded new lines and tabs. BTW, there is a 'R Data Import/Export' manual which goes through step-by-step the assumptions you make when using read.table with various options. Do read it now. Using count.fields() suggests that a number of lines have length not equal to 14, but not 260. Looking at the actual file, however, I cannot see anything wrong with any lines. They all seem to have length 14, there are no double tabs etc., and the file reads correctly in other programs. Does anyone have any suggestions as to what this might stem from? I have placed a copy of the file at http://dss.ucsd.edu/~kgledits/archigos_v.1.9.asc regards, Kristian Skrede Gleditsch archigos1.9 - read.table(c:/work/work12/archigos/archigos_v.1.9.asc, + sep=\t,header=T,as.is=T,row.names=NULL) Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 260 did not have 14 elements a - count.fields(c:/work/work12/archigos/archigos_v.1.9.asc,sep=\t) a - data.frame(c(1:length(a)),a) a[a[,2]!=14,] c.1.length.a.. a 150 150 10 313 313 10 424 424 10 1189 1189 5 1510 1510 10 1514 1514 10 1590 1590 5 1600 1600 10 1612 1612 10 1618 1618 10 1619 1619 10 1709 1709 10 1722 1722 10 1981 1981 10 1985 1985 10 2112 2112 10 2178 2178 10 2208 2208 10 2224 2224 10 2530 2530 5 2536 2536 5 2573 2573 5 2928 2928 5 -- Kristian Skrede Gleditsch Department of Political Science, UCSD (On leave, University of Essex, 2005-6) Tel: +44 1206 872499, Fax: +44 1206 873234 Email: [EMAIL PROTECTED] or [EMAIL PROTECTED] http://weber.ucsd.edu/~kgledits/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Clustered standard errors in a panel
No, he wants to fit a glm() and get the right standard errors. For linear models the best way to do this is to model some random effects, but doing so in glm changes the meanings of the parameters. To estimate the same parameters you want to use the sandwich standard errors variously attributed to Huber and White. The Design package has robcov() to do this, and there is also code at http://faculty.washington.edu/tlumley/data2001/sandwich.R -thomas On Wed, 20 Jul 2005, Spencer Graves wrote: Have you considered lmer in library(lme4)? If you are interested in this, you may want to check the article by Doug Bates in the latest R news, www.r-project.org - Documentation: Newsletter. spencer graves Thomas Davidoff wrote: I want to do the following: glm(y ~ x1 + x2 +...) within a panel. Hence y, x1, and x2 all vary at the individual level. However, there is likely correlation of these variables within an individual, so standard errors need adjustment. I do not want to estimate fixed effects, but do want to cluster standard errors at the individual level. Is there an automated way to do this? Nothing in the cluster documentation makes it clear that there is. (An alternative is to do this by hand. In that case, I would need to be able to calculate weighted sums of x1 and x2... at the individual level. I can do this at the variable level [with lapply,split and unsplit], but would love to be able to do so over the matrix of x's. Of course, doing by hand is less easy than an automated solution if it exists.) Thomas Davidoff Assistant Professor Haas School of Business UC Berkeley Berkeley, CA 94720 phone: (510) 643-1425 fax:(510) 643-7357 [EMAIL PROTECTED] http://faculty.haas.berkeley.edu/davidoff [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Spencer Graves, PhD Senior Development Engineer PDF Solutions, Inc. 333 West San Carlos Street Suite 700 San Jose, CA 95110, USA [EMAIL PROTECTED] www.pdf.com http://www.pdf.com Tel: 408-938-4420 Fax: 408-280-7915 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] cutomized link function in R
Hello! I am trying to run my S+ code in R (version 2.1.0). I've created a customized link function, namely my.binomial where parameter theta has to be given. I'm considering theta to be .05. Unfortunately, R is giving an error (I had used MASS in adjusting the S+ code to R). I would very much appreciate it if you could help me in finding the correction needed in the code. My S+ code I'm trying to adjust to R is: g-glm(y~diffwhale+tdelta:diffwhale+tdelta2:diffwhale, data=ind, family=my.binomial(theta=.05)) Thanks, Isin __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Chemoinformatic people
I am just curious why I always want to have a position like that but never find one. Am I lazy or unlucky for job huntering?(^%$$%*^( weiwei On 7/21/05, Frédéric Ooms [EMAIL PROTECTED] wrote: I am looking for both. Fred -Original Message- From: A.J. Rossini [mailto:[EMAIL PROTECTED] Sent: Thursday, July 21, 2005 3:36 PM To: Frédéric Ooms Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Chemoinformatic people Just with R, or via another tool integrating R, such as Pipeline Pilot? best, -tony On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote: Dear colleague, Just an e-mail to know if they are people working in the field of chemoinformatic that are using R in their work. If yes I was wondering if we couldn't exchange tips and tricks about the use of R in this area ? Best regards Fred Ooms [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- best, -tony Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes (AJR, 4Jan05). A.J. Rossini [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] heatmap color distribution
You can use the breaks argument in image to do this. (You don't specify a function you're using, but other heatmap functions probably have a similar parameter.) Look across all your data, figure out the ranges you want to have different colors, and specify the appropriate break points in each call to image. Then you're using the same color set in each one. You run the risk, of course, that some of your images will have a very narrow color range, which might obscure interesting features. But nothing stops you from making more than one plot. Hope this helps. Regards, Matt Wiener -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jacob Michaelson Sent: Thursday, July 21, 2005 9:26 AM To: r-help@stat.math.ethz.ch Subject: [R] heatmap color distribution Hi all, I've got a set of gene expression data, and I'm plotting several heatmaps for subsets of the whole set. I'd like the heatmaps to have the same color distribution, so that comparisons may be made (roughly) across heatmaps; this would require that the color distribution and distance functions be based on the entire dataset, rather than on individual subsets. Does anyone know how to do this? Thanks in advance, Jake __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Rprof fails in combination with RMySQL
Dear R community, I tried to optimized my R code by using Rprof. In my R code I'm using MySQL database connections intensively. After a bunch of queries R fails with the following error message: Error in .Call(RS_MySQL_newConnection, drvId, con.params, groups, PACKAGE = .MySQLPkgName) : RS-DBI driver: (could not connect [EMAIL PROTECTED] on dbname myDB Without the R profiler this code runs very stable since weeks. Do you have any ideas or suggestions? I tried the following R versions: ___ platform i386-pc-solaris2.8 arch i386 os solaris2.8 system i386, solaris2.8 status major1 minor9.1 year 2004 month06 day 21 language R ___ platform sparc-sun-solaris2.8 arch sparc os solaris2.8 system sparc, solaris2.8 status major2 minor1.1 year 2005 month06 day 20 language R ___ platform sparc-sun-solaris2.8 arch sparc os solaris2.8 system sparc, solaris2.8 status major1 minor9.1 year 2004 month06 day 21 language R Thank you in advance and kind regards, Lutz Thieme AMD Saxony/ Product Engineering AMD Saxony Limited Liability Company Co. KG phone: + 49-351-277-4269 M/S E22-PE, Wilschdorfer Landstr. 101 fax:+ 49-351-277-9-4269 D-01109 Dresden, Germany [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Problem with read.table()
I don't really understand it, but the problem seems to come down to the presence of apostrophes (single right quotes ') in the text strings. The first of these occurs in line 149 (not counting the header line). If one tries to scan just that line, one gets a vector of length 10. Fields 10 to 14 are read as a single field. Upon deleting the apostrophe, I got a a vector of length 14 (OMMM!) The help on scan() talks about a quote argument and indicates that if sep is not the newline character, then quote defaults to '\. It remarks that you can include quotes inside strings by doubling them. I did a global substitution, changing ' to '' throughout, and the read.table() worked (i.e. didn't complain and yielded up a data frame of dimension 2935 x 14). But no apostrophes appeared in the fields in the resulting data frame. The help seems to indicate that you can get around the problem by specifying quote = some character which doesn't appear in the file. (This also saves having to do a global edit.) I tried quote=# and it seemed to work in this instance. And the apostrophes ***did*** appear in the strings in the data frame. I don't grok why the complaint shows up at line 260 rather than immediately at line 149 but it's a start. cheers, Rolf Turner [EMAIL PROTECTED] Original message: From [EMAIL PROTECTED] Thu Jul 21 10:12:09 2005 Date: Thu, 21 Jul 2005 14:11:36 +0100 From: Kristian Skrede Gleditsch [EMAIL PROTECTED] User-Agent: Mozilla Thunderbird 1.0.2 (Windows/20050317) X-Accept-Language: en-us, en MIME-Version: 1.0 To: r-help@stat.math.ethz.ch X-Essex-ClamAV: No malware found X-Essex-MailScanner: Found to be clean X-Essex-MailScanner-SpamCheck: not spam, SpamAssassin (score=-2.82, required 5, autolearn=disabled, ALL_TRUSTED -2.82) X-MailScanner-From: [EMAIL PROTECTED] X-Virus-Scanned: by amavisd-new at stat.math.ethz.ch Subject: [R] Problem with read.table() X-BeenThere: r-help@stat.math.ethz.ch X-Mailman-Version: 2.1.6 List-Id: Main R Mailing List: Primary help r-help.stat.math.ethz.ch List-Unsubscribe: https://stat.ethz.ch/mailman/listinfo/r-help, mailto:[EMAIL PROTECTED] List-Archive: https://stat.ethz.ch/pipermail/r-help List-Post: mailto:r-help@stat.math.ethz.ch List-Help: mailto:[EMAIL PROTECTED] List-Subscribe: https://stat.ethz.ch/mailman/listinfo/r-help, mailto:[EMAIL PROTECTED] Content-Transfer-Encoding: 7bit X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on erdos.math.unb.ca X-Spam-Math-Flag: NO X-Spam-Math-Status: No, hits=0.0 required=5.0 tests=BAYES_50 autolearn=no version=3.0.4 Dear all, I have encountered a strange problem with read.table(). When I try to read a tab delimited file I get an error message for line 260 not being equal to 14 (see below). Using count.fields() suggests that a number of lines have length not equal to 14, but not 260. Looking at the actual file, however, I cannot see anything wrong with any lines. They all seem to have length 14, there are no double tabs etc., and the file reads correctly in other programs. Does anyone have any suggestions as to what this might stem from? I have placed a copy of the file at http://dss.ucsd.edu/~kgledits/archigos_v.1.9.asc regards, Kristian Skrede Gleditsch archigos1.9 - read.table(c:/work/work12/archigos/archigos_v.1.9.asc, + sep=\t,header=T,as.is=T,row.names=NULL) Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 260 did not have 14 elements a - count.fields(c:/work/work12/archigos/archigos_v.1.9.asc,sep=\t) a - data.frame(c(1:length(a)),a) a[a[,2]!=14,] c.1.length.a.. a 150 150 10 313 313 10 424 424 10 1189 1189 5 1510 1510 10 1514 1514 10 1590 1590 5 1600 1600 10 1612 1612 10 1618 1618 10 1619 1619 10 1709 1709 10 1722 1722 10 1981 1981 10 1985 1985 10 2112 2112 10 2178 2178 10 2208 2208 10 2224 2224 10 2530 2530 5 2536 2536 5 2573 2573 5 2928 2928 5 -- Kristian Skrede Gleditsch Department of Political Science, UCSD (On leave, University of Essex, 2005-6) Tel: +44 1206 872499, Fax: +44 1206 873234 Email: [EMAIL PROTECTED] or [EMAIL PROTECTED] http://weber.ucsd.edu/~kgledits/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Chemoinformatic people
I don't. THere is an address an email at novartis in the ASA directory ID 068970 NameAnthony J. Rossini Company Novartis Pharma AG Address Biostatistics WSJ-27.1.012 City State Zip CH-4002 Basel Country Switzerland Phone (206) 543-2005 Email [EMAIL PROTECTED] luke On Thu, 21 Jul 2005, A.J. Rossini wrote: Just with R, or via another tool integrating R, such as Pipeline Pilot? best, -tony On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote: Dear colleague, Just an e-mail to know if they are people working in the field of chemoinformatic that are using R in their work. If yes I was wondering if we couldn't exchange tips and tricks about the use of R in this area ? Best regards Fred Ooms [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Chemoinformatic people
I know of a good number of companies who use R via pipeline pilot (and have looked into it a bit recently), but not R by itself. One of the big I wish items that I've got is seemless handling of large data. Some of the RDBMS will do it, but not quite seemlessly. SPLUS 7.0 does it for a limited class, but in a painful (very non-seemless) manner. This would be required to use R in this context, at least for what I've seen. best, -tony On 7/21/05, Frédéric Ooms [EMAIL PROTECTED] wrote: I am looking for both. Fred -Original Message- From: A.J. Rossini [mailto:[EMAIL PROTECTED] Sent: Thursday, July 21, 2005 3:36 PM To: Frédéric Ooms Cc: r-help@stat.math.ethz.ch Subject: Re: [R] Chemoinformatic people Just with R, or via another tool integrating R, such as Pipeline Pilot? best, -tony On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote: Dear colleague, Just an e-mail to know if they are people working in the field of chemoinformatic that are using R in their work. If yes I was wondering if we couldn't exchange tips and tricks about the use of R in this area ? Best regards Fred Ooms [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- best, -tony Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes (AJR, 4Jan05). A.J. Rossini [EMAIL PROTECTED] -- best, -tony Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes (AJR, 4Jan05). A.J. Rossini [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RandomForest question
[EMAIL PROTECTED] wrote: Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when choosing mtry=80. How is it possible that more variables can used than there are in columns the data frame? If some of the variables are factors, dummy variables are generated and you get a larger number of variables in the later process. Uwe Ligges thanks for your help + kind regards, Arne [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] heatmap color distribution
Thanks for the reply. As I understand it, breaks only controls the binning. The problem I'm having is that each subset heatmap has slightly different min and max log2 intensities. I'd like the colors to be based on the overall (complete set) max and min, not the subsets' max and min -- I could be wrong, but I don't think breaks will help me there. And you're right - this might obscure some of the trends/features, but we'll also plot the default heatmaps. Also (I should have specified) I'm using heatmap.2. Thanks, Jake On Jul 21, 2005, at 8:09 AM, Wiener, Matthew wrote: You can use the breaks argument in image to do this. (You don't specify a function you're using, but other heatmap functions probably have a similar parameter.) Look across all your data, figure out the ranges you want to have different colors, and specify the appropriate break points in each call to image. Then you're using the same color set in each one. You run the risk, of course, that some of your images will have a very narrow color range, which might obscure interesting features. But nothing stops you from making more than one plot. Hope this helps. Regards, Matt Wiener -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jacob Michaelson Sent: Thursday, July 21, 2005 9:26 AM To: r-help@stat.math.ethz.ch Subject: [R] heatmap color distribution Hi all, I've got a set of gene expression data, and I'm plotting several heatmaps for subsets of the whole set. I'd like the heatmaps to have the same color distribution, so that comparisons may be made (roughly) across heatmaps; this would require that the color distribution and distance functions be based on the entire dataset, rather than on individual subsets. Does anyone know how to do this? Thanks in advance, Jake __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html --- --- Notice: This e-mail message, together with any attachment...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] debian vcd package
[Apologies if you have already read this message sent from another email address] Hi R-Help, I have been using R in Linux (Debian) for the past month. The usual way I install packages is through apt. Recently, a new packages vcd became available on CRAN. I tried installing it today and found that Debian does not seem to support this package. I also found that many other packages were unavailable. Does anyone have any recommended sites where a full list is available? If none exist, what would be the best way to move ahead in installing say the vcd package. I am still a novice in using Debian and so please forgive me if some of my questions may seem trivial for experienced users. Peter Peter Ho, PhD. Escola Superior de Tecnologia e Gestao. Instituto Politecnico de Viana do Castelo. Avenida do Atlantico- Apartado 574. 4901-908 Viana do Castelo. Portugal. Tel: +351-258-819700 Ext. 1252 Email: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] heatmap color distribution
Breaks affects the binning into colors. Try this. Assume that temp is one of your data sets. It's values are restricted to 0.25 - 0.75, and we'll assume that the full data set goes from 0 to 1. temp - matrix(runif(60, 0.25, 0.75), nc = 6) breaks - seq(from = 0, to = 1, length = 11) image(temp2, col = heat.colors(10)) # full range of color image(temp2, col = heat.colors(10), breaks = breaks)# muted colors The second image is told about all the colors, and about the full range of data through breaks, and only uses the colors in the middle. Is that what you mean? HTH, Matt -Original Message- From: Jake Michaelson [mailto:[EMAIL PROTECTED] Sent: Thursday, July 21, 2005 10:45 AM To: Wiener, Matthew Cc: R-help@stat.math.ethz.ch Subject: Re: [R] heatmap color distribution Thanks for the reply. As I understand it, breaks only controls the binning. The problem I'm having is that each subset heatmap has slightly different min and max log2 intensities. I'd like the colors to be based on the overall (complete set) max and min, not the subsets' max and min -- I could be wrong, but I don't think breaks will help me there. And you're right - this might obscure some of the trends/features, but we'll also plot the default heatmaps. Also (I should have specified) I'm using heatmap.2. Thanks, Jake On Jul 21, 2005, at 8:09 AM, Wiener, Matthew wrote: You can use the breaks argument in image to do this. (You don't specify a function you're using, but other heatmap functions probably have a similar parameter.) Look across all your data, figure out the ranges you want to have different colors, and specify the appropriate break points in each call to image. Then you're using the same color set in each one. You run the risk, of course, that some of your images will have a very narrow color range, which might obscure interesting features. But nothing stops you from making more than one plot. Hope this helps. Regards, Matt Wiener -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jacob Michaelson Sent: Thursday, July 21, 2005 9:26 AM To: r-help@stat.math.ethz.ch Subject: [R] heatmap color distribution Hi all, I've got a set of gene expression data, and I'm plotting several heatmaps for subsets of the whole set. I'd like the heatmaps to have the same color distribution, so that comparisons may be made (roughly) across heatmaps; this would require that the color distribution and distance functions be based on the entire dataset, rather than on individual subsets. Does anyone know how to do this? Thanks in advance, Jake __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html --- --- Notice: This e-mail message, together with any attachment...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Rprof fails in combination with RMySQL
I think you're barking up the wrong tree. Optimize the MySQL code separately from optimizing the R code. A very nice reference about the former is http://highperformancemysql.com/. Also, if possible, do everything in MySQL. hth, b. -Original Message- From: Thieme, Lutz [mailto:[EMAIL PROTECTED] Sent: Thursday, July 21, 2005 10:11 AM To: Rhelp (E-mail) Subject: [R] Rprof fails in combination with RMySQL Dear R community, I tried to optimized my R code by using Rprof. In my R code I'm using MySQL database connections intensively. After a bunch of queries R fails with the following error message: Error in .Call(RS_MySQL_newConnection, drvId, con.params, groups, PACKAGE = .MySQLPkgName) : RS-DBI driver: (could not connect [EMAIL PROTECTED] on dbname myDB Without the R profiler this code runs very stable since weeks. Do you have any ideas or suggestions? I tried the following R versions: ___ platform i386-pc-solaris2.8 arch i386 os solaris2.8 system i386, solaris2.8 status major1 minor9.1 year 2004 month06 day 21 language R ___ platform sparc-sun-solaris2.8 arch sparc os solaris2.8 system sparc, solaris2.8 status major2 minor1.1 year 2005 month06 day 20 language R ___ platform sparc-sun-solaris2.8 arch sparc os solaris2.8 system sparc, solaris2.8 status major1 minor9.1 year 2004 month06 day 21 language R Thank you in advance and kind regards, Lutz Thieme AMD Saxony/ Product Engineering AMD Saxony Limited Liability Company Co. KG phone: + 49-351-277-4269 M/S E22-PE, Wilschdorfer Landstr. 101 fax: + 49-351-277-9-4269 D-01109 Dresden, Germany [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The steps of building library in R 2.1.1
An article like that would be really great. On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote: Gabor Grothendieck wrote: I think you have been using R too long. Something like this is very much needed. There are two problems: 1. the process itself is too complex (need to get rid of perl, integrate package development tools with package installation procedure [it should be as easy as downloading a package], remove necessity to set or modify any environment variables including the path variables). 2. there is too much material to absorb just to create a package. The manuals are insufficient. A step-by-step simplification is very much needed. Its no coincidence that there are a number of such descriptions on the net (google for 'making creating R package') since I would guess that just about everyone has significant problems in creating their first package on Windows. OK, if people really think this is required, I will sit down on a clean Windows XP machine, do the setup, and write it down for the next R Help Desk in R News -- something like Creating my first R package under Windows? If anybody else is willing to contribute and can write something up in a manner that is *not* confusing or misleading (none of the other material spread over the web satisfies this requirement, AFAICS), she/he is invited to contribute, of course. BTW, everybody else is invited to submit proposals for R Help Desk!!! Uwe Ligges On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote: Ivy_Li wrote: Dear All, With the warm support of every R expert, I have built my R library successfully. Especially thanks: Duncan Murdoch Gabor Grothendieck Henrik Bengtsson Uwe Ligges You are welcome. The following is intended for the records in the archive in order to protect readers. Without your help, I will lower efficiency. I noticed that some other friends were puzzled by the method of building library. Now, I organize a document about it. Hoping it can help more friends. 1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools Do you mean http://www.murdoch-sutherland.com/Rtools/ ? 2. Download the rw2011.exe; Install the newest version of R 3. Download the tools.zip; Unpack it into c:\cygwin Not required to call it cygwin - also a bit misleading... 4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active Perl in c:\Perl Why in C:\Perl ? 5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in c:\mingwin Why in c:\mingwin ? 6. Then go to Control Panel - System - Advanced - Environment Variables - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin The PATH variable already contains a couple of paths, add the two given above in front of all others, separated by ;. Why we add them in the beginning of the path? Because we want the folder that contains the tools to be at the beginning so that you eliminate the possibility of finding a different program of the same name first in a folder that comes prior to the one where the tools are stored. OK, this (1-6) is all described in the R Administration and Installation manual, hence I do not see why we have to repeat it here. 7. I use the package.skeleton() function to make a draft package. It will automate some of the setup for a new source package. It creates directories, saves functions anddata to appropriate places, and creates skeleton help files and 'README' files describing further steps in packaging. I type in R: f - function(x,y) x+y g - function(x,y) x-y d - data.frame(a=1, b=2) e - rnorm(1000) package.skeleton(list=c(f,g,d,e), name=example) Then modify the 'DESCRIPTION': Package: example Version: 1.0-1 Date: 2005-07-09 Title: My first function Author: Ivy [EMAIL PROTECTED] Maintainer: Ivy [EMAIL PROTECTED] Description: simple sum and subtract License: GPL version 2 or later Depends: R (= 1.9), stats, graphics, utils You can refer to the web page: http://cran.r-project.org/src/contrib/Descriptions/ There are larger source of examples. And you can read the part of 'Creating R Packages' in 'Writing R Extension'. It introduces some useful things for your reference. This is described in Writing R Extension and is not related to the setup of you system in 1-6. 8. Download hhc.exe Microsoft help compiler from somewhere. And save it somewhere in your path. I download a 'htmlhelp.exe' and setup. saved the hhc.exe into the 'C:\cygwin\bin' because this path has been writen in my PATH Variable Balue. However if you decided not to use the Help Compiler (hhc), then you need to modify the MkRules file in RHOME/src/gnuwin32 to tell it not to try to build that kind of help file
[R] principal component analysis in affy
Hi, I have been using the prcomp function to perform PCA on my example microarray data, (stored in metric text files) which looks like this: 1a 1b 1c 1d 1e 1f ...4r 4s 4t g11.2705 1.2766 ...2.0298 g20.1631 0.7067 g30.2212 1.0439 . . . . g99 1.3657..2.3736 i.e. a matrix of 63 columns and 99 rows, where the columns represent chip and rows represent genes. Now, the biplot function biplot(prcomp(pcadata, scale = TRUE), cex = c(0.75,0.75)) gives me a plot with one vector per gene. However, I actually need to get one vector per chip instead of one vector per gene. I have been told that there is a function in the affy package that does what I am looking for i.e. gives one vector per chip. Can someone please tell me what the function is called, and how I can get hold of the code(since I believe affy only works on CEL files) ? I have downloaded the affy R code from Terry Speed's website already, but I don't know where (if at all) the code to perform PCA is. Thank you everyone! Sincerely, Mugdha Wagle Hartwell Center for Bioinformatics and Biotechnology, St.Jude Children's Research Hospital, Memphis TN 38105 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RandomForest question
Hi, I found the following lines from Leo's randomForest, and I am not sure if it can be applied here but just tried to help: mtry0 = the number of variables to split on at each node. Default is the square root of mdim. ATTENTION! DO NOT USE THE DEFAULT VALUES OF MTRY0 IF YOU WANT TO OPTIMIZE THE PERFORMANCE OF RANDOM FORESTS. TRY DIFFERENT VALUES-GROW 20-30 TREES, AND SELECT THE VALUE OF MTRY THAT GIVES THE SMALLEST OOB ERROR RATE. mdim is the number of predicators. HTH, weiwei On 7/21/05, Liaw, Andy [EMAIL PROTECTED] wrote: From: [EMAIL PROTECTED] Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when choosing mtry=80. How is it possible that more variables can used than there are in columns the data frame? It's not. The code for randomForest.default() has: ## Make sure mtry is in reasonable range. mtry - max(1, min(p, round(mtry))) so it silently sets mtry to number of predictors if it's too large. As an example: library(randomForest) randomForest 4.5-12 Type rfNews() to see new features/changes/bug fixes. iris.rf = randomForest(Species ~ ., iris, mtry=10) iris.rf$mtry [1] 4 I should probably add a warning in such cases... Andy thanks for your help + kind regards, Arne [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The steps of building library in R 2.1.1
On 7/21/2005 9:43 AM, Gabor Grothendieck wrote: I think you have been using R too long. Something like this is very much needed. There are two problems: 1. the process itself is too complex (need to get rid of perl, integrate package development tools with package installation procedure [it should be as easy as downloading a package], remove necessity to set or modify any environment variables including the path variables). I agree with some of this, but I don't see much interest in fixing it. For example, getting rid of Perl would be a lot of work. When the Perl scripts were written, R was not capable of doing what they do. I think it is capable now, but there's still a huge amount of translation work to do. Who will do that? Who will test that they did it right? At the end, will it actually have been worth all the trouble? Installing Perl is not all that hard. 2. there is too much material to absorb just to create a package. The manuals are insufficient. The first sentence here is basically a repetition of the process is too complex. I think the second sentence is incorrect. Could you please point out what necessary steps are missing? A step-by-step simplification is very much needed. Exactly this has been in the Installation and Administration manual since I put it there in February for the 2.1.0 release. It's at the beginning of the appendix on the Windows toolset, with multiple references pointing people there. It's followed by detailed descriptions of each of the steps. If you think it could be further improved, please submit improvements. Its no coincidence that there are a number of such descriptions on the net (google for 'making creating R package') since I would guess that just about everyone has significant problems in creating their first package on Windows. As far as I can tell, those all predate the release of 2.1.0. I think your complaints are out of date. Duncan Murdoch On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote: Ivy_Li wrote: Dear All, With the warm support of every R expert, I have built my R library successfully. Especially thanks: Duncan Murdoch Gabor Grothendieck Henrik Bengtsson Uwe Ligges You are welcome. The following is intended for the records in the archive in order to protect readers. Without your help, I will lower efficiency. I noticed that some other friends were puzzled by the method of building library. Now, I organize a document about it. Hoping it can help more friends. 1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools Do you mean http://www.murdoch-sutherland.com/Rtools/ ? 2. Download the rw2011.exe; Install the newest version of R 3. Download the tools.zip; Unpack it into c:\cygwin Not required to call it cygwin - also a bit misleading... 4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active Perl in c:\Perl Why in C:\Perl ? 5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in c:\mingwin Why in c:\mingwin ? 6. Then go to Control Panel - System - Advanced - Environment Variables - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin The PATH variable already contains a couple of paths, add the two given above in front of all others, separated by ;. Why we add them in the beginning of the path? Because we want the folder that contains the tools to be at the beginning so that you eliminate the possibility of finding a different program of the same name first in a folder that comes prior to the one where the tools are stored. OK, this (1-6) is all described in the R Administration and Installation manual, hence I do not see why we have to repeat it here. 7. I use the package.skeleton() function to make a draft package. It will automate some of the setup for a new source package. It creates directories, saves functions anddata to appropriate places, and creates skeleton help files and 'README' files describing further steps in packaging. I type in R: f - function(x,y) x+y g - function(x,y) x-y d - data.frame(a=1, b=2) e - rnorm(1000) package.skeleton(list=c(f,g,d,e), name=example) Then modify the 'DESCRIPTION': Package: example Version: 1.0-1 Date: 2005-07-09 Title: My first function Author: Ivy [EMAIL PROTECTED] Maintainer: Ivy [EMAIL PROTECTED] Description: simple sum and subtract License: GPL version 2 or later Depends: R (= 1.9), stats, graphics, utils You can refer to the web page: http://cran.r-project.org/src/contrib/Descriptions/ There are larger source of examples. And you can read the part of 'Creating R Packages' in 'Writing R Extension'. It introduces some useful things for your reference. This is described in
Re: [R] The steps of building library in R 2.1.1
On 7/21/2005 10:29 AM, Uwe Ligges wrote: Gabor Grothendieck wrote: I think you have been using R too long. Something like this is very much needed. There are two problems: 1. the process itself is too complex (need to get rid of perl, integrate package development tools with package installation procedure [it should be as easy as downloading a package], remove necessity to set or modify any environment variables including the path variables). 2. there is too much material to absorb just to create a package. The manuals are insufficient. A step-by-step simplification is very much needed. Its no coincidence that there are a number of such descriptions on the net (google for 'making creating R package') since I would guess that just about everyone has significant problems in creating their first package on Windows. OK, if people really think this is required, I will sit down on a clean Windows XP machine, do the setup, and write it down for the next R Help Desk in R News -- something like Creating my first R package under Windows? If anybody else is willing to contribute and can write something up in a manner that is *not* confusing or misleading (none of the other material spread over the web satisfies this requirement, AFAICS), she/he is invited to contribute, of course. BTW, everybody else is invited to submit proposals for R Help Desk!!! That sounds great. Could you also take notes as you go about specific problems in the writeup in the R-admin manual, so it can be improved for the next release? Another thing you could do which would be valuable: get a student or someone else who is reasonably computer literate, but unfamiliar with R details, to do this while you sit watching and recording their mistakes. Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] heatmap color distribution
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Jacob Michaelson Sent: 21 July 2005 12:26 To: r-help@stat.math.ethz.ch Subject: [R] heatmap color distribution Hi all, I've got a set of gene expression data, and I'm plotting several heatmaps for subsets of the whole set. I'd like the heatmaps to have the same color distribution, so that comparisons may be made (roughly) across heatmaps; this would require that the color distribution and distance functions be based on the entire dataset, rather than on individual subsets. Does anyone know how to do this? Thanks in advance, For each heatmap, in image() set the zlim argument to c(zmin,zmax) where zmin and zmax are the minimum and maximum observed across the entire data set. Also, for each heatmap set col=heat.colors(n) to the same n for all heatmaps. I do that with image.kriging in geoR. Hope it works for you. Ruben __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] output of variance estimate of random effect from a gamma frailty model using Coxph in R
Hi, I have a question about the output for variance of random effect from a gamma frailty model using coxph in R. Is it the vairance of frailties themselves or variance of log frailties? Thanks. Guanghui __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Concatenate 2 functions
hi all I need to concatenate 2 functions into one like temp-1:1000 for(i=0;i1000;i++) { func- func function(beta) dweibull(temp[i],beta,eta) } Any idee on this? thks guillaume. // Webmail Oreka : http://www.oreka.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R:plot and dots
On Thu, 2005-07-21 at 16:18 +0200, Clark Allan wrote: hi all a very simple question. i have plot(x,y) but i would like to add in on the plot the observation number associated with each point. how can this be done? / allan If you mean the unique observation number associated with each x,y pair, you can use: text(x, y, labels = ObsNumberVector, pos = 3) after the plot(x, y) call: df - data.frame(x = rnorm(10), y = rnorm(10), ID = 1:10) with(df, plot(x, y)) with(df, text(x, y, labels = ID, pos = 3)) See ?text for more information. Note that I used pos = 3 which places the label above the data point. There are other positioning parameters available, which are noted in the help file. Note also that you might have to adjust the plot axis limits depending upon where you place the text and your extreme points. If you mean the frequency of each x,y pair (if there is more than one observation per x,y pair), you might want to review Deepayan's recent post here: https://stat.ethz.ch/pipermail/r-help/2005-July/074042.html HTH, Marc Schwartz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] reorder bug in heatmap.2?
I want to plot a heatmap without reordering the columns. This works fine in heatmap: heatmap(meanX[selected,], col=cm.colors(256), Colv=NA) But in heatmap.2 I get: heatmap.2(meanX[selected,], col=cm.colors(256), Colv=NA) Error in if (!is.logical(Colv) || Colv) ddc - reorder(ddc, Colv) : missing value where TRUE/FALSE needed (Note that instructions for the use of Colv and Rowv are identical in both heatmap and heatmap.2 documentation) Is there another way to not reorder columns in heatmap.2? Thanks in advance, Jake __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] debian vcd package
On Thu, Jul 21, 2005 at 03:55:29PM +0100, Peter Ho wrote: [Apologies if you have already read this message sent from another email address] Hi R-Help, I have been using R in Linux (Debian) for the past month. The usual way I install packages is through apt. Recently, a new packages vcd became available on CRAN. I tried installing it today and found that Debian does not seem to support this package. I also found that many other packages were unavailable. Does anyone have any recommended sites where a full list is available? If none exist, what would be the best way to move ahead in installing say the vcd package. I am still a novice in using Debian and so please forgive me if some of my questions may seem trivial for experienced users. Unfortunately, the term package means different things in the context of R and of Debian. A Debian package is what you install using tools like apt etc. The traditional way of installing an R package on Linux is to * have R installed from source, or install the r-base-dev Debian package * download the package archive (e.g. http://www.stats.bris.ac.uk/R/src/contrib/vcd_0.9-0.tar.gz ) * run the R CMD INSTALL command on it, e.g. R CMD INSTALL vcd_0.9-0.tar.gz This requires having a number of development Debian packages installed, such as gcc, g77 etc (installing r-base-dev will automatically resolve such dependencies). Best regards, Jan -- +- Jan T. Kim ---+ |*NEW*email: [EMAIL PROTECTED] | |*NEW*WWW: http://www.cmp.uea.ac.uk/people/jtk | *-= hierarchical systems are for files, not for humans =-* __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The steps of building library in R 2.1.1
Duncan Murdoch wrote: On 7/21/2005 10:29 AM, Uwe Ligges wrote: Gabor Grothendieck wrote: I think you have been using R too long. Something like this is very much needed. There are two problems: 1. the process itself is too complex (need to get rid of perl, integrate package development tools with package installation procedure [it should be as easy as downloading a package], remove necessity to set or modify any environment variables including the path variables). 2. there is too much material to absorb just to create a package. The manuals are insufficient. A step-by-step simplification is very much needed. Its no coincidence that there are a number of such descriptions on the net (google for 'making creating R package') since I would guess that just about everyone has significant problems in creating their first package on Windows. OK, if people really think this is required, I will sit down on a clean Windows XP machine, do the setup, and write it down for the next R Help Desk in R News -- something like Creating my first R package under Windows? If anybody else is willing to contribute and can write something up in a manner that is *not* confusing or misleading (none of the other material spread over the web satisfies this requirement, AFAICS), she/he is invited to contribute, of course. BTW, everybody else is invited to submit proposals for R Help Desk!!! That sounds great. Could you also take notes as you go about specific problems in the writeup in the R-admin manual, so it can be improved for the next release? Of course. Another thing you could do which would be valuable: get a student or someone else who is reasonably computer literate, but unfamiliar with R details, to do this while you sit watching and recording their mistakes. Good idea. Uwe Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Concatenate 2 functions
[EMAIL PROTECTED] wrote: hi all I need to concatenate 2 functions into one like temp-1:1000 for(i=0;i1000;i++) { func- func function(beta) dweibull(temp[i],beta,eta) } Please read An Introduction to R. Please read the posting guide. What do you expect to be in func? This is completely unclear to me. Uwe Ligges Any idee on this? thks guillaume. // Webmail Oreka : http://www.oreka.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R:plot and dots
Clark Allan wrote: hi all a very simple question. i have plot(x,y) but i would like to add in on the plot the observation number associated with each point. how can this be done? See ?text Uwe Ligges / allan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] About object of class mle returned by user defined functions
Hi, There is something I don't get with object of class mle returned by a function I wrote. More precisely it's about the behaviour of method confint and profile applied to these object. I've written a short function (see below) whose arguments are: 1) A univariate sample (arising from a gamma, log-normal or whatever). 2) A character string standing for one of the R densities, eg, gamma, lnorm, etc. That's the density the user wants to fit to the data. 3) A named list with initial values for the density parameters; that will be passed to optim via mle. 4) The method to be used by optim via mle. That can be change by the code if parameter boundaries are also supplied. 5) The lowest allowed values for the parameters. 6) The largest allowed values. The big thing this short function does is writing on-fly the corresponding log-likelihood function before calling mle. The object of class mle returned by the call to mle is itself returned by the function. Here is the code: newFit - function(isi, ## The data set isi.density = gamma, ## The name of the density used as model initial.para = list( shape = (mean(isi)/sd(isi))^2, scale = sd(isi)^2 / mean(isi) ), ## Inital parameters passed to optim optim.method = BFGS, ## optim method optim.lower = numeric(length(initial.para)) + 0.1, optim.upper = numeric(length(initial.para)) + Inf, ...) { require(stats4) ## Create a string with the log likelihood definition minusLogLikelihood.txt - paste(function( , paste(names(initial.para), collapse = , ), ) {, isi - eval(, deparse(substitute(isi)), , envir = .GlobalEnv);, -sum(, paste(d, isi.density, sep = ), (isi, , paste(names(initial.para), collapse = , ), , log = TRUE) ) } ) ## Define logLikelihood function minusLogLikelihood - eval( parse(text = minusLogLikelihood.txt) ) environment(minusLogLikelihood) - .GlobalEnv if ( all( is.infinite( c(optim.lower,optim.upper) ) ) ) { getFit - mle(minusLogLikelihood, start = initial.para, method = optim.method, ... ) } else { getFit - mle(minusLogLikelihood, start = initial.para, method = L-BFGS-B, lower = optim.lower, upper = optim.upper, ... ) } ## End of conditional on all(is.infinite(c(optim.lower,optim.upper))) getFit } It seems to work fine on examples like: isi1 - rgamma(100, shape = 2, scale = 1) fit1 - newFit(isi1) ## fitting here with the correct density (initial parameters are obtained by the method of moments) coef(fit1) shape scale 1.8210477 0.9514774 vcov(fit1) shape scale shape 0.05650600 0.02952371 scale 0.02952371 0.02039714 logLik(fit1) 'log Lik.' -155.9232 (df=2) If we compare with a direct call to mle: llgamma - function(sh, sc) -sum(dgamma(isi1, shape = sh, scale = sc, log = TRUE)) fitA - mle(llgamma, start = list( sh = (mean(isi1)/sd(isi1))^2, sc = sd(isi1)^2 / mean(isi1) ),lower = c(0.0001,0.0001), method = L-BFGS-B) coef(fitA) sh sc 1.821042 1.051001 vcov(fitA) sh sc sh 0.05650526 -0.03261146 sc -0.03261146 0.02488714 logLik(fitA) 'log Lik.' -155.9232 (df=2) I get almost the same estimated parameter values, same log-likelihood but not the same vcov matrix. A call to profile or confint on fit1 does not work, eg: confint(fit1) Profiling... Erreur dans approx(sp$y, sp$x, xout = cutoff) : need at least two non-NA values to interpolate De plus : Message d'avis : collapsing to unique 'x' values in: approx(sp$y, sp$x, xout = cutoff) Although calling the log-likelihood function defined in fit1 ([EMAIL PROTECTED]) with argument values different from the MLE does return something sensible: [EMAIL PROTECTED](coef(fit1)[1],coef(fit1)[2]) [1] 155.9232 [EMAIL PROTECTED](coef(fit1)[1]+0.01,coef(fit1)[2]+0.01) [1] 155.9263 There is obviously something I'm missing here since I thought for a while that the problem was with the environment attached to the function minusLogLikelihood when calling eval; but the lines above make me think it is not the case... Any help and/or ideas warmly welcomed. Thanks, Christophe. -- A Master Carpenter has many tools and is expert with most of them.If you only know how to use a hammer, every problem starts to look like a nail. Stay away from that trap. Richard B Johnson. -- Christophe Pouzat
[R] R:plot and dots
hi all a very simple question. i have plot(x,y) but i would like to add in on the plot the observation number associated with each point. how can this be done? / allan__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The steps of building library in R 2.1.1
On 7/21/05, Duncan Murdoch [EMAIL PROTECTED] wrote: On 7/21/2005 9:43 AM, Gabor Grothendieck wrote: I think you have been using R too long. Something like this is very much needed. There are two problems: 1. the process itself is too complex (need to get rid of perl, integrate package development tools with package installation procedure [it should be as easy as downloading a package], remove necessity to set or modify any environment variables including the path variables). I agree with some of this, but I don't see much interest in fixing it. For example, getting rid of Perl would be a lot of work. When the Perl scripts were written, R was not capable of doing what they do. I think it is capable now, but there's still a huge amount of translation work to do. Who will do that? Who will test that they did it right? At the end, will it actually have been worth all the trouble? Installing Perl is not all that hard. Each step may not be hard but the totality of them all means its pretty complex for most people. I don't know who will do it or whether anyone even will but a first step is identifying that it needs to be done. Since the key to expanding R is to expand the library to me making it simple to create and install packages and R ought to be of very high priority for the core group regardless of difficulty in achieving this. If no one is interested in doing it then it will remain a limitation of R that commercial or other free systems can use to gain advantage over R. 2. there is too much material to absorb just to create a package. The manuals are insufficient. The first sentence here is basically a repetition of the process is too complex. I think the second sentence is incorrect. Could you please point out what necessary steps are missing? Its not that anything is missing that I am aware of. Its that there is so much detail one is overwhelmed. Its not completely the fault of the description since as point #1 mentions the process itself is a key part of the problem. A step-by-step simplification is very much needed. Exactly this has been in the Installation and Administration manual since I put it there in February for the 2.1.0 release. It's at the beginning of the appendix on the Windows toolset, with multiple references pointing people there. It's followed by detailed descriptions of each of the steps. If you think it could be further improved, please submit improvements. That is easy to say but, in fact, if anyone does this they are not met with a receptive atmosphere. The excellent post describing the process that started this out (even if there are some small errors) is just one example. Its no coincidence that there are a number of such descriptions on the net (google for 'making creating R package') since I would guess that just about everyone has significant problems in creating their first package on Windows. As far as I can tell, those all predate the release of 2.1.0. I think your complaints are out of date. I am sure the situation is getting better but I did look at the manuals again before posting and do think that a step by step article such as that in Ivy Li's post, the various documents on the net findable by google as I mentioned and the proposed article by Uwe are really needed in addition to the manuals. The manuals can then be used to get additional detail. Duncan Murdoch On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote: Ivy_Li wrote: Dear All, With the warm support of every R expert, I have built my R library successfully. Especially thanks: Duncan Murdoch Gabor Grothendieck Henrik Bengtsson Uwe Ligges You are welcome. The following is intended for the records in the archive in order to protect readers. Without your help, I will lower efficiency. I noticed that some other friends were puzzled by the method of building library. Now, I organize a document about it. Hoping it can help more friends. 1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools Do you mean http://www.murdoch-sutherland.com/Rtools/ ? 2. Download the rw2011.exe; Install the newest version of R 3. Download the tools.zip; Unpack it into c:\cygwin Not required to call it cygwin - also a bit misleading... 4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active Perl in c:\Perl Why in C:\Perl ? 5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in c:\mingwin Why in c:\mingwin ? 6. Then go to Control Panel - System - Advanced - Environment Variables - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin The PATH variable already contains a couple of paths, add the two given above in front of all others, separated by ;. Why we add them in the beginning of the path?
[R] opening RDB files
Hi all, I've recently upgraded to R version 2.1.1 and when trying to inspect the contents of many packages in the library (for instance library\MASS\R) I've realized wordpad, or the notepad, won't open them since they have *.RDB and *.RDX extensions which these editors cannot recognize. However, libraries in previous versions of R did not have these extensions and I could inspect the contents of each package without any trouble. I've been searching for this thread but did not find it. Thank you! Emili __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] opening RDB files
Emili Tortosa-Ausina wrote: Hi all, I've recently upgraded to R version 2.1.1 and when trying to inspect the contents of many packages in the library (for instance library\MASS\R) I've realized wordpad, or the notepad, won't open them since they have *.RDB and *.RDX extensions which these editors cannot recognize. However, libraries in previous versions of R did not have these extensions and I could inspect the contents of each package without any trouble. I've been searching for this thread but did not find it. Well, these are the lazy loading databases which have been introduced in R-1.9.0, AFAIR. There is a corresponding article in R News. Just download the source package in order to look at the code. Uwe Ligges Thank you! Emili __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] normal reference intervals
I am interested in calculating Age-Specific normal reverence intervals, using non-parametric methods - or ideally something called the LMS method (which as I understand it uses cubic splines fitted to the data). Any packages in R that you think might help me? Any other advice gratefully received. Many thanks. Best wishes, David. - Dr. David Crabb School of Biomedical and Natural Sciences, Nottingham Trent University, Clifton Campus, Nottingham. NG11 8NS Tel: 0115 848 3275 Fax: 0115 848 6690 This email is intended solely for the addressee. It may contain private and confidential information. If you are not the intended addressee, please take no action based on it nor show a copy to anyone. In this case, please reply to this email to highlight the error. Opinions and information in this email that do not relate to the official business of Nottingham Trent University shall be understood as neither given nor endorsed by the University. Nottingham Trent University has taken steps to ensure that this email and any attachments are virus-free, but we do advise that the recipient should check that the email and its attachments are actually virus free. This is in keeping with good computing practice. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] opening RDB files
Emili Tortosa-Ausina wrote: Hi all, I've recently upgraded to R version 2.1.1 and when trying to inspect the contents of many packages in the library (for instance library\MASS\R) I've realized wordpad, or the notepad, won't open them since they have *.RDB and *.RDX extensions which these editors cannot recognize. However, libraries in previous versions of R did not have these extensions and I could inspect the contents of each package without any trouble. The *.rdb etc are the new compact package file formats introduced around R v2.0.0; these are binary files that won't make much sense to look at. It was only in version before this you could inspect the R code by looking at the file pkg/R/pkg.R in a text file viewer/editor. To look at the code now, you have to either download the source of the package you're interested in (look for the *.tar.gz files), or you can always do it from within R, e.g. print(read.table). If the the function you want to look at gives UseMethod and so on, you're looking at a generic function, e.g. print(print): function (x, ...) UseMethod(print) environment: namespace:base then you want to track down the method for your specific object. To find all implementation of print, use methods(), e.g. methods(print): [1] print.acf* print.anova [3] print.aov* print.aovlist* [5] print.ar*print.Arima* [7] print.arima0*print.AsIs [9] print.Bibtex*print.by snip/snip [123] print.vignette* print.xgettext* [125] print.xngettext* print.xtabs* Then do, say, print(print.by) and you'll see the code. All method with an asterisk are namespace protected methods. To get these you have to use getAnywhere(), e.g. print(getAnywhere(print.acf)). Why the new file format? It is used for package that utilized lazy loading, which more and more package now use (packages without lazy loading can still be inspected the old way). Thanks to lazy loading, packages now loads more or less instantainously. They are also more memory efficient, because all code is not loaded at once. Cheers Henrik Bengtsson I've been searching for this thread but did not find it. Thank you! Emili __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] colnames
Hi Adai, Your diagnosis is absolutely right; class(r1) returned data.frame and your suggested solution worked perfectly. Your assumption is also right; both x and y are positive. If I want to compare the performance of the my old function with yours, are there some functions in R I could use to get the elapsed time etc? Many Thanks indeed. Regards, Gilbert -Original Message- From: Adaikalavan Ramasamy [mailto:[EMAIL PROTECTED] Sent: 19 July 2005 23:38 To: Gilbert Wu Cc: r-help@stat.math.ethz.ch Subject: RE: [R] colnames What does class(r1) give you ? If it is data.frame, then try exp( diff( log( as.matrix( df ) ) ) ) BTW, I made the assumption that both x and y are positive values only. Regards, Adai On Tue, 2005-07-19 at 16:30 +0100, Gilbert Wu wrote: Hi Adai, When I tried the optimized routine, I got the following error message: r1 899188 902232 901714 28176U 15322M 20050713 7.595 10.97 17.96999 5.1925 11.44 20050714 7.605 10.94 18.00999 5.2500 11.50 20050715 7.480 10.99 17.64999 5.2500 11.33 20050718 7.415 11.05 17.64000 5.2250 11.27 exp(diff(log(r1))) -1 Error in r[i1] - r[-length(r):-(length(r) - lag + 1)] : non-numeric argument to binary operator Any idea? Many Thanks. Gilbert -Original Message- From: Adaikalavan Ramasamy [mailto:[EMAIL PROTECTED] Sent: 19 July 2005 12:20 To: Gilbert Wu Cc: r-help@stat.math.ethz.ch Subject: RE: [R] colnames First, your problem could be boiled down to the following example. See how the colnames of the two outputs vary. df - cbind.data.frame( 100=1:2, 200=3:4 ) df/df X100 X200 111 211 m - as.matrix( df ) # coerce to matrix class m/m 100 200 1 1 1 2 1 1 It appears that whenever R has to create a new dataframe automatically, it tries to get nice colnames. See help(data.frame). I am not exactly sure why this behaviour is different when creating a matrix. But I do not think this is a major problem for most people. If you coerce your input to matrix, the problem goes away. Next, note the following points : a) mat[ 1:3, 1:ncol(mat) ] is equivalent to simply mat[ 1:3, ]. b) mat[ 2:nrow(mat), ] is equivalent to simply mat[ -1, ] See help(subset) for more information. Using the points above, we can simplify your function as p.RIs2Returns - function (mat){ mat - as.matrix(mat) x - mat[ -nrow(mat), ] y - mat[ -1, ] return( y/x -1 ) } If your data contains only numerical data, it is probably good idea to work with matrices as matrix operations are faster. Finally, we can shorten your function. You can use the diff (which works column-wise if input is a matrix) and apply function if you know that y/x = exp(log(y/x)) = exp( log(y) - log(x) ) which could be coded in R as exp( diff( log(r1) ) ) and then subtract 1 from above to get your returns. Regards, Adai On Tue, 2005-07-19 at 09:17 +0100, Gilbert Wu wrote: Hi Adai, Many Thanks for the examples. I work for a financial institution. We are exploring R as a tool to implement our portfolio optimization strategies. Hence, R is still a new language to us. The script I wrote tried to make a returns matrix from the daily return indices extracted from a SQL database. Please find below the output that produces the 'X' prefix in the colnames. The reason to preserve the column names is that they are stock identifiers which are to be used by other sub systems rather than R. I would welcome any suggestion to improve the script. Regards, Gilbert p.RIs2Returns - + function (RIm) + { + x-RIm[1:(nrow(RIm)-1), 1:ncol(RIm)] + y-RIm[2:nrow(RIm), 1:ncol(RIm)] + RReturns - (y/x -1) + RReturns + } channel-odbcConnect(ourSQLDB) result-sqlQuery(channel,paste(select * from equityRIs;)) odbcClose(channel) result stockidsdate dbPrice 1 899188 20050713 7.59500 2 899188 20050714 7.60500 3 899188 20050715 7.48000 4 899188 20050718 7.41500 5 902232 20050713 10.97000 6 902232 20050714 10.94000 7 902232 20050715 10.99000 8 902232 20050718 11.05000 9 901714 20050713 17.96999 10 901714 20050714 18.00999 11 901714 20050715 17.64999 12 901714 20050718 17.64000 13 28176U 20050713 5.19250 14 28176U 20050714 5.25000 15 28176U 20050715 5.25000 16 28176U 20050718 5.22500 17 15322M 20050713 11.44000 18 15322M 20050714 11.5 19 15322M 20050715 11.33000 20 15322M 20050718 11.27000 r1-reshape(result, timevar=stockid, idvar=sdate, direction=wide) r1 sdate dbPrice.899188 dbPrice.902232 dbPrice.901714 dbPrice.28176U dbPrice.15322M 1 20050713 7.595 10.97 17.96999 5.1925 11.44 2 20050714 7.605 10.94 18.00999 5.2500 11.50 3 20050715 7.480
Re: [R] RandomForest question
See the tuneRF() function in the package for an implementation of the strategy recommended by Breiman Cutler. BTW, randomForest is only for the R package. See Breiman's web page for notice on trademarks. Andy From: Weiwei Shi Hi, I found the following lines from Leo's randomForest, and I am not sure if it can be applied here but just tried to help: mtry0 = the number of variables to split on at each node. Default is the square root of mdim. ATTENTION! DO NOT USE THE DEFAULT VALUES OF MTRY0 IF YOU WANT TO OPTIMIZE THE PERFORMANCE OF RANDOM FORESTS. TRY DIFFERENT VALUES-GROW 20-30 TREES, AND SELECT THE VALUE OF MTRY THAT GIVES THE SMALLEST OOB ERROR RATE. mdim is the number of predicators. HTH, weiwei On 7/21/05, Liaw, Andy [EMAIL PROTECTED] wrote: From: [EMAIL PROTECTED] Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when choosing mtry=80. How is it possible that more variables can used than there are in columns the data frame? It's not. The code for randomForest.default() has: ## Make sure mtry is in reasonable range. mtry - max(1, min(p, round(mtry))) so it silently sets mtry to number of predictors if it's too large. As an example: library(randomForest) randomForest 4.5-12 Type rfNews() to see new features/changes/bug fixes. iris.rf = randomForest(Species ~ ., iris, mtry=10) iris.rf$mtry [1] 4 I should probably add a warning in such cases... Andy thanks for your help + kind regards, Arne [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Weiwei Shi, Ph.D Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] RandomForest question
From: [EMAIL PROTECTED] Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when choosing mtry=80. How is it possible that more variables can used than there are in columns the data frame? It's not. The code for randomForest.default() has: ## Make sure mtry is in reasonable range. mtry - max(1, min(p, round(mtry))) so it silently sets mtry to number of predictors if it's too large. As an example: library(randomForest) randomForest 4.5-12 Type rfNews() to see new features/changes/bug fixes. iris.rf = randomForest(Species ~ ., iris, mtry=10) iris.rf$mtry [1] 4 I should probably add a warning in such cases... Andy thanks for your help + kind regards, Arne [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R graphics
Sam Baxter [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED] I am trying to set up 16 graphs on one graphics page in R. I have used the mfrow=c(4,4) command. However I get a lot of white space between each graph. Does anyone know how I can reduce this? The default par()$mar is c(5,4,4,2) + 0.1 and can be reduced. For example: par(mfrow=c(4,4), mar=c(3,3,0,0)) for (i in 1:16) { plot(0:10) } efg -- Earl F. Glynn Bioinformatics Department Stowers Institute for Medical Research __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] colnames
See help(system.time). On Thu, 2005-07-21 at 17:56 +0100, Gilbert Wu wrote: Hi Adai, Your diagnosis is absolutely right; class(r1) returned data.frame and your suggested solution worked perfectly. Your assumption is also right; both x and y are positive. If I want to compare the performance of the my old function with yours, are there some functions in R I could use to get the elapsed time etc? Many Thanks indeed. Regards, Gilbert -Original Message- From: Adaikalavan Ramasamy [mailto:[EMAIL PROTECTED] Sent: 19 July 2005 23:38 To: Gilbert Wu Cc: r-help@stat.math.ethz.ch Subject: RE: [R] colnames What does class(r1) give you ? If it is data.frame, then try exp( diff( log( as.matrix( df ) ) ) ) BTW, I made the assumption that both x and y are positive values only. Regards, Adai On Tue, 2005-07-19 at 16:30 +0100, Gilbert Wu wrote: Hi Adai, When I tried the optimized routine, I got the following error message: r1 899188 902232 901714 28176U 15322M 20050713 7.595 10.97 17.96999 5.1925 11.44 20050714 7.605 10.94 18.00999 5.2500 11.50 20050715 7.480 10.99 17.64999 5.2500 11.33 20050718 7.415 11.05 17.64000 5.2250 11.27 exp(diff(log(r1))) -1 Error in r[i1] - r[-length(r):-(length(r) - lag + 1)] : non-numeric argument to binary operator Any idea? Many Thanks. Gilbert -Original Message- From: Adaikalavan Ramasamy [mailto:[EMAIL PROTECTED] Sent: 19 July 2005 12:20 To: Gilbert Wu Cc: r-help@stat.math.ethz.ch Subject: RE: [R] colnames First, your problem could be boiled down to the following example. See how the colnames of the two outputs vary. df - cbind.data.frame( 100=1:2, 200=3:4 ) df/df X100 X200 111 211 m - as.matrix( df ) # coerce to matrix class m/m 100 200 1 1 1 2 1 1 It appears that whenever R has to create a new dataframe automatically, it tries to get nice colnames. See help(data.frame). I am not exactly sure why this behaviour is different when creating a matrix. But I do not think this is a major problem for most people. If you coerce your input to matrix, the problem goes away. Next, note the following points : a) mat[ 1:3, 1:ncol(mat) ] is equivalent to simply mat[ 1:3, ]. b) mat[ 2:nrow(mat), ] is equivalent to simply mat[ -1, ] See help(subset) for more information. Using the points above, we can simplify your function as p.RIs2Returns - function (mat){ mat - as.matrix(mat) x - mat[ -nrow(mat), ] y - mat[ -1, ] return( y/x -1 ) } If your data contains only numerical data, it is probably good idea to work with matrices as matrix operations are faster. Finally, we can shorten your function. You can use the diff (which works column-wise if input is a matrix) and apply function if you know that y/x = exp(log(y/x)) = exp( log(y) - log(x) ) which could be coded in R as exp( diff( log(r1) ) ) and then subtract 1 from above to get your returns. Regards, Adai On Tue, 2005-07-19 at 09:17 +0100, Gilbert Wu wrote: Hi Adai, Many Thanks for the examples. I work for a financial institution. We are exploring R as a tool to implement our portfolio optimization strategies. Hence, R is still a new language to us. The script I wrote tried to make a returns matrix from the daily return indices extracted from a SQL database. Please find below the output that produces the 'X' prefix in the colnames. The reason to preserve the column names is that they are stock identifiers which are to be used by other sub systems rather than R. I would welcome any suggestion to improve the script. Regards, Gilbert p.RIs2Returns - + function (RIm) + { + x-RIm[1:(nrow(RIm)-1), 1:ncol(RIm)] + y-RIm[2:nrow(RIm), 1:ncol(RIm)] + RReturns - (y/x -1) + RReturns + } channel-odbcConnect(ourSQLDB) result-sqlQuery(channel,paste(select * from equityRIs;)) odbcClose(channel) result stockidsdate dbPrice 1 899188 20050713 7.59500 2 899188 20050714 7.60500 3 899188 20050715 7.48000 4 899188 20050718 7.41500 5 902232 20050713 10.97000 6 902232 20050714 10.94000 7 902232 20050715 10.99000 8 902232 20050718 11.05000 9 901714 20050713 17.96999 10 901714 20050714 18.00999 11 901714 20050715 17.64999 12 901714 20050718 17.64000 13 28176U 20050713 5.19250 14 28176U 20050714 5.25000 15 28176U 20050715 5.25000 16 28176U 20050718 5.22500 17 15322M 20050713 11.44000 18 15322M 20050714 11.5 19 15322M 20050715 11.33000 20 15322M 20050718 11.27000 r1-reshape(result, timevar=stockid, idvar=sdate, direction=wide) r1 sdate dbPrice.899188 dbPrice.902232
Re: [R] Problem with read.table()
Thanks to all who responded to my earlier message. The problem lies in that apostrophes (i.e., ') in some of the text fields are read as quotes. The file can be read without problems setting quotes= in read.table. Incidently, read.delim() also works, even without setting quotes= explicitly. best regards, Kristian Skrede Gleditsch Department of Political Science, UCSD (On leave, University of Essex, 2005-6) Tel: +44 1206 872499, Fax: +44 1206 873234 Email: [EMAIL PROTECTED] or [EMAIL PROTECTED] http://weber.ucsd.edu/~kgledits/ Kristian Skrede Gleditsch wrote: Dear all, I have encountered a strange problem with read.table(). When I try to read a tab delimited file I get an error message for line 260 not being equal to 14 (see below). Using count.fields() suggests that a number of lines have length not equal to 14, but not 260. Looking at the actual file, however, I cannot see anything wrong with any lines. They all seem to have length 14, there are no double tabs etc., and the file reads correctly in other programs. Does anyone have any suggestions as to what this might stem from? I have placed a copy of the file at http://dss.ucsd.edu/~kgledits/archigos_v.1.9.asc regards, Kristian Skrede Gleditsch archigos1.9 - read.table(c:/work/work12/archigos/archigos_v.1.9.asc, + sep=\t,header=T,as.is=T,row.names=NULL) Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 260 did not have 14 elements a - count.fields(c:/work/work12/archigos/archigos_v.1.9.asc,sep=\t) a - data.frame(c(1:length(a)),a) a[a[,2]!=14,] c.1.length.a.. a 150 150 10 313 313 10 424 424 10 1189 1189 5 1510 1510 10 1514 1514 10 1590 1590 5 1600 1600 10 1612 1612 10 1618 1618 10 1619 1619 10 1709 1709 10 1722 1722 10 1981 1981 10 1985 1985 10 2112 2112 10 2178 2178 10 2208 2208 10 2224 2224 10 2530 2530 5 2536 2536 5 2573 2573 5 2928 2928 5 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] help with a xyplot legend
On 7/20/05, Ronaldo Reis-Jr. [EMAIL PROTECTED] wrote: Hi, I try to put a legend in a xyplot graphic. xyplot(y~x|g,ylim=c(0,80),xlim=c(10,40),as.table=T,layout=c(2,3), ylab=Número de machos capturados,xlab=expression(paste(Temperatura (,degree,C))), key=list(corner=c(0,0),x=0, y=0, text=list(legenda),lines=list(col=cor, lwd = espessura, lty=linha),columns=7,between=0.5,betwen.columns=0.5,cex=0.8)) The problem is that legend is very close do xlab. I try change corner=c(0,0),x=0, y=0, to corner=c(0,0),x=0, y=1, but in this way the legend dont appear. How to make the vertical bottom area of the plot bigger to put the legend a bit separated of the xlabel? Where exactly do you want the legend? If it's outside and below the plot, you should try space=bottom instead of x,y,corner, etc. Otherwise, with space=inside, no attempt will be made to save space for the legend. Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The steps of building library in R 2.1.1
On 7/21/2005 12:03 PM, Gabor Grothendieck wrote: 2. there is too much material to absorb just to create a package. The manuals are insufficient. The first sentence here is basically a repetition of the process is too complex. I think the second sentence is incorrect. Could you please point out what necessary steps are missing? Its not that anything is missing that I am aware of. Its that there is so much detail one is overwhelmed. Its not completely the fault of the description since as point #1 mentions the process itself is a key part of the problem. And what solution do you propose to this problem? Saying this ought to be a high priority for the core group is not a solution. Tell us where the resources will come from to do this. A step-by-step simplification is very much needed. Exactly this has been in the Installation and Administration manual since I put it there in February for the 2.1.0 release. It's at the beginning of the appendix on the Windows toolset, with multiple references pointing people there. It's followed by detailed descriptions of each of the steps. If you think it could be further improved, please submit improvements. That is easy to say but, in fact, if anyone does this they are not met with a receptive atmosphere. The excellent post describing the process that started this out (even if there are some small errors) is just one example. I don't think that post was written with the intention of putting it into the manual. It would still take a fair bit of work to do that: 1. deciding where it fits and what to replace, 2. correcting the errors, 3. writing it in texinfo format. I'd be happy to talk with someone who volunteers to do that. (I'd suggest the volunteer should do number 1 first, so as not to waste a lot of time on versions that don't fit.) Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] problem running R with perl Statistics::R
__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Using Perl Statistics::R module, not running the
__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Chemoinformatic people
Sure, but Luke, I DO NOT currently use R at work... (now, that's not to say I won't be using it in a few months, but currently...). best, -tony On 7/21/05, Luke Tierney [EMAIL PROTECTED] wrote: I don't. THere is an address an email at novartis in the ASA directory ID 068970 NameAnthony J. Rossini Company Novartis Pharma AG Address Biostatistics WSJ-27.1.012 City State Zip CH-4002 Basel Country Switzerland Phone (206) 543-2005 Email [EMAIL PROTECTED] luke On Thu, 21 Jul 2005, A.J. Rossini wrote: Just with R, or via another tool integrating R, such as Pipeline Pilot? best, -tony On 7/20/05, Frédéric Ooms [EMAIL PROTECTED] wrote: Dear colleague, Just an e-mail to know if they are people working in the field of chemoinformatic that are using R in their work. If yes I was wondering if we couldn't exchange tips and tricks about the use of R in this area ? Best regards Fred Ooms [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: [EMAIL PROTECTED] Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu -- best, -tony Commit early,commit often, and commit in a repository from which we can easily roll-back your mistakes (AJR, 4Jan05). A.J. Rossini [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The steps of building library in R 2.1.1
Gabor Grothendieck wrote: I think you have been using R too long. Something like this is very much needed. There are two problems: 1. the process itself is too complex (need to get rid of perl, integrate package development tools with package installation procedure [it should be as easy as downloading a package], remove necessity to set or modify any environment variables including the path variables). 2. there is too much material to absorb just to create a package. The manuals are insufficient. A step-by-step simplification is very much needed. Its no coincidence that there are a number of such descriptions on the net (google for 'making creating R package') since I would guess that just about everyone has significant problems in creating their first package on Windows. OK, if people really think this is required, I will sit down on a clean Windows XP machine, do the setup, and write it down for the next R Help Desk in R News -- something like Creating my first R package under Windows? If anybody else is willing to contribute and can write something up in a manner that is *not* confusing or misleading (none of the other material spread over the web satisfies this requirement, AFAICS), she/he is invited to contribute, of course. BTW, everybody else is invited to submit proposals for R Help Desk!!! Uwe Ligges On 7/21/05, Uwe Ligges [EMAIL PROTECTED] wrote: Ivy_Li wrote: Dear All, With the warm support of every R expert, I have built my R library successfully. Especially thanks: Duncan Murdoch Gabor Grothendieck Henrik Bengtsson Uwe Ligges You are welcome. The following is intended for the records in the archive in order to protect readers. Without your help, I will lower efficiency. I noticed that some other friends were puzzled by the method of building library. Now, I organize a document about it. Hoping it can help more friends. 1. Read the webpage http://www.stats.ox.ac.uk/pub/Rtools Do you mean http://www.murdoch-sutherland.com/Rtools/ ? 2. Download the rw2011.exe; Install the newest version of R 3. Download the tools.zip; Unpack it into c:\cygwin Not required to call it cygwin - also a bit misleading... 4. Download the ActivePerl-5.6.1.633-MSWin32-x86.msi; Install Active Perl in c:\Perl Why in C:\Perl ? 5. Download the MinGW-3.1.0-1.exe; Install the mingw32 port of gcc in c:\mingwin Why in c:\mingwin ? 6. Then go to Control Panel - System - Advanced - Environment Variables - Path - Variable Balue; add c:\cygwin;c:\mingwin\bin The PATH variable already contains a couple of paths, add the two given above in front of all others, separated by ;. Why we add them in the beginning of the path? Because we want the folder that contains the tools to be at the beginning so that you eliminate the possibility of finding a different program of the same name first in a folder that comes prior to the one where the tools are stored. OK, this (1-6) is all described in the R Administration and Installation manual, hence I do not see why we have to repeat it here. 7. I use the package.skeleton() function to make a draft package. It will automate some of the setup for a new source package. It creates directories, saves functions anddata to appropriate places, and creates skeleton help files and 'README' files describing further steps in packaging. I type in R: f - function(x,y) x+y g - function(x,y) x-y d - data.frame(a=1, b=2) e - rnorm(1000) package.skeleton(list=c(f,g,d,e), name=example) Then modify the 'DESCRIPTION': Package: example Version: 1.0-1 Date: 2005-07-09 Title: My first function Author: Ivy [EMAIL PROTECTED] Maintainer: Ivy [EMAIL PROTECTED] Description: simple sum and subtract License: GPL version 2 or later Depends: R (= 1.9), stats, graphics, utils You can refer to the web page: http://cran.r-project.org/src/contrib/Descriptions/ There are larger source of examples. And you can read the part of 'Creating R Packages' in 'Writing R Extension'. It introduces some useful things for your reference. This is described in Writing R Extension and is not related to the setup of you system in 1-6. 8. Download hhc.exe Microsoft help compiler from somewhere. And save it somewhere in your path. I download a 'htmlhelp.exe' and setup. saved the hhc.exe into the 'C:\cygwin\bin' because this path has been writen in my PATH Variable Balue. However if you decided not to use the Help Compiler (hhc), then you need to modify the MkRules file in RHOME/src/gnuwin32 to tell it not to try to build that kind of help file This is described in the R Administration and Installation manual and I do not see why we should put the html compiler to the other tools. 9. In the DOS environment. Into the D:\ Type the following code: There is no DOS
Re: [R] Question about 'text' (add lm summary to a plot)
On Thu, 21 Jul 2005, Christoph Buser wrote: Dear Dan I can only help you with your third problem, expression and paste. You can use: plot(1:5,1:5, type = n) text(2,4,expression(paste(Slope : , 3.45%+-%0.34, sep = )), pos = 4) text(2,3.8,expression(paste(Intercept : , -10.43%+-%1.42)), pos = 4) text(2,3.6,expression(paste(R^2,: , 0.78, sep = )), pos = 4) Cheers for this. I was trying to get it to work, but the problem is that I need to replace the values above with variables, from the following code... dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) dat.lm.sum - summary(dat.lm) my.slope.1 - round(dat.lm.sum$coefficients[2],2) my.slope.2 - round(dat.lm.sum$coefficients[4],2) my.inter.1 - round(dat.lm.sum$coefficients[1],2) my.inter.2 - round(dat.lm.sum$coefficients[3],2) my.Rsqua.1 - round(dat.lm.sum$r.squared,2) Anything I try results in either the words 'paste(Slope:, my.slope.1, %+-%my.slope.2,sep=)' being written to the plot, or just 'my.slope.1+-my.slope2' (where the +- is correctly written). I want to script it up and write all three lines to the plot with 'sep=\n', rather than deciding three different heights. I do not have an elegant solution for the alignment. Thanks very much for what you gave, its a good start for me to figure out how I am supposed to be telling R what to do! Any way to just get fixed width fonts with text? (for the alignment problem) Cheers, Dan. Regards, Christoph Buser -- Christoph Buser [EMAIL PROTECTED] Seminar fuer Statistik, LEO C13 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-44-632-4673fax: 632-1228 http://stat.ethz.ch/~buser/ -- Dan Bolser writes: I would like to annotate my plot with a little box containing the slope, intercept and R^2 of a lm on the data. I would like it to look like... ++ | Slope : 3.45 +- 0.34 | | Intercept : -10.43 +- 1.42 | | R^2 : 0.78 | ++ However I can't make anything this neat, and I can't find out how to combine this with symbols for R^2 / +- (plus minus). Below is my best attempt (which is franky quite pour). Can anyone improve on the below? Specifically, aligned text and numbers, aligned decimal places, symbol for R^2 in the text (expression(R^2) seems to fail with 'paste') and +- Cheers, Dan. dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) abline(coef(dat.lm),lty=2,lwd=1.5) dat.lm.sum - summary(dat.lm) dat.lm.sum attributes(dat.lm.sum) my.text.1 - paste(Slope : , round(dat.lm.sum$coefficients[2],2), +/-, round(dat.lm.sum$coefficients[4],2)) my.text.2 - paste(Intercept : , round(dat.lm.sum$coefficients[1],2), +/-, round(dat.lm.sum$coefficients[3],2)) my.text.3 - paste(R^2 : , round(dat.lm.sum$r.squared,2)) my.text.1 my.text.2 my.text.3 ## Add legend text(x=3, y=300, paste(my.text.1, my.text.2, my.text.3, sep=\n), adj=c(0,0), cex=1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The steps of building library in R 2.1.1
Just a few thoughts... Good documentation helps everybody - the beginners and the experts (less beginner questions if there is thorough and accessible documentation. I fully appreciate that this is a volunteer effort - I'm just trying to pin down some places where we have documentation issues. Docs can be in a number of different forms - reference, examples, carefully and throughly explained. I personally find it difficult to understand reference type material until I have seen a worked example, and some of the reference material is a little light on the examples for me, and others like me who thrive on examples. The Linux-HOWTO collection are a good example of step-by-step documentation... If you're an expert, then you can read it fast and skip... otherwise you read every line. Since R comes as a computer package and a statistics thingy... the questions on this list come in three forms - those that are very 'package', e.g. how do I reduce the space between two graphs - to the statistics questions how reliable is the coefficient of determination in the presence of outliers (which shouldn't really be asked here), and then the how do I do 'statistic X' in R - how do I calculate a confidence interval around the coefficent of determination in R?. The standard documentation got me so far in learning about R - I got a copy of MASS, S-Programming, Introductory Statistics with R, and Michael Crawley's new book Statistics : An Introduction using R - along with every online book I could find on CRAN and elsewhere. Unfortunately for me, my experience leaves me in between the beginner books and the more advanced texts like MASS, David, Schervish etc... The learning curve is steep - but then like many people, I'd like to be able to do sophisticated modelling with deep understanding and no effort :-) I feel there is a bit of a hole in the middle of the documentation which could be attacked from both sides - the introduction element is starting to be covered - it's the next step up from that. And yes before you ask I would like to help - but my statistics knowledge is very poor! Should this conversation go to r-devel? Thanks for listening, Sean On 21/07/05, Uwe Ligges [EMAIL PROTECTED] wrote: Duncan Murdoch wrote: On 7/21/2005 10:29 AM, Uwe Ligges wrote: Gabor Grothendieck wrote: I think you have been using R too long. Something like this is very much needed. There are two problems: 1. the process itself is too complex (need to get rid of perl, integrate package development tools with package installation procedure [it should be as easy as downloading a package], remove necessity to set or modify any environment variables including the path variables). 2. there is too much material to absorb just to create a package. The manuals are insufficient. A step-by-step simplification is very much needed. Its no coincidence that there are a number of such descriptions on the net (google for 'making creating R package') since I would guess that just about everyone has significant problems in creating their first package on Windows. OK, if people really think this is required, I will sit down on a clean Windows XP machine, do the setup, and write it down for the next R Help Desk in R News -- something like Creating my first R package under Windows? If anybody else is willing to contribute and can write something up in a manner that is *not* confusing or misleading (none of the other material spread over the web satisfies this requirement, AFAICS), she/he is invited to contribute, of course. BTW, everybody else is invited to submit proposals for R Help Desk!!! That sounds great. Could you also take notes as you go about specific problems in the writeup in the R-admin manual, so it can be improved for the next release? Of course. Another thing you could do which would be valuable: get a student or someone else who is reasonably computer literate, but unfamiliar with R details, to do this while you sit watching and recording their mistakes. Good idea. Uwe Duncan Murdoch __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] unable to call R t-test from Java
In my last post, I left off version info: Java: jdk1.5.0_03 R: 2.1.1 SJava: 0.68 OS: SunOs 5.8 By trying either the eval or the call method to execute a t-test results in a core dump. e.g. eval method /* produces a core */ System.err.println(eval a t.test); Object value = e.eval(t.test (c(1,2,3), c(4,5,6))); if (value != null) interp.show(value); e.g call method Object[] funArgs = new Object[2]; double[] d0 = { 1.1, 2.2, 3.3}; double[] d1 = { 9.9, 8.8, 7.7}; funArgs[0] = d0; funArgs[1] = d1; System.err.println(\r\n Calling t.test and passing a java array); Object value = e.call(t.test, funArgs); if(value != null) { interp.show(value ); System.err.println(\r\n); } Thanks, Laura -Original Message- From: O'Brien, Laura Sent: Wednesday, July 20, 2005 10:42 AM To: 'r-help@stat.math.ethz.ch' Subject: unable to call R t-test from Java Hello, My colleague and I would like to write Java code that invokes R to do a simple TTest. I've included my sample java code below. I tried various alternatives and am unable to pass a vector to the TTest method. In my investigation, I tried to call other R methods that take vectors and also ran into various degrees of failure. Any insight you can provide or other Web references you can point me to would be appreciated. Thank you, Laura O'Brien Application Architect --- code -- package org.omegahat.R.Java.Examples; import org.omegahat.R.Java.ROmegahatInterpreter; import org.omegahat.R.Java.REvaluator; public class JavaRCall2 { /** * want to see if I can eval a t.test command like what I would run in the * R command line */ static public void runTTestByEval_cores(REvaluator e, ROmegahatInterpreter interp) { /* produces a core */ System.err.println(eval a t.test); Object value = e.eval(t.test (c(1,2,3), c(4,5,6))); if (value != null) interp.show(value); } /** * want to see if I can eval anything that takes a vector, e.g. mean, * like what I would run in the R command line */ static public void runMeanByEval_works(REvaluator e, ROmegahatInterpreter interp) { System.err.println(\r\n evaluation string mean command); Object value = e.eval(mean(c(1,2,3))); if(value != null) { interp.show(value ); System.err.println(\r\n); } } /** * if I pass mean a org.omegahat.Environment.DataStructures.numeric what do I get? NaN */ static public void runMeanByNumericList_nan(REvaluator e, ROmegahatInterpreter interp) { Object[] funArgs = new Object[1]; // given argument is not numeric or logical org.omegahat.Environment.DataStructures.numeric rList1 = new org.omegahat.Environment.DataStructures.numeric(3); double[] dList = new double[3]; dList[0] = (double) 1.1; dList[1] = (double) 2.2; dList[2] = (double) 3.3; rList1.setData(dList, true); System.err.println(rList1.toString()); funArgs[0] = rList1 ; System.err.println(\r\n Calling mean and passing an omegahat vector); Object value = e.call(mean, funArgs); if(value != null) { interp.show(value ); System.err.println(\r\n); } } /** * let's run some tests on the vector passed in and see what R thinks I'm handing it * * it returns * isnumeric: false * mode: list * length:2 */ public static void runTestsOnOmegahatNumeric(REvaluator e, ROmegahatInterpreter interp) { Object[] funArgs = new Object[1]; // given argument is not numeric or logical org.omegahat.Environment.DataStructures.numeric rList1 = new org.omegahat.Environment.DataStructures.numeric(3); double[] dList = new double[3]; dList[0] = (double) 1.1; dList[1] = (double) 2.2; dList[2] = (double) 3.3; rList1.setData(dList, true); System.err.println(rList1.toString()); funArgs[0] = rList1 ; System.err.println(\r\n Calling isnumeric and passing an omegahat vector); Object value = e.call(is.numeric, funArgs); if(value != null) { interp.show(value ); System.err.println(\r\n); } // mode is list System.err.println(\r\n Calling mode and passing an omegahat vector); value = e.call(mode, funArgs); if(value != null) { interp.show(value ); System.err.println(\r\n); }
Re: [R] output of variance estimate of random effect from a gamma frailty model using Coxph in R
On Thu, 21 Jul 2005 [EMAIL PROTECTED] wrote: Hi, I have a question about the output for variance of random effect from a gamma frailty model using coxph in R. Is it the vairance of frailties themselves or variance of log frailties? Thanks. For a Gamma frailty model it is the variance of the Gamma distribution, so the variance of the frailties. For Gaussian frailty it will be the log frailties, though. -thomas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] debian vcd package
I just installed it on a Debian 2.6.8.1 and the following R platform i386-pc-linux-gnu arch i386 os linux-gnu system i386, linux-gnu status major2 minor0.1 year 2004 month11 day 15 language R By the way are you using apt-get install vcd. and if so why? just use install.packages(vcd) from within R HTH Jean On Thu, 21 Jul 2005, Peter Ho wrote: [Apologies if you have already read this message sent from another email address] Hi R-Help, I have been using R in Linux (Debian) for the past month. The usual way I install packages is through apt. Recently, a new packages vcd became available on CRAN. I tried installing it today and found that Debian does not seem to support this package. I also found that many other packages were unavailable. Does anyone have any recommended sites where a full list is available? If none exist, what would be the best way to move ahead in installing say the vcd package. I am still a novice in using Debian and so please forgive me if some of my questions may seem trivial for experienced users. Peter Peter Ho, PhD. Escola Superior de Tecnologia e Gestao. Instituto Politecnico de Viana do Castelo. Avenida do Atlantico- Apartado 574. 4901-908 Viana do Castelo. Portugal. Tel: +351-258-819700 Ext. 1252 Email: [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] tav files
Dear colleagues: I am using the spotfinder 2.2.3 and the .tav files generated there have 20 columns. When I exclude the 3 last columns, the .tav file can not be recognized by aroma package in R platform. What I have to do to generate .tav files with 17 columns only? Thank you in advance. Carlos Dept. of Immunology University of São Paulo -- Open WebMail Project (http://openwebmail.org) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Invitation to New York, Spain, and Italy; c/ba
Dear potential Speaker: On behalf of the organizing committee, I would like to extend a cordial invitation for you to submit a paper to the IPSI Transactions journal, or to attend one of the upcoming IPSI BgD multidisciplinary, interdisciplinary, and transdisciplinary conferences. The first one will take place in New York City, NY, USA: IPS-USA-2006 NEW YORK Hotel Beacon (arrival: 5 January 06 / departure: 8 January 06) New Deadlines: 1 August 05 (abstract) 1 October 05 (full paper) The second one will take place in Marbella, Spain: IPSI-2006 SPAIN Hotel Puente Romano (arrival: 10 February 06 / departure: 13 February 06) Deadlines: 1 September 05 (abstract) 1 November 05 (full paper) The third one will take place in Amalfi, Italy: IPSI-2006 ITALY Hotel Santa Caterina (arrival: 23 March 06 / departure: 26 March 06) Deadlines: 1 October 05 (abstract) 1 December 05 (full paper) All IPSI BgD conferences are non-profit. They bring together the elite of the world science; so far, we have had seven Nobel Laureates speaking at the opening ceremonies. The conferences always take place in some of the most attractive places of the world. All those who come to IPSI conferences once, always love to come back (because of the unique professional quality and the extremely creative atmosphere); lists of past participants are on the web, as well as details of future conferences. These conferences are in line with the newest recommendations of the US National Science Foundation and of the EU research sponsoring agencies, to stress multidisciplinary, interdisciplinary, and transdisciplinary research (M+I+T++ research). The speakers and activities at the conferences truly support this type of scientific interaction. Among the main topics of these conferencs are: E-education and E-business with Special Emphasis on Semantic Web and Web Datamining Other topics of interest include, but are not limited to: * Internet * Computer Science and Engineering * Mobile Communications/Computing for Science and Business * Management and Business Administration * Education * e-Medicine * e-Oriented Bio Engineering/Science and Molecular Engineering/Science * Environmental Protection * e-Economy * e-Law * Technology Based Art and Art to Inspire Technology Developments * Internet Psychology If you would like more information on either conference, please reply to this e-mail message. If you plan to submit an abstract and paper, please let us know immediately for planning purposes. Remember that you can submit your paper also to the IPSI Transactions journal. Sincerely Yours, Prof. V. Milutinovic, Chairman, IPSI BgD Conferences * * * CONTROLLING OUR E-MAILS TO YOU * * * If you would like to continue to be informed about future IPSI BgD conferences, please reply to this e-mail message with a subject line of SUBSCRIBE. If you would like to be removed from our mailing list, please reply to this e-mail message with a subject line of REMOVE. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question about 'text' (add lm summary to a plot)
Use bquote instead of expression, e.g. trees.lm - lm(Volume ~ Girth, trees) trees.sm - summary(trees.lm) trees.co - round(trees.sm$coefficients,2) trees.rsq - round(trees.sm$r.squared,2) plot(Volume ~ Girth, trees) text(10,60, bquote(Intercept : .(trees.co[1,1])%+-%.(trees.co[1,2])), pos = 4) text(10,57, bquote(Slope : .(trees.co[2,1])%+-%.(trees.co[2,2])), pos = 4) text(10,54,bquote(R^2 : .(trees.rsq)), pos = 4) On 7/21/05, Dan Bolser [EMAIL PROTECTED] wrote: On Thu, 21 Jul 2005, Christoph Buser wrote: Dear Dan I can only help you with your third problem, expression and paste. You can use: plot(1:5,1:5, type = n) text(2,4,expression(paste(Slope : , 3.45%+-%0.34, sep = )), pos = 4) text(2,3.8,expression(paste(Intercept : , -10.43%+-%1.42)), pos = 4) text(2,3.6,expression(paste(R^2,: , 0.78, sep = )), pos = 4) Cheers for this. I was trying to get it to work, but the problem is that I need to replace the values above with variables, from the following code... dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) dat.lm.sum - summary(dat.lm) my.slope.1 - round(dat.lm.sum$coefficients[2],2) my.slope.2 - round(dat.lm.sum$coefficients[4],2) my.inter.1 - round(dat.lm.sum$coefficients[1],2) my.inter.2 - round(dat.lm.sum$coefficients[3],2) my.Rsqua.1 - round(dat.lm.sum$r.squared,2) Anything I try results in either the words 'paste(Slope:, my.slope.1, %+-%my.slope.2,sep=)' being written to the plot, or just 'my.slope.1+-my.slope2' (where the +- is correctly written). I want to script it up and write all three lines to the plot with 'sep=\n', rather than deciding three different heights. I do not have an elegant solution for the alignment. Thanks very much for what you gave, its a good start for me to figure out how I am supposed to be telling R what to do! Any way to just get fixed width fonts with text? (for the alignment problem) Cheers, Dan. Regards, Christoph Buser -- Christoph Buser [EMAIL PROTECTED] Seminar fuer Statistik, LEO C13 ETH (Federal Inst. Technology) 8092 Zurich SWITZERLAND phone: x-41-44-632-4673fax: 632-1228 http://stat.ethz.ch/~buser/ -- Dan Bolser writes: I would like to annotate my plot with a little box containing the slope, intercept and R^2 of a lm on the data. I would like it to look like... ++ | Slope : 3.45 +- 0.34 | | Intercept : -10.43 +- 1.42 | | R^2 : 0.78 | ++ However I can't make anything this neat, and I can't find out how to combine this with symbols for R^2 / +- (plus minus). Below is my best attempt (which is franky quite pour). Can anyone improve on the below? Specifically, aligned text and numbers, aligned decimal places, symbol for R^2 in the text (expression(R^2) seems to fail with 'paste') and +- Cheers, Dan. dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) abline(coef(dat.lm),lty=2,lwd=1.5) dat.lm.sum - summary(dat.lm) dat.lm.sum attributes(dat.lm.sum) my.text.1 - paste(Slope : , round(dat.lm.sum$coefficients[2],2), +/-, round(dat.lm.sum$coefficients[4],2)) my.text.2 - paste(Intercept : , round(dat.lm.sum$coefficients[1],2), +/-, round(dat.lm.sum$coefficients[3],2)) my.text.3 - paste(R^2 : , round(dat.lm.sum$r.squared,2)) my.text.1 my.text.2 my.text.3 ## Add legend text(x=3, y=300, paste(my.text.1, my.text.2, my.text.3, sep=\n), adj=c(0,0), cex=1) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Question about 'text' (add lm summary to a plot)
[Note: the initial posts have been re-arranged to attempt to maintain the flow from top to bottom] Dan Bolser writes: I would like to annotate my plot with a little box containing the slope, intercept and R^2 of a lm on the data. I would like it to look like... ++ | Slope : 3.45 +- 0.34 | | Intercept : -10.43 +- 1.42 | | R^2 : 0.78 | ++ However I can't make anything this neat, and I can't find out how to combine this with symbols for R^2 / +- (plus minus). Below is my best attempt (which is franky quite pour). Can anyone improve on the below? Specifically, aligned text and numbers, aligned decimal places, symbol for R^2 in the text (expression(R^2) seems to fail with 'paste') and +- Cheers, Dan. dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) abline(coef(dat.lm),lty=2,lwd=1.5) dat.lm.sum - summary(dat.lm) dat.lm.sum attributes(dat.lm.sum) my.text.1 - paste(Slope : , round(dat.lm.sum$coefficients[2],2), +/-, round(dat.lm.sum$coefficients[4],2)) my.text.2 - paste(Intercept : , round(dat.lm.sum$coefficients[1],2), +/-, round(dat.lm.sum$coefficients[3],2)) my.text.3 - paste(R^2 : , round(dat.lm.sum$r.squared,2)) my.text.1 my.text.2 my.text.3 ## Add legend text(x=3, y=300, paste(my.text.1, my.text.2, my.text.3, sep=\n), adj=c(0,0), cex=1 On Thu, 21 Jul 2005, Christoph Buser wrote: Dear Dan I can only help you with your third problem, expression and paste. You can use: plot(1:5,1:5, type = n) text(2,4,expression(paste(Slope : , 3.45%+-%0.34, sep = )), pos = 4) text(2,3.8,expression(paste(Intercept : , -10.43%+-%1.42)), pos = 4) text(2,3.6,expression(paste(R^2,: , 0.78, sep = )), pos = 4) I do not have an elegant solution for the alignment. On Thu, 2005-07-21 at 19:55 +0100, Dan Bolser wrote: Cheers for this. I was trying to get it to work, but the problem is that I need to replace the values above with variables, from the following code... dat.lm - lm(dat$AVG_CH_PAIRS ~ dat$CHAINS) dat.lm.sum - summary(dat.lm) my.slope.1 - round(dat.lm.sum$coefficients[2],2) my.slope.2 - round(dat.lm.sum$coefficients[4],2) my.inter.1 - round(dat.lm.sum$coefficients[1],2) my.inter.2 - round(dat.lm.sum$coefficients[3],2) my.Rsqua.1 - round(dat.lm.sum$r.squared,2) Anything I try results in either the words 'paste(Slope:, my.slope.1, %+-%my.slope.2,sep=)' being written to the plot, or just 'my.slope.1+-my.slope2' (where the +- is correctly written). I want to script it up and write all three lines to the plot with 'sep=\n', rather than deciding three different heights. Thanks very much for what you gave, its a good start for me to figure out how I am supposed to be telling R what to do! Any way to just get fixed width fonts with text? (for the alignment problem) Dan, Here is one approach. It may not be the best, but it gets the job done. You can certainly take this and encapsulate it in a function to automate the text/box placement and to pass values as arguments. A couple of quick concepts: 1. As far as I know, plotmath cannot do multiple lines, so each line in your box needs to be done separately. 2. The horizontal alignment is a bit problematic when using expression() or bquote() since I don't believe that multiple spaces are honored as such after parsing. Thus I break up each component (label, : and values) into separate text() calls. The labels are left justified. 3. The alignment for the numeric values are done with right justification. So, as long as you use a consistent number of decimals in the value outputs (2 here), you should be OK. This means you might need to use formatC() or sprintf() to control the numeric output values on either side of the +/- sign. 4. In the variable replacement, note the use of substitute() and the list of x and y arguments as replacement values in the expressions. # Set your values my.slope.1 - 3.45 my.slope.2 - 0.34 my.inter.1 - -10.43 my.inter.2 - 1.42 my.Rsqua - 0.78 # Create the initial plot as per Christoph's post plot(1:5, 1:5, type = n) #- # Do the Slope #- text(1, 4.5, Slope, pos = 4) text(2, 4.5, :) text(3, 4.5, substitute(x %+-% y, list(x = my.slope.1, y = my.slope.2)), pos = 2) #- # Do the Intercept #- text(1, 4.25, Intercept, pos = 4) text(2, 4.25, :) text(3, 4.25, substitute(x %+-% y, list(x = my.inter.1, y = my.inter.2)), pos = 2)
Re: [R] RandomForest question
From: Uwe Ligges [EMAIL PROTECTED] wrote: Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when choosing mtry=80. How is it possible that more variables can used than there are in columns the data frame? If some of the variables are factors, dummy variables are generated and you get a larger number of variables in the later process. No, unless the OP is using the formula interface with a version of the package from two years or so ago. We got the first formula interface by copying and modifying the one for svm() in e1071, and forgot the fact that SVM needs that for dealing with factors, but not trees (especially not how the underlying RF code handles them). This has been correctly long ago. Cheers, Andy Uwe Ligges thanks for your help + kind regards, Arne [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] A Question About Inverse Gamma
Hi R users, I am having a little problem finding the the solution to this problem in R: 1. I need to generate normal distribution of sample size 30, mean = 50, sd = 5. 2. From the statistics obtained in step 1, I need to generate the Inverse Gamma distribution. Your views and help will be appreciated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] A Question About Inverse Gamma
[EMAIL PROTECTED] wrote: Hi R users, I am having a little problem finding the the solution to this problem in R: 1. I need to generate normal distribution of sample size 30, mean = 50, sd = 5. 2. From the statistics obtained in step 1, I need to generate the Inverse Gamma distribution. Your views and help will be appreciated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html I found rinvgamma in the MCMCpack package. Perhaps that's what you need. Did you read the posing guide? A help.search(normal) would have helped you with item 1. --sundar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R:plot and dots
Here is an example : x - rnorm(20) y - rnorm(20) plot(x, y, type=n) text(x, y, labels=as.character(1:20)) Also look into help(identify) if you want to point out specific points. Regards, Adai On Thu, 2005-07-21 at 16:18 +0200, Clark Allan wrote: hi all a very simple question. i have plot(x,y) but i would like to add in on the plot the observation number associated with each point. how can this be done? / allan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The steps of building library in R 2.1.1
On 22 Jul 2005 00:01:18 +0200, Peter Dalgaard [EMAIL PROTECTED] wrote: Duncan Murdoch [EMAIL PROTECTED] writes: On 7/21/2005 9:43 AM, Gabor Grothendieck wrote: I think you have been using R too long. Something like this is very much needed. There are two problems: 1. the process itself is too complex (need to get rid of perl, integrate package development tools with package installation procedure [it should be as easy as downloading a package], remove necessity to set or modify any environment variables including the path variables). I agree with some of this, but I don't see much interest in fixing it. For example, getting rid of Perl would be a lot of work. When the Perl scripts were written, R was not capable of doing what they do. I think it is capable now, but there's still a huge amount of translation work to do. Who will do that? Who will test that they did it right? At the end, will it actually have been worth all the trouble? Installing Perl is not all that hard. Another point of view is that the issue is that we cannot ship a full set of build tools with R on Windows, the main obstacle being that Active Perl has redistribution restrictions. Although the ultimate answer is to get rid of perl entirely, in the same vein as your discussion, perhaps R could simply provide standalone executables for each perl program currently used by R (using perlcc to produce them). I believe there is a free alternative to Microsoft's Help Compiler too but I just googled for it and was unable to locate it. By placing all these items (and the UNIXish tools) in the \R\rw...\bin directory and using the registry as in Rcmd.bat and Rgui.bat found in batchfiles: http://cran.r-project.org/contrib/extra/batchfiles/ modifying the path and environment variables by the user might be eliminated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] principal component analysis in affy
1) Please learn to wrap your emails to 72 characters per line. See http://expita.com/nomime.html 2) You might have better luck with the BioConductor folks. Their mailing list is https://stat.ethz.ch/mailman/listinfo/bioconductor 3) The affy package has many functions including some algorithms for preprocessing CEL files. 4) I do not understand why you need the affy package if you do not have CEL files ? I presume that you only have a subset of the final data. 5) I do not understand your question but you might want to simply transpose the matrix before doing a prcomp() on it. See help(t). Regards, Adai On Thu, 2005-07-21 at 10:09 -0500, Wagle, Mugdha wrote: Hi, I have been using the prcomp function to perform PCA on my example microarray data, (stored in metric text files) which looks like this: 1a 1b 1c 1d 1e 1f ...4r 4s 4t g11.2705 1.2766 ...2.0298 g20.1631 0.7067 g30.2212 1.0439 . . . . g99 1.3657..2.3736 i.e. a matrix of 63 columns and 99 rows, where the columns represent chip and rows represent genes. Now, the biplot function biplot(prcomp(pcadata, scale = TRUE), cex = c(0.75,0.75)) gives me a plot with one vector per gene. However, I actually need to get one vector per chip instead of one vector per gene. I have been told that there is a function in the affy package that does what I am looking for i.e. gives one vector per chip. Can someone please tell me what the function is called, and how I can get hold of the code(since I believe affy only works on CEL files) ? I have downloaded the affy R code from Terry Speed's website already, but I don't know where (if at all) the code to perform PCA is. Thank you everyone! Sincerely, Mugdha Wagle Hartwell Center for Bioinformatics and Biotechnology, St.Jude Children's Research Hospital, Memphis TN 38105 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] vectorising ifelse()
Hi All, is there any chance of vectorising the two ifelse() statements in the following code: for(i in gp){ new[i,1] = ifelse(srow[i]0, new[srow[i],zippo[i]], sample(1:100, 1, prob =Y1, rep = T)) new[i,2] = ifelse(drow[i]0, new[drow[i]0,zappo[i]], sample(1:100, 1, prob =Y1, rep = T)) } Where I am forced to check if the value of drow and srow are 0 for each line... in practical terms, I am attributing haplotypes to a pedigree, so I have to give the haplotypes to the parents before I give them to the offspring. The vectors *zippo* and *zappo* are the chances of getting one or the other hap from the sire and dam respectively. *gp* is the vectors of non-ancestral animals. *new* is a two col matrix where the haps are stored. Cheers, Federico -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 7594 1602 Fax (+44) 020 7594 3193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Re Randomization test for interaction effect
Dear Pedro, How to test for an interaction--or, even, how to pose the question of an interaction--in randomization-based inference is not at all obvious. And, in the permutation test context reliance has been placed on the exchangeability of (estimated) residuals under an additive, homoscedastic model. Where estimated, the residuals are not exactly exchangeable. A reference you might find useful is Pesarin, F (2001) Multivariate permutations tests. Wiley: Chichester, UK. His method of synchronized permutations may be applied to test for interactions under some limited circumstances. Pedro de Barros writes, in part: Dear All, I am trying to build a randomization test for interaction The problem is as follows: I have a set of stations where the occurrence and biomass of each species being investigated was recorded. snip I would really appreciate any pointer to a solution of this problem. I believe it is not complicated (and probably quite obvious) but the solution keeps out of reach, even though I have been searching for over a week. Thanks, Pedro ** Cliff Lunneborg, Professor Emeritus, Statistics Psychology, University of Washington, Seattle [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help: pls package
Hello, I have a data set with 15 variables (first one is the response) and 1200 observations. Now I use pls package to do the plsr with cross validation as below. trainSet = as.data.frame(scale(trainSet, center = T, scale = T)) trainSet.plsr = mvr(formula, ncomp = 14, data = trainSet, method = kernelpls, CV = TRUE, validation = LOO, model = TRUE, x = TRUE, y = TRUE) after that I wish to obtain the value of se, estimated standard errors of the estimates(cross validation), mentioned in the function of MSEP, but not implemented yet, so I made the program by myself to calculate it. The results I got seem not right, and I wonder which step below is wrong. y = trainSet.plsr$y p = as.data.frame(trainSet.plsr$validation$pred) i = 1; msep_element = c() while(i = length(p)){ msep_element[,i] = (p[i]-y)^2 i = i + 1 } msep = colMeans(msep_element) msep_sd = sd(msep_element) Then I compare msep with trainSet.plsr$validation$MSEP, which are the same, but the values of msep_sd seem much larger than I expected, is it the same as se? Thank you, Shengzhe __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] help: pls package
Hello, I have a data set with 15 variables (first one is the response) and 1200 observations. Now I use pls package to do the plsr with cross validation as below. trainSet = as.data.frame(scale(trainSet, center = T, scale = T)) trainSet.plsr = mvr(formula, ncomp = 14, data = trainSet, method = kernelpls, CV = TRUE, validation = LOO, model = TRUE, x = TRUE, y = TRUE) after that I wish to obtain the value of se, estimated standard errors of the estimates(cross validation), mentioned in the function of MSEP, but not implemented yet, so I made the program by myself to calculate it. The results I got seem not right, and I wonder which step below is wrong. y = trainSet.plsr$y p = as.data.frame(trainSet.plsr$validation$pred) i = 1; msep_element = c() while(i = length(p)){ msep_element[,i] = (p[i]-y)^2 i = i + 1 } msep = colMeans(msep_element) msep_sd = sd(msep_element) Then I compare msep with trainSet.plsr$validation$MSEP, which are the same, but the values of msep_sd seem much larger than I expected, is it the same as se? If not, how to calculate se of cross validation? Thank you, Shengzhe __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] :)
olá - Original Message - From: [EMAIL PROTECTED] To: [EMAIL PROTECTED] Sent: Thursday, July 21, 2005 3:04 PM Subject: :) pq nao me liga?? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] find confounder in covariates
Hi, I was wondering if there is a way, or function in R to find confounders. For istance, a = sample( c(1:3), size=10,replace=T) X1 = factor( c('A','B','C')[a] ) X2 = factor( c('Aa','Bb','Cc')[a] ) Xmat = data.frame(X1,X2,rnorm(10),rnorm(10)) dimnames(Xmat)[[2]] = c('z1','z2','z3','y') Now, z2 is just an alias of z1. There can be a collinearity like one is a linear combination of others. If you run lm on it: f = lm(y~.,data=Xmat) summary(f) Call: lm(formula = y ~ ., data = Xmat) Residuals: Min 1Q Median 3Q Max -1.2853 -0.3708 -0.1224 0.4617 1.2821 Coefficients: (2 not defined because of singularities) Estimate Std. Error t value Pr(|t|) (Intercept) 0.821410.44583 1.842 0.1150 z1B -1.341670.65176 -2.059 0.0852 . z1C 0.808911.07639 0.751 0.4808 z2Bb NA NA NA NA z2Cc NA NA NA NA z3 0.042310.23397 0.181 0.8625 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.971 on 6 degrees of freedom Multiple R-Squared: 0.5086, Adjusted R-squared: 0.2629 F-statistic: 2.07 on 3 and 6 DF, p-value: 0.2057 In this case, I can look at data and figure out which variable is confounded with which. But, if we have many categorial covariates ( not necessarily same number of levels ), it is almost impossible to check it out. Any help would be greatly appreicated. Thanks. Young. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] A Question About Inverse Gamma
On Thu, 21 Jul 2005, Sundar Dorai-Raj wrote: [EMAIL PROTECTED] wrote: Hi R users, I am having a little problem finding the the solution to this problem in R: 1. I need to generate normal distribution of sample size 30, mean = 50, sd = 5. 2. From the statistics obtained in step 1, I need to generate the Inverse Gamma distribution. Your views and help will be appreciated. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html I found rinvgamma in the MCMCpack package. Perhaps that's what you need. I think there is a problem with rinvgamma: rinvgamma function (n, shape, scale = 1) { return(1/rgamma(n, shape, scale)) } environment: namespace:MCMCpack I know it is not necessarily authoritative but look at wikipedia http://en.wikipedia.org/wiki/Inverse-gamma_distribution It seems the one line of the function should be: return(1/rgamma(n, shape, 1/scale) Or you could of course throw caution to the winds and write your own rinvgamma using rgamma :-) David Scott _ David Scott Department of Statistics, Tamaki Campus The University of Auckland, PB 92019 AucklandNEW ZEALAND Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000 Email: [EMAIL PROTECTED] Graduate Officer, Department of Statistics __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html