[R] regular expression help
Dear all as usual I am again lost in virtues of regular expressions. I have such character vector named vzor: [365] 61A 62C/27 65A/27 66C/29 69A/29 70C/31 73A/31 74C/33 77A/33 81A/35 82C/37 85A/37 86C/39 [378] 89A/39 90C/41 93A/41 94C/43 97A/43 98C/45 101A/45 102C/47 105A/47 106C/49 109A/49 110C/51 113A/51 and I want only letters from it. I tried gsub([[:alpha:]], \\1,vzor) Error in gsub([[:alpha:]], \\1, vzor) : invalid backreference 1 in regular expression gsub([:alpha:], \\1,vzor) gives me the same vector There is probably very simple solution to it which I overlooked and examples in help page did not help me to find it. Thank you Best regards Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression help
gsub(^.*([[:alpha:]]).*$, \\1, vzor) ..°})) ) ) ) ) ) ( ( ( ( (Prof. Philippe Grosjean ) ) ) ) ) ( ( ( ( (Numerical Ecology of Aquatic Systems ) ) ) ) ) Mons-Hainaut University, Belgium ( ( ( ( ( .. Petr PIKAL wrote: Dear all as usual I am again lost in virtues of regular expressions. I have such character vector named vzor: [365] 61A 62C/27 65A/27 66C/29 69A/29 70C/31 73A/31 74C/33 77A/33 81A/35 82C/37 85A/37 86C/39 [378] 89A/39 90C/41 93A/41 94C/43 97A/43 98C/45 101A/45 102C/47 105A/47 106C/49 109A/49 110C/51 113A/51 and I want only letters from it. I tried gsub([[:alpha:]], \\1,vzor) Error in gsub([[:alpha:]], \\1, vzor) : invalid backreference 1 in regular expression gsub([:alpha:], \\1,vzor) gives me the same vector There is probably very simple solution to it which I overlooked and examples in help page did not help me to find it. Thank you Best regards Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-OSX error: 'memory not mapped'
On Wed, 18 Apr 2007, Atte Tenkanen wrote: I often get a following error with R *** caught segfault *** address 0x78807e00, cause 'memory not mapped' Possible actions: 1: abort (with core dump) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: The system is OSX 4.9 and R-version 2.4.1. Is there something to d0? Does this involve your own compiled code? If so, run R under a debugger (e.g. R -d gdb) when you will get more information. (You may also be able to get a core dump from option 1 and look at that in a debugger, but what happens is OS-dependent, including system settings in the OS.) If not, it is likely to be a MacOS-specific problem, so please send a reproducible example to the R-sig-mac list. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Runing R in a bash script
On Wed, 18 Apr 2007, Ulrik Stervbo wrote: As I har problems installing the Cairo package, I went for Henriks solution - and it works almost perfect. I would like to have been able to generate transparent png. You cannot do transparency via postscript. I would suggest using pdf() and converting the output of that, which often works even better (and does have full transparency support). Thanks for the help Ulrik On 18/04/07, Henrik Bengtsson [EMAIL PROTECTED] wrote: Or see png2() in R.utils, which imitates png() but uses bitmap(), which in turn uses postscript-to-png via ghostscript. BTW, personally I think PNGs generated via bitmap() look way better than the ones generated via png(). As there are two separate versions of png() for different OSes, comments like that are very system-dependent. Other postings suggest this is Windows, and if png() is giving poor results there it suggests a problem with the way Windows' GDI is configured (which depends on the graphics card). And of course, PNGs don't 'look' at all: they are rendered by some other tool, and quite often the perceived problem with R graphical output is in fact with the rendering tool. /Henrik On 4/17/07, Jeffrey Horner [EMAIL PROTECTED] wrote: Ulrik Stervbo wrote: Hello! I am having issues trying to plot to a ong (or jpg) when the R-code in a bash script is executed from cron. I can generate a pdf file, but when I try to write to a png, the file is created, but nothing is written. If I execute the bash script from my console, everything works file. Any ideas? In my cron I have SHELL=/bin/bash - otherwise /bin/shell is used and the folowing enery, so example is executed every minute * * * * * [path]/example.sh I am running R version 2.4.1 (2006-12-18) Here's a minimal example - two files one R-script ('example.r') and one bash-script ('example.sh') example.r # Example R-script x - c(1:10) y - x^2 png(file=example2.png) #pdf(file=example2.pdf) plot(x,y) graphics.off() example.sh #/bin/bash # # Hello world is written to exhotext every time cron executes this script echo Hello world echotext # This works, but not when executed from cron n=`R --save example.r` # using exec as in `exec R --save example.r` dosent work with cron either # This also works, but nothing is written to the png when executed from cron R --save RSCRIPT x - c(1:10) y - x^2 png(file=example2.png) #pdf(file=example2.pdf) plot(x,y) graphics.off() #dev.off() dosent work at all when executed from cron RSCRIPT The png() device requires an X server for the image rendering. You might be able to get away with exporting the DISPLAY environment variable export DISPLAY=:0.0 # try and connect to X server on display 0.0 within your script, but it will only work if the script is executed by the same user as is running the X server, *and* the X server is running at the time the script is executed. There are a handful of packages that will create a png without the presence of an X server, and I'm partial to Cairo (since I've done some work on it). You can install the latest version like this: install.packages(Cairo,,'http://rforge.net/',type='source') Cairo can also outputs nice pdf's with embedded fonts... useful if you want to embed high-quality OpenType or TrueType fonts. Best, Jeff -- http://biostat.mc.vanderbilt.edu/JeffreyHorner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] value of complexity parameter in ridge regression
What is the optimum range to look for a value of lambda while doing ridge regression. Can/ should lambda be greater than 1 ? -- I think it's data dependent, but lambda can certainly be greater than one. For many ridge regression problems you can choose lambda `objectively' by generalized cross validation (GCV). package `mgcv' provides a routine `magic' for doing this (although it doesn't use the most efficient method if you only have one lambda/ridge penalty). Of course this won't be appropriate if the ridge penalty is only being used to stabilize the fit and you want the minimum lambda that e.g. makes X'X+ \lambda I +ve definite. I have conflicting (or what appears conflicting to me) sources that use lambda = 0, without any upper limit, but that makes the search space infinite.. right ?? So, perhaps my question is: is there an upper limit to lambda. I don't think so. Does the value of lambda convey something about my data ? Depends on the details of the model. For some models ridge penalties can be viewed as inverses of random effect covariance matrices, in which case lambda is related to the random effect variance. (see e.g. section 6.2.6. of 2006 book referenced in ?gam from mgcv package). best, Simon Thanks a lot, Sikander - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK +44 1225 386603 www.maths.bath.ac.uk/~sw283 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] proxy settings
Hi all, I used to connect internet via a proxy. Before update packages I wrote in R Sys.putenv(http_proxy=http://proxy3.redegov.sp.gov.br:80/;) Nevertheless the way the connection is done has changed. For example, in the browser the proxy is not indicated and I have to give an username and a password to have access to internet. I read the FAQ and the help for download.file but I was not able to do updates. I tried Sys.putenv(http_proxy_user=ask) Sys.putenv(http_proxy_user=http://:;) Well, I use R under both Windows XP and Linux Ubuntu. Thanks for any help. Antonio Olinto - BCMG Internet Webmail www.bcmg.com.br __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help me about MADE4
Hi, Help me, how I can Install made4(micoarray analysis tool) in R using linux OS. thanking you. -- Nitish Kumar Mishra Junior Research Fellow BIC, IMTECH, Chandigarh, India E-Mail Address: [EMAIL PROTECTED] [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] GREP - Choosing values between two borders
Another way you can do it, if the data has the pattern shown in your sample, it to select all the lines that start with a numeric: input - FILE-CONTENT ## + EXAM NUM:2 + - + EXAM #1 + ASTIG:-2.4D + AXIS:4.8 + START OF HEIGHT DATA + 0 0.0 0. + 0 0.1 0.00055643 + 9 4.9 1.67278117 + 9 5.0 1.74873257 + 10 0.0 0. + 10 0.1 0.00075557 + 99 5.3 1.94719490 + END OF HEIGHT DATA + X POS:-0.299mm + Y POS:0.442mm + Z POS:-0.290mm + - + EXAM #2 + ASTIG:-2.4D + AXIS:4.8 + START OF HEIGHT DATA + 0 0.0 0. + 0 0.1 0.00055643 + 9 4.9 1.67278117 + 9 5.0 1.74873257 + 10 0.0 0. + 10 0.1 0.00075557 + 99 5.3 1.94719490 + END OF HEIGHT DATA + X POS:-0.299mm + Y POS:0.442mm + Z POS:-0.290mm + x - readLines(textConnection(input)) x - x[grep(^\\s*\\d, x, perl=TRUE)] x.in - scan(textConnection(x), what=0) Read 42 items x.in - matrix(x.in, ncol=3, byrow=TRUE) x.in [,1] [,2] [,3] [1,]0 0.0 0. [2,]0 0.1 0.00055643 [3,]9 4.9 1.67278117 [4,]9 5.0 1.74873257 [5,] 10 0.0 0. [6,] 10 0.1 0.00075557 [7,] 99 5.3 1.94719490 [8,]0 0.0 0. [9,]0 0.1 0.00055643 [10,]9 4.9 1.67278117 [11,]9 5.0 1.74873257 [12,] 10 0.0 0. [13,] 10 0.1 0.00075557 [14,] 99 5.3 1.94719490 On 4/17/07, Felix Wave [EMAIL PROTECTED] wrote: Hello, I import datas from an file with: readLines But I need only a part of all measurments of this file. These are between two borders START and END. Can you tell me the syntax of grep(), to choose values between two borders? My R Code was not succesful, and I can't finde anything in the help. Thank's a lot. Felix # R-CODE ### file- file-content Measure - grep([START-END],file) #Measure - grep([START|END],file) FILE-CONTENT ## EXAM NUM:2 - EXAM #1 ASTIG:-2.4D AXIS:4.8 START OF HEIGHT DATA 0 0.0 0. 0 0.1 0.00055643 9 4.9 1.67278117 9 5.0 1.74873257 10 0.0 0. 10 0.1 0.00075557 99 5.3 1.94719490 END OF HEIGHT DATA X POS:-0.299mm Y POS:0.442mm Z POS:-0.290mm - EXAM #2 ASTIG:-2.4D AXIS:4.8 START OF HEIGHT DATA 0 0.0 0. 0 0.1 0.00055643 9 4.9 1.67278117 9 5.0 1.74873257 10 0.0 0. 10 0.1 0.00075557 99 5.3 1.94719490 END OF HEIGHT DATA X POS:-0.299mm Y POS:0.442mm Z POS:-0.290mm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] proxy settings
On Tue, 17 Apr 2007, Antonio Olinto wrote: Hi all, I used to connect internet via a proxy. Before update packages I wrote in R Sys.putenv(http_proxy=http://proxy3.redegov.sp.gov.br:80/;) Nevertheless the way the connection is done has changed. For example, in the browser the proxy is not indicated and I have to give an username and a password to have access to internet. This is nothing to do with a browser: see below. I read the FAQ and the help for download.file but I was not able to do updates. I tried Sys.putenv(http_proxy_user=ask) Sys.putenv(http_proxy_user=http://:;) It does say Setting Proxies: This applies to the internal code only. ... These environment variables must be set before the download code is first used: they cannot be altered later by calling 'Sys.setenv'. You can debug the download.file() session via option() 'internet.info', which should show you what is happening. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tcltk
Sorry, but this works under all the circumstances I tried on my Vista system, so there is nothing I can do to debug it. On Tue, 17 Apr 2007, Prof Brian Ripley wrote: I suspect tcl's own version of 'access', but can you please confirm that this still happens under 'Run as Administrator', assuming 'C:\Program' is a system area in Swedish Windows Vista? I will be able to take a closer look, but not before 2.5.0 (which is in code freeze and I have limited acccess to a Vista machine). On Tue, 17 Apr 2007, Sofia Wikström wrote: I have problems with Tcl/Tk in R 2.4.1, when running it on Windows Vista (see error message below). Regards, Sofia library(tcltk) Loading Tcl/Tk interface ... Error in fun(...) : Can't find a usable init.tcl in the following directories: {C:\Program\R\R-2.4.1/Tcl/lib/tcl8.4} {C:\Program\R\R-2.4.1/Tcl/lib/tcl8.4} C:/Program/R/R-2.4.1/Tcl/lib/tcl8.4 C:/Program/R/R-2.4.1/Tcl/lib/tcl8.4 This probably means that Tcl wasn't installed properly. Error: .onLoad failed in 'loadNamespace' for 'tcltk' Error: package/namespace load failed for 'tcltk' _ Sofia Wikström, PhD AquaBiota Water Research Svante Arrhenius väg 21A, SE-104 05 Stockholm, Sweden Phone: (+46) 8 16 10 07 [EMAIL PROTECTED] -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-OSX error: 'memory not mapped'
On Wed, 18 Apr 2007, Atte Tenkanen wrote: I often get a following error with R *** caught segfault *** address 0x78807e00, cause 'memory not mapped' Possible actions: 1: abort (with core dump) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: The system is OSX 4.9 and R-version 2.4.1. Is there something to d0? Does this involve your own compiled code? No. I have installed R 2.4.1 .dmg-package. If so, run R under a debugger (e.g. R -d gdb) when you will get more information. (You may also be able to get a core dump from option 1 and look at that in a debugger, but what happens is OS-dependent, including system settings in the OS.) If not, it is likely to be a MacOS-specific problem, so please send a reproducible example to the R-sig-mac list. Now working with linux-R but next time when using the OSX-version I'll do. Sometimes the console is full of red error texts. Perhaps I can copy and send them? Atte T. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression help
A backreference is contained in parentheses and there are no parentheses in your regular expression, hence the error message. Its probably easiest just to remove all non-letters: x - 45x53yy66 gsub([^[:alpha:]], , x) # xyy On 4/18/07, Petr PIKAL [EMAIL PROTECTED] wrote: Dear all as usual I am again lost in virtues of regular expressions. I have such character vector named vzor: [365] 61A 62C/27 65A/27 66C/29 69A/29 70C/31 73A/31 74C/33 77A/33 81A/35 82C/37 85A/37 86C/39 [378] 89A/39 90C/41 93A/41 94C/43 97A/43 98C/45 101A/45 102C/47 105A/47 106C/49 109A/49 110C/51 113A/51 and I want only letters from it. I tried gsub([[:alpha:]], \\1,vzor) Error in gsub([[:alpha:]], \\1, vzor) : invalid backreference 1 in regular expression gsub([:alpha:], \\1,vzor) gives me the same vector There is probably very simple solution to it which I overlooked and examples in help page did not help me to find it. Thank you Best regards Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fractals with R
Now problems with plot-command... I try to plot Julia set and the algorithm works, if the points are drawn during the loop. But if I want to save the values first to the matrix and then afterwards plot them at once, the picture is distorted. What's wrong with the plot-command? I think the PointsMatrix is ok? -Atte C=-0.7-0.4i # Complex parameter, connected to coordinate of the Mandelbrot set in a complex plane. Limits=c(-2,2) z=0+0i MaxIter=60 cl=colours() Step=seq(Limits[1],Limits[2],by=0.01) PointsMatrix=array(0,dim=c(length(Step)*length(Step),3)) a1=0 for(x in Step) { for(y in Step) { z1=x+y*1i n=0 z=z1 while(nMaxIter abs(z)2) { z=z^2+C n=n+1 } if(abs(z)2) colour=1 else colour=n*10 #points(z1, pch=., col=cl[colour]) # This works! # But this doesn't! a1=a1+1 PointsMatrix[a1,]=c(x,y,colour) } } #??? plot(PointsMatrix[,1], PointsMatrix[,2], xlim=Limits, ylim=Limits, col=cl[PointsMatrix[,3]], pch=.) ## Atte Tenkanen wrote: Hi, That is of counter for web page. Do you get some pop-up windows? Atte Hi Atte, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regular expression help
Thank you all for your working solutions Petr Pikal [EMAIL PROTECTED] [EMAIL PROTECTED] napsal dne 18.04.2007 13:26:36: A backreference is contained in parentheses and there are no parentheses in your regular expression, hence the error message. Its probably easiest just to remove all non-letters: x - 45x53yy66 gsub([^[:alpha:]], , x) # xyy On 4/18/07, Petr PIKAL [EMAIL PROTECTED] wrote: Dear all as usual I am again lost in virtues of regular expressions. I have such character vector named vzor: [365] 61A 62C/27 65A/27 66C/29 69A/29 70C/31 73A/31 74C/33 77A/33 81A/35 82C/37 85A/37 86C/39 [378] 89A/39 90C/41 93A/41 94C/43 97A/43 98C/45 101A/45 102C/47 105A/47 106C/49 109A/49 110C/51 113A/51 and I want only letters from it. I tried gsub([[:alpha:]], \\1,vzor) Error in gsub([[:alpha:]], \\1, vzor) : invalid backreference 1 in regular expression gsub([:alpha:], \\1,vzor) gives me the same vector There is probably very simple solution to it which I overlooked and examples in help page did not help me to find it. Thank you Best regards Petr Pikal [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help comparing two median with R
Thomas Lumley wrote: On Tue, 17 Apr 2007, Frank E Harrell Jr wrote: The points that Thomas and Brian have made are certainly correct, if one is truly interested in testing for differences in medians or means. But the Wilcoxon test provides a valid test of x y more generally. The test is consonant with the Hodges-Lehmann estimator: the median of all possible differences between an X and a Y. Yes, but there is no ordering of distributions (taken one at a time) that agrees with the Wilcoxon two-sample test, only orderings of pairs of distributions. The Wilcoxon test provides a test of xy if it is known a priori that the two distributions are stochastically ordered, but not under weaker assumptions. Otherwise you can get xyzx. This is in contrast to the t-test, which orders distributions (by their mean) whether or not they are stochastically ordered. Now, it is not unreasonable to say that the problems are unlikely to occur very often and aren't worth worrying too much about. It does imply that it cannot possibly be true that there is any summary of a single distribution that the Wilcoxon test tests for (and the same is true for other two-sample rank tests, eg the logrank test). I know Frank knows this, because I gave a talk on it at Vanderbilt, but most people don't know it. (I thought for a long time that the Wilcoxon rank-sum test was a test for the median pairwise mean, which is actually the R-estimator corresponding to the *one*-sample Wilcoxon test). -thomas Thanks for your note Thomas. I do feel that the problems you have rightly listed occur infrequently and that often I only care about two groups. Rank tests generally are good at relatives, not absolutes. We have an efficient test (Wilcoxon) for relative shift but for estimating an absolute one-sample quantity (e.g., median) the nonparametric estimator is not very efficient. Ironically there is an exact nonparametric confidence interval for the median (unrelated to Wilcoxon) but none exists for the mean. Cheers, Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Two sample t.test, order of comparions
Dear group members, I want to compare response variables (logAUC) of two groups (treatment Test, Reference) of a subset (period == 1) in dataframe resp (below): sequence subject period treatment AUC logAUC 1RT 1 1 Reference 44.1 3.786460 2RT 1 2 Test 39.1 3.666122 3RT 2 1 Reference 33.6 3.514526 4RT 2 2 Test 23.8 3.169686 5RT 3 1 Reference 45.5 3.817712 6RT 3 2 Test 40.8 3.708682 7TR 4 1 Test 19.5 2.970414 8TR 4 2 Reference 21.1 3.049273 9TR 5 1 Test 67.2 4.207673 10 TR 5 2 Reference 51.5 3.941582 11 TR 6 1 Test 25.7 3.246491 12 TR 6 2 Reference 30.1 3.404525 13 RT 7 1 Reference 35.3 3.563883 14 RT 7 2 Test 26.7 3.284664 15 RT 8 1 Reference 26.0 3.258097 16 RT 8 2 Test 36.5 3.597312 17 RT 9 1 Reference 38.2 3.642836 18 RT 9 2 Test 57.8 4.056989 19 TR 10 1 Test 33.6 3.514526 20 TR 10 2 Reference 32.5 3.481240 21 TR 11 1 Test 25.1 3.222868 22 TR 11 2 Reference 36.8 3.605498 23 TR 12 1 Test 44.1 3.786460 24 TR 12 2 Reference 42.9 3.758872 25 RT 13 1 Reference 25.6 3.242592 26 RT 13 2 Test 20.1 3.000720 27 RT 14 1 Reference 58.0 4.060443 28 RT 14 2 Test 45.3 3.813307 29 RT 15 1 Reference 47.2 3.854394 30 RT 15 2 Test 51.8 3.947390 31 TR 16 1 Test 16.5 2.803360 32 TR 16 2 Reference 21.4 3.063391 33 TR 17 1 Test 47.3 3.856510 34 TR 17 2 Reference 39.4 3.673766 35 TR 18 1 Test 22.6 3.117950 36 TR 18 2 Reference 17.3 2.850707 37 RT 19 1 Reference 17.5 2.862201 38 RT 19 2 Test 30.1 3.404525 39 RT 20 1 Reference 51.7 3.945458 40 RT 20 2 Test 36.0 3.583519 41 RT 21 1 Reference 24.5 3.198673 42 RT 21 2 Test 18.2 2.901422 43 TR 22 1 Test 36.3 3.591818 44 TR 22 2 Reference 27.2 3.303217 45 TR 23 1 Test 29.4 3.380995 46 TR 23 2 Reference 39.6 3.678829 47 TR 24 1 Test 18.3 2.906901 48 TR 24 2 Reference 20.7 3.030134 The formula method of t.test result - t.test(logAUC ~ treatment, data = resp, subset = (period == 1), var.equal = FALSE, conf.level = 0.90) result gives Welch Two Sample t-test data: logAUC by treatment t = 1.1123, df = 21.431, p-value = 0.2783 alternative hypothesis: true difference in means is not equal to 0 90 percent confidence interval: -0.0973465 0.4542311 sample estimates: mean in group Reference mean in group Test 3.5622733.383831 Now I'm interested rather in the confidence interval of Test - Reference rather than Reference - Test which is given by t.test Do you know a more elegant way than the clumsy one I have tried? as.numeric(exp(result$estimate[2]-result$estimate[1])) as.numeric(exp(-result$conf.int[2])) as.numeric(exp(-result$conf.int[1])) Best regards, Helmut -- Ing. Helmut Schütz BEBAC - Consultancy Services for Bioequivalence and Bioavailability Studies Neubaugasse 36/11 1070 Vienna, Austria tel/fax +43 1 2311746 e-mail [EMAIL PROTECTED] web http://bebac.at forum http://forum.bebac.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two sample t.test, order of comparions
On Apr 18, 2007, at 8:46 AM, Helmut Schütz wrote: Dear group members, I want to compare response variables (logAUC) of two groups (treatment Test, Reference) of a subset (period == 1) in dataframe resp (below): [ snip ] The formula method of t.test result - t.test(logAUC ~ treatment, data = resp, subset = (period == 1), var.equal = FALSE, conf.level = 0.90) result gives Welch Two Sample t-test data: logAUC by treatment t = 1.1123, df = 21.431, p-value = 0.2783 alternative hypothesis: true difference in means is not equal to 0 90 percent confidence interval: -0.0973465 0.4542311 sample estimates: mean in group Reference mean in group Test 3.5622733.383831 Now I'm interested rather in the confidence interval of Test - Reference rather than Reference - Test which is given by t.test Do you know a more elegant way than the clumsy one I have tried? as.numeric(exp(result$estimate[2]-result$estimate[1])) as.numeric(exp(-result$conf.int[2])) as.numeric(exp(-result$conf.int[1])) First off, those three could probably be simplified slightly as: as.numeric(exp(-diff(result$estimate))) as.numeric(exp(-result$conf.int)) The simplest solution I think is to specify that resp$treatment should have the levels ordered in the way you like them using this first: resp$treatment - ordered(resp$treatment, levels=rev(levels(resp $treatment))) Then the t.test will show things in the order you want them. Best regards, Helmut Haris Skiadas Department of Mathematics and Computer Science Hanover College __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two sample t.test, order of comparions
On 4/18/07, Helmut Schütz [EMAIL PROTECTED] wrote: Dear group members, I want to compare response variables (logAUC) of two groups (treatment Test, Reference) of a subset (period == 1) in dataframe resp (below): sequence subject period treatment AUC logAUC 1RT 1 1 Reference 44.1 3.786460 2RT 1 2 Test 39.1 3.666122 3RT 2 1 Reference 33.6 3.514526 4RT 2 2 Test 23.8 3.169686 5RT 3 1 Reference 45.5 3.817712 6RT 3 2 Test 40.8 3.708682 7TR 4 1 Test 19.5 2.970414 8TR 4 2 Reference 21.1 3.049273 9TR 5 1 Test 67.2 4.207673 10 TR 5 2 Reference 51.5 3.941582 11 TR 6 1 Test 25.7 3.246491 12 TR 6 2 Reference 30.1 3.404525 13 RT 7 1 Reference 35.3 3.563883 14 RT 7 2 Test 26.7 3.284664 15 RT 8 1 Reference 26.0 3.258097 16 RT 8 2 Test 36.5 3.597312 17 RT 9 1 Reference 38.2 3.642836 18 RT 9 2 Test 57.8 4.056989 19 TR 10 1 Test 33.6 3.514526 20 TR 10 2 Reference 32.5 3.481240 21 TR 11 1 Test 25.1 3.222868 22 TR 11 2 Reference 36.8 3.605498 23 TR 12 1 Test 44.1 3.786460 24 TR 12 2 Reference 42.9 3.758872 25 RT 13 1 Reference 25.6 3.242592 26 RT 13 2 Test 20.1 3.000720 27 RT 14 1 Reference 58.0 4.060443 28 RT 14 2 Test 45.3 3.813307 29 RT 15 1 Reference 47.2 3.854394 30 RT 15 2 Test 51.8 3.947390 31 TR 16 1 Test 16.5 2.803360 32 TR 16 2 Reference 21.4 3.063391 33 TR 17 1 Test 47.3 3.856510 34 TR 17 2 Reference 39.4 3.673766 35 TR 18 1 Test 22.6 3.117950 36 TR 18 2 Reference 17.3 2.850707 37 RT 19 1 Reference 17.5 2.862201 38 RT 19 2 Test 30.1 3.404525 39 RT 20 1 Reference 51.7 3.945458 40 RT 20 2 Test 36.0 3.583519 41 RT 21 1 Reference 24.5 3.198673 42 RT 21 2 Test 18.2 2.901422 43 TR 22 1 Test 36.3 3.591818 44 TR 22 2 Reference 27.2 3.303217 45 TR 23 1 Test 29.4 3.380995 46 TR 23 2 Reference 39.6 3.678829 47 TR 24 1 Test 18.3 2.906901 48 TR 24 2 Reference 20.7 3.030134 The formula method of t.test result - t.test(logAUC ~ treatment, data = resp, subset = (period == 1), var.equal = FALSE, conf.level = 0.90) result gives Welch Two Sample t-test data: logAUC by treatment t = 1.1123, df = 21.431, p-value = 0.2783 alternative hypothesis: true difference in means is not equal to 0 90 percent confidence interval: -0.0973465 0.4542311 sample estimates: mean in group Reference mean in group Test 3.5622733.383831 Now I'm interested rather in the confidence interval of Test - Reference rather than Reference - Test which is given by t.test You could change the order of the levels of the treatment factor. See ?relevel Do you know a more elegant way than the clumsy one I have tried? as.numeric(exp(result$estimate[2]-result$estimate[1])) as.numeric(exp(-result$conf.int[2])) as.numeric(exp(-result$conf.int[1])) Best regards, Helmut -- Ing. Helmut Schütz BEBAC - Consultancy Services for Bioequivalence and Bioavailability Studies Neubaugasse 36/11 1070 Vienna, Austria tel/fax +43 1 2311746 e-mail [EMAIL PROTECTED] web http://bebac.at forum http://forum.bebac.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Changing axis lable text size in plots?
Dear list I'm plotting ( boxplot() and plot() ) some data for a publication. The editor would like the text labels on the plots in a larger font. I'm doing something like this: snip jpeg( filename = D:/Martin/Work/CleanPath/RAF1%03d.jpg, width = 1000, height = 600, pointsize = 12, quality = 100, bg = white, res = 96, restoreConsole = TRUE ) boxplot( dafExpo[,c(1,2,3,4,5,6,7,8,9,10,11,12)], main = Total NOx Exposure per trip ) numAxeMax = max(dafExpo$grn07, dafExpo$grn09, dafExpo$grn14, dafExpo$grn16) plot( dafExpo$grn07,dafExpo$grn09, xlim=c(0,numAxeMax), ylim=c(0,numAxeMax), pch=1, xlab=Green at 7 AM : NOx [(µg/m3)*hour], ylab=Green at 9 AM : NOx [(µg/m3)*hour] ) abline(0,1) snip dafExpo is a dataframe and the above code works just fine. My question: How do I change the size of the text seen on the plot, the axis labels in particular? My mailbox is spam-free with ChoiceMail, the leader in personal and corporate anti-spam solutions. Download your free copy of ChoiceMail from www.digiportal.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two sample t.test, order of comparions
take a look at ?relevel() Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Helmut Schütz [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Wednesday, April 18, 2007 2:46 PM Subject: [R] Two sample t.test, order of comparions Dear group members, I want to compare response variables (logAUC) of two groups (treatment Test, Reference) of a subset (period == 1) in dataframe resp (below): sequence subject period treatment AUC logAUC 1RT 1 1 Reference 44.1 3.786460 2RT 1 2 Test 39.1 3.666122 3RT 2 1 Reference 33.6 3.514526 4RT 2 2 Test 23.8 3.169686 5RT 3 1 Reference 45.5 3.817712 6RT 3 2 Test 40.8 3.708682 7TR 4 1 Test 19.5 2.970414 8TR 4 2 Reference 21.1 3.049273 9TR 5 1 Test 67.2 4.207673 10 TR 5 2 Reference 51.5 3.941582 11 TR 6 1 Test 25.7 3.246491 12 TR 6 2 Reference 30.1 3.404525 13 RT 7 1 Reference 35.3 3.563883 14 RT 7 2 Test 26.7 3.284664 15 RT 8 1 Reference 26.0 3.258097 16 RT 8 2 Test 36.5 3.597312 17 RT 9 1 Reference 38.2 3.642836 18 RT 9 2 Test 57.8 4.056989 19 TR 10 1 Test 33.6 3.514526 20 TR 10 2 Reference 32.5 3.481240 21 TR 11 1 Test 25.1 3.222868 22 TR 11 2 Reference 36.8 3.605498 23 TR 12 1 Test 44.1 3.786460 24 TR 12 2 Reference 42.9 3.758872 25 RT 13 1 Reference 25.6 3.242592 26 RT 13 2 Test 20.1 3.000720 27 RT 14 1 Reference 58.0 4.060443 28 RT 14 2 Test 45.3 3.813307 29 RT 15 1 Reference 47.2 3.854394 30 RT 15 2 Test 51.8 3.947390 31 TR 16 1 Test 16.5 2.803360 32 TR 16 2 Reference 21.4 3.063391 33 TR 17 1 Test 47.3 3.856510 34 TR 17 2 Reference 39.4 3.673766 35 TR 18 1 Test 22.6 3.117950 36 TR 18 2 Reference 17.3 2.850707 37 RT 19 1 Reference 17.5 2.862201 38 RT 19 2 Test 30.1 3.404525 39 RT 20 1 Reference 51.7 3.945458 40 RT 20 2 Test 36.0 3.583519 41 RT 21 1 Reference 24.5 3.198673 42 RT 21 2 Test 18.2 2.901422 43 TR 22 1 Test 36.3 3.591818 44 TR 22 2 Reference 27.2 3.303217 45 TR 23 1 Test 29.4 3.380995 46 TR 23 2 Reference 39.6 3.678829 47 TR 24 1 Test 18.3 2.906901 48 TR 24 2 Reference 20.7 3.030134 The formula method of t.test result - t.test(logAUC ~ treatment, data = resp, subset = (period == 1), var.equal = FALSE, conf.level = 0.90) result gives Welch Two Sample t-test data: logAUC by treatment t = 1.1123, df = 21.431, p-value = 0.2783 alternative hypothesis: true difference in means is not equal to 0 90 percent confidence interval: -0.0973465 0.4542311 sample estimates: mean in group Reference mean in group Test 3.5622733.383831 Now I'm interested rather in the confidence interval of Test - Reference rather than Reference - Test which is given by t.test Do you know a more elegant way than the clumsy one I have tried? as.numeric(exp(result$estimate[2]-result$estimate[1])) as.numeric(exp(-result$conf.int[2])) as.numeric(exp(-result$conf.int[1])) Best regards, Helmut -- Ing. Helmut Schütz BEBAC - Consultancy Services for Bioequivalence and Bioavailability Studies Neubaugasse 36/11 1070 Vienna, Austria tel/fax +43 1 2311746 e-mail [EMAIL PROTECTED] web http://bebac.at forum http://forum.bebac.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R-2.4.1 for MacOS X - languageR, acepack, Hmisc
I updated R to the last 2.4.1 version and unfortunately I can not load languageR any longer. In R-2.4.1, LanguageR requires acepack, but Hmisc doesn't work when acepack is loaded. library(languageR) Loading required package: Design Loading required package: Hmisc Loading required package: acepack Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/Library/Frameworks/ R.framework/Versions/2.4/Resources/library/Hmisc/libs/i386/Hmisc.so': dlopen(/Library/Frameworks/R.framework/Versions/2.4/Resources/ library/Hmisc/libs/i386/Hmisc.so, 6): Library not loaded: /usr/local/ gcc4.0/i686-apple-darwin8/lib/libgcc_s.1.0.dylib Referenced from: /Library/Frameworks/R.framework/Versions/2.4/ Resources/library/Hmisc/libs/i386/Hmisc.so Reason: image not found Error: package 'Hmisc' could not be loaded Apparently the Hmisc.so cannot be loaded, but it is actually there: source(/Library/Frameworks/R.framework/Versions/2.4/Resources/ library/Hmisc/libs/i386/Hmisc.so) Error in parse(file, n = -1, NULL, ?) : syntax error at 1: Did anybody else encounter the same problem? And, if so, I would be very grateful to anybody who could tell me how to solve this problem. Thanks, Lara Tagliapietra [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tcltk
Prof Brian Ripley wrote: Sorry, but this works under all the circumstances I tried on my Vista system, so there is nothing I can do to debug it. You (i.e. Sofia) could do some investigation yourself. It may prove informative if you search for init.tcl and check whether it is readable (for you as ordinary user). It is supposed to be a plain text file, so notepad/wordpad can read it. -p On Tue, 17 Apr 2007, Prof Brian Ripley wrote: I suspect tcl's own version of 'access', but can you please confirm that this still happens under 'Run as Administrator', assuming 'C:\Program' is a system area in Swedish Windows Vista? I will be able to take a closer look, but not before 2.5.0 (which is in code freeze and I have limited acccess to a Vista machine). On Tue, 17 Apr 2007, Sofia Wikström wrote: I have problems with Tcl/Tk in R 2.4.1, when running it on Windows Vista (see error message below). Regards, Sofia library(tcltk) Loading Tcl/Tk interface ... Error in fun(...) : Can't find a usable init.tcl in the following directories: {C:\Program\R\R-2.4.1/Tcl/lib/tcl8.4} {C:\Program\R\R-2.4.1/Tcl/lib/tcl8.4} C:/Program/R/R-2.4.1/Tcl/lib/tcl8.4 C:/Program/R/R-2.4.1/Tcl/lib/tcl8.4 This probably means that Tcl wasn't installed properly. Error: .onLoad failed in 'loadNamespace' for 'tcltk' Error: package/namespace load failed for 'tcltk' _ Sofia Wikström, PhD AquaBiota Water Research Svante Arrhenius väg 21A, SE-104 05 Stockholm, Sweden Phone: (+46) 8 16 10 07 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Two sample t.test, order of comparions
Dear Charilaos! Charilaos Skiadas wrote: Do you know a more elegant way than the clumsy one I have tried? as.numeric(exp(result$estimate[2]-result$estimate[1])) as.numeric(exp(-result$conf.int[2])) as.numeric(exp(-result$conf.int[1])) First off, those three could probably be simplified slightly as: as.numeric(exp(-diff(result$estimate))) as.numeric(exp(-result$conf.int)) The simplest solution I think is to specify that resp$treatment should have the levels ordered in the way you like them using this first: resp$treatment - ordered(resp$treatment, levels=rev(levels(resp$treatment))) Then the t.test will show things in the order you want them. I applied relevel() as suggested by Douglas and Dimitri: relevel(resp$treatment, ref = Reference) result - t.test(logAUC ~ treatment, data = resp, subset = (period == 1), var.equal = FALSE, conf.level = 0.90) result yielding Welch Two Sample t-test data: logAUC by treatment t = 1.1123, df = 21.431, p-value = 0.2783 alternative hypothesis: true difference in means is not equal to 0 90 percent confidence interval: -0.0973465 0.4542311 sample estimates: mean in group Reference mean in group Test 3.5622733.383831 So right now the confidence interval in the log-domain is of the correct order. Your first suggestion is working (sign changed due to reversed level) as.numeric(exp(diff(result$estimate))) [1] 0.8365723 But still I have to apply as.numeric(exp(-result$conf.int[2])) [1] 0.634936 as.numeric(exp(-result$conf.int[1])) [1] 1.102242 because as.numeric(exp(-result$conf.int)) [1] 1.102242 0.634936 in order to get the correct CI in the untransformed domain I had to sort the list: sort(as.numeric(exp(-result$conf.int))) [1] 0.634936 1.102242 Best regards, Helmut -- Ing. Helmut Schütz BEBAC - Consultancy Services for Bioequivalence and Bioavailability Studies Neubaugasse 36/11 1070 Vienna, Austria tel/fax +43 1 2311746 e-mail [EMAIL PROTECTED] web http://bebac.at forum http://forum.bebac.at __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Runing R in a bash script
Prof Brian Ripley wrote: On Wed, 18 Apr 2007, Ulrik Stervbo wrote: As I har problems installing the Cairo package, I went for Henriks solution - and it works almost perfect. I would like to have been able to generate transparent png. You cannot do transparency via postscript. I would suggest using pdf() and converting the output of that, which often works even better (and does have full transparency support). Cairo also has full transparency support, both for png and pdf. Ulrik, I presume the problem you ran into was the absence of the cairo library dependency: configure: error: Cannot find cairo.h! Please install cairo (http://www.cairographics.org/) and/or set CAIRO_CFLAGS/LIBS correspondingly. which has it's own library dependencies as well. Most all Linux distributions (I'm presuming you're running Linux) have ways to install these dependencies easily: apt-get, rpm, etc. Otherwise, if you had other problems installing Cairo, please let me know and I can fix them. Jeff Thanks for the help Ulrik On 18/04/07, Henrik Bengtsson [EMAIL PROTECTED] wrote: Or see png2() in R.utils, which imitates png() but uses bitmap(), which in turn uses postscript-to-png via ghostscript. BTW, personally I think PNGs generated via bitmap() look way better than the ones generated via png(). As there are two separate versions of png() for different OSes, comments like that are very system-dependent. Other postings suggest this is Windows, and if png() is giving poor results there it suggests a problem with the way Windows' GDI is configured (which depends on the graphics card). And of course, PNGs don't 'look' at all: they are rendered by some other tool, and quite often the perceived problem with R graphical output is in fact with the rendering tool. /Henrik On 4/17/07, Jeffrey Horner [EMAIL PROTECTED] wrote: Ulrik Stervbo wrote: Hello! I am having issues trying to plot to a ong (or jpg) when the R-code in a bash script is executed from cron. I can generate a pdf file, but when I try to write to a png, the file is created, but nothing is written. If I execute the bash script from my console, everything works file. Any ideas? In my cron I have SHELL=/bin/bash - otherwise /bin/shell is used and the folowing enery, so example is executed every minute * * * * * [path]/example.sh I am running R version 2.4.1 (2006-12-18) Here's a minimal example - two files one R-script ('example.r') and one bash-script ('example.sh') example.r # Example R-script x - c(1:10) y - x^2 png(file=example2.png) #pdf(file=example2.pdf) plot(x,y) graphics.off() example.sh #/bin/bash # # Hello world is written to exhotext every time cron executes this script echo Hello world echotext # This works, but not when executed from cron n=`R --save example.r` # using exec as in `exec R --save example.r` dosent work with cron either # This also works, but nothing is written to the png when executed from cron R --save RSCRIPT x - c(1:10) y - x^2 png(file=example2.png) #pdf(file=example2.pdf) plot(x,y) graphics.off() #dev.off() dosent work at all when executed from cron RSCRIPT The png() device requires an X server for the image rendering. You might be able to get away with exporting the DISPLAY environment variable export DISPLAY=:0.0 # try and connect to X server on display 0.0 within your script, but it will only work if the script is executed by the same user as is running the X server, *and* the X server is running at the time the script is executed. There are a handful of packages that will create a png without the presence of an X server, and I'm partial to Cairo (since I've done some work on it). You can install the latest version like this: install.packages(Cairo,,'http://rforge.net/',type='source') Cairo can also outputs nice pdf's with embedded fonts... useful if you want to embed high-quality OpenType or TrueType fonts. Best, Jeff -- http://biostat.mc.vanderbilt.edu/JeffreyHorner __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] division of decimal number
Dear R-Experts, how can I divide the number 0.285 with 2. I need a function. Result: 0.285 / 2 = 0.1425 Thanks, Corinna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] division of decimal number
Schmitt, Corinna wrote: Dear R-Experts, how can I divide the number 0.285 with 2. I need a function. Result: 0.285 / 2 = 0.1425 Just get the / operator: divide = get(/) divide(0.285,2) [1] 0.1425 Is that what you want? Barry __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tcltk
Thanks for the help! I found out that it was a problem with the search path for R. Reinstalling R seems to have solved the problem. Sorry for bothering you before having tested that. Sofia -Ursprungligt meddelande- Från: Peter Dalgaard [mailto:[EMAIL PROTECTED] Skickat: den 18 april 2007 16:11 Till: Prof Brian Ripley Kopia: Sofia Wikström; r-help@stat.math.ethz.ch Ämne: Re: [R] Tcltk Prof Brian Ripley wrote: Sorry, but this works under all the circumstances I tried on my Vista system, so there is nothing I can do to debug it. You (i.e. Sofia) could do some investigation yourself. It may prove informative if you search for init.tcl and check whether it is readable (for you as ordinary user). It is supposed to be a plain text file, so notepad/wordpad can read it. -p On Tue, 17 Apr 2007, Prof Brian Ripley wrote: I suspect tcl's own version of 'access', but can you please confirm that this still happens under 'Run as Administrator', assuming 'C:\Program' is a system area in Swedish Windows Vista? I will be able to take a closer look, but not before 2.5.0 (which is in code freeze and I have limited acccess to a Vista machine). On Tue, 17 Apr 2007, Sofia Wikström wrote: I have problems with Tcl/Tk in R 2.4.1, when running it on Windows Vista (see error message below). Regards, Sofia library(tcltk) Loading Tcl/Tk interface ... Error in fun(...) : Can't find a usable init.tcl in the following directories: {C:\Program\R\R-2.4.1/Tcl/lib/tcl8.4} {C:\Program\R\R-2.4.1/Tcl/lib/tcl8.4} C:/Program/R/R-2.4.1/Tcl/lib/tcl8.4 C:/Program/R/R-2.4.1/Tcl/lib/tcl8.4 This probably means that Tcl wasn't installed properly. Error: .onLoad failed in 'loadNamespace' for 'tcltk' Error: package/namespace load failed for 'tcltk' _ Sofia Wikström, PhD AquaBiota Water Research Svante Arrhenius väg 21A, SE-104 05 Stockholm, Sweden Phone: (+46) 8 16 10 07 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] division of decimal number
Well, i think half.of.0.285 - function() { 0.1425 } would do the trink. Gabor On Wed, Apr 18, 2007 at 04:42:49PM +0200, Schmitt, Corinna wrote: Dear R-Experts, how can I divide the number 0.285 with 2. I need a function. Result: 0.285 / 2 = 0.1425 Thanks, Corinna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] importing excel-file
Dear R-experts, It is a quite stupid question but please help me. I am very confuced. I am able to import normal txt ant mat-files to R but unable to import .xls-file I do not understand the online help. Can please anyone send me the corresponding command lines? The .xls-file is attached. In my file we use commas for the decimal format (example: 0,712), changes might be needed. Thanks, Corinna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ESS function highlighting
Hi, is there a way of telling Emacs + ESS to show words that are already a function in R (such as 'length') is a different colour/font? Best, Federico -- Federico C. F. Calboli Department of Epidemiology and Public Health Imperial College, St. Mary's Campus Norfolk Place, London W2 1PG Tel +44 (0)20 75941602 Fax +44 (0)20 75943193 f.calboli [.a.t] imperial.ac.uk f.calboli [.a.t] gmail.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Get the: standard deviation and mean value of 3D-measurments
Hello, I have got a numeric matrix with three colums of a few measurments. My x and y coordinates are rising numbers. The z coordinate is the measurment. In my matrix if have got ~6000 sorted values for one measurments multiplied with the number of measurments. An example in the end. My aim is to get the mean values and standard deviation for each value between the measuements (of the z coordinate). But the problem is. I don't have always the same count of x and y values. See the example (29 4.8 xxx) is missing in measure two. Has anybody an idea what I can do? I though I could split the measurments an copy it horizontal in the matrix. But then I can't work with mean because I don't have the same numbers in every measurment (~6000). Thank's a lot. Your Felix PS: Thank you for the help in the past! (xY polar coordinates) of two measurments -- x y z 29 4.5 1.505713 29 4.6 1.580402 29 4.7 1.656875 29 4.8 1.735054 30 0 0 30 0.1 0.00096108 30 0.2 0.00323831 ... 29 4.5 1.495148 29 4.6 1.568961 29 4.7 1.644467 30 0 0 30 0.1 0.00093699 30 0.2 0.00319411 30 0.3 0.00676619 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help comparing two median with R
Has anyone proposed using a bootstrap for Pedro's problem? What about taking a boostrap sample from x, a boostrap sample from y, take the difference in the medians for these two bootstrap samples, repeat the process 1,000 times and calculate the 95th percentiles of the 1,000 computed differences? You would get a CI on the difference between the medians for these two groups, with which you could determine whether the difference was greater/less than zero. Too crude? Regards, -Cody Frank E Harrell Jr [EMAIL PROTECTED] To bilt.edu Thomas Lumley Sent by: [EMAIL PROTECTED] [EMAIL PROTECTED] cc at.math.ethz.ch r-help@stat.math.ethz.ch Subject Re: [R] help comparing two median 04/18/2007 05:02 with R AM Thomas Lumley wrote: On Tue, 17 Apr 2007, Frank E Harrell Jr wrote: The points that Thomas and Brian have made are certainly correct, if one is truly interested in testing for differences in medians or means. But the Wilcoxon test provides a valid test of x y more generally. The test is consonant with the Hodges-Lehmann estimator: the median of all possible differences between an X and a Y. Yes, but there is no ordering of distributions (taken one at a time) that agrees with the Wilcoxon two-sample test, only orderings of pairs of distributions. The Wilcoxon test provides a test of xy if it is known a priori that the two distributions are stochastically ordered, but not under weaker assumptions. Otherwise you can get xyzx. This is in contrast to the t-test, which orders distributions (by their mean) whether or not they are stochastically ordered. Now, it is not unreasonable to say that the problems are unlikely to occur very often and aren't worth worrying too much about. It does imply that it cannot possibly be true that there is any summary of a single distribution that the Wilcoxon test tests for (and the same is true for other two-sample rank tests, eg the logrank test). I know Frank knows this, because I gave a talk on it at Vanderbilt, but most people don't know it. (I thought for a long time that the Wilcoxon rank-sum test was a test for the median pairwise mean, which is actually the R-estimator corresponding to the *one*-sample Wilcoxon test). -thomas Thanks for your note Thomas. I do feel that the problems you have rightly listed occur infrequently and that often I only care about two groups. Rank tests generally are good at relatives, not absolutes. We have an efficient test (Wilcoxon) for relative shift but for estimating an absolute one-sample quantity (e.g., median) the nonparametric estimator is not very efficient. Ironically there is an exact nonparametric confidence interval for the median (unrelated to Wilcoxon) but none exists for the mean. Cheers, Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem with ?curve
Dear all R gurus, I have following syntax: y = c(1:10) chippy - function(x) { y[5] = x sin(cos(t(y)%*%y)*exp(-t(y)%*%y/2)) } curve(chippy, 1, 20, n=200) But I am getting error while executing : Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ In addition: Warning message: number of items to replace is not a multiple of replacement length Can anyone tell me how I can recover? Thanks Ron Send instant messages to your online friends http://uk.messenger.yahoo.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Specifying ANCOVA models in R
Hi all, I am trying to fit an ANOVA model in R using the aov/lm commands. I have a set of observational (i.e. no fixed experimental effects) data, in which I have identified high and low clusters of the response variable. The design is unbalanced, with 773 high cluster observations, and 523 low cluster observations. I would like to test a set of 7 correlates to see if there are significant differences in their means between the clusters: That is I have one fixed effect with 2 levels, and a bunch of 7 continuous predictors. I believe the correct model specification is an ANCOVA design(?) I can fit this model in MINITAB using, say: glm response = cluster; covariate predictor1 predictor2 ... predictor7. In R, if I specify the model using cluster-ordered(clusterlevels=c(Low,High)) Model-lm(predictor~response1+response2+ ... response7+cluster) I can replicate the results from MINITAB, getting identical P and t values when I do summary(lm(Model)), but the F values are all different (huge) when I do summary(aov(Model)) for all correlates. The F value for the fixed effect is correct. The P values for summary(aov(Model)) are all highly significant too. I would like to fit the model in R, both for consistency with my other analysis, and because I use R on my home machine, and have to venture into the university labs to use MINITAB. Many thanks Luke Spadavecchia [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with ?curve
Your chipply function is not vectorized. See ?curve and try: curve(Vectorize(chippy)(x), 1, 20, n=200) On 4/18/07, Ron Michael [EMAIL PROTECTED] wrote: Dear all R gurus, I have following syntax: y = c(1:10) chippy - function(x) { y[5] = x sin(cos(t(y)%*%y)*exp(-t(y)%*%y/2)) } curve(chippy, 1, 20, n=200) But I am getting error while executing : Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ In addition: Warning message: number of items to replace is not a multiple of replacement length Can anyone tell me how I can recover? Thanks Ron Send instant messages to your online friends http://uk.messenger.yahoo.com [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help comparing two median with R
[EMAIL PROTECTED] wrote: Has anyone proposed using a bootstrap for Pedro's problem? What about taking a boostrap sample from x, a boostrap sample from y, take the difference in the medians for these two bootstrap samples, repeat the process 1,000 times and calculate the 95th percentiles of the 1,000 computed differences? You would get a CI on the difference between the medians for these two groups, with which you could determine whether the difference was greater/less than zero. Too crude? Regards, -Cody As hinted at by Brian Ripley, the following code will approximate that. It gets the nonparametric confidence interval for the median and solves for the variance that would give the same confidence interval width if normality of the median held. g - function(y) { y - sort(y[!is.na(y)]) n - length(y) if(n 4) return(c(median=median(y),q1=NA,q3=NA,variance=NA)) qu - quantile(y, c(.5,.25,.75)) names(qu) - NULL r - pmin(qbinom(c(.025,.975), n, .5) + 1, n) ## Exact 0.95 C.L. w - y[r[2]] - y[r[1]] ## Width of C.L. var.med - ((w/1.96)^2)/4 ## Approximate variance of median c(median=qu[1], q1=qu[2], q3=qu[3], variance=var.med) } Run g separately by group, add the two variances, and take the square root to approximate the variance of the difference in medians and get a confidence interval. Frank Frank E Harrell Jr [EMAIL PROTECTED] To bilt.edu Thomas Lumley Sent by: [EMAIL PROTECTED] [EMAIL PROTECTED] cc at.math.ethz.ch r-help@stat.math.ethz.ch Subject Re: [R] help comparing two median 04/18/2007 05:02 with R AM Thomas Lumley wrote: On Tue, 17 Apr 2007, Frank E Harrell Jr wrote: The points that Thomas and Brian have made are certainly correct, if one is truly interested in testing for differences in medians or means. But the Wilcoxon test provides a valid test of x y more generally. The test is consonant with the Hodges-Lehmann estimator: the median of all possible differences between an X and a Y. Yes, but there is no ordering of distributions (taken one at a time) that agrees with the Wilcoxon two-sample test, only orderings of pairs of distributions. The Wilcoxon test provides a test of xy if it is known a priori that the two distributions are stochastically ordered, but not under weaker assumptions. Otherwise you can get xyzx. This is in contrast to the t-test, which orders distributions (by their mean) whether or not they are stochastically ordered. Now, it is not unreasonable to say that the problems are unlikely to occur very often and aren't worth worrying too much about. It does imply that it cannot possibly be true that there is any summary of a single distribution that the Wilcoxon test tests for (and the same is true for other two-sample rank tests, eg the logrank test). I know Frank knows this, because I gave a talk on it at Vanderbilt, but most people don't know it. (I thought for a long time that the Wilcoxon rank-sum test was a test for the median pairwise mean, which is actually the R-estimator corresponding to the *one*-sample Wilcoxon test). -thomas Thanks for your note Thomas. I do feel that the problems you have rightly listed occur infrequently and that often I only care about two groups. Rank tests generally are good at relatives, not absolutes. We have an efficient test (Wilcoxon) for relative shift but for estimating an absolute one-sample quantity (e.g., median) the nonparametric estimator is not very efficient. Ironically there is an exact nonparametric confidence interval for the median (unrelated to Wilcoxon) but none exists for the mean. Cheers, Frank -- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University __
Re: [R] importing excel-file
Corinna Schmitt wrote: It is a quite stupid question but please help me. I am very confuced. I am able to import normal txt ant mat-files to R but unable to import .xls-file I've tried two ways to import excel files, but none of them seems perfect. Method 1: This method uses library RODBC. The way to import excel files is this: channel - odbcConnectExcel(myfile.xls) tables - sqlTables(channel) # list the names of the spreadsheets name1 - tables[1, TABLE_NAME] # get the name of the 1st spreadsheet plan1 - sqlFetch(channel, name1) # this _should_ work, but it doesn't # The reason is that somehow the names of the sheets are altered plan1 - sqlFetch(channel, sheet name) # this works # but you must type the exact name of the sheet # the next line works, no matter what is name1 (taken from tables) plan1 - sqlQuery(channel, sprintf(select * from [%s], name1)) odbcClose(channel) # close it This is not perfect. Some (most?) of the numerical fields in the spreadsheet are translated to NA and become meaningless. Method 2: This method uses library xlsReadWrite. You must know the index of the spreadsheet that you want to load: plan6 - read.xls(filename, sheet = 6, colClasses=double) This works in most cases. I do not understand the online help. Can please anyone send me the corresponding command lines? help(help) # :-) The .xls-file is attached. No, it's not. In my file we use commas for the decimal format (example: 0,712), changes might be needed. I *think* this is an internal flag. If the numbers are numbers, then this should be no problem. An excel spreadsheet in any language is portable to other languages, even when the evil geniuses of M$ decided to localize function names so that, in Portuguese, we have SENO instead of SIN and RAIZ instead of SQRT. Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing excel-file
2007/4/18, Schmitt, Corinna [EMAIL PROTECTED]: It is a quite stupid question but please help me. I am very confuced. I am able to import normal txt ant mat-files to R but unable to import .xls-file Searching for Excel on e.g. http://www.r-project.org/search.html, http://tolstoy.newcastle.edu.au/R/about.html or http://wiki.r-project.org/rwiki/doku.php gives: - RODBC package - xlsReadWrite package - gdata package - rexcelpoi package - ActiveX (RDCOMClient package, search for examples in the mailling list) - read.table command to read .csv files I'd take xlsReadWrite (but I am biased), RODBC is also good. ActiveX if you have lower level know how. read.table if working with .csv files is ok. I do not understand the online help. Can please anyone send me the corresponding command lines? library(xlsReadWrite) dat - read.xls( filename ) details in ?read.xls The .xls-file is attached. binary files will be dropped from the list In my file we use commas for the decimal format (example: 0,712), changes might be needed. Don't know if this is relevant. Sorry. -- Regards, Hans-Peter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing excel-file
There is also a read.xls command in package gdata, it seems that it uses a perl script called 'xls2csv'. I've have no idea how good this is, never tried it. Btw, xlsReadWrite is Windows-only, so you can use it only if you use windows. Gabor ps. Corinna, to be honest, i've no idea what kind online help you've read, there is plenty. Next time try to be more specific please. On Wed, Apr 18, 2007 at 03:07:51PM -0200, Alberto Monteiro wrote: Corinna Schmitt wrote: It is a quite stupid question but please help me. I am very confuced. I am able to import normal txt ant mat-files to R but unable to import .xls-file I've tried two ways to import excel files, but none of them seems perfect. [...] -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error in geweke.diag function of coda package
Hi R users, Does anybody knows for the following erro after running geweke.diag(MCMC.sampled, frac1=0.1, frac2=0.5) Erro em glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, : laço interno 1; não é possÃvel corrigir o tamanho do passo Além disso: Warning messages: 1: algoritmo não convergiu in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 2: algoritmo não convergiu in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 3: algoritmo não convergiu in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 4: algoritmo não convergiu in: glm.fit(x = X, y = Y, weights = weights, start = start, etastart = etastart, 5: tamanho do passo truncado por causa de divergência Thanks for any help. Gilberto Matos. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing excel-file
I use gdata and it works quite well for me. It's as easy as install.packages(gdata) library(gdata) data = read.xls(mydata.xls,sheet=1) [read.xls() can take other arguments] It requires concurrent installation of Perl, but installing Perl is also simple. For Windows, you can get it here: http://www.activestate.com/Products/ActivePerl/ --- Gabor Csardi [EMAIL PROTECTED] wrote: There is also a read.xls command in package gdata, it seems that it uses a perl script called 'xls2csv'. I've have no idea how good this is, never tried it. Btw, xlsReadWrite is Windows-only, so you can use it only if you use windows. Gabor ps. Corinna, to be honest, i've no idea what kind online help you've read, there is plenty. Next time try to be more specific please. On Wed, Apr 18, 2007 at 03:07:51PM -0200, Alberto Monteiro wrote: Corinna Schmitt wrote: It is a quite stupid question but please help me. I am very confuced. I am able to import normal txt ant mat-files to R but unable to import .xls-file I've tried two ways to import excel files, but none of them seems perfect. [...] -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing excel-file
To avoid complications, save your file as comma separated and use one of the instructions for reading delimited files. If you are using a comma as a decimal point you are probably using ; as a separator. If this is so use read.csv2. Please see the help files for read.table. Best Regards John On 18/04/07, Schmitt, Corinna [EMAIL PROTECTED] wrote: Dear R-experts, It is a quite stupid question but please help me. I am very confuced. I am able to import normal txt ant mat-files to R but unable to import .xls-file I do not understand the online help. Can please anyone send me the corresponding command lines? The .xls-file is attached. In my file we use commas for the decimal format (example: 0,712), changes might be needed. Thanks, Corinna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John C Frain Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Manipulation using R
...is this what you're looking for? donedat - subset(data,ID 6000 | ID = 7000) findat - donedat[-unique(rapply(donedat,function(x) which( x 0 ))),,drop=FALSE] the second line looks through each column, and finds the indices of negative values - rapply() returns all of them as a vector; unique() removes duplicated elements, and with negative indexing you remove these values from donedat. --- Anup Nandialath [EMAIL PROTECTED] wrote: Dear Friends, I have data set with around 220,000 rows and 17 columns. One of the columns is an id variable which is grouped from 1000 through 9000. I need to perform the following operations. 1) Remove all the observations with id's between 6000 and 6999 I tried using this method. remdat1 - subset(data, ID6000) remdat2 - subset(data, ID=7000) donedat - rbind(remdat1, remdat2) I check the last and first entry and found that it did not have ID values 6000. Therefore I think that this might be correct, but is this the most efficient way of doing this? 2) I need to remove observations within columns 3, 4, 6 and 8 when they are negative. For instance if the number in column 3 is -4, then I need to delete the entire observation. Can somebody help me with this too. Thank and Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing excel-file
Gabor Csardi wrote: There is also a read.xls command in package gdata, it seems that it uses a perl script called 'xls2csv'. I've have no idea how good this is, never tried it. Btw, xlsReadWrite is Windows-only, so you can use it only if you use windows. Ok, but who would be insane enough to use Excel in Linux or Mac? :-) Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing excel-file
John C Frain wrote: To avoid complications, save your file as comma separated and use one of the instructions for reading delimited files. If you are using a comma as a decimal point you are probably using ; as a separator. If this is so use read.csv2. Please see the help files for read.table. I think the problem is that we _can't_ alter or write the excel file, or it would be unpractical to do it (say, this xls file is generated by someone else once a day, and we must use it). I usually save the excel file to plain text, and then read it - but this is not the best solution when the file keeps changing. Alberto Monteiro __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing excel-file
Hello Everybody, Install the package: install.packages(xlsReadWrite) Load it: library(xlsReadWrite) testfile = read.xls(TesFile.xls) Have Fun! Kind Regards, Soare Marcian-Alin PS: If dont works, then install also the package xtable, but it should work without installing it! 2007/4/18, John C Frain [EMAIL PROTECTED]: To avoid complications, save your file as comma separated and use one of the instructions for reading delimited files. If you are using a comma as a decimal point you are probably using ; as a separator. If this is so use read.csv2. Please see the help files for read.table. Best Regards John On 18/04/07, Schmitt, Corinna [EMAIL PROTECTED] wrote: Dear R-experts, It is a quite stupid question but please help me. I am very confuced. I am able to import normal txt ant mat-files to R but unable to import .xls-file I do not understand the online help. Can please anyone send me the corresponding command lines? The .xls-file is attached. In my file we use commas for the decimal format (example: 0,712), changes might be needed. Thanks, Corinna __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John C Frain Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing excel-file
On Wed, Apr 18, 2007 at 03:51:35PM -0200, Alberto Monteiro wrote: Gabor Csardi wrote: There is also a read.xls command in package gdata, it seems that it uses a perl script called 'xls2csv'. I've have no idea how good this is, never tried it. Btw, xlsReadWrite is Windows-only, so you can use it only if you use windows. Ok, but who would be insane enough to use Excel in Linux or Mac? :-) Personally i don't use excel on anything, but receive data in Excel file occasionally. I opened them in openoffice.org and saved them in csv, but just learned that there are quicker solutions, directly from R. So this is useful piece of information. Perhaps not _very_ useful. But you can't deny it's information. :) Gabor Alberto Monteiro -- Csardi Gabor [EMAIL PROTECTED]MTA RMKI, ELTE TTK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] help comparing two median with R
For testing, the permutation test may be prefered to the bootstrap (though the bootstrap could be used for a confidence interval). I remember in grad school doing a project on comparing the efficiency of a permutation test on medians compared to the MannWhitney test, but I don't remember the specifics of when each was better. Do any of the other participants in this discussion have any ideas on how the permutation tests compare to what else has been discussed? The MannWhitney test is actually a special case of the permutation test, but using the median permutation test is more intuitive to my mind. The permutation test is actually testing the null hypothesis that the 2 distributions are identical, but no assumptions about normality, skewness, shift hypotheses, etc.. Though the efficiency of the test statistic used would depend somewhat on the nature of the alternatives of interest (imagine 2 distributions with the same mean, but different medians, or same median, but different mean; a permutation test comparing means or medians would differ in the 2 cases). I'll have to try some simulations looking at a permutation test on efron's dice. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, April 18, 2007 10:06 AM To: r-help@stat.math.ethz.ch Subject: Re: [R] help comparing two median with R Has anyone proposed using a bootstrap for Pedro's problem? What about taking a boostrap sample from x, a boostrap sample from y, take the difference in the medians for these two bootstrap samples, repeat the process 1,000 times and calculate the 95th percentiles of the 1,000 computed differences? You would get a CI on the difference between the medians for these two groups, with which you could determine whether the difference was greater/less than zero. Too crude? Regards, -Cody Frank E Harrell Jr [EMAIL PROTECTED] To bilt.edu Thomas Lumley Sent by: [EMAIL PROTECTED] [EMAIL PROTECTED] cc at.math.ethz.ch r-help@stat.math.ethz.ch Subject Re: [R] help comparing two median 04/18/2007 05:02 with R AM Thomas Lumley wrote: On Tue, 17 Apr 2007, Frank E Harrell Jr wrote: The points that Thomas and Brian have made are certainly correct, if one is truly interested in testing for differences in medians or means. But the Wilcoxon test provides a valid test of x y more generally. The test is consonant with the Hodges-Lehmann estimator: the median of all possible differences between an X and a Y. Yes, but there is no ordering of distributions (taken one at a time) that agrees with the Wilcoxon two-sample test, only orderings of pairs of distributions. The Wilcoxon test provides a test of xy if it is known a priori that the two distributions are stochastically ordered, but not under weaker assumptions. Otherwise you can get xyzx. This is in contrast to the t-test, which orders distributions (by their mean) whether or not they are stochastically ordered. Now, it is not unreasonable to say that the problems are unlikely to occur very often and aren't worth worrying too much about. It does imply that it cannot possibly be true that there is any summary of a single distribution that the Wilcoxon test tests for (and the same is true for other two-sample rank tests, eg the logrank test). I know Frank knows this, because I gave a talk on it at Vanderbilt, but most people don't know it. (I thought for a long time that the Wilcoxon rank-sum test was a test for the median pairwise mean, which is actually the R-estimator corresponding to the *one*-sample Wilcoxon test). -thomas Thanks for
Re: [R] help comparing two median with R
On Wed, 18 Apr 2007, Greg Snow wrote: For testing, the permutation test may be prefered to the bootstrap (though the bootstrap could be used for a confidence interval). I remember in grad school doing a project on comparing the efficiency of a permutation test on medians compared to the MannWhitney test, but I don't remember the specifics of when each was better. Do any of the other participants in this discussion have any ideas on how the permutation tests compare to what else has been discussed? The MannWhitney test is actually a special case of the permutation test, but using the median permutation test is more intuitive to my mind. The permutation test is actually testing the null hypothesis that the 2 distributions are identical, but no assumptions about normality, skewness, shift hypotheses, etc.. Though the efficiency of the test statistic used would depend somewhat on the nature of the alternatives of interest (imagine 2 distributions with the same mean, but different medians, or same median, but different mean; a permutation test comparing means or medians would differ in the 2 cases). I think the point is that one does not want to assume the two distributions are identical: the null hypothesis is that they have the same median but possibly different shapes (including spread). You can set up bootstrap tests (see Davison Hinkley, for example). Cody Hamilton's CI is too crude: again see DH or MASS for less crude alternatives. However, bootstrapping a median has its own peculiarities: see the example in MASS and references there, including to Sheather's book. I'll have to try some simulations looking at a permutation test on efron's dice. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED] Sent: Wednesday, April 18, 2007 10:06 AM To: r-help@stat.math.ethz.ch Subject: Re: [R] help comparing two median with R Has anyone proposed using a bootstrap for Pedro's problem? What about taking a boostrap sample from x, a boostrap sample from y, take the difference in the medians for these two bootstrap samples, repeat the process 1,000 times and calculate the 95th percentiles of the 1,000 computed differences? You would get a CI on the difference between the medians for these two groups, with which you could determine whether the difference was greater/less than zero. Too crude? Regards, -Cody Frank E Harrell Jr [EMAIL PROTECTED] To bilt.edu Thomas Lumley Sent by: [EMAIL PROTECTED] [EMAIL PROTECTED] cc at.math.ethz.ch r-help@stat.math.ethz.ch Subject Re: [R] help comparing two median 04/18/2007 05:02 with R AM Thomas Lumley wrote: On Tue, 17 Apr 2007, Frank E Harrell Jr wrote: The points that Thomas and Brian have made are certainly correct, if one is truly interested in testing for differences in medians or means. But the Wilcoxon test provides a valid test of x y more generally. The test is consonant with the Hodges-Lehmann estimator: the median of all possible differences between an X and a Y. Yes, but there is no ordering of distributions (taken one at a time) that agrees with the Wilcoxon two-sample test, only orderings of pairs of distributions. The Wilcoxon test provides a test of xy if it is known a priori that the two distributions are stochastically ordered, but not under weaker assumptions. Otherwise you can get xyzx. This is in contrast to the t-test, which orders distributions (by their mean) whether or not they are stochastically ordered. Now, it is not unreasonable to say that the problems are unlikely to occur very often and aren't worth worrying too much about. It does imply that it cannot possibly be true that there is any summary of a single distribution that the Wilcoxon test tests for (and the same is true for other two-sample rank tests, eg the logrank test). I know Frank knows this, because I gave a talk on it at Vanderbilt, but most people don't know it. (I thought for a long time that the Wilcoxon rank-sum test was a test for the median pairwise mean, which is actually the R-estimator corresponding to the *one*-sample Wilcoxon test). -thomas Thanks for your note Thomas. I do feel that the problems you have rightly listed occur infrequently and that often I only care about two groups. Rank tests generally are good at relatives, not absolutes. We have an efficient test (Wilcoxon) for relative shift but for estimating an absolute one-sample quantity (e.g., median) the nonparametric estimator is not very efficient. Ironically there is an exact
[R] Thick stripes in the barplot() function
Dear all, Sorry to bother you, but I didn't find the solution in the R-help archive. I would like to change the thickness of stripes inside the barplot. Is there any solution to do that when using the barplot() fuction and the density option ? Or is there any other function ? Thanks very much for your help. Best regards. Gael. Gael Millot UMR 7147 et Universite Pierre et Marie Curie Equipe Recombinaison et instabilite genetique Pav Trouillet Rossignol 5eme etage Institut Curie 26 rue d'Ulm 75248 Paris Cedex 05 FRANCE tel : 1 33 42 34 66 34 fax : 1 33 42 34 66 44 Email : [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fractals with R
Here is a workable version for the Julia set. I put it also to http://fractalswithr.blogspot.com/ Atte Now problems with plot-command... I try to plot Julia set and the algorithm works, if the points are drawn during the loop. But if I want to save the values first to the matrix and then afterwards plot them at once, the picture is distorted. What's wrong with the plot-command? I think the PointsMatrix is ok? -Atte C=-0.7-0.4i # Complex parameter, connected to coordinate of the Mandelbrot set in a complex plane. Limits=c(-2,2) z=0+0i MaxIter=60 cl=colours() Step=seq(Limits[1],Limits[2],by=0.01) PointsMatrix=array(0,dim=c(length(Step)*length(Step),3)) a1=0 for(x in Step) { for(y in Step) { z1=x+y*1i n=0 z=z1 while(nMaxIter abs(z)2) { z=z^2+C n=n+1 } if(abs(z)2) colour=1 else colour=n*10 #points(z1, pch=., col=cl[colour]) # This works! # But this doesn't! a1=a1+1 PointsMatrix[a1,]=c(x,y,colour) } } #??? plot(PointsMatrix[,1], PointsMatrix[,2], xlim=Limits, ylim=Limits, col=cl[PointsMatrix[,3]], pch=.) #--- -# Atte Tenkanen wrote: Hi, That is of counter for web page. Do you get some pop-up windows? Atte Hi Atte, __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R-2.4.1 for MacOS X - languageR, acepack, Hmisc
Hello, Same problem on a MacBook Pro (intel) with RODBC, nortest and gplots. Best regards Le 18-avr.-07 à 21:09, Weiwei Shi a écrit : same problem here. last time I had a similar one when I did library(MASS), I solved that by re-installation of R 2.4.1. However, this time it does not work. On 4/18/07, Lara Tagliapietra [EMAIL PROTECTED] wrote: I updated R to the last 2.4.1 version and unfortunately I can not load languageR any longer. In R-2.4.1, LanguageR requires acepack, but Hmisc doesn't work when acepack is loaded. library(languageR) Loading required package: Design Loading required package: Hmisc Loading required package: acepack Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/Library/Frameworks/ R.framework/Versions/2.4/Resources/library/Hmisc/libs/i386/Hmisc.so': dlopen(/Library/Frameworks/R.framework/Versions/2.4/ Resources/ library/Hmisc/libs/i386/Hmisc.so, 6): Library not loaded: /usr/local/ gcc4.0/i686-apple-darwin8/lib/libgcc_s.1.0.dylib Referenced from: /Library/Frameworks/R.framework/Versions/ 2.4/ Resources/library/Hmisc/libs/i386/Hmisc.so Reason: image not found Error: package 'Hmisc' could not be loaded Apparently the Hmisc.so cannot be loaded, but it is actually there: source(/Library/Frameworks/R.framework/Versions/2.4/Resources/ library/Hmisc/libs/i386/Hmisc.so) Error in parse(file, n = -1, NULL, ?) : syntax error at 1: Œ Did anybody else encounter the same problem? And, if so, I would be very grateful to anybody who could tell me how to solve this problem. Thanks, Lara Tagliapietra [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. -- Weiwei Shi, Ph.D Research Scientist GeneGO, Inc. Did you always know? No, I did not. But I believed... ---Matrix III __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Memory increase in R
Dear All: Pleas help me to increase the memory in R. I am trying to make euclidean distance matrix. The number of low in data is 500,000. Therefore, the dimension of euclidean distance matrix is 500,000*500,000. When I run the data in R. R could not make distance matrix because of memory allocation problem. In order increase memory, I read the FAQ and follow the instruction as below: You may also set the amount of available memory manually. Close R, then right-click on your R program icon (the icon on your desktop or in your programs directory). Select ``Properties'', and then select the ``Shortcut'' tab. Look for the ``Target'' field and after the closing quotes around the location of the R executible, add --max-mem-size=500M It does not work. I have tried other computers in also does not work. When I add the --max-mem-size=3Gb in Target field. There is error like as below: “The name C:\Documents and Settings\Hong Su An\My Documents\R\R-2.4.1\bin\Rgui.exe--max-mem-size=1024M specified in the Target box is not valid” I use R2.4.1. in window xp with sp2 and 4Gb RAM. Have a nice day. Hong Su An. [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Memory increase in R
You would need 2TB (2,000,000,000,000) to store a single copy of your data. You probably need to rescale your problem. Even if you had the memory, the computation would take a very long time. On 4/18/07, Hong Su An [EMAIL PROTECTED] wrote: Dear All: Pleas help me to increase the memory in R. I am trying to make euclidean distance matrix. The number of low in data is 500,000. Therefore, the dimension of euclidean distance matrix is 500,000*500,000. When I run the data in R. R could not make distance matrix because of memory allocation problem. In order increase memory, I read the FAQ and follow the instruction as below: You may also set the amount of available memory manually. Close R, then right-click on your R program icon (the icon on your desktop or in your programs directory). Select ``Properties'', and then select the ``Shortcut'' tab. Look for the ``Target'' field and after the closing quotes around the location of the R executible, add --max-mem-size=500M It does not work. I have tried other computers in also does not work. When I add the --max-mem-size=3Gb in Target field. There is error like as below: The name C:\Documents and Settings\Hong Su An\My Documents\R\R-2.4.1\bin\Rgui.exe--max-mem-size=1024M specified in the Target box is not valid I use R2.4.1. in window xp with sp2 and 4Gb RAM. Have a nice day. Hong Su An. [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Gentleman and Ihaka , 2000 paper question
In their paper, Lexical Scope and Statistical Computing, the authors ( Gentleman and Ihaka ) go to great length explaining why R's use of lexical scoping creates advantages when doing statistical computations. If anyone has or is familiar with this paper, could they provide the main program code for how the newton function would be called in their example on page 500 of the paper. The authors are extremely clear in their writing and the paper is quite an eye opener for me but it seems like lfun somehow needs to be initialized so that it grabs the environment of Rmklike. I'm not sure how one would go about doing this so I am wondering what the main program that calls newton would be if there was one. Thanks. Mark This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extracting intercept from ppr fit
Hi Vadim, Estimates of the alpha_0 terms in MASS are the $yb component of the object returned by ppr(). As I understand it, the original PPR algorithm assumes the response variable(s) are centered, so the 'alpha_0' term in MASS is just the mean of the response if the user does not center the responses. Your actual 'a' term will not appear in the output of ppr. n - 1000 data - data.frame(x= rnorm (n), y= rnorm (n)) a - 10 data$z - evalq(a + atan (x + y) + rnorm (n), data) data.ppr - ppr(z ~ x + y, data=data, nterms =1) ## how to extract a = 10 from data.ppr? data.ppr$yb [1] 9.973964 a - 210 data$z - evalq(a + atan (x + y) + rnorm (n), data) data.ppr - ppr(z ~ x + y, data=data, nterms =1) data.ppr$yb [1] 209.9773 HTH Steven McKinney Statistician Molecular Oncology and Breast Cancer Program British Columbia Cancer Research Centre email: [EMAIL PROTECTED] tel: 604-675-8000 x7561 BCCRC Molecular Oncology 675 West 10th Ave, Floor 4 Vancouver B.C. V5Z 1L3 Canada -Original Message- From: [EMAIL PROTECTED] on behalf of Vadim Ogranovich Sent: Tue 4/17/2007 1:06 PM To: r-help@stat.math.ethz.ch Subject: Re: [R] extracting intercept from ppr fit Sorry for triple-posting : I seem to have a problem w/ my mail client. Hi, Is there a way, documented or not, to extract the intercept term (the alpha_0 the MASS book) from a ppr() (Projection Persuit Regression) fit? Thanks, Vadim ## Example: n - 1000 data - data.frame(x= rnorm (n), y= rnorm (n)) a - 10 data$z - evalq(a + atan (x + y) + rnorm (n), data) data.ppr - ppr(z ~ x + y, data=data, nterms =1) ## how to extract a = 10 from data.ppr? [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error loading libraries in MAC
Hi I just installed the gmodels package and the installation was successful but when I was trying to load the library I got an error (see below). Interesting, yesterday I wrote to the maintainer of RSQLite apckage because I got the same error. Does somebody knows what is going on ?? thanks, Mayte library(gmodels) Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/Library/Frameworks/R.framework/ Versions/2.4/Resources/library/gtools/libs/i386/gtools.so': dlopen(/Library/Frameworks/R.framework/Versions/2.4/Resources/ library/gtools/libs/i386/gtools.so, 6): Library not loaded: /usr/ local/gcc4.0/i686-apple-darwin8/lib/libgcc_s.1.0.dylib Referenced from: /Library/Frameworks/R.framework/Versions/2.4/ Resources/library/gtools/libs/i386/gtools.so Reason: image not found Error: package/namespace load failed for 'gmodels' R.Version() $platform [1] i386-apple-darwin8.8.1 $arch [1] i386 $os [1] darwin8.8.1 $system [1] i386, darwin8.8.1 $status [1] $major [1] 2 $minor [1] 4.1 $year [1] 2006 $month [1] 12 $day [1] 18 $`svn rev` [1] 40228 $language [1] R $version.string [1] R version 2.4.1 (2006-12-18) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How do I print a string without the initial [1]?
If I print a sting I get an initial [1]: xx=a xx [1] a How do I get it to print just a with no [1]? I tried looking this up, but I don't know what the initial [1] is called. Steve __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I print a string without the initial [1]?
x=a\n cat(x) a On Apr 18, 2007, at 5:31 PM, steve wrote: If I print a sting I get an initial [1]: xx=a xx [1] a How do I get it to print just a __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: importing excel-file
-- Forwarded message -- From: John C Frain [EMAIL PROTECTED] Date: 18-Apr-2007 22:35 Subject: Re: [R] importing excel-file To: Alberto Monteiro [EMAIL PROTECTED] One additional suggestion would be to use gretl. Gretl will read excel files with an option to possibly ignore the first few rows and columns and then the data can be exported to an R session started from within Gretl. This will also work in both Windows and Linux. I find it hard to imagine an excel file that can not be read in Open Office or Gnumeric or even Excel and then output to a delimited file in a matter of seconds. In my previous employment I conducted several campaigns against the use of excel formats in favour of a delimited file which can be opened in almost any program. I think that we wouls all be better off if this practice was more widespread. John Frain On 18/04/07, Alberto Monteiro [EMAIL PROTECTED] wrote: John C Frain wrote: To avoid complications, save your file as comma separated and use one of the instructions for reading delimited files. If you are using a comma as a decimal point you are probably using ; as a separator. If this is so use read.csv2. Please see the help files for read.table. I think the problem is that we _can't_ alter or write the excel file, or it would be unpractical to do it (say, this xls file is generated by someone else once a day, and we must use it). I usually save the excel file to plain text, and then read it - but this is not the best solution when the file keeps changing. Alberto Monteiro -- John C Frain Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] -- John C Frain Trinity College Dublin Dublin 2 Ireland www.tcd.ie/Economics/staff/frainj/home.html mailto:[EMAIL PROTECTED] mailto:[EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How do I print a string without the initial [1]?
Hello Steve, You can print strings in R with the method cat() \n . new line \t . tabulator Try: name - c(Steve) age=22 cat(\tHello my name is, name ,and I am, age ,years old.\n) Have Fun! Kind Regards, Soare Marcian-Alin 2007/4/18, steve [EMAIL PROTECTED]: If I print a sting I get an initial [1]: xx=a xx [1] a How do I get it to print just a with no [1]? I tried looking this up, but I don't know what the initial [1] is called. Steve __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditional power, predictive power
is there no package/function in R to calculate the conditional power or the bayesian predictive power for trials with binary endpoints? Thanks -- View this message in context: http://www.nabble.com/Conditional-power%2C-predictive-power-tf3603396.html#a10066991 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problems in programming a simple likelihood
As part of carrying out a complicated maximum likelihood estimation, I am trying to learn to program likelihoods in R. I started with a simple probit model but am unable to get the code to work. Any help or suggestions are most welcome. I give my code below: mlogl - function(mu, y, X) { n - nrow(X) zeta - X%*%mu llik - 0 for (i in 1:n) { if (y[i]==1) llik - llik + log(pnorm(zeta[i,], mean=0, sd=1)) else llik - llik + log(1-pnorm(zeta[i,], mean=0, sd=1)) } return(-llik) } women - read.table(~/R/Examples/Women13.txt, header=TRUE) # DATA # THE DATA SET CAN BE ACCESSED HERE # women - read.table(http://wps.aw.com/wps/media/objects/2228/2281678/Data_Sets/ASCII/Women13.txt;, header=TRUE) # I HAVE CHANGED THE NAMES OF THE VARIABLES # J is changed to work # M is changed to mar # S is changed to school attach(women) # THE VARIABLES OF USE ARE # work: binary dependent variable # mar: whether married or not # school: years of schooling mu.start - c(3, -1.5, 10) data - cbind(1, mar, school) out - nlm(mlogl, mu.start, y=work, X=data) cat(Results, \n) out$estimate detach(women) * When I try to run the code, this is what I get: source(probit.R) Results Warning messages: 1: NA/Inf replaced by maximum positive value 2: NA/Inf replaced by maximum positive value 3: NA/Inf replaced by maximum positive value 4: NA/Inf replaced by maximum positive value Thanks in advance. Deepankar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gentleman and Ihaka , 2000 paper question
not sure just what you want, but here are some snippets newton - function(lfun, est, tol = 1e-7, niter = 500) { cscore - lfun$score(est) if (abs(cscore) tol) return(est) for (i in 1:niter) { new - est - cscore / lfun$d2(est) cscore - lfun$score(new) if (abs(cscore) tol) return(new) est - new } stop(exceeded allowed number of iterations) } with the likelihood function Rmklike - function(data) { n - length(data) sumx - sum(data) lfun - function(mu) n * log(mu) - mu * sumx score - function(mu) n / mu - sumx d2 - function(mu) -n / mu^2 list(lfun = lfun, score = score, d2 = d2) } Leeds, Mark (IED) wrote: In their paper, Lexical Scope and Statistical Computing, the authors ( Gentleman and Ihaka ) go to great length explaining why R's use of lexical scoping creates advantages when doing statistical computations. If anyone has or is familiar with this paper, could they provide the main program code for how the newton function would be called in their example on page 500 of the paper. The authors are extremely clear in their writing and the paper is quite an eye opener for me but it seems like lfun somehow needs to be initialized so that it grabs the environment of Rmklike. Rmlike is called to create the likelihood function, and since that function is defined in the body of Rmlike, it has the evaluation environment by default. And any return value will have the correct environment. I'm not sure how one would go about doing this so I am wondering what the main program that calls newton would be if there was one. Thanks. So, data=rexp(10, rate=.3) lf = Rmklike(data) newton(lf, .1) seems to do the trick (and surprisingly still works, as written some unbelievably long time ago) Robert Mark This is not an offer (or solicitation of an offer) to buy/se...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Robert Gentleman, PhD Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 PO Box 19024 Seattle, Washington 98109-1024 206-667-7700 [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing excel-file
On Apr 18, 2007, at 1:58PM , Gabor Csardi wrote: On Wed, Apr 18, 2007 at 03:51:35PM -0200, Alberto Monteiro wrote: Gabor Csardi wrote: Ok, but who would be insane enough to use Excel in Linux or Mac? :-) The original reason for read.xls() was to allow a web application running computations through R to handle data in the format the _user_ preferred. Like it or not, most scientists store their experimental data in MS-Excel spreadsheets, and one needs to be able to handle them It turned out to be a lot easier to write a function for R to read in XLS data than to train our users to convert their data to .csv format. -G (author of the gdata package) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Matrix or grid conversion of spatial data
Dear Happy R-users experts, I am in need of advice, While working with spatial data (x y coordinates of seed locations) I have come accross the problem that I need to convert my point data into a matrix or grid system. I then need to count how often a point falls into a certain position in the matrix or grid. I have searched all day online, asked collegeas but nothing works. Sadly my R box of tricks has run out. My (point) data looks like this; x y 2.34.5 3.4 0.2 and continues for another million records. Now my question; is there any function that is able to count how often a point falls into a grid based on the x and y location? So I need to discretize the spatial locations to a regular grid and then counting how often a point occurs. Many thanks for your thoughts on this problem. Marco Visser __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error loading libraries in MAC
I have received a number of reports of problems with recent unversal Mac packages from CRAN when used with R 2.4.1. Has something in the build script changed? -G On Apr 18, 2007, at 4:49PM , Mayte Suarez-Farinas wrote: Hi I just installed the gmodels package and the installation was successful but when I was trying to load the library I got an error (see below). Interesting, yesterday I wrote to the maintainer of RSQLite apckage because I got the same error. Does somebody knows what is going on ?? thanks, Mayte library(gmodels) Error in dyn.load(x, as.logical(local), as.logical(now)) : unable to load shared library '/Library/Frameworks/R.framework/ Versions/2.4/Resources/library/gtools/libs/i386/gtools.so': dlopen(/Library/Frameworks/R.framework/Versions/2.4/Resources/ library/gtools/libs/i386/gtools.so, 6): Library not loaded: /usr/ local/gcc4.0/i686-apple-darwin8/lib/libgcc_s.1.0.dylib Referenced from: /Library/Frameworks/R.framework/Versions/2.4/ Resources/library/gtools/libs/i386/gtools.so Reason: image not found Error: package/namespace load failed for 'gmodels' R.Version() $platform [1] i386-apple-darwin8.8.1 $arch [1] i386 $os [1] darwin8.8.1 $system [1] i386, darwin8.8.1 $status [1] $major [1] 2 $minor [1] 4.1 $year [1] 2006 $month [1] 12 $day [1] 18 $`svn rev` [1] 40228 $language [1] R $version.string [1] R version 2.4.1 (2006-12-18) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Computing an ordering on subsets of a data frame
If I have a data frame X that looks like this: A B - - 1 2 1 3 1 4 2 3 2 1 2 1 3 2 3 1 3 3 and I want to make another column which has the rank of B computed separately for each value of A. I.e. something like: A B C - - - 1 2 1 1 3 2 1 4 3 2 3 3 2 1 1 2 1 2 3 2 2 3 1 1 3 3 3 by(X, X[,1], function(x) { rank(x[,1], ties.method=random) } ) almost seems to work, but the data is not in a frame, and I can't figure out how to merge it back into X properly. Thanks, Lukas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] importing excel-file
The issue Greg mentions, that most scientists store their experimental data in MS-Excel spreadsheets is the motivation for one of the sessions at the Interface 2007 conference http://sbm.temple.edu/interface07/index.html in Philadelphia, May 23-26, 2007 Erich and Thomas designed the RExcel interface. Naras uses RExcel. Robert uses the interface between S-Plus and Excel. Accessible Interfaces to Advanced Statistics Software Richard Heiberger, organizer Erich Neuwirth, Thomas Baier, An Office-Software and Menu-Driven Interface for Advanced Statistics in the Biological Sciences Narasimhan Balasubramanian, Disseminating Statistical Methodology and Results via R and Excel: Two Examples Robert Gagnon, Analysis and Visualization of Microarray Gene Expression Data Using Excel, SAS, and SPlus __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Matrix or grid conversion of spatial data
On Wed, 18 Apr 2007, Marco Visser wrote: Dear Happy R-users experts, I am in need of advice, While working with spatial data (x y coordinates of seed locations) I have come accross the problem that I need to convert my point data into a matrix or grid system. I then need to count how often a point falls into a certain position in the matrix or grid. I have searched all day online, asked collegeas but nothing works. Sadly my R box of tricks has run out. My (point) data looks like this; x y 2.34.5 3.4 0.2 and continues for another million records. Now my question; is there any function that is able to count how often a point falls into a grid based on the x and y location? So I need to discretize the spatial locations to a regular grid and then counting how often a point occurs. see ?table and ?cut Maybe something like x.breakpoints - sensible breakpoints for x y.breakpoints - sensible breakpoints for y my.grid - table( cut( x, x.breakpoints ), cut( y, y.breakpoints ) ) see also ?xtab and ?quantile Many thanks for your thoughts on this problem. Marco Visser __ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Charles C. Berry(858) 534-2098 Dept of Family/Preventive Medicine E mailto:[EMAIL PROTECTED] UC San Diego http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] erratic behavior of match()?
Consider the code: x - seq(0,1,0.2) y - seq(0,1,0.01) cbind(match(y,x),y) which, surprisingly, doesn't show a match at 0.6! (It gives correct matches at 0, 0.2, 0.4, 0.8 and 1, though) In addition, x[4]==y[61] yields FALSE. (but x[5]==y[81], the one for 0.8, yields TRUE) Is this a consequence of machine error or something else? Could this be overcome? (It works correctly when integers are used in the sequences as well as in many other circumstances) Thank you, Bernhard __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] erratic behavior of match()?
On 4/18/07, Bernhard Klingenberg [EMAIL PROTECTED] wrote: Consider the code: x - seq(0,1,0.2) y - seq(0,1,0.01) cbind(match(y,x),y) which, surprisingly, doesn't show a match at 0.6! (It gives correct matches at 0, 0.2, 0.4, 0.8 and 1, though) In addition, x[4]==y[61] yields FALSE. (but x[5]==y[81], the one for 0.8, yields TRUE) Is this a consequence of machine error or something else? Could this be overcome? (It works correctly when integers are used in the sequences as well as in many other circumstances) See the R FAQ - question 7.31 It's a basic property of floating point arithmetic. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems in programming a simple likelihood
Deepankar, Some general advice from a non-expert: Write your likelihoods without a for loop. This is important because the likelihood is evaluated multiple times in the maximization process and you don't want to be looping looping looping ... Always try multiple starting values Sometimes it helps to try different optimization functions (e.g., optim) Make sure your likelihood is correct. Check it against existing software if possible If necessary, simplify your model, to a single parameter even, and build it up from there Generate data under the model and see if your estimates are getting close to the truth Good luck, Stephen Rochester, Minn. USA On 4/18/07, Deepankar Basu [EMAIL PROTECTED] wrote: As part of carrying out a complicated maximum likelihood estimation, I am trying to learn to program likelihoods in R. I started with a simple probit model but am unable to get the code to work. Any help or suggestions are most welcome. I give my code below: mlogl - function(mu, y, X) { n - nrow(X) zeta - X%*%mu llik - 0 for (i in 1:n) { if (y[i]==1) llik - llik + log(pnorm(zeta[i,], mean=0, sd=1)) else llik - llik + log(1-pnorm(zeta[i,], mean=0, sd=1)) } return(-llik) } women - read.table(~/R/Examples/Women13.txt, header=TRUE) # DATA # THE DATA SET CAN BE ACCESSED HERE # women - read.table(http://wps.aw.com/wps/media/objects/2228/2281678/Data_Sets/ASCII/Women13.txt;, header=TRUE) # I HAVE CHANGED THE NAMES OF THE VARIABLES # J is changed to work # M is changed to mar # S is changed to school attach(women) # THE VARIABLES OF USE ARE # work: binary dependent variable # mar: whether married or not # school: years of schooling mu.start - c(3, -1.5, 10) data - cbind(1, mar, school) out - nlm(mlogl, mu.start, y=work, X=data) cat(Results, \n) out$estimate detach(women) * When I try to run the code, this is what I get: source(probit.R) Results Warning messages: 1: NA/Inf replaced by maximum positive value 2: NA/Inf replaced by maximum positive value 3: NA/Inf replaced by maximum positive value 4: NA/Inf replaced by maximum positive value Thanks in advance. Deepankar __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Computing an ordering on subsets of a data frame
Does this do what you want? x - A B + 1 2 + 1 3 + 1 4 + 2 3 + 2 1 + 2 1 + 3 2 + 3 1 + 3 3 x - read.table(textConnection(x), header=TRUE) x$C - ave(x$B, x$A, FUN=rank) x A B C 1 1 2 1.0 2 1 3 2.0 3 1 4 3.0 4 2 3 3.0 5 2 1 1.5 6 2 1 1.5 7 3 2 2.0 8 3 1 1.0 9 3 3 3.0 On 4/18/07, Lukas Biewald [EMAIL PROTECTED] wrote: If I have a data frame X that looks like this: A B - - 1 2 1 3 1 4 2 3 2 1 2 1 3 2 3 1 3 3 and I want to make another column which has the rank of B computed separately for each value of A. I.e. something like: A B C - - - 1 2 1 1 3 2 1 4 3 2 3 3 2 1 1 2 1 2 3 2 2 3 1 1 3 3 3 by(X, X[,1], function(x) { rank(x[,1], ties.method=random) } ) almost seems to work, but the data is not in a frame, and I can't figure out how to merge it back into X properly. Thanks, Lukas __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.