[R] Method dispatch in functions?
Hi, Could someone point me in the right direction for documentation on the following question? Let's say I have two objects a and b of classes A and B, respectively. Now let's say I write a function foo that does something similar to objects of type A and B. Basically I want to overload the function in C++ talk, so if I give foo and object of type A something (and this is my question) dispatches the call to, say, foo.A, and if I give foo and object of type B something dispatches the call to, say, foo.B. I want to write foo.A and foo.B. How to I perform the method dispatch? From what I understand there are two ways in R: S3 and S4. What is the simple S3 way? Thanks! Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Source Code for zoo?
Berwin, Thanks for your message. It looks like I did download the wrong file! Jack. Berwin A Turlach [EMAIL PROTECTED] wrote:That you are downloading binary distributions of packages instead of their source distribution? :-) Cheers, Berwin - Now that's room service! Choose from over 150,000 hotels [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Days of the week?
Hi WizaRds, What is the standard way to get the day of the week from a date such as as.Date(2006-12-01)? It looks like fCalendar has some functions but this requires a change in the R locale to GMT. Is there another way? Thanks! Jack. - Be a PS3 game guru. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can anyone read a S-PLUS .dmp file for me?
Anyone? John McHenry [EMAIL PROTECTED] wrote:Hi WizaRds, I tried reading the S-PLUS file ftp://ftp.research.att.com/dist/bayes-meta/hblm.dmp into R using data.restore(hblm.dmp) but I got an error: Error in attributes(value) - thelist[-match(c(.Data, .Dim, .Dimnames, : row names must be 'character' or 'integer', not 'double' In addition: Warning message: NAs introduced by coercion Does anyone know how to read this type of S-PLUS file into R? I am not familiar with it. On http://cran.r-project.org/doc/manuals/R-data.html it is suggested that it is usually more reliable to dump the object(s) in S-PLUS and source the dumpfile in R See also, http://tolstoy.newcastle.edu.au/R/help/05/12/18209.html I don't know how this file was created. Could someone with S-PLUS access please see if they can read it? Thanks! Jack. - - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Can anyone read a S-PLUS .dmp file for me?
Hi WizaRds, I tried reading the S-PLUS file ftp://ftp.research.att.com/dist/bayes-meta/hblm.dmp into R using data.restore(hblm.dmp) but I got an error: Error in attributes(value) - thelist[-match(c(.Data, .Dim, .Dimnames, : row names must be 'character' or 'integer', not 'double' In addition: Warning message: NAs introduced by coercion Does anyone know how to read this type of S-PLUS file into R? I am not familiar with it. On http://cran.r-project.org/doc/manuals/R-data.html it is suggested that it is usually more reliable to dump the object(s) in S-PLUS and source the dumpfile in R See also, http://tolstoy.newcastle.edu.au/R/help/05/12/18209.html I don't know how this file was created. Could someone with S-PLUS access please see if they can read it? Thanks! Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Is there a better way than x[1:length(x)-1] ?
Hi WizaRds, In MATLAB you can do x=1:10 and then specify x(2:end) to get 2 3 4 5 6 7 8 9 10 or whatever (note that in MATLAB the parenthetic index notation is used, not brackets as in R). The point is that 'end' allows you to refer to the final index point of the array. Obviously there isn't much gain in syntax when the variable name is x, but when it's something like hereIsABigVariableName(j:end-i) it makes things a lot more readable than hereIsABigVariableName(j:length(hereIsABigVariableName)-i) In R I could do: n- length(hereIsABigVariableName) hereIsABigVariableName[j:n-i] but I'd like to use something like 'end', if it exists. Am I missing something obvious in R that does what 'end' does in MATLAB? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tcltk package
Hi Adrian, Thanks for the tip. I re-installed and everything seems to work just fine. Thanks, Jack. Adrian DUSA [EMAIL PROTECTED] wrote: On Tuesday 01 August 2006 19:24, John McHenry wrote: [...] Yes, I built R myself. I couldn't find a debian package for R 2.3.1. The latest available is 2.2.1. Oh, but there is... right on CRAN. For Dapper just add this line to your sources.list: deb http://cran.R-project.org/bin/linux/ubuntu/ dapper/ This repository has lots of other packages compiled for Ubuntu, feel free to take a look. HTH, Adrian -- Adrian DUSA Romanian Social Data Archive 1, Schitu Magureanu Bd 050025 Bucharest sector 5 Romania Tel./Fax: +40 21 3126618 \ +40 21 3120210 / int.101 - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Tcltk package
Hi WizaRds, I ran into trouble trying to install the debug package, which requires TCL/TK support. It seems like the tcltk package is not installed on my system. From: http://tolstoy.newcastle.edu.au/R/help/05/07/7993.html it seems that the tcltk is bundled with the base R distribution. I'm running R under linux: version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 2 minor 3.1 year 2006 month 06 day01 svn rev38247 language R version.string Version 2.3.1 (2006-06-01) tcl8.4 and tk8.4 are both installed. The messages I get when I try to install the debug package are: install.packages(debug) trying URL 'http://cran.us.r-project.org/src/contrib/debug_1.1.0.tar.gz' Content type 'application/x-tar' length 26492 bytes opened URL == downloaded 25Kb * Installing *source* package 'debug' ... ** R ** inst ** save image Loading required package: mvbutils MVBUTILS: no tasks vector found in ROOT Loading required package: tcltk Error in firstlib(which.lib.loc, package) : Tcl/Tk support is not available on this system Error: package 'tcltk' could not be loaded Execution halted ERROR: execution of package source for 'debug' failed ** Removing '/usr/local/lib/R/library/debug' The downloaded packages are in /tmp/RtmpEocXcC/downloaded_packages Warning message: installation of package 'debug' had non-zero exit status in: install.packages(debug) Where am I going wrong? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tcltk package
Hi Peter, Peter Dalgaard [EMAIL PROTECTED] wrote: Did you build R yourself, and can you do library(tcltk) on R's command line? Yes, I built R myself. I couldn't find a debian package for R 2.3.1. The latest available is 2.2.1. library(tcltk) Error in firstlib(which.lib.loc, package) : Tcl/Tk support is not available on this system Error in library(tcltk) : .First.lib failed for 'tcltk' You may well be missing the -devel packages for tcl and tk. I didn't get any warnings when I used configure. Do I need to explicitly configure for tcl and tk? And, BTW, which Linux distribution is this? i686-pc-linux-gnu is not sufficient. Ubuntu 6.06. Thanks, Jack. Peter Dalgaard [EMAIL PROTECTED] wrote: John McHenry writes: Hi WizaRds, I ran into trouble trying to install the debug package, which requires TCL/TK support. It seems like the tcltk package is not installed on my system. From: http://tolstoy.newcastle.edu.au/R/help/05/07/7993.html it seems that the tcltk is bundled with the base R distribution. I'm running R under linux: version _ platform i686-pc-linux-gnu arch i686 os linux-gnu system i686, linux-gnu status major 2 minor 3.1 year 2006 month 06 day01 svn rev38247 language R version.string Version 2.3.1 (2006-06-01) tcl8.4 and tk8.4 are both installed. The messages I get when I try to install the debug package are: install.packages(debug) trying URL 'http://cran.us.r-project.org/src/contrib/debug_1.1.0.tar.gz' Content type 'application/x-tar' length 26492 bytes opened URL == downloaded 25Kb * Installing *source* package 'debug' ... ** R ** inst ** save image Loading required package: mvbutils MVBUTILS: no tasks vector found in ROOT Loading required package: tcltk Error in firstlib(which.lib.loc, package) : Tcl/Tk support is not available on this system Error: package 'tcltk' could not be loaded Execution halted ERROR: execution of package source for 'debug' failed ** Removing '/usr/local/lib/R/library/debug' The downloaded packages are in /tmp/RtmpEocXcC/downloaded_packages Warning message: installation of package 'debug' had non-zero exit status in: install.packages(debug) Where am I going wrong? Did you build R yourself, and can you do library(tcltk) on R's command line? You may well be missing the -devel packages for tcl and tk. And, BTW, which Linux distribution is this? i686-pc-linux-gnu is not sufficient. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overplotting: plot() invocation looks ugly ... suggestions?
Hi Hadley, Thanks for your suggestion. The description of ggplot states: Description: ... It combines the advantages of both base and lattice graphics ... and you can still build up a plot step by step from multiple data sources So I thought I'd try to enhance the plot by adding in the means from each quarter (this is snagged directly from ESS): qplot(Quarter, Consumption, data=data, type=c(point,line), id=data$Year) ( mean.per.quarter- with(data, tapply(Consumption, Quarter, mean)) ) points(mean.per.quarter, pch=+, cex=2.0) qplot(Quarter, Consumption, data=data, type=c(point,line), id=data$Year) ( mean.per.quarter- with(data, tapply(Consumption, Quarter, mean)) ) 1 2 3 4 888.2 709.2 616.4 832.8 points(mean.per.quarter, pch=+, cex=2.0) Error in plot.xy(xy.coords(x, y), type = type, ...) : plot.new has not been called yet Now I'm green behind the ears when it comes to R, so I'm guessing that there is some major conflict between base graphics and lattice graphics, which I thought ggplot avoided, given the library help blurb. I'm assuming that there must be a way to add points / lines to lattice / ggplot graphics (in the latter case it seems to be via ggpoint, or some such)? But is there a way that allows me to add via: points(mean.per.quarter, pch=+, cex=2.0) and similar, or do I have to learn the lingo for lattice / ggplot? Thanks, Jack. hadley wickham [EMAIL PROTECTED] wrote: And if lattice is ok then try this: library(lattice) xyplot(Consumption ~ Quarter, group = Year, data, type = o) Or you can use ggplot: install.packages(ggplot) library(ggplot) qplot(Quarter, Consumption, data=data,type=c(point,line), id=data$Year) Unfortunately this has uncovered a couple of small bugs for me to fix (no automatic legend, and have to specify the data frame explicitly) The slighly more verbose example below shows you what it should look like. data$Year - factor(data$Year) p - ggplot(data, aes=list(x=Quarter, y=Consumption, id=Year, colour=Year)) ggline(ggpoint(p), size=2) Regards, Hadley - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overplotting: plot() invocation looks ugly ... suggestions?
Gabor, Your suggestion: library(lattice) xyplot(Consumption ~ Quarter, group = Year, data, type = o) is very elegant indeed. Thanks, Jack. Gabor Grothendieck [EMAIL PROTECTED] wrote: And if lattice is ok then try this: library(lattice) xyplot(Consumption ~ Quarter, group = Year, data, type = o) On 7/24/06, Gabor Grothendieck wrote: Try: matplot(levels(data$Quarter), matrix(data$Consumption, 4), type = o) On 7/24/06, John McHenry wrote: Hi WizaRds, I'd like to overplot UK fuel consumption per quarter over the course of five years. Sounds simple enough? Unless I'm missing something, the following seems very involved for what I'm trying to do. Any suggestions on simplifications? The way I did it is awkward mainly because of the first call to plot ... but isn't this necessary, especially to set limits for the plot? The second call to plot(), in conjunction with by(), seems to be natural enough, and, IMHO, seems to be readable and succinct. data- read.table(textConnection(YearQuarterConsumption 19651874 19652679 19653616 19654816 19661866 19662700 19663603 19664814 19671843 19672719 19673594 19674819 19681906 19682703 19683634 19684844 19691952 19692745 19693635 19694871), header=TRUE) data$Quarter- as.factor(data$Quarter) # # what follows is only marginally less involved than using a for loop # (the culprit is, in part, the need to make the first, type=n, call to plot()): windows(width=12,height=6) with(data, plot(levels(Quarter), Consumption[Year==Year[1]], ylim=c(min(Consumption), max(Consumption)), type=n)) with(data, by(Consumption, Year, function(x) lines(levels(Quarter), x, type=o))) Thanks, Jack. - Groups are talking. We�re listening. Check out the handy changes to Yahoo! Groups. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overplotting: plot() invocation looks ugly ... suggestions?
Hi WizaRds, I'd like to overplot UK fuel consumption per quarter over the course of five years. Sounds simple enough? Unless I'm missing something, the following seems very involved for what I'm trying to do. Any suggestions on simplifications? The way I did it is awkward mainly because of the first call to plot ... but isn't this necessary, especially to set limits for the plot? The second call to plot(), in conjunction with by(), seems to be natural enough, and, IMHO, seems to be readable and succinct. data- read.table(textConnection(YearQuarterConsumption 19651874 19652679 19653616 19654816 19661866 19662700 19663603 19664814 19671843 19672719 19673594 19674819 19681906 19682703 19683634 19684844 19691952 19692745 19693635 19694871), header=TRUE) data$Quarter- as.factor(data$Quarter) # # what follows is only marginally less involved than using a for loop # (the culprit is, in part, the need to make the first, type=n, call to plot()): windows(width=12,height=6) with(data, plot(levels(Quarter), Consumption[Year==Year[1]], ylim=c(min(Consumption), max(Consumption)), type=n)) with(data, by(Consumption, Year, function(x) lines(levels(Quarter), x, type=o))) Thanks, Jack. - Groups are talking. Weacute;re listening. Check out the handy changes to Yahoo! Groups. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Code for Screenshots graphics (following on from ease-of-use issues on www.r-project.org)
Does anyone know where the code for the graphics on: http://www.r-project.org/screenshots/screenshots.html lives? - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Incrementing a counter in lapply
Thanks, Gabor Thomas. Apologies, but I used an example that obfuscated the question that I wanted to ask. I really wanted to know how to have extra arguments in functions that would allow, per the example code, for something like a counter to be incremented. Thomas's suggestion of using mapply (reproduced below with corrections) is probably closest. Jack. PS Here's the corrected code: d- data.frame(read.table(textConnection( Y X D 8530 0 9540 1 9040 1 7520 0 10060 1 9040 0 9050 0 9030 1 10060 1 8530 1 ), header=TRUE)) windows(); plot(Y ~ X, d, type=n) colors- c(blue,green) junk- mapply( function(z,color) with(z, lines(X, predict(lm(Y~X)), col=color)), with(d, split(d,D)), color=colors ) Thomas Lumley [EMAIL PROTECTED] wrote: You can't get lapply to increment i, but you can use mapply and write your function with two arguments. mapply( function(z,colour) with(z, lines(X, predict(lm(Y~X), col=colour)), with(d, split(d,D)), colors) -thomas Gabor Grothendieck [EMAIL PROTECTED] wrote: Try this: plot(Y ~ X, d, type = n) f - function(i) abline(lm(Y ~ X, d, subset = D == i), col = colors[i+1]) junk - lapply(unique(d$D), f) On 3/13/06, John McHenry wrote: Hi All, I'm looking for some hints on idiomatic R usage using 'lapply' or similar. What follows is a simple example from which to generalize my question... # Suppose, in this simple example, I want to plot a number of different lines in different colors; # I define the colors I wish to use and I plot them in a loop: d- data.frame(read.table(textConnection( Y X D 8530 0 9540 1 9040 1 7520 0 10060 1 9040 0 9050 0 9030 1 10060 1 8530 1 ), header=TRUE)) # graph the relation of Y to X when # i) D==0 # ii) D==1 with( d, plot(X, Y, type=n) ) component- with( d, split(d, D) ) colors- c(blue, green) for (i in 1:length(component)) with( component[[i]], lines(X, predict(lm(Y ~ X)), col=colors[i]) ) # # ... seems easy enough # # [Q.]: How to do the same as the above but using 'lapply'? # ... i.e. something along the lines of: with( d, plot(X, Y, type=n) ) colors- c(blue, green) # how do I get lapply to increment i? lapply( with(d, split(d, D)), function(z) with(z, lines(X, predict(lm(Y ~ X)), col=colors[i])) ) Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Incrementing a counter in lapply
Hi All, I'm looking for some hints on idiomatic R usage using 'lapply' or similar. What follows is a simple example from which to generalize my question... # Suppose, in this simple example, I want to plot a number of different lines in different colors; # I define the colors I wish to use and I plot them in a loop: d- data.frame(read.table(textConnection( Y X D 8530 0 9540 1 9040 1 7520 0 10060 1 9040 0 9050 0 9030 1 10060 1 8530 1 ), header=TRUE)) # graph the relation of Y to X when # i) D==0 # ii) D==1 with( d, plot(X, Y, type=n) ) component- with( d, split(d, D) ) colors- c(blue, green) for (i in 1:length(component)) with( component[[i]], lines(X, predict(lm(Y ~ X)), col=colors[i]) ) # # ... seems easy enough # # [Q.]: How to do the same as the above but using 'lapply'? # ... i.e. something along the lines of: with( d, plot(X, Y, type=n) ) colors- c(blue, green) # how do I get lapply to increment i? lapply( with(d, split(d, D)), function(z) with(z, lines(X, predict(lm(Y ~ X)), col=colors[i])) ) Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Command-line editing history
Unless, of course, like me, you run vim for religious reasons ;) Seriously, though, it seems that this functionality is not available in the R Console, correct? I have tried many GUI front-ends and found that there are both advantages and unfortunately disadvantages to their use---certainly in the way that I normally work---hence the best solution, for me at least, would be to have added functionality in the R Console. Liaw, Andy [EMAIL PROTECTED] wrote: Unless I'm mistaken, all those features (and more) are available if you run R within ESS/(X)Emacs. Andy From: John McHenry Hi all, Are there any plans to add more functionality to command-line editing and history editing on the command line? In MATLAB (I know, comparisons are odious ...), you can type p and up-arrow on the command line and scroll through the recently entered commands beginning with p. This is a very useful feature and something that I believe is not replicated in R. Please correct me if I'm wrong; currently I use history(Inf) in R, search for what I want and cut and paste if I find what I'm looking for. Also in MATLAB, tab completion is available for directory listings and also for function name completion. Again, I'm unaware of how to do this in R. The added MATLAB functionality makes finding files easy on the command line and it also saves the fingers on long function names. Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Command-line editing history
Oops, should have included: version _ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major2 minor2.1 year 2005 month12 day 20 svn rev 36812 language R So, no, I'm on the Windoze port. So I guess my question changes to: Does anyone know if there are plans / is it possible to add GNU Readline functionality in the Windoze port? I especially like the fact that vi key bindings are available ;) Thanks, Jack. Jeffrey Horner [EMAIL PROTECTED] wrote: John McHenry wrote: Hi all, Are there any plans to add more functionality to command-line editing and history editing on the command line? Presuming you're running R from a Unix console (I'm unsure of the windows port, maybe?), it is sufficient, insofar as how well you like the GNU readline library and if it's been compiled into R: http://cnswww.cns.cwru.edu/php/chet/readline/rluserman.html I can even use VI style key bindings to work with historical commands: Typing K recalls previous commands, J goes forward through the commands. I can even search through the commands. Auto completion of function names doesn't work but file names do. One point that was a bit of work for me to set up was automatically saving history. In order to do this, you must first set the environment variable R_HISTFILE to the location of your saved history file. Then, at the end of your R session, you can run: savehistory(Sys.getenv(R_HISTFILE) Or better yet, put the following in your .Rprofile: .Last - function() savehistory(Sys.getenv(R_HISTFILE)) In MATLAB (I know, comparisons are odious ...), you can type p and up-arrow on the command line and scroll through the recently entered commands beginning with p. This is a very useful feature and something that I believe is not replicated in R. Please correct me if I'm wrong; currently I use history(Inf) in R, search for what I want and cut and paste if I find what I'm looking for. Also in MATLAB, tab completion is available for directory listings and also for function name completion. Again, I'm unaware of how to do this in R. The added MATLAB functionality makes finding files easy on the command line and it also saves the fingers on long function names. Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- Jeffrey Horner Computer Systems Analyst School of Medicine 615-322-8606 Department of Biostatistics Vanderbilt University - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Command-line editing history
Hi all, Are there any plans to add more functionality to command-line editing and history editing on the command line? In MATLAB (I know, comparisons are odious ...), you can type p and up-arrow on the command line and scroll through the recently entered commands beginning with p. This is a very useful feature and something that I believe is not replicated in R. Please correct me if I'm wrong; currently I use history(Inf) in R, search for what I want and cut and paste if I find what I'm looking for. Also in MATLAB, tab completion is available for directory listings and also for function name completion. Again, I'm unaware of how to do this in R. The added MATLAB functionality makes finding files easy on the command line and it also saves the fingers on long function names. Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Elegant way to express residual calculation in R?
Hi All, I am illustrating a simple, two-way ANOVA using the following data and I'm having difficulty in expressing the predicted values succinctly in R. X- data.frame(read.table(textConnection( Machine.1Machine.2Machine.3 53 61 51 47 55 51 46 52 49 50 58 54 49 54 50 ), header=TRUE)) rownames(X)- paste(Operator., 1:nrow(X), sep=) print(X) # I'd like to know if there is a more elegant way to calculate the residuals # than the following, which seems to be rather a kludge. If you care to read # the code you'll see what I mean. machine.adjustment- colMeans(X) - mean(mean(X))# length(machine.adjustment)==3 operator.adjustment- rowMeans(X) - mean(mean(X))# length(operator.adjustment)==5 X.predicted- numeric(0) for (j in 1:ncol(X)) { new.col- mean(mean(X)) + operator.adjustment + machine.adjustment[j] X.predicted- cbind(X.predicted, new.col) } print(X.predicted) X.residual- X - X.predicted SS.E- sum( X.residual^2 ) It seems like there ought to be some way of doing that a little bit cleaner ... Thanks, Jack. - Bring photos to life! New PhotoMail makes sharing a breeze. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] array of lists? is this the best way to do it?
[Q.] How to create an array of lists, or structures the most elegant way? There have been questions in the past but none too recently...I want to know if the following looks OK to you guys or if there is a better way to create an array of lists: # PREAMBLE ... JUST TO GET THINGS GOING makeList- function(data, anythingElse) { rval - list( data = data, anythingElse = anythingElse ) class(rval) - myListOfArbitraryThings return(rval) } # make up some arbitrary data payload- list( as.matrix(cbind(1,1:3)), 10:15, data.frame(cbind(x=1, y=1:10), fac=sample(LETTERS[1:3], 10, repl=TRUE)) ) # HERE'S THE ARRAY-CONSTRUCTION PART THAT I WANT CRITIQUED: n- 3 # number of lists in the array of lists v- vector(list, n) # --- IS THIS THE BEST WAY TO CREATE AN ARRAY OF LISTS? # fill the array with essentially arbitrary stuff: for (i in 1:n) v[[i]]- makeList(payload[[i]], i) Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] reading in data with variable length
I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.8546,0.2696,0.3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8V9V10V11 V12V13 V14 V15V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NANA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in data with variable length
I should have mentioned that I already tried the readLines() approach: ta-readLines(foo.csv) ptm-proc.time() f-character(length(ta)) for (k in 2:length(ta)) { f[k-1]-(strsplit(ta[k],,)[[1]])[3] }# - PARSING EACH LINE AT THIS LEVEL IS WHERE THE REAL INEFFICIENCY IS (proc.time()-ptm)[3] [1] 102.75 on a 62M file, so I'm guessing that on my 1GB files this will be about (102.75*(1000/61))/60 [1] 28.07377 minutes...which is way, way too long. I'm new to R but I'm kind of surprised that this problem isn't well known (couldn't find anything after a long hunt). As I mentioned, MATLAB does it using textread which makes a call to its dll dataread. The data are read using something like: [name, startMonth, data]=textread(fileName,'%s%n%[^\n]', 'delimiter',',', 'bufsize', 100, 'headerlines',1); which is kind of fscanf-like. data in the above is then a cell array with each cell being the variable-length data. Liaw, Andy [EMAIL PROTECTED] wrote: Use file() connection in conjunction with readLines() and strsplit() should do it. I would try to count the number of lines in the file first, and create a list with that many components, then fill it in. I believe the array of cells in Matlab is sort of equivalent to a list in R, but that's beyond my knowledge of Matlab... Andy From: John McHenry I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.854 6,0.2696,0.3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NA NA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in data with variable length
Thanks for the awk scripts, Ted. There are reasons (read political!) why R needs to be able to read the files in directly. But, sure, I agree, why not just awk the durned thing. Just to be clear: the NAs aren't so much unsightly as the storage required in RAM is too much. With 1GB files it's easy to rapidly run out of space. [EMAIL PROTECTED] wrote: On 06-Dec-05 John McHenry wrote: I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.8546,0.2696,0 .3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. While you may well get a good R solution from the experts, in such a situation (as in so many) I would be tempted to pre-process the file with 'awk' (installed by default on Unix/Linux systems, available also for Windows). The following will give you a CSV file with a constant number of fields per line. While this does not eliminate the NAs which you apparently find unsightly, it should be a fast and clean way of doing the basic job, since it a line-by-line operation in two passes, so there should be no question. of choking the system (unless you run out of HD space as a result of creating the second file). Two passes, on the lines of Pass 1: cat foo.csv | awk ' BEGIN{FS=,; n=0} {m=NF; if(mn){n=m}} END{print n} ' which gives you the maximum number of fields in any line. Suppose (for example) that this number is 37. Then Pass 2: cat foo.csv | awk -v maxF=37 ' BEGIN{FS=,; OFS=,} {if(NF {print $0} ' newfoo.csv Tiny example: 1) See foo.csv cat foo.csv 1 1,2 1,2,3 1,2,3,4 1,2 2) Pass 1: cat foo.csv | awk ' BEGIN{FS=,; n=0} {m=NF; if(mn){n=m}} END{print n} ' 4 3) So we need 4 fields per line. With maxF=4, Pass 2: cat foo.csv | awk -v maxF=4 ' BEGIN{FS=,; OFS=,} {if(NF {print $0} ' newfoo.csv 4) See newfoo.csv cat newfoo.csv 1,,, 1,2,, 1,2,3, 1,2,3,4 1,2,, So you now have a CSV file with a constant number of fields per line. This doesn't make it into lists, though. Hoping this helps, Ted. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NA NA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html E-Mail: (Ted Harding) Fax-to-email: +44 (0)870 094 0861 Date: 06-Dec-05 Time: 18:08:54 -- XFMail -- - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] reading in data with variable length
Everything has slowed down with #1 and #3 by about 50%. Can't do #2 #4 : ta.num - lapply(ta0, scan, sep = ,) Error in file(file, r) : unable to open connection scan seems to want a file or a connection ... Gabor Grothendieck [EMAIL PROTECTED] wrote: Could you time these and see how each of these do: # 1 ta.split - strsplit(ta, split = ,) ta.num - lapply(ta.split, function(x) as.numeric(x[-(1:2)])) # 2 ta0 - sub(^[^,]*,[^.]*,, , ta) ta.num - lapply(ta0, scan, sep = ,) # 3 - loop version of #1 n - length(ta) ta.split - strsplit(ta, split = ,) ta.num - list(length = n) for(i in 1:n) ta.num[[i]] - as.numeric(ta.split[[i]][-(1:2)]) # 4 - loop version of #2 n - length(ta) ta0 - sub(^[^,]*,[^.]*,, , ta) ta.num - list(length = n) for(i in 1:n) ta.num[[i]] - scan(t0[[i]) On 12/6/05, John McHenry wrote: I should have mentioned that I already tried the readLines() approach: ta-readLines(foo.csv) ptm-proc.time() f-character(length(ta)) for (k in 2:length(ta)) { f[k-1]-(strsplit(ta[k],,)[[1]])[3] }# - PARSING EACH LINE AT THIS LEVEL IS WHERE THE REAL INEFFICIENCY IS (proc.time()-ptm)[3] [1] 102.75 on a 62M file, so I'm guessing that on my 1GB files this will be about (102.75*(1000/61))/60 [1] 28.07377 minutes...which is way, way too long. I'm new to R but I'm kind of surprised that this problem isn't well known (couldn't find anything after a long hunt). As I mentioned, MATLAB does it using textread which makes a call to its dll dataread. The data are read using something like: [name, startMonth, data]=textread(fileName,'%s%n%[^\n]', 'delimiter',',', 'bufsize', 100, 'headerlines',1); which is kind of fscanf-like. data in the above is then a cell array with each cell being the variable-length data. Liaw, Andy wrote: Use file() connection in conjunction with readLines() and strsplit() should do it. I would try to count the number of lines in the file first, and create a list with that many components, then fill it in. I believe the array of cells in Matlab is sort of equivalent to a list in R, but that's beyond my knowledge of Matlab... Andy From: John McHenry I have very large csv files (up to 1GB each of ASCII text). I'd like to be able to read them directly in to R. The problem I am having is with the variable length of the data in each record. Here's a (simplified) example: $ cat foo.csv Name,Start Month,Data Foo,10,-0.5615,2.3065,0.1589,-0.3649,1.5955 Bar,21,0.0880,0.5733,0.0081,2.0253,-0.7602,0.7765,0.2810,1.854 6,0.2696,0.3316,0.1565,-0.4847,-0.1325,0.0454,-1.2114 The records consist of rows with some set comma-separated fields (e.g. the Name Start Month fields in the above) and then the data follow as a variable-length list of comma-separated values until a new line is encountered. Now I can use e.g. fileName=foo.csv ta-read.csv(fileName, header=F, skip=1, sep=,, dec=., fill=T) which does the job nicely: V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 1 Foo 10 -0.5615 2.3065 0.1589 -0.3649 1.5955 NA NA NA NA NA NA NA NA NA NA 2 Bar 21 0.0880 0.5733 0.0081 2.0253 -0.7602 0.7765 0.281 1.8546 0.2696 0.3316 0.1565 -0.4847 -0.1325 0.0454 -1.2114 but the problem is with files on the order of 1GB this either crunches for ever or runs out of memory trying ... plus having all those NAs isn't too pretty to look at. (I have a MATLAB version that can read this stuff into an array of cells in about 3 minutes). I really want a fast way to read the data part into a list; that way I can access data in the array of lists containing the records by doing something ta[[i]]$data. Ideas? Thanks, Jack. - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html -- -- - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html