Re: [R] Column mapping
Am 28.07.2010 07:36, schrieb jd6688: DF1 name OTHER ABCO KKKO QQQO DDDO PPPO DF2 name ABC KKK DDD If the names in df1 mapped the names in df2, then add the mapped name to df1 as a separate column, for instance mappedColumn What do you mean by mapped: (1) how often a name in df1 is present or (2) whether a name of DF1 is existing in DF2? for (1) you could use DF1$existsinDF2-DF1$name%in%DF2$name for (2) use either aggregate or table (type ?aggregate or ?table at command prompt for help. Stefan __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ddply error
ddply(z, .(groupId,location),function(d)with(d, c(startLoc=Pos[1],endLoc=Pos[length(Pos)], peakValue=max(sumoo),other=map[1]))) startLoc=Pos[1],endLoc=Pos[length(Pos)], peakValue=max(sumoo),numeric value other=map[1]===charactor value as a result: c(startLoc=Pos[1],endLoc=Pos[length(Pos)], peakValue=max(sumoo),other=map[1]) ===convert all values to charactor. my question, How can i separate the numeric and charactor values from above statement? Thanks, -- View this message in context: http://r.789695.n4.nabble.com/ddply-error-tp2304485p2304485.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hatching posibility in Panel.Polygon
Hi, library(gridExtra) example(patternGrob) provides some patterns to fill a rectangular area using Grid graphics. It could in theory be used in lattice. I wouldn't use it either, but I can imagine how it might be useful on very special occasions. Best, baptiste On 28 July 2010 06:11, HC hca...@yahoo.co.in wrote: Thank you for your follow up on this matter. I did think about the partial transparent color option and will certainly use it and see how it works out. For presentations and color prints, the partial transparency is surely going to work and may even look nicer. But for black and white printing for journal articles or making zerox copies, hatching seems more effective. And that was the reason of my exploring it. There may be some valid reason to be disappointed if such option is made available. However, such an option will only add more power to the trellis graphics that already very useful, attractive and efficient. Regards. HC -- View this message in context: http://r.789695.n4.nabble.com/hatching-posibility-in-Panel-Polygon-tp2301863p2304414.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xYplot error
Hi Frank, Thanks for the suggestion. using numericScale() does work for Dotplot, but there're still a few issues: 1. My factor names are Plot A, PF, MSF, and YSF, so numericScale turns that into 3, 2, 1, 4 and the x-axis is plotted 1, 2, 3, 4. Is there any way I can retain the same order on the graph? 2. I can't get the error bars displayed even after using method=bars, only the mean, lower and upper bounds of the data as points. This the line I used: Dotplot(cbind(mort, mort + stand, mort - stand) ~ numericScale(site) | type, data = mort, method=bands) Thanks for your help. KM On Jul 27, 9:58 pm, Frank Harrell f.harr...@vanderbilt.edu wrote: If the x-axis variable is really a factor, xYplot will not handle it. You probably need a dot chart instead (see Hmisc's Dotplot). Note that it is unlikely that the confidence intervals are really symmetric. Frank On Tue, 27 Jul 2010, Kang Min wrote: Hi, I'm trying to plot a graph with error bars using xYplot in the Hmisc package. My data looks like this. mort stand site type 0.042512776 0.017854525 Plot A ST 0.010459803 0.005573305 PF ST 0.005188321 0.006842107 MSF ST 0.004276068 0.011592129 YSF ST 0.044586495 0.035225266 Plot A LD 0.038810662 0.037355408 PF LD 0.027567430 0.020523820 MSF LD 0.024698872 0.020320976 YSF LD Having read previous posts on xYplot being unable to plot x-axis as factors, I used numericScale, but I still get this error. Error in label.default(xv, units = TRUE, plot = TRUE, default = as.character(xvname), : the default string cannot be of length greater then one I used: xYplot(cbind(mort, mort + stand, mort - stand) ~ numericScale(site) | type, method=bars) Am I missing something or doing something wrong? Thanks. KM __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimization problem with nonlinear constraint
Dear Ravi, As I've already written to you, the problem indeed is to find a solution to the transcendental equation y = x * T^(x-1), given y and T and the optimization problem below only a workaround. John C. Nash has been so kind to help me on here. In case anyone faces a similar problem in the future, the solution looks as follows: * func1 - function(y,x,T){ out - x*T^(x-1)-y return(out) } # Assign the known values to y and T: y - 3 T - 123 root - uniroot(func1,c(-10,100),y=y,T=T) print(root) Somewhat simpler than I thought Thanks again! Uli Am 26.07.2010 17:44, schrieb Ravi Varadhan: Hi Uli, I am not sure if this is the problem that you really want to solve. The answer is the solution to the equation y = x * T^(x-1), provided a solution exists. There is no optimization involved here. What is the real problem that you are trying to solve? If you want to solve a more meaningful constrained optimization problem, you may want to try the abalama package which I just put on CRAN. It can optimize smooth nonlinear functions subject to linear and nonlinear equality and inequality constraints. http://cran.r-project.org/web/packages/alabama/index.html Let me know if you run into any problems using it. Best, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Uli Kleinwechter Sent: Monday, July 26, 2010 10:16 AM To: r-help@r-project.org Subject: [R] Optimization problem with nonlinear constraint Dear all, I'm looking for a way to solve a simple optimization problem with a nonlinear constraint. An example would be max x s.t. y = x * T ^(x-1) where y and T are known values. optim() and constrOptim() do only allow for box or linear constraints, so I did not succedd here. I also found hints to donlp2 but this does not seem to be available anymore. Any hints are welcome, Uli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] T and F (was: Optimization problem with nonlinear constraint)
Some care is in order: Using 'T' as a variable name is quite dangerous in R since it is an alias for 'TRUE'. Rules to live by: * Avoid using 'T' and 'F' as object names. * Use 'TRUE' and 'FALSE', not 'T' and 'F'. If you follow these, then you won't be tripped up, and you won't trip other people either. On 28/07/2010 08:31, Uli Kleinwechter wrote: Dear Ravi, As I've already written to you, the problem indeed is to find a solution to the transcendental equation y = x * T^(x-1), given y and T and the optimization problem below only a workaround. John C. Nash has been so kind to help me on here. In case anyone faces a similar problem in the future, the solution looks as follows: * func1 - function(y,x,T){ out - x*T^(x-1)-y return(out) } # Assign the known values to y and T: y - 3 T - 123 root - uniroot(func1,c(-10,100),y=y,T=T) print(root) Somewhat simpler than I thought Thanks again! Uli Am 26.07.2010 17:44, schrieb Ravi Varadhan: Hi Uli, I am not sure if this is the problem that you really want to solve. The answer is the solution to the equation y = x * T^(x-1), provided a solution exists. There is no optimization involved here. What is the real problem that you are trying to solve? If you want to solve a more meaningful constrained optimization problem, you may want to try the abalama package which I just put on CRAN. It can optimize smooth nonlinear functions subject to linear and nonlinear equality and inequality constraints. http://cran.r-project.org/web/packages/alabama/index.html Let me know if you run into any problems using it. Best, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Uli Kleinwechter Sent: Monday, July 26, 2010 10:16 AM To: r-help@r-project.org Subject: [R] Optimization problem with nonlinear constraint Dear all, I'm looking for a way to solve a simple optimization problem with a nonlinear constraint. An example would be max x s.t. y = x * T ^(x-1) where y and T are known values. optim() and constrOptim() do only allow for box or linear constraints, so I did not succedd here. I also found hints to donlp2 but this does not seem to be available anymore. Any hints are welcome, Uli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using jpeg() without X
On Tue, 27 Jul 2010, mic wrote: When I tried this, I'm having this error. Can somebody help me on this. Are there any alternatives or workaround for this? I'm having hard time to convince our admin to install X11 library and headers since they are not included on the default OS installation. You could install R from an RPM: you only need the X11 headers to build R. (It is possible if tricky to do that in a user account, but you might persuade the sysadmin to do so.) You could use the bitmap() device, if gs is installed. You could use some of the third-party alternatives (packages Cairo, GDD ...) *but* you almost certainly don't have the -devel RPMs they depend on either. (I don't think the -devel RPMs needed for jpeg are in 'the default OS installation', but it depends on which default.) You could install a copy of X11 (and, preferably, cairographics) from the sources in your own space. Finally, you could talk to the 'admin's line-manager about his/her employee's obstructive attitude. Thanks in advance :) jpeg(test.jpg) Error in jpeg(test.jpg) : X11 is not available sessionInfo() R version 2.11.1 (2010-05-31) i686-pc-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base capabilities() jpeg png tifftcltk X11 aqua http/ftp sockets FALSEFALSEFALSEFALSEFALSEFALSE TRUE TRUE libxml fifo clediticonv NLS profmemcairo TRUE TRUE TRUE TRUE TRUEFALSEFALSE Seems that libjpeg is available on our server [r...@localhost R-2.11.1]# locate libjpeg /usr/lib/libjpeg.so /usr/lib/libjpeg.so.62 /usr/lib/libjpeg.so.62.0.0 Interesting: the first is in the libjpeg-devel RPM, so the sysadmin has installed some unnecessary software already I'm using Fedora 12 and compiled the newest version of r-project. Here are my steps I've taken before I run that command ./configure --with-x=no --with-tcltk=no Here's the message after the command... R is now configured for i686-pc-linux-gnu Source directory: . Installation directory:/usr/local C compiler:gcc -std=gnu99 -g -O2 Fortran 77 compiler: gfortran -g -O2 C++ compiler: g++ -g -O2 Fortran 90/95 compiler:gfortran -g -O2 Obj-C compiler: Interfaces supported: External libraries:readline Additional capabilities: JPEG, NLS Options enabled: shared BLAS, R profiling, Java Recommended packages: yes ... make make install R jpeg(test.jpg) Error in jpeg(test.jpg) : X11 is not available __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Time-dependent covariates in survreg function
Dear all, I'm asking this question again as I didn't get a reply last time: I'm doing a survival analysis with time-dependent covariates. Until now, I have used a simple Cox model for this, specifically the coxph function from the survival library. Now, I would like to try out an accelerated failure time model with a parametric specification as implemented for example in the survreg function. Two questions: First, can survreg handle time-dependent covariates? The description for this function does not make reference to them. And second, in case survreg cannot deal with time-dependent covariates, is there a similar function in some other package that can? Thanks very much, Michael Michael Haenlein Associate Professor of Marketing ESCP Europe Paris, France [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Understanding how R works
Hello everyone. I am more than new into R. Today I have started reading about grf function that is included in geoR package. According the manual (vignette?) grf() generates (unconditional) simulations of Gaussian random fields for given covariance parameters. geoR2RF converts model specification used by geoR to the correponding one in RandomFields. I would like to use this function to fill in with valies a rasterlayer. According to the manual grf() returns a list Value grf returns a list with the components: coords an n x 2 matrix with the coordinates of the simulated data. data a vector (if nsim = 1) or a matrix with the simulated values. For the latter each column corresponds to one simulation. cov.model a string with the name of the correlation function. nugget the value of the nugget parameter. cov.pars a vector with the values of sigma^2 and phi, respectively. kappa value of the parameter kappa. lambda value of the Box-Cox transformation parameter lambda. aniso.pars a vector with values of the anisotropy parameters, if provided in the function call. method a string with the name of the simulation method used. sim.dim a string 1d or 2d indicating the spatial dimension of the simulation. .Random.seed the random seed by the time the function was called. messages messages produced by the function describing the simulation. call the function call. geoR2grf returns a list with the components: model RandomFields name of the correlation model param RandomFields parameter vector And this is exactly the point I do not get. In the example section of the same function I can find the following: sim1 - grf(100, cov.pars = c(1, .25)) # a display of simulated locations and values points(sim1) How the function points() understand what to read exactly from the list? I would like tot hank you in advance for your replies Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Time-dependent covariates in survreg function
Dear Michael, AFAIK survreg() cannot handle time-dependent covariates. In particular, things are getting more complicated under the accelerated failure time framework when it comes to the handling of time-dependent covariates. Moreover, it is important (for both Cox and AFT models) to determine what type of the time-dependent covariate you have in your problem. Is it exogenous (aka external) or endogenous (aka internal) -- for info have a look at [1, Section 6.3]. For exogenous time-dependent covariates you can still use the time-dependent version of the Cox model. However, for endogenous ones you might have considerable bias if you do so. For endogenous time-dependent covariates an alternative is to use package JM (http://www.jstatsoft.org/v35/i09, http://CRAN.R-project.org/package=JM, http://rwiki.sciviews.org/doku.php?id=packages:cran:jm) that has both relative risk and AFT models available. I hope it helps. Best, Dimitris [1] Kalbfleisch, J. and Prentice, R. (2002). The Statistical Analysis of Failure Time Data, 2nd edition. John Wiley Sons, New York. On 7/28/2010 10:06 AM, Michael Haenlein wrote: Dear all, I'm asking this question again as I didn't get a reply last time: I'm doing a survival analysis with time-dependent covariates. Until now, I have used a simple Cox model for this, specifically the coxph function from the survival library. Now, I would like to try out an accelerated failure time model with a parametric specification as implemented for example in the survreg function. Two questions: First, can survreg handle time-dependent covariates? The description for this function does not make reference to them. And second, in case survreg cannot deal with time-dependent covariates, is there a similar function in some other package that can? Thanks very much, Michael Michael Haenlein Associate Professor of Marketing ESCP Europe Paris, France [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to deal with more than 6GB dataset using R?
Matthew, You might want to look at function read.table.ffdf in the ff package, which can read large csv files in chunks and store the result in a binary format on disk that can be quickly accessed from R. ff allows you to access complete columns (returned as a vector or array) or subsets of the data identified by row-positions (and column selection, returned as a data.frame). As Jim pointed out: all depends on what you are going with the data. If you want to access subsets not by row-position but rather by search conditions, you are better-off with an indexed database. Please let me know if you write a fast read.fwf.ffdf - we would be happy to include it into the ff package. Jens __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sqldf 0.3-5 package or tcltk problem
On Wed, Jul 28, 2010 at 1:21 AM, erickso...@aol.com wrote: This is my first post. I am running Mac OS X version 10.6.3. I am running R 2.11.0 GUI 1.33 64 bit. This may or may not be related to sqldf, but I experienced this problem while attempting to use an sqldf query. The same code runs with no problem on my Windows machine. Here is what happens: r=sqldf(select ... ) Loading required package: tcltk Loading Tcl/Tk interface ... Then it never loads. I have X11 open. I have all the latest versions of all the necessary packages for sqldf 0.3-5: DBI 0.2-5 RSQLite 0.9-1 RSQLite.extfuns 0.0.1 gsubfn 0.5-3 proto 0.3-8 chron 2.3-35 Although it gives warning messages for these: package 'sqldf' was built under R version 2.11.1 package 'RSQLite' was built under R version 2.11.1 package 'RSQLite.extfuns' was built under R version 2.11.1 package 'gsubfun' was built under R version 2.11.1 What can I do to load the Tcl/Tk interface? Some things to try: - upgrade to R 2.11.1 - try this alone: library(tcltk) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Understanding how R works
Hi r-help-boun...@r-project.org napsal dne 28.07.2010 10:08:37: Hello everyone. I am more than new into R. Today I have started reading about grf function that is included in geoR package. According the manual (vignette?) grf() generates (unconditional) simulations of Gaussian random fields for given covariance parameters. geoR2RF converts model specification used by geoR to the correponding one in RandomFields. I would like to use this function to fill in with valies a rasterlayer. According to the manual grf() returns a list Value grf returns a list with the components: coords an n x 2 matrix with the coordinates of the simulated data. data a vector (if nsim = 1) or a matrix with the simulated values. For the latter each column corresponds to one simulation. cov.model a string with the name of the correlation function. nugget the value of the nugget parameter. cov.pars a vector with the values of sigma^2 and phi, respectively. kappa value of the parameter kappa. lambda value of the Box-Cox transformation parameter lambda. aniso.pars a vector with values of the anisotropy parameters, if provided in the function call. method a string with the name of the simulation method used. sim.dim a string 1d or 2d indicating the spatial dimension of the simulation. .Random.seed the random seed by the time the function was called. messages messages produced by the function describing the simulation. call the function call. geoR2grf returns a list with the components: model RandomFields name of the correlation model param RandomFields parameter vector And this is exactly the point I do not get. In the example section of the same function I can find the following: sim1 - grf(100, cov.pars = c(1, .25)) # a display of simulated locations and values points(sim1) How the function points() understand what to read exactly from the list? You can track R functions sometimes by typing name of function or look if there are methods for some object e.g. methods(points) [1] points.default points.formula* points.table* Non-visible functions are asterisked points.default function (x, y = NULL, type = p, ...) plot.xy(xy.coords(x, y), type = type, ...) environment: namespace:graphics xy.coords function (x, y = NULL, xlab = NULL, ylab = NULL, log = NULL, recycle = FALSE) { if (is.null(y)) { ylab - xlab if (is.language(x)) { if (inherits(x, formula) length(x) == 3) { ylab - deparse(x[[2L]]) xlab - deparse(x[[3L]]) y - eval(x[[2L]], environment(x), parent.frame()) x - eval(x[[3L]], environment(x), parent.frame()) } else stop(invalid first argument) } else if (inherits(x, ts)) { y - if (is.matrix(x)) x[, 1] else x x - stats::time(x) xlab - Time } else if (is.complex(x)) { y - Im(x) x - Re(x) xlab - paste(Re(, ylab, ), sep = ) ylab - paste(Im(, ylab, ), sep = ) } So you could check if there is points method for grf object Regards Petr I would like tot hank you in advance for your replies Best Regards Alex __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing the Zoo *package*
It's a package, not a library ... I've changed the subject line where you had library. A library in R is a directory in which packages are stored, and which you see from typing 'library()' or a file with a collection of *compiled* C, Fortran, C++, etc functions (sometimes called a DLL, notably in Windows) which in R is loaded by dyn.load() or library.dynam() -- typically implicitly when a package is loaded containing compiled code. As you declare yourself an R newbie, one of the early steps will be to use precise language. Hence, do watch your language :-) Martin Maechler, ETH Zurich (yes, I'm vacationing and having some extra time .. :-) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] R CMD build wiped my computer
Hi Martin, I think this is the most likely reason given that the name in the DESCRIPTION file does NOT have a version number. Even so, it is very easy to misname a file and then delete it/change its name (as I've done here) and I hope current versions of R would not cause this problem. Perhaps Fedora should not use ~ as its back up file suffixes? Cheers, Jarrod On 28 Jul 2010, at 11:41, Martin Maechler wrote: Jarrod Hadfield j.hadfi...@ed.ac.uk on Tue, 27 Jul 2010 21:37:09 +0100 writes: Hi, I ran R (version 2.9.0) CMD build under root in Fedora (9). When it tried to remove junk files it removed EVERYTHING in my local account! (See below). Can anyone tell me what happened, the culprit may lay here: * removing junk files unlink MCMCglmm_2.05/R/ residuals.MCMCglmm.R ~ where it seems that someone (you?) have added a newline in the filname, so instead of 'residuals.MCMCglmm.R~' you got 'residuals.MCMCglmm.R ~' and the unlink / rm command interpreted '~' as your home directory. But I can hardly believe it. This seems explanation seems a bit doubtful to me.. ... and even more importantly if I can I restore what was lost. well, you just get it from the backup. You do daily backups, do you? Regards, Martin Maechler, ETH Zurich Panickingly, Jarrod [jar...@localhost AManal]$ R CMD build MCMCglmm_2.05 * checking for file 'MCMCglmm_2.05/DESCRIPTION' ... OK * preparing 'MCMCglmm_2.05': * checking DESCRIPTION meta-information ... OK * cleaning src * installing the package to re-build vignettes * Installing *source* package ?MCMCglmm? ... ** libs g++ -m64 -I/usr/include/R -I/usr/local/include-fpic -O2 -g - pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -c MCMCglmm.cc -o MCMCglmm.o MCMCglmm.cc: In function ?void MCMCglmm(double*, double*, double*, int*, int*, int*, int*, int*, int*, double*, int*, int*, double*, int*, int*, double*, double*, int*, int*, int*, int*, int*, int*, int*, int*, int*, int*, double*, double*, double*, int*, int*, int*, int*, double*, double*, double*, int*, double*, bool*, double*, double*, int*, int*, int*, int*, int*, double*, int*, int*, int*, double*, double*, double*, int*, int*, double*, int*, int*, int*, int*, double*, double*, double*, double*)?: MCMCglmm.cc:228: warning: ?pvLS? may be used uninitialized in this function MCMCglmm.cc:227: warning: ?pvLL? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?Lrv? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?pmuL? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?pvL? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?LambdaX? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?bv_tmp? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?bv? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?A? may be used uninitialized in this function MCMCglmm.cc:86: warning: ?dimG? may be used uninitialized in this function MCMCglmm.cc:228: warning: ?AlphainvS? may be used uninitialized in this function MCMCglmm.cc:227: warning: ?AlphainvL? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?alphalocation_tmp? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?alphalocation? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?alphazstar? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?alphapred? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?alphaastar? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?muAlpha? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?Alphainv? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?tXalpha? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?Xalpha? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?linky_orig? may be used uninitialized in this function MCMCglmm.cc:228: warning: ?tYKrinvYS? may be used uninitialized in this function MCMCglmm.cc:228: warning: ?LambdaS? may be used uninitialized in this function MCMCglmm.cc:227: warning: ?tYKrinvYL? may be used uninitialized in this function MCMCglmm.cc:227: warning: ?LambdaLU? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?tYKrinvY? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?tYKrinv? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?ILY? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?tY? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?Y? may be used uninitialized in this function MCMCglmm.cc:225: warning: ?I? may be used uninitialized in this function MCMCglmm.cc:228: warning: ?alphaS? may be used uninitialized in this function MCMCglmm.cc:225: warning:
[R] message box
Hi, I need some help figuring out how to make a pop-up message box appear with error messages when running a script using Rterm. Windows XP R2.10.1 ...possibly with the ability to either continue or abort the script? Thanks. M -- [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] print(list) ?show all values?
Hello! I´m playing arround with rJava... Always when i want to print(list), it only shows me the first values, and then (900534 more values follow)... How can i get arround this? what i´m doing wrong? I want that all values are printed ? -- View this message in context: http://r.789695.n4.nabble.com/print-list-show-all-values-tp2304598p2304598.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ...more values follow?
Hello! I´m playing arround with rJava... Always when i want to print(list), it only shows me the first values, and then (900534 more values follow)... How can i get shure, that all list-values are printed? what i´m doing wrong? -- View this message in context: http://r.789695.n4.nabble.com/more-values-follow-tp2304566p2304566.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Introductory statistics and introduction to R
Hi, I have a bright, diligent second-year graduate student who wants to learn statistics and R and will, in effect, be taking a tutorial from me on these subjects. (If you've seen some of my questions on this list, please don't laugh.) As an undergrad he majored in philosophy, so this will be his first foray into computer programming and statistics. I'm thinking of having him use Introductory Statistics with R by Peter Dalgaard, but I'm unable to tell if the book requires calculus. I don't think this student knows calculus, so this would be a deal breaker. Can someone tell me if my student can get through this book starting out with just knowledge of algebra? Also, do you have other suggestions for texts, manuals, web sites, etc. that would introduce statistics and R simultaneously? Thanks, Marsh Feldman __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data frame modification
Hi I am trying to modify a data frame D with lists x and y in such a way that if a value in x==0 then it should replace that value with the last not zero value in x. I.e. for loop over i{ if(D$x[i]==0) D$x[i]=D$x[i-1] } The data frame is quite large in size ~ 43000 rows. This operation is taking a large amount of time. Can someone please suggest me what might be the reason. Thanks Regards Siddharth Sent on my BlackBerry® from Vodafone __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to point a column of dataframe by a character
Hello, Here is a dilemma I am having for a long time. But, I couldn't figure it out. I have an vector of Y and a data frame named data,which contains all Xs. I tried to be more efficient in fitting a simple linear regression with each X. Firstly, for (i in 1:(dim(data)[2])){ model-lm(Y~data[,i]) # this is not what I want since the name of coefficient will be data[,i] # I need coefficient name to be the name for each variable # for instance: # Coefficients: # (Intercept)data[, 1] # 24.2780 -0.3381 } Second try! I first create a vector of characters (Xs) that contains possible names of X. This vector is exactly the same as colnames of X. #my Xs Xs-c(a,b,c) for (i in length(Xs)){ model-lm(Y~data[,Xs[i]]) # Again, not what I want # Coefficients: # (Intercept)data[, Xs[1]] # 24.2780 -0.3381 } Thus, how can I solve this dilemma? I think about trying to find a function that can map the name of variable to values of that variable. That is, I first attach the data. I can type a to pull out values of data[,a] (a vector of numeric) directly. However, using Xs[1] will give me only character - a. Thus, is there any function that allow me to pull values of dadta[,a] , eg, some_function(Xs[1]) give me values of data[,a] Any help is appreciated. Tony [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] finding the next highest number in an array
Hi I have a sorted array ( in ascending order) and I want to find the subscript of a number in the array which is the the next highest number to a given number. For example,if I have 67 as a given number and if I have a vector x=c(23,36,45,62,79,103,109), then how do I get the subscript 5 from x (to get 79 which is the next highest to 67) without using a for loop? Thx -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re-sampling of large sacle data
On Jul 27, 2010, at 6:44 PM, jd6688 wrote: I am trying to do the following to accomplish the tasks, can anybody to simplify the solutions. Thanks, for (i in 1:1){ d-apply(s,2,sample) pos_neg_tem-t(apply(d,1,doit)) if (i1){ pos_neg_pool-rbind(pos_neg_pool,pos_neg_tem) }else{ pos_neg_pool- pos_neg_tem }} A bit of efficiency advice: incremental creation of objects is generally a major source of slowness. Consider creating pos_neg_pool before the loop and then filling it in within the loop. It would also let you remove that if{}else{} statement. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] finding the next highest number in an array
a couple of the many possible ways are: x - c(23,36,45,62,79,103,109) thr - 67 x[x thr][1] head(x[x thr], 1) I hope it helps. Best, Dimitris On 7/28/2010 12:12 PM, Raghu wrote: Hi I have a sorted array ( in ascending order) and I want to find the subscript of a number in the array which is the the next highest number to a given number. For example,if I have 67 as a given number and if I have a vector x=c(23,36,45,62,79,103,109), then how do I get the subscript 5 from x (to get 79 which is the next highest to 67) without using a for loop? Thx -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] memory problem for scatterplot using ggplot
Dear all, I have a memory problem in making a scatter plot of my 17.5 million-pair datasets. My intention to use the ggplot package and use the bin2d. Please find the attached script for more details. Could somebody please give me any clues or tips to solve my problem?? please ... Just for additional information: I'm running my R script on my 32-bit machine: Ubuntu 9.10, hardware: AMD Athlon Dual Core Processor 5200B, memory: 1.7GB. Many thanks in advance. Kind Regards, -- Ir. Edwin H. Sutanudjaja Dept. of Physical Geography, Faculty of Geosciences, Utrecht University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with scatterplots in R
Hi, When I plot a scatter plot, R automatically only gives the years on the x-axis. How can I make R also show the months on the x-axis? Thank you very much! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] easy debugging
ycw == ying chen wang gracedrop.w...@gmail.com on Mon, 26 Jul 2010 17:10:24 -0400 writes: ycw Yes, thanks. Just found out the solution. Thanks for the help. ycw Just started R. Not familiar with its environment. ycw G ycw On Mon, Jul 26, 2010 at 5:08 PM, jim holtman jholt...@gmail.com wrote: Is this what you want: equated-c(111.0,112.06, 112.9, 113.8, 115.0, 116.2, 117.0, 118.0, 120.5, + 120.5, 120.5) equated[equated 120] - 120 equated [1] 111.00 112.06 112.90 113.80 115.00 116.20 117.00 118.00 120.00 120.00 120.00 Note that for the particular question, pmin(120, equated) is more efficient than any of the other versions you've mentioned. Martin Maechler, ETH Zurich You should read up on 'indexing' in the R Intro paper. On Mon, Jul 26, 2010 at 1:26 PM, ying_chen wang gracedrop.w...@gmail.com wrote: I am new to R. Used to use FORTRAN. R is so different from FORTRAN. The following codes would work in FOTRAN. I am trying to put an upper limit at 120. If the score is 120, it is assigned 120. Or else, keep the original values. version 1: equated-11 result-11 equated-c(111.0,112.06, 112.9, 113.8, 115.0, 116.2, 117.0, 118.0, 120.5, 120.5, 120.5) for (i in 1:11){ if (equated 120) result[i]-120 if (equated 120) result[i]-equated[i] result-result result } result version2: if (equated 120) result-120 if (equated 120) result-equated If any of you can help, I would appreciate that. G [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? ycw [[alternative HTML version deleted]] ycw __ ycw R-help@r-project.org mailing list ycw https://stat.ethz.ch/mailman/listinfo/r-help ycw PLEASE do read the posting guide http://www.R-project.org/posting-guide.html ycw and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] strange error : isS4(x) in gamm function (mgcv package). Variable in data-frame not recognized???
Dear all, I run a gamm with following call : result - try(gamm(values~ s( VM )+s( RH )+s( TT )+s( PP )+RF+weekend+s(day)+s(julday) ,correlation=corCAR1(form=~ day|month ),data=tmp) ) with mgcv version 1.6.2 No stress about the data, the error is not data-related. I get : Error in isS4(x) : object 'VM' not found What so? I did define the dataframe to be used, and the dataframe contains a variable VM : str(tmp) 'data.frame': 4014 obs. of 12 variables: $ values : num 73.45 105.45 74.45 41.45 -4.55 ... $ dates :Class 'Date' num [1:4014] 9862 9863 9864 9865 9866 ... $ year : num -5.65 -5.65 -5.65 -5.65 -5.65 ... $ day: num -178 -177 -176 -175 -174 ... $ month : Factor w/ 156 levels 1996-April,1996-August,..: 17 17 17 17 17 17 17 17 17 17 ... $ julday : num -2241 -2240 -2239 -2238 -2237 ... $ weekend: num -0.289 -0.289 -0.289 0.711 0.711 ... $ VM : num 0.139 -1.451 0.349 0.839 -0.611 ... $ RH : num 55.2 61.4 59.8 64.1 60.7 ... $ TT : num -23.4 -23.6 -19.5 -16.1 -15.3 ... $ PP : num 6.17 4.27 -4.93 -9.23 -2.63 ... $ RF : Ord.factor w/ 3 levels None2.5mm..: 1 1 1 1 1 1 1 1 1 1 ... - attr(*, means)= num Any idea what I'm missing here? Cheers Joris -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] re-sampling of large sacle data
On Jul 28, 2010, at 12:09 AM, jd6688 wrote: d - apply(s, 2, sample, size = 1*nrow(s), replace = TRUE) why the code above return the following error Error: cannot allocate vector of size 218.8 Mb Possibilities: Your workspace is full of other junk? Your workspace used to be full of other junk and its memory is too fragmented to find a contiguous chunk of memory? Your computer is full of other junk? You have not read the R-FAQ ( or the RW-FAQ ) items on the the topic of memory usage on whatever operating system you are working with. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] kde on Torus
Hello, I have 2D-data on a torus, i.e. they are scattered within [0:2pi) and are supposed to be periodic with period 2pi. Is there a way in R for a kernel density estimation for such data? I found this article http://www.dmqte.unich.it/personal/dimarzio/density46.pdf but a) I don't fully understand the article (my knowledge in statistics is poor) b) I did not understand which Eq. represents the kernel(s) c) I do not now R well enough to understand whether I can use kde2d or nprudens with an arbitrary kernel My simple-minded attempt was to extend the data to [-2pi:4pi) and then use kde2d, but I am wondering a) how accurate is this b) is there a way to do it properly on a torus. Thanks a lot, Tim -- -- Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A signature.asc Description: Digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] message box
Martin du Saire asked : I need some help figuring out how to make a pop-up message box appear with error messages when running a script using Rterm. Windows XP R2.10.1 Have you tried the exampleshttp://bioinf.wehi.edu.au/%7Ewettenhall/RTclTkExamples/RTclTkExamples.zipcompiled by James Wettenhall ? wettenh...@wehi.edu.au http://bioinf.wehi.edu.au/~wettenhall/RTclTkExamples/mb.html Bruno. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] finding the next highest number in an array
On Jul 28, 2010, at 7:28 AM, Dimitris Rizopoulos wrote: a couple of the many possible ways are: x - c(23,36,45,62,79,103,109) thr - 67 x[x thr][1] head(x[x thr], 1) Since he wanted the subscript rather than the number wouldn't it be: which(x 67)[1] head( which(x 67), 1) -- David. I hope it helps. Best, Dimitris On 7/28/2010 12:12 PM, Raghu wrote: Hi I have a sorted array ( in ascending order) and I want to find the subscript of a number in the array which is the the next highest number to a given number. For example,if I have 67 as a given number and if I have a vector x=c(23,36,45,62,79,103,109), then how do I get the subscript 5 from x (to get 79 which is the next highest to 67) without using a for loop? Thx -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strange error : isS4(x) in gamm function (mgcv package). Variable in data-frame not recognized???
Dear all, it gets even more weird. After restarting R, the code I used works just fine. The call is generated in a function that I debugged using browser(). Problem is solved, but I have no clue whatsoever how that error came about. It must have something to do with namespaces, but the origin is dark. I tried to regenerate the error, but didn't succeed. Somebody an idea as to where I have to look for a cause? Cheers Joris On Wed, Jul 28, 2010 at 1:16 PM, Joris Meys jorism...@gmail.com wrote: Dear all, I run a gamm with following call : result - try(gamm(values~ s( VM )+s( RH )+s( TT )+s( PP )+RF+weekend+s(day)+s(julday) ,correlation=corCAR1(form=~ day|month ),data=tmp) ) with mgcv version 1.6.2 No stress about the data, the error is not data-related. I get : Error in isS4(x) : object 'VM' not found What so? I did define the dataframe to be used, and the dataframe contains a variable VM : str(tmp) 'data.frame': 4014 obs. of 12 variables: $ values : num 73.45 105.45 74.45 41.45 -4.55 ... $ dates :Class 'Date' num [1:4014] 9862 9863 9864 9865 9866 ... $ year : num -5.65 -5.65 -5.65 -5.65 -5.65 ... $ day : num -178 -177 -176 -175 -174 ... $ month : Factor w/ 156 levels 1996-April,1996-August,..: 17 17 17 17 17 17 17 17 17 17 ... $ julday : num -2241 -2240 -2239 -2238 -2237 ... $ weekend: num -0.289 -0.289 -0.289 0.711 0.711 ... $ VM : num 0.139 -1.451 0.349 0.839 -0.611 ... $ RH : num 55.2 61.4 59.8 64.1 60.7 ... $ TT : num -23.4 -23.6 -19.5 -16.1 -15.3 ... $ PP : num 6.17 4.27 -4.93 -9.23 -2.63 ... $ RF : Ord.factor w/ 3 levels None2.5mm..: 1 1 1 1 1 1 1 1 1 1 ... - attr(*, means)= num Any idea what I'm missing here? Cheers Joris -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odp: Data frame modification
Hi r-help-boun...@r-project.org napsal dne 28.07.2010 11:30:48: Hi I am trying to modify a data frame D with lists x and y in such a way that if a value in x==0 then it should replace that value with the last not zero valuein x. I.e. for loop over i{ if(D$x[i]==0) D$x[i]=D$x[i-1] } The data frame is quite large in size ~ 43000 rows. This operation is taking a large amount of time. Can someone please suggest me what might be the reason. Bad programming practice? I would suggest to use zoo package and na.locf function after changing all zero values to NA. Regards Petr Thanks Regards Siddharth Sent on my BlackBerry® from Vodafone __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with scatterplots in R
I think that we need an example of what you are doing before anyone can really answer that question. At the moment we don't even know how you are plotting the scatterplot. --- On Wed, 7/28/10, Sarah Chisholm sarah.chisholm...@ucl.ac.uk wrote: From: Sarah Chisholm sarah.chisholm...@ucl.ac.uk Subject: [R] Help with scatterplots in R To: r-help@r-project.org r-help@r-project.org Received: Wednesday, July 28, 2010, 6:57 AM Hi, When I plot a scatter plot, R automatically only gives the years on the x-axis. How can I make R also show the months on the x-axis? Thank you very much! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] kde on Torus
On Jul 28, 2010, at 7:55 AM, Tim Gruene wrote: Hello, I have 2D-data on a torus, i.e. they are scattered within [0:2pi) and are supposed to be periodic with period 2pi. Is there a way in R for a kernel density estimation for such data? I found this article http://www.dmqte.unich.it/personal/dimarzio/density46.pdf but a) I don't fully understand the article (my knowledge in statistics is poor) b) I did not understand which Eq. represents the kernel(s) c) I do not now R well enough to understand whether I can use kde2d or nprudens with an arbitrary kernel My simple-minded attempt was to extend the data to [-2pi:4pi) and then use kde2d, but I am wondering a) how accurate is this b) is there a way to do it properly on a torus. You could take a look at the density function within the package, circular. -- David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to point a column of dataframe by a character
Hi Tony, I am sure there are other ways, but I would create formula objects and then pass them to lm(). Here's an example: mydata - data.frame(Y = 1:10, X1 = 11:20, X2 = 21:30) my.names - names(mydata)[-1] for(i in my.names) { my.formula - formula(paste(Y ~ , i, sep = )) my.lm - lm(my.formula, data = mydata) print(summary(my.lm)) } HTH, Josh On Wed, Jul 28, 2010 at 2:35 AM, Tony lul...@gmail.com wrote: Hello, Here is a dilemma I am having for a long time. But, I couldn't figure it out. I have an vector of Y and a data frame named data,which contains all Xs. I tried to be more efficient in fitting a simple linear regression with each X. Firstly, for (i in 1:(dim(data)[2])){ model-lm(Y~data[,i]) # this is not what I want since the name of coefficient will be data[,i] # I need coefficient name to be the name for each variable # for instance: # Coefficients: # (Intercept) data[, 1] # 24.2780 -0.3381 } Second try! I first create a vector of characters (Xs) that contains possible names of X. This vector is exactly the same as colnames of X. #my Xs Xs-c(a,b,c) for (i in length(Xs)){ model-lm(Y~data[,Xs[i]]) # Again, not what I want # Coefficients: # (Intercept) data[, Xs[1]] # 24.2780 -0.3381 } Thus, how can I solve this dilemma? I think about trying to find a function that can map the name of variable to values of that variable. That is, I first attach the data. I can type a to pull out values of data[,a] (a vector of numeric) directly. However, using Xs[1] will give me only character - a. Thus, is there any function that allow me to pull values of dadta[,a] , eg, some_function(Xs[1]) give me values of data[,a] Any help is appreciated. Tony [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] R CMD build wiped my computer
Jarrod, Noting your exchange with Martin, Martin brings up a point that certainly I missed, which is that somehow the tilde ('~') character got into the chain of events. As Martin noted, on Linuxen/Unixen (including OSX), the tilde, when used in the context of file name globbing, refers to your home directory. Thus, a command such as: ls ~ will list the files in your home directory. Similarly: rm ~ will remove the files there as well. If the -rf argument is added, then the deletion becomes recursive through that directory tree, which appears to be the case here. I am unclear, as Martin appears to be, as to the steps that caused this to happen. That may yet be related in some fashion to Duncan's hypothesis. That being said, the use of the tilde character as a suffix to denote that a file is a backup version, is not limited to Fedora or Linux, for that matter. It is quite common for many text editors (eg. Emacs) to use this. As a result, it is also common for many applications to ignore files that have a tilde suffix. Based upon your follow up posts to the original thread, it would seem that you do not have any backups. The default ext3 file system that is used on modern Linuxen, by design, makes it a bit more difficult to recover deleted files. This is due to the unlinking of file metadata at the file system data structure level, as opposed to simply marking the file as deleted in the directory structures, as happens on Windows. There is a utility called ext3undel (http://projects.izzysoft.de/trac/ext3undel), which is a wrapper of sorts to other undelete utilities such as PhotoRec and foremost. I have not used it/them, so cannot speak from personal experience. Thus it would be a good idea to engage in some reviews of the documentation and perhaps other online resources before proceeding. The other consideration is the Catch-22 of not copying anything new to your existing HD, for fear of overwriting the lost files with new data. So you would need to consider an approach of downloading these utilities via another computer and then running them on the computer in question from other media, such as a CD/DVD or USB HD. A more expensive option would be to use a professional data recovery service, where you would have to consider the cost of recovery versus your lost time. One option would be Kroll OnTrack UK (http://www.ontrackdatarecovery.co.uk/). I happen to live about a quarter mile from their world HQ here in a suburb of Minneapolis. I have not used them myself, but others that I know have, with good success. Again, this comes at a potentially substantial monetary cost. The key is that if you have any hope to recover the deleted files, you not copy anything new onto the hard drive in the mean time. Doing so will decrease the possibility of file recovery to near 0. As Duncan noted, there is great empathy with your situation. We have all gone through this at one time or another. In my case, it was perhaps 20+ years ago, but as a result, I am quite anal retentive about having backups, which I have done for some time on my systems, hourly. HTH, Marc Schwartz On Jul 28, 2010, at 5:55 AM, Jarrod Hadfield wrote: Hi Martin, I think this is the most likely reason given that the name in the DESCRIPTION file does NOT have a version number. Even so, it is very easy to misname a file and then delete it/change its name (as I've done here) and I hope current versions of R would not cause this problem. Perhaps Fedora should not use ~ as its back up file suffixes? Cheers, Jarrod On 28 Jul 2010, at 11:41, Martin Maechler wrote: Jarrod Hadfield j.hadfi...@ed.ac.uk on Tue, 27 Jul 2010 21:37:09 +0100 writes: Hi, I ran R (version 2.9.0) CMD build under root in Fedora (9). When it tried to remove junk files it removed EVERYTHING in my local account! (See below). Can anyone tell me what happened, the culprit may lay here: * removing junk files unlink MCMCglmm_2.05/R/ residuals.MCMCglmm.R ~ where it seems that someone (you?) have added a newline in the filname, so instead of 'residuals.MCMCglmm.R~' you got 'residuals.MCMCglmm.R ~' and the unlink / rm command interpreted '~' as your home directory. But I can hardly believe it. This seems explanation seems a bit doubtful to me.. ... and even more importantly if I can I restore what was lost. well, you just get it from the backup. You do daily backups, do you? Regards, Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help with specifiying random effects in lmer - psychology experiment
!-- /* Font Definitions */ @font-face {font-family:Cambria Math; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font-family:roman; mso-font-pitch:variable; mso-font-signature:-1610611985 1107304683 0 0 159 0;} @font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font-family:swiss; mso-font-pitch:variable; mso-font-signature:-1610611985 1073750139 0 0 159 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal{mso-style-unhide:no; mso-style-qformat:yes; mso-style-parent:;margin-top:0cm; margin-right:0cm; margin-bottom:10.0pt; margin-left:0cm; line-height:115%; mso-pagination:widow-orphan;font-size:11.0pt; font-family:Calibri,sans-serif; mso-fareast-font-family:Calibri; mso-bidi-font-family:Times New Roman; mso-fareast-language:EN-US;} .MsoChpDefault {mso-style-type:export-only;mso-default-props:yes; font-size:10.0pt; mso-ansi-font-size:10.0pt; mso-bidi-font-size:10.0pt; mso-ascii-font-family:Calibri; mso-fareast-font-family:Calibri;mso-hansi-font-family:Calibri;} @page Section1 {size:595.3pt 841.9pt; margin:72.0pt 72.0pt 72.0pt 72.0pt; mso-header-margin:35.4pt; mso-footer-margin:35.4pt; mso-paper-source:0;} div.Section1 {page:Section1;} -- Hi all, Im a psychologist moving from anova to lmer for analysis of response time (and error data) from a reaction time experiment, and have a question about specifying the structure of random effects in the model. Many of the examples I am reading about involve split-plot designs where effects are nested within each other This is not the case here: I am using a typical repeated measures design where each subject does everything: 30 subjects judge the laterality of a hand shown on a computer screen, completing 18 trials in all combinations of the following experimental factors (Angle, 8 levels (hand shown in 8 different orientations), Laterality, 2 levels (both right and left hands shown), Condition, 3 levels (participant holds own hands in posture a,b or c). With repeated measures ANOVA the Error structure is specified as follows aov(percentErrors~angle*condition*laterality+Error(subject/(angle*condition*laterality)),data=errorData) aov(meanRT~angle*condition*laterality+Error(subject/(angle*condition*laterality)),data=RTdata) where response variables, percentErrors and meanRT, are the percentage of errors (out of 18 trials) and the mean reaction time over 18 trials I need to move to lmer as variance is not constant across Angle and error data are bounded (0,1) so I should use the binomial family my first pass (looking at all main effects possible interactions) is: lmer(cbind(numErrors,numCorrect)~angle*condition*laterality+(angle*condition*laterality|subject),family=binomial),data=errorData) where numErrors+numCorrect = 18 for each subject Angle-Laterality-Condition combination lmer(meanRT~angle*condition*laterality+(angle*condition*laterality|subject),data=RTdata) I am unsure if this is correct? Help is welcome, thanks - Nuala Nuala Brady School of Psychology University College Dublin Belfield, D4 IRELAND +353 (0)1 716 8247 nuala.br...@ucd.ie [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] kde on Torus
Dear David, thanks for the hint. As far as I understand the circular kernels apply only to one-dimensional data, don't they? Tim On Wed, Jul 28, 2010 at 08:57:40AM -0400, David Winsemius wrote: On Jul 28, 2010, at 7:55 AM, Tim Gruene wrote: Hello, I have 2D-data on a torus, i.e. they are scattered within [0:2pi) and are supposed to be periodic with period 2pi. Is there a way in R for a kernel density estimation for such data? I found this article http://www.dmqte.unich.it/personal/dimarzio/density46.pdf but a) I don't fully understand the article (my knowledge in statistics is poor) b) I did not understand which Eq. represents the kernel(s) c) I do not now R well enough to understand whether I can use kde2d or nprudens with an arbitrary kernel My simple-minded attempt was to extend the data to [-2pi:4pi) and then use kde2d, but I am wondering a) how accurate is this b) is there a way to do it properly on a torus. You could take a look at the density function within the package, circular. -- David Winsemius, MD West Hartford, CT -- -- Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A signature.asc Description: Digital signature __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] xYplot error
Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Wed, 28 Jul 2010, Kang Min wrote: Hi Frank, Thanks for the suggestion. using numericScale() does work for Dotplot, but there're still a few issues: 1. My factor names are Plot A, PF, MSF, and YSF, so numericScale turns that into 3, 2, 1, 4 and the x-axis is plotted 1, 2, 3, 4. Is there any way I can retain the same order on the graph? Not sure why you are using numericScale. You can use the original factor variable. If you need to re-order its levels for plotting using reorder.factor. 2. I can't get the error bars displayed even after using method=bars, only the mean, lower and upper bounds of the data as points. This the line I used: Dotplot(cbind(mort, mort + stand, mort - stand) ~ numericScale(site) | type, data = mort, method=bands) That looks OK but I can't test it right now. Please continue to have a look, and if you still don't see the problem provide a tiny reproducible example with self-contained data I can access. Frank Thanks for your help. KM On Jul 27, 9:58 pm, Frank Harrell f.harr...@vanderbilt.edu wrote: If the x-axis variable is really a factor, xYplot will not handle it. You probably need a dot chart instead (see Hmisc's Dotplot). Note that it is unlikely that the confidence intervals are really symmetric. Frank On Tue, 27 Jul 2010, Kang Min wrote: Hi, I'm trying to plot a graph with error bars using xYplot in the Hmisc package. My data looks like this. mort stand site type 0.042512776 0.017854525 Plot A ST 0.010459803 0.005573305 PF ST 0.005188321 0.006842107 MSF ST 0.004276068 0.011592129 YSF ST 0.044586495 0.035225266 Plot A LD 0.038810662 0.037355408 PF LD 0.027567430 0.020523820 MSF LD 0.024698872 0.020320976 YSF LD Having read previous posts on xYplot being unable to plot x-axis as factors, I used numericScale, but I still get this error. Error in label.default(xv, units = TRUE, plot = TRUE, default = as.character(xvname), : the default string cannot be of length greater then one I used: xYplot(cbind(mort, mort + stand, mort - stand) ~ numericScale(site) | type, method=bars) Am I missing something or doing something wrong? Thanks. KM __ r-h...@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] R CMD build wiped my computer
I doubt if it will be helpful for Jarrod, but when I hear stories like this, I remember reading about a similar event and a remarkable partial recovery here: http://www.justpasha.org/folk/rm.html http://www.justpasha.org/folk/rm.htmlKevin On Wed, Jul 28, 2010 at 8:04 AM, Marc Schwartz marc_schwa...@me.com wrote: Jarrod, Noting your exchange with Martin, Martin brings up a point that certainly I missed, which is that somehow the tilde ('~') character got into the chain of events. As Martin noted, on Linuxen/Unixen (including OSX), the tilde, when used in the context of file name globbing, refers to your home directory. Thus, a command such as: ls ~ will list the files in your home directory. Similarly: rm ~ will remove the files there as well. If the -rf argument is added, then the deletion becomes recursive through that directory tree, which appears to be the case here. I am unclear, as Martin appears to be, as to the steps that caused this to happen. That may yet be related in some fashion to Duncan's hypothesis. That being said, the use of the tilde character as a suffix to denote that a file is a backup version, is not limited to Fedora or Linux, for that matter. It is quite common for many text editors (eg. Emacs) to use this. As a result, it is also common for many applications to ignore files that have a tilde suffix. Based upon your follow up posts to the original thread, it would seem that you do not have any backups. The default ext3 file system that is used on modern Linuxen, by design, makes it a bit more difficult to recover deleted files. This is due to the unlinking of file metadata at the file system data structure level, as opposed to simply marking the file as deleted in the directory structures, as happens on Windows. There is a utility called ext3undel ( http://projects.izzysoft.de/trac/ext3undel), which is a wrapper of sorts to other undelete utilities such as PhotoRec and foremost. I have not used it/them, so cannot speak from personal experience. Thus it would be a good idea to engage in some reviews of the documentation and perhaps other online resources before proceeding. The other consideration is the Catch-22 of not copying anything new to your existing HD, for fear of overwriting the lost files with new data. So you would need to consider an approach of downloading these utilities via another computer and then running them on the computer in question from other media, such as a CD/DVD or USB HD. A more expensive option would be to use a professional data recovery service, where you would have to consider the cost of recovery versus your lost time. One option would be Kroll OnTrack UK ( http://www.ontrackdatarecovery.co.uk/). I happen to live about a quarter mile from their world HQ here in a suburb of Minneapolis. I have not used them myself, but others that I know have, with good success. Again, this comes at a potentially substantial monetary cost. The key is that if you have any hope to recover the deleted files, you not copy anything new onto the hard drive in the mean time. Doing so will decrease the possibility of file recovery to near 0. As Duncan noted, there is great empathy with your situation. We have all gone through this at one time or another. In my case, it was perhaps 20+ years ago, but as a result, I am quite anal retentive about having backups, which I have done for some time on my systems, hourly. HTH, Marc Schwartz On Jul 28, 2010, at 5:55 AM, Jarrod Hadfield wrote: Hi Martin, I think this is the most likely reason given that the name in the DESCRIPTION file does NOT have a version number. Even so, it is very easy to misname a file and then delete it/change its name (as I've done here) and I hope current versions of R would not cause this problem. Perhaps Fedora should not use ~ as its back up file suffixes? Cheers, Jarrod On 28 Jul 2010, at 11:41, Martin Maechler wrote: Jarrod Hadfield j.hadfi...@ed.ac.uk on Tue, 27 Jul 2010 21:37:09 +0100 writes: Hi, I ran R (version 2.9.0) CMD build under root in Fedora (9). When it tried to remove junk files it removed EVERYTHING in my local account! (See below). Can anyone tell me what happened, the culprit may lay here: * removing junk files unlink MCMCglmm_2.05/R/ residuals.MCMCglmm.R ~ where it seems that someone (you?) have added a newline in the filname, so instead of 'residuals.MCMCglmm.R~' you got 'residuals.MCMCglmm.R ~' and the unlink / rm command interpreted '~' as your home directory. But I can hardly believe it. This seems explanation seems a bit doubtful to me.. ... and even more importantly if I can I restore what was lost. well, you just get it from the backup. You do daily backups, do you? Regards, Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list
[R] Fwd: How to point a column of dataframe by a character
(Forgot to copy the list.) Begin forwarded message: From: David Winsemius dwinsem...@comcast.net Date: July 28, 2010 7:44:38 AM EDT To: Tony lul...@gmail.com Subject: Re: [R] How to point a column of dataframe by a character On Jul 28, 2010, at 5:35 AM, Tony wrote: Hello, Here is a dilemma I am having for a long time. But, I couldn't figure it out. I have an vector of Y and a data frame named data,which contains all Xs. I tried to be more efficient in fitting a simple linear regression with each X. Firstly, for (i in 1:(dim(data)[2])){ model-lm(Y~data[,i]) You could try instead: model1 - lm(Y ~ . , data=data[ , i, drop=FALSE]) I added the drop=FALSE to prevent a single column from being converted to a nameless vector. testdf - data.frame(testcol=letters[1:10], stringsAsFactors=FALSE) testdf[,1] [1] a b c d e f g h i j testdf[,1, drop=FALSE] testcol 1a 2b 3c 4d 5e 6f 7g 8h 9i 10 j BTW, data as the name for an object is a bad idea. As seen in the line above, it means you brain needs to do extra work to keep straight the fact that data is now the name for two things, the object and the parameter. It could get even more complicated if you used the data function. Notice that I even numbered the model. I thought the name model was too non-specific. David. # this is not what I want since the name of coefficient will be data[,i] # I need coefficient name to be the name for each variable # for instance: # Coefficients: # (Intercept)data[, 1] # 24.2780 -0.3381 } Second try! I first create a vector of characters (Xs) that contains possible names of X. This vector is exactly the same as colnames of X. #my Xs Xs-c(a,b,c) for (i in length(Xs)){ model-lm(Y~data[,Xs[i]]) # Again, not what I want # Coefficients: # (Intercept)data[, Xs[1]] # 24.2780 -0.3381 } Thus, how can I solve this dilemma? I think about trying to find a function that can map the name of variable to values of that variable. That is, I first attach the data. I can type a to pull out values of data[,a] (a vector of numeric) directly. However, using Xs[1] will give me only character - a. Thus, is there any function that allow me to pull values of dadta[,a] , eg, some_function(Xs[1]) give me values of data[,a] Any help is appreciated. Tony David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using jpeg() without X
Something similar happens with png(), I think, if one doesn't compile against certain X headers (for instance on Mac). Is there any resource which describes exactly which graphics devices would work without *any* special options on any OS? I've only been able to find pdf() so far (the second poster said that bitmap() should work with ghostscript, but let's say one doesn't have that or permission to install it), but then again I also have a very poor understanding of how the whole graphics device thing works. I have to say it seems odd to me that one would need X or Quartz to create graphics, though I accept that it is; it would be really helpful to know why this is, though. Thanks! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with specifiying random effects in lmer - psychology experiment
Apologies, not sure why that was so garbled first time! - posting again more simply I want to use lmer for analysis of response time ( error data) from a reaction time experiment, and have a question about specifying the structure of random effects in the model. I am using a repeated measures design where 30 subjects judge the laterality of a hand shown on a computer screen, completing 18 trials in all combinations of the following experimental factors (Angle, 8 levels (hand shown in 8 orientations), Laterality, 2 levels (both right left hands shown), Condition, 3 levels (participant holds own hands in posture a,b or c). With repeated measures ANOVA the Error structure is specified as follows: aov(percentErrors~angle*condition*laterality+Error(subject/(angle*condition*laterality)),data=errorData) aov(meanRT~angle*condition*laterality+Error(subject/(angle*condition*laterality)),data=RTdata) where response variables, percentErrors and meanRT, are the percentage of errors (out of 18 trials) and the mean reaction time over 18 trials . I need to move to lmer as variance is not constant across Angle and error data are bounded (0,1) so I should use the binomial family my first pass (looking at all main effects possible interactions) is: lmer(cbind(numErrors,numCorrect)~angle*condition*laterality+(angle*condition*laterality|subject),family=binomial),data=errorData) where numErrors+numCorrect = 18 for each subject Angle-Laterality-Condition combination lmer(meanRT~angle*condition*laterality+(angle*condition*laterality|subject),data=RTdata) I am unsure if this is correct? Help is welcome, thanks - Nuala - Original Message - From: nuala brady nuala.br...@ucd.ie Date: Wednesday, July 28, 2010 2:09 pm Subject: [R] Help with specifiying random effects in lmer - psychology experiment To: r-help@r-project.org !-- /* Font Definitions */ @font- face {font-family:Cambria Math; panose-1:2 4 5 3 5 4 6 3 2 4; mso-font-charset:0; mso-generic-font- family:roman; mso-font-pitch:variable; mso-font- signature:-1610611985 1107304683 0 0 159 0;} @font- face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4; mso-font-charset:0; mso-generic-font- family:swiss; mso-font-pitch:variable; mso-font- signature:-1610611985 1073750139 0 0 159 0;} /* Style Definitions */ p.MsoNormal, li.MsoNormal, div.MsoNormal {mso-style-unhide:no; mso-style- qformat:yes; mso-style-parent:; margin- top:0cm; margin-right:0cm; margin- bottom:10.0pt; margin-left:0cm; line- height:115%; mso-pagination:widow-orphan; font- size:11.0pt; font-family:Calibri,sans-serif; mso- fareast-font-family:Calibri; mso-bidi-font-family:Times New Roman; mso-fareast-language:EN-US;} .MsoChpDefault {mso-style-type:export-only; mso- default-props:yes; font-size:10.0pt; mso-ansi-font- size:10.0pt; mso-bidi-font-size:10.0pt; mso-ascii- font-family:Calibri; mso-fareast-font- family:Calibri; mso-hansi-font-family:Calibri;} @page Section1 {size:595.3pt 841.9pt; margin:72.0pt 72.0pt 72.0pt 72.0pt; mso-header-margin:35.4pt; mso-footer- margin:35.4pt; mso-paper-source:0;} div.Section1 {page:Section1;} -- Hi all, Im a psychologist moving from anova to lmer for analysis of response time (and error data) from a reaction time experiment, and have a question about specifying the structure of random effects in the model. Many of the examples I am reading about involve split-plot designs where effects are nested within each other This is not the case here: I am using a typical repeated measures design where each subject does everything: 30 subjects judge the laterality of a hand shown on a computer screen, completing 18 trials in all combinations of the following experimental factors (Angle, 8 levels (hand shown in 8 different orientations), Laterality, 2 levels (both right and left hands shown), Condition, 3 levels (participant holds own hands in posture a,b or c). With repeated measures ANOVA the Error structure is specified as follows aov(percentErrors~angle*condition*laterality+Error(subject/(angle*condition*laterality)),data=errorData) aov(meanRT~angle*condition*laterality+Error(subject/(angle*condition*laterality)),data=RTdata) where response variables, percentErrors and meanRT, are the percentage of errors (out of 18 trials) and the mean reaction time over 18 trials I need to move to lmer as variance is not constant across Angle and error data are bounded (0,1) so I should use the binomial family my first pass (looking at all main effects possible interactions) is:
Re: [R] Odp: Data frame modification
Hi why do you insist on loops. R is not C. If you want to use loops use C or similar programming languages. It is almost always better to apply whole object approach. Kind and clever people already programmed it (sometimes in C ). x-rnorm(20) x[c(10,12,13,17)]-NA x [1] -1.12423790 0.80641765 -1.02686262 0.71894420 -0.76157153 -0.09612362 [7] 0.36681907 0.11164870 -1.06308689 NA -1.32903523 NA [13] NA 0.43308928 -0.16599726 -1.85594816 NA 0.02117957 [19] -0.58170838 1.45417843 library(zoo) na.locf(x) [1] -1.12423790 0.80641765 -1.02686262 0.71894420 -0.76157153 -0.09612362 [7] 0.36681907 0.11164870 -1.06308689 -1.06308689 -1.32903523 -1.32903523 [13] -1.32903523 0.43308928 -0.16599726 -1.85594816 -1.85594816 0.02117957 [19] -0.58170838 1.45417843 Would be always quicker then for cycle with condition checked in each step. There was an article in R News and P.Burns R inferno is also worth to look at if you are interested in loop performance. If you want to see where the time is spent use Rprof Regards Petr siddharth.gar...@gmail.com napsal dne 28.07.2010 15:20:11: Thanks for the reply Petr. I have solved this problem using sapply but what I am trying to understand here is, why this code is slow. One of the possible reasons could be when I use the assignment operator ie D$x[i]=D$x[i-1] It actually makes a new copy of D$x with the modified value. Another reason could be indexed lookups might not be very fast in R. Regards Siddharth --Original Message-- From: Petr PIKAL To: siddharth.gar...@gmail.com Cc: r-help@r-project.org Subject: Odp: [R] Data frame modification Sent: Jul 28, 2010 6:15 PM Hi r-help-boun...@r-project.org napsal dne 28.07.2010 11:30:48: Hi I am trying to modify a data frame D with lists x and y in such a way that if a value in x==0 then it should replace that value with the last not zero valuein x. I.e. for loop over i{ if(D$x[i]==0) D$x[i]=D$x[i-1] } The data frame is quite large in size ~ 43000 rows. This operation is taking a large amount of time. Can someone please suggest me what might be the reason. Bad programming practice? I would suggest to use zoo package and na.locf function after changing all zero values to NA. Regards Petr Thanks Regards Siddharth Sent on my BlackBerry® from Vodafone __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Sent on my BlackBerry® from Vodafone __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Glm
Ana, It's not really clear what you are trying to do. From your description, you aren't estimating anything. The code a - c(1, 0.6, 0.8) x - rnorm(2) y - crossprod(c(1, x), a) generates 'y'; but nothing is estimated. 'y' is going to have the same variance regardless of how many observations you generate, since the variance is coming from x1 and x2, not 'a'. Your question sounds a little like a homework problem. If it's not, please explain what you're trying to do. --Gray 2010/7/27 Ana De Barros belindadebar...@gmail.com: No, but thanks anyway... On 27/07/10 16:38, Gray Calhoun gray.calh...@gmail.com wrote: Hi Ana, Does the predict function do what you want? Type in ?predict.lm --Gray On 7/27/10, Ana De Barros belindadebar...@gmail.com wrote: Hi, Is there any way to estimate a DEPENDENT variable through a GLM/LM model? Suppose I have the linear model: y=a0+a1*x1+a2*x2 (a0=1, a1=0.6, a2=0.8, x1~N(1,1), x2~N(0,1)). The alphas and the auxiliary variables are given and I have to estimate y. The point is if I estimate it, let零 say algebraically, I get high variances that do not decrease as sample sizes increases... Is the any other way to do this?... It is not compulsory to use these alphas but my Y is unknown... Any ideas?... Thanks, Ana [[alternative HTML version deleted]] -- Gray Calhoun Assistant Professor of Economics, Iowa State University http://www.econ.iastate.edu/~gcalhoun/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] randomisation for matrix
Hi to all, I am looking for a randomisation procedure for a single matrix, including a possibility to set the number of randomisations and the to set the number of row and columns . Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How get colnames and rownames in Rcpp method?
Hi all, How get colnames and rownames in Rcpp method? attecthed file : RGui.exe capture my work environment : R version : 2.11.1 OS : WinXP Pro sp3 Thanks and best regards. Young-Ju, Park from Korea [1][rKWLzcpt.zNp8gmPEwGJCA00] [...@from=dllmainrcpt=r%2Dhelp%40r%2Dproject%2Eorgmsgid=%3C20100728211143%2EH M%2E0c7%40dllmain%2Ewwl737%2Ehanmail%2Enet%3E] References 1. mailto:dllm...@hanmail.net __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Tinn-R related problem
ForestStats schrieb: Hello, I too am having this problem. Some two minutes ago all was well then all of a sudden I cannot backspace or delete or use arrows etc.. there is a special Tinn-R forum: http://sourceforge.net/projects/tinn-r/forums Kind Regards Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing the Zoo Library.
There is no problems getting zoo to install correctly on my windows 7 box, but I can't get it happening on linux (ubuntu) here is the latest issue again... install.packages(zoo_1.6-4.tar.gz, repos = NULL ); Warning in install.packages(zoo_1.6-4.tar.gz, repos = NULL) : argument 'lib' is missing: using /usr/local/lib/R/site-library * Installing *source* package 'zoo' ... ** R ** demo ** inst ** preparing package for lazy loading Error in parse(file, n, text, prompt) : syntax error at 666: } 667: if(max(my.table(indexes)) 1L Error: unable to load R code in package 'zoo' Execution halted ERROR: lazy loading failed for package 'zoo' ** Removing '/usr/local/lib/R/site-library/zoo' Warning message: installation of package 'zoo_1.6-4.tar.gz' had non-zero exit status in: install.packages(zoo_1.6-4.tar.gz, repos = NULL) I also tried... install.packages(zoo) Warning in install.packages(zoo) : argument 'lib' is missing: using /usr/local/lib/R/site-library Warning in download.packages(unique(pkgs), destdir = tmpd, available = available, : no package 'zoo' at the repositories -- View this message in context: http://r.789695.n4.nabble.com/Installing-the-Zoo-Library-tp2303604p2304907.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] fisher's posthock test or fisher's combination test
I was looking for: fisher.comb - function (pvalues) { df=length(pvalues) ch2=(-2*sum(log(pvalues))) return pchisq(ch2, df=df, lower.tail=FALSE) } http://en.wikipedia.org/wiki/Fisher%27s_method for the second part: the calculation of the p-value from the result of fischer combination test . Kind regards Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error: arguments imply differing number
mydata - read.table(textConnection( Id cat1 location item_values p-values sequence a111 1 3002737 100 0.01 1 a112 1 3017821 102 0.05 2 a113 2 3027730 103 0.02 3 a114 2 3036220 104 0.04 4 a115 1 3053984 105 0.03 5 a118 1 3090500 106 0.02 8 a119 1 3103304 107 0.03 9 a120 2 3090500 106 0.02 10 a121 2 3103304 107 0.03 11 ), header = TRUE) closeAllConnections() first - function(x)c(TRUE, diff(x)!=1) last - function(x)c(diff(x)!=1, TRUE) mydata$start - first(mydata$sequence) mydata$end - last(mydata$sequence) mydata$runNumber - cumsum(first(mydata$sequence)) #load library library(plyr) ddply(mydata[, -1], .(runNumber,cat1), function(x) {max(x$item_values)}) my.summary - function(x) { start.loc - x$location[which(x$start == TRUE)] end.loc - x$location[which(x$end == TRUE)] peak - max(x$item_values) output - data.frame( start_of_the_location = start.loc, end_of_the_location = end.loc, peak_value = peak) return(output) } ddply(mydata[, -1], .(runNumber,cat1), my.summary) why ddply returned the following error Error in data.frame(start_of_the_location = start.loc, end_of_the_location = end.loc, : arguments imply differing number of rows: 0, 1 mydata[,-1] cat1 location item_values p.values sequence start end runNumber 11 3002737 100 0.011 TRUE FALSE 1 21 3017821 102 0.052 FALSE FALSE 1 32 3027730 103 0.023 FALSE FALSE 1 42 3036220 104 0.044 FALSE FALSE 1 51 3053984 105 0.035 FALSE TRUE 1 61 3090500 106 0.028 TRUE FALSE 2 71 3103304 107 0.039 FALSE FALSE 2 82 3090500 106 0.02 10 FALSE FALSE 2 92 3103304 107 0.03 11 FALSE TRUE 2 -- View this message in context: http://r.789695.n4.nabble.com/error-arguments-imply-differing-number-tp2305014p2305014.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] IPSUR-1.0 is on CRAN, plus update to RcmdrPlugin.IPSUR
IPSUR-1.0 is making its way through CRAN. It is a snapshot of the development version of the following textbook: Title: Introduction to Probability and Statistics using R, First Edition ISBN: 978-0-557-24979-4 Publisher: me The book is targeted for an undergraduate course in probability and statistics. The prerequisites are a couple semesters of calculus and a little bit of linear algebra. I have used various drafts of this document to supplement my lectures for over four years. IPSUR is FREE, in the GNU sense of the word. A pdf copy (plus the LaTeX source) is on the R-Forge Project Page: http://ipsur.r-forge.r-project.org/ Alternatively, a person can do the following at the command prompt: install.packages(IPSUR) library(IPSUR) read(IPSUR) There are still many important topics, examples, and (especially) exercises missing. I will add them as time and two toddlers permit. If you would like to preview the daily-built, bleeding-edge latest version you can get it with install.packages(IPSUR, repos=http://R-Forge.R-project.org;) Please route IPSUR-specific emails to the respective R-Forge mailing lists: Questions or problems: ipsur-h...@lists.r-forge.r-project.org Mistakes, suggestions: ipsur-de...@lists.r-forge.r-project.org RcmdrPlugin.IPSUR is a plugin for the R Commander to accompany IPSUR. The update to version 0.1-7 is an important one, because it squashes a long-standing incompatibility problem with other plugins. The naming scheme for menus is also updated to be consistent with the other plugins in particular, and naming conventions in general. Cheers, Jay P.S. If you are thinking to print parts of IPSUR yourself then I recommend the publisher-quality PDF linked from the Downloads section of the R-Forge Project Page. G. Jay Kerns, Ph.D. Associate Professor Department of Mathematics Statistics Youngstown State University Youngstown, OH 44555-0002 USA http://people.ysu.edu/~gkerns/ ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] T and F (was: Optimization problem with nonlinear constraint)
Hi Patrick, Actually, I feel just the opposite, i.e. it is not a good idea to use `T' for `TRUE'. I have been snared by this trap many a times in my early days with S-Plus and R. It is a good practice to use unabbreviated values. Best, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Patrick Burns Sent: Wednesday, July 28, 2010 3:54 AM To: r-help@r-project.org; u.kleinwech...@uni-hohenheim.de Subject: Re: [R] T and F (was: Optimization problem with nonlinear constraint) Some care is in order: Using 'T' as a variable name is quite dangerous in R since it is an alias for 'TRUE'. Rules to live by: * Avoid using 'T' and 'F' as object names. * Use 'TRUE' and 'FALSE', not 'T' and 'F'. If you follow these, then you won't be tripped up, and you won't trip other people either. On 28/07/2010 08:31, Uli Kleinwechter wrote: Dear Ravi, As I've already written to you, the problem indeed is to find a solution to the transcendental equation y = x * T^(x-1), given y and T and the optimization problem below only a workaround. John C. Nash has been so kind to help me on here. In case anyone faces a similar problem in the future, the solution looks as follows: * func1 - function(y,x,T){ out - x*T^(x-1)-y return(out) } # Assign the known values to y and T: y - 3 T - 123 root - uniroot(func1,c(-10,100),y=y,T=T) print(root) Somewhat simpler than I thought Thanks again! Uli Am 26.07.2010 17:44, schrieb Ravi Varadhan: Hi Uli, I am not sure if this is the problem that you really want to solve. The answer is the solution to the equation y = x * T^(x-1), provided a solution exists. There is no optimization involved here. What is the real problem that you are trying to solve? If you want to solve a more meaningful constrained optimization problem, you may want to try the abalama package which I just put on CRAN. It can optimize smooth nonlinear functions subject to linear and nonlinear equality and inequality constraints. http://cran.r-project.org/web/packages/alabama/index.html Let me know if you run into any problems using it. Best, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Uli Kleinwechter Sent: Monday, July 26, 2010 10:16 AM To: r-help@r-project.org Subject: [R] Optimization problem with nonlinear constraint Dear all, I'm looking for a way to solve a simple optimization problem with a nonlinear constraint. An example would be max x s.t. y = x * T ^(x-1) where y and T are known values. optim() and constrOptim() do only allow for box or linear constraints, so I did not succedd here. I also found hints to donlp2 but this does not seem to be available anymore. Any hints are welcome, Uli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Patrick Burns pbu...@pburns.seanet.com http://www.burns-stat.com (home of 'Some hints for the R beginner' and 'The R Inferno') __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory problem for scatterplot using ggplot
It was my understanding that R wasn't really the best thing for absolutely huge datasets. 17.5 million points would probably fall under the category of absolutely huge. I'm on a little netbook right now (atom/R32) and it failed, but I'll try it on my macbookPro/R64 later and see if it's able to handle the size better. For more information, my error is the following: Error: cannot allocate vector of size 66.8 Mb R(6725,0xa016e500) malloc: *** mmap(size=7640) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(6725,0xa016e500) malloc: *** mmap(size=7640) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug sessionInfo() R version 2.11.1 (2010-05-31) i386-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] sp_0.9-65 mapproj_1.1-8.2 maps_2.1-4 mgcv_1.6-2 ggplot2_0.8.8 [6] reshape_0.8.3 plyr_1.0.2 proto_0.3-8 loaded via a namespace (and not attached): [1] digest_0.4.2 lattice_0.18-8 Matrix_0.999375-39 nlme_3.1-96 [5] tools_2.11.1 On Wed, Jul 28, 2010 at 11:13, Edwin Husni Sutanudjaja hsutanudjajacchm...@yahoo.com wrote: Dear all, I have a memory problem in making a scatter plot of my 17.5 million-pair datasets. My intention to use the ggplot package and use the bin2d. Please find the attached script for more details. Could somebody please give me any clues or tips to solve my problem?? please ... Just for additional information: I'm running my R script on my 32-bit machine: Ubuntu 9.10, hardware: AMD Athlon Dual Core Processor 5200B, memory: 1.7GB. Many thanks in advance. Kind Regards, -- Ir. Edwin H. Sutanudjaja Dept. of Physical Geography, Faculty of Geosciences, Utrecht University -- You received this message because you are subscribed to the ggplot2 mailing list. Please provide a reproducible example: http://gist.github.com/270442 To post: email ggpl...@googlegroups.com To unsubscribe: email ggplot2+unsubscr...@googlegroups.comggplot2%2bunsubscr...@googlegroups.com More options: http://groups.google.com/group/ggplot2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to point a column of dataframe by a character
On Wed, Jul 28, 2010 at 8:59 AM, Joshua Wiley jwiley.ps...@gmail.com wrote: Hi Tony, I am sure there are other ways, but I would create formula objects and then pass them to lm(). Here's an example: mydata - data.frame(Y = 1:10, X1 = 11:20, X2 = 21:30) my.names - names(mydata)[-1] for(i in my.names) { my.formula - formula(paste(Y ~ , i, sep = )) my.lm - lm(my.formula, data = mydata) print(summary(my.lm)) } You might want to also replace the my.lm- line above with: my.lm - do.call(lm, list(my.formula, data = quote(mydata))) so that the Call: line in the output comes out fully expanded. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Beginner stucked with raster + geoR package.
Hello everyone. I am trying to build up understanding in R by trying to develop just some simple scenarios. I would like to explain you what I am trying to do and what I did so far. I would like to put inside a RasterLayer (raster package) a Gaussian field (for given covariance) using grf function (geoR package) 1. First I created a Raster Layer object r - raster() # Default values are ok 2. Then I set some values to test how if setValues worked r - setValues(r,1:ncell(r)/100) # Every cell of the RasterLayer takes as data its cell number/100. 3. Tested that 2 works with getValuesBlock(r,1,nrow(r),1,ncol(r)) 4. Then I tried to generate a Gaussian random field for given covariance parameters using grf. temp - grf(1,nx=nrow(r),ny=ncol(r), cov.pars=c(1, .25)) # Note: The first input parameter in grf is locations. At first I tried nrow(r)*ncol(y) but I got the ouput that this will produce 15,5 Gb of data :(. I just wanted to created a x*y array of this data. Then I tried to pass the nrow(r) and ncol(r) to the nx and ny arguments. Now this seems ok to me. 5. Final step is to set the values created in 4th step to the RasterLayer object that was created at first step. r - setValues(r,temp) # Unfortunately this ends with the message: Error in setValues(r, temp) : values must be a vector and at this point I am completely stuck :( I would like to request some of your kindness to help me unstuck Best Regards Alex. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] R CMD build wiped my computer
Hi Marc, Thanks for the info on recovery - most of it can pieced together from backups but a quick, cheap and easy method of recovery would have been nicer. My main concern is that this could happen again and that the bug is not limited to R 2.9. I would think that an accidental carriage return at the end of a file name (even a temporary one) would be a reasonably common phenomenon (I'm surprised I hadn't done it before). Cheers, Jarrod On 28 Jul 2010, at 14:04, Marc Schwartz wrote: Jarrod, Noting your exchange with Martin, Martin brings up a point that certainly I missed, which is that somehow the tilde ('~') character got into the chain of events. As Martin noted, on Linuxen/Unixen (including OSX), the tilde, when used in the context of file name globbing, refers to your home directory. Thus, a command such as: ls ~ will list the files in your home directory. Similarly: rm ~ will remove the files there as well. If the -rf argument is added, then the deletion becomes recursive through that directory tree, which appears to be the case here. I am unclear, as Martin appears to be, as to the steps that caused this to happen. That may yet be related in some fashion to Duncan's hypothesis. That being said, the use of the tilde character as a suffix to denote that a file is a backup version, is not limited to Fedora or Linux, for that matter. It is quite common for many text editors (eg. Emacs) to use this. As a result, it is also common for many applications to ignore files that have a tilde suffix. Based upon your follow up posts to the original thread, it would seem that you do not have any backups. The default ext3 file system that is used on modern Linuxen, by design, makes it a bit more difficult to recover deleted files. This is due to the unlinking of file metadata at the file system data structure level, as opposed to simply marking the file as deleted in the directory structures, as happens on Windows. There is a utility called ext3undel (http://projects.izzysoft.de/trac/ext3undel ), which is a wrapper of sorts to other undelete utilities such as PhotoRec and foremost. I have not used it/them, so cannot speak from personal experience. Thus it would be a good idea to engage in some reviews of the documentation and perhaps other online resources before proceeding. The other consideration is the Catch-22 of not copying anything new to your existing HD, for fear of overwriting the lost files with new data. So you would need to consider an approach of downloading these utilities via another computer and then running them on the computer in question from other media, such as a CD/DVD or USB HD. A more expensive option would be to use a professional data recovery service, where you would have to consider the cost of recovery versus your lost time. One option would be Kroll OnTrack UK (http://www.ontrackdatarecovery.co.uk/ ). I happen to live about a quarter mile from their world HQ here in a suburb of Minneapolis. I have not used them myself, but others that I know have, with good success. Again, this comes at a potentially substantial monetary cost. The key is that if you have any hope to recover the deleted files, you not copy anything new onto the hard drive in the mean time. Doing so will decrease the possibility of file recovery to near 0. As Duncan noted, there is great empathy with your situation. We have all gone through this at one time or another. In my case, it was perhaps 20+ years ago, but as a result, I am quite anal retentive about having backups, which I have done for some time on my systems, hourly. HTH, Marc Schwartz On Jul 28, 2010, at 5:55 AM, Jarrod Hadfield wrote: Hi Martin, I think this is the most likely reason given that the name in the DESCRIPTION file does NOT have a version number. Even so, it is very easy to misname a file and then delete it/change its name (as I've done here) and I hope current versions of R would not cause this problem. Perhaps Fedora should not use ~ as its back up file suffixes? Cheers, Jarrod On 28 Jul 2010, at 11:41, Martin Maechler wrote: Jarrod Hadfield j.hadfi...@ed.ac.uk on Tue, 27 Jul 2010 21:37:09 +0100 writes: Hi, I ran R (version 2.9.0) CMD build under root in Fedora (9). When it tried to remove junk files it removed EVERYTHING in my local account! (See below). Can anyone tell me what happened, the culprit may lay here: * removing junk files unlink MCMCglmm_2.05/R/ residuals.MCMCglmm.R ~ where it seems that someone (you?) have added a newline in the filname, so instead of 'residuals.MCMCglmm.R~' you got 'residuals.MCMCglmm.R ~' and the unlink / rm command interpreted '~' as your home directory. But I can hardly believe it. This seems explanation seems a bit doubtful to me.. ... and even more importantly if I can I restore what was lost. well, you just get it from the backup. You do
Re: [R] randomisation for matrix
Hi Knut, I think you're going to have to be more specific. The code matrix(rnorm(25), 5, 5) generates a random 5 by 5 matrix. If you need specific distributions, search through the helpfiles using help.search or RSiteSearch. --Gray On Wed, Jul 28, 2010 at 7:51 AM, Knut Krueger r...@krueger-family.de wrote: Hi to all, I am looking for a randomisation procedure for a single matrix, including a possibility to set the number of randomisations and the to set the number of row and columns . Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Gray Calhoun Assistant Professor of Economics, Iowa State University http://www.econ.iastate.edu/~gcalhoun/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory problem for scatterplot using ggplot
On 07/28/2010 06:13 AM, Edwin Husni Sutanudjaja wrote: Dear all, I have a memory problem in making a scatter plot of my 17.5 million-pair datasets. My intention to use the ggplot package and use the bin2d. Please find the attached script for more details. Could somebody please give me any clues or tips to solve my problem?? please ... Just for additional information: I'm running my R script on my 32-bit machine: Ubuntu 9.10, hardware: AMD Athlon Dual Core Processor 5200B, memory: 1.7GB. Many thanks in advance. Kind Regards, You should try to get access to a fairly robust 64bit machine, say in the range of =8GiB real memory and see what you can do. No chance on a 32 bit machine. No chance on a 64 bit machine without sufficient real memory (you will be doomed to die by swap). Does your institution have a virtualization lab with the ability to allocate machines with large memory footprints? There is always Amazon EC2. You could experiment with sizing before buying that new workstation you've had your eye on. Alternatively, you might take much smaller samples of your data and massively decrease the size of the working set. I assume this is not want you want though. Mark __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How get colnames and rownames in Rcpp method?
Hi, This is not the appropriate list for questions about Rcpp. See the Rcpp-devel mailing list, but first please think about a reproducible example. Romain Le 28/07/10 14:11, 나여나 a écrit : Hi all, How get colnames and rownames in Rcpp method? attecthed file : RGui.exe capture my work environment : R version : 2.11.1 OS : WinXP Pro sp3 Thanks and best regards. Young-Ju, Park from Korea [1][rKWLzcpt.zNp8gmPEwGJCA00] [...@from=dllmainrcpt=r%2Dhelp%40r%2Dproject%2Eorgmsgid=%3C20100728211143%2EH M%2E0c7%40dllmain%2Ewwl737%2Ehanmail%2Enet%3E] References 1. mailto:dllm...@hanmail.net -- Romain Francois Professional R Enthusiast +33(0) 6 28 91 30 30 http://romainfrancois.blog.free.fr |- http://bit.ly/aryfrk : useR! 2010 |- http://bit.ly/bc8jNi : Rcpp 0.8.4 `- http://bit.ly/dz0RlX : bibtex 0.2-1 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimization problem with nonlinear constraint
Uli Kleinwechter u.kleinwechter at uni-hohenheim.de writes: Dear Ravi, As I've already written to you, the problem indeed is to find a solution to the transcendental equation y = x * T^(x-1), given y and T and the optimization problem below only a workaround. I don't think optimization is the right approach for simply inverting a simple function. The inverse of the function x \to x * e^x is the Lambert W function. So the solution in your case is: W(log(T)*y*T) / log(T) # hope I transformed it correctly. Now, how to compute Lambert's W ? Well, look into the 'gsl' package and, alas, there is the function lambert_W0. Your example: y - 3 T - 123 library(gsl) lambert_W0(log(T)*y*T)/log(T) # [1] 1.191830 Regards, Hans Werner John C. Nash has been so kind to help me on here. In case anyone faces a similar problem in the future, the solution looks as follows: * func1 - function(y,x,T){ out - x*T^(x-1)-y return(out) } # Assign the known values to y and T: y - 3 T - 123 root - uniroot(func1,c(-10,100),y=y,T=T) print(root) Somewhat simpler than I thought Thanks again! Uli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] R CMD build wiped my computer
On 28/07/2010 10:01 AM, Jarrod Hadfield wrote: Hi Marc, Thanks for the info on recovery - most of it can pieced together from backups but a quick, cheap and easy method of recovery would have been nicer. My main concern is that this could happen again and that the bug is not limited to R 2.9. I would think that an accidental carriage return at the end of a file name (even a temporary one) would be a reasonably common phenomenon (I'm surprised I hadn't done it before). If you can put together a recipe to reproduce the problem (or a less extreme version of R deleting files it shouldn't), we'll certainly fix it. But so far all we've got are guesses about what might have gone wrong, and I don't think anyone has been able to reproduce the problem on current R. Duncan Murdoch Cheers, Jarrod On 28 Jul 2010, at 14:04, Marc Schwartz wrote: Jarrod, Noting your exchange with Martin, Martin brings up a point that certainly I missed, which is that somehow the tilde ('~') character got into the chain of events. As Martin noted, on Linuxen/Unixen (including OSX), the tilde, when used in the context of file name globbing, refers to your home directory. Thus, a command such as: ls ~ will list the files in your home directory. Similarly: rm ~ will remove the files there as well. If the -rf argument is added, then the deletion becomes recursive through that directory tree, which appears to be the case here. I am unclear, as Martin appears to be, as to the steps that caused this to happen. That may yet be related in some fashion to Duncan's hypothesis. That being said, the use of the tilde character as a suffix to denote that a file is a backup version, is not limited to Fedora or Linux, for that matter. It is quite common for many text editors (eg. Emacs) to use this. As a result, it is also common for many applications to ignore files that have a tilde suffix. Based upon your follow up posts to the original thread, it would seem that you do not have any backups. The default ext3 file system that is used on modern Linuxen, by design, makes it a bit more difficult to recover deleted files. This is due to the unlinking of file metadata at the file system data structure level, as opposed to simply marking the file as deleted in the directory structures, as happens on Windows. There is a utility called ext3undel (http://projects.izzysoft.de/trac/ext3undel ), which is a wrapper of sorts to other undelete utilities such as PhotoRec and foremost. I have not used it/them, so cannot speak from personal experience. Thus it would be a good idea to engage in some reviews of the documentation and perhaps other online resources before proceeding. The other consideration is the Catch-22 of not copying anything new to your existing HD, for fear of overwriting the lost files with new data. So you would need to consider an approach of downloading these utilities via another computer and then running them on the computer in question from other media, such as a CD/DVD or USB HD. A more expensive option would be to use a professional data recovery service, where you would have to consider the cost of recovery versus your lost time. One option would be Kroll OnTrack UK (http://www.ontrackdatarecovery.co.uk/ ). I happen to live about a quarter mile from their world HQ here in a suburb of Minneapolis. I have not used them myself, but others that I know have, with good success. Again, this comes at a potentially substantial monetary cost. The key is that if you have any hope to recover the deleted files, you not copy anything new onto the hard drive in the mean time. Doing so will decrease the possibility of file recovery to near 0. As Duncan noted, there is great empathy with your situation. We have all gone through this at one time or another. In my case, it was perhaps 20+ years ago, but as a result, I am quite anal retentive about having backups, which I have done for some time on my systems, hourly. HTH, Marc Schwartz On Jul 28, 2010, at 5:55 AM, Jarrod Hadfield wrote: Hi Martin, I think this is the most likely reason given that the name in the DESCRIPTION file does NOT have a version number. Even so, it is very easy to misname a file and then delete it/change its name (as I've done here) and I hope current versions of R would not cause this problem. Perhaps Fedora should not use ~ as its back up file suffixes? Cheers, Jarrod On 28 Jul 2010, at 11:41, Martin Maechler wrote: Jarrod Hadfield j.hadfi...@ed.ac.uk on Tue, 27 Jul 2010 21:37:09 +0100 writes: Hi, I ran R (version 2.9.0) CMD build under root in Fedora (9). When it tried to remove junk files it removed EVERYTHING in my local account! (See below). Can anyone tell me what happened, the culprit may lay here: * removing junk files unlink MCMCglmm_2.05/R/
Re: [R] Optimization problem with nonlinear constraint
Very nice, Hans! I didn't know of the existence of Lambert W function (a.k.a Omega function) before. I didn't know that it occurs in the solution of exponential decay with delay: dy/dy = a * y(t - 1). Apparently it is more than 250 years old! Thanks, Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hans W Borchers Sent: Wednesday, July 28, 2010 11:11 AM To: r-h...@stat.math.ethz.ch Subject: Re: [R] Optimization problem with nonlinear constraint Uli Kleinwechter u.kleinwechter at uni-hohenheim.de writes: Dear Ravi, As I've already written to you, the problem indeed is to find a solution to the transcendental equation y = x * T^(x-1), given y and T and the optimization problem below only a workaround. I don't think optimization is the right approach for simply inverting a simple function. The inverse of the function x \to x * e^x is the Lambert W function. So the solution in your case is: W(log(T)*y*T) / log(T) # hope I transformed it correctly. Now, how to compute Lambert's W ? Well, look into the 'gsl' package and, alas, there is the function lambert_W0. Your example: y - 3 T - 123 library(gsl) lambert_W0(log(T)*y*T)/log(T) # [1] 1.191830 Regards, Hans Werner John C. Nash has been so kind to help me on here. In case anyone faces a similar problem in the future, the solution looks as follows: * func1 - function(y,x,T){ out - x*T^(x-1)-y return(out) } # Assign the known values to y and T: y - 3 T - 123 root - uniroot(func1,c(-10,100),y=y,T=T) print(root) Somewhat simpler than I thought Thanks again! Uli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Optimization problem with nonlinear constraint
For those interested in esoteric of special functions, here is a nice reference on the Lambert W function: http://www.cs.uwaterloo.ca/research/tr/1993/03/W.pdf Ravi. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Hans W Borchers Sent: Wednesday, July 28, 2010 11:11 AM To: r-h...@stat.math.ethz.ch Subject: Re: [R] Optimization problem with nonlinear constraint Uli Kleinwechter u.kleinwechter at uni-hohenheim.de writes: Dear Ravi, As I've already written to you, the problem indeed is to find a solution to the transcendental equation y = x * T^(x-1), given y and T and the optimization problem below only a workaround. I don't think optimization is the right approach for simply inverting a simple function. The inverse of the function x \to x * e^x is the Lambert W function. So the solution in your case is: W(log(T)*y*T) / log(T) # hope I transformed it correctly. Now, how to compute Lambert's W ? Well, look into the 'gsl' package and, alas, there is the function lambert_W0. Your example: y - 3 T - 123 library(gsl) lambert_W0(log(T)*y*T)/log(T) # [1] 1.191830 Regards, Hans Werner John C. Nash has been so kind to help me on here. In case anyone faces a similar problem in the future, the solution looks as follows: * func1 - function(y,x,T){ out - x*T^(x-1)-y return(out) } # Assign the known values to y and T: y - 3 T - 123 root - uniroot(func1,c(-10,100),y=y,T=T) print(root) Somewhat simpler than I thought Thanks again! Uli __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error: arguments imply differing number
Hi, Thanks for including code and data so that we could reproduce what you're doing. Your problem is that you tell ddply to split the dataset by runNumber and cat1, which results in 4 groups. ddply then applies my.summary() to these four groups. One of these groups (cat1 = 1 and runNumber=1) has both start.loc and end.loc, as it contains rows which has start=TRUE and end=TRUE. This group will work fine. The other three groups, however, are broken. The group with cat1 = 2 and runNumber = 1 has neither start.loc nor end.loc, while the two groups with runNumber = 2 each have only one of the two. The error disappears if you split the dataset only by runNumber, as then each group has both start.loc and end.loc. If you want to apply my.summary() to each of these four groups, you're going to have to fix the earlier code that assigns the start and end variables. Jonathan On Wed, Jul 28, 2010 at 7:59 AM, jd6688 jdsignat...@gmail.com wrote: mydata - read.table(textConnection( Id cat1 location item_values p-values sequence a111 1 3002737 100 0.01 1 a112 1 3017821 102 0.05 2 a113 2 3027730 103 0.02 3 a114 2 3036220 104 0.04 4 a115 1 3053984 105 0.03 5 a118 1 3090500 106 0.02 8 a119 1 3103304 107 0.03 9 a120 2 3090500 106 0.02 10 a121 2 3103304 107 0.03 11 ), header = TRUE) closeAllConnections() first - function(x)c(TRUE, diff(x)!=1) last - function(x)c(diff(x)!=1, TRUE) mydata$start - first(mydata$sequence) mydata$end - last(mydata$sequence) mydata$runNumber - cumsum(first(mydata$sequence)) #load library library(plyr) ddply(mydata[, -1], .(runNumber,cat1), function(x) {max(x$item_values)}) my.summary - function(x) { start.loc - x$location[which(x$start == TRUE)] end.loc - x$location[which(x$end == TRUE)] peak - max(x$item_values) output - data.frame( start_of_the_location = start.loc, end_of_the_location = end.loc, peak_value = peak) return(output) } ddply(mydata[, -1], .(runNumber,cat1), my.summary) why ddply returned the following error Error in data.frame(start_of_the_location = start.loc, end_of_the_location = end.loc, : arguments imply differing number of rows: 0, 1 mydata[,-1] cat1 location item_values p.values sequence start end runNumber 11 3002737 100 0.011 TRUE FALSE 1 21 3017821 102 0.052 FALSE FALSE 1 32 3027730 103 0.023 FALSE FALSE 1 42 3036220 104 0.044 FALSE FALSE 1 51 3053984 105 0.035 FALSE TRUE 1 61 3090500 106 0.028 TRUE FALSE 2 71 3103304 107 0.039 FALSE FALSE 2 82 3090500 106 0.02 10 FALSE FALSE 2 92 3103304 107 0.03 11 FALSE TRUE 2 -- View this message in context: http://r.789695.n4.nabble.com/error-arguments-imply-differing-number-tp2305014p2305014.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] columns mapping
DF1 name OTHER ABC O KKK O QQQ O DDD O PPP O DF2 name ABC KKK DDD If the names in df1 resides in df2, then add the mapped name to df1 as a separate column, for instance mappedColumn the output should be: DF1 name OTHER mappedColumn ABCOABC KKKO KKK QQQ O DDDO PPP O PPP I have been trying for a while, still didn't figure it out, would you please help if you could. Thanks -- View this message in context: http://r.789695.n4.nabble.com/columns-mapping-tp2305213p2305213.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Axes=F and plotting dual y axes
Howdy. Been running into a bit of trouble with plotting. Seems that axes=F is not working. Whenever I plot (either a dataframe or xts/zoo series) and I set axes=F along with xlab/ylab= I still get the default axes printed in my chart. Consider this: #Create some sample data, both 50 units of blah series2 = c(1:50) series1 = rep(25:74) testdf1 = as.data.frame(series1) testdf1$series2 = series2 As a note, I converted my original xts/zoo dataset into a dataframe thinking it could be weirdness on the part of that. I just did this here to have something reproducible since it's not feasible to put my entire original dataset in the email. plot(testdf1[,1], main=Woo, col=rich12equal, xlab=, ylab=, axes=F, type='l') It still displays the x and y axes. What I'm trying to do: par(new=T) plot(testdf1[,2], ...) and from there I can manually label the axes, title, etc. Any idea what's going on? I'm using R 2.10.1 on ubuntu (I probably need to upgrade via debian packages soon) Regards, CJ ps: I am trying to accomplish this, as seen here: http://blog.earlh.com/index.php/2009/07/multiple-y-axes-in-r-plots-part-9-in-a-series/ -- but axes=F is giving me trouble. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] strange error : isS4(x) in gamm function (mgcv package). Variable in data-frame not recognized???
Follow up: I finally succeeded to more or less reproduce the error. The origin lied in the fact that I accidently loaded a function while being in browser mode for debugging that function. So something went very much wrong with the namespaces. Teaches me right... Cheers Joris On Wed, Jul 28, 2010 at 2:27 PM, Joris Meys jorism...@gmail.com wrote: Dear all, it gets even more weird. After restarting R, the code I used works just fine. The call is generated in a function that I debugged using browser(). Problem is solved, but I have no clue whatsoever how that error came about. It must have something to do with namespaces, but the origin is dark. I tried to regenerate the error, but didn't succeed. Somebody an idea as to where I have to look for a cause? Cheers Joris On Wed, Jul 28, 2010 at 1:16 PM, Joris Meys jorism...@gmail.com wrote: Dear all, I run a gamm with following call : result - try(gamm(values~ s( VM )+s( RH )+s( TT )+s( PP )+RF+weekend+s(day)+s(julday) ,correlation=corCAR1(form=~ day|month ),data=tmp) ) with mgcv version 1.6.2 No stress about the data, the error is not data-related. I get : Error in isS4(x) : object 'VM' not found What so? I did define the dataframe to be used, and the dataframe contains a variable VM : str(tmp) 'data.frame': 4014 obs. of 12 variables: $ values : num 73.45 105.45 74.45 41.45 -4.55 ... $ dates :Class 'Date' num [1:4014] 9862 9863 9864 9865 9866 ... $ year : num -5.65 -5.65 -5.65 -5.65 -5.65 ... $ day : num -178 -177 -176 -175 -174 ... $ month : Factor w/ 156 levels 1996-April,1996-August,..: 17 17 17 17 17 17 17 17 17 17 ... $ julday : num -2241 -2240 -2239 -2238 -2237 ... $ weekend: num -0.289 -0.289 -0.289 0.711 0.711 ... $ VM : num 0.139 -1.451 0.349 0.839 -0.611 ... $ RH : num 55.2 61.4 59.8 64.1 60.7 ... $ TT : num -23.4 -23.6 -19.5 -16.1 -15.3 ... $ PP : num 6.17 4.27 -4.93 -9.23 -2.63 ... $ RF : Ord.factor w/ 3 levels None2.5mm..: 1 1 1 1 1 1 1 1 1 1 ... - attr(*, means)= num Any idea what I'm missing here? Cheers Joris -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be --- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] UseR! 2010 - my impressions
Ravi, you are so right, we had the opportunity to attend an excellent and extremely well organized conference. Congratulation to Kate and the whole team at the NIST!!! Uwe ps. Thanks for starting the thread - it is too easy to forget about these well deserved thanks when we are back in normal life... On 24.07.2010 01:50, Ravi Varadhan wrote: Dear UseRs!, Everything about UseR! 2010 was terrific! I really mean everything - the tutorials, invited talks, kaleidoscope sessions, focus sessions, breakfast, snacks, lunch, conference dinner, shuttle services, and the participants. The organization was fabulous. NIST were gracious hosts, and provided top notch facilities. The rousing speech by Antonio Possolo, who is the chief of Statistical Engineering Division at NIST, set the tempo for the entire conference. Excellent invited lectures by Luke Tierney, Frank Harrell, Mark Handcock, Diethelm Wurtz, Uwe Ligges, and Fritz Leisch. All the sessions that I attended had many interesting ideas and useful contributions. During the whole time that I was there, I could not help but get the feeling that I am a part of something great. Before I end, let me add a few words about a special person. This conference would not have been as great as it was without the tireless efforts of Kate Mullen. The great thing about Kate is that she did so much without ever hogging the limelight. Thank you, Kate and thank you NIST! I cannot wait for UseR!2011! Best, Ravi. Ravi Varadhan, Ph.D. Assistant Professor, Division of Geriatric Medicine and Gerontology School of Medicine Johns Hopkins University Ph. (410) 502-2619 email: rvarad...@jhmi.edu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] columns mapping
Hi, please try ?merge DF2$mappedColumn - DF2$name merge(DF1,DF2,all.x=T,sort=F) - A R learner. -- View this message in context: http://r.789695.n4.nabble.com/columns-mapping-tp2305213p2305250.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axes=F and plotting dual y axes
Johnson, Cedrick W. cedrick at cedrickjohnson.com writes: Howdy. Been running into a bit of trouble with plotting. Seems that axes=F is not working. Whenever I plot (either a dataframe or xts/zoo series) and I set axes=F along with xlab/ylab= I still get the default axes printed in my chart. A quick guess: what happens if you try axes=FALSE instead? (i.e., have you assigned 'F' a value in your workspace?) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problem with building package on CRAN
Package should be online now (or within 24 hours at least). Apologies for the noise, and please just ask me directly in case you get my messages - no need to flood R-help even more Best wishes, Uwe On 26.07.2010 23:25, Romain Francois wrote: Hello, (ccing Rcpp-devel too because this is relevant) This comes up every now and again on packages that are completely unrelated to Rcpp. We don't know yet why or what to do to fix the issue. I believe (but this might not be the case) that this is due to packages that do use Rcpp and fail to follow our guidelines and documentation and emails about using LinkingTo to pull in header files. See http://lists.r-forge.r-project.org/pipermail/rcpp-devel/2010-July/000898.html I'm really sorry you are caught in the middle of this, there is probably nothing wrong with your package, it just happens sort of randomly. I believe the chances of this happening would be lower if package developers (of package using Rcpp) would be so kind and follow our guidelines, but we can only offer to write the guidelines, we cannot force them to read or apply them. Romain Le 26/07/10 22:43, William Revelle a écrit : Dear friends, I have just gotten a strange error message back from Uwe saying that the most recent version of psych failed to pass R CMD check for Windows. The error message was less than helpful, in that it seems to have failed when trying to include the Rcpp library, which I do not directly call. (see below) * using log directory 'd:/Rcompile/CRANpkg/local/2.11/psych.Rcheck' * using R version 2.11.1 (2010-05-31) * using session charset: ISO8859-1 * checking for file 'psych/DESCRIPTION' ... OK * this is package 'psych' version '1.0-90' * checking package name space information ... OK * checking package dependencies ... OK * checking if this is a source package ... OK * checking whether package 'psych' can be installed ... ERROR Installation failed. The installation logfile: -Id:/Rcompile/CRANpkg/lib/2.11/Rcpp/include I do have several suggested packages (polycor, GPArotation, MASS, graph, Rgraphviz, mvtnorm, Rcsdp), but none of these are actually required. My examples all ask if the suggested packages are available and then do not call them if they are not. Any suggestions on what to do would be appreciated. Thanks. Bill __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory problem for scatterplot using ggplot
On Jul 28, 2010, at 9:53 AM, Brandon Hurr wrote: It was my understanding that R wasn't really the best thing for absolutely huge datasets. 17.5 million points would probably fall under the category of absolutely huge. I'm on a little netbook right now (atom/R32) and it failed, but I'll try it on my macbookPro/R64 later and see if it's able to handle the size better. With 24GB on a Mac with 64 bit R, I routinely work with objects that are, let's see... 3,969,086,272, ... after commas 4GB in size (about 4.5 million records with about 100 columns). Thank you Simon and all the others doing R core and Mac development. As far as I am concerned 64bit R _IS_ the best thing. -- David. For more information, my error is the following: Error: cannot allocate vector of size 66.8 Mb R(6725,0xa016e500) malloc: *** mmap(size=7640) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug R(6725,0xa016e500) malloc: *** mmap(size=7640) failed (error code=12) *** error: can't allocate region *** set a breakpoint in malloc_error_break to debug sessionInfo() R version 2.11.1 (2010-05-31) i386-apple-darwin9.8.0 locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8 attached base packages: [1] grid stats graphics grDevices utils datasets methods base other attached packages: [1] sp_0.9-65 mapproj_1.1-8.2 maps_2.1-4 mgcv_1.6-2 ggplot2_0.8.8 [6] reshape_0.8.3 plyr_1.0.2 proto_0.3-8 loaded via a namespace (and not attached): [1] digest_0.4.2 lattice_0.18-8 Matrix_0.999375-39 nlme_3.1-96 [5] tools_2.11.1 On Wed, Jul 28, 2010 at 11:13, Edwin Husni Sutanudjaja hsutanudjajacchm...@yahoo.com wrote: Dear all, I have a memory problem in making a scatter plot of my 17.5 million- pair datasets. My intention to use the ggplot package and use the bin2d. Please find the attached script for more details. Could somebody please give me any clues or tips to solve my problem?? please ... Just for additional information: I'm running my R script on my 32-bit machine: Ubuntu 9.10, hardware: AMD Athlon Dual Core Processor 5200B, memory: 1.7GB. Many thanks in advance. Kind Regards, -- Ir. Edwin H. Sutanudjaja Dept. of Physical Geography, Faculty of Geosciences, Utrecht University -- You received this message because you are subscribed to the ggplot2 mailing list. Please provide a reproducible example: http://gist.github.com/270442 To post: email ggpl...@googlegroups.com To unsubscribe: email ggplot2+unsubscr...@googlegroups.comggplot2%2bunsubscr...@googlegroups.com More options: http://groups.google.com/group/ggplot2 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axes=F and plotting dual y axes
Cedrick I used this script to produce a simple chart with no axes: series2 = c(1:50) series1 = rep(25:74) plot(series1, series2, main=Woo, col=red, xlab=, ylab=, axes=F, type=l) Works fine on Windows XP using R 2.11.1. -- View this message in context: http://r.789695.n4.nabble.com/Axes-F-and-plotting-dual-y-axes-tp2305230p2305290.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Introductory statistics and introduction to R
Dear Marsh, On Tue, Jul 27, 2010 at 10:46 AM, Marsh Feldman marshfeld...@cox.net wrote: Hi, I have a bright, diligent second-year graduate student who wants to learn statistics and R and will, in effect, be taking a tutorial from me on these subjects. (If you've seen some of my questions on this list, please don't laugh.) As an undergrad he majored in philosophy, so this will be his first foray into computer programming and statistics. I'm thinking of having him use Introductory Statistics with R by Peter Dalgaard, but I'm unable to tell if the book requires calculus. I don't think this student knows calculus, so this would be a deal breaker. Can someone tell me if my student can get through this book starting out with just knowledge of algebra? Short answer: Yes. The long answer is also Yes. (Not really, it depends on what you mean by 'get through'.) Also, do you have other suggestions for texts, manuals, web sites, etc. that would introduce statistics and R simultaneously? Have you seen this? http://rwiki.sciviews.org/doku.php?id=books:intrstat Good luck, Jay *** G. Jay Kerns, Ph.D. Associate Professor Department of Mathematics Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX VoIP: gjke...@ekiga.net E-mail: gke...@ysu.edu http://people.ysu.edu/~gkerns/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] finding the next highest number in an array
... just a note: you don't have to first sort the vector to do this: x - sample(1:7) x [1] 3 5 7 6 2 4 1 which(x==min(x[x4])) [1] 2 Bert Gunter Genentech Nonclinical Biostatistics On Wed, Jul 28, 2010 at 3:12 AM, Raghu r.raghura...@gmail.com wrote: Hi I have a sorted array ( in ascending order) and I want to find the subscript of a number in the array which is the the next highest number to a given number. For example,if I have 67 as a given number and if I have a vector x=c(23,36,45,62,79,103,109), then how do I get the subscript 5 from x (to get 79 which is the next highest to 67) without using a for loop? Thx -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Axes=F and plotting dual y axes
That worked. Stupid me forgot that I had the stock ticker 'F' assigned in my workspace. Well.. guess I'll hit myself with a 2x4 now.. Thanks for your help guys.. -c On 07/28/2010 12:37 PM, Ben Bolker wrote: Johnson, Cedrick W.cedrickat cedrickjohnson.com writes: Howdy. Been running into a bit of trouble with plotting. Seems that axes=F is not working. Whenever I plot (either a dataframe or xts/zoo series) and I set axes=F along with xlab/ylab= I still get the default axes printed in my chart. A quick guess: what happens if you try axes=FALSE instead? (i.e., have you assigned 'F' a value in your workspace?) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R CMD build wiped my computer
On Wed, 28 Jul 2010, Jarrod Hadfield wrote: My main concern is that this could happen again and that the bug is not limited to R 2.9. I would think that an accidental carriage return at the end of a file name (even a temporary one) would be a reasonably common phenomenon (I'm surprised I hadn't done it before). This is a well known attribute of buildsystems generally. Inadvertent errors can cause local damage, and so the recommendation to build in a 'chroot' or in a different userid. Ditto the initial speculation as to you building as root (another common beginner mistake) Malicious code can 'crawl out of a chroot' as well and wander through a system, or evade checking tools so as to be able to later emerge once in production, but this thread then goes afield from R -- Russ herrold __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Introductory statistics and introduction to R
Hi Marsh, I taught an intro to R course and have posted all the materials up on the web: http://psych-swiki.colorado.edu:8080/LearnR. Most learning in R comes from doing, not reading, and that's how I structured my course. All the lectures/HWs can be done individually, and the keys are there to check how at least I solved the problems. A good intro R book would definitely be of help as well. Best of luck, Matt On Wed, Jul 28, 2010 at 11:16 AM, G. Jay Kerns gjke...@gmail.com wrote: Dear Marsh, On Tue, Jul 27, 2010 at 10:46 AM, Marsh Feldman marshfeld...@cox.net wrote: Hi, I have a bright, diligent second-year graduate student who wants to learn statistics and R and will, in effect, be taking a tutorial from me on these subjects. (If you've seen some of my questions on this list, please don't laugh.) As an undergrad he majored in philosophy, so this will be his first foray into computer programming and statistics. I'm thinking of having him use Introductory Statistics with R by Peter Dalgaard, but I'm unable to tell if the book requires calculus. I don't think this student knows calculus, so this would be a deal breaker. Can someone tell me if my student can get through this book starting out with just knowledge of algebra? Short answer: Yes. The long answer is also Yes. (Not really, it depends on what you mean by 'get through'.) Also, do you have other suggestions for texts, manuals, web sites, etc. that would introduce statistics and R simultaneously? Have you seen this? http://rwiki.sciviews.org/doku.php?id=books:intrstat Good luck, Jay *** G. Jay Kerns, Ph.D. Associate Professor Department of Mathematics Statistics Youngstown State University Youngstown, OH 44555-0002 USA Office: 1035 Cushwa Hall Phone: (330) 941-3310 Office (voice mail) -3302 Department -3170 FAX VoIP: gjke...@ekiga.net E-mail: gke...@ysu.edu http://people.ysu.edu/~gkerns/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Matthew C Keller Asst. Professor of Psychology University of Colorado at Boulder www.matthewckeller.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] hatching posibility in Panel.Polygon
Tufte discusses hash lines in his book The Visual Display of Quantitative Information and does a better job of it than I can. The short version is that the hashing can actually produce optical illusion effects that distort the information. (and often don't copy well either). Printing and copying are good things to think about in generating graphs, but I think that using shades of gray, or no fill with labels, or other options are still much preferable to hashing. Consider making multiple simple graphs rather than one all inclusive complicated graph. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- project.org] On Behalf Of HC Sent: Tuesday, July 27, 2010 10:11 PM To: r-help@r-project.org Subject: Re: [R] hatching posibility in Panel.Polygon Thank you for your follow up on this matter. I did think about the partial transparent color option and will certainly use it and see how it works out. For presentations and color prints, the partial transparency is surely going to work and may even look nicer. But for black and white printing for journal articles or making zerox copies, hatching seems more effective. And that was the reason of my exploring it. There may be some valid reason to be disappointed if such option is made available. However, such an option will only add more power to the trellis graphics that already very useful, attractive and efficient. Regards. HC -- View this message in context: http://r.789695.n4.nabble.com/hatching- posibility-in-Panel-Polygon-tp2301863p2304414.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] columns mapping
Thank you so much. it worked as expected. you have been great help -- View this message in context: http://r.789695.n4.nabble.com/columns-mapping-tp2305213p2305401.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Introductory statistics and introduction to R
On Tue, Jul 27, 2010 at 10:46 AM, Marsh Feldman marshfeld...@cox.net wrote: Hi, I have a bright, diligent second-year graduate student who wants to learn statistics and R and will, in effect, be taking a tutorial from me on these subjects. (If you've seen some of my questions on this list, please don't laugh.) As an undergrad he majored in philosophy, so this will be his first foray into computer programming and statistics. I'm thinking of having him use Introductory Statistics with R by Peter Dalgaard, but I'm unable to tell if the book requires calculus. I don't think this student knows calculus, so this would be a deal breaker. Can someone tell me if my student can get through this book starting out with just knowledge of algebra? Also, do you have other suggestions for texts, manuals, web sites, etc. that would introduce statistics and R simultaneously? You could give him this list of free online documents: http://cran.r-project.org/other-docs.html and have him try a few and pick the one he likes best. The one by Owen, for example, is quite good and he could start with that. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Out-of-sample predictions with boosting model
Hi UseRs - I am new to R, and could use some help making out-of-sample predictions using a boosting model (the mboost command). The issue is complicated by the fact that I have panel data (time by country), and am estimating the model separately for each country. FYI, this is monthly data and I have 1986m1 - 2009m12 for 9 countries. To give you a flavor of what I am doing, here is a simple example to show how I make in-sample predictions: # data has following columns: country year month y x1 x2 x3 dat = read.csv(data.csv) # Create function that estimates model, produces in-sample predictions bbox = function(df) { blackbox = mboost(y ~ x1 + x2 + x3) predict(blackbox) } # Use lapply to estimate by country bycountry = lapply(split(dat, dat$country), bbox) So that in the end I have an object bycountry that contains the in-sample predictions of the model, estimated for each country separately. What I would like to do is take this model and estimate it for each country using some initial data. I.e., estimate Australia with 1986m1-2003m12, make prediction about 2004m1, roll data forward. Estimate AUS with 1986m2-2004m1, predict 2004m2, etc for all data points. Now do the same for Canada, Denmark, etc. So I guess my problem is twofold. 1) How to make these out-of-sample predictions, by country, when my data has not been declared as time-series? (I do not think that mboost can handle time-series data...x1 x2 and x3 have been lagged appropriately). 2) How to save the one-step ahead predictions into a vector? Any thoughts would be greatly appreciated. Many thanks! -Travis [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] how to code it??
Hi I have say a large vector of 3500 digits. Initially the digits are 0s and 1s. I need to check for a rule to change some of the 0s to -1s in this vector. But once I change a 0 to -1 then I need to start applying the rule to change the next 0 only after I see the next 1 in the vector. Say for example x = (0,0,0,0,0,0,0,1,0,0,1,1,0,0,1,0,0,0,1) I need to traverse from the 9th element to the last ( because the first occurrence of 1 is at 8) . Let us assume that according to our rule we change the 13th element (only 0s can be changed) to -1. Now we need to go to the next occurrence of 1 (which is 15) and begin the rule application from the 16th till the end of the vector and once replaced a 0 to a -1 then start again from the next 1. How do we code this? I 'feel' recursion is the best possible solution but I am not a programmer and will await experts' views. If this is not a typical R-forum question then my advance apologies. Many thx -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Random Forest - Strata
Max, Thanks. Yes what you said is exactly I am looking for, i.e. the first tree fits using data from sites AB, then predicts on C (and so on). Does that means if I : 1. pass this list as index into trainControl tmpSiteList [[1]] [1] 1 2 3 4 5 6 7 [[2]] [1] 1 2 3 8 9 10 [[3]] [1] 4 5 6 7 8 9 10 AND 2. use other methods in the trainControl() then I would get the RF to be built and tested in the above way? I had tried other methods in the trainControl (had tried root, cv), but seems in the final built RF, the rf.obj$finalModel$inbag still does not match those in the index...my understanding of rf.obj$finalModel$inbag is that it should show which row of sample that had gone into training of that particular tree, which in essence should match the index argument that we had passed into trainControl...may be my understanding of what this rf.obj$finalModel$inbag would show is wrong? I had not look into the estimates yet, what I am looking is just to make sure in each of the tree iteration, the training sites data does go into the training, and the hold out sites data would be used for testing in that tree iteration. Welcome any thoughts/ideas. Again really appreciates your patience and help on this. Regards, Coll -- View this message in context: http://r.789695.n4.nabble.com/Random-Forest-Strata-tp2295731p2305269.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Odd crash with tcl/tk
Hi, Recently, I've been trying to use packages in R that require loading the Tcl/Tk interface. However, I get a strange result and a crash that I haven't been able to find discussion about on these boards (or any others). When I enter library(tcltk), it reads Loading Tcl/Tk interface ... , but then never says done or displays some sort of error message. Looks like this: x11() library(tcltk) Loading Tcl/Tk interface ... Now you can type additional commands in, at your peril! For example, if I type in the text library, nothing happens, but library( causes R to freeze up irreparably, with executing: try(gsub('\\s+','',paste(capture.output(print(args(library,collapse=)),silent=TRUE) displayed at the bottom. When this happens, there's nothing you can do but restart R because it's completely frozen. I'm running R version 2.11.1 Patched (2010-07-27 r52627) [R.app GUI 1.35 (5603) i386-apple-darwin9.8.0] with XQuartz 2.3.5 (xorg-server 1.4.2-apple53) on a mac (snow leopard) Thanks for any help/suggestions in advance, Andrew -- View this message in context: http://r.789695.n4.nabble.com/Odd-crash-with-tcl-tk-tp2305032p2305032.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to generate a random data from a empirical distribition
hi, Frank: how can we make sure the randomly sampled data follow the same distribution as the original dataset? i assume each data point has the same prabability to be selected in a simple random sampling scheme. thanks -- View this message in context: http://r.789695.n4.nabble.com/how-to-generate-a-random-data-from-a-empirical-distribition-tp2302716p2305275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] randomisation for matrix
Gray Calhoun schrieb: Hi Knut, I think you're going to have to be more specific. The code matrix(rnorm(25), 5, 5) I found the answer there is a generator seed field in Ucinet, I do not know why its possible to set the random generator starting point, by hand and I assume the generator starting point is set in rnorm by any kind of f.e time date combination. so matrix(rnorm(x),y,z) is suitable for me. Kind regards Knut __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Help Creating a Stacked Bar Chart with Color Coding
I have a data set of the following form: Johnson 4 Smith4 Smith2 Smith3 Garcia 1 Garcia 4 Rodriguez 2 Adams 2 Adams 3 Adams 4 Turner 4 Turner 3 And I'd like to create a stacked bar chart that has scores on the x-axis and the bars are broken up by judges name. Whenever I try to do this though (barplot(table(judges,scores)) it always creates more groups than actual judges. Would anyone know how to graph this and color code the names appropriately? -- View this message in context: http://r.789695.n4.nabble.com/Help-Creating-a-Stacked-Bar-Chart-with-Color-Coding-tp2305487p2305487.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] error in f(x,...)
Dear all, I tried once to create one variable called bip such that: bip - cip + (1/f(cip))*fi(f,cip) And this was working. But now, doing the same thing I did before, the software shows me the following message: Error in f(x, ...) : unused argument(s) (subdivision = 2000) I have the variable cip and the variable bip should be created such that: Fn - ecdf(cip) f - function(x) {(1 - Fn(x))^4} fi - function(f,x) { res - numeric(length(x)) for (i in 1:length(x)){res[i] - integrate(f,x[i],2.967, subdivision=2000)$value} res} bip - cip + (1/f(cip))*fi(f,cip) Is there anything wrong? How can I solve this problem? Thanks in advance! NGS [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Odd crash with tcl/tk
On Wed, Jul 28, 2010 at 10:06 AM, AndrewPage savejar...@yahoo.com wrote: Hi, Recently, I've been trying to use packages in R that require loading the Tcl/Tk interface. However, I get a strange result and a crash that I haven't been able to find discussion about on these boards (or any others). When I enter library(tcltk), it reads Loading Tcl/Tk interface ... , but then never says done or displays some sort of error message. Looks like this: x11() library(tcltk) Loading Tcl/Tk interface ... Now you can type additional commands in, at your peril! For example, if I type in the text library, nothing happens, but library( causes R to freeze up irreparably, with executing: try(gsub('\\s+','',paste(capture.output(print(args(library,collapse=)),silent=TRUE) displayed at the bottom. When this happens, there's nothing you can do but restart R because it's completely frozen. I'm running R version 2.11.1 Patched (2010-07-27 r52627) [R.app GUI 1.35 (5603) i386-apple-darwin9.8.0] with XQuartz 2.3.5 (xorg-server 1.4.2-apple53) on a mac (snow leopard) Thanks for any help/suggestions in advance, Andrew One thought is that if this is to use gsubfn or sqldf (which uses gsubfn) then you can get them to not use the tcltk code but use R code instead by either of these two means: 1. issue the command: options(gsubfn.engine = R) before issuing your library(sqldf) or library(gsubfn) command. You can put the options command in your .Rprofile if you like and then you will have it on every session. or 2. use a build of R that has no tcltk in it. In that case it will recognize it and switch to using R. I believe one such build exists for the Mac. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to generate a random data from a empirical distribition
This is true by definition. Read about the bootstrap which may give you some good background information. Frank E Harrell Jr Professor and ChairmanSchool of Medicine Department of Biostatistics Vanderbilt University On Wed, 28 Jul 2010, xin wei wrote: hi, Frank: how can we make sure the randomly sampled data follow the same distribution as the original dataset? i assume each data point has the same prabability to be selected in a simple random sampling scheme. thanks -- View this message in context: http://r.789695.n4.nabble.com/how-to-generate-a-random-data-from-a-empirical-distribition-tp2302716p2305275.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] install packages in Windows Vista
Dear R experts... I would really appreciate your suggestions in installing a package in Windows Vista... I am unable to install a package on a windows vista based computer, in spite of running it as an administrator. The package xpose4.2.1 is available from https://sourceforge.net/projects/xpose/files/Xpose4/Xpose_4.2.1/xpose4_4.2.1_win32.zip/download Following is the error message when I try to install the package from the Rgui command line... I get similar errors when I try to install the packages from Rgui (Packages | Install packages from local zip files) menu, but with default lib=C:/Users/santosh/Documents/R/win-library/2.11 . install.packages(C:/Users/santosh/Downloads/xpose4_4.2.1_win32.zip,repos=NULL,lib=C:/Program Files/R/R-2.11.1/library) Error in gzfile(file, r) : cannot open the connection In addition: Warning message: In gzfile(file, r) : cannot open compressed file 'xpose4_4.2.1_win32/DESCRIPTION', probable reason 'No such file or directory' However, I see file folder in which the library related folders are installed.. Thanks again, Santosh [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] question about SVM in e1071
Pau, Sorry for getting back to you for this again. I am getting confused about your interpretation of 3). It is obvious from your code that increasing C results in* smaller *number of SVs, this seems to contradict with your interpretation * Increasing the value of C (...) forces the creation of a more accurate model.* A more accurate model is done my adding more SV. In addition, I got to know that the number of SVs increases with C decreasing is because there are many bounded SVs (whose alpha = C, remember 0 alpha = C), those SVs with alpha smaller than C is called free SVs. Here is another question: is the complexity of the boundary determined by number of total SVs (bounded SV + free SV) or free SVs only? Thanks a bunch, -Jack On Thu, Jul 15, 2010 at 4:17 AM, Pau Carrio Gaspar paucar...@gmail.comwrote: Hi Jack, to 1) and 2) there are telling you the same. I recommend you to read the first sections of the article it is very well writen and clear. There you will read about duality. to 3) I interpret the scatter plot so: * Increasing the value of C (...) forces the creation of a more accurate model.* A more accurate model is done my adding more SV ( till we get a convex hull of the data ) hope it helps Regards Pau 2010/7/14 Jack Luo jluo.rh...@gmail.com Pau, Thanks a lot for your email, I found it very helpful. Please see below for my reply, thanks. -Jack On Wed, Jul 14, 2010 at 10:36 AM, Pau Carrio Gaspar paucar...@gmail.comwrote: Hello Jack, 1 ) why do you thought that larger C is prone to overfitting than smaller C ? *There is some statement in the link http://www.dtreg.com/svm.htm To allow some flexibility in separating the categories, SVM models have a cost parameter, C, that controls the trade off between allowing training errors and forcing rigid margins. It creates a soft marginthat permits some misclassifications. Increasing the value of C increases the cost of misclassifying points and forces the creation of a more accurate model that may not generalize well. My understanding is that this means larger C may not generalize well (prone to overfitting). * 2 ) if you look at the formulation of the quadratic program problem you will see that C rules the error of the cutting plane ( and overfitting ). Therfore for hight C you allow that the cutting plane cuts worse the set, so SVM needs less points to build it. a proper explanation is in Kristin P. Bennett and Colin Campbell, Support Vector Machines: Hype or Hallelujah?, SIGKDD Explorations, 2,2, 2000, 1-13. http://www.idi.ntnu.no/emner/it3704/lectures/papers/Bennett_2000_Support.pdf *Could you be more specific about this? I don't quite understand. * 3) you might find usefull this plots: library(e1071) m1 - matrix( c( 0,0,0,1,1,2, 1, 2,3,2,3, 3, 0, 1,2,3,0, 1, 2, 3, 1,2,3,2,3,3, 0, 0,0,1, 1, 2, 4, 4,4,4, 0, 1, 2, 3, 1,1,1,1,1,1,-1,-1, -1,-1,-1,-1, 1 ,1,1,1, 1, 1,-1,-1 ), ncol = 3 ) Y = m1[,3] X = m1[,1:2] df = data.frame( X , Y ) par(mfcol=c(4,2)) for( cost in c( 1e-3 ,1e-2 ,1e-1, 1e0, 1e+1, 1e+2 ,1e+3)) { #cost - 1 model.svm - svm( Y ~ . , data = df , type = C-classification , kernel = linear, cost = cost, scale =FALSE ) #print(model.svm$SV) plot(x=0,ylim=c(0,5), xlim=c(0,3),main= paste( cost: ,cost, #SV: , nrow(model.svm$SV) )) points(m1[m1[,3]0,1], m1[m1[,3]0,2], pch=3, col=green) points(m1[m1[,3]0,1], m1[m1[,3]0,2], pch=4, col=blue) points(model.svm$SV[,1],model.svm$SV[,2], pch=18 , col = red) } * * *Thanks a lot for the code, I really appreciate it. I've run it, but I am not sure how should I interpret the scatter plot, although it is obvious that number of SVs decreases with cost increasing. * Regards Pau 2010/7/14 Jack Luo jluo.rh...@gmail.com Hi, I have a question about the parameter C (cost) in svm function in e1071. I thought larger C is prone to overfitting than smaller C, and hence leads to more support vectors. However, using the Wisconsin breast cancer example on the link: http://planatscher.net/svmtut/svmtut.html I found that the largest cost have fewest support vectors, which is contrary to what I think. please see the scripts below: Am I misunderstanding something here? Thanks a bunch, -Jack model1 - svm(databctrain, classesbctrain, kernel = linear, cost = 0.01) model2 - svm(databctrain, classesbctrain, kernel = linear, cost = 1) model3 - svm(databctrain, classesbctrain, kernel = linear, cost = 100) model1 Call: svm.default(x = databctrain, y = classesbctrain, kernel = linear, cost = 0.01) Parameters: SVM-Type: C-classification SVM-Kernel: linear cost: 0.01 gamma: 0.111 Number of Support Vectors: 99 model2 Call: svm.default(x = databctrain, y = classesbctrain, kernel = linear, cost = 1) Parameters: SVM-Type:
Re: [R] how to code it??
If I take your meaning correctly, you want something like this. x - c(0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, + 1) easy - function(x) { + state - 0 + for (i in 1:length(x)) { + if (x[i] == 0) + x[i] - state + state - 0 + if (x[i] == 1) + state - -1 + } + x + } easy(x) [1] 0 0 0 0 0 0 0 1 -1 0 1 1 -1 0 1 -1 0 0 1 -Matt On Wed, 2010-07-28 at 14:10 -0400, Raghu wrote: Hi I have say a large vector of 3500 digits. Initially the digits are 0s and 1s. I need to check for a rule to change some of the 0s to -1s in this vector. But once I change a 0 to -1 then I need to start applying the rule to change the next 0 only after I see the next 1 in the vector. Say for example x = (0,0,0,0,0,0,0,1,0,0,1,1,0,0,1,0,0,0,1) I need to traverse from the 9th element to the last ( because the first occurrence of 1 is at 8) . Let us assume that according to our rule we change the 13th element (only 0s can be changed) to -1. Now we need to go to the next occurrence of 1 (which is 15) and begin the rule application from the 16th till the end of the vector and once replaced a 0 to a -1 then start again from the next 1. How do we code this? I 'feel' recursion is the best possible solution but I am not a programmer and will await experts' views. If this is not a typical R-forum question then my advance apologies. Many thx -- Matthew S. Shotwell Graduate Student Division of Biostatistics and Epidemiology Medical University of South Carolina __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Documenting different OO-aproaches in R as a package?
s == schuster m...@friedrich-schuster.de on Tue, 27 Jul 2010 22:17:09 +0200 writes: s Hello, s I see some people including myself confused by the s different object-oriented approaches in R (S3, S4, OOP, s R.oo etc.). s Would it be ok to collect examples and solutions for the s different OO-packages in one package and add a vignette s for documentation? (assuming I find time for this task) s I mean in this case the package would not add data or s functionality to R or serve as a companion package for a s book. In this case the package would (only) add s documentation to R. Is this ok? Hmm, we had discussed the issue within R Core in a team meeting many years ago. At the time, those present agreed that we should emphasize S3 (for small and legacy applications) and S4, and concentrate on these rather than fostering even more alternatives. Everything else has been by contributed by R users who were not happy with S4 ... at the time at least ... Note that in the last several R releases, S3 - S4 interoparability has been greatly improved. Martin Maechler, ETH Zurich s Friedrich Schuster Dompfaffenweg 6 69123 Heidelberg __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to code it??
You've tried: diff(c(0, x)) ? On Wed, Jul 28, 2010 at 3:10 PM, Raghu r.raghura...@gmail.com wrote: Hi I have say a large vector of 3500 digits. Initially the digits are 0s and 1s. I need to check for a rule to change some of the 0s to -1s in this vector. But once I change a 0 to -1 then I need to start applying the rule to change the next 0 only after I see the next 1 in the vector. Say for example x = (0,0,0,0,0,0,0,1,0,0,1,1,0,0,1,0,0,0,1) I need to traverse from the 9th element to the last ( because the first occurrence of 1 is at 8) . Let us assume that according to our rule we change the 13th element (only 0s can be changed) to -1. Now we need to go to the next occurrence of 1 (which is 15) and begin the rule application from the 16th till the end of the vector and once replaced a 0 to a -1 then start again from the next 1. How do we code this? I 'feel' recursion is the best possible solution but I am not a programmer and will await experts' views. If this is not a typical R-forum question then my advance apologies. Many thx -- 'Raghu' [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [Rd] R CMD build wiped my computer
Ubuntu also uses ~ as a backup file syntax, but Ubuntu has a trash can where deleted files are located, so it would be easy to restore them. I would be surprised if Fedora didn't also have a trash can. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Duncan Murdoch Sent: Wednesday, July 28, 2010 11:12 AM To: Jarrod Hadfield Cc: r-help@r-project.org; Marc Schwartz Subject: Re: [R] [Rd] R CMD build wiped my computer On 28/07/2010 10:01 AM, Jarrod Hadfield wrote: Hi Marc, Thanks for the info on recovery - most of it can pieced together from backups but a quick, cheap and easy method of recovery would have been nicer. My main concern is that this could happen again and that the bug is not limited to R 2.9. I would think that an accidental carriage return at the end of a file name (even a temporary one) would be a reasonably common phenomenon (I'm surprised I hadn't done it before). If you can put together a recipe to reproduce the problem (or a less extreme version of R deleting files it shouldn't), we'll certainly fix it. But so far all we've got are guesses about what might have gone wrong, and I don't think anyone has been able to reproduce the problem on current R. Duncan Murdoch Cheers, Jarrod On 28 Jul 2010, at 14:04, Marc Schwartz wrote: Jarrod, Noting your exchange with Martin, Martin brings up a point that certainly I missed, which is that somehow the tilde ('~') character got into the chain of events. As Martin noted, on Linuxen/Unixen (including OSX), the tilde, when used in the context of file name globbing, refers to your home directory. Thus, a command such as: ls ~ will list the files in your home directory. Similarly: rm ~ will remove the files there as well. If the -rf argument is added, then the deletion becomes recursive through that directory tree, which appears to be the case here. I am unclear, as Martin appears to be, as to the steps that caused this to happen. That may yet be related in some fashion to Duncan's hypothesis. That being said, the use of the tilde character as a suffix to denote that a file is a backup version, is not limited to Fedora or Linux, for that matter. It is quite common for many text editors (eg. Emacs) to use this. As a result, it is also common for many applications to ignore files that have a tilde suffix. Based upon your follow up posts to the original thread, it would seem that you do not have any backups. The default ext3 file system that is used on modern Linuxen, by design, makes it a bit more difficult to recover deleted files. This is due to the unlinking of file metadata at the file system data structure level, as opposed to simply marking the file as deleted in the directory structures, as happens on Windows. There is a utility called ext3undel (http://projects.izzysoft.de/trac/ext3undel ), which is a wrapper of sorts to other undelete utilities such as PhotoRec and foremost. I have not used it/them, so cannot speak from personal experience. Thus it would be a good idea to engage in some reviews of the documentation and perhaps other online resources before proceeding. The other consideration is the Catch-22 of not copying anything new to your existing HD, for fear of overwriting the lost files with new data. So you would need to consider an approach of downloading these utilities via another computer and then running them on the computer in question from other media, such as a CD/DVD or USB HD. A more expensive option would be to use a professional data recovery service, where you would have to consider the cost of recovery versus your lost time. One option would be Kroll OnTrack UK (http://www.ontrackdatarecovery.co.uk/ ). I happen to live about a quarter mile from their world HQ here in a suburb of Minneapolis. I have not used them myself, but others that I know have, with good success. Again, this comes at a potentially substantial monetary cost. The key is that if you have any hope to recover the deleted files, you not copy anything new onto the hard drive in the mean time. Doing so will decrease the possibility of file recovery to near 0. As Duncan noted, there is great empathy with your situation. We have all gone through this at one time or another. In my case, it was perhaps 20+ years ago, but as a result, I am quite anal retentive about having backups, which I have done for some time on my systems, hourly. HTH, Marc Schwartz On Jul 28, 2010, at 5:55 AM, Jarrod Hadfield wrote: Hi Martin, I think this is the most likely reason given that the name in the DESCRIPTION file does NOT have a version number. Even so, it is very easy to misname a file and then delete it/change its name (as I've done here) and I hope current versions of R would not