[R] ggplot2: scale_y_log10() with geom_histogram
Dear ggplot2 users, is there an easy/elegant way to suppress zero count bars in histograms with logarithmic y axis ? One (made up) example would be qplot(exp(rnorm(1000))) + geom_histogram(colour = cornsilk, fill = darkblue) + scale_x_sqrt() + scale_y_log10() Thanks! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] barplot, different color for shading lines and bar
Dear all, might there be a modified barplot function out there which allows the user to specify a fill color for the bars and independent parameters for the overlaid shading lines ? Currently, when I specify density and col, the fill color for the bars is white. Thanks! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] RWeka, java.lang.NullPointerException
Dear all, I have trained a J48 classifier in RWeka but when I try to predict on new data I get the following exceptions: fit - J48(...) yNew - predict(fit, x, type=probability); Error in .jcall(RWekaInterfaces, [D, distributionForInstances, .jcast(classifier, : java.lang.NullPointerException What could be the cause of this ? I am using R version 2.10.0 on a Linux server. Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package gbm, predict.gbm with offset
Dear all, the help file for predict.gbm states that The predictions from gbm do not include the offset term. The user may add the value of the offset to the predicted value if desired. I am just not sure how exactly, especially for a Poisson model, where I believe the offset is multiplicative ? For example: library(MASS) fit1 - glm(Claims ~ District + Group + Age + offset(log(Holders)), data = Insurance, family = poisson) head(predict(fit1, data = Insurance, type = response)) #glm.predict includes the offset: head(predict(fit1, newdata = Insurance, type = response)) #1 2 3 4 5 6 # 31.86358 35.27587 28.18080 158.87829 53.97772 84.16012 library(gbm) fit2 - gbm(Claims ~ District + Group + Age + offset(log(Holders)), data = Insurance, distribution =poisson, n.trees = 600) head(predict(fit2, newdata = Insurance, type = response, n.trees=600)) #[1] 0.1378249 0.1378249 0.1314991 0.1284441 0.1389563 0.1389563 #Warning message: #In predict.gbm(fit2, newdata = Insurance, type = response, n.trees = 600) : # predict.gbm does not add the offset to the predicted values. Would the answer be simple multiplication such as: head(predict(fit2, newdata = Insurance, type = response, n.trees=600)*Insurance[,Holders]) [1] 27.15151 36.38577 32.34878 215.78607 39.46359 74.48058 Any help would be immensely useful. Thx, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package gbm C++ code as separate module
Dear all, I would like to separate the gbm C++ code from any R dependencies, so that it could be compiled into a standalone module. I am wondering if anyone has already done this and could provide me with some pointers/help ? Thanks! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] modifying axis labels in lattice panels
Dear all, I am struggling to modify the axis labels/ticks in a panel provided to xyplot. To begin with, I do not know the equivalent of the xaxt=n directive for panels that would set the stage for no default x axis being drawn. My goal is to draw ticks and custom formatted labels at certain hours of the week. When I execute the code below, I get an error message in the plot window that suggests a problem with some args parameter. A second problem concerns the shaded rectangles I try to draw. Clearly, the range returned by par('usr') does not correspond to the true y ranges. Any help would be greatly appreciated, Thanks, Markus PS I am using R version 2.10.0 on MACOS and the lattice package version 0.18-3 (latest) library(lattice); #multivariate time series, one row for each hour of the week: Xwide = cbind.data.frame(time=as.POSIXct(2010-09-06 00:00:00 EDT) + (0:167)*3600, Comp.1= sin(2*pi*7*(0:167)/168), Comp.2 = cos(2*pi*7*(0:167)/168)); #to pass this to xyplot, first need to reshape: Xlong - reshape(Xwide, varying = c(2:3), idvar = time, direction=long, timevar = PC); #get descriptive shingle labels Xlong[,PC] - factor(Xlong[,PC], labels = paste(PC,1:2)); xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel, scales = list(x=list(at = Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H); WeekhourPanel - function(x,y,...){ r - range(x); #print(r) Hours8 - seq(r[1], r[2], by=8*3600); Hours24 - seq(r[1]+12*3600, r[2], by=24*3600) #axis.POSIXct(1, at= Hours8, format=%H); panel.xyplot(x,y, type=l, ...); panel.grid(0,3); panel.abline(v= Hours24-4*3600, lty=2, col = rgb(0,0,1,0.5)); panel.abline(v=Hours24+6*3600, lty=2, col = rgb(0,1,0,0.5)); bb - par('usr') y0 - bb[3]; for (i in seq(r[1], r[2], by=48*3600)) panel.rect(xleft=i, ybottom=y0, xright=i+24*3600-1, ytop=bb[4], col = rgb(0.75,0.75,0.75,0.3), border = NA); panel.axis(1, at= Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H))); #panel.axis(1, at= Hours24+6*3600, labels=format(x,%H)); #panel.axis(3, at= Hours24, labels=format(x,%a), line = -1, tick = FALSE); } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] modifying axis labels in lattice panels
Thanks a lot for this incredibly helpful and thorough reply. I had actually meant to cut out the scales part before sending the email, very sorry about the confusion, so I was actually executing just xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel) The scales part was a later attempt to control the axis directly which I eventually abandoned. (partly because I actually wanted the HOUR variables to be local to the panel function) and yes, in this simplified version I asked for labels only at 8am which formats to 08. My intention was to add more hours and weekly labels once I figure out this simple axis first. I had hoped to define a panel function that draws only one PC at a time since I envision that grouping variable to have many levels (two were just an example). Might you know how to disable the axis drawing in panel.xyplot ? Thanks ! Markus On Fri, Sep 10, 2010 at 12:45 PM, Dennis Murphy djmu...@gmail.com wrote: Hi: On Fri, Sep 10, 2010 at 7:16 AM, Markus Loecher markus.loec...@gmail.comwrote: Dear all, I am struggling to modify the axis labels/ticks in a panel provided to xyplot. To begin with, I do not know the equivalent of the xaxt=n directive for panels that would set the stage for no default x axis being drawn. My goal is to draw ticks and custom formatted labels at certain hours of the week. When I execute the code below, I get an error message in the plot window that suggests a problem with some args parameter. A second problem concerns the shaded rectangles I try to draw. Clearly, the range returned by par('usr') does not correspond to the true y ranges. Any help would be greatly appreciated, Thanks, Markus PS I am using R version 2.10.0 on MACOS and the lattice package version 0.18-3 (latest) library(lattice); #multivariate time series, one row for each hour of the week: Xwide = cbind.data.frame(time=as.POSIXct(2010-09-06 00:00:00 EDT) + (0:167)*3600, Comp.1= sin(2*pi*7*(0:167)/168), Comp.2 = cos(2*pi*7*(0:167)/168)); #to pass this to xyplot, first need to reshape: Xlong - reshape(Xwide, varying = c(2:3), idvar = time, direction=long, timevar = PC); #get descriptive shingle labels Xlong[,PC] - factor(Xlong[,PC], labels = paste(PC,1:2)); A less mentally taxing approach :) library(reshape) xlong - melt(Xwide, id = 'time') names(xlong)[2:3] - c('PC', 'Comp') xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel, scales = list(x=list(at = Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H); When attempting to run this, I got Error in xyplot.formula(Comp ~ time | PC, data = Xlong, pane1l = WeekhourPanel, : object 'Hours24' not found Attempting to pull Hours24 out of the function didn't work... Hours24 - seq(r[1]+12*3600, r[2], by=24*3600) Error in seq(r[1] + 12 * 3600, r[2], by = 24 * 3600) : object 'r' not found One problem is that to use Hours24 in scales, it has to be defined in the calling environment of xyplot() - in other words, it has to be defined outside the panel function and outside of xyplot() if your present code is to have any hope of working. I think I got that part figured out: in the console, type r - range(Xwide$time) Hours24 - seq(r[1]+12*3600, r[2], by=24*3600) I at least get a plot now by running your xyplot() function with the panel function, but all the labels are 08 on the x-axis. Here's why: format(Hours24-4*3600,%H) [1] 08 08 08 08 08 08 08 This comes from the labels = part of your panel function. I got the same plot with this code (apart from adding the lines): xyplot(Comp ~ time | PC ,data = Xlong, type = 'l', scales =list(x = list(at = Hours24-4*3600, labels=as.character(format(Hours24-4*3600,%H) which indicates that something in your panel function is awry. I'd suggest starting out simply. Put both plots on the same panel using PC as a grouping variable in the long data frame. It will automatically use different colors for groups, but you can control the line color with the col.lines = argument; e.g., col.lines = c('red', 'blue'). Next, I'd work on getting the axis ticks and labels the way you want with scales. It also appears that you want to set a custom grid - my suggestion would be to do that last, after you've controlled the axis ticks and labels. Once you have that figured out, you have the kernel of your panel function. In most applications I've seen in lattice, the idea is to keep the panel function as simple as possible and pass the 'global' stuff to the function call. There's something broken in your panel function, but it's a run-time error rather than a compile-time error, so you can either debug it or try simplifying the problem (and the panel function) as much as possible. HTH, Dennis WeekhourPanel - function(x,y,...){ r - range(x); #print(r) Hours8 - seq(r[1], r[2], by=8*3600); Hours24 - seq
[R] no predict function in lme4 ?
Dear mixed effects modelers, I seem unable to find a predict method for mer objects in the package lme4. Am I not seeing the forest for the trees ? Any pointer would be very helpful. Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] expression(), mixed symbols and evaluated objects
Is it possible to mix symbols and evaluated objects inside the expression() function ? The following example shows what I am trying to achieve: for (m in 1:3) { plot(1:10); #just a place holder for the real plots title(expression(y = m * lambda)); } I want to actually evaluate the variable m but keep lambda as a symbol in the title. I tried to wrap an eval() around various subparts of the expression but to no avail. Going further, I ideally would like to mix text into the expression as well. Any help would be appreciated. Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] MASS package not on CRAN ?
In fact, I must have a broken installation of R then (though I have not noticed any other problems so far). The library MASS is neither pre-installed nor can I explicitly install it (though the internet connection is up and functional). Thanks for all the help ! Markus 2010/3/9 Uwe Ligges lig...@statistik.tu-dortmund.de On 09.03.2010 15:39, Ista Zahn wrote: MASS is a recommended package, so is probably already installed on your machine. Try And if installation fails, it is either your internet connection that does not download the file in its original form or you have a broken installation of R (which would also be indicated if MASS is not already installed given you installed a released version of R). Best, Uwe Ligges library(MASS) -Ista On Tue, Mar 9, 2010 at 9:32 AM, Markus Loechermarkus.loec...@gmail.com wrote: The MASS package is listed on the CRAN web site ( http://cran.r-project.org/web/packages/MASS/index.html) but I am unable to install it via install.packages(). The error is that the package is unavailable. When I manually download the source tar ball and try to install it on a Linux machine, installation fails because it is not a valid package. Do I need to search different repositories ? Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R package pdf files
Dear all, the examples in the pdf files that are automatically built from the examples in package help files are poorly formatted; they frequently do not wrap to the next line and are cut off. While there is an easy work around by looking at the examples in the corresponding help files, I do wonder if there is a way to ensure proper line wrappiong when creating a package. Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] MASS package not on CRAN ?
The MASS package is listed on the CRAN web site ( http://cran.r-project.org/web/packages/MASS/index.html) but I am unable to install it via install.packages(). The error is that the package is unavailable. When I manually download the source tar ball and try to install it on a Linux machine, installation fails because it is not a valid package. Do I need to search different repositories ? Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] grid.image(), pckg grid
While I am very happy with and awed by the grid package and its basic plotting primitives such as grid.points, grid.lines, etc, I was wondering whether the equivalent of a grid.image() function exists ? Any pointer would be helpful. Thanks ! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] possible memory leak in predict.gbm(), package gbm ?
Dear gbm users, When running predict.gbm() on a large dataset (150,000 rows, 300 columns, 500 trees), I notice that the memory used by R grows beyond reasonable limits. My 14GB of RAM are often not sufficient. I am interpreting this as a memory leak since there should be no reason to expand memory needs once the data are loaded and passed to predict.gbm() ? Running R version 2.9.2 on Linux, gbm package 1.6-3. Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package:snow, timeOut for makeSOCKcluster()
Dear snow users, is there any way to specify a max time after which makeSOCKcluster() stops trying to create socket connections and gives up/returns ? In my current setup (MAC OSX 10.5.8, R version 2.9) I have to force quit R if the host specified in makeSOCKcluster() either does not exist or does not respond. On Linux, I can at least manually interrupt the function via Ctrl-C Any help would be greatly appreciated, Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] as.POSIXct(as.Date()) independent of timezone
Dear R users, I am struggling a bit with the converting dates to full POSIX timestamps, in particular, I would like to somehow force the timezone to be local, i.e. the output of as.POSIXct(as.Date(2008-07-01)) should always be equal to 2008-07-01 00:00:00, is that achievable ? I tried to set the origin and the timezone, neither of which seems to make a difference. On my Mac Book Pro (R version 2.9.1) which is set to Eastern US time zone, I obtain the shifted result: as.POSIXct(as.Date(2008-07-01)) [1] 2008-06-30 20:00:00 EDT And e.g. as.POSIXct( Sys.Date()) [1] 2009-09-17 20:00:00 EDT Sys.time() [1] 2009-09-18 10:10:48 EDT Any help would make life simpler for me. Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read/write connections
Dear fellow R users, I would very much like to see an example of read/write connection (open = r+ ) for e.g. pipe() or any other R connection. I have a standalone program which accepts input from stdin, performs some processing and returns the results on stdout. Is it possible at all to open a connection to that program, write to it (i.e. to stdin of that process) and read back the results ? As a silly example, imagine the following use of the Unix function head: zz - pipe( head , open =r+); cat(rnorm(10), file = zz); Error in cat(rnorm(10), file = zz) : cannot write to this connection While I am not surprised that this does not work, I would love to know a solution to this general problem. Thanks Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] read.table, row.names arg
Dear R users, I had somehow expected that read.table() would treat the column specified by the row.names argument as of class character. That seems to be the only sensible class allowed for a column containing row names. However, that does not seem to be the case, as the following example shows: x - cbind.data.frame(ID = c(010007787048271871, 1007109516820319, 10094843652996959, 010145176274075487), X1 = 1:4, X2 = 4:1); write.table(x, tmp.txt, quote = FALSE, row.names = FALSE); y - read.table(tmp.txt, header= TRUE, row.names=1) y X1 X2 10007787048271872 1 4 1007109516820319 2 3 10094843652996960 3 2 10145176274075488 4 1 x ID X1 X2 1 010007787048271871 1 4 2 1007109516820319 2 3 3 10094843652996959 3 2 4 010145176274075487 4 1 The first column was not read in as a string, which mangled the IDs. I could use colClasses explicitly, but then I would need to know the number and classes of the remaining columns in advance. Is this a bug or expected behavior ? Any advice would be most helpful. Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] quick square root axes
Dear R users, while I enjoy the built-in log argument to the plot() function, I wished it would be as easy to create more general custom transformed axes such as sqrt(), logit, etc... for example, instead of plot(x=exp(rnorm(10)), y=(1:10)^4, log = xy), sth. along the lines of plot(x=exp(rnorm(10)), y=(1:10)^4, trans = list(x = log, y = sqrt)) to encode the desired transfomation. This involves just transforming the xy values and creating nice tick marks at the appropriate positions. Before trying to write my own function, I wanted to see if that functionality already exists in another package ? Thanks! Markus . [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] can install.packages() copy utility files to the public_html directory ?
Dear fellow R-users, I am about to publish an HTML utility package to CRAN that expands on the R2HTML package and includes a few goodies such as sorted tables, easy automation of framed HTML reporting, etc. However, some of the resulting dynamic HTML pages need to access JavaScript code that should sit in a specific subdirectory of public_html. My more general question is hence, (i) how do I include the directory containing the JavaScript code in my R package and (ii) is it possible to copy this directory to the user's public_html path during installation ? Thanks! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] source code for prompt()
Dear R community, pardon my ignorance but how would you get the source code fornon-visible functions ? For example, I would like to see and modify the source code for the prompt() function. Thanks! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] package knnFinder, kd-trees
Dear R users, thanks to Samuel for making the package knnFinder available to the public. I was wondering if there is an easy way to only build and store the kdd tree in a first step and perform NN queries from then on ? It seems that nn() does both simultaneously. Thanks! Markus __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] compressing data without writing output to file
This might seem like a strange question but is there any way to compress an R object (such as a matrix) and know its resulting size in bytes ? Clearly, I could implement this in the following way (if x is my matrix): zz - gzfile(fname,w); write.table(x,zz); close(zz); file.info(fname)[,size]; However, I need to do this for hundreds of thousands of objects and the overhead in terms of disk access due to the actual file creation is prohibitive. I guess, I would like a modified object.size() function that returns the size of the compressed (e.g. gzip) version of the object. Thanks! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] updating contents of a package
Dear fellow R users, I read through the Writing R Extensions document and am able to now create my own packages/libraries which so far are just well documented collections of my own R functions. I use package.skeleton() and the tools package to build these packages. However, it is not clear to me how to modify and update a package after its initial creation. How do you elegantly update e.g. the old help file when one added an argument to a function ? How do you keep most of the existing package structure when implementing incremental changes ? Any help would be very useful, Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] beanplot, Error in shapiro.test(x)
Dear all, I am trying to create beanplots from a dataset for which boxplot works fine. (MACOS, R 2.8.1 GUI 1.27 Tiger build 32-bit (5301)) I am getting the following error message: Error in shapiro.test(x) : sample size must be between 3 and 5000 I am not even sure why the shapiro.test is being used, but is there any workaround ? Thanks ! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] interrupting R
Dear fellow R users, is there a generic way to gracefully interrupt an R function without terminating the entire session ? I am mainly interested in this answer for Linux and MacOS. I found neither Esc nor Ctrl-C to work; it seems that R does not check for signals periodically? Also, an entirely unrelated question: I have been looking unsuccessfully for the R sources for the examples given in Simon Wood's book on Generalized Additive Models. I had hoped they would be part of the mgcv package but they are not. Has anyone had any luck with this ? Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] interrupting R
Thank you for the quick reply. It seems that Ctrl-C interrupts pure R functions (i.e. R scripts that do not call external compiled libraries) but when I run functions that in turn call external C code (such as gam() in the package mgcv), the Ctrl-C does not appear to propagate that deeply, if I may use such loose language. The Stop icon in the R.app on MAC OS is similarly unresponsive when e.g. gam() is performing some extensive computations. (I am just using gam() as an example, nothing special about it, I think) You are right, I should ask the author about the source code, I just did not want to add more requests to his InBox. Thanks again, Markus On Fri, Jan 2, 2009 at 8:56 AM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote: On Fri, 2 Jan 2009, Markus Loecher wrote: Dear fellow R users, is there a generic way to gracefully interrupt an R function without terminating the entire session ? I am mainly interested in this answer for Linux and MacOS. I found neither Esc nor Ctrl-C to work; it seems that R does not check for signals periodically? Well, Ctrl-C works for me. Rather than check for signals, R installs a signal handler and gets the OS to do the work. On Mac OS it is unclear if you mean R or R.app. R.app has a Stop sign icon, amongst other ways. Also, an entirely unrelated question: I have been looking unsuccessfully for the R sources for the examples given in Simon Wood's book on Generalized Additive Models. I had hoped they would be part of the mgcv package but they are not. Has anyone had any luck with this ? Why not ask him directly? Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/http://www.stats.ox.ac.uk/%7Eripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] apply() just loops ?
Dear R users, I have been diligently using the apply() family in order to avoid explicit for loops and speed up computation. However, when I finally inspected the source code for apply, it appears that the core computation is a simple loop as well. What am I missing ? Why the often found advice to use apply() instead of loops and the actually observed empirical speedups on many tasks ? Thanks in advance for demystifying, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] findInterval(), binary search, log(N) complexity
Dear R users, the help for findInterval(x,vec) suggests a logarithmic dependence on N (=length(vec)), which would imply a binary search type algorithm. However, when I test this hypothesis, in the following manner: set.seed(-3645); l - vector(); N.seq - c(5000, 50, 100, 1000, 5000);k - 1 for (N in N.seq){ tmp - sort(round(stats::rt(N, df=2), 2)); l[k] - system.time(it3 - findInterval(-1, tmp))[2];k - k + 1; } plot(N.seq,l,type=b,xlab=length(vec), ylab=CPU time); the resulting plot suggests a linear relationship. I must be missing sth. here ? Thanks ! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] gam negative.binomial
Dear list members, while I appreciate the possibility to deal with overdispersion for count data either by specifying the family argument to be quasipoisson() or negative.binomial(), it estimates just one overdispersion parameter for the entire data set. In my applications I often would like the estimate for overdispersion to depend on the covariates in the same manner as the mean. For example, #either library(mgcv) or library(gam): x - seq(0,1,length = 100)*2*pi mu - 4+ 2*sin(x) size - 4 + 2*cos(x) data - cbind.data.frame(x- rep(x,10), y = rnbinom(10*100,mu=rep(mu,10),size=rep(size,10))) x.gam - gam(y~s(x), data=data,family=quasipoisson()) plot(x.gam) summary(x.gam) How would I get a smooth estimate of the overdispersion ? Thanks, Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mboost partial contribution plots
Just having read the nice review article on boosting in the latest Statistical Science, I would love to reproduce some of the plots inside that article, but it is not clear to me how to create the partial contribution plots for the Poisson regression. Does anyone have example code for this ? (The vignette does not offer it, I think) Thanks ! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] estimate of overdispersion with glm.nb
Dear R users, I am trying to fully understand the difference between estimating overdispersion with glm.nb() from MASS compared to glm(..., family = quasipoisson). It seems that (i) the coefficient estimates are different and also (ii) the summary() method for glm.nb suggests that overdispersion is taken to be one: Dispersion parameter for Negative Binomial(0.9695) family taken to be 1, which is not what I expected. The code I used is pasted below: x - rep(seq(0,23,by=1),50); s - rep(seq(1,2,length=50*24),1); tmp - cbind.data.frame(y=rnbinom(length(tmp1),mu=8*(sin(2*pi*x/24)+2),size = 1),x=x,s=s); tmp.glm.qp - glm(y~factor(x)-1,data = tmp, family=quasipoisson, offset=log(s)); tmp.glm.nb - glm.nb(y~factor(x)-1 +offset(log(s)),data = tmp); On a more advanced topic, I was furthermore hoping to compare models with a global estimate of overdispersion with one that allows overdispersion to be estimated separately for each level of the factor x. Can I achieve that in glm or do I need to employ a mixed effects model ? Thanks! Markus [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pnbinom.c qnorm.c
Dear R users, I was wondering from where I could get the C source code to compute pnbinom() and qnorm() ? (I would use R in batch mode but I find the startup time prohibitive, unless there is a way to speed it up) I searched the Web and it clearly is part of the R distribution, I just don't know how to extract them. Thanking you ! Markus Loecher Princeton, NJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Confidence intervals for PCA scores/eigenvalues
Dear all, I have read various descriptions of employing resampling techniques, such as the bootstrap, to estimate the uncertainties of the eigenvectors computed by PCA. When I try __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Too many open files
Dear all, Did this problem that was posted in 2006 (see below) ever got fully resolved ? I am encountering the exact same issue ; I have executed get.hist.quote() in a loop and now R not only refuses to establish any further connections to yahoo, but, worse, it will not open any files either. For example, I cannot even save my current workspace for that reason. I tried closeAllConnections() As well as showConnections() But I do not see any open connections. Also, does get.hist.quote() not close its connection internally once it is done ? Any help would be immensely useful as I am quite stuck. Thanks! Markus Re: [R] Too many open files * This message: [ Message body ] [ More options ] * Related messages: [ Next message ] [ Previous message ] [ In reply to ] [ [R] error reports ] [ Next in thread ] From: Seth Falcon sfalcon_at_fhcrc.org Date: Sat 27 May 2006 - 09:21:36 EST Omar Lakkis [EMAIL PROTECTED] writes: This may be more of an OS question ... I have this call r = get.hist.quote(symbol, start= format(start, %Y-%m-%d), end= format(end, %Y-%m-%d)) which does a url request in a loop and my program runs out of file handlers after few hundred rotations. The error message is: 'Too many open files'. Other than increasing the file handlers assigned to my process, is there a way to cleanly release and reuse these connections? Inside your loop you need to close the connection object created by url(). for (i in 1:500) { con - url(urls[i]) ## ... stuff here ... close(con) } R only allows you to have a fixed number of open connections at one time, and they do not get closed automatically when they go out of scope. These commands may help make clear how things work... showConnections() description class mode text isopen can read can write f = url(http://www.r-project.org;, open=r) showConnections() From: Gabor Grothendieck ggrothendieck_at_gmail.com Date: Sat 27 May 2006 - 09:47:20 EST Try closeAllConnections() __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.