[Rd] S4 Objects [Sec=Unclassified]
I am new to R programming but have dived into a medium sized modelling software development project. Having come from a Java OO background I have a couple of questions about S4 objects. Is there a way to make S4 slots (and methods) private and hence force the use of accessor methods? Is there a straight-forward way to implement pass-by-reference for method parameters? I am currently returning and overwritting updated objects which is clunky and costly and would like a more efficient way of doing this. Can anyone point me to some useful texts on S4 programming apart from the following: Chambers - Software for Data Analysis: Programming with R Venables - S Programming Thanks heaps Troy ___ Australian Antarctic Division - Commonwealth of Australia IMPORTANT: This transmission is intended for the addressee only. If you are not the intended recipient, you are notified that use or dissemination of this communication is strictly prohibited by Commonwealth law. If you have received this transmission in error, please notify the sender immediately by e-mail or by telephoning +61 3 6232 3209 and DELETE the message. Visit our web site at http://www.antarctica.gov.au/ ___ [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] reference counting bug: overwriting for loop 'seq' variable
William Dunlap wrote: It looks like the 'seq' variable to 'for' can be altered from within the loop, leading to incorrect answers. E.g., in the following I'd expect 'sum' to be 1+2=3, but R 2.10.0 (svn 48686) gives 44.5. x = c(1,2); sum = 0; for (i in x) { x[i+1] = i + 42.5; sum = sum + i }; sum [1] 44.5 or, with a debugging cat()s, x = c(1,2); sum = 0; for (i in x) { cat(before, i=, i, \n); x[i+1] = i + 42.5; cat(after, i=, i,\n); sum = sum + i }; sum before, i= 1 after, i= 1 before, i= 43.5 after, i= 43.5 [1] 44.5 If I force the for's 'seq' to be a copy of x by adding 0 to it, then I do get the expected answer. x = c(1,2); sum = 0; for (i in x+0) { x[i+1] = i + 42.5; sum = sum + i }; sum b[1] 3 It looks like an error in reference counting. indeed; seems like you've hit the issue of when r triggers data duplication and when it doesn't, discussed some time ago in the context of names() etc. consider: x = 1:2 for (i in x) x[i+1] = i-1 x # 1 0 1 y = c(1, 2) for (i in y) y[i+1] = i-1 y # -1 0 vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] formal argument envir matched by multiple actual arguments
In fact reg.finalizer() looks like a dangerous feature. If the finalizer itself triggers (implicitely or explicitely) garbage collection, then bad things happen. In the following example, garbage collection is triggered explicitely (using R-2.9.0): setClass(B, representation(bb=environment)) newB - function() { ans - new(B, bb=new.env()) reg.finalizer(a...@bb, function(e) { gc() cat(cleaning, class(ans), object...\n) } ) return(ans) } for (i in 1:500) {cat(i, \n); b1 - newB()} 1 2 3 4 5 6 ... 13 cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... 14 ... 169 170 171 Error: not a weak reference Error: not a weak reference [repeat the above line thousands of times] ... Error: not a weak reference Error: not a weak reference cleaning B object... Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' [repeat the above line thousands of times] ... Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' 172 ... 246 247 cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... *** caught segfault *** address 0x41, cause 'memory not mapped' Traceback: 1: gc() 2: function (e) {gc()cat(cleaning, class(ans), object...\n)}(environment) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: 2 Save workspace image? [y/n/c]: n Segmentation fault So apparently, if the finalizer triggers garbage collection, then we can end up with a corrupted session. Then anything can happen, from the strange 'formal argument envir matched by multiple actual arguments' error I reported in the previous post, to a segfault. In the worse case, nothing apparently happens but the output produced by the code is wrong. Maybe garbage collection requests should be ignored during the execution of the finalizer? (and more generally during garbbage collection itself) Cheers, H. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] The default position of plot title
Dear R-developers, It seems to me that the position of title is usually at the bottom of a plot in sociological and political science books and articles. I wonder if the same convention applies in other disciplines. If yes, is it reasonable to change the default position of main title of plot function? -- HUANG Ronggui, Wincent PhD Candidate Dept of Public and Social Administration City University of Hong Kong Home page: http://asrr.r-forge.r-project.org/rghuang.html __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] formal argument envir matched by multiple actual arguments
Hi. 2009/6/1 Hervé Pagès hpa...@fhcrc.org: Hi list, This looks similar to the problem reported here https://stat.ethz.ch/pipermail/r-devel/2006-April/037199.html by Henrik Bengtsson a long time ago. It is very sporadic and non-reproducible. Henrik, do you remember if your code was using reg.finalizer()? I tend to suspect it but I'm not sure. Yes. This was/is observed with object extending the Object class of R.oo, and the constructor of Object use reg.finalizer() [which then calls finalize() that can be overloaded]. The fact that the garbage collector is involved could explain why this bug(?) is hard to reproduce. It's been a while since I saw this problem (and we do instantiate way more Object:s these days). Looking at my source code comments and the post you refers to, I suspect that I manage to circumvent the issue by the following trick (looking at my code, I have several of those statements): envir2 - envir get(name, envir=envir2) Also, on March 6, 2008 I reported to R-devel on a related problem with '%in%': http://tolstoy.newcastle.edu.au/R/e4/devel/08/03/0708.html That one I circumvent by now only using is.element(a,b) instead of a %in% b. Maybe this gives you further clues. /Henrik BTW. You need to be careful when you register a finalizer and that uses code in a package, which may have been detached. This may cause an error in the finalizer which can give further side effects. See here: http://tolstoy.newcastle.edu.au/R/e2/devel/07/08/4251.html I've been hunting this bug for months but today, and we the help of other Bioconductor users, I was able to isolate it and to write some code that seems to almost reproduce it (i.e. not systematically but most of the times). (Just to put some context to the code below: it's a simplified version of some more complex code that we use in Bioconductor to manage memory caching of some big objects stored on disk. The idea is that objects of class A can be named. All A objects with the same name form a group. The code below implements a simple mechanism to trigger some action when a group is completely removed from memory i.e. when the last object in a group is garbage collected.) setClassUnion(environmentORNULL, c(environment, NULL)) setClass(A, representation( aa=integer, groupname=character, groupanchor=environmentORNULL ) ) .A.group.sizes - new.env(hash=TRUE, parent=emptyenv()) .inc.A.group.size - function(groupname) { group.size - 1L if (exists(groupname, envir=.A.group.sizes, inherits=FALSE)) group.size - group.size + get(groupname, envir=.A.group.sizes, inherits=FALSE) assign(groupname, group.size, envir=.A.group.sizes, inherits=FALSE) } .dec.A.group.size - function(groupname) { group.size - get(groupname, envir=.A.group.sizes, inherits=FALSE) - 1L assign(groupname, group.size, envir=.A.group.sizes, inherits=FALSE) return(group.size) } newA - function(groupname=) { a - new(A, groupname=groupname) if (!identical(groupname, )) { .inc.A.group.size(groupname) groupanchor - new.env(parent=emptyenv()) reg.finalizer(groupanchor, function(e) { group.size - .dec.A.group.size(groupname) if (group.size == 0L) { cat(no more object of group, groupname, in memory\n) # take some action } } ) �...@groupanchor - groupanchor } return(a) } The following commands seem to trigger the problem: for (i in 1:2000) {a1 - newA(group1)} as.list(.A.group.sizes) gc() as.list(.A.group.sizes) for (i in 1:2000) {a2 - newA(group2)} Error in assign(.Method, method, envir = envir) : formal argument envir matched by multiple actual arguments If it doesn't, then adding more rounds should finally do it: gc() for (i in 1:2000) {a3 - newA(group3)} gc() for (i in 1:2000) {a4 - newA(group4)} etc... Thanks in advance for any help with this! H. sessionInfo() R version 2.9.0 (2009-04-17) x86_64-unknown-linux-gnu locale: LC_CTYPE=en_CA.UTF-8;LC_NUMERIC=C;LC_TIME=en_CA.UTF-8;LC_COLLATE=en_CA.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_CA.UTF-8;LC_PAPER=en_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_CA.UTF-8;LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M2-B876 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fhcrc.org Phone: (206) 667-5791 Fax: (206) 667-1319 __ R-devel@r-project.org mailing list
[Rd] Bug in so_strsplit (PR#13742)
Full_Name: Wacek Kusnierczyk Version: 2.10.0 r48689 OS: Ubuntu 8.04 Linux 32b Submission from: (NULL) (129.241.199.78) src/main/character.c:435-438 (do_strsplit) contains the following code: for (i = 0; i tlen; i++) if (getCharCE(STRING_ELT(tok, 0)) == CE_UTF8) use_UTF8 = TRUE; for (i = 0; i len; i++) if (getCharCE(STRING_ELT(x, 0)) == CE_UTF8) use_UTF8 = TRUE; both loops iterate over loop-invariant expressions and statements. either the loops are redundant, or the fixed index '0' is copied over from some other place and should be replaced with 'i'. the bug can be fixed with for (i = 0; i tlen; i++) if (getCharCE(STRING_ELT(tok, i)) == CE_UTF8) { use_UTF8 = TRUE; break; } for (i = 0; i len; i++) if (getCharCE(STRING_ELT(x, i)) == CE_UTF8) { use_UTF8 = TRUE; break; } or with #define CHECK_CE(CHARACTER, LENGTH, USEUTF8) \ for (i = 0; i (LENGTH); i++) \ if (getCharCE(STRING_ELT((CHARACTER), i)) == CE_UTF8) { \ (USEUTF8) = TRUE; \ break; } CHECK_CE(tok, tlen, use_UTF8) CHECK_CE(x, len, use_UTF8) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] formal argument envir matched by multiple actual arguments
Nice case - I think you're onto something. /Henrik 2009/6/2 hpa...@fhcrc.org: In fact reg.finalizer() looks like a dangerous feature. If the finalizer itself triggers (implicitely or explicitely) garbage collection, then bad things happen. In the following example, garbage collection is triggered explicitely (using R-2.9.0): setClass(B, representation(bb=environment)) newB - function() { ans - new(B, bb=new.env()) reg.finalizer(a...@bb, function(e) { gc() cat(cleaning, class(ans), object...\n) } ) return(ans) } for (i in 1:500) {cat(i, \n); b1 - newB()} 1 2 3 4 5 6 ... 13 cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... 14 ... 169 170 171 Error: not a weak reference Error: not a weak reference [repeat the above line thousands of times] ... Error: not a weak reference Error: not a weak reference cleaning B object... Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' [repeat the above line thousands of times] ... Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' 172 ... 246 247 cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... *** caught segfault *** address 0x41, cause 'memory not mapped' Traceback: 1: gc() 2: function (e) { gc() cat(cleaning, class(ans), object...\n)}(environment) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: 2 Save workspace image? [y/n/c]: n Segmentation fault So apparently, if the finalizer triggers garbage collection, then we can end up with a corrupted session. Then anything can happen, from the strange 'formal argument envir matched by multiple actual arguments' error I reported in the previous post, to a segfault. In the worse case, nothing apparently happens but the output produced by the code is wrong. Maybe garbage collection requests should be ignored during the execution of the finalizer? (and more generally during garbbage collection itself) Cheers, H. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] cryptic error message from R CMD check
Dear R developers, I've run into a very cryptic error message from R CMD check while working on a new package. This is the relevant output: [fiz...@~/Rmap]:R CMD check Rmap * checking for working pdflatex ... OK * using log directory '/home/fizban/Rmap/Rmap.Rcheck' * using R version 2.9.0 (2009-04-17) * using session charset: UTF-8 * checking for file 'Rmap/DESCRIPTION' ... OK * checking extension type ... Package * this is package 'Rmap' version '0.1' * checking package dependencies ... OK * checking if this is a source package ... OK * checking for .dll and .exe files ... OK * checking whether package 'Rmap' can be installed ... ERROR Installation failed. See '/home/fizban/Rmap/Rmap.Rcheck/00install.out' for details. [fiz...@~/Rmap]:cat /home/fizban/Rmap/Rmap.Rcheck/00install.out * Installing *source* package ‘Rmap’ ... ** R ** preparing package for lazy loading ** help *** installing help indices Error in `[.data.frame`(M, , 4) : undefined columns selected * Removing ‘/home/fizban/Rmap/Rmap.Rcheck/Rmap’ R CMD build + INSTALL fails in the same way: [fiz...@~/Rmap]:R CMD build Rmap * checking for file 'Rmap/DESCRIPTION' ... OK * preparing 'Rmap': * checking DESCRIPTION meta-information ... OK * removing junk files * checking for LF line-endings in source and make files * checking for empty or unneeded directories WARNING: directory 'Rmap/man' is empty * building 'Rmap_0.1.tar.gz' [fiz...@~/Rmap]:sudo R CMD INSTALL Rmap_0.1.tar.gz * Installing to library ‘/usr/local/lib/R/site-library’ * Installing *source* package ‘Rmap’ ... ** R ** preparing package for lazy loading ** help *** installing help indices Error in `[.data.frame`(M, , 4) : undefined columns selected * Removing ‘/usr/local/lib/R/site-library/Rmap’ The error is easily reproducible, as it's caused by the lack of Rd documentation in the man directory; adding even one Rd file solves the problem. It's clearly a corner case (yes. I'm lazy and should have written the documentation a long time ago), but if you think it's worth the time it would be better to have a clearer error message from R CMD check. P.S.: this is on an updated Debian Sid: [fiz...@~/Rmap]:dpkg -l | grep ii r- | grep -v cran ii r-base 2.9.0-4 GNU R statistical computation and graphics system ii r-base-core 2.9.0-4 GNU R core of statistical computation and graphics system ii r-recommended2.9.0-4 GNU R collection of recommended packages [metapackage] Thanks for your time, Marco Scutari -- Marco Scutari, Ph.D. Student Department of Statistical Sciences University of Padova, Italy Facts don't care if you feel good about them. Slashdot, 25/10/07 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] S4 Objects [Sec=Unclassified]
Hi Troy -- Troy Robertson wrote: I am new to R programming but have dived into a medium sized modelling software development project. Having come from a Java OO background I have a couple of questions about S4 objects. Is there a way to make S4 slots (and methods) private and hence force the use of accessor methods? No, except by convention (e.g., 'don't directly access slots name-mangled in this way'; non-package code must never directly access slots') Is there a straight-forward way to implement pass-by-reference for method parameters? I am currently returning and overwritting updated objects which is clunky and costly and would like a more efficient way of doing this. no, copy-on-change is the most common semantic in R; using an 'environment' provides some flexibility, but use with S4 introduces twists. See this concurrent thread https://stat.ethz.ch/pipermail/r-help/2009-June/200038.html (my 2 cents:) embracing rather than avoiding the paradigm might lead to different designs, e.g., 'column-oriented' (an S4 instance representing an entire table) rather than row-oriented (an S4 instance for each row) data structures. Can anyone point me to some useful texts on S4 programming apart from the following: Chambers - Software for Data Analysis: Programming with R Venables - S Programming Gentleman, R Programming for Bioinformatics. Hope that helps. Martin Thanks heaps Troy ___ Australian Antarctic Division - Commonwealth of Australia IMPORTANT: This transmission is intended for the addressee only. If you are not the intended recipient, you are notified that use or dissemination of this communication is strictly prohibited by Commonwealth law. If you have received this transmission in error, please notify the sender immediately by e-mail or by telephoning +61 3 6232 3209 and DELETE the message. Visit our web site at http://www.antarctica.gov.au/ ___ [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] warning: some HTML links ...
Hi Everyone, I am getting a warning when I build and package from windows. The warning is some HTML links may not be found. Through some searching I found this probably has to do with the Microsoft HTML Help Workshop, but I have installed it (a few times) from the different locations in the manual and have the path as listed. When I install the package in my R session all the help files (which I assumes is what the error is about) work fine as do the functions. Has anybody else had this problem and have found a solution? Dan Daniel B. Wright Psychology Florida International University 11200 S.W. 8th Street Miami, FL 33199, USA http://www.fiu.edu/~dwright/ dwri...@fiu.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] The default position of plot title
ronggui-2 wrote: Dear R-developers, It seems to me that the position of title is usually at the bottom of a plot in sociological and political science books and articles. I wonder if the same convention applies in other disciplines. If yes, is it reasonable to change the default position of main title of plot function? I'm not an R developer, but: * in my field (biology/ecology), it's at the top. * it doesn't seem that hard to put the titles where you want with mtext() * changing this would have major backward-compatibility/surprise issues for all the other users who expect the title to be at the top ... Ben Bolker -- View this message in context: http://www.nabble.com/The-default-position-of-plot-title-tp23828342p23831792.html Sent from the R devel mailing list archive at Nabble.com. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Recommendations for a quick UI.
Alex, Kevin R. Coombes wrote: The following idea only partially answers your question I have successfully written a GUI using the tcl/tk package that ships with standard R. It is then possible (in Windows) to create a shortcut icon that runs the following command: C:\R\R-2.8.1\bin\R.exe --vanilla -e library(SuperCurveGUI);sc(); Note two features: [1] the first part of the -e switch loads the library containing the GUI [2] the second part (after the semicolon) launches the GUI If you make a normal shortcut this way, a batch window will open showing the ongoing R session, which is not quite what you want. However, if you adjust the shortcut to Run: Minimized, then (most) users will never see the batch window, and will only see the GUI. The reasons that this only partially answers your question are [1] It is Windows-specific [2] I do not know how to set up the shortcut automatically upon installation. depending on how deep you want to dig into programming (aside from R), you could use any COM client (on Windows), e.g., Visual Basic, C# using statconnDCOM (just download and install the package rcom) or using Java (RServe) Thomas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] qpois documentation (PR#13743)
Full_Name: Jerry W. Lewis Version: 2.9.0 OS: Windows XP Professional Submission from: (NULL) (166.186.168.103) Quantiles for discrete distributions are consitently implemented, but inconsitently documented. Help for qpois incorrectly states in the Details section that The quantile is left continuous: qgeom(q, prob) is the largest integer x such that P(X = x) q. which disagrees with the implementation; it should read The quantile is defined as the smallest value x such that F(x) = p, where F is the distribution function. Also, this definition should be added to Help for qhyper. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] setdiff bizarre (was: odd behavior out of setdiff)
On Sat, May 30, 2009 at 11:59 AM, Stavros Macrakis macra...@alum.mit.eduwrote: Since R is object-oriented, data frame set operations should be the natural operations for their class. There are, I suppose, two natural ways: the column-wise (variable-wise) and the row-wise (observation-wise) one. The row-wise one seems more natural and more useful to me. ... The row-wise interpretation makes sense in cases where observations with the same values for all variables can be considered redundant. That seems to me a much more useful interpretation. The union, intersection, and set difference of two sets of observations would seem to all be highly useful. Another argument for the row-wise interpretation: the `subset` function (also part of base) works that way on data frames. Interestingly, %in%/match appears to work neither row-wise nor column-wise: 1 %in% data.frame(a=1:3) # FALSE (would be true if row-wise) 1:3 %in% data.frame(a=1:3) # FALSE FALSE FALSE (would be true if column-wise) but simply treats the data frame as a *character* list: 1 %in% data.frame(a=2,b=1) # TRUE '1' %in% data.frame(a=2,b=1) # TRUE 1 %in% data.frame(a=2:3,b=1:2) # FALSE 1:3 %in% data.frame(a=2:4,b=1:3) # FALSE FALSE FALSE '1:3' %in% data.frame(a=2:4,b=1:3) # TRUE This specification is clearly documented in ? match, but I am mystified by it. Perhaps someone from R core can shed light on the rationale? -s [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] reference counting bug: overwriting for loop 'seq' variable
Thanks for the report. Should be fixed in teh devel and 2.9 branches. luke On Mon, 1 Jun 2009, William Dunlap wrote: It looks like the 'seq' variable to 'for' can be altered from within the loop, leading to incorrect answers. E.g., in the following I'd expect 'sum' to be 1+2=3, but R 2.10.0 (svn 48686) gives 44.5. x = c(1,2); sum = 0; for (i in x) { x[i+1] = i + 42.5; sum = sum + i }; sum [1] 44.5 or, with a debugging cat()s, x = c(1,2); sum = 0; for (i in x) { cat(before, i=, i, \n); x[i+1] = i + 42.5; cat(after, i=, i,\n); sum = sum + i }; sum before, i= 1 after, i= 1 before, i= 43.5 after, i= 43.5 [1] 44.5 If I force the for's 'seq' to be a copy of x by adding 0 to it, then I do get the expected answer. x = c(1,2); sum = 0; for (i in x+0) { x[i+1] = i + 42.5; sum = sum + i }; sum b[1] 3 It looks like an error in reference counting. Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: l...@stat.uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Recommendations for a quick UI.
On Mon, Jun 1, 2009 at 9:21 AM, Martin Maechler maech...@stat.math.ethz.chwrote: AB == Alex Bokov bo...@uthscsa.edu on Mon, 01 Jun 2009 00:24:58 -0500 writes: AB I'm trying to wrap my R package in a GUI such that when AB the user launches the app, they see my GUI window and AB never interact with the R console at all There's a dedicated Special Interest Group mailing list for answering / discussing such questions : R-SIG-GUI I would also be interested in the answer to this. My impression is that the R-sig-gui is mostly about graphical programming environments for R rather than about building GUI applications on top of R, though of course there is some overlap. I have recently started playing with R.rsp and it seems to provide a fairly simple solution for developing GUIs if you have some familiarity with generating Web pages dynamically (cf. ASP, JSP, etc.); R.rsp lets you build a dynamic Web page powered by R. It includes its own asynchronous Web server. To get started: install.packages('R.rsp') library(R.rsp) browseRsp() This will bring up the R.rsp documentation in a Web browser. You can then edit rsp files in .../r/win-library/2.8/R.rsp/rsp and run them. It is even pretty straightforward to include plotting output, though the solution demonstrated in figures.rsp has a problem: either all users of the server share the same set of plot files (so one user's output will overwrite another's) or there will be an ever-growing collection of old plot files, with no mechanism for culling them. You can imagine various ways around this, but as far as I know R.rsp doesn't support them directly. -s [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] formal argument envir matched by multiple actual arguments
On Tue, 2 Jun 2009, Henrik Bengtsson wrote: Nice case - I think you're onto something. /Henrik 2009/6/2 hpa...@fhcrc.org: In fact reg.finalizer() looks like a dangerous feature. If the finalizer itself triggers (implicitely or explicitely) garbage collection, then bad things happen. In the following example, garbage collection is triggered explicitely (using R-2.9.0): setClass(B, representation(bb=environment)) newB - function() { ans - new(B, bb=new.env()) reg.finalizer(a...@bb, function(e) { gc() cat(cleaning, class(ans), object...\n) } ) return(ans) } for (i in 1:500) {cat(i, \n); b1 - newB()} 1 2 3 4 5 6 ... 13 cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... 14 ... 169 170 171 Error: not a weak reference Error: not a weak reference [repeat the above line thousands of times] ... Error: not a weak reference Error: not a weak reference cleaning B object... Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' [repeat the above line thousands of times] ... Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' Error: SET_VECTOR_ELT() can only be applied to a 'list', not a 'integer' 172 ... 246 247 cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... cleaning B object... *** caught segfault *** address 0x41, cause 'memory not mapped' Traceback: 1: gc() 2: function (e) { gc() cat(cleaning, class(ans), object...\n)}(environment) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace Selection: 2 Save workspace image? [y/n/c]: n Segmentation fault So apparently, if the finalizer triggers garbage collection, then we can end up with a corrupted session. Then anything can happen, from the strange 'formal argument envir matched by multiple actual arguments' error I reported in the previous post, to a segfault. In the worse case, nothing apparently happens but the output produced by the code is wrong. Maybe garbage collection requests should be ignored during the execution of the finalizer? (and more generally during garbbage collection itself) Thanks for the report. The gc proper does not (or should not) do anything that could cause allocation or trigger another gc. The gc proper only identifies objects ready for finalization; running the finalizers happens outside the gc proper where allocation and gc calls should be safe. This looks like either a missing PROTECT call in the code for running finalizers or possibly a more subltle bug in managing the lists of objects in different states of finalization. I will look more carefully when I get a chance. luke Cheers, H. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Luke Tierney Chair, Statistics and Actuarial Science Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: l...@stat.uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu__ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] setdiff bizarre
Stavros Macrakis wrote: '1:3' %in% data.frame(a=2:4,b=1:3) # TRUE utterly weird. so what would x have to be so that x %in% data.frame('a') # TRUE hint: '1' %in% data.frame(1) # TRUE vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] setdiff bizarre
%in% is a thin wrapper on a call to match(). match() is not a generic function (and is not documented to be one), so it treats data.frames as lists, as their underlying representation is a list of columns. match is documented to convert lists to character and to then run the character version of match on that character data. match does not bail out if the types of the x and table arguments don't match (that would be undesirable in the integer/numeric mismatch case). Hence '1' %in% data.frame(1) # - TRUE is acting consistently with match(as.character(pi), c(1, pi, exp(1))) # - 2 and 1L %in% c(1.0, 2.0, 3.0) # - TRUE The related functions, duplicated() and unique(), do have row-wise data.frame methods. E.g., duplicated(data.frame(x=c(1,2,2,3,3),y=letters[c(1,1,2,2,2)])) [1] FALSE FALSE FALSE FALSE TRUE Perhaps match() ought to have one also. S+'s match is generic and has a data.frame method (which is row-oriented) so there we get: match(data.frame(x=c(1,3,5), y=letters[c(1,3,5)]), data.frame(x=1:10,y=letters[1:10])) [1] 1 3 5 is.element(data.frame(x=1:10,y=letters[1:10]), data.frame(x=c(1,3,5), y=letters[c(1,3,5)])) [1] TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE I think that %in% and is.element() ought to remain calls to match() and that if you want them to work row-wise on data.frames then match should get a data.frame method. Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On Behalf Of Wacek Kusnierczyk Sent: Tuesday, June 02, 2009 9:11 AM To: Stavros Macrakis Cc: r-devel@r-project.org; dwinsem...@comcast.net Subject: Re: [Rd] setdiff bizarre Stavros Macrakis wrote: '1:3' %in% data.frame(a=2:4,b=1:3) # TRUE utterly weird. so what would x have to be so that x %in% data.frame('a') # TRUE hint: '1' %in% data.frame(1) # TRUE vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] setdiff bizarre (was: odd behavior out of setdiff)
On Tue, Jun 2, 2009 at 4:13 PM, Stavros Macrakis macra...@alum.mit.edu wrote: but simply treats the data frame as a *character* list: 1 %in% data.frame(a=2,b=1) # TRUE '1' %in% data.frame(a=2,b=1) # TRUE 1 %in% data.frame(a=2:3,b=1:2) # FALSE 1:3 %in% data.frame(a=2:4,b=1:3) # FALSE FALSE FALSE '1:3' %in% data.frame(a=2:4,b=1:3) # TRUE It applies as.character to the dataframe: z=data.frame(a=2:4,b=1:3) as.character(z) [1] 2:4 1:3 The as.character method for data frames seems to spot integer sequences (but only for int types and not num types) and show the a:b notation: x=data.frame(z=as.integer(c(1,2,3,4,5))) str(x) 'data.frame': 5 obs. of 1 variable: $ z: int 1 2 3 4 5 as.character(x) [1] 1:5 Obviously it doesn't do this for vectors: as.character(x$z) [1] 1 2 3 4 5 I suspect it's using 'deparse()' to get the character representation. This function is mentioned in ?as.character, but as.character.default disappears into the infernal .Internal and I don't have time to chase source code - it's sunny outside! Barry __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] setdiff bizarre
William Dunlap wrote: %in% is a thin wrapper on a call to match(). match() is not a generic function (and is not documented to be one), so it treats data.frames as lists, as their underlying representation is a list of columns. match is documented to convert lists to character and to then run the character version of match on that character data. match does not bail out if the types of the x and table arguments don't match (that would be undesirable in the integer/numeric mismatch case). yes, i understand that this is documented behaviour, and that it's not a bug. nevertheless, the example is odd, and hints that there's a design flaw. i also do not understand why the following should be useful and desirable: as.character(list('a')) # a as.character(data.frame('a')) # 1 and hence 'a' %in% list('a') # TRUE while 'a' %in% data.frame('a') # FALSE '1' %in% data.frame('a') # TRUE there is a mechanistic explanation for how this works, but is there one for why this works this way? Hence '1' %in% data.frame(1) # - TRUE is acting consistently with match(as.character(pi), c(1, pi, exp(1))) # - 2 and 1L %in% c(1.0, 2.0, 3.0) # - TRUE The related functions, duplicated() and unique(), do have row-wise data.frame methods. E.g., duplicated(data.frame(x=c(1,2,2,3,3),y=letters[c(1,1,2,2,2)])) [1] FALSE FALSE FALSE FALSE TRUE Perhaps match() ought to have one also. S+'s match is generic and has a data.frame method (which is row-oriented) so there we get: match(data.frame(x=c(1,3,5), y=letters[c(1,3,5)]), data.frame(x=1:10,y=letters[1:10])) [1] 1 3 5 is.element(data.frame(x=1:10,y=letters[1:10]), data.frame(x=c(1,3,5), y=letters[c(1,3,5)])) [1] TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE I think that %in% and is.element() ought to remain calls to match() and that if you want them to work row-wise on data.frames then match should get a data.frame method. sounds good to me. how is 'a' %in% data.frame('a') in S+? thanks for the response. regards, vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] setdiff bizarre
... The related functions, duplicated() and unique(), do have row-wise data.frame methods. E.g., duplicated(data.frame(x=c(1,2,2,3,3),y=letters[c(1,1,2,2,2)])) [1] FALSE FALSE FALSE FALSE TRUE Perhaps match() ought to have one also. S+'s match is generic and has a data.frame method (which is row-oriented) so there we get: match(data.frame(x=c(1,3,5), y=letters[c(1,3,5)]), data.frame(x=1:10,y=letters[1:10])) [1] 1 3 5 is.element(data.frame(x=1:10,y=letters[1:10]), data.frame(x=c(1,3,5), y=letters[c(1,3,5)])) [1] TRUE FALSE TRUE FALSE TRUE FALSE FALSE FALSE FALSE FALSE I think that %in% and is.element() ought to remain calls to match() and that if you want them to work row-wise on data.frames then match should get a data.frame method. sounds good to me. how is 'a' %in% data.frame('a') in S+? thanks for the response. S+ gives: 'a' %in% data.frame(letters) [1] TRUE 'a' %in% data.frame(letters[2:26]) [1] FALSE but that special case, x a scalar and table a data.frame with one column, gets by more or less by accident. 'a' %in% data.frame(letters, num=1:26) Problem in match.data.frame(x, table, nomatch, incom..: table must be a list the same length as x c('a', 'b') %in% data.frame(letters) Problem in match.data.frame(x, table, nomatch, incom..: table must be a list the same length as x The intent is that the x and table arguments to match be compatible data.frames. S+'s match works differently on lists than R's does. It is set up to work on data.frame-like things: x and table must be lists of the the same length and within each list, each element must have the same length. It acts like match(do.call(paste,x), do.call(paste,table)) but doesn't actually do the conversion to character implied in that (it hashes all the entries in each 'row' into one hash table entry, using the usual type-specific hash number computation on each entry and combining them to make the row hash number). E.g., match(list(c(3,2), c(1,7), c(4,1)), list(c(1,4,2,3),c(0,6,7,1),c(0,5,1,4))) [1] 4 3 (Its match.data.frame() doesn't actually call this, for historical/inertial reasons. It goes the paste() route.) Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Recommendations for a quick UI.
Some possibilities: The Rcmdr package is a very good example of a GUI built using Tk (it does not hide the R program, but lets you do analyses using menus and dialogs). Rcmdr also has a plug-in mechanism to write extensions to it, depending on what you want to do, writing a simple extension to Rcmdr may be enough and a lot less work than creating your own from scratch. There are tools (Rpad, Rserve, and others) that allow web interfaces to R, that may work for you. There is the Rcom project uses R as a background tool for other programs, the most developed tool uses MSExcel as the GUI with R doing the heavy work behind the scenes. There are various examples of tools using the excel interface available. There is a lot of info at: http://www.sciviews.org/_rgui/ Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111 -Original Message- From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r- project.org] On Behalf Of Alex Bokov Sent: Sunday, May 31, 2009 11:25 PM To: r-devel@r-project.org Subject: [Rd] Recommendations for a quick UI. Hi. This is my first post to this list, I seem to be graduating to from the r-help list. :-) I'm trying to wrap my R package in a GUI such that when the user launches the app, they see my GUI window and never interact with the R console at all. I don't have any fancy requirements for the GUI itself- - all it needs to do is collect input from the user and pass the input as arguments to an R function, which writes the results to a file. I read the R Extensions Manual section about GUIs, and it seems like overkill to write the thing in a compiled language and link against R as a library when there are dozens of different interpreted cross-platform GUI toolkits out there. Does anybody know of any functioning examples of packages (or other add-ons) with GUIs that run R silently in the background which I can study? Do they use the R CMD BATCH mechanism, or something else? Thanks. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] setdiff bizarre
Barry Rowlingson wrote: [...] I suspect it's using 'deparse()' to get the character representation. This function is mentioned in ?as.character, but as.character.default disappears into the infernal .Internal and I don't have time to chase source code - it's sunny outside! on the side, as.character triggers do_ascharacter, which in turn calls DispatchOrEval, a function with the following beautiful comment: To call this an ugly hack would be to insult all existing ugly hacks at large in the world. a fortune? vQ __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] setdiff bizarre
On Tue, Jun 2, 2009 at 1:18 PM, William Dunlap wdun...@tibco.com wrote: %in% is a thin wrapper on a call to match(). Yes, as I mentioned in my email, all this is clearly documented in ? match. match() is not a generic function (and is not documented to be one), so it treats data.frames as lists, as their underlying representation is a list of columns. Yes, I understand that this is the proximal cause of the current strange behavior. What I don't understand is why the current behavior is a good idea. match is documented to convert lists to character and to then run the character version of match on that character data Yes, this peculiar behavior is documented. What I don't get is its rationale. match does not bail out if the types of the x and table arguments don't match (that would be undesirable in the integer/numeric mismatch case). Why would it 'bail out'? The related functions, duplicated() and unique(), do have row-wise data.frame methods. E.g., duplicated(data.frame(x=c(1,2,2,3,3),y=letters[c(1,1,2,2,2)])) [1] FALSE FALSE FALSE FALSE TRUE Perhaps match() ought to have one also I think that %in% and is.element() ought to remain calls to match() and that if you want them to work row-wise on data.frames then match should get a data.frame method. After all that, it sounds like we agree...! -s [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] statutes of R Foundation
Good evening all, I realised yesterday that the R Foundation statutes doc was only available in English and in German. I tried to translate into French: a first version is available here http://dutangc.free.fr/pub/statut%20R.pdf . Could you please tell me what do you think of my translation? Thanks in advance Christophe PS : Tex sources are here http://dutangc.free.fr/pub/statut%20R.tex -- Christophe Dutang Ph. D. student at ISFA, Lyon, France website: http://dutangc.free.fr __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] How to generate R objects in C?
I'm in the process of coding a parser (in C) to generate R entities (vectors, lists, etc.) from a text description (different from R). The basic parser works, and now I need to tell it how to create R entities. I need to be able to create character vectors (for unicode strings), integers, floats, unnamed lists, named lists, boolean values, and NA. With the exception of the two types of lists and the character vectors, all the other objects I need to generate are scalars, so I suppose they will correspond to 1-element vectors in R. I also need to be able to add R entities to both kinds of lists. I've been staring at various official documents (ch 5 of Writing R Extensions, R Internals, Rinternals.h) for this kind of work for some time, but I can't find the constructors for such objects (here I'm using the term constructor loosely). I'm even further from finding the C equivalent of my.list[[ length(my.list) + 1 ]] - new.thing. Can someone point me in the right direction? Thanks! Kynn __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] Vectorize fails for function with ... arglist
Vectorize is defined to return a function that acts as if 'mapply' was called. So we have: mapply(dput,1:2)# mapply form 1L # calls dput on each element of 1:2 2L [1] 1 2 Vectorize(dput)(1:2)# Vectorize form 1L# same behavior 2L [1] 1 2 Same thing with a named argument: mapply(function(a)dput(a),1:2) 1L 2L [1] 1 2 Vectorize(function(a)dput(a))(1:2) 1L 2L [1] 1 2 But though mapply has no problem with function(...): mapply(function(...)dput(list(...)),1:2) list(1L) list(2L) [[1]] [1] 1 [[2]] [1] 2 mapply(function(...)dput(list(...)),1:2,11:12) list(1L, 11L) list(2L, 12L) [,1] [,2] [1,] 12 [2,] 11 12 Vectorize fails silently in this case: Vectorize(function(...)dput(list(...))(1:2) list(1:2)# calls dput with entire vector # invisible result inherited from dput Vectorize(function(...)dput(list(...)))(1:2,11:12) list(1:2, 11:12) and sure enough: Vectorize(function(...)list(...)) function(...)list(...)# returns arg unmodified! I looked at the code, and ... args are *explicitly* rejected. I see no logical reason for this inconsistency, and the documentation doesn't require it. -s PS This is not an artificial example concocted to demonstrate inconsistencies. I had written the following function which wraps another function in a tryCatch: catcher - function(f) function(...) tryCatch(do.call(f,list(...)),error=function(e) NA) (The '...' argument list allows this to work with a function of any number of arguments.) but instead of catching individual errors in Vectorize(catcher(fun))(1:10,1:10), it caught them all as one big error, which was not at all the goal. [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] reference counting bug related to break and next in loops
One of our R users here just showed me the following problem while investigating the return value of a while loop. I added some information on a similar bug in for loops. I think he was using 2.9.0 but I see the same problem on today's development version of 2.10.0 (svn 48703). Should the semantics of while and for loops be changed slightly to avoid the memory buildup that fixing this to reflect the current docs would entail? S+'s loops return nothing useful - that change was made long ago to avoid memory buildup resulting from semantics akin the R's present semantics. Bill Dunlap TIBCO Software Inc - Spotfire Division wdunlap tibco.com Forwarded (and edited) message below--- -- I think I have found another reference counting bug. If you type in the following in R you get what I think is the wrong result. i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break }; i = i + 1; y}; q [1] 42 42 42 42 42 42 42 42 9 10 I had expected [1] 42 42 42 42 42 42 42 8 9 10 which is what you get if you add 0 to y in the last statement in the while loop: i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break }; i = i + 1; y + 0}; q [1] 42 42 42 42 42 42 42 8 9 10 Also, i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break }; i-i+1 ; if (i=8i3)next ; cat(Completing iteration, i, \n); y}; q Completing iteration 2 Completing iteration 3 [1] 42 42 42 42 42 42 42 42 9 10 but if the last statement in the while loop is y+0 instead of y I get the expected result: i = 1; y = 1:10; q = while(T) { y[i] = 42; if (i == 8) { break }; i-i+1 ; if (i=8i3)next ; cat(Completing iteration, i, \n); y+0L}; q Completing iteration 2 Completing iteration 3 [1] 42 42 3 4 5 6 7 8 9 10 A background to the problem is that in R a while-loop returns the value of the last iteration. However there is an exception if an iteration is terminated by a break or a next. Then the value is the value of the previously completed iteration that did not execute a break or next. Thus in an extreme case the value of the while may be the value of the very first iteration even though it executed a million iterations. Thus to implement that correctly one needs to keep a reference to the value of the last non-terminated iteration. It seems as if the current R implementation does that but does not increase the reference counter which explains the odd behavior. The for loop example is z-{ tmp-rep(pi,10);for(i in 1:10){ tmp[i]-i^2;if(i==9)break ; if (i9i3)next ; tmp } } z [1] 1.00 4.00 9.00 16.00 25.00 36.00 49.00 [8] 64.00 81.00 3.141593 z-{ tmp-rep(pi,10);for(i in 1:10){ tmp[i]-i^2;if(i==9)break ; if (i9i3)next ; tmp+0 } } z [1] 1.00 4.00 9.00 3.141593 3.141593 3.141593 3.141593 3.141593 [9] 3.141593 3.141593 I can think of a couple of ways to solve this. 1. Increment the reference counter. This solves the bug but may have serious performance implications. In the while example above it needs to copy y in every iteration. 2. Change the semantics of while loops by getting rid of the exception described above. When a loop is terminated with a break the value of the loop would be NULL. Thus there is no need to keep a reference to the value of the last non-terminated iteration. Any opinions? __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel