Re: [Rd] Request for help with UBSAN and total absense of CRAN response
... and they didn't make it through. I put the files in a gist: https://gist.github.com/djvanderlaan/1e9beb75d2d595824efc Jan On 16-01-15 15:21, Jan van der Laan wrote: Dirk, The vagrant setup I use to test my packages with UBSAN also seems to replicate the error reported by CRAN (together with some other warnings). I have attached the files (I hope they get through the filters). I suppose you know what to do with them. Jan Dirk Eddelbuettel e...@debian.org schreef: CRAN has a package of mine in upload limbo because it failed UBSAN. I am not entirely ignorant on the topic of sanitizers and SAN / ASAN / UBSAN; we created not one but two Docker containers with ASAN and USBAN: https://registry.hub.docker.com/u/rocker/r-devel-san/ https://registry.hub.docker.com/u/rocker/r-devel-ubsan-clang/ as well as predecessors to them in earlier Docker repos. Yet I fail to recreate the errors reported by CRAN: http://www.stats.ox.ac.uk/pub/bdr/memtests/UBSAN-clang-trunk/RcppAnnoy/tests/runUnitTests.Rout http://www.stats.ox.ac.uk/pub/bdr/memtests/UBSAN/RcppAnnoy/tests/runUnitTests.Rout I asked politely (and twice) for help with the corresponding compiler configuration(s). But CRAN is of course way above communicating with mere mortals such as yours truly. So I have no recourse other than to spam all of you: if anybody here has a working UBSAN setup which can replicate the issue seen in the (rather small) RcppAnnoy package? Erik (upstream for Annoy, CC'ed) and I would be most grateful. We do not like being held hostage on an error report we cannot replicate and for which we do not receive any help (or even further communication) whatsoever. Dirk about to turn into yet another frustrated CRAN user -- http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Is the tcltk failure in affylmGUI related to R bug 15957
Thanks Peter and Dan for your replies. After learning a bit more about tcltk and environments etc. I have replaced Try(n - evalq(TclVarCount - TclVarCount + 1, .TkRoot$env)) with Try(n - .TkRoot$env$TclVarCount - .TkRoot$env$TclVarCount +1L) as you suggest. It now works for both R-3.1.1 and R-3.1.2+ (My understanding is that the Try function is there to put a GUI box around the error messages.) I shall update affylmGUI versions accordingly soon. cheers Keith PS I have also changed the Depends in DESCRIPTION to Imports and added an import statement to the NAMESPACE file which is independent of this problem. Consequently removed Require(tkrplot) statements as no longer needed. - peter dalgaard pda...@gmail.com wrote: Seems unlikely that that particular bug is involved. I seem to recall some change related to inadvertent variable capture in .TkRoot$env (?). At any rate, we currently have parent.env(.TkRoot$env) environment: R_EmptyEnv which used to be parent.env(.TkRoot$env) environment: R_GlobalEnv as a result, this won't work any more because R_EmptyEnv has no operators and functions in it: evalq(x - 1, .TkRoot$env) Error in eval(substitute(expr), envir, enclos) : could not find function - and consequently, you conk out at Try(n - evalq(TclVarCount - TclVarCount + 1, .TkRoot$env)) which presumably needs to be recoded in the same way as the current code in tclVar(): tclVar function (init = ) { n - .TkRoot$env$TclVarCount - .TkRoot$env$TclVarCount + 1L name - paste0(::RTcl, n) l - list(env = new.env()) assign(name, NULL, envir = l$env) reg.finalizer(l$env, function(env) tcl(unset, ls(env))) class(l) - tclVar tclvalue(l) - init l } (The whole thing looks a bit odd: Your function clones a fair bit of tclVar, wrapping each line in Try() for no apparent reason (or?), with the apparent purpose of doing something that seems quite similar to what tclArray() already does...) -pd On 14 Jan 2015, at 06:50 , Keith Satterley ke...@wehi.edu.au wrote: I maintain the package affylmGUI. It works when installed on many previous versions of R. I have today tested exactly the same code under R-2.15.3, R-3.0.2, R-3.1.0, R-3.1.1, R-3.1.2 and R-devel. I have also tested the versions of affylmGUI downloaded by biocLite for each version of R and the same result applies. I have no errors under 2.15.3, 3.0.2, 3.1.0 and 3.1.1. The following error occurs under 3.1.2 and R-devel. I run affylmGUI and read a targets file which then causes affylmGUI to read the specified cel files. On attempting to display the RNA targets file in a Tk window using the RNA Targets option from the RNA Targets Menu item and the following errors occur: Error text box 1: Error in eval(substitute(expr),enclos):could not find function - - pressed OK Following error text box: Error in paste(::RTcl,n,sep=): object 'n' not found - pressed OK Following error text box: Error in assign(name, NULL, environ = I$env): object 'name' not found - pressed OK Following error text box: Error in paste(set,name, (0,0)\\,sep= ):object 'name' not found - pressed OK This then results in an unfilled Tk window. I am testing on a Windows 7, 64 bit environment. My sessionInfo is: R version 3.1.2 (2014-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit) locale: [1] LC_COLLATE=English_Australia.1252 LC_CTYPE=English_Australia.1252 LC_MONETARY=English_Australia.1252 [4] LC_NUMERIC=C LC_TIME=English_Australia.1252 attached base packages: [1] stats4parallel tcltk stats graphics grDevices utils datasets methods base other attached packages: [1] affylmGUI_1.40.0 AnnotationDbi_1.28.1 GenomeInfoDb_1.2.4 IRanges_2.0.1 S4Vectors_0.4.0 [6] xtable_1.7-4 R2HTML_2.3.1 affyPLM_1.42.0 preprocessCore_1.28.0 gcrma_2.38.0 [11] tkrplot_0.0-23affyio_1.34.0 BiocInstaller_1.16.1 affy_1.44.0 Biobase_2.26.0 [16] BiocGenerics_0.12.1 limma_3.22.3 loaded via a namespace (and not attached): [1] Biostrings_2.34.1 DBI_0.3.1 RSQLite_1.0.0 splines_3.1.2 XVector_0.6.0 zlibbioc_1.12.0 I think the relevant code that is resulting in the error is generated by this function in main.R: tclArrayVar - function(){ Try(n - evalq(TclVarCount - TclVarCount + 1, .TkRoot$env)) Try(name - paste(::RTcl, n,sep = )) Try(l - list(env = new.env())) Try(assign(name, NULL, envir = l$env)) Try(reg.finalizer(l$env, function(env) tcl(unset, ls(env Try(class(l) - tclArrayVar) Try(.Tcl(paste(set ,name,(0,0) \\,sep=))) l ### Investigate this line KS } #end of tclArrayVar - function() This code is lines 877-886 in main.R Despite the un-investigated last line in this function, it works fine in earlier versions of R as described
Re: [Rd] default min-v/nsize parameters
On Thu, Jan 15, 2015 at 3:55 PM, Michael Lawrence lawrence.mich...@gene.com wrote: Just wanted to start a discussion on whether R could ship with more appropriate GC parameters. I've been doing a number of similar measurements, and have come to the same conclusion. R is currently very conservative about memory usage, and this leads to unnecessarily poor performance on certain problems. Changing the defaults to sizes that are more appropriate for modern machines can often produce a 2x speedup. On Sat, Jan 17, 2015 at 8:39 AM, luke-tier...@uiowa.edu wrote: Martin Morgan discussed this a year or so ago and as I recall bumped up these values to the current defaults. I don't recall details about why we didn't go higher -- maybe Martin does. I just checked, and it doesn't seem that any of the relevant values have been increased in the last ten years. Do you have a link to the discussion you recall so we can see why the changes weren't made? I suspect the main concern would be with small memory machines in student labs and less developed countries. While a reasonable concern, I'm doubtful there are many machines for which the current numbers are optimal. The current minimum size increases for node and vector heaps are 40KB and 80KB respectively. This grows as the heap grows (min + .05 * heap), but still means that we do many more expensive garbage collections at while growing than we need to. Paradoxically, the SMALL_MEMORY compile option (which is suggestd for computers with up to 32MB of RAM) has slightly larger at 50KB and 100KB. I think we'd get significant benefit for most users by being less conservative about memory consumption.The exact sizes should be discussed, but with RAM costing about $10/GB it doesn't seem unreasonable to assume most machines running R have multiple GB installed, and those that don't will quite likely be running an OS that needs a custom compiled binary anyway. I could be way off, but my suggestion might be a 10MB start with 1MB minimum increments for SMALL_MEMORY, 100MB start with 10MB increments for NORMAL_MEMORY, and 1GB start with 100MB increments for LARGE_MEMORY might be a reasonable spread. Or one could go even larger, noting that on most systems, overcommitted memory is not a problem until it is used. Until we write to it, it doesn't actually use physical RAM, just virtual address space. Or we could stay small, but make it possible to programmatically increase the granularity from within R. For ease of reference, here are the relevant sections of code: https://github.com/wch/r-source/blob/master/src/include/Defn.h#L217 (ripley last authored on Jan 26, 2000 / pd last authored on May 8, 1999) 217 #ifndef R_NSIZE 218 #define R_NSIZE 35L 219 #endif 220 #ifndef R_VSIZE 221 #define R_VSIZE 6291456L 222 #endif https://github.com/wch/r-source/blob/master/src/main/startup.c#L169 (ripley last authored on Jun 9, 2004) 157 Rp-vsize = R_VSIZE; 158 Rp-nsize = R_NSIZE; 166 #define Max_Nsize 5000 /* about 1.4Gb 32-bit, 2.8Gb 64-bit */ 167 #define Max_Vsize R_SIZE_T_MAX /* unlimited */ 169 #define Min_Nsize 22 170 #define Min_Vsize (1*Mega) https://github.com/wch/r-source/blob/master/src/main/memory.c#L335 (luke last authored on Nov 1, 2000) #ifdef SMALL_MEMORY 336 /* On machines with only 32M of memory (or on a classic Mac OS port) 337 it might be a good idea to use settings like these that are more 338 aggressive at keeping memory usage down. */ 339 static double R_NGrowIncrFrac = 0.0, R_NShrinkIncrFrac = 0.2; 340 static int R_NGrowIncrMin = 5, R_NShrinkIncrMin = 0; 341 static double R_VGrowIncrFrac = 0.0, R_VShrinkIncrFrac = 0.2; 342 static int R_VGrowIncrMin = 10, R_VShrinkIncrMin = 0; 343#else 344 static double R_NGrowIncrFrac = 0.05, R_NShrinkIncrFrac = 0.2; 345 static int R_NGrowIncrMin = 4, R_NShrinkIncrMin = 0; 346 static double R_VGrowIncrFrac = 0.05, R_VShrinkIncrFrac = 0.2; 347 static int R_VGrowIncrMin = 8, R_VShrinkIncrMin = 0; 348#endif static void AdjustHeapSize(R_size_t size_needed) { R_size_t R_MinNFree = (R_size_t)(orig_R_NSize * R_MinFreeFrac); R_size_t R_MinVFree = (R_size_t)(orig_R_VSize * R_MinFreeFrac); R_size_t NNeeded = R_NodesInUse + R_MinNFree; R_size_t VNeeded = R_SmallVallocSize + R_LargeVallocSize + size_needed + R_MinVFree; double node_occup = ((double) NNeeded) / R_NSize; double vect_occup = ((double) VNeeded) / R_VSize; if (node_occup R_NGrowFrac) { R_size_t change = (R_size_t)(R_NGrowIncrMin + R_NGrowIncrFrac * R_NSize); if (R_MaxNSize = R_NSize + change) R_NSize += change; } else if (node_occup R_NShrinkFrac) { R_NSize -= (R_NShrinkIncrMin + R_NShrinkIncrFrac * R_NSize); if (R_NSize NNeeded) R_NSize = (NNeeded R_MaxNSize) ? NNeeded: R_MaxNSize; if (R_NSize orig_R_NSize) R_NSize = orig_R_NSize; } if (vect_occup 1.0 VNeeded R_MaxVSize)
Re: [Rd] default min-v/nsize parameters
Martin Morgan discussed this a year or so ago and as I recall bumped up these values to the current defaults. I don't recall details about why we didn't go higher -- maybe Martin does. I suspect the main concern would be with small memory machines in student labs and less developed countries. If there was a way on all platforms to identify how much memory is available that might help to set a default, though that isn't perfect since you want something different on a large memory machine for one R process than for 16 R processes. Best, luke On Thu, 15 Jan 2015, Michael Lawrence wrote: Just wanted to start a discussion on whether R could ship with more appropriate GC parameters. Right now, loading the recommended package Matrix leads to: library(Matrix) gc() used (Mb) gc trigger (Mb) max used (Mb) Ncells 1076796 57.61368491 73.1 1198505 64.1 Vcells 1671329 12.82685683 20.5 1932418 14.8 Results may vary, but here R needed 64MB of N cells and 15MB of V cells to load one of the most important packages. Currently, the default GC triggers are ~20MB (64 bit systems) for N cells and ~6MB of V cells. Martin Morgan found that this leads to a lot of GC overhead during package loading and at least in our tests can significantly increase the load time of complex packages. If we set the triggers at the command line beyond the reach of library(Matrix) (--min-vsize=2048M --min-nsize=45M), then we see: used (Mb) gc trigger (Mb) max used (Mb) Ncells 1076859 57.6 47185920 2520 6260069 334.4 Vcells 1671431 12.8 268435456 2048 9010303 68.8 So by effectively disabling the GC, we let R consume 335MB N + 70MB of V, but loading goes a lot faster: Loading Matrix with default settings: system.time(library(Matrix)) user system elapsed 1.600 0.011 1.610 With high GC triggers (): system.time(library(Matrix)) user system elapsed 0.983 0.097 1.079 Given modern hardware capabilities and the need to efficiently load software for the user to be able to do something, perhaps we should bump the default settings so that the GC is fired sparingly when loading a large package. For users of Bioconductor, we see this for library(GenomicRanges): used (Mb) gc trigger (Mb) max used (Mb) Ncells 1322124 70.7 47185920 2520 15591302 832.7 Vcells 1216015 9.3 268435456 2048 13992181 106.8 So perhaps that user would want 900 MB of N and 100 MB of V as the trigger (corresponding to --min-vsize=100M --min-nsize=16M). Thoughts? [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Luke Tierney Ralph E. Wareham Professor of Mathematical Sciences University of Iowa Phone: 319-335-3386 Department of Statistics andFax: 319-335-3017 Actuarial Science 241 Schaeffer Hall email: luke-tier...@uiowa.edu Iowa City, IA 52242 WWW: http://www.stat.uiowa.edu __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel