[Rd] Choices to remove `srcref` (and its buddies) when serializing objects
Dear R devs, I was digging into a package issue today when I realized R serialize function not always generate the same results on equivalent objects when users choose to run differently. For example, the following code serialize(with(new.env(), { function(){} }), NULL, TRUE) generates different results when I copy-paste into console vs when I use ctrl+shift+enter to source the file in RStudio. With a deeper inspect into the cause, I found that function and language get source reference when getOption("keep.source") is TRUE. This means the source reference will make the functions different while in most cases, whether keeping function source might not impact how a function behaves. While it's OK that function serialize generates different results, functions such as `rlang::hash` and `digest::digest`, which depend on `serialize` might eventually deliver false positives on same inputs. I've checked source code in digest package hoping to get around this issue (for example serialize(..., refhook = ...)). However, my workaround did not work. It seems that the markers to the objects are different even if I used `refhook` to force srcref to be the same. I also tried `removeSource` and `rlang::zap_srcref`. None of them works directly on nested environments with multiple functions. I wonder how hard it would be to have options to discard source when serializing R objects? Currently my analyses heavily depend on digest function to generate file caches and automatically schedule pipelines (to update cache) when changes are detected. The pipelines save the hashes of source code, inputs, and outputs together so other people can easily verify the calculation without accessing the original data (which could be sensitive), or running hour-long analyses, or having to buy servers. All of these require `serialize` to produce the same results regardless of how users choose to run the code. It would be great if this feature could be in the future R. Other pipeline packages such as `targets` and `drake` can also benefit from it. Thanks, - Dipterix [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] using Paraview "in-situ" with R?
Thanks for adding more explanation. As Ivan Krylov mentioned earlier, this sounds like an idea for developing an R package. The viewers and R largely operate in communities that so far have little interaction and both can benefit from ideas in the other. George > On Jan 11, 2024, at 6:30 AM, Mike Marchywka wrote: > > Thanks. I take it though you see "R" in this role as adding to the > capabilities of > the viewers, maybe adding some quick model fits over FEM results or something? > Right now I was imagining working with freefem and rolling my own c++ code > with supporting use of R code. Ideally I could easily overlay stuff without > messing around with temp files. There are a lot of R things, probably > optimizations etc, that may be nice to view as they progress > with more than just a figure of merit. > Right now I'm just trying to use Runge-Kutta on a simple orbit > and the mjmdatascope output is much more useful on-the-fly > than text or after the fact. > > > Mike Marchywka > 44 Crosscreek Trail > Jasper GA 30143 > was 306 Charles Cox Drive Canton, GA 30115 > 470-758-0799 > 404-788-1216 > > > > > > From: George Ostrouchov > Sent: Wednesday, January 10, 2024 3:06 PM > To: r-devel@r-project.org > Cc: Mike Marchywka > Subject: Re: [Rd] using Paraview "in-situ" with R? > > At ORNL, we worked with VisIt (a sibling of Paraview, both funded largely by > DOE) around 2016 and made an in situ demo with R. We used packages pbdMPI (on > CRAN) and pbdDMAT (on GitHub/RbigData), which were in part built for this > purpose. Later also the package hola (on GitHub/RbigData) was built to > connect with adios2, which can do buffered in situ connections with various > codes. > > But the VisIt developers were not interested in R (preferring to roll their > own), so that direction fizzled. Paraview is a competetive sibling of VisIt, > so I don’t know if they would be interested. The packages we developed are > viable for that purpose. There is a lot in R that could benefit Paraview (or > VisIt). > > George > >> >> Message: 1 >> Date: Tue, 9 Jan 2024 14:20:17 + >> From: Mike Marchywka >> To: R-devel >> Subject: [Rd] using Paraview "in-situ" with R? >> Message-ID: >> >> >> >> Content-Type: text/plain; charset="iso-8859-1" >> >> I had previously asked about R interfaces to various "other" visualization >> tools specifically lightweights for monitoring progress of >> various codes. I was working on this, >> >> https://github.com/mmarchywka/mjmdatascope >> >> but in the meantime found out that Paraview has an "in-situ" >> capability for similar objectives. >> >> https://discourse.paraview.org/t/does-or-can-paraview-support-streaming-input/13637/9 >> >> While R does have a lot of plotting features, >> it seems like an excellent tool to interface to R allowing visualization >> without >> a bunch of temp files or >> >> Is anyone aware of anyone doing this interface or reasons its a boondoggle? >> >> Thanks. >> >> >> >> Mike Marchywka >> 44 Crosscreek Trail >> Jasper GA 30143 >> was 306 Charles Cox Drive Canton, GA 30115 >> 470-758-0799 >> 404-788-1216 >> > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Bioc-devel] How to push to release branch
It's not clear from your screenshot that you checked out the release branch, made the fix, and then tried to push? You don't want to push your devel branch onto the release branch. https://contributions.bioconductor.org/git-version-control.html#bug-fix-in-release-and-devel -Original Message- From: Bioc-devel On Behalf Of Yue Cao via Bioc-devel Sent: Wednesday, January 10, 2024 9:35 PM To: bioc-devel@r-project.org Subject: [Bioc-devel] How to push to release branch Dear Bioconductor team, I have received package building error for scFeatures https://urldefense.com/v3/__https://master.bioconductor.org/checkResults/3.18/bioc-LATEST/scFeatures/nebbiolo2-checksrc.html__;!!K-Hz7m0Vt54!ndWcNrE9ea12ac-vnY2R3-qffBIu5QKAl21UGUFGS1vAgQtMtfA4DyJLJulx1KBBXJedA6BBhfyL3aPEiVyh8A$ It seems it is building on the RELEASE_3_18 branch. I have pushed everything to the devel branch, but I believe I would need to push on the release branch to fix this building error. However, while I can push to the devel branch, I got errors when pushing to the release branch (as shown in the screenshot attachment). Wondering if you have any clue on this. Thank you for your help. Best regards, Yue ___ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Re: [R-pkg-devel] "Examples with CPU time > 2.5 times elapsed time" and other NOTEs on CRAN and rhub checks
В Thu, 11 Jan 2024 12:39:17 + D Z пишет: > The package itself has no parallelism built-in, but Imports > data.table. This NOTE does not surface on other platforms (eg using > rhub or on my GitHub actions runners). My unit tests already limit > data.table to 2 cores using setDTthreads(2), but I would like to keep > this line out of the help files for my functions. A breakpoint on pthread_create confirms that these are OpenMP threads created by data.table. You can wrap setDTthreads(2) in \dontshow{} to avoid visual pollution: https://cran.r-project.org/doc/manuals/R-exts.html#index-_005cdontshow > I receive the NOTE that my libs/ sub-directory is at 7.7Mb. Can I > ignore this or do I need to figure out how to reduce the binary size > of the package? I think this is typically accepted for packages using C++. > And last but not least, on some rhub instances (Fedora and Ubuntu > GCC) I receive a NOTE that the package runs its examples too slowly > (eg above 5secs). I have already tweaked the example code already > that it runs reliably <4 secs on my development laptop Then it should be fine. Additionally, you may need to cast some of your Rprintf arguments to avoid format warnings on Windows: https://win-builder.r-project.org/incoming_pretest/RITCH_0.1.23_20240110_120457/Windows/00check.log -- Best regards, Ivan __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
[R-pkg-devel] "Examples with CPU time > 2.5 times elapsed time" and other NOTEs on CRAN and rhub checks
Hi all, I submitted my package RITCH (https://github.com/DavZim/RITCH) to CRAN (used to be archived but I wanted to revive it again) but I got a NOTE (Question 1 below). Besides this NOTE from CRAN, I got two other NOTEs from rhub (Q2 and Q3). Q1) The CRAN NOTE (Debian only, does not surface on Windows or other platforms) reads *checking examples ... [7s/3s] NOTE Examples with CPU time > 2.5 times elapsed time user system elapsed ratio read_functions 3.968 0.092 0.831 4.886 (https://github.com/DavZim/RITCH/blob/master/R/read_functions.R in case you need the source code, the full CRAN report can be found here https://win-builder.r-project.org/incoming_pretest/RITCH_0.1.23_20240110_120457/Debian/00check.log) The package itself has no parallelism built-in, but Imports data.table. This NOTE does not surface on other platforms (eg using rhub or on my GitHub actions runners). My unit tests already limit data.table to 2 cores using setDTthreads(2), but I would like to keep this line out of the help files for my functions. Is there anything that I can do or can I ignore the result and argue for an exception using the false positive argument? Q2) A second question that I have is that on rhub Ubuntu Linux 20.04.1 LTS, R-release, GCC (https://artifacts.r-hub.io/RITCH_0.1.22.tar.gz-d2b925faf6b24497abbfa6ff60e51d34/RITCH.Rcheck/00check.log) I receive the NOTE that my libs/ sub-directory is at 7.7Mb. Can I ignore this or do I need to figure out how to reduce the binary size of the package? * checking installed package size ... NOTE installed size is 8.6Mb sub-directories of 1Mb or more: libs 7.7Mb My code uses Rcpp and has some classes and interdependencies between C++ functions, therefore a rewrite to make the binary size smaller might take a lot of work. From looking around online I find that other packages are a lot bigger. Are there any low-hanging fruits that I can use to reduce the size or should I ignore this NOTE? Q3) And last but not least, on some rhub instances (Fedora and Ubuntu GCC) I receive a NOTE that the package runs its examples too slowly (eg above 5secs). I have already tweaked the example code already that it runs reliably <4 secs on my development laptop Ubuntu Linux 20.04.1 LTS, R-release, GCC (https://builder.r-hub.io/status/original/RITCH_0.1.22.tar.gz-d2b925faf6b24497abbfa6ff60e51d34) * checking examples ... [6s/37s] NOTE Examples with CPU (user + system) or elapsed time > 5s user system elapsed read_functions 2.51 0.028 12.57 and on Fedora Linux, R-devel, clang, gfortran (https://builder.r-hub.io/status/original/RITCH_0.1.22.tar.gz-01bf475551eb4b30a722ea79ce421788) * checking examples ... [6s/26s] NOTE Examples with CPU (user + system) or elapsed time > 5s user system elapsed read_functions 1.896 0.018 8.891 As this does not surface on the CRAN checks, I would ignore it for now and concentrate only on the CRAN checks. Is this correct or should I pay more attention to these NOTEs? Any help/comment is appreciated. Thank you for your time and best regards, David [[alternative HTML version deleted]] __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [Rd] using Paraview "in-situ" with R?
Thanks. I take it though you see "R" in this role as adding to the capabilities of the viewers, maybe adding some quick model fits over FEM results or something? Right now I was imagining working with freefem and rolling my own c++ code with supporting use of R code. Ideally I could easily overlay stuff without messing around with temp files. There are a lot of R things, probably optimizations etc, that may be nice to view as they progress with more than just a figure of merit. Right now I'm just trying to use Runge-Kutta on a simple orbit and the mjmdatascope output is much more useful on-the-fly than text or after the fact. Mike Marchywka 44 Crosscreek Trail Jasper GA 30143 was 306 Charles Cox Drive Canton, GA 30115 470-758-0799 404-788-1216 From: George Ostrouchov Sent: Wednesday, January 10, 2024 3:06 PM To: r-devel@r-project.org Cc: Mike Marchywka Subject: Re: [Rd] using Paraview "in-situ" with R? At ORNL, we worked with VisIt (a sibling of Paraview, both funded largely by DOE) around 2016 and made an in situ demo with R. We used packages pbdMPI (on CRAN) and pbdDMAT (on GitHub/RbigData), which were in part built for this purpose. Later also the package hola (on GitHub/RbigData) was built to connect with adios2, which can do buffered in situ connections with various codes. But the VisIt developers were not interested in R (preferring to roll their own), so that direction fizzled. Paraview is a competetive sibling of VisIt, so I don’t know if they would be interested. The packages we developed are viable for that purpose. There is a lot in R that could benefit Paraview (or VisIt). George > > Message: 1 > Date: Tue, 9 Jan 2024 14:20:17 + > From: Mike Marchywka > To: R-devel > Subject: [Rd] using Paraview "in-situ" with R? > Message-ID: > > > > Content-Type: text/plain; charset="iso-8859-1" > > I had previously asked about R interfaces to various "other" visualization > tools specifically lightweights for monitoring progress of > various codes. I was working on this, > > https://github.com/mmarchywka/mjmdatascope > > but in the meantime found out that Paraview has an "in-situ" > capability for similar objectives. > > https://discourse.paraview.org/t/does-or-can-paraview-support-streaming-input/13637/9 > > While R does have a lot of plotting features, > it seems like an excellent tool to interface to R allowing visualization > without > a bunch of temp files or > > Is anyone aware of anyone doing this interface or reasons its a boondoggle? > > Thanks. > > > > Mike Marchywka > 44 Crosscreek Trail > Jasper GA 30143 > was 306 Charles Cox Drive Canton, GA 30115 > 470-758-0799 > 404-788-1216 > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel