[Rd] Choices to remove `srcref` (and its buddies) when serializing objects

2024-01-11 Thread Dipterix Wang
Dear R devs,

I was digging into a package issue today when I realized R serialize function 
not always generate the same results on equivalent objects when users choose to 
run differently. For example, the following code

serialize(with(new.env(), { function(){} }), NULL, TRUE)

generates different results when I copy-paste into console vs when I use 
ctrl+shift+enter to source the file in RStudio. 

With a deeper inspect into the cause, I found that function and language get 
source reference when getOption("keep.source") is TRUE. This means the source 
reference will make the functions different while in most cases, whether 
keeping function source might not impact how a function behaves.

While it's OK that function serialize generates different results, functions 
such as `rlang::hash` and `digest::digest`, which depend on `serialize` might 
eventually deliver false positives on same inputs. I've checked source code in 
digest package hoping to get around this issue (for example serialize(..., 
refhook = ...)). However, my workaround did not work. It seems that the markers 
to the objects are different even if I used `refhook` to force srcref to be the 
same. I also tried `removeSource` and `rlang::zap_srcref`. None of them works 
directly on nested environments with multiple functions. 

I wonder how hard it would be to have options to discard source when 
serializing R objects? 

Currently my analyses heavily depend on digest function to generate file caches 
and automatically schedule pipelines (to update cache) when changes are 
detected. The pipelines save the hashes of source code, inputs, and outputs 
together so other people can easily verify the calculation without accessing 
the original data (which could be sensitive), or running hour-long analyses, or 
having to buy servers. All of these require `serialize` to produce the same 
results regardless of how users choose to run the code.

It would be great if this feature could be in the future R. Other pipeline 
packages such as `targets` and `drake` can also benefit from it.

Thanks,

- Dipterix
[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] using Paraview "in-situ" with R?

2024-01-11 Thread George Ostrouchov
Thanks for adding more explanation. As Ivan Krylov mentioned earlier, this 
sounds like an idea for developing an R package. The viewers and R largely 
operate in communities that so far have little interaction and both can benefit 
from ideas in the other. 

George

> On Jan 11, 2024, at 6:30 AM, Mike Marchywka  wrote:
> 
> Thanks. I take it though you see "R" in this role as adding to the 
> capabilities of 
> the viewers, maybe adding some quick model fits over FEM results or something?
> Right now I was imagining working with freefem and rolling my own c++ code
> with supporting use of R code. Ideally I could easily overlay stuff without
> messing around with temp files.  There are a lot of R things, probably
> optimizations etc, that may be nice to view as they progress
> with more than just a figure of merit. 
> Right now I'm just trying to use Runge-Kutta on a simple orbit 
> and the mjmdatascope output is much more useful on-the-fly 
> than text or after the fact.
> 
> 
>  Mike Marchywka 
> 44 Crosscreek Trail
> Jasper GA 30143
> was 306 Charles Cox Drive  Canton, GA 30115
> 470-758-0799
> 404-788-1216 
> 
> 
> 
> 
> 
> From: George Ostrouchov 
> Sent: Wednesday, January 10, 2024 3:06 PM
> To: r-devel@r-project.org
> Cc: Mike Marchywka
> Subject: Re:  [Rd] using Paraview "in-situ" with R?
> 
> At ORNL, we worked with VisIt (a sibling of Paraview, both funded largely by 
> DOE) around 2016 and made an in situ demo with R. We used packages pbdMPI (on 
> CRAN) and pbdDMAT (on GitHub/RbigData), which were in part built for this 
> purpose. Later also the package hola (on GitHub/RbigData) was built to 
> connect with adios2, which can do buffered in situ connections with various 
> codes.
> 
> But the VisIt developers were not interested in R (preferring to roll their 
> own), so that direction fizzled. Paraview is a competetive sibling of VisIt, 
> so I don’t know if they would be interested. The packages we developed are 
> viable for that purpose. There is a lot in R that could benefit Paraview (or 
> VisIt).
> 
> George
> 
>> 
>> Message: 1
>> Date: Tue, 9 Jan 2024 14:20:17 +
>> From: Mike Marchywka 
>> To: R-devel 
>> Subject: [Rd] using Paraview "in-situ" with R?
>> Message-ID:
>>  
>> 
>> 
>> Content-Type: text/plain; charset="iso-8859-1"
>> 
>> I had previously asked about R interfaces to various "other" visualization
>> tools specifically lightweights for monitoring progress of
>> various codes. I was working on this,
>> 
>> https://github.com/mmarchywka/mjmdatascope
>> 
>> but in the meantime found out that Paraview has an "in-situ"
>> capability for similar objectives.
>> 
>> https://discourse.paraview.org/t/does-or-can-paraview-support-streaming-input/13637/9
>> 
>> While R does have a lot of plotting features,
>> it seems like an excellent tool to interface to R allowing visualization 
>> without
>> a bunch of temp files or
>> 
>> Is anyone aware of anyone doing this interface or reasons its  a boondoggle?
>> 
>> Thanks.
>> 
>> 
>> 
>> Mike Marchywka
>> 44 Crosscreek Trail
>> Jasper GA 30143
>> was 306 Charles Cox Drive  Canton, GA 30115
>> 470-758-0799
>> 404-788-1216
>> 
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] How to push to release branch

2024-01-11 Thread James W. MacDonald
It's not clear from your screenshot that you checked out the release branch, 
made the fix, and then tried to push? You don't want to push your devel branch 
onto the release branch. 

https://contributions.bioconductor.org/git-version-control.html#bug-fix-in-release-and-devel

-Original Message-
From: Bioc-devel  On Behalf Of Yue Cao via 
Bioc-devel
Sent: Wednesday, January 10, 2024 9:35 PM
To: bioc-devel@r-project.org
Subject: [Bioc-devel] How to push to release branch

Dear Bioconductor team,

I have received package building error for scFeatures 
https://urldefense.com/v3/__https://master.bioconductor.org/checkResults/3.18/bioc-LATEST/scFeatures/nebbiolo2-checksrc.html__;!!K-Hz7m0Vt54!ndWcNrE9ea12ac-vnY2R3-qffBIu5QKAl21UGUFGS1vAgQtMtfA4DyJLJulx1KBBXJedA6BBhfyL3aPEiVyh8A$
 

It seems it is building on the RELEASE_3_18 branch. I have pushed everything to 
the devel branch, but I believe I would need to push on the release branch to 
fix this building error.

However, while I can push to the devel branch, I got errors when pushing to the 
release branch (as shown in the screenshot attachment).

Wondering if you have any clue on this. Thank you for your help.

Best regards,
Yue

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [R-pkg-devel] "Examples with CPU time > 2.5 times elapsed time" and other NOTEs on CRAN and rhub checks

2024-01-11 Thread Ivan Krylov via R-package-devel
В Thu, 11 Jan 2024 12:39:17 +
D Z  пишет:

> The package itself has no parallelism built-in, but Imports
> data.table. This NOTE does not surface on other platforms (eg using
> rhub or on my GitHub actions runners). My unit tests already limit
> data.table to 2 cores using setDTthreads(2), but I would like to keep
> this line out of the help files for my functions.

A breakpoint on pthread_create confirms that these are OpenMP threads
created by data.table. You can wrap setDTthreads(2) in \dontshow{} to
avoid visual pollution:
https://cran.r-project.org/doc/manuals/R-exts.html#index-_005cdontshow

> I receive the NOTE that my libs/ sub-directory is at 7.7Mb. Can I
> ignore this or do I need to figure out how to reduce the binary size
> of the package?

I think this is typically accepted for packages using C++.

> And last but not least, on some rhub instances (Fedora and Ubuntu
> GCC) I receive a NOTE that the package runs its examples too slowly
> (eg above 5secs). I have already tweaked the example code already
> that it runs reliably <4 secs on my development laptop

Then it should be fine.

Additionally, you may need to cast some of your Rprintf arguments to
avoid format warnings on Windows:
https://win-builder.r-project.org/incoming_pretest/RITCH_0.1.23_20240110_120457/Windows/00check.log

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] "Examples with CPU time > 2.5 times elapsed time" and other NOTEs on CRAN and rhub checks

2024-01-11 Thread D Z
Hi all,
I submitted my package RITCH (https://github.com/DavZim/RITCH) to CRAN (used to 
be archived but I wanted to revive it again) but I got a NOTE (Question 1 
below). Besides this NOTE from CRAN, I got two other NOTEs from rhub (Q2 and 
Q3).

Q1) The CRAN NOTE (Debian only, does not surface on Windows or other platforms) 
reads

*checking examples ... [7s/3s] NOTE
Examples with CPU time > 2.5 times elapsed time
  user system elapsed ratio
read_functions 3.968  0.092   0.831 4.886
(https://github.com/DavZim/RITCH/blob/master/R/read_functions.R in case you 
need the source code, the full CRAN report can be found here 
https://win-builder.r-project.org/incoming_pretest/RITCH_0.1.23_20240110_120457/Debian/00check.log)

The package itself has no parallelism built-in, but Imports data.table. This 
NOTE does not surface on other platforms (eg using rhub or on my GitHub actions 
runners). My unit tests already limit data.table to 2 cores using 
setDTthreads(2), but I would like to keep this line out of the help files for 
my functions.
Is there anything that I can do or can I ignore the result and argue for an 
exception using the false positive argument?

Q2) A second question that I have is that on rhub Ubuntu Linux 20.04.1 LTS, 
R-release, GCC 
(https://artifacts.r-hub.io/RITCH_0.1.22.tar.gz-d2b925faf6b24497abbfa6ff60e51d34/RITCH.Rcheck/00check.log)
 I receive the NOTE that my libs/ sub-directory is at 7.7Mb. Can I ignore this 
or do I need to figure out how to reduce the binary size of the package?

* checking installed package size ... NOTE
  installed size is  8.6Mb
  sub-directories of 1Mb or more:
libs   7.7Mb

My code uses Rcpp and has some classes and interdependencies between C++ 
functions, therefore a rewrite to make the binary size smaller might take a lot 
of work. From looking around online I find that other packages are a lot 
bigger. Are there any low-hanging fruits that I can use to reduce the size or 
should I ignore this NOTE?

Q3) And last but not least, on some rhub instances (Fedora and Ubuntu GCC) I 
receive a NOTE that the package runs its examples too slowly (eg above 5secs). 
I have already tweaked the example code already that it runs reliably <4 secs 
on my development laptop

Ubuntu Linux 20.04.1 LTS, R-release, GCC 
(https://builder.r-hub.io/status/original/RITCH_0.1.22.tar.gz-d2b925faf6b24497abbfa6ff60e51d34)
* checking examples ... [6s/37s] NOTE
Examples with CPU (user + system) or elapsed time > 5s
   user system elapsed
read_functions 2.51  0.028   12.57

and on Fedora Linux, R-devel, clang, gfortran 
(https://builder.r-hub.io/status/original/RITCH_0.1.22.tar.gz-01bf475551eb4b30a722ea79ce421788)

* checking examples ... [6s/26s] NOTE
Examples with CPU (user + system) or elapsed time > 5s
user system elapsed
read_functions 1.896  0.018   8.891

As this does not surface on the CRAN checks, I would ignore it for now and 
concentrate only on the CRAN checks. Is this correct or should I pay more 
attention to these NOTEs?

Any help/comment is appreciated.

Thank you for your time and best regards,
David

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] using Paraview "in-situ" with R?

2024-01-11 Thread Mike Marchywka
Thanks. I take it though you see "R" in this role as adding to the capabilities 
of 
the viewers, maybe adding some quick model fits over FEM results or something?
Right now I was imagining working with freefem and rolling my own c++ code
with supporting use of R code. Ideally I could easily overlay stuff without
messing around with temp files.  There are a lot of R things, probably
optimizations etc, that may be nice to view as they progress
with more than just a figure of merit. 
Right now I'm just trying to use Runge-Kutta on a simple orbit 
and the mjmdatascope output is much more useful on-the-fly 
than text or after the fact.


 Mike Marchywka 
44 Crosscreek Trail
Jasper GA 30143
was 306 Charles Cox Drive  Canton, GA 30115
470-758-0799
404-788-1216 





From: George Ostrouchov 
Sent: Wednesday, January 10, 2024 3:06 PM
To: r-devel@r-project.org
Cc: Mike Marchywka
Subject: Re:  [Rd] using Paraview "in-situ" with R?

At ORNL, we worked with VisIt (a sibling of Paraview, both funded largely by 
DOE) around 2016 and made an in situ demo with R. We used packages pbdMPI (on 
CRAN) and pbdDMAT (on GitHub/RbigData), which were in part built for this 
purpose. Later also the package hola (on GitHub/RbigData) was built to connect 
with adios2, which can do buffered in situ connections with various codes.

But the VisIt developers were not interested in R (preferring to roll their 
own), so that direction fizzled. Paraview is a competetive sibling of VisIt, so 
I don’t know if they would be interested. The packages we developed are viable 
for that purpose. There is a lot in R that could benefit Paraview (or VisIt).

George

>
> Message: 1
> Date: Tue, 9 Jan 2024 14:20:17 +
> From: Mike Marchywka 
> To: R-devel 
> Subject: [Rd] using Paraview "in-situ" with R?
> Message-ID:
>   
> 
>
> Content-Type: text/plain; charset="iso-8859-1"
>
> I had previously asked about R interfaces to various "other" visualization
> tools specifically lightweights for monitoring progress of
> various codes. I was working on this,
>
> https://github.com/mmarchywka/mjmdatascope
>
> but in the meantime found out that Paraview has an "in-situ"
> capability for similar objectives.
>
> https://discourse.paraview.org/t/does-or-can-paraview-support-streaming-input/13637/9
>
> While R does have a lot of plotting features,
> it seems like an excellent tool to interface to R allowing visualization 
> without
> a bunch of temp files or
>
> Is anyone aware of anyone doing this interface or reasons its  a boondoggle?
>
> Thanks.
>
>
>
>  Mike Marchywka
> 44 Crosscreek Trail
> Jasper GA 30143
> was 306 Charles Cox Drive  Canton, GA 30115
> 470-758-0799
> 404-788-1216
>


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel