Re: [Rd] Lazy-evaluate elements wrapped with invisible

2022-10-29 Thread Dipterix Wang
Sorry I think I intended to say that 

1. the expressions within `delayed` don’t have to be executed if not assigned, 
and 
2. the enclosing runtime environment that is potentially referenced by the 
objects within delayed() can be released immediately either the returned values 
are referenced or not (compared to the environment() approach). 

For a toy example,

a <- function() {
  v <- rnorm(1e8)
  return(delayed({
list(m = mean(v), plot_data = 1:3)
  }))
}
m1 <- a()

can immediately release the function runtime environment either the results of 
`a()` is assigned to an object (`mean(v)` is evaluated) or not (delayed is not 
evaluated, and gc’ed), hence the object `v` is freed from the memory.

Compared to the `delayedAssign+environment()` approach

b <- function(){
  v <- rnorm(1e8)
  delayedAssign(‘m’, mean(v))
  plot_data <- 1:3
  return(environment())
}
m2 <- b()

`v` will be kept in memory until the returned environment itself gets deleted 
(rm(m2)). The situation might get tricky if the result of  “b()” is further 
used and returned in nested pipes… (might keep parent frame, parent parent 
frames…)

- D

> On Oct 29, 2022, at 11:54 AM, Bill Dunlap  wrote:
> 
> >  the `delayed` object is ready to be garbage collected if not assigned 
> > immediately.
> I am not sure what is meant here.  Any object (at the R code level) is ready 
> to be garbage collected if not given a name or is not part of an object with 
> a name.  Do you mean a 'delayed' component of a list should be considered 
> garbage if not 'immediately' extracted from a list?   Could you show a few 
> usage cases?
> 
> -Bill
> 
> On Fri, Oct 28, 2022 at 7:41 PM Dipterix Wang  > wrote:
> 
>> This is not quite true. The value, even when invisible, is captured by 
>> .Last.value, and 
>> 
>> > f <- function() invisible(5)
>> > f()
>> > .Last.value
>> [1] 5
> 
> 
> I understand .Last.value will capture the function returns, but that only 
> happens in the top-level... I guess?
> 
> In the followings code, I think .Last.value does not capture the results of 
> f, h, k, l
> 
> g <- function() {
>   f(); h(); k(); l()
>   return()
> }
> g()
> 
> 
> Maybe I caused confusion by mentioning `invisible` function. I guess it 
> should be a new function (let’s call it `delayed`). The function does not 
> have to be limited to “printing”. For example, a digest key
> 
> 
> a <- function(key, value) {
>   map$set(key, value)
> 
>   return(delayed({
> digest(value)
>   }))
> }
> 
> Or an async evaluation of which the saved result might not be needed if not 
> assigned (detached), or the result will be “joined” to the main process
> 
> a <- function(path) {
>   # async 
>   f <- future::future({
> # calculate, and then write to path
> saveRDS(…, path)
>   })
>   
>   return(delayed({
> resolve(f) # wait till f to finish
> 
> readRDS(path)
>   }))
> }
> 
> Although I could use wrappers such as formula, quosure, or environment to 
> achieve similar results, there are two major differences
> 
> 1. There is an extra call to get the lazy-evaluated results (if I do want to 
> resolve it)
> 2. The returned objects have to contain sort of “environment” component in 
> it. It can’t just be simple objects like vectors, matrices, lists, … (also 
> you can't immediately garbage collect the enclosing environment)
> 
> From the implementation perspective, the `delayed` object is ready to be 
> garbage collected if not assigned immediately.
> 
> Best,
> - D
> 
>> 
>> This is not quite true. The value, even when invisible, is captured by 
>> .Last.value, and 
>> 
>> > f <- function() invisible(5)
>> > f()
>> > .Last.value
>> [1] 5
>> 
>> Now that doesn't actually preclude what you're suggesting (just have to wait 
>> for .Last.value to be populated by something else), but it does complicate 
>> it to the extent that I'm not sure the benefit we'd get would be worth it.
>> 
>> Also, in the case you're describing, you'd be pushing the computational cost 
>> into printing, which, imo, is not where it should live. Printing a values 
>> generally speaking, should just print things, imo.
>> 
>> That said, if you really wanted to do this, you could approach the behavior 
>> you want, I believe (but again, I think this is a bad idea) by returning a 
>> custom class that wraps formula (or, I imagine, tidyverse style quosures) 
>> that reach back into the call frame you return them from, and evaluating 
>> them only on demand.
>> 
>> Best,
>> ~G 
>> 
>> 
>> This idea is somewhere between `delayedAssign` and eager evaluation. Maybe 
>> we could call it delayedInvisible()?
>> 
>> Best,
>> - Zhengjia
>> 
>> __
>> R-devel@r-project.org  mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel 
>> 
> 


[[alternative HTML version deleted]]

__
R-devel@r-proje

Re: [Rd] tools:: extracting pkg dependencies from DCF

2022-10-29 Thread Jan Gorecki
Thank you Gabriel,

Just for future readers. Below is a base R way to address this common
problem, as instructed by you (+stopifnot to suppress print).

Rscript -e 'stopifnot(file.copy("DESCRIPTION",
file.path(tdir<-tempdir(), "PACKAGES")));
db<-available.packages(paste0("file://", tdir));
install.packages(setdiff(tools::package_dependencies(read.dcf("DESCRIPTION",
fields="Package")[[1L]], db, which="most")[[1L]],
installed.packages(priority="high")[,"Package"]))'

3 liner, 310 chars long command, far from ideal, but does work.

Best,
Jan


On Fri, Oct 28, 2022 at 10:42 PM Gabriel Becker  wrote:
>
> Hi Jan,
>
>
> On Fri, Oct 28, 2022 at 1:57 PM Jan Gorecki  wrote:
>>
>> Gabriel,
>>
>> It is the most basic CI use case. One wants to install only
>> dependencies only of the package, and run R CMD check on the package.
>
>
> Really what you're looking for though, is to install all the dependencies 
> which aren't present right? Excluding base packages is just a particular way 
> to do that under certain assumptions about the CI environment.
>
> So
>
>
> needed_pkgs <- setdiff(package_dependencies(...), 
> installed.packages()[,"Package"])
> install.packages(needed_pkgs, repos = fancyrepos)
>
>
> will do what you want without installing the package itself, if that is 
> important. This will filter out base and recommended packages (which will be 
> already installed in your CI container, since R is).
>
>
> Now this does not take into account versioned dependencies, so it's not 
> actually fully correct (whereas installing the package is), but it gets you 
> where you're trying to go. And in a clean CI container without cached package 
> installation for the deps, its equivalent.
>
>
> Also, as an aside, if you need to get the base packages, you can do
>
> installed.packages(priority="base")[,"Package"]
>
>basecompilerdatasetsgraphics   grDevicesgrid
>
>  "base"  "compiler"  "datasets"  "graphics" "grDevices"  "grid"
>
> methodsparallel splines   stats  stats4   tcltk
>
>   "methods"  "parallel"   "splines" "stats""stats4" "tcltk"
>
>   tools   utils
>
> "tools" "utils"
>
>
> (to get base and recommended packages use 'high' instead of 'base')
>
> No need to be reaching down into unexported functions. So if you *really* 
> only want to exclude base functions (which likely will give you some 
> protection from versioned dep issues), you can change the code above to
>
> needed_pkgs <- setdiff(package_dependencies(...), installed.packages(priority 
> = "high")[,"Package"])
> install.packages(needed_pkgs, repos = fancyrepos)
>
> Best,
> ~G
>
>>
>> On Fri, Oct 28, 2022 at 8:42 PM Gabriel Becker  wrote:
>> >
>> > Hi Jan,
>> >
>> > The reason, I suspect without speaking for R-core, is that by design you 
>> > should not be specifying package dependencies as additional packages to 
>> > install. install.packages already does this for you, as it did in the 
>> > construct of a repository code that I provided previously in the thread. 
>> > You should be *only* doing
>> >
>> > install.packages(, repos = *)
>> >
>> > Then everything happens automatically via extremely well tested very 
>> > mature code.
>> >
>> > I (still) don't understand why you'd need to pass install.packages the 
>> > vector of dependencies yourself, as that is counter to install.packages' 
>> > core design.
>> >
>> > Does that make sense?
>> >
>> > Best,
>> > ~G
>> >
>> > On Fri, Oct 28, 2022 at 12:18 PM Jan Gorecki  wrote:
>> >>
>> >> Gabriel,
>> >>
>> >> I am trying to design generic solution that could be applied to
>> >> arbitrary package. Therefore I went with the latter solution you
>> >> proposed.
>> >> If we wouldn't have to exclude base packages, then its a 3 liner
>> >>
>> >> file.copy("DESCRIPTION", file.path(tdir<-tempdir(), "PACKAGES"));
>> >> db<-available.packages(paste0("file://", tdir));
>> >> utils::install.packages(tools::package_dependencies("pkgname", db,
>> >> which="most")[[1L]])
>> >>
>> >> As you noticed, we still have to filter out base packages. Otherwise
>> >> it won't be a robust utility that can be used in CI. Therefore we have
>> >> to add a call to tools:::.get_standard_package_names() which is an
>> >> internal function (as of now). Not only complicating the call but also
>> >> putting the functionality outside of safe use.
>> >>
>> >> Considering above, don't you agree that the following one liner could
>> >> nicely address the problem? The problem that hundreds/thousands of
>> >> packages are now addressing in their CI scripts by using a third party
>> >> packages.
>> >>
>> >> utils::install.packages(packages.dcf("DESCRIPTION", which="most"))
>> >>
>> >> It is hard to me to understand why R members don't consider this basic
>> >> functionality to be part of base R. Possibly they just don't need it
>> >> themselves. Yet isn't this sufficient that hundreds/thousands of
>> >> packages does need this functionality?
>> >>
>> >> Best reg

Re: [Rd] Lazy-evaluate elements wrapped with invisible

2022-10-29 Thread Bill Dunlap
>  the `delayed` object is ready to be garbage collected if not assigned
immediately.
I am not sure what is meant here.  Any object (at the R code level) is
ready to be garbage collected if not given a name or is not part of an
object with a name.  Do you mean a 'delayed' component of a list
should be considered
garbage if not 'immediately' extracted from a list?   Could you show a few
usage cases?

-Bill

On Fri, Oct 28, 2022 at 7:41 PM Dipterix Wang 
wrote:

>
> This is not quite true. The value, even when invisible, is captured by
> .Last.value, and
>
> > f <- function() invisible(5)
> > f()
> > .Last.value
> [1] 5
>
>
> I understand .Last.value will capture the function returns, but that only
> happens in the top-level... I guess?
>
> In the followings code, I think .Last.value does not capture the results
> of f, h, k, l
>
> g <- function() {
>   f(); h(); k(); l()
>   return()
> }
> g()
>
>
> Maybe I caused confusion by mentioning `invisible` function. I guess it
> should be a new function (let’s call it `delayed`). The function does not
> have to be limited to “printing”. For example, a digest key
>
>
> a <- function(key, value) {
>   map$set(key, value)
>
>   return(*delayed*({
> digest(value)
>   }))
> }
>
> Or an async evaluation of which the saved result might not be needed if
> not assigned (detached), or the result will be “joined” to the main process
>
> a <- function(path) {
>   # async
>   f <- future::future({
> # calculate, and then write to path
> saveRDS(…, path)
>   })
>
>   return(*delayed*({
> resolve(f) # wait till f to finish
>
> readRDS(path)
>   }))
> }
>
> Although I could use wrappers such as formula, quosure, or environment to
> achieve similar results, there are two major differences
>
> 1. There is an extra call to get the lazy-evaluated results (if I do want
> to resolve it)
> 2. The returned objects have to contain sort of “environment” component in
> it. It can’t just be simple objects like vectors, matrices, lists, …
> (also you can't immediately garbage collect the enclosing environment)
>
> From the implementation perspective, the `delayed` object is ready to be
> garbage collected if not assigned immediately.
>
> Best,
> - D
>
>
> This is not quite true. The value, even when invisible, is captured by
> .Last.value, and
>
> > f <- function() invisible(5)
> > f()
> > .Last.value
> [1] 5
>
> Now that doesn't actually preclude what you're suggesting (just have to
> wait for .Last.value to be populated by something else), but it does
> complicate it to the extent that I'm not sure the benefit we'd get would be
> worth it.
>
> Also, in the case you're describing, you'd be pushing the computational
> cost into printing, which, imo, is not where it should live. Printing a
> values generally speaking, should just print things, imo.
>
> That said, if you really wanted to do this, you could approach the
> behavior you want, I believe (but again, I think this is a bad idea) by
> returning a custom class that wraps formula (or, I imagine, tidyverse style
> quosures) that reach back into the call frame you return them from, and
> evaluating them only on demand.
>
> Best,
> ~G
>
>
>> This idea is somewhere between `delayedAssign` and eager evaluation.
>> Maybe we could call it delayedInvisible()?
>>
>> Best,
>> - Zhengjia
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel