Re: [Rd] tools:: extracting pkg dependencies from DCF

2022-10-15 Thread Simon Urbanek


Jan,

I think using a single DCF as input is not very practical and would not be 
useful in the context you describe (creating self contained repos) since they 
typically concern a list of packages, but essentially splitting out the part of 
install.packages() which determines which files will be pulled from where would 
be very useful as it would be trivial to use it to create repository (what we 
always do in corporate environments) instead of installing the packages. I 
suspect that install packages is already too complex so instead of adding a 
flag to install.packages one could move that functionality into a separate 
function - we all do that constantly for the sites we manage, so it would be 
certainly something worthwhile.

Cheers,
Simon


> On Oct 15, 2022, at 7:14 PM, Jan Gorecki  wrote:
> 
> Hi Gabriel,
> 
> It's very nice usage you provided here. Maybe instead of adding new
> function we could extend packages_depenedncies then? To accept file path to
> dsc file.
> 
> What about repos.dcf? Maybe additional repositories could be an attribute
> attached to returned character vector.
> 
> The use case is to, for a given package sources, obtain its dependencies,
> so one can use that for installing them/mirroring CRAN subset, or whatever.
> The later is especially important for a production environment where one
> wants to have fixed version of packages, and mirroring relevant subset of
> CRAN is the most simple, and IMO reliable, way to manage such environment.
> 
> Regards
> Jan
> 
> On Fri, Oct 14, 2022, 23:34 Gabriel Becker  wrote:
> 
>> Hi Jan and Jan,
>> 
>> Can you explain a little more what exactly you want the non-recursive,
>> non-version aware dependencies from an individual package for?
>> 
>> Either way package_dependencies will do this for you* with a little
>> "aggressive convincing". It wants output from available.packages, but who
>> really cares what it wants? It's a function and we are people :)
>> 
>>> library(tools)
>>> db <- read.dcf("~/gabe/checkedout/rtables_clean/DESCRIPTION")
>>> package_dependencies("rtables", db, which = intersect(c("Depends",
>> "Suggests", "Imports", "LinkingTo"), colnames(db)))
>> $rtables
>> [1] "methods""magrittr"   "formatters" "dplyr"  "tibble"
>> [6] "tidyr"  "testthat"   "xml2"   "knitr"  "rmarkdown"
>> [11] "flextable"  "officer""stats"  "htmltools"  "grid"
>> 
>> 
>> The only gotcha that I see immediately is that "LinkingTo" isn't always
>> there (whereas it is with real output from available.packages). If you
>> know your package doesn't have that (or that it does) at call time , this
>> becomes a one-liner:
>> 
>> package_dependencies("rtables", db =
>> read.dcf("~/gabe/checkedout/rtables_clean/DESCRIPTION"), which =
>> c("Depends", "Suggests", "Imports"))
>> $rtables
>> [1] "methods""magrittr"   "formatters" "dplyr"  "tibble"
>> [6] "tidyr"  "testthat"   "xml2"   "knitr"  "rmarkdown"
>> [11] "flextable"  "officer""stats"  "htmltools"  "grid"
>> 
>> You can also trick it a slightly different way by giving it what it
>> actually wants
>> 
>>> tdir <- tempdir()
>>> file.copy("~/gabe/checkedout/rtables_clean/DESCRIPTION", file.path(tdir,
>> "PACKAGES"))
>> [1] TRUE
>>> avl <- available.packages(paste0("file://", tdir))
>>> library(tools)
>>> package_dependencies("rtables", avl)
>> $rtables
>> [1] "methods""magrittr"   "formatters" "stats"  "htmltools"
>> [6] "grid"
>> 
>>> package_dependencies("rtables", avl, which = "all")
>> $rtables
>> [1] "methods""magrittr"   "formatters" "stats"  "htmltools"
>> [6] "grid"   "dplyr"  "tibble" "tidyr"  "testthat"
>> [11] "xml2"   "knitr"  "rmarkdown"  "flextable"  "officer"
>> 
>> So the only real benefits I see that we'd be picking up here is automatic
>> filtering by priority, and automatic extraction of the package name from
>> the DESCRIPTION file. I'm not sure either of those warrant a new exported
>> function that R-core has to maintain forever.
>> 
>> Best,
>> ~G
>> 
>> * I haven't tested this across all OSes, but I dont' know of any reason it
>> wouldn't work generally.
>> 
>> On Fri, Oct 14, 2022 at 2:33 PM Jan Gorecki  wrote:
>> 
>>> Hello Jan,
>>> 
>>> Thanks for confirming about many packages reinventing this missing
>>> functionality.
>>> packages.dcf was not meant handle versions. It just extracts names of
>>> dependencies... Yes, such a simple thing, yet missing in base R.
>>> 
>>> Versions of packages can be controlled when setting up R pkgs repo. This
>>> is
>>> how I used to handle it. Making a CRAN subset mirror of fixed version
>>> pkgs.
>>> BTW. function for that is also included in mentioned branch. I am just not
>>> proposing it, to increase the chance of having at least this simple,
>>> missing, functionality merged.
>>> 
>>> Best
>>> Jan
>>> 
>>> On Fri, Oct 14, 2022, 15:14 Jan Netík  wrote:
>>> 
 Hello Jan,
 
 I have seen many packages that implemented dependencies 

Re: [Rd] tools:: extracting pkg dependencies from DCF

2022-10-15 Thread Gabriel Becker
On Fri, Oct 14, 2022 at 11:14 PM Jan Gorecki  wrote:

> Hi Gabriel,
>
> It's very nice usage you provided here. Maybe instead of adding new
> function we could extend packages_depenedncies then? To accept file path to
> dsc file.
>
> What about repos.dcf? Maybe additional repositories could be an attribute
> attached to returned character vector.
>
> The use case is to, for a given package sources, obtain its dependencies,
> so one can use that for installing them/mirroring CRAN subset, or whatever.
> The later is especially important for a production environment where one
> wants to have fixed version of packages, and mirroring relevant subset of
> CRAN is the most simple, and IMO reliable, way to manage such environment.
>

Right. Thats why I asked though, because this only makes sense to do
recursively (i.e. collectively). Packages cannot meaningfully be treated in
isolation in R. If you capture/mirror the non-recursive dependencies only,
your package won't be (re-)installable.

What you actually want is either a frozen slice of CRAN, or a description
of either your full package library, or the full recursive subset of it
relevant to a particular package. (switchr was designed to do both of these
things easily, as an aside). Neither of which is achievable by looking at
an individual DESCRIPTION file.

Here's another fun trick if you don't want to just use switchr and let it
take care of it for you:

> .libPaths()
[1] "/Users/gabrielbecker/Rlib/syswide-4.1.2"
[2] "/Library/Frameworks/R.framework/Versions/4.1/Resources/library"
> write_PACKAGES(.libPaths()[1], unpacked = TRUE, validate = FALSE)
> avl <- available.packages(paste0("file://", .libPaths()[1]))
> head(avl)
  Package Version  Priority
abind "abind" "1.4-5"  NA
AnnotationDbi "AnnotationDbi" "1.56.2" NA
askpass   "askpass"   "1.1"NA
assertthat"assertthat""0.2.1"  NA
backports "backports" "1.4.1"  NA
base64enc "base64enc" "0.1-3"  NA
  Depends

abind "R (>= 1.5.0)"

AnnotationDbi "R (>= 2.7.0), methods, utils, stats4, BiocGenerics
(>=\n0.29.2), Biobase (>= 1.17.0), IRanges"
askpass   NA

assertthatNA

backports "R (>= 3.0.0)"

base64enc "R (>= 2.9.0)"

  Imports
 LinkingTo
abind "methods, utils"   NA

AnnotationDbi "DBI, RSQLite, S4Vectors (>= 0.9.25), stats, KEGGREST" NA

askpass   "sys (>= 2.1)" NA

assertthat"tools"NA

backports NA NA

base64enc NA NA

  Suggests


abind NA


AnnotationDbi "hgu95av2.db, GO.db, org.Sc.sgd.db, org.At.tair.db,
RUnit,\nTxDb.Hsapiens.UCSC.hg19.knownGene, org.Hs.eg.db,
reactome.db,\nAnnotationForge, graph, EnsDb.Hsapiens.v75, BiocStyle, knitr"
askpass   "testthat"


assertthat"testthat, covr"


backports NA


base64enc NA


  Enhances License  License_is_FOSS
abind NA   "LGPL (>= 2)"NA
AnnotationDbi NA   "Artistic-2.0"   NA
askpass   NA   "MIT + file LICENSE" NA
assertthatNA   "GPL-3"  NA
backports NA   "GPL-2 | GPL-3"  NA
base64enc "png""GPL-2 | GPL-3"  NA
  License_restricts_use OS_type Archs   MD5sum
abind NANA  NA  NA
AnnotationDbi NANA  NA  NA
askpass   NANA  "askpass.so.dSYM"   NA
assertthatNANA  NA  NA
backports NANA  "backports.so.dSYM" NA
base64enc NANA  "base64enc.so.dSYM" NA
  NeedsCompilation File
abind "no" NA
AnnotationDbi "no" NA
askpass   "yes"NA
assertthat"no" NA
backports "yes"NA
base64enc "yes"NA
  Repository
abind "file:///Users/gabrielbecker/Rlib/syswide-4.1.2"
AnnotationDbi "file:///Users/gabrielbecker/Rlib/syswide-4.1.2"
askpass   "file:///Users/gabrielbecker/Rlib/syswide-4.1.2"
assertthat"file:///Users/gabrielbecker/Rlib/syswide-4.1.2"
backports "file:///Users/gabrielbecker/Rlib/syswide-4.1.2"
base64enc "file:///Users/gabrielbecker/Rlib/syswide-4.1.2"
> package_dependencies("rtables", avl, recursive = TRUE)
$rtables
 [1] "methods""magrittr"   "formatters" "stats"  "htmltools"
 [6] "grid"   "utils"  "digest" "grDevices"  "base64enc"
[11] "rlang"  "fastmap"

> package_dependencies("rtables", avl, which = "all", recursive = TRUE)
$rtables
  [1] "methods"
  [2] "magrittr"
  [3] "formatters"
  [4] "stats"
  [5] "htmltools"
  [6] "grid"
  [7] "dplyr"
  [8] "tibble"

 
[653] "rjson"
[654] "rsolr"

Re: [Rd] tools:: extracting pkg dependencies from DCF

2022-10-15 Thread Jan Gorecki
Hi Gabriel,

It's very nice usage you provided here. Maybe instead of adding new
function we could extend packages_depenedncies then? To accept file path to
dsc file.

What about repos.dcf? Maybe additional repositories could be an attribute
attached to returned character vector.

The use case is to, for a given package sources, obtain its dependencies,
so one can use that for installing them/mirroring CRAN subset, or whatever.
The later is especially important for a production environment where one
wants to have fixed version of packages, and mirroring relevant subset of
CRAN is the most simple, and IMO reliable, way to manage such environment.

Regards
Jan

On Fri, Oct 14, 2022, 23:34 Gabriel Becker  wrote:

> Hi Jan and Jan,
>
> Can you explain a little more what exactly you want the non-recursive,
> non-version aware dependencies from an individual package for?
>
> Either way package_dependencies will do this for you* with a little
> "aggressive convincing". It wants output from available.packages, but who
> really cares what it wants? It's a function and we are people :)
>
> > library(tools)
> > db <- read.dcf("~/gabe/checkedout/rtables_clean/DESCRIPTION")
> > package_dependencies("rtables", db, which = intersect(c("Depends",
> "Suggests", "Imports", "LinkingTo"), colnames(db)))
> $rtables
>  [1] "methods""magrittr"   "formatters" "dplyr"  "tibble"
>  [6] "tidyr"  "testthat"   "xml2"   "knitr"  "rmarkdown"
> [11] "flextable"  "officer""stats"  "htmltools"  "grid"
>
>
> The only gotcha that I see immediately is that "LinkingTo" isn't always
> there (whereas it is with real output from available.packages). If you
> know your package doesn't have that (or that it does) at call time , this
> becomes a one-liner:
>
> package_dependencies("rtables", db =
> read.dcf("~/gabe/checkedout/rtables_clean/DESCRIPTION"), which =
> c("Depends", "Suggests", "Imports"))
> $rtables
>  [1] "methods""magrittr"   "formatters" "dplyr"  "tibble"
>  [6] "tidyr"  "testthat"   "xml2"   "knitr"  "rmarkdown"
> [11] "flextable"  "officer""stats"  "htmltools"  "grid"
>
> You can also trick it a slightly different way by giving it what it
> actually wants
>
> > tdir <- tempdir()
> > file.copy("~/gabe/checkedout/rtables_clean/DESCRIPTION", file.path(tdir,
> "PACKAGES"))
> [1] TRUE
> > avl <- available.packages(paste0("file://", tdir))
> > library(tools)
> > package_dependencies("rtables", avl)
> $rtables
> [1] "methods""magrittr"   "formatters" "stats"  "htmltools"
> [6] "grid"
>
> > package_dependencies("rtables", avl, which = "all")
> $rtables
>  [1] "methods""magrittr"   "formatters" "stats"  "htmltools"
>  [6] "grid"   "dplyr"  "tibble" "tidyr"  "testthat"
> [11] "xml2"   "knitr"  "rmarkdown"  "flextable"  "officer"
>
> So the only real benefits I see that we'd be picking up here is automatic
> filtering by priority, and automatic extraction of the package name from
> the DESCRIPTION file. I'm not sure either of those warrant a new exported
> function that R-core has to maintain forever.
>
> Best,
> ~G
>
> * I haven't tested this across all OSes, but I dont' know of any reason it
> wouldn't work generally.
>
> On Fri, Oct 14, 2022 at 2:33 PM Jan Gorecki  wrote:
>
>> Hello Jan,
>>
>> Thanks for confirming about many packages reinventing this missing
>> functionality.
>> packages.dcf was not meant handle versions. It just extracts names of
>> dependencies... Yes, such a simple thing, yet missing in base R.
>>
>> Versions of packages can be controlled when setting up R pkgs repo. This
>> is
>> how I used to handle it. Making a CRAN subset mirror of fixed version
>> pkgs.
>> BTW. function for that is also included in mentioned branch. I am just not
>> proposing it, to increase the chance of having at least this simple,
>> missing, functionality merged.
>>
>> Best
>> Jan
>>
>> On Fri, Oct 14, 2022, 15:14 Jan Netík  wrote:
>>
>> > Hello Jan,
>> >
>> > I have seen many packages that implemented dependencies "extraction" on
>> > their own for internal purposes and today I was doing exactly that for
>> > mine. It's not a big deal using read.dcf on DESCRIPTION. It was
>> sufficient
>> > for me, but I had to take care of some \n chars (the overall returned
>> value
>> > has some rough edges, in my opinion). However, the function from the
>> branch
>> > seems to not care about version requirements, which are crucial for me.
>> > Maybe that is something to reconsider before merging.
>> >
>> > Best,
>> > Jan
>> >
>> > pá 14. 10. 2022 v 2:27 odesílatel Jan Gorecki 
>> > napsal:
>> >
>> >> Dear R devs,
>> >>
>> >> I would like to raise a request for a simple helper function.
>> >> Utility function to extract package dependencies from DESCRIPTION file.
>> >>
>> >> I do think that tools package is better place, for such a fundamental
>> >> functionality, than community packages.
>> >>
>> >> tools pkg seems perfect fit (having already great function
>>