Re: [R-pkg-devel] check cross-references error: Non-file package-anchored link(s)

2020-06-15 Thread David Hugh-Jones
On this note, I just got

Non-file package-anchored link(s) in documentation object
'brk_width-for-datetime.Rd':
  ‘[lubridate:%m+%]{lubridate::add_with_rollback()}’

The correct filename appears to be %m+% in the lubridate help. Can anyone
tell me the right way to format this? I would work it out myself, but the
check didn't cause problems on the r-devel systems I tested with, so I'd be
testing blind.

Cheers,
David


On Mon, 15 Jun 2020 at 17:30, Duncan Murdoch 
wrote:

> On 15/06/2020 12:05 p.m., Martin Maechler wrote:
> >> Duncan Murdoch   on Sun, 14 Jun 2020 07:28:03 -0400 writes:
> >
> >  > I agree with almost everything you wrote, except one thing:  this
> isn't
> >  > newly enforced, it has been enforced since the help system
> began.  What
> >  > I think is new is that there are now tests for it.  Previously
> those
> >  > links just wouldn't work.
> >
> >  > Duncan Murdoch
> >
> > Yes, to all... including Duncan's agreement with Gábor.
> >
> > Also, Duncan M earlier did mention that he had wanted to
> > *change* the link-to-file behavior for these cases (when he
> > wrote most of the Rd2html source code) but somehow did not get it.
>
> Actually, I don't think I pushed for this change at the time (or at
> least I didn't push much).  I just wish now that I had, because I think
> it will be harder to do it now than it would have been then.
>
> Duncan
>
> >
> > And that's why we had partial workarounds (as the dynamic server
> > still finding the links under some circumstances).
> >
> > My personal opinions was also that "we" (the R community; i.e.,
> > people providing good patches to the R sources / collaborating
> > with R core / ...) should rather work to fix the current
> > design/implementation "infelicity" than the current checks
> > starting to enforce something which is really a wart in my view,
> > and indeed, as Gábor also notes, will create R source
> > documentation that depends on implementation details of other
> > package's documentation.
> > I don't like it either, not at all.
> >
> > Martin
> >
> >  > On 14/06/2020 6:26 a.m., Gábor Csárdi wrote:
> >  >> On Sun, Jun 14, 2020 at 10:44 AM Duncan Murdoch
> >  >>  wrote:
> >  >> [...]
> >  >>>
> >  >>> I think the argument was that static builds of the help pages
> would have
> >  >>> trouble resolving the links.  With the current system, you can
> build a
> >  >>> help page that links to a page in package foo even if package
> foo is not
> >  >>> installed yet, and have the link work later after you install
> foo.
> >  >>
> >  >> That is true, but it is also not a big problem, I think. The CRAN
> >  >> Windows R installer does indeed build static help pages by
> default.
> >  >> But the built-in web server that serves these works around broken
> >  >> links by treating them as help topics instead of files. As you
> know.
> >  >> :) So this would only be a problem if you wanted to serve the
> static
> >  >> help pages with another web server. (Which is not a bad use
> case, but
> >  >> then maybe Rd2HTML() can just resolve them as topics and avoid
> the
> >  >> broken links.)
> >  >>
> >  >> Btw. the problem of linking to the wrong page is even worse with
> >  >> static builds of help pages, because if a link w/o a package
> (e.g.
> >  >> \link{filter}) picks up the wrong package at install time, then
> the
> >  >> wrong link is hard-coded in the html. If you are building binary
> >  >> packages, then they will link to the wrong help pages.
> >  >>
> >  >> WRE says that specifying the package in the link is rarely
> needed.
> >  >> This was probably the case some time ago, especially when
> packages did
> >  >> not have (compulsory) namespaces. But I am not sure if it still
> holds.
> >  >> I would argue that it is better to specify the package you are
> linking
> >  >> to. But the newly enforced requirement that we need to link to
> files
> >  >> instead of topics makes this more error prone.
> >  >>
> >  >> Gabor
> >  >>
> >  >> [...]
> >
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Uncaught use of internal functions from other packages in R CMD check

2020-06-15 Thread Bert Gunter
Do note that ?asNamespace says:
"Not **intended** to be called directly," (emphasis added)
but not  "should never be called directly" or some such. I don't know if
this makes a difference to package checking, but it isn't clear to me that
it would.

Cheers,
Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, Jun 15, 2020 at 3:03 PM Balasubramanian Narasimhan <
na...@stanford.edu> wrote:

> At least one package on CRAN uses
>
> get("foo", envir=asNamespace("imported_package"))
>
> and passes check.
>
> Is this known?
>
> -Naras
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] Uncaught use of internal functions from other packages in R CMD check

2020-06-15 Thread Balasubramanian Narasimhan

At least one package on CRAN uses

get("foo", envir=asNamespace("imported_package"))

and passes check.

Is this known?

-Naras

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] how to prevent a small package from yielding a large installed size?

2020-06-15 Thread Tomas Kalibera

On 6/15/20 6:52 PM, Duncan Murdoch wrote:

On 15/06/2020 12:30 p.m., Daniel Kelley wrote:
Duncan, thanks very much for that very helpful hint.  I got as 
follows.   My guess is that the first column in rdx$variables is an 
address offset, and so it seems that the lion's share of the storage 
is dedicated to items with names starting with a decimal point.  For 
example, the "[[" item is at offset of nearly 4M.  I may try fiddling 
with my code in which I specialize that method, to see whether I can 
reduce the memory footprint.  From what I can gather, both linux and 
windows build argoFloats into a package with R directory of about 
2.5M size, which is a lot better than what I get in macOS but still 
over the warning threshold (I think) and therefore I worry about CRAN 
acceptance.


The second column is the size, so actually the lion's share is 
dedicated to things that are not being shown.  They are indexed in the 
rdx$references list, and are probably going to be harder to track 
down, because they probably don't have names assigned by you.


For example, in the rgl package, I see

> rdx$references
$`env::1`
[1]  661 1037

$`env::10`
[1] 123952    221

$`env::11`
[1] 126378    224

$`env::12`
[1] 128575    226

[ many more deleted ]

Presumably `env::1` is an environment which might be referenced by 
several of the functions, and I'm guessing that one of yours is really 
big.  This can happen accidentally:  you have a temporary local 
variable in a function and create and save another function, or a 
formula, or some other environment-using object, and save the useless 
local variable along with it.


I don't have a good suggestion for figuring out what's in the bad 
environment; maybe someone else can suggest how to read an object from 
the .rdb file using R code.  Internally R uses C code for this.


I think this can be done using lazyLoadDBfetch(key, file, compressed, 
hook). "key" is c(128575L, 226L) for "env::12" above. See 
library/base/R/lazyload.R. "file" is the .RDB file, "compressed" is 
TRUE, "envhook" for this can be "function(x) NULL"


Tomas



Duncan Murdoch

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] how to prevent a small package from yielding a large installed size?

2020-06-15 Thread Daniel Kelley
I found as Duncan, but then I saw 
https://stackoverflow.com/questions/54144239/how-to-use-saverds-refhook-parameter
 and found more info as follows.  (It will take me a while to go through the 
output, but basically I am seeing code from the "oce" package, which makes me 
think I ought to get rid of the "Imports:" listing in my DESCRIPTION and 
instead use "Suggests:" and then grab things at runtime, as needed.  Maybe that 
will trim down the rds file, e.g. I see in the output from below about 1000 
lines that correspond to a particular oce function (or set of specialized 
versions of a generic, actually), which yields size 38649 in a references$`env` 
entry.

unser <- function(s){
  i <- as.numeric(s)
  return(rdx$variables[[i]])
}

readRDB <- function(filename, offset, size, type = 'gzip') {
   f <- file(filename, 'rb')
   on.exit(close(f))
   seek(f, offset + 4)
   unserialize(memDecompress(readBin(f, 'raw', size - 4), type), 
refhook=unser)
}



Dan E. Kelley [he/him/his 314ppm]
Professor and Senator
Department of Oceanography
Dalhousie University
PO BOX 15000
Halifax, NS, Canada B3H 4R2
(902)494-1694  dan.kel...@dal.ca









On Jun 15, 2020, at 4:12 PM, Duncan Murdoch 
mailto:murdoch.dun...@gmail.com>> wrote:

CAUTION: The Sender of this email is not from within Dalhousie.

On 15/06/2020 1:24 p.m., Ivan Krylov wrote:
On Mon, 15 Jun 2020 12:52:20 -0400
Duncan Murdoch mailto:murdoch.dun...@gmail.com>> 
wrote:

maybe someone else can suggest how to read an object from
the .rdb file using R code.  Internally R uses C code for this.

This function seems to work for me:

# filename: the .rdb file
# offset, size: the pair of values from the .rdx
# type: 'gzip' if $compressed is TRUE, 'bzip2' for 2, 'xz' for 3
readRDB <- function(filename, offset, size, type = 'gzip') {
 f <- file(filename, 'rb')
 on.exit(close(f))
 seek(f, offset + 4)
 unserialize(memDecompress(readBin(f, 'raw', size - 4), type))
}


Thanks, though it didn't work for me.  I get

Error in unserialize(memDecompress(readBin(f, "raw", size - 4), type)) :
 no restore method available

on every object I tried.  However, maybe Dan will have better luck.

Duncan Murdoch


[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] how to prevent a small package from yielding a large installed size?

2020-06-15 Thread Duncan Murdoch

On 15/06/2020 1:24 p.m., Ivan Krylov wrote:

On Mon, 15 Jun 2020 12:52:20 -0400
Duncan Murdoch  wrote:


maybe someone else can suggest how to read an object from
the .rdb file using R code.  Internally R uses C code for this.


This function seems to work for me:

# filename: the .rdb file
# offset, size: the pair of values from the .rdx
# type: 'gzip' if $compressed is TRUE, 'bzip2' for 2, 'xz' for 3
readRDB <- function(filename, offset, size, type = 'gzip') {
f <- file(filename, 'rb')
on.exit(close(f))
seek(f, offset + 4)
unserialize(memDecompress(readBin(f, 'raw', size - 4), type))
}



Thanks, though it didn't work for me.  I get

Error in unserialize(memDecompress(readBin(f, "raw", size - 4), type)) :
  no restore method available

on every object I tried.  However, maybe Dan will have better luck.

Duncan Murdoch

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] how to prevent a small package from yielding a large installed size?

2020-06-15 Thread Ivan Krylov
On Mon, 15 Jun 2020 12:52:20 -0400
Duncan Murdoch  wrote:

> maybe someone else can suggest how to read an object from 
> the .rdb file using R code.  Internally R uses C code for this.

This function seems to work for me:

# filename: the .rdb file
# offset, size: the pair of values from the .rdx
# type: 'gzip' if $compressed is TRUE, 'bzip2' for 2, 'xz' for 3
readRDB <- function(filename, offset, size, type = 'gzip') {
f <- file(filename, 'rb')
on.exit(close(f))
seek(f, offset + 4)
unserialize(memDecompress(readBin(f, 'raw', size - 4), type))
}

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] how to prevent a small package from yielding a large installed size?

2020-06-15 Thread Duncan Murdoch

On 15/06/2020 12:30 p.m., Daniel Kelley wrote:
Duncan, thanks very much for that very helpful hint.  I got as follows. 
  My guess is that the first column in rdx$variables is an address 
offset, and so it seems that the lion's share of the storage is 
dedicated to items with names starting with a decimal point.  For 
example, the "[[" item is at offset of nearly 4M.  I may try fiddling 
with my code in which I specialize that method, to see whether I can 
reduce the memory footprint.  From what I can gather, both linux and 
windows build argoFloats into a package with R directory of about 2.5M 
size, which is a lot better than what I get in macOS but still over the 
warning threshold (I think) and therefore I worry about CRAN acceptance.


The second column is the size, so actually the lion's share is dedicated 
to things that are not being shown.  They are indexed in the 
rdx$references list, and are probably going to be harder to track down, 
because they probably don't have names assigned by you.


For example, in the rgl package, I see

> rdx$references
$`env::1`
[1]  661 1037

$`env::10`
[1] 123952221

$`env::11`
[1] 126378224

$`env::12`
[1] 128575226

[ many more deleted ]

Presumably `env::1` is an environment which might be referenced by 
several of the functions, and I'm guessing that one of yours is really 
big.  This can happen accidentally:  you have a temporary local variable 
in a function and create and save another function, or a formula, or 
some other environment-using object, and save the useless local variable 
along with it.


I don't have a good suggestion for figuring out what's in the bad 
environment; maybe someone else can suggest how to read an object from 
the .rdb file using R code.  Internally R uses C code for this.


Duncan Murdoch

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] check cross-references error: Non-file package-anchored link(s)

2020-06-15 Thread Duncan Murdoch

On 15/06/2020 12:05 p.m., Martin Maechler wrote:

Duncan Murdoch   on Sun, 14 Jun 2020 07:28:03 -0400 writes:


 > I agree with almost everything you wrote, except one thing:  this isn't
 > newly enforced, it has been enforced since the help system began.  What
 > I think is new is that there are now tests for it.  Previously those
 > links just wouldn't work.

 > Duncan Murdoch

Yes, to all... including Duncan's agreement with Gábor.

Also, Duncan M earlier did mention that he had wanted to
*change* the link-to-file behavior for these cases (when he
wrote most of the Rd2html source code) but somehow did not get it.


Actually, I don't think I pushed for this change at the time (or at 
least I didn't push much).  I just wish now that I had, because I think 
it will be harder to do it now than it would have been then.


Duncan



And that's why we had partial workarounds (as the dynamic server
still finding the links under some circumstances).

My personal opinions was also that "we" (the R community; i.e.,
people providing good patches to the R sources / collaborating
with R core / ...) should rather work to fix the current
design/implementation "infelicity" than the current checks
starting to enforce something which is really a wart in my view,
and indeed, as Gábor also notes, will create R source
documentation that depends on implementation details of other
package's documentation.
I don't like it either, not at all.

Martin

 > On 14/06/2020 6:26 a.m., Gábor Csárdi wrote:
 >> On Sun, Jun 14, 2020 at 10:44 AM Duncan Murdoch
 >>  wrote:
 >> [...]
 >>>
 >>> I think the argument was that static builds of the help pages would 
have
 >>> trouble resolving the links.  With the current system, you can build a
 >>> help page that links to a page in package foo even if package foo is 
not
 >>> installed yet, and have the link work later after you install foo.
 >>
 >> That is true, but it is also not a big problem, I think. The CRAN
 >> Windows R installer does indeed build static help pages by default.
 >> But the built-in web server that serves these works around broken
 >> links by treating them as help topics instead of files. As you know.
 >> :) So this would only be a problem if you wanted to serve the static
 >> help pages with another web server. (Which is not a bad use case, but
 >> then maybe Rd2HTML() can just resolve them as topics and avoid the
 >> broken links.)
 >>
 >> Btw. the problem of linking to the wrong page is even worse with
 >> static builds of help pages, because if a link w/o a package (e.g.
 >> \link{filter}) picks up the wrong package at install time, then the
 >> wrong link is hard-coded in the html. If you are building binary
 >> packages, then they will link to the wrong help pages.
 >>
 >> WRE says that specifying the package in the link is rarely needed.
 >> This was probably the case some time ago, especially when packages did
 >> not have (compulsory) namespaces. But I am not sure if it still holds.
 >> I would argue that it is better to specify the package you are linking
 >> to. But the newly enforced requirement that we need to link to files
 >> instead of topics makes this more error prone.
 >>
 >> Gabor
 >>
 >> [...]



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] how to prevent a small package from yielding a large installed size?

2020-06-15 Thread Daniel Kelley
Duncan, thanks very much for that very helpful hint.  I got as follows.  My 
guess is that the first column in rdx$variables is an address offset, and so it 
seems that the lion's share of the storage is dedicated to items with names 
starting with a decimal point.  For example, the "[[" item is at offset of 
nearly 4M.  I may try fiddling with my code in which I specialize that method, 
to see whether I can reduce the memory footprint.  From what I can gather, both 
linux and windows build argoFloats into a package with R directory of about 
2.5M size, which is a lot better than what I get in macOS but still over the 
warning threshold (I think) and therefore I worry about CRAN acceptance.

R version 4.0.1 (2020-06-06) -- "See Things Now"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin17.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> rdx <- readRDS("~/Library/R/4.0/library/argoFloats/R/argoFloats.rdx")
> sizes <- sapply(rdx$variables, function(n) n[2])
> cat(str(rdx$variables))
List of 23
 $ .__C__argoFloats: int [1:2] 0 299
 $ .__NAMESPACE__. : int [1:2] 1431 51
 $ .__S3MethodsTable__.: int [1:2] 1611 51
 $ .__T__[[:base   : int [1:2] 3914410 51
 $ .__T__initialize:methods: int [1:2] 4075727 53
 $ .__T__merge:base: int [1:2] 4174252 53
 $ .__T__plot:base : int [1:2] 6684924 53
 $ .__T__show:methods  : int [1:2] 6761290 53
 $ .__T__subset:base   : int [1:2] 7255136 53
 $ .__T__summary:base  : int [1:2] 7476028 53
 $ .packageName: int [1:2] 7476081 54
 $ argoFloatsDebug : int [1:2] 7476135 2035
 $ argoUseAdjusted : int [1:2] 7478170 3736
 $ downloadWithRetries : int [1:2] 7481906 3369
 $ geographical: int [1:2] 7485275 40
 $ getIndex: int [1:2] 7485315 9870
 $ getProfileFromUrl   : int [1:2] 7495185 1337
 $ getProfiles : int [1:2] 7496522 2843
 $ merge   : int [1:2] 7499365 455
 $ plot: int [1:2] 7499820 455
 $ readProfiles: int [1:2] 7500275 6887
 $ subset  : int [1:2] 7507162 443
 $ summary : int [1:2] 7507605 447
> sum(sizes)
[1] 32741
> system("ls -l ~/Library/R/4.0/library/argoFloats/R/argoFloats.rdb")
-rw-r--r--  1 kelley  staff  7508052 15 Jun 07:47 
/Users/kelley/Library/R/4.0/library/argoFloats/R/argoFloats.rdb
>







On Jun 15, 2020, at 10:50 AM, Duncan Murdoch 
mailto:murdoch.dun...@gmail.com>> wrote:

rdx <- readRDS("foo.rdx")
sizes <- sapply(rdx$variables, function(n) n[2])


[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] check cross-references error: Non-file package-anchored link(s)

2020-06-15 Thread Martin Maechler
> Duncan Murdoch   on Sun, 14 Jun 2020 07:28:03 -0400 writes:

> I agree with almost everything you wrote, except one thing:  this isn't 
> newly enforced, it has been enforced since the help system began.  What 
> I think is new is that there are now tests for it.  Previously those 
> links just wouldn't work.

> Duncan Murdoch

Yes, to all... including Duncan's agreement with Gábor.

Also, Duncan M earlier did mention that he had wanted to
*change* the link-to-file behavior for these cases (when he
wrote most of the Rd2html source code) but somehow did not get it.

And that's why we had partial workarounds (as the dynamic server
still finding the links under some circumstances).

My personal opinions was also that "we" (the R community; i.e.,
people providing good patches to the R sources / collaborating
with R core / ...) should rather work to fix the current
design/implementation "infelicity" than the current checks
starting to enforce something which is really a wart in my view,
and indeed, as Gábor also notes, will create R source
documentation that depends on implementation details of other
package's documentation.
I don't like it either, not at all.

Martin

> On 14/06/2020 6:26 a.m., Gábor Csárdi wrote:
>> On Sun, Jun 14, 2020 at 10:44 AM Duncan Murdoch
>>  wrote:
>> [...]
>>> 
>>> I think the argument was that static builds of the help pages would have
>>> trouble resolving the links.  With the current system, you can build a
>>> help page that links to a page in package foo even if package foo is not
>>> installed yet, and have the link work later after you install foo.
>> 
>> That is true, but it is also not a big problem, I think. The CRAN
>> Windows R installer does indeed build static help pages by default.
>> But the built-in web server that serves these works around broken
>> links by treating them as help topics instead of files. As you know.
>> :) So this would only be a problem if you wanted to serve the static
>> help pages with another web server. (Which is not a bad use case, but
>> then maybe Rd2HTML() can just resolve them as topics and avoid the
>> broken links.)
>> 
>> Btw. the problem of linking to the wrong page is even worse with
>> static builds of help pages, because if a link w/o a package (e.g.
>> \link{filter}) picks up the wrong package at install time, then the
>> wrong link is hard-coded in the html. If you are building binary
>> packages, then they will link to the wrong help pages.
>> 
>> WRE says that specifying the package in the link is rarely needed.
>> This was probably the case some time ago, especially when packages did
>> not have (compulsory) namespaces. But I am not sure if it still holds.
>> I would argue that it is better to specify the package you are linking
>> to. But the newly enforced requirement that we need to link to files
>> instead of topics makes this more error prone.
>> 
>> Gabor
>> 
>> [...]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] how to prevent a small package from yielding a large installed size?

2020-06-15 Thread Ivan Krylov
On Mon, 15 Jun 2020 11:13:21 +
Daniel Kelley  wrote:

> A possible clue is that I get a large-file note on macOS, but not
> when I use rhub for test linux builds, or winbuilder for a windows
> build.  I do not have ready access to either linux or windows
> machines, to examine those builds in detail.

For what it's worth, if I build your package on Linux with R 3.6.3 and
--no-build-vignettes, it results in R/argoFloats.rdb being ~2.4M when
installed on either same Linux or R-hub's macOS "10.13.6 High Sierra,
R-release, CRAN's setup" [*]. Perhaps you would be able to find a
difference between artifacts from the R-hub installation and your own.

-- 
Best regards,
Ivan

[*]
https://artifacts.r-hub.io/argoFloats_0.1.3.tar.gz-b35fd40886b0a6c6b1b173a34944df14

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] how to prevent a small package from yielding a large installed size?

2020-06-15 Thread Duncan Murdoch
I can't install your package (I don't have an up to date GDAL), but 
maybe this is some help:


- Package dependencies aren't included, except possibly for static 
linking of C/Fortran/C++ code.  Those normally won't end up in an .rdb file.


- .rdb files are part of the lazy load mechanism.  You can read the 
corresponding .rdx file using readRDS(); it contains information on 
where to look in the .rdb file to find the source of an object.  For 
example, if you have foo.rdb and foo.rdx, then this will tell you what's 
big in your foo.rdb file:


rdx <- readRDS("foo.rdx")
sizes <- sapply(rdx$variables, function(n) n[2])

Now sizes will be a named vector of objects contained in the rdb.  You 
should find that sum(sizes) is similar to the size of the .rdb file, but 
probably a bit smaller, because there are some objects missed by this 
count:  the ones contained in rdx$references.


Duncan Murdoch


On 15/06/2020 7:13 a.m., Daniel Kelley wrote:

I am working on a package (https://github.com/ArgoCanada/argoFloats) that has a 
412K source tarball (most of which is data; the R code is 176K), but that 
creates a library .rdb file of MUCH larger size, namely 7.2M.  This file causes 
a build NOTE, being over the threshold of 1M, and that concerns me in terms of 
hoped-for submission to CRAN during this summer.

My goal in writing this email is to get some advice regarding reducing the size 
of the .rds file, if indeed this is a general problem and not an artifact of my 
(macOS) development environment.

Here's some more detail:

argoFloats depends on some other packages, and so I am wondering whether the large 
multiplier between R source and .rdb file is because the other sources are dragged in.  I 
could try moving everything to "Suggests", and use requireNamespace(), but that 
seems to go against recommendations, if I interpret Wickham and Bryan 
(https://r-pkgs.org/description.html) correctly.

A possible clue is that I get a large-file note on macOS, but not when I use 
rhub for test linux builds, or winbuilder for a windows build.  I do not have 
ready access to either linux or windows machines, to examine those builds in 
detail.

My thinking is that examination of the .rdb file might help me to learn about problems (e.g. if it holds code 
from packages I "import" from, that might motivate me to move from "import" to 
"suggest"). Unfortunately, I have not been able to discover a way to examine that file, which seems 
to be designed for internal R use.

I am attaching below my signature line the output from sessionInfo(), in case 
that helps.  The URL I reference in my second paragraph has my DESCRIPTION 
file, and I will admit that I do not fully understand its nuances.  Note that I 
use roxygen2 to build documentation and NAMESPACE.

Any advice would be greatly appreciated, and indeed I thank anyone who got to 
the bottom of this long email.

Dan E. Kelley [he/him/his 314ppm]
Department of Oceanography
Dalhousie University
Halifax, NS, Canada



R version 4.0.1 (2020-06-06)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] argoFloats_0.1.3

loaded via a namespace (and not attached):
  [1] Rcpp_1.0.4.6pillar_1.4.4compiler_4.0.1  plyr_1.8.6
  class_7.3-17
  [6] tools_4.0.1 testthat_2.3.2  digest_0.6.25   bit_1.1-15.2  
  ncdf4_1.17
[11] oce_1.2-1   memoise_1.1.0   RSQLite_2.2.0   
lifecycle_0.2.0 tibble_3.0.1
[16] gtable_0.3.0lattice_0.20-41 gsw_1.0-6   
pkgconfig_2.0.3 rlang_0.4.6
[21] DBI_1.1.0   rstudioapi_0.11 curl_4.3e1071_1.7-3
 dplyr_1.0.0
[26] stringr_1.4.0   raster_3.1-5generics_0.0.2  vctrs_0.3.1
 classInt_0.4-3
[31] bit64_0.9-7 grid_4.0.1  tidyselect_1.1.0glue_1.4.1 
 sf_0.9-4
[36] R6_2.4.1sp_1.4-2marmap_1.0.4
adehabitatMA_0.3.14 blob_1.2.1
[41] ggplot2_3.3.1   purrr_0.3.4 reshape2_1.4.4  magrittr_1.5   
 units_0.6-6
[46] scales_1.1.1codetools_0.2-16ellipsis_0.3.1  shape_1.4.4
 colorspace_1.4-1
[51] KernSmooth_2.23-17  stringi_1.4.6   munsell_0.5.0   
crayon_1.3.4.9000

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] how to prevent a small package from yielding a large installed size?

2020-06-15 Thread Daniel Kelley
I am working on a package (https://github.com/ArgoCanada/argoFloats) that has a 
412K source tarball (most of which is data; the R code is 176K), but that 
creates a library .rdb file of MUCH larger size, namely 7.2M.  This file causes 
a build NOTE, being over the threshold of 1M, and that concerns me in terms of 
hoped-for submission to CRAN during this summer.

My goal in writing this email is to get some advice regarding reducing the size 
of the .rds file, if indeed this is a general problem and not an artifact of my 
(macOS) development environment.

Here's some more detail:

argoFloats depends on some other packages, and so I am wondering whether the 
large multiplier between R source and .rdb file is because the other sources 
are dragged in.  I could try moving everything to "Suggests", and use 
requireNamespace(), but that seems to go against recommendations, if I 
interpret Wickham and Bryan (https://r-pkgs.org/description.html) correctly.

A possible clue is that I get a large-file note on macOS, but not when I use 
rhub for test linux builds, or winbuilder for a windows build.  I do not have 
ready access to either linux or windows machines, to examine those builds in 
detail.

My thinking is that examination of the .rdb file might help me to learn about 
problems (e.g. if it holds code from packages I "import" from, that might 
motivate me to move from "import" to "suggest"). Unfortunately, I have not been 
able to discover a way to examine that file, which seems to be designed for 
internal R use.

I am attaching below my signature line the output from sessionInfo(), in case 
that helps.  The URL I reference in my second paragraph has my DESCRIPTION 
file, and I will admit that I do not fully understand its nuances.  Note that I 
use roxygen2 to build documentation and NAMESPACE.

Any advice would be greatly appreciated, and indeed I thank anyone who got to 
the bottom of this long email.

Dan E. Kelley [he/him/his 314ppm]
Department of Oceanography
Dalhousie University
Halifax, NS, Canada



R version 4.0.1 (2020-06-06)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] argoFloats_0.1.3

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.4.6pillar_1.4.4compiler_4.0.1  plyr_1.8.6 
 class_7.3-17   
 [6] tools_4.0.1 testthat_2.3.2  digest_0.6.25   bit_1.1-15.2   
 ncdf4_1.17 
[11] oce_1.2-1   memoise_1.1.0   RSQLite_2.2.0   
lifecycle_0.2.0 tibble_3.0.1   
[16] gtable_0.3.0lattice_0.20-41 gsw_1.0-6   
pkgconfig_2.0.3 rlang_0.4.6
[21] DBI_1.1.0   rstudioapi_0.11 curl_4.3e1071_1.7-3
 dplyr_1.0.0
[26] stringr_1.4.0   raster_3.1-5generics_0.0.2  vctrs_0.3.1
 classInt_0.4-3 
[31] bit64_0.9-7 grid_4.0.1  tidyselect_1.1.0glue_1.4.1 
 sf_0.9-4   
[36] R6_2.4.1sp_1.4-2marmap_1.0.4
adehabitatMA_0.3.14 blob_1.2.1 
[41] ggplot2_3.3.1   purrr_0.3.4 reshape2_1.4.4  magrittr_1.5   
 units_0.6-6
[46] scales_1.1.1codetools_0.2-16ellipsis_0.3.1  shape_1.4.4
 colorspace_1.4-1   
[51] KernSmooth_2.23-17  stringi_1.4.6   munsell_0.5.0   
crayon_1.3.4.9000  

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel