Re: [R-pkg-devel] Courtesy methods and explosive dependencies

2018-05-25 Thread Duncan Murdoch

On 25/05/2018 3:22 PM, Lenth, Russell V wrote:

There can't really be an "ImportGenerics", because S3 is so informal.  A 
generic function is a function that calls UseMethod, but it can do anything else as well. 
 So R would need some fancy code analysis to know whether it was safe to import the 
generic but not all the dependencies of that package, and that could change when the 
package holding the generic was updated.

Examples of generics that do more than simply call UseMethod are rare, but they 
exist:  as.data.frame() and sort() are a couple.

Duncan Murdoch


Right -- ImportGenerics was a dumb idea. However, Something like ImportNoDeps 
makes sense as an additional category in the DESCRIPTION file. In my example, 
if such existed, I could put in the DESCRIPTION file:

 ImportNoDeps: multcomp

This would signal that multcomp -- but not necessarily any of its dependencies 
-- is required to install emmeans.  And in the NAMESPACE file:

 importFrom(multcomp, cld)
 S3method(cld, emmGrid)

... would import the needed generic and register my new S3 method.

I think this kind of construct could significantly reduce dependencies for 
packages that extend other packages' methods.


But what happens if the multcomp maintainer modifies cld() next week 
(and makes no other change), so that it makes use of multcomp 
dependencies?  Your package would be claiming that it didn't need them, 
while multcomp would not have changed any of its declarations, just the 
code for that one function.  How would R know to generate an error in 
your package (because its claim about multcomp was now false)?  If it 
didn't, you'd end up in a situation where calls to cld() failed because 
the cld() generic no longer worked.


This is just too complicated.  If multcomp says that it needs 8 
packages, you shouldn't be able to install it without all of them.


On the other hand, I think you can do what you want as follows:

1.  Put multcomp into your Suggests list.  Users can choose to install 
it or not.


2.  Put a conditional define of the generic into one of your .R source 
files to define it for users who don't have multcomp:


if (!requireNamespace("multcomp"))
  cld <- function(object, ...) UseMethod("cld")

3.  In your NAMESPACE file, export cld, and conditionally import it for 
users who do have it:


export(cld)

if (requireNamespace("multcomp"))
  importFrom(multcomp, cld)


I haven't tested this much at all, but it appears to work on a very 
superficial test.


Duncan Murdoch

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Courtesy methods and explosive dependencies

2018-05-25 Thread Lenth, Russell V
> There can't really be an "ImportGenerics", because S3 is so informal.  A 
> generic function is a function that calls UseMethod, but it can do anything 
> else as well.  So R would need some fancy code analysis to know whether it 
> was safe to import the generic but not all the dependencies of that package, 
> and that could change when the package holding the generic was updated.
> 
> Examples of generics that do more than simply call UseMethod are rare, but 
> they exist:  as.data.frame() and sort() are a couple.
> 
> Duncan Murdoch

Right -- ImportGenerics was a dumb idea. However, Something like ImportNoDeps 
makes sense as an additional category in the DESCRIPTION file. In my example, 
if such existed, I could put in the DESCRIPTION file:

ImportNoDeps: multcomp

This would signal that multcomp -- but not necessarily any of its dependencies 
-- is required to install emmeans.  And in the NAMESPACE file:

importFrom(multcomp, cld)
S3method(cld, emmGrid)

... would import the needed generic and register my new S3 method.

I think this kind of construct could significantly reduce dependencies for 
packages that extend other packages' methods.

Russ
__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Courtesy methods and explosive dependencies

2018-05-25 Thread Duncan Murdoch

On 25/05/2018 11:38 AM, Lenth, Russell V wrote:

I agree that most of the package dependencies in multcomp are worth having, but 
that is not the point. The point is that if a developer wants to write a method 
for a generic function offered in another non-base package, that creates false 
dependencies: packages that users are required to have, but that aren't 
actually used by the method. I certainly wasn't trying to diss multcomp; that 
was just a concrete illustration.


There can't really be an "ImportGenerics", because S3 is so informal.  A 
generic function is a function that calls UseMethod, but it can do 
anything else as well.  So R would need some fancy code analysis to know 
whether it was safe to import the generic but not all the dependencies 
of that package, and that could change when the package holding the 
generic was updated.


Examples of generics that do more than simply call UseMethod are rare, 
but they exist:  as.data.frame() and sort() are a couple.


Duncan Murdoch



Russ

-Original Message-
From: Martin Maechler [mailto:maech...@stat.math.ethz.ch]
Sent: Friday, May 25, 2018 2:13 AM
To: Lenth, Russell V <russell-le...@uiowa.edu>
Cc: r-package-devel@r-project.org
Subject: Re: [R-pkg-devel] Courtesy methods and explosive dependencies


Lenth, Russell V
 on Thu, 24 May 2018 23:14:42 + writes:


 > Package developers, I am trying to severely cut down on
 > the number of dependencies of my package emmeans. It is
 > now 48, which is quite a few (but that is down from over
 > 100 in the preceding version, where I made the unwise
 > choice of including a particularly greedy package in
 > Imports). I hate to force users to install dozens of
 > packages that they don't really need or want, but it seems
 > very hard to avoid.

 > Case in point: emmeans provides additional methods for
 > 'cld' and 'glht' from the multcomp package. Accordingly, I
 > import their generics, and register my additional
 > methods. But because I import the generics, I must list
 > multcomp in Imports, and that results in the addition of
 > 16 required packages, some of which I never use. More
 > important, I believe that NONE of those 16 packages are
 > required for the correct functioning of my courtesy
 > methods. The only things I really need are the generics.

There must be a mistake here -- I think in your perception:

'multcomp' does *not* have excessive dependencies (though I'd say one too much):


tools::package_dependencies("multcomp")

$multcomp
[1] "stats" "graphics"  "mvtnorm"   "survival"  "TH.data"   "sandwich"
[7] "codetools"


tools::package_dependencies("multcomp", recursive=TRUE)

$multcomp
  [1] "stats" "graphics"  "mvtnorm"   "survival"  "TH.data"   "sandwich"
  [7] "codetools" "methods"   "utils" "zoo"   "Matrix""splines"
[13] "MASS"  "grDevices" "grid"  "lattice"





Apart from "base + recommended" packages (which *everyone* has installed), 
these are just 4 packages:

  mvtnorm
  TH.data
  sandwich
  zoo

where  mvtnorm, sandwich, and zoo  really are among the (formally undefined) 
recommended-level-2 R packages... so I do wonder if you really had needed to 
install.

The 'TH.data' { TH <==>  maintainer("multcomp") } package I think should not be in the 
strict dependencies of 'multcomp' but rather in its "Suggests" something I'd say must be 
true for all data packages:
The whole idea of data packages is that they should be needed for interesting 
help page examples, vignettes, maybe even tests, but not for package 
functionality.

In sum: I'd strongly advise to not change from keeping multcomp among your 
imports.

Martin Maechler
ETH Zurich

 > On the flip side, emmeans defines some generics of its own
 > that I invite other package developers to extend so that
 > emmeans supports their models. If those packages import
 > emmeans, there is an overhead of 48 dependencies; again,
 > most of these are packages that are not needed at all for
 > those packages' methods to work. I don't like being
 > responsible for that.

 > I believe this is a very common problem, not just with my
 > own packages. It's one thing to extend a base method like
 > 'print'; but extending methods in contributed packages
 > creates these dependency explosions. I have hundreds of
 > packages installed on my system - a couple dozen I care
 > about, another few dozen that seem fairly desirable, and a
 > couple hundred that I d

Re: [R-pkg-devel] Courtesy methods and explosive dependencies

2018-05-25 Thread Neal Fultz
In the estimatr package, we provided a shim to support broom without
transitively depending on the tidyverse:


tidy <- function(object, ...) {
  if (requireNamespace("broom", quietly = TRUE)) broom::tidy(object, ...)
else UseMethod("tidy")
}

https://github.com/DeclareDesign/estimatr/blob/master/R/S3_tidy.R

It's not ideal but it seems to make our functions work whether or not the
end user has broom.


S4 methods (eg texreg) were more difficult, and we ended up registering the
generics in .onLoad:


.onLoad <- function(libname, pkgname) {
  if (suppressWarnings(requireNamespace("texreg", quietly = TRUE))) {
setGeneric("extract", function(model, ...) standardGeneric("extract"),
  package = "texreg"
)
setMethod("extract",
  signature = className("lm_robust", pkgname),
  definition = extract.lm_robust
)
  }
  invisible()
}

https://github.com/DeclareDesign/estimatr/blob/master/R/zzz.R


Since this is in .onLoad, it's a little buggy if the user has estimatr and
then installs texreg, but again seems to mostly work.

We would appreciate any feedback the list may have on these design patterns.


-Neal




On Fri, May 25, 2018 at 8:38 AM, Lenth, Russell V <russell-le...@uiowa.edu>
wrote:

> I agree that most of the package dependencies in multcomp are worth
> having, but that is not the point. The point is that if a developer wants
> to write a method for a generic function offered in another non-base
> package, that creates false dependencies: packages that users are required
> to have, but that aren't actually used by the method. I certainly wasn't
> trying to diss multcomp; that was just a concrete illustration.
>
> Russ
>
> -Original Message-
> From: Martin Maechler [mailto:maech...@stat.math.ethz.ch]
> Sent: Friday, May 25, 2018 2:13 AM
> To: Lenth, Russell V <russell-le...@uiowa.edu>
> Cc: r-package-devel@r-project.org
> Subject: Re: [R-pkg-devel] Courtesy methods and explosive dependencies
>
> >>>>> Lenth, Russell V
> >>>>> on Thu, 24 May 2018 23:14:42 + writes:
>
> > Package developers, I am trying to severely cut down on
> > the number of dependencies of my package emmeans. It is
> > now 48, which is quite a few (but that is down from over
> > 100 in the preceding version, where I made the unwise
> > choice of including a particularly greedy package in
> > Imports). I hate to force users to install dozens of
> > packages that they don't really need or want, but it seems
> > very hard to avoid.
>
> > Case in point: emmeans provides additional methods for
> > 'cld' and 'glht' from the multcomp package. Accordingly, I
> > import their generics, and register my additional
> > methods. But because I import the generics, I must list
> > multcomp in Imports, and that results in the addition of
> > 16 required packages, some of which I never use. More
> > important, I believe that NONE of those 16 packages are
> > required for the correct functioning of my courtesy
> > methods. The only things I really need are the generics.
>
> There must be a mistake here -- I think in your perception:
>
> 'multcomp' does *not* have excessive dependencies (though I'd say one too
> much):
>
> > tools::package_dependencies("multcomp")
> $multcomp
> [1] "stats" "graphics"  "mvtnorm"   "survival"  "TH.data"   "sandwich"
> [7] "codetools"
>
> > tools::package_dependencies("multcomp", recursive=TRUE)
> $multcomp
>  [1] "stats" "graphics"  "mvtnorm"   "survival"  "TH.data"
>  "sandwich"
>  [7] "codetools" "methods"   "utils" "zoo"   "Matrix"
> "splines"
> [13] "MASS"  "grDevices" "grid"  "lattice"
>
> >
>
> Apart from "base + recommended" packages (which *everyone* has installed),
> these are just 4 packages:
>
>  mvtnorm
>  TH.data
>  sandwich
>  zoo
>
> where  mvtnorm, sandwich, and zoo  really are among the (formally
> undefined) recommended-level-2 R packages... so I do wonder if you really
> had needed to install.
>
> The 'TH.data' { TH <==>  maintainer("multcomp") } package I think should
> not be in the strict dependencies of 'multcomp' but rather in its
> "Suggests" something I'd say must be true for all data packages:
> The whole idea of data packages is that they should be needed

Re: [R-pkg-devel] Courtesy methods and explosive dependencies

2018-05-25 Thread Lenth, Russell V
I agree that most of the package dependencies in multcomp are worth having, but 
that is not the point. The point is that if a developer wants to write a method 
for a generic function offered in another non-base package, that creates false 
dependencies: packages that users are required to have, but that aren't 
actually used by the method. I certainly wasn't trying to diss multcomp; that 
was just a concrete illustration. 

Russ

-Original Message-
From: Martin Maechler [mailto:maech...@stat.math.ethz.ch] 
Sent: Friday, May 25, 2018 2:13 AM
To: Lenth, Russell V <russell-le...@uiowa.edu>
Cc: r-package-devel@r-project.org
Subject: Re: [R-pkg-devel] Courtesy methods and explosive dependencies

>>>>> Lenth, Russell V 
>>>>> on Thu, 24 May 2018 23:14:42 + writes:

> Package developers, I am trying to severely cut down on
> the number of dependencies of my package emmeans. It is
> now 48, which is quite a few (but that is down from over
> 100 in the preceding version, where I made the unwise
> choice of including a particularly greedy package in
> Imports). I hate to force users to install dozens of
> packages that they don't really need or want, but it seems
> very hard to avoid.

> Case in point: emmeans provides additional methods for
> 'cld' and 'glht' from the multcomp package. Accordingly, I
> import their generics, and register my additional
> methods. But because I import the generics, I must list
> multcomp in Imports, and that results in the addition of
> 16 required packages, some of which I never use. More
> important, I believe that NONE of those 16 packages are
> required for the correct functioning of my courtesy
> methods. The only things I really need are the generics.

There must be a mistake here -- I think in your perception:

'multcomp' does *not* have excessive dependencies (though I'd say one too much):

> tools::package_dependencies("multcomp")
$multcomp
[1] "stats" "graphics"  "mvtnorm"   "survival"  "TH.data"   "sandwich" 
[7] "codetools"

> tools::package_dependencies("multcomp", recursive=TRUE)
$multcomp
 [1] "stats" "graphics"  "mvtnorm"   "survival"  "TH.data"   "sandwich" 
 [7] "codetools" "methods"   "utils" "zoo"   "Matrix""splines"  
[13] "MASS"  "grDevices" "grid"  "lattice"  

> 

Apart from "base + recommended" packages (which *everyone* has installed), 
these are just 4 packages:

 mvtnorm
 TH.data
 sandwich
 zoo 

where  mvtnorm, sandwich, and zoo  really are among the (formally undefined) 
recommended-level-2 R packages... so I do wonder if you really had needed to 
install.

The 'TH.data' { TH <==>  maintainer("multcomp") } package I think should not be 
in the strict dependencies of 'multcomp' but rather in its "Suggests" 
something I'd say must be true for all data packages: 
The whole idea of data packages is that they should be needed for interesting 
help page examples, vignettes, maybe even tests, but not for package 
functionality.

In sum: I'd strongly advise to not change from keeping multcomp among your 
imports.

Martin Maechler
ETH Zurich

> On the flip side, emmeans defines some generics of its own
> that I invite other package developers to extend so that
> emmeans supports their models. If those packages import
> emmeans, there is an overhead of 48 dependencies; again,
> most of these are packages that are not needed at all for
> those packages' methods to work. I don't like being
> responsible for that.

> I believe this is a very common problem, not just with my
> own packages. It's one thing to extend a base method like
> 'print'; but extending methods in contributed packages
> creates these dependency explosions. I have hundreds of
> packages installed on my system - a couple dozen I care
> about, another few dozen that seem fairly desirable, and a
> couple hundred that I don't even know what they're for,
> other than that things will break if I uninstall them.

> I do know of a couple ways to reduce these dependencies in
> the case of my multcomp dependencies:

> 1. I could simply export my S3 methods for cld and glht as
> functions, rather than registering them as S3 methods.
> Then I could move multcomp to Suggests. The downside is
> that it clutters the namespace for emmeans.

> 2. I could define the generics for cld and glht in my own
> package, and export them; and move m

Re: [R-pkg-devel] Courtesy methods and explosive dependencies

2018-05-25 Thread Ben Bolker
Russ Lenth may have picked a suboptimal example (we could search
through the dependencies of emmeans for an example with more
non-(base+recommended) recursive dependencies, but the general point
definitely holds.

"(formally undefined) recommended-level-2 R packages" seems like a can
of worms (I know what would be on my list, but I wonder how much it
would overlap everyone else's)

FWIW I don't know of a better solution than #1 from the original post.

  cheers
Ben Bolker


On Fri, May 25, 2018 at 3:13 AM, Martin Maechler
 wrote:
>> Lenth, Russell V
>> on Thu, 24 May 2018 23:14:42 + writes:
>
> > Package developers, I am trying to severely cut down on
> > the number of dependencies of my package emmeans. It is
> > now 48, which is quite a few (but that is down from over
> > 100 in the preceding version, where I made the unwise
> > choice of including a particularly greedy package in
> > Imports). I hate to force users to install dozens of
> > packages that they don't really need or want, but it seems
> > very hard to avoid.
>
> > Case in point: emmeans provides additional methods for
> > 'cld' and 'glht' from the multcomp package. Accordingly, I
> > import their generics, and register my additional
> > methods. But because I import the generics, I must list
> > multcomp in Imports, and that results in the addition of
> > 16 required packages, some of which I never use. More
> > important, I believe that NONE of those 16 packages are
> > required for the correct functioning of my courtesy
> > methods. The only things I really need are the generics.
>
> There must be a mistake here -- I think in your perception:
>
> 'multcomp' does *not* have excessive dependencies (though I'd
> say one too much):
>
>> tools::package_dependencies("multcomp")
> $multcomp
> [1] "stats" "graphics"  "mvtnorm"   "survival"  "TH.data"   "sandwich"
> [7] "codetools"
>
>> tools::package_dependencies("multcomp", recursive=TRUE)
> $multcomp
>  [1] "stats" "graphics"  "mvtnorm"   "survival"  "TH.data"   "sandwich"
>  [7] "codetools" "methods"   "utils" "zoo"   "Matrix""splines"
> [13] "MASS"  "grDevices" "grid"  "lattice"
>
>>
>
> Apart from "base + recommended" packages (which *everyone* has installed),
> these are just 4 packages:
>
>  mvtnorm
>  TH.data
>  sandwich
>  zoo
>
> where  mvtnorm, sandwich, and zoo  really are among the
> (formally undefined) recommended-level-2 R packages... so I do
> wonder if you really had needed to install.
>
> The 'TH.data' { TH <==>  maintainer("multcomp") } package I
> think should not be in the strict dependencies of 'multcomp' but
> rather in its "Suggests" something I'd say must be true for
> all data packages:
> The whole idea of data packages is that they should be needed
> for interesting help page examples, vignettes, maybe even tests,
> but not for package functionality.
>
> In sum: I'd strongly advise to not change from keeping multcomp
> among your imports.
>
> Martin Maechler
> ETH Zurich
>
> > On the flip side, emmeans defines some generics of its own
> > that I invite other package developers to extend so that
> > emmeans supports their models. If those packages import
> > emmeans, there is an overhead of 48 dependencies; again,
> > most of these are packages that are not needed at all for
> > those packages' methods to work. I don't like being
> > responsible for that.
>
> > I believe this is a very common problem, not just with my
> > own packages. It's one thing to extend a base method like
> > 'print'; but extending methods in contributed packages
> > creates these dependency explosions. I have hundreds of
> > packages installed on my system - a couple dozen I care
> > about, another few dozen that seem fairly desirable, and a
> > couple hundred that I don't even know what they're for,
> > other than that things will break if I uninstall them.
>
> > I do know of a couple ways to reduce these dependencies in
> > the case of my multcomp dependencies:
>
> > 1. I could simply export my S3 methods for cld and glht as
> > functions, rather than registering them as S3 methods.
> > Then I could move multcomp to Suggests. The downside is
> > that it clutters the namespace for emmeans.
>
> > 2. I could define the generics for cld and glht in my own
> > package, and export them; and move multcomp to Suggests.
> > Again, it clutters the namespace, and creates warning
> > messages about (if not real issues with) masking.
>
> > Probably (1) is better than (2), but is it better than
> > what I do now? Is there something else that I (and
> > probably a whole lot of other people) don't know?
>
> > I wish there were an ImportGenerics or an
> > ImportWithoutDependencies or some such field possible in
> >