Re: [R-pkg-devel] Courtesy methods and explosive dependencies
On 25/05/2018 3:22 PM, Lenth, Russell V wrote: There can't really be an "ImportGenerics", because S3 is so informal. A generic function is a function that calls UseMethod, but it can do anything else as well. So R would need some fancy code analysis to know whether it was safe to import the generic but not all the dependencies of that package, and that could change when the package holding the generic was updated. Examples of generics that do more than simply call UseMethod are rare, but they exist: as.data.frame() and sort() are a couple. Duncan Murdoch Right -- ImportGenerics was a dumb idea. However, Something like ImportNoDeps makes sense as an additional category in the DESCRIPTION file. In my example, if such existed, I could put in the DESCRIPTION file: ImportNoDeps: multcomp This would signal that multcomp -- but not necessarily any of its dependencies -- is required to install emmeans. And in the NAMESPACE file: importFrom(multcomp, cld) S3method(cld, emmGrid) ... would import the needed generic and register my new S3 method. I think this kind of construct could significantly reduce dependencies for packages that extend other packages' methods. But what happens if the multcomp maintainer modifies cld() next week (and makes no other change), so that it makes use of multcomp dependencies? Your package would be claiming that it didn't need them, while multcomp would not have changed any of its declarations, just the code for that one function. How would R know to generate an error in your package (because its claim about multcomp was now false)? If it didn't, you'd end up in a situation where calls to cld() failed because the cld() generic no longer worked. This is just too complicated. If multcomp says that it needs 8 packages, you shouldn't be able to install it without all of them. On the other hand, I think you can do what you want as follows: 1. Put multcomp into your Suggests list. Users can choose to install it or not. 2. Put a conditional define of the generic into one of your .R source files to define it for users who don't have multcomp: if (!requireNamespace("multcomp")) cld <- function(object, ...) UseMethod("cld") 3. In your NAMESPACE file, export cld, and conditionally import it for users who do have it: export(cld) if (requireNamespace("multcomp")) importFrom(multcomp, cld) I haven't tested this much at all, but it appears to work on a very superficial test. Duncan Murdoch __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Courtesy methods and explosive dependencies
> There can't really be an "ImportGenerics", because S3 is so informal. A > generic function is a function that calls UseMethod, but it can do anything > else as well. So R would need some fancy code analysis to know whether it > was safe to import the generic but not all the dependencies of that package, > and that could change when the package holding the generic was updated. > > Examples of generics that do more than simply call UseMethod are rare, but > they exist: as.data.frame() and sort() are a couple. > > Duncan Murdoch Right -- ImportGenerics was a dumb idea. However, Something like ImportNoDeps makes sense as an additional category in the DESCRIPTION file. In my example, if such existed, I could put in the DESCRIPTION file: ImportNoDeps: multcomp This would signal that multcomp -- but not necessarily any of its dependencies -- is required to install emmeans. And in the NAMESPACE file: importFrom(multcomp, cld) S3method(cld, emmGrid) ... would import the needed generic and register my new S3 method. I think this kind of construct could significantly reduce dependencies for packages that extend other packages' methods. Russ __ R-package-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-package-devel
Re: [R-pkg-devel] Courtesy methods and explosive dependencies
On 25/05/2018 11:38 AM, Lenth, Russell V wrote: I agree that most of the package dependencies in multcomp are worth having, but that is not the point. The point is that if a developer wants to write a method for a generic function offered in another non-base package, that creates false dependencies: packages that users are required to have, but that aren't actually used by the method. I certainly wasn't trying to diss multcomp; that was just a concrete illustration. There can't really be an "ImportGenerics", because S3 is so informal. A generic function is a function that calls UseMethod, but it can do anything else as well. So R would need some fancy code analysis to know whether it was safe to import the generic but not all the dependencies of that package, and that could change when the package holding the generic was updated. Examples of generics that do more than simply call UseMethod are rare, but they exist: as.data.frame() and sort() are a couple. Duncan Murdoch Russ -Original Message- From: Martin Maechler [mailto:maech...@stat.math.ethz.ch] Sent: Friday, May 25, 2018 2:13 AM To: Lenth, Russell V <russell-le...@uiowa.edu> Cc: r-package-devel@r-project.org Subject: Re: [R-pkg-devel] Courtesy methods and explosive dependencies Lenth, Russell V on Thu, 24 May 2018 23:14:42 + writes: > Package developers, I am trying to severely cut down on > the number of dependencies of my package emmeans. It is > now 48, which is quite a few (but that is down from over > 100 in the preceding version, where I made the unwise > choice of including a particularly greedy package in > Imports). I hate to force users to install dozens of > packages that they don't really need or want, but it seems > very hard to avoid. > Case in point: emmeans provides additional methods for > 'cld' and 'glht' from the multcomp package. Accordingly, I > import their generics, and register my additional > methods. But because I import the generics, I must list > multcomp in Imports, and that results in the addition of > 16 required packages, some of which I never use. More > important, I believe that NONE of those 16 packages are > required for the correct functioning of my courtesy > methods. The only things I really need are the generics. There must be a mistake here -- I think in your perception: 'multcomp' does *not* have excessive dependencies (though I'd say one too much): tools::package_dependencies("multcomp") $multcomp [1] "stats" "graphics" "mvtnorm" "survival" "TH.data" "sandwich" [7] "codetools" tools::package_dependencies("multcomp", recursive=TRUE) $multcomp [1] "stats" "graphics" "mvtnorm" "survival" "TH.data" "sandwich" [7] "codetools" "methods" "utils" "zoo" "Matrix""splines" [13] "MASS" "grDevices" "grid" "lattice" Apart from "base + recommended" packages (which *everyone* has installed), these are just 4 packages: mvtnorm TH.data sandwich zoo where mvtnorm, sandwich, and zoo really are among the (formally undefined) recommended-level-2 R packages... so I do wonder if you really had needed to install. The 'TH.data' { TH <==> maintainer("multcomp") } package I think should not be in the strict dependencies of 'multcomp' but rather in its "Suggests" something I'd say must be true for all data packages: The whole idea of data packages is that they should be needed for interesting help page examples, vignettes, maybe even tests, but not for package functionality. In sum: I'd strongly advise to not change from keeping multcomp among your imports. Martin Maechler ETH Zurich > On the flip side, emmeans defines some generics of its own > that I invite other package developers to extend so that > emmeans supports their models. If those packages import > emmeans, there is an overhead of 48 dependencies; again, > most of these are packages that are not needed at all for > those packages' methods to work. I don't like being > responsible for that. > I believe this is a very common problem, not just with my > own packages. It's one thing to extend a base method like > 'print'; but extending methods in contributed packages > creates these dependency explosions. I have hundreds of > packages installed on my system - a couple dozen I care > about, another few dozen that seem fairly desirable, and a > couple hundred that I d
Re: [R-pkg-devel] Courtesy methods and explosive dependencies
In the estimatr package, we provided a shim to support broom without transitively depending on the tidyverse: tidy <- function(object, ...) { if (requireNamespace("broom", quietly = TRUE)) broom::tidy(object, ...) else UseMethod("tidy") } https://github.com/DeclareDesign/estimatr/blob/master/R/S3_tidy.R It's not ideal but it seems to make our functions work whether or not the end user has broom. S4 methods (eg texreg) were more difficult, and we ended up registering the generics in .onLoad: .onLoad <- function(libname, pkgname) { if (suppressWarnings(requireNamespace("texreg", quietly = TRUE))) { setGeneric("extract", function(model, ...) standardGeneric("extract"), package = "texreg" ) setMethod("extract", signature = className("lm_robust", pkgname), definition = extract.lm_robust ) } invisible() } https://github.com/DeclareDesign/estimatr/blob/master/R/zzz.R Since this is in .onLoad, it's a little buggy if the user has estimatr and then installs texreg, but again seems to mostly work. We would appreciate any feedback the list may have on these design patterns. -Neal On Fri, May 25, 2018 at 8:38 AM, Lenth, Russell V <russell-le...@uiowa.edu> wrote: > I agree that most of the package dependencies in multcomp are worth > having, but that is not the point. The point is that if a developer wants > to write a method for a generic function offered in another non-base > package, that creates false dependencies: packages that users are required > to have, but that aren't actually used by the method. I certainly wasn't > trying to diss multcomp; that was just a concrete illustration. > > Russ > > -Original Message- > From: Martin Maechler [mailto:maech...@stat.math.ethz.ch] > Sent: Friday, May 25, 2018 2:13 AM > To: Lenth, Russell V <russell-le...@uiowa.edu> > Cc: r-package-devel@r-project.org > Subject: Re: [R-pkg-devel] Courtesy methods and explosive dependencies > > >>>>> Lenth, Russell V > >>>>> on Thu, 24 May 2018 23:14:42 + writes: > > > Package developers, I am trying to severely cut down on > > the number of dependencies of my package emmeans. It is > > now 48, which is quite a few (but that is down from over > > 100 in the preceding version, where I made the unwise > > choice of including a particularly greedy package in > > Imports). I hate to force users to install dozens of > > packages that they don't really need or want, but it seems > > very hard to avoid. > > > Case in point: emmeans provides additional methods for > > 'cld' and 'glht' from the multcomp package. Accordingly, I > > import their generics, and register my additional > > methods. But because I import the generics, I must list > > multcomp in Imports, and that results in the addition of > > 16 required packages, some of which I never use. More > > important, I believe that NONE of those 16 packages are > > required for the correct functioning of my courtesy > > methods. The only things I really need are the generics. > > There must be a mistake here -- I think in your perception: > > 'multcomp' does *not* have excessive dependencies (though I'd say one too > much): > > > tools::package_dependencies("multcomp") > $multcomp > [1] "stats" "graphics" "mvtnorm" "survival" "TH.data" "sandwich" > [7] "codetools" > > > tools::package_dependencies("multcomp", recursive=TRUE) > $multcomp > [1] "stats" "graphics" "mvtnorm" "survival" "TH.data" > "sandwich" > [7] "codetools" "methods" "utils" "zoo" "Matrix" > "splines" > [13] "MASS" "grDevices" "grid" "lattice" > > > > > Apart from "base + recommended" packages (which *everyone* has installed), > these are just 4 packages: > > mvtnorm > TH.data > sandwich > zoo > > where mvtnorm, sandwich, and zoo really are among the (formally > undefined) recommended-level-2 R packages... so I do wonder if you really > had needed to install. > > The 'TH.data' { TH <==> maintainer("multcomp") } package I think should > not be in the strict dependencies of 'multcomp' but rather in its > "Suggests" something I'd say must be true for all data packages: > The whole idea of data packages is that they should be needed
Re: [R-pkg-devel] Courtesy methods and explosive dependencies
I agree that most of the package dependencies in multcomp are worth having, but that is not the point. The point is that if a developer wants to write a method for a generic function offered in another non-base package, that creates false dependencies: packages that users are required to have, but that aren't actually used by the method. I certainly wasn't trying to diss multcomp; that was just a concrete illustration. Russ -Original Message- From: Martin Maechler [mailto:maech...@stat.math.ethz.ch] Sent: Friday, May 25, 2018 2:13 AM To: Lenth, Russell V <russell-le...@uiowa.edu> Cc: r-package-devel@r-project.org Subject: Re: [R-pkg-devel] Courtesy methods and explosive dependencies >>>>> Lenth, Russell V >>>>> on Thu, 24 May 2018 23:14:42 + writes: > Package developers, I am trying to severely cut down on > the number of dependencies of my package emmeans. It is > now 48, which is quite a few (but that is down from over > 100 in the preceding version, where I made the unwise > choice of including a particularly greedy package in > Imports). I hate to force users to install dozens of > packages that they don't really need or want, but it seems > very hard to avoid. > Case in point: emmeans provides additional methods for > 'cld' and 'glht' from the multcomp package. Accordingly, I > import their generics, and register my additional > methods. But because I import the generics, I must list > multcomp in Imports, and that results in the addition of > 16 required packages, some of which I never use. More > important, I believe that NONE of those 16 packages are > required for the correct functioning of my courtesy > methods. The only things I really need are the generics. There must be a mistake here -- I think in your perception: 'multcomp' does *not* have excessive dependencies (though I'd say one too much): > tools::package_dependencies("multcomp") $multcomp [1] "stats" "graphics" "mvtnorm" "survival" "TH.data" "sandwich" [7] "codetools" > tools::package_dependencies("multcomp", recursive=TRUE) $multcomp [1] "stats" "graphics" "mvtnorm" "survival" "TH.data" "sandwich" [7] "codetools" "methods" "utils" "zoo" "Matrix""splines" [13] "MASS" "grDevices" "grid" "lattice" > Apart from "base + recommended" packages (which *everyone* has installed), these are just 4 packages: mvtnorm TH.data sandwich zoo where mvtnorm, sandwich, and zoo really are among the (formally undefined) recommended-level-2 R packages... so I do wonder if you really had needed to install. The 'TH.data' { TH <==> maintainer("multcomp") } package I think should not be in the strict dependencies of 'multcomp' but rather in its "Suggests" something I'd say must be true for all data packages: The whole idea of data packages is that they should be needed for interesting help page examples, vignettes, maybe even tests, but not for package functionality. In sum: I'd strongly advise to not change from keeping multcomp among your imports. Martin Maechler ETH Zurich > On the flip side, emmeans defines some generics of its own > that I invite other package developers to extend so that > emmeans supports their models. If those packages import > emmeans, there is an overhead of 48 dependencies; again, > most of these are packages that are not needed at all for > those packages' methods to work. I don't like being > responsible for that. > I believe this is a very common problem, not just with my > own packages. It's one thing to extend a base method like > 'print'; but extending methods in contributed packages > creates these dependency explosions. I have hundreds of > packages installed on my system - a couple dozen I care > about, another few dozen that seem fairly desirable, and a > couple hundred that I don't even know what they're for, > other than that things will break if I uninstall them. > I do know of a couple ways to reduce these dependencies in > the case of my multcomp dependencies: > 1. I could simply export my S3 methods for cld and glht as > functions, rather than registering them as S3 methods. > Then I could move multcomp to Suggests. The downside is > that it clutters the namespace for emmeans. > 2. I could define the generics for cld and glht in my own > package, and export them; and move m
Re: [R-pkg-devel] Courtesy methods and explosive dependencies
Russ Lenth may have picked a suboptimal example (we could search through the dependencies of emmeans for an example with more non-(base+recommended) recursive dependencies, but the general point definitely holds. "(formally undefined) recommended-level-2 R packages" seems like a can of worms (I know what would be on my list, but I wonder how much it would overlap everyone else's) FWIW I don't know of a better solution than #1 from the original post. cheers Ben Bolker On Fri, May 25, 2018 at 3:13 AM, Martin Maechlerwrote: >> Lenth, Russell V >> on Thu, 24 May 2018 23:14:42 + writes: > > > Package developers, I am trying to severely cut down on > > the number of dependencies of my package emmeans. It is > > now 48, which is quite a few (but that is down from over > > 100 in the preceding version, where I made the unwise > > choice of including a particularly greedy package in > > Imports). I hate to force users to install dozens of > > packages that they don't really need or want, but it seems > > very hard to avoid. > > > Case in point: emmeans provides additional methods for > > 'cld' and 'glht' from the multcomp package. Accordingly, I > > import their generics, and register my additional > > methods. But because I import the generics, I must list > > multcomp in Imports, and that results in the addition of > > 16 required packages, some of which I never use. More > > important, I believe that NONE of those 16 packages are > > required for the correct functioning of my courtesy > > methods. The only things I really need are the generics. > > There must be a mistake here -- I think in your perception: > > 'multcomp' does *not* have excessive dependencies (though I'd > say one too much): > >> tools::package_dependencies("multcomp") > $multcomp > [1] "stats" "graphics" "mvtnorm" "survival" "TH.data" "sandwich" > [7] "codetools" > >> tools::package_dependencies("multcomp", recursive=TRUE) > $multcomp > [1] "stats" "graphics" "mvtnorm" "survival" "TH.data" "sandwich" > [7] "codetools" "methods" "utils" "zoo" "Matrix""splines" > [13] "MASS" "grDevices" "grid" "lattice" > >> > > Apart from "base + recommended" packages (which *everyone* has installed), > these are just 4 packages: > > mvtnorm > TH.data > sandwich > zoo > > where mvtnorm, sandwich, and zoo really are among the > (formally undefined) recommended-level-2 R packages... so I do > wonder if you really had needed to install. > > The 'TH.data' { TH <==> maintainer("multcomp") } package I > think should not be in the strict dependencies of 'multcomp' but > rather in its "Suggests" something I'd say must be true for > all data packages: > The whole idea of data packages is that they should be needed > for interesting help page examples, vignettes, maybe even tests, > but not for package functionality. > > In sum: I'd strongly advise to not change from keeping multcomp > among your imports. > > Martin Maechler > ETH Zurich > > > On the flip side, emmeans defines some generics of its own > > that I invite other package developers to extend so that > > emmeans supports their models. If those packages import > > emmeans, there is an overhead of 48 dependencies; again, > > most of these are packages that are not needed at all for > > those packages' methods to work. I don't like being > > responsible for that. > > > I believe this is a very common problem, not just with my > > own packages. It's one thing to extend a base method like > > 'print'; but extending methods in contributed packages > > creates these dependency explosions. I have hundreds of > > packages installed on my system - a couple dozen I care > > about, another few dozen that seem fairly desirable, and a > > couple hundred that I don't even know what they're for, > > other than that things will break if I uninstall them. > > > I do know of a couple ways to reduce these dependencies in > > the case of my multcomp dependencies: > > > 1. I could simply export my S3 methods for cld and glht as > > functions, rather than registering them as S3 methods. > > Then I could move multcomp to Suggests. The downside is > > that it clutters the namespace for emmeans. > > > 2. I could define the generics for cld and glht in my own > > package, and export them; and move multcomp to Suggests. > > Again, it clutters the namespace, and creates warning > > messages about (if not real issues with) masking. > > > Probably (1) is better than (2), but is it better than > > what I do now? Is there something else that I (and > > probably a whole lot of other people) don't know? > > > I wish there were an ImportGenerics or an > > ImportWithoutDependencies or some such field possible in > >