Re: [R-pkg-devel] Convention or standards for using header library (e.g. Eigen)

2023-06-25 Thread Uwe Ligges




On 24.06.2023 19:44, Dirk Eddelbuettel wrote:


On 24 June 2023 at 21:35, Stephen Wade wrote:
| Doesnt seem like the system package is worth it. Should the convention
| simply be to bundle the headers in the package then? What about package
| size - is there some limit to the size of included libraries/headers to
| consider for CRAN?

Here is one (drastic) example:

   $ du -csh /usr/local/lib/R/site-library/BH
   156M/usr/local/lib/R/site-library/BH
   156Mtotal
   $



Of course one should always try to keep software as small as possible 
and not waste space.
For binary packages, we are aware that some packages are large for the 
reason Dirk explains. CRAN typically does not complain here, although 
there are cases where we have to consider if it makes sense to 
distribute that huge software is system libraries may be available.


The size restriction that applies for CRAN packages is the 5MB threshold 
for the source package size.


Best,
Uwe Ligges



Note that the package was smaller when it started (in 2013). (Note that the
last time I checked its size, the largest (not just headers) package I know
of on CRAN still was about twice as large still.)

Anyway: as you are starting to see, this is a somewhat complex problem.
Header packages are one approach. _Writing R Extensions_ mentions pure header
packages and name-checks my packages BH, RcppArmadillo and RcppEigen in
Section 1.1.3. I once wrote a short paper on this [1] (also a vignette [2])
where I more or less recommend header packages because compiled ones are so
much harder.  Recognise for example that a) no cross-OS way to check for
packages exists (though pkg-config comes close), b) no general package
managers exist, c) configure and cmake come close (but cmake is also an added
system requirement; and configure is a no-show on Windows) and d) even within
a given OS and release you may have very different versions. Lastly also: e)
some packages (RcppEigen is an example) have patches the system library would
not have applied (!!).

So to me a simplified view is that just as R "abstracts away" POSIX so that
we can always say e.g. 'dir.exists(path)' no matter where R runs, having a
package with headers ensure we get a consistent _and reliable_ compilation
experience from client packages. This matters.

Now, there are clearly downsides. With my Debian maintainer hat on, I have to
defend including Armadillo withon RcppArmadillo because the distro has it too
(but then version skew ie d) above and ease of use and consistency etc
dominate so we continue to ship RcppArmadillo).  At the same time, at CRAN we
have needless duplications. For example, my RcppCCTZ package was the first to
offer the nice (Google made but not a Google 'product') CCTZ library for R
use (starting in 2015). But when I last checked a year or so ago, four other
packages now included redundant extra copies. Also happens with Eigen. Not
great.

On the other side, packages with full (included or not) libraries work too,
but they are more effort to portably provide them, to explain to users where
to get them and keep them current and so.  It is hard (or even impossible)
for R to fill in as a _general system_ package manager across all OSs and
deployments.  There is a new kid on this block [3] we are starting to use at
work, and which may help in time across the platforms that R uses. To be
seen...

So to sum up: I think header packages are great, and I maintain a few, both
large and small in size.  I would encourage you to try them. For RcppEigen,
you can just use LinkingTo: to gets its headers.  Some 400+ packages rely on
it. (And its over 1000 for Armadillo now, and over 300 for BH.)

Hth,  Dirk

[1] https://arxiv.org/abs/1911.06416
[2] https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-libraries.pdf
[3] https://vcpkg.io/en/



__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Convention or standards for using header library (e.g. Eigen)

2023-06-24 Thread Stephen Wade
Cheers Dirk and Simon for your advice, very helpful and clear.

It's certainly a complex problem, way above my experience and pay grade.

I've decided the solution for me would be to remove the dependency on Eigen
altogether, as I am only constructing and accessing sparse matrices at the
library-level. It's probably easier if I just write my own implementation
of that structure, keeping the package size in check and ensuring maximum
portability.

I hope a clearer picture emerges once the maintainers/devs have spare
capacity to address the issue - there are so many powerful header-only
libraries out there that could bring a lot to R - especially with tools
like Rcpp at hand.

Kind regards,
-Stephen.

On Sun, Jun 25, 2023 at 3:44 AM Dirk Eddelbuettel  wrote:

>
> On 24 June 2023 at 21:35, Stephen Wade wrote:
> | Doesnt seem like the system package is worth it. Should the convention
> | simply be to bundle the headers in the package then? What about package
> | size - is there some limit to the size of included libraries/headers to
> | consider for CRAN?
>
> Here is one (drastic) example:
>
>   $ du -csh /usr/local/lib/R/site-library/BH
>   156M/usr/local/lib/R/site-library/BH
>   156Mtotal
>   $
>
> Note that the package was smaller when it started (in 2013). (Note that the
> last time I checked its size, the largest (not just headers) package I know
> of on CRAN still was about twice as large still.)
>
> Anyway: as you are starting to see, this is a somewhat complex problem.
> Header packages are one approach. _Writing R Extensions_ mentions pure
> header
> packages and name-checks my packages BH, RcppArmadillo and RcppEigen in
> Section 1.1.3. I once wrote a short paper on this [1] (also a vignette [2])
> where I more or less recommend header packages because compiled ones are so
> much harder.  Recognise for example that a) no cross-OS way to check for
> packages exists (though pkg-config comes close), b) no general package
> managers exist, c) configure and cmake come close (but cmake is also an
> added
> system requirement; and configure is a no-show on Windows) and d) even
> within
> a given OS and release you may have very different versions. Lastly also:
> e)
> some packages (RcppEigen is an example) have patches the system library
> would
> not have applied (!!).
>
> So to me a simplified view is that just as R "abstracts away" POSIX so that
> we can always say e.g. 'dir.exists(path)' no matter where R runs, having a
> package with headers ensure we get a consistent _and reliable_ compilation
> experience from client packages. This matters.
>
> Now, there are clearly downsides. With my Debian maintainer hat on, I have
> to
> defend including Armadillo withon RcppArmadillo because the distro has it
> too
> (but then version skew ie d) above and ease of use and consistency etc
> dominate so we continue to ship RcppArmadillo).  At the same time, at CRAN
> we
> have needless duplications. For example, my RcppCCTZ package was the first
> to
> offer the nice (Google made but not a Google 'product') CCTZ library for R
> use (starting in 2015). But when I last checked a year or so ago, four
> other
> packages now included redundant extra copies. Also happens with Eigen. Not
> great.
>
> On the other side, packages with full (included or not) libraries work too,
> but they are more effort to portably provide them, to explain to users
> where
> to get them and keep them current and so.  It is hard (or even impossible)
> for R to fill in as a _general system_ package manager across all OSs and
> deployments.  There is a new kid on this block [3] we are starting to use
> at
> work, and which may help in time across the platforms that R uses. To be
> seen...
>
> So to sum up: I think header packages are great, and I maintain a few, both
> large and small in size.  I would encourage you to try them. For RcppEigen,
> you can just use LinkingTo: to gets its headers.  Some 400+ packages rely
> on
> it. (And its over 1000 for Armadillo now, and over 300 for BH.)
>
> Hth,  Dirk
>
> [1] https://arxiv.org/abs/1911.06416
> [2]
> https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-libraries.pdf
> [3] https://vcpkg.io/en/
>
> --
> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Convention or standards for using header library (e.g. Eigen)

2023-06-24 Thread Dirk Eddelbuettel


On 24 June 2023 at 21:35, Stephen Wade wrote:
| Doesnt seem like the system package is worth it. Should the convention
| simply be to bundle the headers in the package then? What about package
| size - is there some limit to the size of included libraries/headers to
| consider for CRAN?

Here is one (drastic) example:

  $ du -csh /usr/local/lib/R/site-library/BH
  156M/usr/local/lib/R/site-library/BH
  156Mtotal
  $ 

Note that the package was smaller when it started (in 2013). (Note that the
last time I checked its size, the largest (not just headers) package I know
of on CRAN still was about twice as large still.)

Anyway: as you are starting to see, this is a somewhat complex problem.
Header packages are one approach. _Writing R Extensions_ mentions pure header
packages and name-checks my packages BH, RcppArmadillo and RcppEigen in
Section 1.1.3. I once wrote a short paper on this [1] (also a vignette [2])
where I more or less recommend header packages because compiled ones are so
much harder.  Recognise for example that a) no cross-OS way to check for
packages exists (though pkg-config comes close), b) no general package
managers exist, c) configure and cmake come close (but cmake is also an added
system requirement; and configure is a no-show on Windows) and d) even within
a given OS and release you may have very different versions. Lastly also: e)
some packages (RcppEigen is an example) have patches the system library would
not have applied (!!).

So to me a simplified view is that just as R "abstracts away" POSIX so that
we can always say e.g. 'dir.exists(path)' no matter where R runs, having a
package with headers ensure we get a consistent _and reliable_ compilation
experience from client packages. This matters.

Now, there are clearly downsides. With my Debian maintainer hat on, I have to
defend including Armadillo withon RcppArmadillo because the distro has it too
(but then version skew ie d) above and ease of use and consistency etc
dominate so we continue to ship RcppArmadillo).  At the same time, at CRAN we
have needless duplications. For example, my RcppCCTZ package was the first to
offer the nice (Google made but not a Google 'product') CCTZ library for R
use (starting in 2015). But when I last checked a year or so ago, four other
packages now included redundant extra copies. Also happens with Eigen. Not
great.

On the other side, packages with full (included or not) libraries work too,
but they are more effort to portably provide them, to explain to users where
to get them and keep them current and so.  It is hard (or even impossible)
for R to fill in as a _general system_ package manager across all OSs and
deployments.  There is a new kid on this block [3] we are starting to use at
work, and which may help in time across the platforms that R uses. To be
seen...

So to sum up: I think header packages are great, and I maintain a few, both
large and small in size.  I would encourage you to try them. For RcppEigen,
you can just use LinkingTo: to gets its headers.  Some 400+ packages rely on
it. (And its over 1000 for Armadillo now, and over 300 for BH.)

Hth,  Dirk

[1] https://arxiv.org/abs/1911.06416
[2] https://cran.r-project.org/web/packages/Rcpp/vignettes/Rcpp-libraries.pdf
[3] https://vcpkg.io/en/

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Convention or standards for using header library (e.g. Eigen)

2023-06-24 Thread Stephen Wade
Doesnt seem like the system package is worth it. Should the convention
simply be to bundle the headers in the package then? What about package
size - is there some limit to the size of included libraries/headers to
consider for CRAN?

On Sat, 24 June 2023, 15:08 Simon Urbanek, 
wrote:

> Stephen,
>
> If you want to give the system version a shot, I would simply look for
> pkg-config, add the supplied CPPFLAGS to the package R flags if present and
> then test (regardless of pkg-config) with AC_CHECK_HEADER (see standard
> R-exts autoconf rules for packages). If that fails then use your included
> copy by adding the corresponding -I flag pointing to your supplied copy.
> You should not download anything as there is no expectation that the user
> has any internet access as the time of the installation so if you want to
> provide a fall-back, it should be in the sources of your package. That
> said, there is nothing wrong with ignoring the system version especially in
> this header-only case since you can then rely on the correct version which
> you tested - you can still allow the user to provide an option to override
> that behavior if desired.
>
> Cheers,
> Simon
>
>
>
> > On Jun 23, 2023, at 10:08 PM, Stephen Wade 
> wrote:
> >
> > I recently submitted a package to CRAN which downloaded Eigen via
> Makevars
> > and Makevars.win. My Makevars.ucrt was empty as I noted that Eigen3 is
> > installed by default (however, this doesn't ensure that a version of
> Eigen
> > compatible/tested with the package is available).
> >
> > The source is currently on github:
> > https://github.com/stephematician/literanger
> >
> > Here is the Makevars
> >
> > $ more src/Makevars
> > # downloads eigen3 to extlibs/ and sets include location
> > PKG_CPPFLAGS = -I../src -I../extlibs/
> > .PHONY: all clean extlibs
> > all: extlibs $(SHLIB)
> > extlibs:
> > "${R_HOME}/bin${R_ARCH_BIN}/Rscript" "../tools/extlibs.R"
> > clean:
> > rm -f $(SHLIB) $(OBJECTS)
> >
> > The details of `extlibs.R` are fairly mundane, it downloads a release
> from
> > gitlab and unzips it to `extlibs`.
> >
> > CRAN gave me this feedback:
> >
> >> Why do you download eigen here rather than using the system version of
> >> Eigen if available?
> >>
> >> We asked you to do that for Windows as you did in Makevars.ucrt. For
> >> Unix-like OS you should only fall back (if at all) to some download if
> >> the system Eigen is unavailable.
> >
> > The problem is I'm not sure what a minimum standard to 'searching' for a
> > system version of Eigen looks like. I also note that packages like
> > RcppEigen simply bundle the Eigen headers within the package (and its
> > repository) which will certainly ignore any system headers.
> >
> > I would like a solution that would keep CRAN happy, i.e. i need to meet
> > some standard for searching for the compiler flags, checking the version
> of
> > the system headers, and then falling through to download release if the
> > system headers fail.
> >
> > 1.  For each platform (Unix, Windows, OS-X) what tool(s) should be
> invoked
> > to check for compiler flags for a header-only library like Eigen? e.g.
> > pkg-config, pkgconf? others?
> > 2.  What is a reasonable approach for the possible package names for
> Eigen
> > (e.g. typically libeigen3-dev on Debian, and eigen3 on arch, homebrew,
> > others)? Is this enough?
> > 3.  If pkg-config/pkgconf (or others) are unavailable, what is a
> reasonable
> > standard for checking if the library can be built with some reasonable
> > guess for the compiler flags (probably empty) - I assume I would need to
> > try to compile a test program (within Makevars)?
> > 4.  Following on from 3... would a package need to check (again via a
> test
> > program) that the _system_ headers have the correct version (e.g. some
> > static assert on EIGEN_WORLD_VERSION), and if that fails _then_ download
> > the release from gitlab?
> >
> > Any and all advice would be appreciated.
> >
> > Kind regards,
> > -Stephen Wade
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-package-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
>
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Convention or standards for using header library (e.g. Eigen)

2023-06-23 Thread Simon Urbanek
Stephen,

If you want to give the system version a shot, I would simply look for 
pkg-config, add the supplied CPPFLAGS to the package R flags if present and 
then test (regardless of pkg-config) with AC_CHECK_HEADER (see standard R-exts 
autoconf rules for packages). If that fails then use your included copy by 
adding the corresponding -I flag pointing to your supplied copy. You should not 
download anything as there is no expectation that the user has any internet 
access as the time of the installation so if you want to provide a fall-back, 
it should be in the sources of your package. That said, there is nothing wrong 
with ignoring the system version especially in this header-only case since you 
can then rely on the correct version which you tested - you can still allow the 
user to provide an option to override that behavior if desired. 

Cheers,
Simon



> On Jun 23, 2023, at 10:08 PM, Stephen Wade  wrote:
> 
> I recently submitted a package to CRAN which downloaded Eigen via Makevars
> and Makevars.win. My Makevars.ucrt was empty as I noted that Eigen3 is
> installed by default (however, this doesn't ensure that a version of Eigen
> compatible/tested with the package is available).
> 
> The source is currently on github:
> https://github.com/stephematician/literanger
> 
> Here is the Makevars
> 
> $ more src/Makevars
> # downloads eigen3 to extlibs/ and sets include location
> PKG_CPPFLAGS = -I../src -I../extlibs/
> .PHONY: all clean extlibs
> all: extlibs $(SHLIB)
> extlibs:
> "${R_HOME}/bin${R_ARCH_BIN}/Rscript" "../tools/extlibs.R"
> clean:
> rm -f $(SHLIB) $(OBJECTS)
> 
> The details of `extlibs.R` are fairly mundane, it downloads a release from
> gitlab and unzips it to `extlibs`.
> 
> CRAN gave me this feedback:
> 
>> Why do you download eigen here rather than using the system version of
>> Eigen if available?
>> 
>> We asked you to do that for Windows as you did in Makevars.ucrt. For
>> Unix-like OS you should only fall back (if at all) to some download if
>> the system Eigen is unavailable.
> 
> The problem is I'm not sure what a minimum standard to 'searching' for a
> system version of Eigen looks like. I also note that packages like
> RcppEigen simply bundle the Eigen headers within the package (and its
> repository) which will certainly ignore any system headers.
> 
> I would like a solution that would keep CRAN happy, i.e. i need to meet
> some standard for searching for the compiler flags, checking the version of
> the system headers, and then falling through to download release if the
> system headers fail.
> 
> 1.  For each platform (Unix, Windows, OS-X) what tool(s) should be invoked
> to check for compiler flags for a header-only library like Eigen? e.g.
> pkg-config, pkgconf? others?
> 2.  What is a reasonable approach for the possible package names for Eigen
> (e.g. typically libeigen3-dev on Debian, and eigen3 on arch, homebrew,
> others)? Is this enough?
> 3.  If pkg-config/pkgconf (or others) are unavailable, what is a reasonable
> standard for checking if the library can be built with some reasonable
> guess for the compiler flags (probably empty) - I assume I would need to
> try to compile a test program (within Makevars)?
> 4.  Following on from 3... would a package need to check (again via a test
> program) that the _system_ headers have the correct version (e.g. some
> static assert on EIGEN_WORLD_VERSION), and if that fails _then_ download
> the release from gitlab?
> 
> Any and all advice would be appreciated.
> 
> Kind regards,
> -Stephen Wade
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> 

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] Convention or standards for using header library (e.g. Eigen)

2023-06-23 Thread Stephen Wade
I recently submitted a package to CRAN which downloaded Eigen via Makevars
and Makevars.win. My Makevars.ucrt was empty as I noted that Eigen3 is
installed by default (however, this doesn't ensure that a version of Eigen
compatible/tested with the package is available).

The source is currently on github:
https://github.com/stephematician/literanger

Here is the Makevars

$ more src/Makevars
# downloads eigen3 to extlibs/ and sets include location
PKG_CPPFLAGS = -I../src -I../extlibs/
.PHONY: all clean extlibs
all: extlibs $(SHLIB)
extlibs:
 "${R_HOME}/bin${R_ARCH_BIN}/Rscript" "../tools/extlibs.R"
clean:
 rm -f $(SHLIB) $(OBJECTS)

The details of `extlibs.R` are fairly mundane, it downloads a release from
gitlab and unzips it to `extlibs`.

CRAN gave me this feedback:

> Why do you download eigen here rather than using the system version of
> Eigen if available?
>
> We asked you to do that for Windows as you did in Makevars.ucrt. For
> Unix-like OS you should only fall back (if at all) to some download if
> the system Eigen is unavailable.

The problem is I'm not sure what a minimum standard to 'searching' for a
system version of Eigen looks like. I also note that packages like
RcppEigen simply bundle the Eigen headers within the package (and its
repository) which will certainly ignore any system headers.

I would like a solution that would keep CRAN happy, i.e. i need to meet
some standard for searching for the compiler flags, checking the version of
the system headers, and then falling through to download release if the
system headers fail.

1.  For each platform (Unix, Windows, OS-X) what tool(s) should be invoked
to check for compiler flags for a header-only library like Eigen? e.g.
pkg-config, pkgconf? others?
2.  What is a reasonable approach for the possible package names for Eigen
(e.g. typically libeigen3-dev on Debian, and eigen3 on arch, homebrew,
others)? Is this enough?
3.  If pkg-config/pkgconf (or others) are unavailable, what is a reasonable
standard for checking if the library can be built with some reasonable
guess for the compiler flags (probably empty) - I assume I would need to
try to compile a test program (within Makevars)?
4.  Following on from 3... would a package need to check (again via a test
program) that the _system_ headers have the correct version (e.g. some
static assert on EIGEN_WORLD_VERSION), and if that fails _then_ download
the release from gitlab?

Any and all advice would be appreciated.

Kind regards,
-Stephen Wade

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel