from:"Jan Gorecki"

Re: [R-pkg-devel] [Rd] static html vignette

2024-01-06 Thread Jan Gorecki

I may add two cents on that as we recently made this change in data.table.
Using markdown package instead of rmarkdown as a vignette engine reduced
burden caused by extra dependencies tremendously. Moreover it made package
to not even need c++ compiler, as knitr and markdown both (and their
recursive deps) are pure C with no C++.
Gains were huge. Deps installation time from 12min to 30sec. CI pipeline
compute minutes saving around 100min on a single workflow.

But there is even a better news (so be sure to upvote), that knitr may not
be required to render Rmd at all in near future. For details see
https://github.com/rstudio/markdown/issues/109

On Thu, Jan 4, 2024, 22:27 Adrian Dușa  wrote:

> On Thu, Jan 4, 2024 at 10:44 PM Uwe Ligges <
> lig...@statistik.tu-dortmund.de>
> wrote:
>
> > On 04.01.2024 21:23, Duncan Murdoch wrote:[...]
> > > Users aren't forced to install "Suggests" packages.  That's a choice
> > > they make.  The default for `install.packages()` is `dependencies =
> NA`,
> > > which says to install hard dependencies (Imports, Depends, LinkingTo).
> > > Users have to choose a non-default setting to include Suggests.
> >
> > Also note that the maintainer builds the vignette whe calling
> > R CMD build
> > CRAN checks whether the vignette can be build.
> > If a user installs a package, the already produced vignette (on the
> > maintainers machine by R CMD build) is instaled. There is no need for
> > the user to install any extra package for being able to look at the
> > vignettes.
> >
>
> I see... then I must have tested with dependencies = TRUE thinking this
> refers to hard dependencies (one more reason to read the documentation
> properly).
>
> Thank you,
> Adrian
>
> [[alternative HTML version deleted]]
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] eval(parse()) within mutate() returning same value for all rows

2023-12-29 Thread Jan Gorecki

Unless you are able to reproduce the problem without dplyr then you should
submit your question to dplyr issues tracker. R-devel is for issues in R,
not in a third party packages.

On Fri, Dec 29, 2023, 15:13 Mateo Obregón  wrote:

> Hi all-
>
> Looking through stackoverflow for R string combining examples, I found the
> following from 3 years ago:
>
> <
> https://stackoverflow.com/questions/63881854/how-to-format-strings-using-values-from-other-column-in-r
> >
>
> The top answer suggests to use eval(parse(sprintf())). I tried the
> suggestion
> and it did not return the expected combines strings. I thought that this
> might
> be an issue with some leftover values being reused, so I explicitly eval()
> with a new.env():
>
> > library(dplyr)
> > df <- tibble(words=c("%s plus %s equals %s"),
> args=c("1,1,2","2,2,4","3,3,6"))
> > df |> mutate(combined = eval(parse(text=sprintf("sprintf('%s', %s)",
> words,
> args)), envir=new.env()))
>
> # A tibble: 3 × 3
>   wordsargs  combined
>
> 1 %s plus %s equals %s 1,1,2 3 plus 3 equals 6
> 2 %s plus %s equals %s 2,2,4 3 plus 3 equals 6
> 3 %s plus %s equals %s 3,3,6 3 plus 3 equals 6
>
> The `combined`  is not what I was expecting, as the same last eval() is
> returned for all three rows.
>
> Am I missing something? What has changed in the past three years?
>
> Mateo.
> --
> Mateo Obregón
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] tools:: extracting pkg dependencies from DCF

2023-12-19 Thread Jan Gorecki

Hello all,

Following up on this old thread as I have recently observed, rather a
bad practice (maintaining order of installation for R packages rather
than relying on R for that), for solving a problem that R branch
tools4pkgs (mentioned in this email) addresses very well.
More details can be found in
https://github.com/dewittpe/R-install-dependencies/issues/3

Therefore I extracted functionality from base R branch and put into
standalone package, named after R branch:
https://github.com/jangorecki/tools4pkgs
Sharing for whoever would reach this email thread in future.

Best Regards,
Jan Gorecki


On Sat, Oct 29, 2022 at 6:26 PM Jan Gorecki  wrote:
>
> Thank you Gabriel,
>
> Just for future readers. Below is a base R way to address this common
> problem, as instructed by you (+stopifnot to suppress print).
>
> Rscript -e 'stopifnot(file.copy("DESCRIPTION",
> file.path(tdir<-tempdir(), "PACKAGES")));
> db<-available.packages(paste0("file://", tdir));
> install.packages(setdiff(tools::package_dependencies(read.dcf("DESCRIPTION",
> fields="Package")[[1L]], db, which="most")[[1L]],
> installed.packages(priority="high")[,"Package"]))'
>
> 3 liner, 310 chars long command, far from ideal, but does work.
>
> Best,
> Jan
>
>
> On Fri, Oct 28, 2022 at 10:42 PM Gabriel Becker  wrote:
> >
> > Hi Jan,
> >
> >
> > On Fri, Oct 28, 2022 at 1:57 PM Jan Gorecki  wrote:
> >>
> >> Gabriel,
> >>
> >> It is the most basic CI use case. One wants to install only
> >> dependencies only of the package, and run R CMD check on the package.
> >
> >
> > Really what you're looking for though, is to install all the dependencies 
> > which aren't present right? Excluding base packages is just a particular 
> > way to do that under certain assumptions about the CI environment.
> >
> > So
> >
> >
> > needed_pkgs <- setdiff(package_dependencies(...), 
> > installed.packages()[,"Package"])
> > install.packages(needed_pkgs, repos = fancyrepos)
> >
> >
> > will do what you want without installing the package itself, if that is 
> > important. This will filter out base and recommended packages (which will 
> > be already installed in your CI container, since R is).
> >
> >
> > Now this does not take into account versioned dependencies, so it's not 
> > actually fully correct (whereas installing the package is), but it gets you 
> > where you're trying to go. And in a clean CI container without cached 
> > package installation for the deps, its equivalent.
> >
> >
> > Also, as an aside, if you need to get the base packages, you can do
> >
> > installed.packages(priority="base")[,"Package"]
> >
> >basecompilerdatasetsgraphics   grDevicesgrid
> >
> >  "base"  "compiler"  "datasets"  "graphics" "grDevices"  "grid"
> >
> > methodsparallel splines   stats  stats4   tcltk
> >
> >   "methods"  "parallel"   "splines" "stats""stats4" "tcltk"
> >
> >   tools   utils
> >
> > "tools" "utils"
> >
> >
> > (to get base and recommended packages use 'high' instead of 'base')
> >
> > No need to be reaching down into unexported functions. So if you *really* 
> > only want to exclude base functions (which likely will give you some 
> > protection from versioned dep issues), you can change the code above to
> >
> > needed_pkgs <- setdiff(package_dependencies(...), 
> > installed.packages(priority = "high")[,"Package"])
> > install.packages(needed_pkgs, repos = fancyrepos)
> >
> > Best,
> > ~G
> >
> >>
> >> On Fri, Oct 28, 2022 at 8:42 PM Gabriel Becker  
> >> wrote:
> >> >
> >> > Hi Jan,
> >> >
> >> > The reason, I suspect without speaking for R-core, is that by design you 
> >> > should not be specifying package dependencies as additional packages to 
> >> > install. install.packages already does this for you, as it did in the 
> >> > construct of a repository code that I provided previously in the thread. 
> >> > You should be *only* doing
> >> >
> >> > install.packages(, repos = *)
> >> >
> >> > Then everything happens automatically via extremely well tested very 
> &

[Rd] capabilities() could report strict-barrier

2023-12-10 Thread Jan Gorecki

Hi,

I would like to propose for capabilities() function to include information
about strict-barrier (--enable-strict-barrier flag).
I can now do "grep barrier /usr/local/lib/R/etc/Makeconf" but having that
in R, in platform independent way, would be useful.

Best Regards,
Jan Gorecki

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] tune "checking installed package size" NOTE

2023-11-24 Thread Jan Gorecki

Oh, thank you. I only grep WRE.
Yes it does work

On Fri, Nov 24, 2023 at 12:20 PM Ivan Krylov  wrote:

> В Fri, 24 Nov 2023 12:15:06 +0100
> Jan Gorecki  пишет:
>
> > Recently we got _R_CHECK_CRAN_INCOMING_TARBALL_THRESHOLD_ env var to
> > tune "Size of tarball" note during R package check.
> >
> > Could we get similar one for tuning "checking installed package size"
> > note?
>
> Does _R_CHECK_PKG_SIZES_THRESHOLD_ (in megabytes) work for you? It can
> be found by reading R Internals or grepping the tools source code for
> "installed package size".
>
> --
> Best regards,
> Ivan
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] tune "checking installed package size" NOTE

2023-11-24 Thread Jan Gorecki

Hello,

Recently we got _R_CHECK_CRAN_INCOMING_TARBALL_THRESHOLD_ env var to tune
"Size of tarball" note during R package check.

Could we get similar one for tuning "checking installed package size" note?

We can then fit R CMD check more tightly to our package and track
regression in installed size, that could easily happen if we are force to
ignore the note because our package is already big enough.

Regards,
Jan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[R-pkg-devel] DESCRIPTION file Imports of base R packages

2023-10-03 Thread Jan Gorecki

Hello,

I noticed some packages define Imports in DESCRIPTION file listing base R
packages like methods, utils, etc.

My question is that if it is necessary to list those* dependencies in
DESCRIPTION file, or is it enough to have them listed in NAMESPACE file?

* by "those" I mean packages listed by:

in tools:::.get_standard_package_names()$base

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] R_orderVector1 - algo: radix, shell, or another?

2023-09-26 Thread Jan Gorecki

Thank you Ivan for all detail.

I was looking for particular algo for benchmarking purpose.

On Mon, Sep 25, 2023 at 9:26 AM Ivan Krylov  wrote:

> В Sun, 24 Sep 2023 10:38:41 +0200
> Jan Gorecki  пишет:
>
> >
> https://github.com/wch/r-source/blob/ed51d34ec195b89462a8531b9ef30b7b72e47204/src/main/sort.c#L1133
>
> > could anyone describe which one R_orderVector1 uses,
>
> The body of the sorting algorithm is defined in the sort2_with_index
> macro. This is shell sort. (I don't actually recognise sorting
> algorithms on sight, but the "sincs" array gave it away:
> <
> https://discourse.julialang.org/t/ironic-observation-about-sort-and-sortperm-speed-for-small-integers-vs-r/8715/3
> >.)
>
> > and if there is easy API to use different ones from C?
>
> No easy ones, I think. You could construct a call to order(..., method
> = 'radix') from R or bundle a sort implementation of your own.
>
> These are all undocumented implementation details. They could change in
> a new version of R (although quite a lot of it hasn't changed in 11-22
> years). Why are you looking for specific sorting algorithms? Could
> there be a better way to solve your problem?
>
> --
> Best regards,
> Ivan
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] R_orderVector1 - algo: radix, shell, or another?

2023-09-24 Thread Jan Gorecki

Hi Jeff,

Yes I did. My question is about R_orderVector1 which is part of public R C
api.
Should I notice something relevant in the source of R's order?

Best
Jan

On Sun, Sep 24, 2023, 17:27 Jeff Newmiller  wrote:

> Have you read the output of
>
> order
>
> entered at the R console?
>
>
> On September 24, 2023 1:38:41 AM PDT, Jan Gorecki 
> wrote:
> >Dear pkg developers,
> >
> >Are there any ways to check which sorting algorithm is being used when
> >calling `order` function? Documentation at
> >https://stat.ethz.ch/R-manual/R-devel/library/base/html/sort.html
> >says it is radix for length < 2^31
> >
> >On the other hand, I am using R_orderVector1, passing in double float
> >smaller than 2^31. Short description of it states
> >"Fast version of 1-argument case of R_orderVector".
> >Should I expect R_orderVector1 follow the same algo as R's order()? If so
> >it should be radix as well.
> >
> >
> https://github.com/wch/r-source/blob/ed51d34ec195b89462a8531b9ef30b7b72e47204/src/main/sort.c#L1133
> >
> >If there is no way to check sorting algo, could anyone describe which one
> >R_orderVector1 uses, and if there is easy API to use different ones from
> C?
> >
> >Best Regards,
> >Jan Gorecki
> >
> >   [[alternative HTML version deleted]]
> >
> >__
> >R-package-devel@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
> --
> Sent from my phone. Please excuse my brevity.
>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[R-pkg-devel] R_orderVector1 - algo: radix, shell, or another?

2023-09-24 Thread Jan Gorecki

Dear pkg developers,

Are there any ways to check which sorting algorithm is being used when
calling `order` function? Documentation at
https://stat.ethz.ch/R-manual/R-devel/library/base/html/sort.html
says it is radix for length < 2^31

On the other hand, I am using R_orderVector1, passing in double float
smaller than 2^31. Short description of it states
"Fast version of 1-argument case of R_orderVector".
Should I expect R_orderVector1 follow the same algo as R's order()? If so
it should be radix as well.

https://github.com/wch/r-source/blob/ed51d34ec195b89462a8531b9ef30b7b72e47204/src/main/sort.c#L1133

If there is no way to check sorting algo, could anyone describe which one
R_orderVector1 uses, and if there is easy API to use different ones from C?

Best Regards,
Jan Gorecki

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[R-pkg-devel] safely allocate SEXP and handle failure

2023-09-21 Thread Jan Gorecki

Dear pkg developers

I would like to safely allocate R object from C. By safely I mean
that, I can test if allocation succeeded or failed, and then raise
exception myself.
R_alloc and allocVector both raises exception straightaway, so I am
not able to handle failed allocation myself.
In plain C it is something like that:

int *x = malloc(nx*sizeof(int));
if (!x) {
  my_fun_to_set_exception_signal();
  free(x);
  return;
}

How can I do it for creating SEXP object?

Best Regards,
Jan Gorecki

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] tools:: extracting pkg dependencies from DCF

2022-10-29 Thread Jan Gorecki

Thank you Gabriel,

Just for future readers. Below is a base R way to address this common
problem, as instructed by you (+stopifnot to suppress print).

Rscript -e 'stopifnot(file.copy("DESCRIPTION",
file.path(tdir<-tempdir(), "PACKAGES")));
db<-available.packages(paste0("file://", tdir));
install.packages(setdiff(tools::package_dependencies(read.dcf("DESCRIPTION",
fields="Package")[[1L]], db, which="most")[[1L]],
installed.packages(priority="high")[,"Package"]))'

3 liner, 310 chars long command, far from ideal, but does work.

Best,
Jan


On Fri, Oct 28, 2022 at 10:42 PM Gabriel Becker  wrote:
>
> Hi Jan,
>
>
> On Fri, Oct 28, 2022 at 1:57 PM Jan Gorecki  wrote:
>>
>> Gabriel,
>>
>> It is the most basic CI use case. One wants to install only
>> dependencies only of the package, and run R CMD check on the package.
>
>
> Really what you're looking for though, is to install all the dependencies 
> which aren't present right? Excluding base packages is just a particular way 
> to do that under certain assumptions about the CI environment.
>
> So
>
>
> needed_pkgs <- setdiff(package_dependencies(...), 
> installed.packages()[,"Package"])
> install.packages(needed_pkgs, repos = fancyrepos)
>
>
> will do what you want without installing the package itself, if that is 
> important. This will filter out base and recommended packages (which will be 
> already installed in your CI container, since R is).
>
>
> Now this does not take into account versioned dependencies, so it's not 
> actually fully correct (whereas installing the package is), but it gets you 
> where you're trying to go. And in a clean CI container without cached package 
> installation for the deps, its equivalent.
>
>
> Also, as an aside, if you need to get the base packages, you can do
>
> installed.packages(priority="base")[,"Package"]
>
>basecompilerdatasetsgraphics   grDevicesgrid
>
>  "base"  "compiler"  "datasets"  "graphics" "grDevices"  "grid"
>
> methodsparallel splines   stats  stats4   tcltk
>
>   "methods"  "parallel"   "splines" "stats""stats4" "tcltk"
>
>   tools   utils
>
> "tools" "utils"
>
>
> (to get base and recommended packages use 'high' instead of 'base')
>
> No need to be reaching down into unexported functions. So if you *really* 
> only want to exclude base functions (which likely will give you some 
> protection from versioned dep issues), you can change the code above to
>
> needed_pkgs <- setdiff(package_dependencies(...), installed.packages(priority 
> = "high")[,"Package"])
> install.packages(needed_pkgs, repos = fancyrepos)
>
> Best,
> ~G
>
>>
>> On Fri, Oct 28, 2022 at 8:42 PM Gabriel Becker  wrote:
>> >
>> > Hi Jan,
>> >
>> > The reason, I suspect without speaking for R-core, is that by design you 
>> > should not be specifying package dependencies as additional packages to 
>> > install. install.packages already does this for you, as it did in the 
>> > construct of a repository code that I provided previously in the thread. 
>> > You should be *only* doing
>> >
>> > install.packages(, repos = *)
>> >
>> > Then everything happens automatically via extremely well tested very 
>> > mature code.
>> >
>> > I (still) don't understand why you'd need to pass install.packages the 
>> > vector of dependencies yourself, as that is counter to install.packages' 
>> > core design.
>> >
>> > Does that make sense?
>> >
>> > Best,
>> > ~G
>> >
>> > On Fri, Oct 28, 2022 at 12:18 PM Jan Gorecki  wrote:
>> >>
>> >> Gabriel,
>> >>
>> >> I am trying to design generic solution that could be applied to
>> >> arbitrary package. Therefore I went with the latter solution you
>> >> proposed.
>> >> If we wouldn't have to exclude base packages, then its a 3 liner
>> >>
>> >> file.copy("DESCRIPTION", file.path(tdir<-tempdir(), "PACKAGES"));
>> >> db<-available.packages(paste0("file://", tdir));
>> >> utils::install.packages(tools::package_dependencies("pkgname", db,
>> >> which="most")[[1L]])
>> >>
>> >> As you noticed, we sti

Re: [Rd] tools:: extracting pkg dependencies from DCF

2022-10-28 Thread Jan Gorecki

Gabriel,

It is the most basic CI use case. One wants to install only
dependencies only of the package, and run R CMD check on the package.

Unless you say that installing the package and then running R CMD
check on that package is considered good practice. Then yes,
functionality I am asking about is not needed. Somehow I never thought
that this could be considered a good practice just by the fact that
installation of the package could already impact environment in which
check is taking place.

Best,
Jan

On Fri, Oct 28, 2022 at 8:42 PM Gabriel Becker  wrote:
>
> Hi Jan,
>
> The reason, I suspect without speaking for R-core, is that by design you 
> should not be specifying package dependencies as additional packages to 
> install. install.packages already does this for you, as it did in the 
> construct of a repository code that I provided previously in the thread. You 
> should be *only* doing
>
> install.packages(, repos = *)
>
> Then everything happens automatically via extremely well tested very mature 
> code.
>
> I (still) don't understand why you'd need to pass install.packages the vector 
> of dependencies yourself, as that is counter to install.packages' core design.
>
> Does that make sense?
>
> Best,
> ~G
>
> On Fri, Oct 28, 2022 at 12:18 PM Jan Gorecki  wrote:
>>
>> Gabriel,
>>
>> I am trying to design generic solution that could be applied to
>> arbitrary package. Therefore I went with the latter solution you
>> proposed.
>> If we wouldn't have to exclude base packages, then its a 3 liner
>>
>> file.copy("DESCRIPTION", file.path(tdir<-tempdir(), "PACKAGES"));
>> db<-available.packages(paste0("file://", tdir));
>> utils::install.packages(tools::package_dependencies("pkgname", db,
>> which="most")[[1L]])
>>
>> As you noticed, we still have to filter out base packages. Otherwise
>> it won't be a robust utility that can be used in CI. Therefore we have
>> to add a call to tools:::.get_standard_package_names() which is an
>> internal function (as of now). Not only complicating the call but also
>> putting the functionality outside of safe use.
>>
>> Considering above, don't you agree that the following one liner could
>> nicely address the problem? The problem that hundreds/thousands of
>> packages are now addressing in their CI scripts by using a third party
>> packages.
>>
>> utils::install.packages(packages.dcf("DESCRIPTION", which="most"))
>>
>> It is hard to me to understand why R members don't consider this basic
>> functionality to be part of base R. Possibly they just don't need it
>> themselves. Yet isn't this sufficient that hundreds/thousands of
>> packages does need this functionality?
>>
>> Best regards,
>> Jan
>>
>> On Mon, Oct 17, 2022 at 8:39 AM Jan Gorecki  wrote:
>> >
>> > Gabriel and Simon
>> >
>> > I completely agree with what you are saying.
>> > The thing is that obtaining recursive deps, all/most whatever, is already 
>> > well supported in core R. What is missing is just this single 
>> > functionality I am requesting.
>> >
>> > If you will look into the branch you can see there is mirror.packages 
>> > function meant to mirror a slice of CRAN. It is doing exactly what you 
>> > described: package_dependencies; to obtain recursive deps, then download 
>> > all, etc.
>> > I would love to have this function provided by core R as well, but we need 
>> > to start somewhere.
>> >
>> > There are other use cases as well.
>> > For example CI, where one wants to install all/most dependencies and then 
>> > run R CMD check. Then we don't worry about recursive deps are they will be 
>> > resolved automatically.
>> > I don't think it's reasonable to force users to use 3rd party packages to 
>> > handle such a common and simple use case. Otherwise one has to hard code 
>> > deps in CI script. Not robust at all.
>> >
>> > packages.dcf and repos.dcf makes all that way easier, and are solid base 
>> > for building customized orchestration like mirroring slice of CRAN.
>> >
>> > Best regards
>> > Jan
>> >
>> > On Sun, Oct 16, 2022, 01:31 Simon Urbanek  
>> > wrote:
>> >>
>> >> Jan,
>> >>
>> >> I think using a single DCF as input is not very practical and would not 
>> >> be useful in the context you describe (creating self contained repos) 
>> >> since they typically concern a li

Re: [Rd] tools:: extracting pkg dependencies from DCF

2022-10-28 Thread Jan Gorecki

Gabriel,

I am trying to design generic solution that could be applied to
arbitrary package. Therefore I went with the latter solution you
proposed.
If we wouldn't have to exclude base packages, then its a 3 liner

file.copy("DESCRIPTION", file.path(tdir<-tempdir(), "PACKAGES"));
db<-available.packages(paste0("file://", tdir));
utils::install.packages(tools::package_dependencies("pkgname", db,
which="most")[[1L]])

As you noticed, we still have to filter out base packages. Otherwise
it won't be a robust utility that can be used in CI. Therefore we have
to add a call to tools:::.get_standard_package_names() which is an
internal function (as of now). Not only complicating the call but also
putting the functionality outside of safe use.

Considering above, don't you agree that the following one liner could
nicely address the problem? The problem that hundreds/thousands of
packages are now addressing in their CI scripts by using a third party
packages.

utils::install.packages(packages.dcf("DESCRIPTION", which="most"))

It is hard to me to understand why R members don't consider this basic
functionality to be part of base R. Possibly they just don't need it
themselves. Yet isn't this sufficient that hundreds/thousands of
packages does need this functionality?

Best regards,
Jan

On Mon, Oct 17, 2022 at 8:39 AM Jan Gorecki  wrote:
>
> Gabriel and Simon
>
> I completely agree with what you are saying.
> The thing is that obtaining recursive deps, all/most whatever, is already 
> well supported in core R. What is missing is just this single functionality I 
> am requesting.
>
> If you will look into the branch you can see there is mirror.packages 
> function meant to mirror a slice of CRAN. It is doing exactly what you 
> described: package_dependencies; to obtain recursive deps, then download all, 
> etc.
> I would love to have this function provided by core R as well, but we need to 
> start somewhere.
>
> There are other use cases as well.
> For example CI, where one wants to install all/most dependencies and then run 
> R CMD check. Then we don't worry about recursive deps are they will be 
> resolved automatically.
> I don't think it's reasonable to force users to use 3rd party packages to 
> handle such a common and simple use case. Otherwise one has to hard code deps 
> in CI script. Not robust at all.
>
> packages.dcf and repos.dcf makes all that way easier, and are solid base for 
> building customized orchestration like mirroring slice of CRAN.
>
> Best regards
> Jan
>
> On Sun, Oct 16, 2022, 01:31 Simon Urbanek  wrote:
>>
>> Jan,
>>
>> I think using a single DCF as input is not very practical and would not be 
>> useful in the context you describe (creating self contained repos) since 
>> they typically concern a list of packages, but essentially splitting out the 
>> part of install.packages() which determines which files will be pulled from 
>> where would be very useful as it would be trivial to use it to create 
>> repository (what we always do in corporate environments) instead of 
>> installing the packages. I suspect that install packages is already too 
>> complex so instead of adding a flag to install.packages one could move that 
>> functionality into a separate function - we all do that constantly for the 
>> sites we manage, so it would be certainly something worthwhile.
>>
>> Cheers,
>> Simon
>>
>>
>> > On Oct 15, 2022, at 7:14 PM, Jan Gorecki  wrote:
>> >
>> > Hi Gabriel,
>> >
>> > It's very nice usage you provided here. Maybe instead of adding new
>> > function we could extend packages_depenedncies then? To accept file path to
>> > dsc file.
>> >
>> > What about repos.dcf? Maybe additional repositories could be an attribute
>> > attached to returned character vector.
>> >
>> > The use case is to, for a given package sources, obtain its dependencies,
>> > so one can use that for installing them/mirroring CRAN subset, or whatever.
>> > The later is especially important for a production environment where one
>> > wants to have fixed version of packages, and mirroring relevant subset of
>> > CRAN is the most simple, and IMO reliable, way to manage such environment.
>> >
>> > Regards
>> > Jan
>> >
>> > On Fri, Oct 14, 2022, 23:34 Gabriel Becker  wrote:
>> >
>> >> Hi Jan and Jan,
>> >>
>> >> Can you explain a little more what exactly you want the non-recursive,
>> >> non-version aware dependencies from an individual package for?
>> >>
>> >> Either

Re: [Rd] tools:: extracting pkg dependencies from DCF

2022-10-17 Thread Jan Gorecki

Gabriel and Simon

I completely agree with what you are saying.
The thing is that obtaining recursive deps, all/most whatever, is already
well supported in core R. What is missing is just this single functionality
I am requesting.

If you will look into the branch you can see there is mirror.packages
function meant to mirror a slice of CRAN. It is doing exactly what you
described: package_dependencies; to obtain recursive deps, then download
all, etc.
I would love to have this function provided by core R as well, but we need
to start somewhere.

There are other use cases as well.
For example CI, where one wants to install all/most dependencies and then
run R CMD check. Then we don't worry about recursive deps are they will be
resolved automatically.
I don't think it's reasonable to force users to use 3rd party packages to
handle such a common and simple use case. Otherwise one has to hard code
deps in CI script. Not robust at all.

packages.dcf and repos.dcf makes all that way easier, and are solid base
for building customized orchestration like mirroring slice of CRAN.

Best regards
Jan

On Sun, Oct 16, 2022, 01:31 Simon Urbanek 
wrote:

> Jan,
>
> I think using a single DCF as input is not very practical and would not be
> useful in the context you describe (creating self contained repos) since
> they typically concern a list of packages, but essentially splitting out
> the part of install.packages() which determines which files will be pulled
> from where would be very useful as it would be trivial to use it to create
> repository (what we always do in corporate environments) instead of
> installing the packages. I suspect that install packages is already too
> complex so instead of adding a flag to install.packages one could move that
> functionality into a separate function - we all do that constantly for the
> sites we manage, so it would be certainly something worthwhile.
>
> Cheers,
> Simon
>
>
> > On Oct 15, 2022, at 7:14 PM, Jan Gorecki  wrote:
> >
> > Hi Gabriel,
> >
> > It's very nice usage you provided here. Maybe instead of adding new
> > function we could extend packages_depenedncies then? To accept file path
> to
> > dsc file.
> >
> > What about repos.dcf? Maybe additional repositories could be an attribute
> > attached to returned character vector.
> >
> > The use case is to, for a given package sources, obtain its dependencies,
> > so one can use that for installing them/mirroring CRAN subset, or
> whatever.
> > The later is especially important for a production environment where one
> > wants to have fixed version of packages, and mirroring relevant subset of
> > CRAN is the most simple, and IMO reliable, way to manage such
> environment.
> >
> > Regards
> > Jan
> >
> > On Fri, Oct 14, 2022, 23:34 Gabriel Becker 
> wrote:
> >
> >> Hi Jan and Jan,
> >>
> >> Can you explain a little more what exactly you want the non-recursive,
> >> non-version aware dependencies from an individual package for?
> >>
> >> Either way package_dependencies will do this for you* with a little
> >> "aggressive convincing". It wants output from available.packages, but
> who
> >> really cares what it wants? It's a function and we are people :)
> >>
> >>> library(tools)
> >>> db <- read.dcf("~/gabe/checkedout/rtables_clean/DESCRIPTION")
> >>> package_dependencies("rtables", db, which = intersect(c("Depends",
> >> "Suggests", "Imports", "LinkingTo"), colnames(db)))
> >> $rtables
> >> [1] "methods""magrittr"   "formatters" "dplyr"  "tibble"
> >> [6] "tidyr"  "testthat"   "xml2"   "knitr"  "rmarkdown"
> >> [11] "flextable"  "officer""stats"  "htmltools"  "grid"
> >>
> >>
> >> The only gotcha that I see immediately is that "LinkingTo" isn't always
> >> there (whereas it is with real output from available.packages). If you
> >> know your package doesn't have that (or that it does) at call time ,
> this
> >> becomes a one-liner:
> >>
> >> package_dependencies("rtables", db =
> >> read.dcf("~/gabe/checkedout/rtables_clean/DESCRIPTION"), which =
> >> c("Depends", "Suggests", "Imports"))
> >> $rtables
> >> [1] "methods""magrittr"   "formatters" "dplyr"  "tibble"
> >> [6] "tidyr&qu

Re: [Rd] tools:: extracting pkg dependencies from DCF

2022-10-15 Thread Jan Gorecki

Hi Gabriel,

It's very nice usage you provided here. Maybe instead of adding new
function we could extend packages_depenedncies then? To accept file path to
dsc file.

What about repos.dcf? Maybe additional repositories could be an attribute
attached to returned character vector.

The use case is to, for a given package sources, obtain its dependencies,
so one can use that for installing them/mirroring CRAN subset, or whatever.
The later is especially important for a production environment where one
wants to have fixed version of packages, and mirroring relevant subset of
CRAN is the most simple, and IMO reliable, way to manage such environment.

Regards
Jan

On Fri, Oct 14, 2022, 23:34 Gabriel Becker  wrote:

> Hi Jan and Jan,
>
> Can you explain a little more what exactly you want the non-recursive,
> non-version aware dependencies from an individual package for?
>
> Either way package_dependencies will do this for you* with a little
> "aggressive convincing". It wants output from available.packages, but who
> really cares what it wants? It's a function and we are people :)
>
> > library(tools)
> > db <- read.dcf("~/gabe/checkedout/rtables_clean/DESCRIPTION")
> > package_dependencies("rtables", db, which = intersect(c("Depends",
> "Suggests", "Imports", "LinkingTo"), colnames(db)))
> $rtables
>  [1] "methods""magrittr"   "formatters" "dplyr"  "tibble"
>  [6] "tidyr"  "testthat"   "xml2"   "knitr"  "rmarkdown"
> [11] "flextable"  "officer""stats"  "htmltools"  "grid"
>
>
> The only gotcha that I see immediately is that "LinkingTo" isn't always
> there (whereas it is with real output from available.packages). If you
> know your package doesn't have that (or that it does) at call time , this
> becomes a one-liner:
>
> package_dependencies("rtables", db =
> read.dcf("~/gabe/checkedout/rtables_clean/DESCRIPTION"), which =
> c("Depends", "Suggests", "Imports"))
> $rtables
>  [1] "methods""magrittr"   "formatters" "dplyr"  "tibble"
>  [6] "tidyr"  "testthat"   "xml2"   "knitr"  "rmarkdown"
> [11] "flextable"  "officer""stats"  "htmltools"  "grid"
>
> You can also trick it a slightly different way by giving it what it
> actually wants
>
> > tdir <- tempdir()
> > file.copy("~/gabe/checkedout/rtables_clean/DESCRIPTION", file.path(tdir,
> "PACKAGES"))
> [1] TRUE
> > avl <- available.packages(paste0("file://", tdir))
> > library(tools)
> > package_dependencies("rtables", avl)
> $rtables
> [1] "methods""magrittr"   "formatters" "stats"  "htmltools"
> [6] "grid"
>
> > package_dependencies("rtables", avl, which = "all")
> $rtables
>  [1] "methods""magrittr"   "formatters" "stats"  "htmltools"
>  [6] "grid"   "dplyr"  "tibble" "tidyr"  "testthat"
> [11] "xml2"   "knitr"  "rmarkdown"  "flextable"  "officer"
>
> So the only real benefits I see that we'd be picking up here is automatic
> filtering by priority, and automatic extraction of the package name from
> the DESCRIPTION file. I'm not sure either of those warrant a new exported
> function that R-core has to maintain forever.
>
> Best,
> ~G
>
> * I haven't tested this across all OSes, but I dont' know of any reason it
> wouldn't work generally.
>
> On Fri, Oct 14, 2022 at 2:33 PM Jan Gorecki  wrote:
>
>> Hello Jan,
>>
>> Thanks for confirming about many packages reinventing this missing
>> functionality.
>> packages.dcf was not meant handle versions. It just extracts names of
>> dependencies... Yes, such a simple thing, yet missing in base R.
>>
>> Versions of packages can be controlled when setting up R pkgs repo. This
>> is
>> how I used to handle it. Making a CRAN subset mirror of fixed version
>> pkgs.
>> BTW. function for that is also included in mentioned branch. I am just not
>> proposing it, to increase the chance of having at least this simple,
>> missing, functionality merged.
>>
>> Best
>> Jan
>>
>> On Fri, Oct 14, 2022, 15:14 Jan Ne

Re: [Rd] tools:: extracting pkg dependencies from DCF

2022-10-14 Thread Jan Gorecki

Hello Jan,

Thanks for confirming about many packages reinventing this missing
functionality.
packages.dcf was not meant handle versions. It just extracts names of
dependencies... Yes, such a simple thing, yet missing in base R.

Versions of packages can be controlled when setting up R pkgs repo. This is
how I used to handle it. Making a CRAN subset mirror of fixed version pkgs.
BTW. function for that is also included in mentioned branch. I am just not
proposing it, to increase the chance of having at least this simple,
missing, functionality merged.

Best
Jan

On Fri, Oct 14, 2022, 15:14 Jan Netík  wrote:

> Hello Jan,
>
> I have seen many packages that implemented dependencies "extraction" on
> their own for internal purposes and today I was doing exactly that for
> mine. It's not a big deal using read.dcf on DESCRIPTION. It was sufficient
> for me, but I had to take care of some \n chars (the overall returned value
> has some rough edges, in my opinion). However, the function from the branch
> seems to not care about version requirements, which are crucial for me.
> Maybe that is something to reconsider before merging.
>
> Best,
> Jan
>
> pá 14. 10. 2022 v 2:27 odesílatel Jan Gorecki 
> napsal:
>
>> Dear R devs,
>>
>> I would like to raise a request for a simple helper function.
>> Utility function to extract package dependencies from DESCRIPTION file.
>>
>> I do think that tools package is better place, for such a fundamental
>> functionality, than community packages.
>>
>> tools pkg seems perfect fit (having already great function
>> write_PACKAGES).
>>
>> Functionality I am asking for is already in R svn repository since 2016,
>> in
>> a branch tools4pkgs. Function is called 'packages.dcf'.
>> Another one 'repos.dcf' would be a good functional complementary to it.
>>
>> Those two simple helper functions really makes it easier for organizations
>> to glue together usage of their own R packages repos and CRAN repo in a
>> smooth way. That could possibly help to offload CRAN from new submissions.
>>
>> gh mirror link for easy preview:
>>
>> https://github.com/wch/r-source/blob/tools4pkgs/src/library/tools/R/packages.R#L419
>>
>> Regards
>> Jan Gorecki
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] tools:: extracting pkg dependencies from DCF

2022-10-13 Thread Jan Gorecki

Dear R devs,

I would like to raise a request for a simple helper function.
Utility function to extract package dependencies from DESCRIPTION file.

I do think that tools package is better place, for such a fundamental
functionality, than community packages.

tools pkg seems perfect fit (having already great function write_PACKAGES).

Functionality I am asking for is already in R svn repository since 2016, in
a branch tools4pkgs. Function is called 'packages.dcf'.
Another one 'repos.dcf' would be a good functional complementary to it.

Those two simple helper functions really makes it easier for organizations
to glue together usage of their own R packages repos and CRAN repo in a
smooth way. That could possibly help to offload CRAN from new submissions.

gh mirror link for easy preview:
https://github.com/wch/r-source/blob/tools4pkgs/src/library/tools/R/packages.R#L419

Regards
Jan Gorecki

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R-devel new warning: no longer be an S4 object

2021-05-10 Thread Jan Gorecki

Hi R-devs,
R 4.0.5 gives no warning. Is it expected? Searching the news for "I("
doesn't give any info. Thanks

z = I(getClass("MethodDefinition"))
Warning message:
In `class<-`(x, unique.default(c("AsIs", oldClass(x :
  Setting class(x) to multiple strings ("AsIs", "classRepresentation",
...); result will no longer be an S4 object

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] In function isum in summary.c, k should be R_xlen_t

2020-12-31 Thread Jan Gorecki

Don't know this piece well but I am guessing that you haven't found an
example because the iterator is going up to the length of a vector
anymore but only to the number of batches, which is unlikely to be
more than 2^31.

On Tue, Dec 22, 2020 at 12:30 PM Suharto Anggono Suharto Anggono via
R-devel  wrote:
>
>
> In summary.c, in function 'isum', the loop is 'ITERATE_BY_REGION' that 
> contains 'for' loop
> for (int k = 0; k < nbatch; k++)
> It is since SVN revision 73445, in released R since version 3.5.0.
> Previously, the loop is
> for (R_xlen_t i = 0; i < n; i++)
>
> Inside 'ITERATE_BY_REGION', the type of the index, 'k', should still be 
> 'R_xlen_t' as previously. If 'sx' is a regular vector (not ALTREP), data 
> pointer is taken and 'nbatch' is the length of the vector, like without 
> 'ITERATE_BY_REGION'. With 64-bit R, it is possible that the vector is a long 
> vector. In that case, correct iteration should reach index outside the range 
> of 'int'.
>
> However, I haven't found an example in 64-bit R of wrong behavior of
> sum(x)
> for 'x' with storage mode "integer" and length 2^31 or more.
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Re: New pipe operator

2020-12-06 Thread Jan Gorecki

Luke,
When writing a blog post on that, could you please describe
performance implications that this new feature will carry?
AFAIU, compared to a standard way of using temporary variables, pipes
will allow to not increment REFCNT of objects being piped into.
Therefore peak memory usage could be lower in some cases.

As for brackets required on RHS, I think it makes sense to be
consistent and either require brackets for anonymous functions the
same way we require for function name, or not require brackets for
both of them.

Best,
Jan

On Sat, Dec 5, 2020 at 8:10 PM  wrote:
>
> We went back and forth on this several times. The key advantage of
> requiring parentheses is to keep things simple and consistent.  Let's
> get some experience with that. If experience shows requiring
> parentheses creates too many issues then we can add the option of
> dropping them later (with special handling of :: and :::). It's easier
> to add flexibility and complexity than to restrict it after the fact.
>
> Best,
>
> luke
>
> On Sat, 5 Dec 2020, Hugh Parsonage wrote:
>
> > I'm surprised by the aversion to
> >
> > mtcars |> nrow
> >
> > over
> >
> > mtcars |> nrow()
> >
> > and I think the decision to disallow the former should be
> > reconsidered.  The pipe operator is only going to be used when the rhs
> > is a function, so there is no ambiguity with omitting the parentheses.
> > If it's disallowed, it becomes inconsistent with other treatments like
> > sapply(mtcars, typeof) where sapply(mtcars, typeof()) would just be
> > noise.  I'm not sure why this decision was taken
> >
> > If the only issue is with the double (and triple) colon operator, then
> > ideally `mtcars |> base::head` should resolve to `base::head(mtcars)`
> > -- in other words, demote the precedence of |>
> >
> > Obviously (looking at the R-Syntax branch) this decision was
> > considered, put into place, then dropped, but I can't see why
> > precisely.
> >
> > Best,
> >
> >
> > Hugh.
> >
> >
> >
> >
> >
> >
> >
> > On Sat, 5 Dec 2020 at 04:07, Deepayan Sarkar  
> > wrote:
> >>
> >> On Fri, Dec 4, 2020 at 7:35 PM Duncan Murdoch  
> >> wrote:
> >>>
> >>> On 04/12/2020 8:13 a.m., Hiroaki Yutani wrote:
> >   Error: function '::' not supported in RHS call of a pipe
> 
>  To me, this error looks much more friendly than magrittr's error.
>  Some of them got too used to specify functions without (). This
>  is OK until they use `::`, but when they need to use it, it takes
>  hours to figure out why
> 
>  mtcars %>% base::head
>  #> Error in .::base : unused argument (head)
> 
>  won't work but
> 
>  mtcars %>% head
> 
>  works. I think this is a too harsh lesson for ordinary R users to
>  learn `::` is a function. I've been wanting for magrittr to drop the
>  support for a function name without () to avoid this confusion,
>  so I would very much welcome the new pipe operator's behavior.
>  Thank you all the developers who implemented this!
> >>>
> >>> I agree, it's an improvement on the corresponding magrittr error.
> >>>
> >>> I think the semantics of not evaluating the RHS, but treating the pipe
> >>> as purely syntactical is a good decision.
> >>>
> >>> I'm not sure I like the recommended way to pipe into a particular 
> >>> argument:
> >>>
> >>>mtcars |> subset(cyl == 4) |> \(d) lm(mpg ~ disp, data = d)
> >>>
> >>> or
> >>>
> >>>mtcars |> subset(cyl == 4) |> function(d) lm(mpg ~ disp, data = d)
> >>>
> >>> both of which are equivalent to
> >>>
> >>>mtcars |> subset(cyl == 4) |> (function(d) lm(mpg ~ disp, data = d))()
> >>>
> >>> It's tempting to suggest it should allow something like
> >>>
> >>>mtcars |> subset(cyl == 4) |> lm(mpg ~ disp, data = .)
> >>
> >> Which is really not that far off from
> >>
> >> mtcars |> subset(cyl == 4) |> \(.) lm(mpg ~ disp, data = .)
> >>
> >> once you get used to it.
> >>
> >> One consequence of the implementation is that it's not clear how
> >> multiple occurrences of the placeholder would be interpreted. With
> >> magrittr,
> >>
> >> sort(runif(10)) %>% ecdf(.)(.)
> >> ## [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
> >>
> >> This is probably what you would expect, if you expect it to work at all, 
> >> and not
> >>
> >> ecdf(sort(runif(10)))(sort(runif(10)))
> >>
> >> There would be no such ambiguity with anonymous functions
> >>
> >> sort(runif(10)) |> \(.) ecdf(.)(.)
> >>
> >> -Deepayan
> >>
> >>> which would be expanded to something equivalent to the other versions:
> >>> but that makes it quite a bit more complicated.  (Maybe _ or \. should
> >>> be used instead of ., since those are not legal variable names.)
> >>>
> >>> I don't think there should be an attempt to copy magrittr's special
> >>> casing of how . is used in determining whether to also include the
> >>> previous value as first argument.
> >>>
> >>> Duncan Murdoch
> >>>
> >>>
> 
>  Best,
>  Hiroaki Yutani
> 
>  2020年12月4日(金) 20:51 Duncan Murdoch :
>

Re: [Rd] [External] Re: .Internal(quit(...)): system call failed: Cannot allocate memory

2020-12-01 Thread Jan Gorecki

Yes, I do set outside of R, in shell:
R_MAX_VSIZE=100Gb SRC_DATANAME=G1_1e9_2e0_0_0 /usr/bin/time -v Rscript
datatable/groupby-datatable.R

I think it might be related to allocations made with malloc rather than R_alloc.
Probably malloc allocation is not capped by setting this env var.
If so, then I have to limit memory on OS/shell level. As you mentioned before.

Best

On Tue, Dec 1, 2020 at 6:54 PM  wrote:
>
> The fact that your max resident size isn't affected looks odd.  Are
> you setting the environment variable outside R? When I run
>
>  env R_MAX_VSIZE=16Gb /usr/bin/time bin/Rscript jg.R 1e9 2e0 0 0
>
> (your code in jg.R). I get a quick failure with 11785524maxresident)k
>
> Best,
>
> luke
>
> On Tue, 1 Dec 2020, Jan Gorecki wrote:
>
> > Thank you Luke,
> >
> > I tried your suggestion about R_MAX_VSIZE but I am not able to get the
> > error you are getting.
> > I tried recent R devel as I have seen you made a change to GC there.
> > My machine is 128GB, free -h reports 125GB available. I tried to set
> > 128, 125 and 100. In all cases the result is "Command terminated by
> > signal 9". Each took around 6-6.5h.
> > Details below, if it tells you anything how could I optimize it (or
> > raise an exception early) please do let me know.
> >
> > R 4.0.3
> >
> > unset R_MAX_VSIZE
> >User time (seconds): 40447.92
> >System time (seconds): 4034.37
> >Percent of CPU this job got: 201%
> >Elapsed (wall clock) time (h:mm:ss or m:ss): 6:07:59
> >Maximum resident set size (kbytes): 127261184
> >Major (requiring I/O) page faults: 72441
> >Minor (reclaiming a frame) page faults: 3315491751
> >Voluntary context switches: 381446
> >Involuntary context switches: 529554
> >File system inputs: 108339200
> >File system outputs: 120
> >
> > R-devel 2020-11-27 r79522
> >
> > unset R_MAX_VSIZE
> >User time (seconds): 40713.52
> >System time (seconds): 4039.52
> >Percent of CPU this job got: 198%
> >Elapsed (wall clock) time (h:mm:ss or m:ss): 6:15:52
> >Maximum resident set size (kbytes): 127254796
> >Major (requiring I/O) page faults: 72810
> >Minor (reclaiming a frame) page faults: 3433589848
> >Voluntary context switches: 384363
> >Involuntary context switches: 609024
> >File system inputs: 108467064
> >File system outputs: 112
> >
> > R_MAX_VSIZE=128Gb
> >User time (seconds): 40411.13
> >System time (seconds): 4227.99
> >Percent of CPU this job got: 198%
> >Elapsed (wall clock) time (h:mm:ss or m:ss): 6:14:01
> >Maximum resident set size (kbytes): 127249316
> >Major (requiring I/O) page faults: 88500
> >Minor (reclaiming a frame) page faults: 3544520527
> >Voluntary context switches: 384117
> >Involuntary context switches: 545397
> >File system inputs: 111675896
> >File system outputs: 120
> >
> > R_MAX_VSIZE=125Gb
> >User time (seconds): 40246.83
> >System time (seconds): 4042.76
> >Percent of CPU this job got: 201%
> >Elapsed (wall clock) time (h:mm:ss or m:ss): 6:06:56
> >Maximum resident set size (kbytes): 127254200
> >Major (requiring I/O) page faults: 63867
> >Minor (reclaiming a frame) page faults: 3449493803
> >Voluntary context switches: 370753
> >Involuntary context switches: 614607
> >File system inputs: 106322880
> >File system outputs: 112
> >
> > R_MAX_VSIZE=100Gb
> >User time (seconds): 41837.10
> >System time (seconds): 3979.57
> >Percent of CPU this job got: 192%
> >    Elapsed (wall clock) time (h:mm:ss or m:ss): 6:36:34
> >Maximum resident set size (kbytes): 127256940
> >Major (requiring I/O) page faults: 66829
> >Minor (reclaiming a frame) page faults: 3357778594
> >    Voluntary context switches: 391149
> >Involuntary context switches: 646410
> >File system inputs: 106605648
> >File system outputs: 120
> >
> > On Fri, Nov 27, 2020 at 10:18 PM  wrote:
> >>
> >> On Thu, 26 Nov 2020, Jan Gorecki wrote:
> >>
> >>> Thank you Luke for looking into it. Your knowledge of gc is definitely
> >>> helpful here. I put comments inline below.
> >>>
> >>> Best,
> >>>

Re: [Rd] [External] Re: .Internal(quit(...)): system call failed: Cannot allocate memory

2020-12-01 Thread Jan Gorecki

Thank you Luke,

I tried your suggestion about R_MAX_VSIZE but I am not able to get the
error you are getting.
I tried recent R devel as I have seen you made a change to GC there.
My machine is 128GB, free -h reports 125GB available. I tried to set
128, 125 and 100. In all cases the result is "Command terminated by
signal 9". Each took around 6-6.5h.
Details below, if it tells you anything how could I optimize it (or
raise an exception early) please do let me know.

R 4.0.3

unset R_MAX_VSIZE
User time (seconds): 40447.92
System time (seconds): 4034.37
Percent of CPU this job got: 201%
Elapsed (wall clock) time (h:mm:ss or m:ss): 6:07:59
Maximum resident set size (kbytes): 127261184
Major (requiring I/O) page faults: 72441
Minor (reclaiming a frame) page faults: 3315491751
Voluntary context switches: 381446
Involuntary context switches: 529554
File system inputs: 108339200
File system outputs: 120

R-devel 2020-11-27 r79522

unset R_MAX_VSIZE
User time (seconds): 40713.52
System time (seconds): 4039.52
Percent of CPU this job got: 198%
Elapsed (wall clock) time (h:mm:ss or m:ss): 6:15:52
Maximum resident set size (kbytes): 127254796
Major (requiring I/O) page faults: 72810
Minor (reclaiming a frame) page faults: 3433589848
Voluntary context switches: 384363
Involuntary context switches: 609024
File system inputs: 108467064
File system outputs: 112

R_MAX_VSIZE=128Gb
User time (seconds): 40411.13
System time (seconds): 4227.99
Percent of CPU this job got: 198%
Elapsed (wall clock) time (h:mm:ss or m:ss): 6:14:01
Maximum resident set size (kbytes): 127249316
Major (requiring I/O) page faults: 88500
Minor (reclaiming a frame) page faults: 3544520527
Voluntary context switches: 384117
Involuntary context switches: 545397
File system inputs: 111675896
File system outputs: 120

R_MAX_VSIZE=125Gb
User time (seconds): 40246.83
System time (seconds): 4042.76
Percent of CPU this job got: 201%
Elapsed (wall clock) time (h:mm:ss or m:ss): 6:06:56
Maximum resident set size (kbytes): 127254200
Major (requiring I/O) page faults: 63867
Minor (reclaiming a frame) page faults: 3449493803
Voluntary context switches: 370753
Involuntary context switches: 614607
File system inputs: 106322880
File system outputs: 112

R_MAX_VSIZE=100Gb
User time (seconds): 41837.10
System time (seconds): 3979.57
Percent of CPU this job got: 192%
Elapsed (wall clock) time (h:mm:ss or m:ss): 6:36:34
Maximum resident set size (kbytes): 127256940
Major (requiring I/O) page faults: 66829
Minor (reclaiming a frame) page faults: 3357778594
Voluntary context switches: 391149
Involuntary context switches: 646410
File system inputs: 106605648
File system outputs: 120

On Fri, Nov 27, 2020 at 10:18 PM  wrote:
>
> On Thu, 26 Nov 2020, Jan Gorecki wrote:
>
> > Thank you Luke for looking into it. Your knowledge of gc is definitely
> > helpful here. I put comments inline below.
> >
> > Best,
> > Jan
> >
> > On Wed, Nov 25, 2020 at 10:38 PM  wrote:
> >>
> >> On Tue, 24 Nov 2020, Jan Gorecki wrote:
> >>
> >>> As for other calls to system. I avoid calling system. In the past I
> >>> had some (to get memory stats from OS), but they were failing with
> >>> exactly the same issue. So yes, if I would add call to system before
> >>> calling quit, I believe it would fail with the same error.
> >>> At the same time I think (although I am not sure) that new allocations
> >>> made in R are working fine. So R seems to reserve some memory and can
> >>> continue to operate, while external call like system will fail. Maybe
> >>> it is like this by design, don't know.
> >>
> >> Thanks for the report on quit(). We're exploring how to make the
> >> cleanup on exit more robust to low memory situations like these.
> >>
> >>>
> >>> Aside from this problem that is easy to report due to the warning
> >>> message, I think that gc() is choking at the same time. I tried to
> >>> make reproducible example for that, multiple times but couldn't, let
> >>> me try one more time.
> >>> It happens to manifest when there is 4e8+ unique characters/factors in
> >>> an R session. I am able to reproduce it using data.table and dplyr
> >>> (0.8.4 because 1.0.0+ fails even sooner), but using base R is not easy
> >>> because of the size. I de

Re: [Rd] [External] Re: .Internal(quit(...)): system call failed: Cannot allocate memory

2020-11-26 Thread Jan Gorecki

Thank you Luke for looking into it. Your knowledge of gc is definitely
helpful here. I put comments inline below.

Best,
Jan

On Wed, Nov 25, 2020 at 10:38 PM  wrote:
>
> On Tue, 24 Nov 2020, Jan Gorecki wrote:
>
> > As for other calls to system. I avoid calling system. In the past I
> > had some (to get memory stats from OS), but they were failing with
> > exactly the same issue. So yes, if I would add call to system before
> > calling quit, I believe it would fail with the same error.
> > At the same time I think (although I am not sure) that new allocations
> > made in R are working fine. So R seems to reserve some memory and can
> > continue to operate, while external call like system will fail. Maybe
> > it is like this by design, don't know.
>
> Thanks for the report on quit(). We're exploring how to make the
> cleanup on exit more robust to low memory situations like these.
>
> >
> > Aside from this problem that is easy to report due to the warning
> > message, I think that gc() is choking at the same time. I tried to
> > make reproducible example for that, multiple times but couldn't, let
> > me try one more time.
> > It happens to manifest when there is 4e8+ unique characters/factors in
> > an R session. I am able to reproduce it using data.table and dplyr
> > (0.8.4 because 1.0.0+ fails even sooner), but using base R is not easy
> > because of the size. I described briefly problem in:
> > https://github.com/h2oai/db-benchmark/issues/110
>
> Because of the design of R's character vectors, with each element
> allocated separately, R is never going to be great at handling huge
> numbers of distinct strings. But it can do an adequate job given
> enough memory to work with.
>
> When I run your GitHub issue example on a machine with around 500 Gb
> of RAM it seems to run OK; /usr/bin/time reports
>
> 2706.89user 161.89system 37:10.65elapsed 128%CPU (0avgtext+0avgdata 
> 92180796maxresident)k
> 0inputs+103450552outputs (0major+38716351minor)pagefaults 0swaps
>
> So the memory footprint is quite large. Using gc.time() it looks like
> about 1/3 of the time is in GC. Not ideal, and maybe could be improved
> on a bit, but probably not by much. The GC is basically doing an
> adequate job, given enough RAM.

Agree, 1/3 is a lot but still acceptable. So this strictly is not
something that requires intervention.
PS. I wasn't aware of gc.time(), it may be worth linking it from
SeeAlso in gc() manual.

>
> If you run this example on a system without enough RAM, or with other
> programs competing for RAM, you are likely to end up fighting with
> your OS/hardware's virtual memory system. When I try to run it on a
> 16Gb system it churns for an hour or so before getting killed, and
> /usr/bin/time reports a huge number of page faults:
>
> 312523816inputs+0outputs (24761285major+25762068minor)pagefaults 0swaps
>
> You are probably experiencing something similar.

Yes, this is exactly what I am experiencing.
The machine is a bare metal machine of 128GB mem, csv size 50GB,
data.frame size 74GB.
In my case it churns for ~3h before it gets killed with SIGINT from
the parent R process which uses 3h as a timeout for this script.
This is something I would like to be addressed because gc time is far
bigger than actual computation time. This is not really acceptable, I
would prefer to raise an exception instead.

>
> There may be opportunities for more tuning of the GC to better handle
> running this close to memory limits, but I doubt the payoff would be
> worth the effort.

If you don't have plans/time to work on that anytime soon, then I can
fill bugzilla for this problem so it won't get lost in the mailing
list.


>
> Best,
>
> luke
>
> > It would help if gcinfo() could take FALSE/TRUE/2L where 2L will print
> > even more information about gc, like how much time the each gc()
> > process took, how many objects it has to check on each level.
> >
> > Best regards,
> > Jan
> >
> >
> >
> > On Tue, Nov 24, 2020 at 1:05 PM Tomas Kalibera  
> > wrote:
> >>
> >> On 11/24/20 11:27 AM, Jan Gorecki wrote:
> >>> Thanks Bill for checking that.
> >>> It was my impression that warnings are raised from some internal
> >>> system calls made when quitting R. At that point I don't have much
> >>> control over checking the return status of those.
> >>> Your suggestion looks good to me.
> >>>
> >>> Tomas, do you think this could help? could this be implemented?
> >>
> >> I think this is a good suggestion. Deleting files on Unix was changed
> >> from system("rm") to doing that in C, and deleting the se

Re: [Rd] .Internal(quit(...)): system call failed: Cannot allocate memory

2020-11-24 Thread Jan Gorecki

As for other calls to system. I avoid calling system. In the past I
had some (to get memory stats from OS), but they were failing with
exactly the same issue. So yes, if I would add call to system before
calling quit, I believe it would fail with the same error.
At the same time I think (although I am not sure) that new allocations
made in R are working fine. So R seems to reserve some memory and can
continue to operate, while external call like system will fail. Maybe
it is like this by design, don't know.

Aside from this problem that is easy to report due to the warning
message, I think that gc() is choking at the same time. I tried to
make reproducible example for that, multiple times but couldn't, let
me try one more time.
It happens to manifest when there is 4e8+ unique characters/factors in
an R session. I am able to reproduce it using data.table and dplyr
(0.8.4 because 1.0.0+ fails even sooner), but using base R is not easy
because of the size. I described briefly problem in:
https://github.com/h2oai/db-benchmark/issues/110

It would help if gcinfo() could take FALSE/TRUE/2L where 2L will print
even more information about gc, like how much time the each gc()
process took, how many objects it has to check on each level.

Best regards,
Jan



On Tue, Nov 24, 2020 at 1:05 PM Tomas Kalibera  wrote:
>
> On 11/24/20 11:27 AM, Jan Gorecki wrote:
> > Thanks Bill for checking that.
> > It was my impression that warnings are raised from some internal
> > system calls made when quitting R. At that point I don't have much
> > control over checking the return status of those.
> > Your suggestion looks good to me.
> >
> > Tomas, do you think this could help? could this be implemented?
>
> I think this is a good suggestion. Deleting files on Unix was changed
> from system("rm") to doing that in C, and deleting the session directory
> should follow.
>
> It might also help diagnosing your problem, but I don't think it would
> solve it. If the diagnostics in R works fine and the OS was so
> hopelessly out of memory that it couldn't run any more external
> processes, then really this is not a problem of R, but of having
> exhausted the resources. And it would be a coincidence that just this
> particular call to "system" at the end of the session did not work.
> Anything else could break as well close to the end of the script. This
> seems the most likely explanation to me.
>
> Do you get this warning repeatedly, reproducibly at least in slightly
> different scripts at the very end, with this warning always from quit()?
> So that the "call" part of the warning message has .Internal(quit) like
> in the case you posted? Would adding another call to "system" before the
> call to "q()" work - with checking the return value? If it is always
> only the last call to "system" in "q()", then it is suspicious, perhaps
> an indication that some diagnostics in R is not correct. In that case, a
> reproducible example would be the key - so either if you could diagnose
> on your end what is the problem, or create a reproducible example that
> someone else can use to reproduce and debug.
>
> Best
> Tomas
>
> >
> > On Mon, Nov 23, 2020 at 7:10 PM Bill Dunlap  
> > wrote:
> >> The call to system() probably is an internal call used to delete the 
> >> session's tempdir().  This sort of failure means that a potentially large 
> >> amount of disk space is not being recovered when R is done.  Perhaps 
> >> R_CleanTempDir() could call R_unlink() instead of having a subprocess call 
> >> 'rm -rf ...'.  Then it could also issue a specific warning if it was 
> >> impossible to delete all of tempdir().  (That should be very rare.)
> >>
> >>> q("no")
> >> Breakpoint 1, R_system (command=command@entry=0x7fffa1e0 "rm -Rf 
> >> /tmp/RtmppoKPXb") at sysutils.c:311
> >> 311 {
> >> (gdb) where
> >> #0  R_system (command=command@entry=0x7fffa1e0 "rm -Rf 
> >> /tmp/RtmppoKPXb") at sysutils.c:311
> >> #1  0x557c30ec in R_CleanTempDir () at sys-std.c:1178
> >> #2  0x557c31d7 in Rstd_CleanUp (saveact=, status=0, 
> >> runLast=) at sys-std.c:1243
> >> #3  0x557c593d in R_CleanUp (saveact=saveact@entry=SA_NOSAVE, 
> >> status=status@entry=0, runLast=) at system.c:87
> >> #4  0x556cc85e in do_quit (call=, op= >> out>, args=0x57813f90, rho=) at main.c:1393
> >>
> >> -Bill
> >>
> >> On Mon, Nov 23, 2020 at 3:15 AM Tomas Kalibera  
> >> wrote:
> >>> On 11/21/20 6:51 PM, Jan Gorecki wrote:
> >>>> De

Re: [Rd] .Internal(quit(...)): system call failed: Cannot allocate memory

2020-11-24 Thread Jan Gorecki

Thanks Bill for checking that.
It was my impression that warnings are raised from some internal
system calls made when quitting R. At that point I don't have much
control over checking the return status of those.
Your suggestion looks good to me.

Tomas, do you think this could help? could this be implemented?

On Mon, Nov 23, 2020 at 7:10 PM Bill Dunlap  wrote:
>
> The call to system() probably is an internal call used to delete the 
> session's tempdir().  This sort of failure means that a potentially large 
> amount of disk space is not being recovered when R is done.  Perhaps 
> R_CleanTempDir() could call R_unlink() instead of having a subprocess call 
> 'rm -rf ...'.  Then it could also issue a specific warning if it was 
> impossible to delete all of tempdir().  (That should be very rare.)
>
> > q("no")
> Breakpoint 1, R_system (command=command@entry=0x7fffa1e0 "rm -Rf 
> /tmp/RtmppoKPXb") at sysutils.c:311
> 311 {
> (gdb) where
> #0  R_system (command=command@entry=0x7fffa1e0 "rm -Rf /tmp/RtmppoKPXb") 
> at sysutils.c:311
> #1  0x557c30ec in R_CleanTempDir () at sys-std.c:1178
> #2  0x557c31d7 in Rstd_CleanUp (saveact=, status=0, 
> runLast=) at sys-std.c:1243
> #3  0x557c593d in R_CleanUp (saveact=saveact@entry=SA_NOSAVE, 
> status=status@entry=0, runLast=) at system.c:87
> #4  0x556cc85e in do_quit (call=, op=, 
> args=0x57813f90, rho=) at main.c:1393
>
> -Bill
>
> On Mon, Nov 23, 2020 at 3:15 AM Tomas Kalibera  
> wrote:
>>
>> On 11/21/20 6:51 PM, Jan Gorecki wrote:
>> > Dear R-developers,
>> >
>> > Some of the more fat scripts (50+ GB mem used by R) that I am running,
>> > when they finish they do quit with q("no", status=0)
>> > Quite often it happens that there is an extra stderr output produced
>> > at the very end which looks like this:
>> >
>> > Warning message:
>> > In .Internal(quit(save, status, runLast)) :
>> >system call failed: Cannot allocate memory
>> >
>> > Is there any way to avoid this kind of warnings? I am using stderr
>> > output for detecting failures in scripts and this warning is a false
>> > positive of a failure.
>> >
>> > Maybe quit function could wait little bit longer trying to allocate
>> > before it raises this warning?
>>
>> If you see this warning, some call to system() or system2() or similar,
>> which executes an external program, failed to even run a shell to run
>> that external program, because there was not enough memory. You should
>> be able to find out where it happens by checking the exit status of
>> system().
>>
>> Tomas
>>
>>
>> >
>> > Best regards,
>> > Jan Gorecki
>> >
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] .Internal(quit(...)): system call failed: Cannot allocate memory

2020-11-21 Thread Jan Gorecki

Dear R-developers,

Some of the more fat scripts (50+ GB mem used by R) that I am running,
when they finish they do quit with q("no", status=0)
Quite often it happens that there is an extra stderr output produced
at the very end which looks like this:

Warning message:
In .Internal(quit(save, status, runLast)) :
  system call failed: Cannot allocate memory

Is there any way to avoid this kind of warnings? I am using stderr
output for detecting failures in scripts and this warning is a false
positive of a failure.

Maybe quit function could wait little bit longer trying to allocate
before it raises this warning?

Best regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] details of CRAN's r-devel-linux-x86_64-debian-clang

2020-10-27 Thread Jan Gorecki

Thanks to Kurt Hornik for confirming that following should be sufficient.
export _R_S3_METHOD_LOOKUP_BASEENV_AFTER_GLOBALENV_=false
export _R_S3_METHOD_LOOKUP_REPORT_SEARCH_PATH_USES_=true

I did more tests and found out that for the issue to manifest I had to
install xts, rather than just zoo (which registers generic method),
because a particular unit test required xts to be present.
Now I am able to reproduce the problem.

On Sun, Oct 25, 2020 at 12:17 PM Jan Gorecki  wrote:
>
> Thanks Hugh,
> I must have dropped it when pasting to email. I checked again and it
> doesn't make any difference. So my question is still valid then.
>
> On Sun, Oct 25, 2020 at 11:14 AM Hugh Parsonage
>  wrote:
> >
> > _R_S3_METHOD_LOOKUP_BASEENV_AFTER_GLOBALENV_
> >
> > not
> >
> >  R_S3_METHOD_LOOKUP_BASEENV_AFTER_GLOBALENV_
> >
> > (i.e. you seem to have dropped the initial underscore)?
> >
> > On Sun, 25 Oct 2020 at 20:01, Jan Gorecki  wrote:
> > >
> > > Dear community,
> > >
> > > I am getting an error on CRAN on r-devel-linux-x86_64-debian-clang 
> > > machine only:
> > >
> > >   S3 method lookup found 'as.x.y' on search path
> > >
> > > I know how to fix the error. I am interested to know how to reproduce
> > > this error locally. I tried combination of
> > > R_S3_METHOD_LOOKUP_BASEENV_AFTER_GLOBALENV_
> > > and
> > > _R_S3_METHOD_LOOKUP_REPORT_SEARCH_PATH_USES_
> > > but didn't manage to reproduce the error.
> > >
> > > What is the proper way to set up checks to detect these kinds of issues?
> > >
> > > Moreover, I kindly request to provide "details" for other CRAN
> > > machines in flavors page:
> > > https://cran.r-project.org/web/checks/check_flavors.html
> > > "Details" are currently provided only for 3 machines there. Having
> > > "details" for r-devel-linux-x86_64-debian-clang would probably answer
> > > my question without involving readers of this mailing list.
> > >
> > > Best Regards,
> > > Jan Gorecki
> > >
> > > __
> > > R-package-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] details of CRAN's r-devel-linux-x86_64-debian-clang

2020-10-25 Thread Jan Gorecki

Thanks Hugh,
I must have dropped it when pasting to email. I checked again and it
doesn't make any difference. So my question is still valid then.

On Sun, Oct 25, 2020 at 11:14 AM Hugh Parsonage
 wrote:
>
> _R_S3_METHOD_LOOKUP_BASEENV_AFTER_GLOBALENV_
>
> not
>
>  R_S3_METHOD_LOOKUP_BASEENV_AFTER_GLOBALENV_
>
> (i.e. you seem to have dropped the initial underscore)?
>
> On Sun, 25 Oct 2020 at 20:01, Jan Gorecki  wrote:
> >
> > Dear community,
> >
> > I am getting an error on CRAN on r-devel-linux-x86_64-debian-clang machine 
> > only:
> >
> >   S3 method lookup found 'as.x.y' on search path
> >
> > I know how to fix the error. I am interested to know how to reproduce
> > this error locally. I tried combination of
> > R_S3_METHOD_LOOKUP_BASEENV_AFTER_GLOBALENV_
> > and
> > _R_S3_METHOD_LOOKUP_REPORT_SEARCH_PATH_USES_
> > but didn't manage to reproduce the error.
> >
> > What is the proper way to set up checks to detect these kinds of issues?
> >
> > Moreover, I kindly request to provide "details" for other CRAN
> > machines in flavors page:
> > https://cran.r-project.org/web/checks/check_flavors.html
> > "Details" are currently provided only for 3 machines there. Having
> > "details" for r-devel-linux-x86_64-debian-clang would probably answer
> > my question without involving readers of this mailing list.
> >
> > Best Regards,
> > Jan Gorecki
> >
> > __
> > R-package-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[R-pkg-devel] details of CRAN's r-devel-linux-x86_64-debian-clang

2020-10-25 Thread Jan Gorecki

Dear community,

I am getting an error on CRAN on r-devel-linux-x86_64-debian-clang machine only:

  S3 method lookup found 'as.x.y' on search path

I know how to fix the error. I am interested to know how to reproduce
this error locally. I tried combination of
R_S3_METHOD_LOOKUP_BASEENV_AFTER_GLOBALENV_
and
_R_S3_METHOD_LOOKUP_REPORT_SEARCH_PATH_USES_
but didn't manage to reproduce the error.

What is the proper way to set up checks to detect these kinds of issues?

Moreover, I kindly request to provide "details" for other CRAN
machines in flavors page:
https://cran.r-project.org/web/checks/check_flavors.html
"Details" are currently provided only for 3 machines there. Having
"details" for r-devel-linux-x86_64-debian-clang would probably answer
my question without involving readers of this mailing list.

Best Regards,
Jan Gorecki

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] check cross-references error: Non-file package-anchored link(s)

2020-07-02 Thread Jan Gorecki

Thank you Gabor

On Thu, Jul 2, 2020 at 10:20 AM Gábor Csárdi  wrote:
>
> You can set the _R_CHECK_XREFS_MIND_SUSPECT_ANCHORS_=true env var and
> use R-devel.
>
> Alternatively, and you don't need R-devel for this, you can run R CMD
> --html INSTALL on your package, and then look for messages that
> contain "treated as a topic", e.g.
>
> curl_fdshtml
> Rd warning: /Users/gaborcsardi/works/processx/man/curl_fds.Rd:11: file
> link ‘multi_fdset’ in package ‘curl’ does not exist and so has been
> treated as a topic
>
> Gabor
>
>
> On Thu, Jul 2, 2020 at 10:06 AM Jan Gorecki  wrote:
> >
> > Hi,
> > What is the recommended way to test for those issues locally?
> > If it is tested during cran submission, then seems reasonable to be enabled 
> > just by --as-cran switch. Is it?
> > Thanks
> >
> > On Wed 17 Jun, 2020, 12:32 AM Wayne Oldford,  wrote:
> >>
> >> Thank you!
> >>
> >> -Original Message-
> >> From: Gábor Csárdi 
> >> Date: Tuesday, June 16, 2020 at 4:32 PM
> >> To: Wayne Oldford 
> >> Cc: List r-package-devel 
> >> Subject: Re: [R-pkg-devel] check cross-references error: Non-file 
> >> package-anchored link(s)
> >>
> >> This is how to look up the filename. The first "sp" is the topic name,
> >> the second is the package name.
> >>
> >> > help("sp", "sp")[[1]]
> >> [1] "C:/Users/csard/R/win-library/4.0/sp/help/00sp"
> >>
> >> So you need to link to the "00sp.Rd" file:  \link[sp:00sp]{sp}
> >>
> >> Gabor
> >>
> >> On Tue, Jun 16, 2020 at 9:09 PM Wayne Oldford  
> >> wrote:
> >> >
> >> > Hi
> >> >
> >> > I got caught by this new test this week in trying to push an updated 
> >> release of the loon package to CRAN.
> >> >
> >> > By following this thread, I corrected my cross-references to 
> >> external packages but I got stymied by
> >> > the one I hoped to give to the  "sp" package for Spatial data
> >> >
> >> > _
> >> >
> >> > Here is the history:
> >> >
> >> > I tried
> >> >\link[sp:sp]{sp}
> >> > which failed here:
> >> > Debian: 
> >> <https://win-builder.r-project.org/incoming_pretest/loon_1.3.1_20200616_162128/Debian/00check.log>
> >> > Status: 1 WARNING
> >> >
> >> >
> >> > That was meant to correct an earlier attempt (it did for other links 
> >> to "scales" for example) where I had tried
> >> >   \link[sp]{sp}
> >> > and  failed here:
> >> > Debian: 
> >> <https://win-builder.r-project.org/incoming_pretest/loon_1.3.1_20200615_213749/Debian/00check.log>
> >> > Status: 1 WARNING
> >> >
> >> >
> >> > So to complete the possibilities as I understand them,  I just now 
> >> tried
> >> >\link{sp}
> >> > which, as might be expected, failed here:
> >> > Debian: 
> >> <https://win-builder.r-project.org/incoming_pretest/loon_1.3.1_20200616_213921/Debian/00check.log>
> >> > Status: 1 WARNING
> >> > As expected, error here was different:  "Missing  link"  as opposed 
> >> to "Non-file package-anchored link"
> >> >
> >> > _
> >> >
> >> >
> >> > I am not sure whether I have missed a subtlety in WRE or that the 
> >> peculiar circumstance
> >> > where the package, the topic, and the file name are all identical 
> >> (sp) is some weird boundary case.
> >> >
> >> > Without further advice, I think I am just going to remove the link 
> >> to "sp".
> >> > It really is just a courtesy link to the package description for 
> >> "sp".
> >> >
> >> > Thanks in advance for your thoughts.
> >> >
> >> > Wayne
> >> >
> >> >
> >> >
> >> >
> >> > -Original Message-
> >> > From: R-package-devel  on 
> >> behalf of Georgi Boshnakov 
> >> > Date

Re: [R-pkg-devel] check cross-references error: Non-file package-anchored link(s)

2020-07-02 Thread Jan Gorecki

Hi,
What is the recommended way to test for those issues locally?
If it is tested during cran submission, then seems reasonable to be enabled
just by --as-cran switch. Is it?
Thanks

On Wed 17 Jun, 2020, 12:32 AM Wayne Oldford,  wrote:

> Thank you!
>
> -Original Message-
> From: Gábor Csárdi 
> Date: Tuesday, June 16, 2020 at 4:32 PM
> To: Wayne Oldford 
> Cc: List r-package-devel 
> Subject: Re: [R-pkg-devel] check cross-references error: Non-file
> package-anchored link(s)
>
> This is how to look up the filename. The first "sp" is the topic name,
> the second is the package name.
>
> > help("sp", "sp")[[1]]
> [1] "C:/Users/csard/R/win-library/4.0/sp/help/00sp"
>
> So you need to link to the "00sp.Rd" file:  \link[sp:00sp]{sp}
>
> Gabor
>
> On Tue, Jun 16, 2020 at 9:09 PM Wayne Oldford 
> wrote:
> >
> > Hi
> >
> > I got caught by this new test this week in trying to push an updated
> release of the loon package to CRAN.
> >
> > By following this thread, I corrected my cross-references to
> external packages but I got stymied by
> > the one I hoped to give to the  "sp" package for Spatial data
> >
> > _
> >
> > Here is the history:
> >
> > I tried
> >\link[sp:sp]{sp}
> > which failed here:
> > Debian: <
> https://win-builder.r-project.org/incoming_pretest/loon_1.3.1_20200616_162128/Debian/00check.log
> >
> > Status: 1 WARNING
> >
> >
> > That was meant to correct an earlier attempt (it did for other links
> to "scales" for example) where I had tried
> >   \link[sp]{sp}
> > and  failed here:
> > Debian: <
> https://win-builder.r-project.org/incoming_pretest/loon_1.3.1_20200615_213749/Debian/00check.log
> >
> > Status: 1 WARNING
> >
> >
> > So to complete the possibilities as I understand them,  I just now
> tried
> >\link{sp}
> > which, as might be expected, failed here:
> > Debian: <
> https://win-builder.r-project.org/incoming_pretest/loon_1.3.1_20200616_213921/Debian/00check.log
> >
> > Status: 1 WARNING
> > As expected, error here was different:  "Missing  link"  as opposed
> to "Non-file package-anchored link"
> >
> > _
> >
> >
> > I am not sure whether I have missed a subtlety in WRE or that the
> peculiar circumstance
> > where the package, the topic, and the file name are all identical
> (sp) is some weird boundary case.
> >
> > Without further advice, I think I am just going to remove the link
> to "sp".
> > It really is just a courtesy link to the package description for
> "sp".
> >
> > Thanks in advance for your thoughts.
> >
> > Wayne
> >
> >
> >
> >
> > -Original Message-
> > From: R-package-devel  on
> behalf of Georgi Boshnakov 
> > Date: Tuesday, June 16, 2020 at 9:27 AM
> > To: Gábor Csárdi , Duncan Murdoch <
> murdoch.dun...@gmail.com>
> > Cc: List r-package-devel 
> > Subject: Re: [R-pkg-devel] check cross-references error: Non-file
> package-anchored link(s)
> >
> > I think that the current behaviour is documented in WRE:
> >
> > "...There are two other forms of optional argument specified as
> \link[pkg]{foo} and
> > \link[pkg:bar]{foo} to link to the package pkg, to files
> foo.html and bar.html respectively.
> > These are rarely needed, perhaps to refer to not-yet-installed
> packages (but there the HTML
> > help system will resolve the link at run time) or in the
> normally undesirable event that more
> > than one package offers help on a topic7 (in which case the
> present package has precedence so
> > this is only needed to refer to other packages). They are
> currently only used in HTML help
> > (and ignored for hyperlinks in LATEX conversions of help pages),
> and link to the file rather
> > than the topic (since there is no way to know which topics are
> in which files in an uninstalled
> > package) ...   Because they have been frequently misused, the
> HTML help system looks for topic foo in package pkg
> > if it does not find file foo.html."
> >
> > Unless I am missing something, it seems that it would be
> relatively painless to reverse the logic of the current behaviour of the
> help system,
> > i.e. to start looking first for the topic and then for a file.
> >
> > Georgi Boshnakov
> >
> > -Original Message-
> > From: R-package-devel 
> On Behalf Of Gábor Csárdi
> > Sent: 16 June 2020 13:44
> > To: Duncan Murdoch 
> > Cc: List r-package-devel 
> > Subject: Re: [R-pkg-devel] check cross-references error:
> Non-file package-anchored link(s)
> >
> > On Mon, Jun 15, 2020 at 5:30 PM Duncan Murdoch <
> murdoch.dun...@gmail.com> wrote:
> > >
> > > On 15/06/2020 12:05 p.m., Martin

Re: [Rd] Build a R call at C level

2020-06-30 Thread Jan Gorecki

It is quite known that R documentation on R C api could be improved...
Still R-package-devel mailing list should be preferred for this kind
of questions.
Not sure if that is the best way, but works.

call_to_sum <- inline::cfunction(
  language = "C",
  sig = c(x = "SEXP"), body = "

SEXP e = PROTECT(lang2(install(\"sum\"), x));
SEXP r_true = PROTECT(CONS(ScalarLogical(1), R_NilValue));
SETCDR(CDR(e), r_true);
SET_TAG(CDDR(e), install(\"na.rm\"));
Rf_PrintValue(e);
SEXP ans = PROTECT(eval(e, R_GlobalEnv));
UNPROTECT(3);
return ans;

")

call_to_sum(c(1L,NA,3L))

On Tue, Jun 30, 2020 at 10:08 AM Morgan Morgan
 wrote:
>
> Hi All,
>
> I was reading the R extension manual section 5.11 ( Evaluating R expression
> from C) and I tried to build a simple call to the sum function. Please see
> below.
>
> call_to_sum <- inline::cfunction(
>   language = "C",
>   sig = c(x = "SEXP"), body = "
>
> SEXP e = PROTECT(lang2(install(\"sum\"), x));
> SEXP ans = PROTECT(eval(e, R_GlobalEnv));
> UNPROTECT(2);
> return ans;
>
> ")
>
> call_to_sum(1:3)
>
> The above works. My question is how do I add the argument "na.rm=TRUE" at C
> level to the above call? I have tried various things based on what is in
> section 5.11 but I did not manage to get it to work.
>
> Thank you
> Best regards
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-devel internal errors during check produce?

2020-06-30 Thread Jan Gorecki

No packages are being loaded, or even installed.
Did you try running the example on R-devel built with flags I have
provided in this email?
I checked now and it is required to use --enable-strict-barrier to
reproduce the issue.

On Tue, Jun 30, 2020 at 9:02 AM Martin Maechler
 wrote:
>
> >>>>> Kurt Hornik
> >>>>> on Tue, 30 Jun 2020 06:20:57 +0200 writes:
>
> >>>>> Jan Gorecki writes:
> >> Thank you both, You are absolutely correct that example
> >> should be minimal, so here it is.
>
> >> l = list(a=new.env(), b=new.env()) unique(l)
>
> >> Just for completeness, env_list during check that raises
> >> error
>
> >> env_list <- list(baseenv(),
> >>   as.environment("package:graphics"),
> >>   as.environment("package:stats"),
> >>   as.environment("package:utils"),
> >>   as.environment("package:methods") )
>
> >> unique(env_list)
>
> > Thanks ... but the above work fine for me.  E.g.,
>
> R> l = list(a=new.env(), b=new.env())
> R> unique(l)
> > [[1]] 
>
> > [[2]] 
>
> > Best -k
>
> Ditto here;  also your (Jan) 2nd example works fine.
>
> So, you must have loaded some (untidy) packages / code which redefine
> standard base R behavior ?
>
> Martin
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-devel internal errors during check produce?

2020-06-29 Thread Jan Gorecki

Thank you both,
You are absolutely correct that example should be minimal, so here it is.

l = list(a=new.env(), b=new.env())
unique(l)

Just for completeness, env_list during check that raises error

env_list <- list(baseenv(),
  as.environment("package:graphics"),
  as.environment("package:stats"),
  as.environment("package:utils"),
  as.environment("package:methods")
)
unique(env_list)

Best regards,
Jan

On Mon, Jun 29, 2020 at 5:42 PM Martin Maechler
 wrote:
>
> >>>>> Kurt Hornik
> >>>>> on Mon, 29 Jun 2020 16:13:03 +0200 writes:
>
> >>>>> Jan Gorecki writes:
> >> So the unique.default is from the R tools package during
> >> checks.  I don't see those issues on CRAN checks.
>
> > I cannot reproduce this locally (and have no clues about
> > docker).  Perhaps you can try to debug this on your end?
> > And see what env_list is when the error occurs?
>
> > Best -k
>
> Indeed, if it is a bug in R (as opposed to being an assumption
> that 'data.table' makes about undocumented R internals), it
> should be reproducible with a very small dummy package instead
> of data.table. ... or actually reproducible with relatively
> simple R code calling unique() not envolving any non base package.
>
> Martin
>
>
> >> Exact environment where I am reproducing this issue is a
> >> fresh ubuntu, no R packages pre-installed docker pull
> >> registry.gitlab.com/jangorecki/dockerfiles/r-devel
> >> 
> https://gitlab.com/jangorecki/dockerfiles/-/raw/master/r-devel/Dockerfile
>
> >> On Sat, Jun 27, 2020 at 12:37 AM Jan Gorecki
> >>  wrote:
> >>>
> >>> Hi R developers,
> >>>
> >>> On R-devel (2020-06-24 r78746) I am getting those two
> >>> new exceptions during R check. I found a change which
> >>> eventually may be related
> >>> 
> https://github.com/wch/r-source/commit/69de92b9fb1b7f2a7c8d1394b8d56050881a5465
> >>> I think this may be a regression. I grep'ed package
> >>> manuals and R code for unique.default but don't see
> >>> any. Usage section of the unique method looks fine as
> >>> well. Errors look a little bit like internal errors.
> >>>
> >>> * checking Rd \usage sections ... NOTE Error in
> >>> unique.default(env_list) : LENGTH or similar applied to
> >>> environment object Calls: 
> >>> ... .get_S3_generics_as_seen_from_package -> unique ->
> >>> unique.default Execution halted The \usage entries for
> >>> S3 methods should use the \method markup and not their
> >>> full name.  * checking S3 generic/method consistency
> >>> ... WARNING Error in unique.default(env_list) : LENGTH
>     >>> or similar applied to environment object Calls:
> >>>  ... .get_S3_generics_as_seen_from_package ->
> >>> unique -> unique.default
> >>>
> >>> I don't think if it is related but I build R-devel with
> >>> extra args: --with-recommended-packages
> >>> --enable-strict-barrier --disable-long-double I check
> >>> with: --as-cran --no-manual To reproduce download
> >>> current data.table from CRAN (1.12.8) and run R check
> >>>
> >>> Best regards, Jan Gorecki
>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-devel internal errors during check produce?

2020-06-26 Thread Jan Gorecki

So the unique.default is from the R tools package during checks.
I don't see those issues on CRAN checks.
Exact environment where I am reproducing this issue is a fresh ubuntu,
no R packages pre-installed
docker pull registry.gitlab.com/jangorecki/dockerfiles/r-devel
https://gitlab.com/jangorecki/dockerfiles/-/raw/master/r-devel/Dockerfile

On Sat, Jun 27, 2020 at 12:37 AM Jan Gorecki  wrote:
>
> Hi R developers,
>
> On R-devel (2020-06-24 r78746) I am getting those two new exceptions
> during R check. I found a change which eventually may be related
> https://github.com/wch/r-source/commit/69de92b9fb1b7f2a7c8d1394b8d56050881a5465
> I think this may be a regression. I grep'ed package manuals and R code
> for unique.default but don't see any. Usage section of the unique
> method looks fine as well. Errors look a little bit like internal
> errors.
>
> * checking Rd \usage sections ... NOTE
>  Error in unique.default(env_list) :
>LENGTH or similar applied to environment object
>  Calls:  ... .get_S3_generics_as_seen_from_package ->
> unique -> unique.default
>  Execution halted
>  The \usage entries for S3 methods should use the \method markup and not
>  their full name.
>  * checking S3 generic/method consistency ... WARNING
>  Error in unique.default(env_list) :
>LENGTH or similar applied to environment object
>  Calls:  ... .get_S3_generics_as_seen_from_package ->
> unique -> unique.default
>
> I don't think if it is related but I build R-devel with extra args:
> --with-recommended-packages --enable-strict-barrier --disable-long-double
> I check with:
> --as-cran --no-manual
> To reproduce download current data.table from CRAN (1.12.8) and run R check
>
> Best regards,
> Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R-devel internal errors during check produce?

2020-06-26 Thread Jan Gorecki

Hi R developers,

On R-devel (2020-06-24 r78746) I am getting those two new exceptions
during R check. I found a change which eventually may be related
https://github.com/wch/r-source/commit/69de92b9fb1b7f2a7c8d1394b8d56050881a5465
I think this may be a regression. I grep'ed package manuals and R code
for unique.default but don't see any. Usage section of the unique
method looks fine as well. Errors look a little bit like internal
errors.

* checking Rd \usage sections ... NOTE
 Error in unique.default(env_list) :
   LENGTH or similar applied to environment object
 Calls:  ... .get_S3_generics_as_seen_from_package ->
unique -> unique.default
 Execution halted
 The \usage entries for S3 methods should use the \method markup and not
 their full name.
 * checking S3 generic/method consistency ... WARNING
 Error in unique.default(env_list) :
   LENGTH or similar applied to environment object
 Calls:  ... .get_S3_generics_as_seen_from_package ->
unique -> unique.default

I don't think if it is related but I build R-devel with extra args:
--with-recommended-packages --enable-strict-barrier --disable-long-double
I check with:
--as-cran --no-manual
To reproduce download current data.table from CRAN (1.12.8) and run R check

Best regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] Rbuildignore a file type from a top level directory only

2020-06-21 Thread Jan Gorecki

Works great, thanks

On Sun, Jun 21, 2020 at 4:21 PM Hugh Parsonage  wrote:
>
> Perhaps
>
> ^[^/]+\.R$
>
> On Sun, 21 Jun 2020 at 22:31, Jan Gorecki  wrote:
> >
> > Hi R developers,
> >
> > What is the proper way to define an entry in .Rbuildignore file to
> > exclude *.R files from a top level directory of R package?
> > Using ^.*\.R$ excludes every R script, including those in sub-directories.
> >
> > Thank you,
> > Jan Gorecki
> >
> > __
> > R-package-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] R_MAKEVARS_USER fail to pass down to sub-makes?

2020-06-18 Thread Jan Gorecki

Thank you Jim,
That was the problem indeed. Too bad it is not mentioned in the manual.

On Wed, Jun 17, 2020 at 8:33 PM Jim Hester  wrote:
>
> I think the issue is likely that you seem to be using a relative path to the 
> R_MAKEVARS_USER file, it needs to be an absolute path as the installation is 
> run in a temporary directory not from the directory you call `R CMD INSTALL` 
> from.
> I observed similar behavior to what you describe when I had the MAKEVARS_USER 
> file as a relative path, but using an absolute path produced the expected 
> result.
>
> On Mon, Jun 1, 2020 at 8:04 AM Jan Gorecki  wrote:
>>
>> Hi package devel support,
>>
>> I am trying to use R_MAKEVARS_USER to customize build, rather than
>> .R/Makevars. It is properly displayed from config CFLAGS but during
>> package install it doesn't seem to work.
>>
>> In R-admin in "6.3.3 Customizing package compilation" there is:
>>
>> > Note that these mechanisms do not work with packages which fail to pass 
>> > settings down to sub-makes, perhaps reading etc/Makeconf in makefiles in 
>> > subdirectories.
>>
>> It seems that it applies to me. How should I debug that? to make this
>> env var respected? Note that my pkg has src/Makevars to handle openmp
>> switch nicely
>> Thank you
>>
>> system("R_MAKEVARS_USER=library/gcc/O3/Makevars R CMD config CFLAGS")
>> -O3
>>
>> system("R_MAKEVARS_USER=library/gcc/O3/Makevars R CMD INSTALL
>> --library=library/gcc/O3 mypkg_1.0.0.tar.gz")
>> * installing *source* package 'mypkg' ...
>> ** using staged installation
>> ** libs
>> gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG-fopenmp -fpic  -g
>> -O2 -fdebug-prefix-map=/build/r-base-V28x5H/r-base-3.6.3=.
>> -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
>> -D_FORTIFY_SOURCE=2 -g  -c assign.c -o assign.o
>>
>> __
>> R-package-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[Rd] fixed url for R-release windows binaries

2020-06-18 Thread Jan Gorecki

Dear R-developers,

Could we have a fixed url to fetch current R-release windows binaries?
Like we do for R-devel
https://cloud.r-project.org/bin/windows/base/R-devel-win.exe

Something like the following would do
https://cloud.r-project.org/bin/windows/base/R-release-win.exe

Thank you,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [R-pkg-devel] Forward function call

2020-06-09 Thread Jan Gorecki

"pkg::fun" cannot be a name because it is a function call already
`::`(pkg, fun).

On Tue, Jun 9, 2020 at 8:10 AM Martin Maechler
 wrote:
>
> > Göran Broström
> > on Mon, 8 Jun 2020 23:02:30 +0200 writes:
>
> > Thanks for the responses!
> > I found the suggestion
>
> > Call[[1]] <- quote(survival::coxph)
>
> > easiest to implement. And it works.
>
> and it is what we use in R's own R source code in several
> places (and that's where/how I assume it also came to  Ben
> Bolker, lme4, etc) :
>
> A simple grep inside current R's source /src/library/ gives
>
> grep -r --color -nHF -e 'quote(stats::'
>
> ./stats/R/acf.R:28:m[[1L]] <- quote(stats::pacf)
> ./stats/R/aggregate.R:154:m[[1L]] <- quote(stats::model.frame)
> ./stats/R/aov.R:36:lmcall[[1L]] <- quote(stats::lm)
> ./stats/R/bartlett.test.R:86:m[[1L]] <- quote(stats::model.frame)
> ./stats/R/cor.test.R:213:m[[1L]] <- quote(stats::model.frame)
> ./stats/R/factanal.R:73:mf[[1L]] <- quote(stats::model.frame)
> ./stats/R/friedman.test.R:92:m[[1L]] <- quote(stats::model.frame)
> ./stats/R/ftable.R:150:m[[1L]] <- quote(stats::model.frame)
> ./stats/R/glm.R:52:mf[[1L]] <- quote(stats::model.frame)
> ./stats/R/glm.R:863:fcall[[1L]] <- quote(stats::glm)
> ./stats/R/lm.R:34:mf[[1L]] <- quote(stats::model.frame)
> ./stats/R/lm.R:546:fcall[[1L]] <- quote(stats::model.frame)
> ./stats/R/loess.R:34:mf[[1L]] <- quote(stats::model.frame)
> ./stats/R/manova.R:22:fcall[[1L]] <- quote(stats::aov)
> ./stats/R/model.tables.R:485:fcall <- c(list(quote(stats::model.frame)), 
> args)
> ./stats/R/nls.R:570:mf[[1L]] <- quote(stats::model.frame)
> ./stats/R/ppr.R:30:m[[1L]] <- quote(stats::model.frame)
> ./stats/R/prcomp.R:69:mf[[1L]] <- quote(stats::model.frame)
> ./stats/R/princomp.R:30:mf[[1L]] <- quote(stats::model.frame)
> ./stats/R/quade.test.R:102:m[[1L]] <- quote(stats::model.frame)
> ./stats/R/spectrum.R:220:m[[1L]] <- quote(stats::plot.spec.coherency)
> ./stats/R/spectrum.R:226:m[[1L]] <- quote(stats::plot.spec.phase)
> ./stats/R/t.test.R:141:m[[1L]] <- quote(stats::model.frame)
> ./stats/R/ts.R:744:m[[1L]] <- quote(stats::window)
> ./stats/R/var.test.R:97:m[[1L]] <- quote(stats::model.frame)
> ./stats/R/xtabs.R:40:m[[1L]] <- quote(stats::model.frame)
>
>
> > Best, Göran
>
> > On 2020-06-08 21:42, Ben Bolker wrote:
> >> I think quote(survival::coxph) will work in place of as.name() ?
> >>
> >> On Mon, Jun 8, 2020 at 3:12 PM Göran Broström  
> wrote:
> >>>
> >>> Hello,
> >>>
> >>> the function 'coxreg' in my package 'eha' is often just a wrapper for
> >>> 'coxph' in survival, so I have code like
> >>>
> >>> if (cox.ph){
> >>> Call <- match.call()
> >>> Call[[1]] <- as.name("coxph")
> >>> fit <- eval.parent(Call)
> >>> return(fit)
> >>> }
> >>>
> >>> which works since eha depends on survival. Now I am thinking of 
> changing
> >>> Depends to Imports, and I would like to change the code to
> >>>
> >>> if (cox.ph){
> >>> Call <- match.call()
> >>> Call[[1]] <- as.name("survival::coxph")
> >>> fit <- eval.parent(Call)
> >>> return(fit)
> >>> }
> >>>
> >>> but this doesn't work, because the function name is turned into
> >>> `survival::coxph` (with the backticks) and the evaluation fails.
> >>>
> >>> How can I solve this?
> >>>
> >>> Thanks, G,
> >>>
> >>> __
> >>> R-package-devel@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
> > __
> > R-package-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] R_MAKEVARS_USER fail to pass down to sub-makes?

2020-06-01 Thread Jan Gorecki

What is the problem exactly? variable name is hardcoded, and variable
value is hardcoded as well.
How it is possible for the second `system` call to deliver different
makevars file than the first one?
This is the problem in question.

Yes, I haven't thought about Windows. I should have mentioned I am on Linux.

On Mon, Jun 1, 2020 at 7:20 PM Jeff Newmiller  wrote:
>
> Each call to system is independent, so it definitely is a problem.
>
> Use Sys.setenv to make changes in environment variables that can be used 
> within system calls.
>
> Bash is not involved with the system call on Windows... so your syntax for 
> setting an environment variable is not portable.
>
> On June 1, 2020 11:10:41 AM PDT, Jan Gorecki  wrote:
> >Thank you Jeff for your comments.
> >Yet they does not seem to be related.
> >a) Environment variable is created inside `system` command, so env var
> >stays valid for the command. Which is confirmed in the first call that
> >properly shows CFLAGS.
> >b) Syntax passed checkbashisms so I expect no problems due to that.
> >
> >On Mon, Jun 1, 2020 at 4:03 PM Jeff Newmiller
> > wrote:
> >>
> >> I don't know anything about the function of that environment
> >variable, but
> >>
> >> a) system() invokes a child process so environment variable changes
> >made using it will only affect the child process created by that system
> >call.
> >>
> >> b) The syntax you have used is shell-specific, so does not look
> >portable.
> >>
> >> On June 1, 2020 4:58:19 AM PDT, Jan Gorecki 
> >wrote:
> >> >Hi package devel support,
> >> >
> >> >I am trying to use R_MAKEVARS_USER to customize build, rather than
> >> >.R/Makevars. It is properly displayed from config CFLAGS but during
> >> >package install it doesn't seem to work.
> >> >
> >> >In R-admin in "6.3.3 Customizing package compilation" there is:
> >> >
> >> >> Note that these mechanisms do not work with packages which fail to
> >> >pass settings down to sub-makes, perhaps reading etc/Makeconf in
> >> >makefiles in subdirectories.
> >> >
> >> >It seems that it applies to me. How should I debug that? to make
> >this
> >> >env var respected? Note that my pkg has src/Makevars to handle
> >openmp
> >> >switch nicely
> >> >Thank you
> >> >
> >> >system("R_MAKEVARS_USER=library/gcc/O3/Makevars R CMD config
> >CFLAGS")
> >> >-O3
> >> >
> >> >system("R_MAKEVARS_USER=library/gcc/O3/Makevars R CMD INSTALL
> >> >--library=library/gcc/O3 mypkg_1.0.0.tar.gz")
> >> >* installing *source* package 'mypkg' ...
> >> >** using staged installation
> >> >** libs
> >> >gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG-fopenmp -fpic
> >-g
> >> >-O2 -fdebug-prefix-map=/build/r-base-V28x5H/r-base-3.6.3=.
> >> >-fstack-protector-strong -Wformat -Werror=format-security
> >-Wdate-time
> >> >-D_FORTIFY_SOURCE=2 -g  -c assign.c -o assign.o
> >> >
> >> >__
> >> >R-package-devel@r-project.org mailing list
> >> >https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >>
> >> --
> >> Sent from my phone. Please excuse my brevity.
>
> --
> Sent from my phone. Please excuse my brevity.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [R-pkg-devel] R_MAKEVARS_USER fail to pass down to sub-makes?

2020-06-01 Thread Jan Gorecki

Thank you Jeff for your comments.
Yet they does not seem to be related.
a) Environment variable is created inside `system` command, so env var
stays valid for the command. Which is confirmed in the first call that
properly shows CFLAGS.
b) Syntax passed checkbashisms so I expect no problems due to that.

On Mon, Jun 1, 2020 at 4:03 PM Jeff Newmiller  wrote:
>
> I don't know anything about the function of that environment variable, but
>
> a) system() invokes a child process so environment variable changes made 
> using it will only affect the child process created by that system call.
>
> b) The syntax you have used is shell-specific, so does not look portable.
>
> On June 1, 2020 4:58:19 AM PDT, Jan Gorecki  wrote:
> >Hi package devel support,
> >
> >I am trying to use R_MAKEVARS_USER to customize build, rather than
> >.R/Makevars. It is properly displayed from config CFLAGS but during
> >package install it doesn't seem to work.
> >
> >In R-admin in "6.3.3 Customizing package compilation" there is:
> >
> >> Note that these mechanisms do not work with packages which fail to
> >pass settings down to sub-makes, perhaps reading etc/Makeconf in
> >makefiles in subdirectories.
> >
> >It seems that it applies to me. How should I debug that? to make this
> >env var respected? Note that my pkg has src/Makevars to handle openmp
> >switch nicely
> >Thank you
> >
> >system("R_MAKEVARS_USER=library/gcc/O3/Makevars R CMD config CFLAGS")
> >-O3
> >
> >system("R_MAKEVARS_USER=library/gcc/O3/Makevars R CMD INSTALL
> >--library=library/gcc/O3 mypkg_1.0.0.tar.gz")
> >* installing *source* package 'mypkg' ...
> >** using staged installation
> >** libs
> >gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG-fopenmp -fpic  -g
> >-O2 -fdebug-prefix-map=/build/r-base-V28x5H/r-base-3.6.3=.
> >-fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
> >-D_FORTIFY_SOURCE=2 -g  -c assign.c -o assign.o
> >
> >__
> >R-package-devel@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
> --
> Sent from my phone. Please excuse my brevity.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

[R-pkg-devel] R_MAKEVARS_USER fail to pass down to sub-makes?

2020-06-01 Thread Jan Gorecki

Hi package devel support,

I am trying to use R_MAKEVARS_USER to customize build, rather than
.R/Makevars. It is properly displayed from config CFLAGS but during
package install it doesn't seem to work.

In R-admin in "6.3.3 Customizing package compilation" there is:

> Note that these mechanisms do not work with packages which fail to pass 
> settings down to sub-makes, perhaps reading etc/Makeconf in makefiles in 
> subdirectories.

It seems that it applies to me. How should I debug that? to make this
env var respected? Note that my pkg has src/Makevars to handle openmp
switch nicely
Thank you

system("R_MAKEVARS_USER=library/gcc/O3/Makevars R CMD config CFLAGS")
-O3

system("R_MAKEVARS_USER=library/gcc/O3/Makevars R CMD INSTALL
--library=library/gcc/O3 mypkg_1.0.0.tar.gz")
* installing *source* package 'mypkg' ...
** using staged installation
** libs
gcc -std=gnu99 -I"/usr/share/R/include" -DNDEBUG-fopenmp -fpic  -g
-O2 -fdebug-prefix-map=/build/r-base-V28x5H/r-base-3.6.3=.
-fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
-D_FORTIFY_SOURCE=2 -g  -c assign.c -o assign.o

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel

Re: [Rd] order function called on a data.frame?

2020-05-31 Thread Jan Gorecki

So maybe for now just warning/error?
Should be much smaller change then those proposed by William and Michael.

Rui,
Your example of order list does raise error, but if you remove second
argument, it won't raise error anymore.

On Mon, May 18, 2020 at 5:27 PM William Dunlap via R-devel
 wrote:
>
> do.call(order, df).  ->  do.call(order, unname(df)).
>
> While you are looking at order(), it would be nice if ';decreasing' could
> be a vector the the length of list(...) so you could ask to sort some
> columns in increasing order and some decreasing.  I thought I put this on
> bugzilla eons ago, but perhaps not.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Mon, May 18, 2020 at 8:52 AM Michael Lawrence via R-devel <
> r-devel@r-project.org> wrote:
>
> > I guess we could make it do the equivalent of do.call(order, df).
> >
> > On Mon, May 18, 2020 at 8:32 AM Rui Barradas  wrote:
> > >
> > > Hello,
> > >
> > > There is a result with lists? I am getting
> > >
> > >
> > > order(list(letters, 1:26))
> > > #Error in order(list(letters, 1:26)) :
> > > #  unimplemented type 'list' in 'orderVector1'
> > >
> > > order(data.frame(letters, 1:26))
> > > # [1] 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
> > > #[22] 48 49 50 51 52  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16
> > > #[43] 17 18 19 20 21 22 23 24 25 26
> > >
> > >
> > > And I agree that order with data.frames should give a warning. The
> > > result is indeed useless:
> > >
> > > data.frame(letters, 1:26)[order(data.frame(letters, 1:26)), ]
> > >
> > >
> > > Hope this helps,
> > >
> > > Rui Barradas
> > >
> > >
> > > Às 00:19 de 18/05/20, Jan Gorecki escreveu:
> > > > Hi,
> > > > base::order main input arguments are defined as:
> > > >
> > > > a sequence of numeric, complex, character or logical vectors, all of
> > > > the same length, or a classed R object
> > > >
> > > > When passing a list or a data.frame, the resuts seems to be a bit
> > > > useless. Shouldn't that raise an error, or at least warning?
> > > >
> > > > Best Regards,
> > > > Jan Gorecki
> > > >
> > > > __
> > > > R-devel@r-project.org mailing list
> > > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > > >
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
> >
> > --
> > Michael Lawrence
> > Senior Scientist, Data Science and Statistical Computing
> > Genentech, A Member of the Roche Group
> > Office +1 (650) 225-7760
> > micha...@gene.com
> >
> > Join Genentech on LinkedIn | Twitter | Facebook | Instagram | YouTube
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] base::order breaking change in R-devel

2020-05-23 Thread Jan Gorecki

Hi R developers,
There seems to be breaking change in base::order on Windows in
R-devel. Code below yields different results on R 4.0.0 and R-devel
(2020-05-22 r78545). I haven't found any info about that change in
NEWS. Was the change intentional?

Sys.setlocale("LC_CTYPE","C")
Sys.setlocale("LC_COLLATE","C")
x1 = "fa\xE7ile"
Encoding(x1) = "latin1"
x2 = iconv(x1, "latin1", "UTF-8")
base::order(c(x2,x1,x1,x2))
Encoding(x2) = "unknown"
base::order(c(x2,x1,x1,x2))

# R 4.0.0
base::order(c(x2,x1,x1,x2))
#[1] 1 4 2 3
Encoding(x2) = "unknown"
base::order(c(x2,x1,x1,x2))
#[1] 2 3 1 4

# R-devel
base::order(c(x2,x1,x1,x2))
#[1] 1 2 3 4
Encoding(x2) = "unknown"
base::order(c(x2,x1,x1,x2))
#[1] 1 4 2 3

Best Regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] order function called on a data.frame?

2020-05-17 Thread Jan Gorecki

Hi,
base::order main input arguments are defined as:

a sequence of numeric, complex, character or logical vectors, all of
the same length, or a classed R object

When passing a list or a data.frame, the resuts seems to be a bit
useless. Shouldn't that raise an error, or at least warning?

Best Regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] update R Internals for R 4.0.0

2020-05-17 Thread Jan Gorecki

Hi R developers,

R ints: https://cran.r-project.org/doc/manuals/r-devel/R-ints.html

At the beginning there is
"this version is for the 3.x.y series."

I would like to kindly ask for update manual for 4.x.y series, which
would also need to cover entries about NAMED, that have been
superseded by REFCNT. There might be of course other places that needs
update, but this one is important feature of R 4.0.0.

Thank you,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] system(timeout=) may timeout with 0 exit code

2020-05-14 Thread Jan Gorecki

Hi R developers,

I observed that system(timeout=) may still return exit code 0, when
killing the process due to timeout.

In src/unix/sys-unix.c there is

#define KILL_SIGNAL1 SIGINT
#define KILL_SIGNAL2 SIGTERM
#define KILL_SIGNAL3 SIGKILL
#define EMERGENCY_TIMEOUT 20

After little bit of debugging I observed that total time of system
call is provided timeout value + 20s. That means EMERGENCY_TIMEOUT 20
kicked in, adding 20 seconds.

I don't have a reproducible example, but following code, and output
file below should be enough to ensure that there is a problem there
(exit code 0 despite timeout).

warn = NULL
p = proc.time()[[3L]]
tryCatch(
  ret <- system(cmd, timeout=timeout_s),
  warning = function(w) {
warn <<- w[["message"]]
  }
)
if (length(warn) && ret==0L)
  cat(sprintf("command '%s' timed out(?) but still exited with 0 code,
timeout %ss, took %ss, warning '%s'\n",
  cmd, timeout_s, proc.time()[[3L]]-p, warn),
file="timeout-exit-codes.out", append=TRUE)

And the content of timeout-exit-codes.out:

command '/bin/bash -c "./_launcher/solution.R > log.out 2> log.err"'
timed out(?) but still exited with 0 code, timeout 7200s, took
7220.005s, warning '/bin/bash -c "./_launcher/solution.R > log.out 2>
log.err"' timed out after 7200s'

Thank you,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] mclapply returns NULLs on MacOS when running GAM

2020-04-29 Thread Jan Gorecki

> PS. Simon, I think your explicit comment on mcparallel() & friends is
very helpful for many people and developers. It clearly tells
developers to never use mclapply() as the only path through their
code. I'm quite sure not everyone has been or is aware of this. Now
it's clear. Thank you.

I second that, IMO that should land somewhere in manual.

On Wed, Apr 29, 2020 at 6:40 AM Henrik Bengtsson
 wrote:
>
> On Tue, Apr 28, 2020 at 9:00 PM Shian Su  wrote:
> >
> > Thanks Simon,
> >
> > I will take note of the sensible default for core usage. I’m trying to 
> > achieve small scale parallelism, where tasks take 1-5 seconds and make 
> > fuller use of consumer hardware. Its not a HPC-worthy computation but even 
> > laptops these days come with 4 cores and I don’t see a reason to not make 
> > use of it.
> >
> > The goal for the current piece of code I’m working on is to bootstrap many 
> > smoothing fits to generate prediction intervals, this is quite easy to 
> > write using mclapply. When you say native with threads, OpenMP, etc… are 
> > you referring to at the C/C++ level? From my understanding most parallel 
> > packages in R end up calling multicore or snow deep down.
> >
> > I think one of the great advantages of mclapply is that it defaults to 
> > lapply when running on a single thread, this makes it much easier to 
> > maintain code with optional parallelism. I’m already running into trouble 
> > with the fact that PSOCK doesn’t seem to retain loaded packages in spawned 
> > processes. I would love to know if there reliable options in R that allow a 
> > similar interface to mclapply but use a different and more RStudio-stable 
> > mode of parallelisation?
>
> If you use parLapply(cl, ...) and gives the end-users the control over
> the cluster 'cl' object (e.g. via an argument), then they have the
> option to choose from the different types of clusters that cl <-
> parallel::makeCluster(...) can create, notably PSOCK, FORK and MPI
> cluster but the framework support others.
>
> The 'foreach' framework takes this separation of *what* to parallelize
> (which you decide as a developer) and *how* to parallel (which the
> end-user decides) further by so called foreach adaptors aka parallel
> backends.  With foreach, users have plently of doNnn packages to pick
> from, doMC, doParallel, doMPI, doSnow, doRedis, and doFuture.  Several
> of these parallel backends build on top of the core functions provided
> by the 'parallel' package.  So, with foreach your users can use forked
> parallel processing if they want and, or something else (selected at
> the top of their script).
>
> (Disclaimer: I'm the author) The 'future' framework tries to take this
> developer-end-user separation one step further and with a lower level
> API - future(), value(), resolved() - for which different parallel
> backends have been implemented, e.g. multicore, multisession
> ("PSOCK"), cluster (any parallel::makeCluster() cluster), callr,
> batchtools (HPC job schedulers), etc.  All these have been tested to
> conform to the Future API specs, so we know our parallel code works
> regardless of which of these backends the user picks.  Now, based on
> these basic future low-level functions, other higher level APIs have
> been implemented.  For instance, the future.apply packages provides
> futurized version of all base R apply functions, e.g. future_lapply(),
> future_vapply(), future_Map(), etc.  You can basically take you
> lapply(...) code and replace it with future_lapply(...) and things
> will just work.  So, try replacing your current mclapply() with
> future_lapply().  If you/the user uses the 'multicore' backend - set
> by plan(multicore) at top of script, you'll get basically what
> mclapply() provides.  If plan(multisession) is used, the you basically
> get what parLapply() does.  The difference is that you don't have to
> worry about globals and packages.  If you like the foreach-style of
> map-reduce, you can use futures via the doFuture backend.  If you like
> the purrr-style of map-reduce, you can use the 'furrr' package.  So,
> and I'm obviously biased, if you pick the future framework, you'll
> leave yourself and end-users with more options going forward.
>
> Clear as mud?
>
> /Henrik
>
> PS. Simon, I think your explicit comment on mcparallel() & friends is
> very helpful for many people and developers. It clearly tells
> developers to never use mclapply() as the only path through their
> code. I'm quite sure not everyone has been or is aware of this. Now
> it's clear. Thank you.
>
> >
> > Thanks,
> > Shian
> >
> > > On 29 Apr 2020, at 1:33 pm, Simon Urbanek  
> > > wrote:
> > >
> > > Do NOT use mcparallel() in packages except as a non-default option that 
> > > user can set for the reasons Henrik explained. Multicore is intended for 
> > > HPC applications that need to use many cores for computing-heavy jobs, 
> > > but it does not play well with RStudio and more importantly you don't 
> > > know the resource

Re: [Rd] Demo for linking native routines between R packages

2020-04-17 Thread Jan Gorecki

Dirk, Thank you for a comprehensive set of resources on that.

Yet, I think the proposal here make sense.
Packages you mentioned are real-life package. It would be way easier
to learn from a package that is meant to only show this single thing.
For the same reason I think it also make sense to have a "hello world
from C" package linked from WRE. All those native routines
registration, the proper way, is not really that obvious. It would be
much easier to learn from a package that doesn't have any other logic.

Best wishes,
Jan Gorecki

On Fri, Apr 17, 2020 at 3:32 PM Zhang, Jitao David via R-devel
 wrote:
>
> Dear Davis and Dirk,
>
> Thank you very much for the suggestions, which are very valuable and
> helpful.
>
> I will add references to prior examples, document my project with the clear
> step-by-step-style document of Davis's project, and come back again to the
> mailing list.
>
> Best wishes,
> David
>
> On Fri, Apr 17, 2020 at 3:40 PM Dirk Eddelbuettel  wrote:
>
> >
> > Jitao,
> >
> > Thanks for writing this up.
> >
> > You could add a section on 'prior art' and references.  The canonical
> > example
> > always was (c.f. Writing R Extensions)
> >
> >   lme4 <-> Matrix
> >
> > which was followed early by the CRAN packages
> >
> >   zoo <-> xts
> >
> > upon which I built
> >
> >   xts <-> RcppXts
> >
> > with a write-up (from 2013 !!) here:
> > https://gallery.rcpp.org/articles/accessing-xts-api/
> >
> > Via private mail, I helped then-maintainer Vincent connect expm:
> >
> >   expm <-> Matrix
> >
> > and built two packages on CRAN _for the very purpose of exporting API
> > functions to be called_ (which in both cases are from base R as R Core is
> > very careful not get tied into exporting APIs, which is both understandable
> > and a source of added difficulty for us package writers)
> >
> >   RApiDatetime
> >   RApiSerialize
> >
> > The latter one is use by my RcppRedis package, Travers' very nice qs
> > package
> > and Tim's rpg package.
> >
> > To my reading, the R Community is drifting more and more towards collective
> > amnesia where prior work is (pick any one the following)
> >
> >  - ignored altogether
> >  - reinvented by another package
> >  - shadowed by another package
> >
> > rather than extended, improved and/or cited.  That is a collective loss for
> > all of us. It would be nice if you could stear back a little and reference
> > prior related work. My apologies to other packages in this area I have not
> > listed. We really should have a common reference for this.
> >
> > Cheers, Dirk
> >
> > --
> > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> >
>
>
> --
>
> *Dr. Jitao David Zhang | 张继涛 | A Computational Biologist in Drug Discovery*
>
> *Building 93/3.38, **Tel +41 61 688 62 51*
>
> *Roche Pharmaceutical Research and Early Development
> (pRED) | Pharmaceutical Sciences, BiOmics, BEDA (see 
> http://**go.roche.com/BEDA
> <http://go.roche.com/BEDA>**) | Roche Innovation Center Basel | F.
> Hoffmann-La-Roche AG | CH-4070 Basel | Switzerland*
> *Core working hours - No Meetings: Mo/8:30-16:00; Tu/8:30-17:00;
> We/8:30-16:00; Th/9:00-11:30*
> *Available for meetings: Mo/16:00-17:00; We/16:00-17:00**; Th/11:00-17:00;
> Fr/8:00-10:00*
>
> Confidentiality Note: This message is intended only for ...{{dropped:13}}
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Add a new environment variable switch for the 'large version' check

2020-04-16 Thread Jan Gorecki

For the same reason, handling false positive in CRAN checks, there are
other places that could be improved.
Like "size of tarball" NOTE.
If one could control this size with an environment variable. Similarly
to the proposal made by Jim. It would be useful as well.

On Thu, Apr 16, 2020 at 5:06 PM Henrik Bengtsson
 wrote:
>
> I'd second Jim's feature request - it would be useful to be able to
> disable this in CI and elsewhere.The concept of using an "unusual"
> version component such as a very large number does a nice job of
> indicating "unusual" and serves as a blocker for submitting
> work-in-progress to CRAN by mistake (hence the validation in 'R CMD
> check').
>
> Another point, which I don't think Jim made, is that this would make
> it possible to run R CMD check --as-cran on your work-in-progress and
> get all OKs.  This in turn would allow us to trigger a non-zero exit
> status also for NOTEs (not just ERRORs and WARNINGs).  Currently, the
> warning on -9000 is a false positive in this sense.  This will allow
> developers to be more conservative without risking to treat NOTEs as
> something to expect as normal.  CI services are typically configured
> to alert the developer on ERRORs and WARNINGs but, AFAIK, not on
> NOTEs.
>
> On the topic of unusual version numbers: I'd like to suggest that
> CRAN(*) makes an unusual version bump whenever they orphan a package,
> e.g. to suffix -1. CRAN already updates/modifies the package
> tarball for orphaned packages by setting 'Maintainer: ORPHANED' in the
> DESCRIPTION file. By also bumping the version of orphaned packages it
> would it stand out in sessionInfo(), which helps in troubleshooting
> and bug reports, etc.  But more importantly, the most recent stable
> CRAN release remain untouched, which I think has a value by itself for
> scientific purposes.
>
> /Henrik
>
> (*) Yes, I should email CRAN about this, but I think it's worth
> vetting it here first.
>
> On Thu, Apr 16, 2020 at 7:44 AM Dirk Eddelbuettel  wrote:
> >
> >
> > Or you use a fourth component to signal a development version as Rcpp has
> > done for years (and, IIRC, for longer than devtools et al used '9000').
> >
> > There is no functional difference between 1.2.3.1 and 1.2.3.9000. They are
> > both larger than 1.2.3 (in the package_version() sense) and signal an
> > intermediate version between 1.2.3 and 1.2.4.
> >
> > But one requires a patch. ¯\_(ツ)_/¯.
> >
> > Dirk
> >
> > --
> > http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] is.vector could handle AsIs class better

2020-03-30 Thread Jan Gorecki

Thank you Gabriel,
Agree, although I think that could be relaxed in this single case and
AsIs class could be ignored.
Best,
Jan

On Sun, Mar 29, 2020 at 7:09 PM Gabriel Becker  wrote:
>
> Jan,
>
> I believe it's because it has "a non-NULL attribute other than names" as per 
> the documentation. In this case its class of "AsIs".
>
> Best,
> ~G
>
> On Sun, Mar 29, 2020 at 6:29 AM Jan Gorecki  wrote:
>>
>> Dear R-devel,
>>
>> AsIs class seems to be well handled by `typeof` and `mode` function.
>> Those two functions are being referred when explaining `is.vector`
>> behaviour in manual. Yet `is.vector` does not seem to be handling AsIs
>> class the same way.
>>
>> is.vector(1L)
>> #[1] TRUE
>> is.vector(I(1L))
>> #[1] FALSE
>>
>> Is there any reason behind this behaviour?
>> Could we have it supported so AsIs class is ignored when `is.vector`
>> is doing its job?
>>
>> Best Regards,
>> Jan Gorecki
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] is.vector could handle AsIs class better

2020-03-29 Thread Jan Gorecki

Dear R-devel,

AsIs class seems to be well handled by `typeof` and `mode` function.
Those two functions are being referred when explaining `is.vector`
behaviour in manual. Yet `is.vector` does not seem to be handling AsIs
class the same way.

is.vector(1L)
#[1] TRUE
is.vector(I(1L))
#[1] FALSE

Is there any reason behind this behaviour?
Could we have it supported so AsIs class is ignored when `is.vector`
is doing its job?

Best Regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] substitute inconsistent output

2020-03-17 Thread Jan Gorecki

Dear R-devel,

Is there anything that we can do to make output of those call more
consistent? So the first one will return `c(1L, 2L)` rather than
`1:2`. Note that it is not related to compact integer sequence
introduced by altrep, it is reproducible on R 3.1.0 as well.

substitute(v+x, list(x=c(1L,2L)))
#v + 1:2
substitute(v+x, list(x=c(0L,2L)))
#v + c(0L, 2L)

Thank you,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] new bquote feature splice does not address a common LISP @ use case?

2020-03-17 Thread Jan Gorecki

Thank you Lionel for comprehensive explanation. I think that rotating
AST in base R is not a good way to go, it would probably complicate
the code heavily.

Best,
Jan Gorecki

On Tue, Mar 17, 2020 at 4:49 PM Lionel Henry  wrote:
>
> Hi Jan,
>
> In the lisp code you provide the operators are parsed as simple
> symbols in a pairlist. In the R snippet, they are parsed as
> left-associative binary operators of equal precedence. If you unquote
> a call in the right-hand side, you're artificially bypassing the
> left-associativity of these operators.
>
> To achieve what you're looking for in a general way, you'll need a
> more precise definition of the problem, and a solution that probably
> involves rotating the AST accordingly (see
> https://github.com/r-lib/rlang/blob/master/src/internal/expr-interp-rotate.c).
> Maybe it could be possible to formulate a definition where splicing in
> special calls like binary operators produces the same AST as the user
> would type by hand. It seems this would make splicing easier to use
> for end users, but make the metaprogramming model more complex for
> experts. This is an interesting perspective though. It also seems
> vaguely connected to the problem of splicing within model formulas.
>
> I see in your example that the new ..() operator in `bquote()` allows
> splicing calls, and seems to unquote them instead of splicing. In the
> first versions of rlang, splicing with !!! behaved just like this. We
> changed this behaviour last year and I would like to share the
> motivations behind this decision, as it might be helpful to inform the
> semantics of ..() in bquote() in R 4.0.
>
> The bottom line is that calls are now treated like scalars. This is a
> slight contortion of the syntax because calls are "language lists",
> and so they could be conceived as collections rather than scalars.
> However, R is vector-oriented rather than pairlist-oriented, and
> treating calls as scalars makes the metaprogramming model simpler.
>
> This is also how `bquote(splice = TRUE)` works. However `bquote()`
> and rlang do not treat scalars in the same way. In rlang scalars
> cannot be spliced, they must be unquoted.
>
> ```
> bquote(foo(..(function() NULL)), splice = TRUE)
> #> foo(function() NULL)
>
> bquote(foo(..(quote(bar))), splice = TRUE)
> #> foo(bar)
>
> expr(foo(!!!function() NULL))
> #> Error: Can't splice an object of type `closure` because it is not a vector.
>
> expr(foo(!!!quote(bar)))
> #> foo(bar)
> #> Warning message:
> #> Unquoting language objects with `!!!` is deprecated as of rlang 0.4.0.
> #> Please use `!!` instead.
> ```
>
> We decided to disallow splicing scalars (and thus calls) in rlang even
> though this is a legal operation in many lisps. In lisps, the splicing
> operation stands for unquoting in the CDR of a pairlist. By contrast
> the unquote operation unquotes in the CAR. For example `(1 ,@3) is
> legal in Common Lisp and stands for the cons cell (1 . 3). I think
> such semantics are not appealing in a language like R because it is
> vector-oriented rather than pairlist oriented. Pairlists are mostly an
> implicit data structure that users are not familiar with, and they are
> not even fully supported in all implementations of R (for instance
> TERR and Renjin do not allow non-NULL terminated pairlists, and while
> GNU R has vestigial print() support for these, they cause str() to crash).
>
> In general, it is much more useful to define a splice operation that
> also works for vectors:
>
> ```
> rlang::list2(1, !!!10:11, 3)
> #> [[1]]
> #> [1] 1
> #>
> #> [[2]]
> #> [1] 10
> #>
> #> [[3]]
> #> [1] 11
> #>
> #> [[4]]
> #> [1] 3
> ```
>
> Because vectors do not have any notion of CDR, the usual lisp
> interpretation of splicing scalars does not apply.
>
> One alternative to make it work is to devolve the splicing operation
> into a simple unquote operation, when supplied a scalar. This is how
> `bquote(splice = TRUE)` works. However I think this kind of
> overloading is more confusing in the long run, and makes it harder for
> users to form a correct mental model for programming with these
> operations. For this reason it seems preferable to force users to be
> explicit about the desired semantics with scalars and calls. In rlang
> they must either unquote the call, or explicitly transform it to a
> list prior to splicing:
>
> ```
> x <- quote(bar + baz)
>
> # Unquote instead of splicing
> expr(foo(!!x))
> #> foo(bar + baz)
>
> # Convert to list and then splice
> expr(add(!!!as.list(x[-1])))
> #> add(bar, baz)
> ```
>
> Unquoting could be consistent if all objects

[Rd] new bquote feature splice does not address a common LISP @ use case?

2020-03-16 Thread Jan Gorecki

Dear R-devel,

There is a new feature in R-devel, which explicitly refers to LISP @
operator for splicing.

> The backquote function bquote() has a new argument splice to enable splicing 
> a computed list of values into an expression, like ,@ in LISP's backquote.

Although the most upvoted SO question asking for exactly LISP's @
functionality in R doesn't seems to be addressed by this new feature.

Is it possible to use new splice feature to create `6 - 5 + 4`
expression rather than `6 - (5 + 4)`?

b = quote(5+4)
b
#5 + 4
c = bquote(6-.(b))
c
#6 - (5 + 4)
d = bquote(6-..(b), splice=TRUE)
d
#6 - (5 + 4)

There is corresponding LISP code provided

CL-USER>
(setf b `(5 + 4))
(5 + 4)
CL-USER>
(setf c `(6 - ,@b))
(6 - 5 + 4)
CL-USER>
(setf c-non-spliced `(6 - ,b))
(6 - (5 + 4))
CL-USER>

Thanks,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] document base::`[[.data.frame`

2020-02-04 Thread Jan Gorecki

Dear R-devel,

Looking at source of base::`[[.data.frame` we can see a mysterious
handling of subsetting rows and columns at the same time. That does
not seems to be dcoumented anywhere in `[[`. Could we extend
documentation for such data.frame use cases? Or if it is meant to be
used only internally, then maybe better to put that into own internal
function?

Best regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] A bug understanding F relative to FALSE?

2020-01-23 Thread Jan Gorecki

I agree it is not good to have those symbols used in packages, but I
found to use those quite often in my developement workflow, which is
something like:

$ R -q
> cc(F)

where `cc` is my function to rebuild application to recent state. I
use it really often. Having to type FALSE every time makes this
workflow relatively longer.
IMO it would be best to warn about T/F symbols during package check,
but not necessarily removing them globally.

On Fri, Jan 17, 2020 at 4:01 PM Joris Meys  wrote:
>
> As others have pointed out, this is expected behaviour. Let me get on that
> hill I'll die on: it is absolutely not suitable. It is way beyond time to
> remove T and F as unprotected kind-of-synonyms for TRUE and FALSE, given
> the amount of times I had to point out that:
>
> T <- t(matrix(0:3,nrow=2))
> isTRUE(T)
>
> was the reason the code didn't do what it's supposed to do. (Also don't use
> T as short for "Transpose of my matrix", but that's another hill.)
>
> As we've become more strict on the use of T and F in packages, maybe 4.0.0
> is a good milestone to finally drop this relic from the past? One can
> dream...
>
> Kind regards
> Joris
>
> On Wed, Jan 15, 2020 at 3:14 PM IAGO GINÉ VÁZQUEZ  wrote:
>
> > Hi all,
> >
> > Is the next behaviour suitable?
> >
> > identical(F,FALSE)
> >
> > ## [1] TRUE
> >
> > utils::getParseData(parse(text = "c(F,FALSE)", keep.so=rce = TRUE))
> >
> > ##line1 col1 line2 col2 id parenttoken terminal  text
> > ## 14 11 1   10 14  0 exprFALSE
> > ## 1  11 11  1  3 SYMBOL_FUNCTION_CALL TRUE c
> > ## 3  11 11  3 14 exprFALSE
> > ## 2  12 12  2 14  '(' TRUE (
> > ## 4  13 13  4  6   SYMBOL TRUE F
> > ## 6  13 13  6 14 exprFALSE
> > ## 5  14 14  5 14  ',' TRUE ,
> > ## 9  15 19  9 10NUM_CONST TRUE FALSE
> > ## 10 15 19 10 14 exprFALSE
> > ## 11 1   10 1   10 11 14  ')' TRUE )
> >
> > I would expect that token for F is the same as token for FALSE.
> >
> >
> > Thank you!
> >
> > Iago
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
> 
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] New R function is.nana = is.na & !is.nan

2020-01-01 Thread Jan Gorecki

"nana" is meant to express "NA, really NA".
Your suggestion sounds good.

On Thu 2 Jan, 2020, 3:38 AM Pages, Herve,  wrote:

> Happy New Year everybody!
>
> The name (is.nana) doesn't make much sense to me. Can you explain it?
>
> One alternative would be to add an extra argument (e.g. 'strict') to
> is.na(). FALSE by default, and ignored (with or w/o a warning) when the
> type of 'x' is not "numeric".
>
> H.
>
>
> On 12/31/19 22:16, Jan Gorecki wrote:
> > Hello R-devel,
> >
> > Best wishes in the new year. I am writing to kindly request new R
> > function so NA_real_ can be more easily detected.
> > Currently if one wants to test for NA_real_ (but not NaN) then extra
> > work has to be done: `is.na(x) & !is.nan(x)`
> > Required functionality is already at C level so to address my request
> > there is not that much to do.
> > Kevin Ushey made a nice summary of current R C api in:
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_a_26262984_2490497=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=oWT1wDiy7pogVlJHGtdOoM3zdB45t9zZVyGYU8qcOgo=zFj3lh-N_YlNBRlDKeO-aTs0Bf2qtWLUHKlw_nh2Q4o=
> > Pasting related part below, extra row added by me is a requested feature.
> >
> >  +-+
> >  | C fun| NaN | NA | R fun
> >  +-+
> >  | ISNAN|  t  | t  | is.na
> >  | R_IsNaN  |  t  | f  | is.nan
> >  | ISNA |  f  | t  | is.na && !is.nan
> >  | R_IsNA   |  f  | t  | is.na && !is.nan
> >  +-+
> >  +-+
> >  | R fun| NaN | NA | C fun
> >  +-+
> >  | is.na|  t  | t  | ISNAN
> >  | is.nan   |  t  | f  | R_IsNaN
> >  +-+
> >  | is.nana  |  f  | t  | R_IsNA
> >  +-+
> >
> > Strictly speaking, I am asking for a new R function:
> >
> >  is.nana <- function(x) if (typeof(x)=="numeric")
> > .Primitive("is.nana") else .Primitive("is.na")
> >
> > Then probably a copy of C function `do_isnan` as `do_isnana` with a
> > minor change from `R_IsNaN` to `R_IsNA`.
> >
> > Best,
> > Jan Gorecki
> >
> > __
> > R-devel@r-project.org mailing list
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel=DwICAg=eRAMFD45gAfqt84VtBcfhQ=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA=oWT1wDiy7pogVlJHGtdOoM3zdB45t9zZVyGYU8qcOgo=tCCxZtQj30QrtAYORMODT-OnjeKiXxiF0qlZtgyj1Mc=
> >
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] New R function is.nana = is.na & !is.nan

2019-12-31 Thread Jan Gorecki

Hello R-devel,

Best wishes in the new year. I am writing to kindly request new R
function so NA_real_ can be more easily detected.
Currently if one wants to test for NA_real_ (but not NaN) then extra
work has to be done: `is.na(x) & !is.nan(x)`
Required functionality is already at C level so to address my request
there is not that much to do.
Kevin Ushey made a nice summary of current R C api in:
https://stackoverflow.com/a/26262984/2490497
Pasting related part below, extra row added by me is a requested feature.

+-+
| C fun| NaN | NA | R fun
+-+
| ISNAN|  t  | t  | is.na
| R_IsNaN  |  t  | f  | is.nan
| ISNA |  f  | t  | is.na && !is.nan
| R_IsNA   |  f  | t  | is.na && !is.nan
+-+
+-+
| R fun| NaN | NA | C fun
+-+
| is.na|  t  | t  | ISNAN
| is.nan   |  t  | f  | R_IsNaN
+-+
| is.nana  |  f  | t  | R_IsNA
+-+

Strictly speaking, I am asking for a new R function:

is.nana <- function(x) if (typeof(x)=="numeric")
.Primitive("is.nana") else .Primitive("is.na")

Then probably a copy of C function `do_isnan` as `do_isnana` with a
minor change from `R_IsNaN` to `R_IsNA`.

Best,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] head/tail breaking change

2019-12-18 Thread Jan Gorecki

Thank you Gabriel,
I agree that new behaviour makes much more sense. Just wanted to confirm
before resolving compatibility of my unit tests.
Best,
Jan

On Wed 18 Dec, 2019, 10:46 PM Gabriel Becker,  wrote:

> Jan,
>
> That is an intentional change as you can see in the documentation for
> head/tail in R-devel. Last time I discussed it with Martin, this behavior
> was desired and thus is unlikely to change unless "our" (ie his) mind does.
>
> The hope is that the new behavior is actually what people would want (note
> it already behaves this way for data.frames and for matrices, which are now
> explicitly array objects with 2 dimensions as well as classed as matrices,
> so its more consistent now, and more reasonable for the object).
>
> Best,
> ~G
>
> On Wed, Dec 18, 2019 at 2:44 AM Jan Gorecki  wrote:
>
>> Hi R-devel community,
>>
>> I am aware of changes in R-devel in head/tail methods but I was not
>> expecting that to be a breaking change.
>>
>> # R 3.6.1
>> ar = array(1:27, c(3,3,3))
>> tail(ar, 1)
>> #[1] 27
>>
>> The current output of R-devel is something that I would expect from a
>>
>> tail(ar, c(1, Inf, Inf))
>>
>> or
>>
>> tail(ar, c(1, NA, NA))
>>
>> calls.
>> Is it going to stay like this or there are plans to mitigate this
>> breaking change?
>>
>> # R-devel 2019-12-17 r77592
>> ar = array(1:27, c(3,3,3))
>> tail(ar, 1)
>> #, , 1
>> #
>> # [,1] [,2] [,3]
>> #[3,]369
>> #
>> #, , 2
>> #
>> # [,1] [,2] [,3]
>> #[3,]   12   15   18
>> #
>> #, , 3
>> #
>> # [,1] [,2] [,3]
>> #[3,]   21   24   27
>>
>> Best,
>> Jan Gorecki
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] head/tail breaking change

2019-12-18 Thread Jan Gorecki

Hi R-devel community,

I am aware of changes in R-devel in head/tail methods but I was not
expecting that to be a breaking change.

# R 3.6.1
ar = array(1:27, c(3,3,3))
tail(ar, 1)
#[1] 27

The current output of R-devel is something that I would expect from a

tail(ar, c(1, Inf, Inf))

or

tail(ar, c(1, NA, NA))

calls.
Is it going to stay like this or there are plans to mitigate this
breaking change?

# R-devel 2019-12-17 r77592
ar = array(1:27, c(3,3,3))
tail(ar, 1)
#, , 1
#
# [,1] [,2] [,3]
#[3,]369
#
#, , 2
#
# [,1] [,2] [,3]
#[3,]   12   15   18
#
#, , 3
#
# [,1] [,2] [,3]
#[3,]   21   24   27

Best,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] data.frame could handle long vectors

2019-12-01 Thread Jan Gorecki

Dear R-devel,

It seems that data.frame class could be improved to handle long
vectors better. It fails while using `data.frame(.)` and the same
using, I believe, lower overhead, `as.data.frame(list(.)). Please find
reproducible example below, tested on 2019-12-01 r77492.

id1 = sample.int(3e9, replace=TRUE)
v1 = runif(3e9)

df = data.frame(id1=id1, v1=v1)
#Error in if (mirn && nrows[i] > 0L) { :
#  missing value where TRUE/FALSE needed
#In addition: Warning message:
#In attributes(.Data) <- c(attributes(.Data), attrib) :
#  NAs introduced by coercion to integer range

df = as.data.frame(list(id1=id1, v1=v1))
#Error in if (mirn && nrows[i] > 0L) { :
#  missing value where TRUE/FALSE needed
#In addition: Warning message:
#In attributes(.Data) <- c(attributes(.Data), attrib) :
#  NAs introduced by coercion to integer range

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] class() |--> c("matrix", "arrary") -- and S3 dispatch

2019-11-25 Thread Jan Gorecki

In case if anyone needs daily R-devel there is my build scheduled on GitLab.
As of now based on Ubuntu 16.04, R built using:
--with-recommended-packages --enable-strict-barrier
--disable-long-double
Predefined Makevars for building pkgs using: -g -O2 -Wall -pedantic
-fstack-protector-strong -D_FORTIFY_SOURCE=2

$ docker run --rm -ti registry.gitlab.com/jangorecki/dockerfiles/r-devel:latest
R Under development (unstable) (2019-11-23 r77455) -- "Unsuffered Consequences"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = "BOOH !")
> class(m <- diag(1))
[1] "matrix" "array"

On Mon, Nov 25, 2019 at 10:31 PM Dirk Eddelbuettel  wrote:
>
>
> On 21 November 2019 at 17:57, Martin Maechler wrote:
> | (if you use a version of R-devel, with svn rev >= 77446; which
> |  you may get as a binary for Windows in about one day; everyone
> |  else needs to compile for the sources .. or wait a bit, maybe
> |  also not much longer than one day, for a docker image) :
>
> FYI: rocker/drd [1] and rocker/r-devel both have rev 77455 now (as they are
> both on weekend auto-rebuild schedule).  The former is smaller, both should
> work to test this. Quick demo below [2].
>
> Dirk
>
> [1] This comes from 'drd == daily r-devel' but we do not build it daily.
> [2] Quick demo follows
>
> edd@rob:~$ docker run --rm -ti rocker/r-devel bash
> root@a30e4a5c89ba:/# RD
>
> R Under development (unstable) (2019-11-23 r77455) -- "Unsuffered 
> Consequences"
> Copyright (C) 2019 The R Foundation for Statistical Computing
> Platform: x86_64-pc-linux-gnu (64-bit)
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>
>   Natural language support but running in an English locale
>
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>
> > Sys.setenv("_R_CLASS_MATRIX_ARRAY_" = "BOOH !") # ==> future R behavior
> > class(m <- diag(1))
> [1] "matrix" "array"
> >
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] R C api for 'inherits' S3 and S4 objects

2019-11-01 Thread Jan Gorecki

Thank you all for your valuable comments.
Best,
Jan

On Fri, Nov 1, 2019 at 8:15 PM Tierney, Luke  wrote:
>
> On Fri, 1 Nov 2019, Jan Gorecki wrote:
>
> > Thank you Luke.
> > That is why I don't use Rf_inherits but INHERITS which does not
> > allocate, provided in the email body.
>
> Your definition can allocate because STING_ELT can allocate.
> getAttrib can GC in general. Currently it would not GC or allocate in
> this case, but this could change.
>
> You can't assume thread-safety for calls into the R API, or any API
> for that matter, unless they are documented to be thread-safe.
>
> You would be better off using Rf_inherits as it does not make the
> assumption that you can use pointer comparisons to check for identical
> strings.  CHARSXPs are almost always cached but they are not
> guaranteed to be, and the caching strategy might change in the future.
>
> Best,
>
> luke
>
> > I cannot do similarly for S4 classes, thus asking for some API for that.
> >
> > On Fri, Nov 1, 2019 at 5:56 PM Tierney, Luke  wrote:
> >>
> >> On Fri, 1 Nov 2019, Jan Gorecki wrote:
> >>
> >>> Dear R developers,
> >>>
> >>> Motivated by discussion about checking inheritance of S3 and S4
> >>> objects (in head matrix/array topic) I would light to shed some light
> >>> on a minor gap about that matter in R C API.
> >>> Currently we are able to check inheritance for S3 class objects from C
> >>> in a robust way (no allocation, thread safe). This is unfortunately
> >>
> >> Your premise is not correct. Rf_inherits will not GC but it can
> >> allocate and is not thread safe.
> >>
> >> Best,
> >>
> >> luke
> >>
> >>> not possible for S4 classes. I would kindly request new function in R
> >>> C api so it can be achieved for S4 classes with no risk of allocation.
> >>> For reference mentioned functions below. Thank you.
> >>> Jan Gorecki
> >>>
> >>> // S3 inheritance
> >>> bool INHERITS(SEXP x, SEXP char_) {
> >>>  SEXP klass;
> >>>  if (isString(klass = getAttrib(x, R_ClassSymbol))) {
> >>>for (int i=0; i >>>  if (STRING_ELT(klass, i) == char_) return true;
> >>>}
> >>>  }
> >>>  return false;
> >>> }
> >>> // S4 inheritance
> >>> bool Rinherits(SEXP x, SEXP char_) {
> >>>  SEXP vec = PROTECT(ScalarString(char_));
> >>>  SEXP call = PROTECT(lang3(sym_inherits, x, vec));
> >>>  bool ans = LOGICAL(eval(call, R_GlobalEnv))[0]==1;
> >>>  UNPROTECT(2);
> >>>  return ans;
> >>> }
> >>>
> >>> __
> >>> R-devel@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >>
> >> --
> >> Luke Tierney
> >> Ralph E. Wareham Professor of Mathematical Sciences
> >> University of Iowa  Phone: 319-335-3386
> >> Department of Statistics andFax:   319-335-3017
> >> Actuarial Science
> >> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> >> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] R C api for 'inherits' S3 and S4 objects

2019-11-01 Thread Jan Gorecki

Thank you Luke.
That is why I don't use Rf_inherits but INHERITS which does not
allocate, provided in the email body.
I cannot do similarly for S4 classes, thus asking for some API for that.

On Fri, Nov 1, 2019 at 5:56 PM Tierney, Luke  wrote:
>
> On Fri, 1 Nov 2019, Jan Gorecki wrote:
>
> > Dear R developers,
> >
> > Motivated by discussion about checking inheritance of S3 and S4
> > objects (in head matrix/array topic) I would light to shed some light
> > on a minor gap about that matter in R C API.
> > Currently we are able to check inheritance for S3 class objects from C
> > in a robust way (no allocation, thread safe). This is unfortunately
>
> Your premise is not correct. Rf_inherits will not GC but it can
> allocate and is not thread safe.
>
> Best,
>
> luke
>
> > not possible for S4 classes. I would kindly request new function in R
> > C api so it can be achieved for S4 classes with no risk of allocation.
> > For reference mentioned functions below. Thank you.
> > Jan Gorecki
> >
> > // S3 inheritance
> > bool INHERITS(SEXP x, SEXP char_) {
> >  SEXP klass;
> >  if (isString(klass = getAttrib(x, R_ClassSymbol))) {
> >for (int i=0; i >  if (STRING_ELT(klass, i) == char_) return true;
> >}
> >  }
> >  return false;
> > }
> > // S4 inheritance
> > bool Rinherits(SEXP x, SEXP char_) {
> >  SEXP vec = PROTECT(ScalarString(char_));
> >  SEXP call = PROTECT(lang3(sym_inherits, x, vec));
> >  bool ans = LOGICAL(eval(call, R_GlobalEnv))[0]==1;
> >  UNPROTECT(2);
> >  return ans;
> > }
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] R C api for 'inherits' S3 and S4 objects

2019-11-01 Thread Jan Gorecki

Dear R developers,

Motivated by discussion about checking inheritance of S3 and S4
objects (in head matrix/array topic) I would light to shed some light
on a minor gap about that matter in R C API.
Currently we are able to check inheritance for S3 class objects from C
in a robust way (no allocation, thread safe). This is unfortunately
not possible for S4 classes. I would kindly request new function in R
C api so it can be achieved for S4 classes with no risk of allocation.
For reference mentioned functions below. Thank you.
Jan Gorecki

// S3 inheritance
bool INHERITS(SEXP x, SEXP char_) {
  SEXP klass;
  if (isString(klass = getAttrib(x, R_ClassSymbol))) {
for (int i=0; ihttps://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

2019-10-29 Thread Jan Gorecki

Gabriel,
My view is rather radical.

- head/tail should return object having same number of dimensions
- data.frame should be a special case
- matrix should be handled as 2D array

P.S. idea of accepting `n` argument as a vector of corresponding
dimensions is a brilliant one

On Wed, Oct 30, 2019 at 1:13 AM Gabriel Becker  wrote:
>
> Hi all,
>
> So I've started working on this and I ran into something that I didn't
> know, namely that for x a multi-dimensional (2+) array, head(x) and tail(x)
> ignore dimension completely, treat x as an atomic vector, and return an
> (unclassed) atomic vector:
>
> > x = array(100, c(4, 5, 5))
>
> > dim(x)
>
> [1] 4 5 5
>
> > head(x, 1)
>
> [1] 100
>
> > class(head(x))
>
> [1] "numeric"
>
>
> (For a 1d array, it does return another 1d array).
>
> When extending head/tail to understand multiple dimensions as discussed in
> this thread, then, should the behavior for 2+d arrays be explicitly
> retained, or should head and tail do the analogous thing (with a head(<2d
> array>) behaving the same as head(), which honestly is what I
> expected to already be happening)?
>
> Are people using/relying on this behavior in their code, and if so, why/for
> what?
>
> Even more generally, one way forward is to have the default methods check
> for dimensions, and use length if it is null:
>
> tail.default <- tail.data.frame <- function(x, n = 6L, ...)
> {
> if(any(n == 0))
> stop("n must be non-zero or unspecified for all dimensions")
> if(!is.null(dim(x)))
> dimsx <- dim(x)
> else
> dimsx <- length(x)
>
> ## this returns a list of vectors of indices in each
> ## dimension, regardless of length of the the n
> ## argument
> sel <- lapply(seq_along(dimsx), function(i) {
> dxi <- dimsx[i]
> ## select all indices (full dim) if not specified
> ni <- if(length(n) >= i) n[i] else dxi
> ## handle negative ns
> ni <- if (ni < 0L) max(dxi + ni, 0L) else min(ni, dxi)
> seq.int(to = dxi, length.out = ni)
> })
> args <- c(list(x), sel, drop = FALSE)
> do.call("[", args)
> }
>
>
> I think this precludes the need for a separate data.frame method at all,
> actually, though (I would think) tail.data.frame would still be defined and
> exported for backwards compatibility. (the matrix method has some extra
> bits so my current conception of it is still separate, though it might not
> NEED to be).
>
> The question then becomes, should head/tail always return something with
> the same dimensionally (number of dims) it got, or should data.frame and
> matrix be special cased in this regard, as they are now?
>
> What are people's thoughts?
> ~G
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] contrib.url and R-devel numbering 3.7 vs 4.0

2019-10-17 Thread Jan Gorecki

Dear R-devel,

Due to the numbering change of R-devel from 3.7 to 4.0 there seems to
be an issue happening in a helper function contrib.url, maybe some
others too.
When running on a Windows r77294 (2019-10-15)

contrib.url("https://cloud.r-project.org;, type="binary")
#[1] "https://cloud.r-project.org/bin/windows/contrib/4.0;

Resulting string is actually not a the proper one. As of now CRAN
still uses 3.7.
It is not clear if migration to 4.0 urls will happen or R-devel CRAN
will stay as 3.7.

Anyway I would like to request, again, to consider my suggestion from
May 2018 that makes contrib.url more flexible, allowing to handle
issue discussed in this email.
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17420

Best Regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] speed up R_IsNA, R_IsNaN for vector input

2019-09-30 Thread Jan Gorecki

Dear Tomas,

I was thinking it is because of taking

ieee_double y;

out from the loop, and re-using across iterations.
Now I checked that was not the reason of speed up.
So as you wrote, it was only due to inlining.
I am surprised the difference is so significant.
Thank you,
Jan

On Mon, Sep 30, 2019 at 10:31 AM Tomas Kalibera
 wrote:
>
> On 9/29/19 1:09 PM, Jan Gorecki wrote:
> > Dear R developers,
> >
> > I spotted that R_isNA and R_IsNaN could be improved when applied on a
> > vector where we could take out small part of their logic, run it once,
> > and then reuse inside the loop.
>
> Dear Jan,
>
> Looking at your examples, I just see you have hand-inlined
> R_IsNA/R_IsNaN, or is there anything more? In principle we could put
> R_IsNA, R_IsNAN into Rinlinedfuns to allow inlining across compilation
> modules, but we can't put all functions there - so there would have to
> be a clear case for a performance problem in some specific function in a
> different module.
>
> If you were curious there are optimized checks for non-finite values in
> vectors in array.c, which are used for matrix multiplication before
> calling to BLAS. These have to be fast and the optimization is biased
> towards the case that such values are rare and that it is ok to
> sometimes say "there may be non-finite values" even when in fact they
> are not.
>
> Best
> Tomas
>
> > I setup tiny plain-C experiment. Taking R_IsNA, R_IsNaN from R's
> > arithmetic.c, and building R_vIsNA and R_vIsNaN accordingly.
> > For double input of size 1e9 (having some NA and NaN) I observed
> > following timings:
> >
> > R_IsNA6.729s
> > R_vIsNA   4.386s
> >
> > R_IsNaN   6.874s
> > R_vIsNaN  4.479s
> >
> > ISNAN 4.392s
> >
> > It looks like R_vIsN(A|aN) are close to ISNAN (which just wraps to
> > math.h::isnan).
> > Should I follow up with a patch?
> >
> > The experiment is a single nan.c file of 127 lines (includes R C
> > funs). Large enough to not paste in the email. Here is the link:
> > https://gist.github.com/jangorecki/c140fed3a3672620c1e2af90a768d785
> >
> > Run it as:
> >
> > gcc nan.c -lm
> > ./a.out R_vIsNA 8
> > ./a.out R_IsNA 8
> > ./a.out R_vIsNaN 8
> > ./a.out R_IsNaN 8
> > ./a.out ISNAN 8
> >
> > Best regards,
> > Jan Gorecki
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] speed up R_IsNA, R_IsNaN for vector input

2019-09-29 Thread Jan Gorecki

Dear R developers,

I spotted that R_isNA and R_IsNaN could be improved when applied on a
vector where we could take out small part of their logic, run it once,
and then reuse inside the loop.
I setup tiny plain-C experiment. Taking R_IsNA, R_IsNaN from R's
arithmetic.c, and building R_vIsNA and R_vIsNaN accordingly.
For double input of size 1e9 (having some NA and NaN) I observed
following timings:

R_IsNA6.729s
R_vIsNA   4.386s

R_IsNaN   6.874s
R_vIsNaN  4.479s

ISNAN 4.392s

It looks like R_vIsN(A|aN) are close to ISNAN (which just wraps to
math.h::isnan).
Should I follow up with a patch?

The experiment is a single nan.c file of 127 lines (includes R C
funs). Large enough to not paste in the email. Here is the link:
https://gist.github.com/jangorecki/c140fed3a3672620c1e2af90a768d785

Run it as:

gcc nan.c -lm
./a.out R_vIsNA 8
./a.out R_IsNA 8
./a.out R_vIsNaN 8
./a.out R_IsNaN 8
./a.out ISNAN 8

Best regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] passing extra arguments to devtools::build

2019-09-27 Thread Jan Gorecki

Dear Michael,

I think R-devel mailing list is not a proper place to report issues
with devtools. I would try filling an issue in their github
repository.
If you are able to reproduce this issue using plain R without any
extra packages, then please provide such example here.

Regards,
Jan Gorecki

On Fri, Sep 27, 2019 at 3:18 PM Michael Friendly  wrote:
>
> This question was posed on SO :
> https://stackoverflow.com/questions/58118495/passing-extra-argumenets-to-devtoolsbuild
> but there has been no useful reply.
>
> Something seems to have changed in the |devtools|package, so that the
> following commands, that used to run now give an error I can't decipher:
>
> |>Sys.setenv(R_GSCMD="C:/Program
> Files/gs/gs9.21/bin/gswin64c.exe")>devtools::build(args
> =c('--resave-data','--compact-vignettes="gs+qpdf"'))The
> filename,directory name,or volume label syntax is incorrect. Error
> in(function(command =NULL,args =character(),error_on_status
> =TRUE,:System command error|
>
> I've tried other alternatives with other |devtools| commands, like just
> passing a single argument, but still get the same error
>
> |args ='--compact-vignettes="gs+qpdf"'devtools::check_win_devel(args=args)|
>
> I'm using devtools 2.2.0, under R 3.5.2
>
> --
> Michael Friendly Email: friendly AT yorku DOT ca
> Professor, Psychology Dept. & Chair, ASA Statistical Graphics Section
> York University  Voice: 416 736-2100 x66249
> 4700 Keele StreetWeb: http://www.datavis.ca | @datavisFriendly
> Toronto, ONT  M3J 1P3 CANADA
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] REprintf could be caught by tryCatch(message)

2019-09-16 Thread Jan Gorecki

Thank you Luke,
I filled https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17612

Best,
Jan

On Mon, Sep 16, 2019 at 2:15 AM Tierney, Luke  wrote:
>
> You can file it as a wishlist item in the bug trackign system. Without
> a compelling case or a complete and well tested patch or both I doubt
> it will rise to the top of anyone's priority list.
>
> Best,
>
> luke
>
> On Sun, 15 Sep 2019, Jan Gorecki wrote:
>
> > Thank you Luke for prompt reply.
> > Is it possible then to request a new function to R C API "message"
> > that would equivalent to R "message" function? Similarly as we now
> > have C "warning" and C "error" functions.
> >
> > Best,
> > Jan
> >
> > On Sun, Sep 15, 2019 at 5:25 PM Tierney, Luke  
> > wrote:
> >>
> >> On Sun, 15 Sep 2019, Jan Gorecki wrote:
> >>
> >>> Dear R-devel community,
> >>>
> >>> There appears to be an inconsistency in R C API about the exceptions
> >>> that can be raised from C code.
> >>> Mapping of R C funs to corresponding R functions is as follows.
> >>>
> >>> error-> stop
> >>> warning  -> warning
> >>> REprintf -> message
> >>
> >> This is wrong: REpintf is like cat with file = stderr(). If this claim
> >> is made somewhere in R documentation please report it a a bug.
> >>
> >>> Rprintf  -> cat
> >>>
> >>> Rprint/cat is of course not an exception, I listed it just for 
> >>> completeness.
> >>> The inconsistency I would like to report is about REprintf. It cannot
> >>> be caught by tryCatch(message). Warnings are errors are being caught
> >>> as expected.
> >>>
> >>> Is there any chance to "fix"/"improve" REprintf so tryCatch(message)
> >>> can catch it?
> >>
> >> No: this is behaving as intended.
> >>
> >> Best,
> >>
> >> luke
> >>
> >>> So in the example below catch(Cmessage()) would behave consistently to
> >>> R's catch(message("a"))?
> >>>
> >>> Regards,
> >>> Jan Gorecki
> >>>
> >>> catch = function(expr) {
> >>>  tryCatch(expr,
> >>>message=function(m) cat("caught message\n"),
> >>>warning=function(w) cat("caught warning\n"),
> >>>error=function(e) cat("caught error\n")
> >>>  )
> >>> }
> >>> library(inline)
> >>> Cstop = cfunction(c(), 'error("%s\\n","a"); return R_NilValue;')
> >>> Cwarning = cfunction(c(), 'warning("%s\\n","a"); return R_NilValue;')
> >>> Cmessage = cfunction(c(), 'REprintf("%s\\n","a"); return R_NilValue;')
> >>>
> >>> catch(stop("a"))
> >>> #caught error
> >>> catch(warning("a"))
> >>> #caught warning
> >>> catch(message("a"))
> >>> #caught message
> >>>
> >>> catch(Cstop())
> >>> #caught error
> >>> catch(Cwarning())
> >>> #caught warning
> >>> catch(Cmessage())
> >>> #a
> >>> #NULL
> >>>
> >>> __
> >>> R-devel@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >>
> >> --
> >> Luke Tierney
> >> Ralph E. Wareham Professor of Mathematical Sciences
> >> University of Iowa  Phone: 319-335-3386
> >> Department of Statistics andFax:   319-335-3017
> >> Actuarial Science
> >> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> >> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] REprintf could be caught by tryCatch(message)

2019-09-15 Thread Jan Gorecki

Thank you Luke for prompt reply.
Is it possible then to request a new function to R C API "message"
that would equivalent to R "message" function? Similarly as we now
have C "warning" and C "error" functions.

Best,
Jan

On Sun, Sep 15, 2019 at 5:25 PM Tierney, Luke  wrote:
>
> On Sun, 15 Sep 2019, Jan Gorecki wrote:
>
> > Dear R-devel community,
> >
> > There appears to be an inconsistency in R C API about the exceptions
> > that can be raised from C code.
> > Mapping of R C funs to corresponding R functions is as follows.
> >
> > error-> stop
> > warning  -> warning
> > REprintf -> message
>
> This is wrong: REpintf is like cat with file = stderr(). If this claim
> is made somewhere in R documentation please report it a a bug.
>
> > Rprintf  -> cat
> >
> > Rprint/cat is of course not an exception, I listed it just for completeness.
> > The inconsistency I would like to report is about REprintf. It cannot
> > be caught by tryCatch(message). Warnings are errors are being caught
> > as expected.
> >
> > Is there any chance to "fix"/"improve" REprintf so tryCatch(message)
> > can catch it?
>
> No: this is behaving as intended.
>
> Best,
>
> luke
>
> > So in the example below catch(Cmessage()) would behave consistently to
> > R's catch(message("a"))?
> >
> > Regards,
> > Jan Gorecki
> >
> > catch = function(expr) {
> >  tryCatch(expr,
> >message=function(m) cat("caught message\n"),
> >warning=function(w) cat("caught warning\n"),
> >error=function(e) cat("caught error\n")
> >  )
> > }
> > library(inline)
> > Cstop = cfunction(c(), 'error("%s\\n","a"); return R_NilValue;')
> > Cwarning = cfunction(c(), 'warning("%s\\n","a"); return R_NilValue;')
> > Cmessage = cfunction(c(), 'REprintf("%s\\n","a"); return R_NilValue;')
> >
> > catch(stop("a"))
> > #caught error
> > catch(warning("a"))
> > #caught warning
> > catch(message("a"))
> > #caught message
> >
> > catch(Cstop())
> > #caught error
> > catch(Cwarning())
> > #caught warning
> > catch(Cmessage())
> > #a
> > #NULL
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] REprintf could be caught by tryCatch(message)

2019-09-15 Thread Jan Gorecki

Dear R-devel community,

There appears to be an inconsistency in R C API about the exceptions
that can be raised from C code.
Mapping of R C funs to corresponding R functions is as follows.

error-> stop
warning  -> warning
REprintf -> message
Rprintf  -> cat

Rprint/cat is of course not an exception, I listed it just for completeness.
The inconsistency I would like to report is about REprintf. It cannot
be caught by tryCatch(message). Warnings are errors are being caught
as expected.

Is there any chance to "fix"/"improve" REprintf so tryCatch(message)
can catch it?
So in the example below catch(Cmessage()) would behave consistently to
R's catch(message("a"))?

Regards,
Jan Gorecki

catch = function(expr) {
  tryCatch(expr,
message=function(m) cat("caught message\n"),
warning=function(w) cat("caught warning\n"),
error=function(e) cat("caught error\n")
  )
}
library(inline)
Cstop = cfunction(c(), 'error("%s\\n","a"); return R_NilValue;')
Cwarning = cfunction(c(), 'warning("%s\\n","a"); return R_NilValue;')
Cmessage = cfunction(c(), 'REprintf("%s\\n","a"); return R_NilValue;')

catch(stop("a"))
#caught error
catch(warning("a"))
#caught warning
catch(message("a"))
#caught message

catch(Cstop())
#caught error
catch(Cwarning())
#caught warning
catch(Cmessage())
#a
#NULL

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Underscores in package names

2019-08-16 Thread Jan Gorecki

Thanks Abby and Martin,

In every company I worked using R - 3 in total - there was at least
one (up to ~10) processes designed (dev and implemented) to depend on
current package naming scheme, having underscore as separator of
package name and its version. From my experience I believe this is a
(very?) common practice. I also use it myself.
Arguments for having underscore in package names are simply weak.
Dot in function names is an entirely different issue caused by S3
dispatch. No need to look at other OOP languages, it is R.
Package name is not a function name.
There are no practical gains.
There is nothing wrong in having package "a.pkg" and function "a_pkg()".

Regards,
Jan Gorecki


On Fri, Aug 16, 2019 at 1:20 AM Abby Spurdle  wrote:
>
> > While
> > package names are not functions, using dots in package names
> > encourages the use of dots in functions, a dangerous practice.
>
> "dangerous"...?
> I can't understand the necessity of RStudio and Tiny-Verse affiliated
> persons to repeatedly use subjective and unscientific phrasing.
>
> Elegant, Advanced, Dangerous...
> At UseR, there was even "Advanced Use of your Favorite IDE".
>
> This is not science.
> This is marketing.
>
> There's nothing dangerous about it other than your belief that it's
> dangerous.
> I note that many functions in the stats package use dots in function names.
> Your statement implies that the stats package is badly designed, which it
> is not.
> Out of 14,800-ish packages on CRAN, very few of them are even close to the
> standard set by the stats package, in my opinion.
>
> And as noted by other people in this thread, changing naming policies could
> interfere with a lot of software "out there", which is dangerous.
>
> > Dots in
> > names is also one of the common stones cast at R as a language, as
> > dots are used for object oriented method dispatch in other common
> > languages.
>
> I don't think the goal is to copy other OOP systems.
> Furthermore, some shells use dot as the current working directory and Java
> uses dots in package namespaces.
> And then there's regular expressions...
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R pkg install should fail for unsuccessful DLL copy on windows?

2019-05-29 Thread Jan Gorecki

Hi Toby,
AFAIK it has not been addressed in R. You can handle the problem on
your package side, see
https://github.com/Rdatatable/data.table/pull/3237
Regards,
Jan


On Thu, May 30, 2019 at 4:46 AM Toby Hocking  wrote:
>
> Hi all,
>
> I am having an issue related to installing packages on windows with
> R-3.6.0. When installing a package that is in use, I expected R to stop
> with an error. However I am getting a warning that the DLL copy was not
> successful, but the overall package installation IS successful. This is
> quite dangerous because the old DLL and the new R code could be
> incompatible.
>
> I am definitely not the first person to have this issue.
> * Matt Dowle reported
> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17478 which was never
> addressed.
> * Jim Hester reported
> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17453 which was
> apparently addressed in R-3.5.1, via
> https://github.com/wch/r-source/commit/828a04f9c428403e476620b1905a1d8ca41d0bcd
>
> But I am now having the same issue in R-3.6.0 -- is this a regression in R?
> or is there another fix that I can use?
>
> Below is the minimal R code that I used to reproduce the issue. Essentially,
> * I start R with --vanilla and set options repos=cloud and warn=2 (which I
> expect should convert warnings to errors).
> * I do library(penaltyLearning) and then install the package from source,
> which results in the
>   warnings. I expected there should be an error.
>
> th798@cmp2986 MINGW64 ~/R
> $ R --vanilla -e "options(repos='https://cloud.r-project.org',
> warn=2);library(penaltyLearning);install.packages('penaltyLearning',
> type='source');getOption('warn');sessionInfo()"
>
> R version 3.6.0 (2019-04-26) -- "Planting of a Tree"
> Copyright (C) 2019 The R Foundation for Statistical Computing
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>
> > options(repos='https://cloud.r-project.org',
> warn=2);library(penaltyLearning);install.packages('penaltyLearning',
> type='source');getOption('warn');sessionInfo()
> Loading required package: data.table
> Registered S3 methods overwritten by 'ggplot2':
>   method from
>   [.quosures rlang
>   c.quosures rlang
>   print.quosures rlang
> trying URL '
> https://cloud.r-project.org/src/contrib/penaltyLearning_2018.09.04.tar.gz'
> Content type 'application/x-gzip' length 2837289 bytes (2.7 MB)
> ==
> downloaded 2.7 MB
>
> * installing *source* package 'penaltyLearning' ...
> ** package 'penaltyLearning' successfully unpacked and MD5 sums checked
> ** using staged installation
> ** libs
> c:/Rtools/mingw_64/bin/g++  -std=gnu++11 -I"C:/PROGRA~1/R/R-36~1.0/include"
> -DNDEBUG  -O2 -Wall  -mtune=generic -c interface.cpp -o interface.o
> c:/Rtools/mingw_64/bin/g++  -std=gnu++11 -I"C:/PROGRA~1/R/R-36~1.0/include"
> -DNDEBUG  -O2 -Wall  -mtune=generic -c largestContinuousMinimum.cpp
> -o largestContinuousMinimum.o
> largestContinuousMinimum.cpp: In function 'int
> largestContinuousMinimum(int, double*, double*, int*)':
> largestContinuousMinimum.cpp:38:27: warning: 'start' may be used
> uninitialized in this function [-Wmaybe-uninitialized]
>index_vec[0] = start;
>^
> c:/Rtools/mingw_64/bin/g++  -std=gnu++11 -I"C:/PROGRA~1/R/R-36~1.0/include"
> -DNDEBUG  -O2 -Wall  -mtune=generic -c modelSelection.cpp -o
> modelSelection.o
> /usr/bin/sed: -e expression #1, char 1: unknown command: `C'
> c:/Rtools/mingw_64/bin/g++ -shared -s -static-libgcc -o penaltyLearning.dll
> tmp.def interface.o largestContinuousMinimum.o modelSelection.o
> -LC:/PROGRA~1/R/R-36~1.0/bin/x64 -lR
> installing to C:/Program
> Files/R/R-3.6.0/library/00LOCK-penaltyLearning/00new/penaltyLearning/libs/x64
> ** R
> ** data
> ** byte-compile and prepare package for lazy loading
> ** help
> *** installing help indices
>   converting help for package 'penaltyLearning'
> finding HTML links ... done
> GeomTallRecthtml
> IntervalRegressionCVhtml
> IntervalRegressionCVmargin  html
> IntervalRegressionInternal  html
> IntervalRegressionRegularized   html
> IntervalRegressionUnregularized html
> ROChangehtml
> change.colors   html
> change.labels   html
> changeLabel html
>

Re: [Rd] nrow(rbind(character(), character())) returns 2 (as documented but very unintuitive, IMHO)

2019-05-16 Thread Jan Gorecki

Hi Gabriel

> Personally, no I wouldn't. I would consider m==0 a degenerate case, where
there is no data, but I personally find matrices (or data.frames) with rows
but no columns a very strange concept.

This distinction between matrix and data.frames is the crux in this case.
>From the dimensional modelling point of view, matrix can have non-zero
rows and zero columns, but data.frame (assuming it maps to database
table structure) should never have non-zero rows and zero columns.
This kind of issue was raised before in our issue tracker:
https://github.com/Rdatatable/data.table/issues/2422
You should find that discussion useful.

Best,
Jan Gorecki


On Fri, May 17, 2019 at 8:11 AM Pages, Herve  wrote:
>
> On 5/16/19 17:48, Gabriel Becker wrote:
>
> Hi Herve,
>
> Inline.
>
>
>
> On Thu, May 16, 2019 at 4:45 PM Pages, Herve 
> mailto:hpa...@fredhutch.org>> wrote:
> Hi Gabe,
>
>ncol(data.frame(aa=c("a", "b", "c"), AA=c("A", "B", "C")))
># [1] 2
>
>ncol(data.frame(aa="a", AA="A"))
># [1] 2
>
>ncol(data.frame(aa=character(0), AA=character(0)))
># [1] 2
>
>ncol(cbind(aa=c("a", "b", "c"), AA=c("A", "B", "C")))
># [1] 2
>
>ncol(cbind(aa="a", AA="A"))
># [1] 2
>
>ncol(cbind(aa=character(0), AA=character(0)))
># [1] 2
>
>nrow(rbind(aa=c("a", "b", "c"), AA=c("A", "B", "C")))
># [1] 2
>
>nrow(rbind(aa="a", AA="A"))
># [1] 2
>
>nrow(rbind(aa=character(0), AA=character(0)))
># [1] 2
>
> Sure, but
>
>
> > nrow(rbind(aa = c("a", "b", "c"), AA = c("a", "b", "c")))
>
> [1] 2
>
> > nrow(rbind(aa = c("a", "b", "c"), AA = "a"))
>
> [1] 2
>
> > nrow(rbind(aa = c("a", "b", "c"), AA = character()))
>
> [1] 1
>
>
> Ah, I see now.
>
> But:
>
>   > data.frame(aa = c("a", "b", "c"), AA = character())
>   Error in data.frame(aa = c("a", "b", "c"), AA = character()) :
> arguments imply differing number of rows: 3, 0
>
> and
>
>   > mapply(`*`, 1:5, integer(0))
>   Error in mapply(`*`, 1:5, integer(0)) :
> zero-length inputs cannot be mixed with those of non-zero length
>
> So I would declare rbind(aa = c("a", "b", "c"), AA = character()) 
> inconsistent rather than making the case that rbind(aa = character(), AA = 
> character()) needs to change.
>
> Cheers,
>
> H.
>
>
> So even if I ultimately "lose"  this debate (which really wouldn't shock me, 
> even if R-core did agree with me there's backwards compatibility to 
> consider), you have to concede that the current behavior is more complicated 
> than the above is acknowledging.
>
> By rights of the invariance that you and Hadley are advocating,  as far as I 
> understand it, the last should give 2 rows, one of which is all NAs, rather 
> than giving only one row as it currently does (and, I assume?,  always has).
>
> So there are two different behavior patterns that could coherently (and 
> internally-consistently) be generalized to apply to the  rbind(character(), 
> character()) case, not just one. I'm making the case that the other one (that 
> length 0 vectors do not add rows because they don't contain data) would be 
> equally valid, and to N>1 people, at least equally intuitive.
>
> Best,
> ~G
>
> hmmm... not sure why ncol(cbind(aa=character(0), AA=character(0))) or
> nrow(rbind(aa=character(0), AA=character(0))) should do anything
> different from what they do.
>
> In my experience, and more generally speaking, the desire to treat
> 0-length vectors as a special case that deviates from the
> non-zero-length case has never been productive.
>
> H.
>
>
> On 5/16/19 13:17, Gabriel Becker wrote:
> > Hi all,
> >
> > Apologies if this has been asked before (a quick google didn't  find it for
> > me),and I know this is a case of behaving as documented but its so
> > unintuitive (to me at least) that I figured I'd bring it up here anyway. I
> > figure its probably going to not be changed,  but I'm happy to submit a
> > patch if this is something R-core feels can/should change.
> >
> > So I recently got bitten by the fact that
> >
> >> nrow(rbind(character(), character()))
> > [1] 2
> >
> >
> >

Re: [Rd] Runnable R packages

2019-01-31 Thread Jan Gorecki

Quoting:

"In summary, I'm convinced R would benefit from something similar to Java's
`Main-Class` header or Python's `__main__()` function. A new R CMD command
would take a package, install its dependencies, and run its "main"
function."

This kind of increase the scope of your idea. New command in R CMD to
redirect to "main" is interesting idea. On the other hand it will
impose limitation on user comparing to the way how you could do it
now: Rscript -e 'mypkg::mymain("myparam")' (or littler, it should be
shipped with R IMO).
For production system one doesn't want to just "install its
dependencies". First dependencies has to be mirrored and their version
frozen. Then testing your package on that set of dependencies. Once
successfully done then same set of packages should be used for
production deployment. For those processes you might find tools4pkgs
branch in base R useful (packages.dcf, mirror.packages functions),
unfortunately not merged:
https://github.com/wch/r-source/compare/tools4pkgs

Jan Gorecki

On Thu, Jan 31, 2019 at 9:08 PM Barry Rowlingson
 wrote:
>
> On Thu, Jan 31, 2019 at 3:14 PM David Lindelof  wrote:
>
> >
> > In summary, I'm convinced R would benefit from something similar to Java's
> > `Main-Class` header or Python's `__main__()` function. A new R CMD command
> > would take a package, install its dependencies, and run its "main"
> > function.
>
>
>
> I just created and built a very boilerplate R package called "runme". I can
> install its dependencies and run its "main" function with:
>
>  $ R CMD INSTALL runme_0.0.0.9000.tar.gz
>  $ R -e 'runme::main()'
>
> No new R CMDs needed. Now my choice of "main" is arbitrary, whereas with
> python and java and C the entrypoint is more tightly specified (__name__ ==
> "__main__" in python, int main(..) in C and so on). But I don't think
> that's much of a problem.
>
> Does that not satisfy your requirements close enough? If you want it in one
> line then:
>
> R CMD INSTALL runme_0.0.0.9000.tar.gz && R -e 'runme::main()'
>
> will do the second if the first succeeds (Unix shells).
>
> You could write a script for $RHOME/bin/RUN which would be a two-liner and
> that could mandate the use of "main" as an entry point. But good luck
> getting anything into base R.
>
> Barry
>
>
>
>
> > If we have this machinery available, we could even consider
> > reaching out to Spark (and other tech stacks) developers and make it easier
> > to develop R applications for those platforms.
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] base::mean not consistent about NA/NaN

2018-07-21 Thread Jan Gorecki

Thank you Tomas for detailed explanation. Such a nice description deserves
to be included somewhere in documentation, R-internals maybe.
Regards
Jan

On 18 Jul 2018 18:24, "Tomas Kalibera"  wrote:

Yes, the performance overhead of fixing this at R level would be too
large and it would complicate the code significantly. The result of
binary operations involving NA and NaN is hardware dependent (the
propagation of NaN payload) - on some hardware, it actually works the
way we would like - NA is returned - but on some hardware you get NaN or
sometimes NA and sometimes NaN. Also there are C compiler optimizations
re-ordering code, as mentioned in ?NaN. Then there are also external
numerical libraries that do not distinguish NA from NaN (NA is an R
concept). So I am afraid this is unfixable. The disclaimer mentioned by
Duncan is in ?NaN/?NA, which I think is ok - there are so many numerical
functions through which one might run into these problems that it would
be infeasible to document them all. Some functions in fact will preserve
NA, and we would not let NA turn into NaN unnecessarily, but the
disclaimer says it is something not to depend on.

Tomas

On 07/03/2018 11:12 AM, Jan Gorecki wrote:
> Thank you for interesting examples.
> I would find useful to document this behavior also in `?mean`, while `+`
> operator is also affected, the `sum` function is not.
> For mean, NA / NaN could be handled in loop in summary.c. I assume that
> performance penalty of fix is the reason why this inconsistency still
> exists.
> Jan
>
> On Mon, Jul 2, 2018 at 8:28 PM, Barry Rowlingson <
> b.rowling...@lancaster.ac.uk> wrote:
>
>> And for a starker example of this (documented) inconsistency,
>> arithmetic addition is not commutative:
>>
>>   > NA + NaN
>>   [1] NA
>>   > NaN + NA
>>   [1] NaN
>>
>>
>>
>> On Mon, Jul 2, 2018 at 5:32 PM, Duncan Murdoch 
>> wrote:
>>> On 02/07/2018 11:25 AM, Jan Gorecki wrote:
>>>> Hi,
>>>> base::mean is not consistent in terms of handling NA/NaN.
>>>> Mean should not depend on order of its arguments while currently it is.
>>> The result of mean() can depend on the order even with regular numbers.
>>> For example,
>>>
>>>   > x <- rep(c(1, 10^(-15)), 100)
>>>   > mean(sort(x)) - 0.5
>>> [1] 5.551115e-16
>>>   > mean(rev(sort(x))) - 0.5
>>> [1] 0
>>>
>>>
>>>>   mean(c(NA, NaN))
>>>>   #[1] NA
>>>>   mean(c(NaN, NA))
>>>>   #[1] NaN
>>>>
>>>> I created issue so in case of no replies here status of it can be
>> looked up
>>>> at:
>>>> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17441
>>> The help page for ?NaN says,
>>>
>>> "Computations involving NaN will return NaN or perhaps NA: which of
>>> those two is not guaranteed and may depend on the R platform (since
>>> compilers may re-order computations)."
>>>
>>> And ?NA says,
>>>
>>> "Numerical computations using NA will normally result in NA: a possible
>>> exception is where NaN is also involved, in which case either might
>>> result (which may depend on the R platform). "
>>>
>>> So I doubt if this inconsistency will be fixed.
>>>
>>> Duncan Murdoch
>>>
>>>> Best,
>>>> Jan
>>>>
>>>>[[alternative HTML version deleted]]
>>>>
>>>> __
>>>> R-devel@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>   [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] base::mean not consistent about NA/NaN

2018-07-03 Thread Jan Gorecki

Thank you for interesting examples.
I would find useful to document this behavior also in `?mean`, while `+`
operator is also affected, the `sum` function is not.
For mean, NA / NaN could be handled in loop in summary.c. I assume that
performance penalty of fix is the reason why this inconsistency still
exists.
Jan

On Mon, Jul 2, 2018 at 8:28 PM, Barry Rowlingson <
b.rowling...@lancaster.ac.uk> wrote:

> And for a starker example of this (documented) inconsistency,
> arithmetic addition is not commutative:
>
>  > NA + NaN
>  [1] NA
>  > NaN + NA
>  [1] NaN
>
>
>
> On Mon, Jul 2, 2018 at 5:32 PM, Duncan Murdoch 
> wrote:
> > On 02/07/2018 11:25 AM, Jan Gorecki wrote:
> >> Hi,
> >> base::mean is not consistent in terms of handling NA/NaN.
> >> Mean should not depend on order of its arguments while currently it is.
> >
> > The result of mean() can depend on the order even with regular numbers.
> > For example,
> >
> >  > x <- rep(c(1, 10^(-15)), 100)
> >  > mean(sort(x)) - 0.5
> > [1] 5.551115e-16
> >  > mean(rev(sort(x))) - 0.5
> > [1] 0
> >
> >
> >>
> >>  mean(c(NA, NaN))
> >>  #[1] NA
> >>  mean(c(NaN, NA))
> >>  #[1] NaN
> >>
> >> I created issue so in case of no replies here status of it can be
> looked up
> >> at:
> >> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17441
> >
> > The help page for ?NaN says,
> >
> > "Computations involving NaN will return NaN or perhaps NA: which of
> > those two is not guaranteed and may depend on the R platform (since
> > compilers may re-order computations)."
> >
> > And ?NA says,
> >
> > "Numerical computations using NA will normally result in NA: a possible
> > exception is where NaN is also involved, in which case either might
> > result (which may depend on the R platform). "
> >
> > So I doubt if this inconsistency will be fixed.
> >
> > Duncan Murdoch
> >
> >>
> >> Best,
> >> Jan
> >>
> >>   [[alternative HTML version deleted]]
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] base::mean not consistent about NA/NaN

2018-07-02 Thread Jan Gorecki

Hi,
base::mean is not consistent in terms of handling NA/NaN.
Mean should not depend on order of its arguments while currently it is.

mean(c(NA, NaN))
#[1] NA
mean(c(NaN, NA))
#[1] NaN

I created issue so in case of no replies here status of it can be looked up
at:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17441

Best,
Jan

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Sys.timezone (timedatectl) unnecessarily warns loudly

2018-05-06 Thread Jan Gorecki

Dear R-devels,

timedatectl binary used by Sys.timezone does not always work reliably.
If it doesn't the warning is raised, unnecessarily because later on
Sys.timezone gets timezone successfully from /etc/timezone. This
obviously might not be true for different linux OSes, but it solves
the issue for simple dockerized Ubuntu 16.04.

Current behavior R Under development (unstable) (2018-05-04 r74695) --
"Unsuffered Consequences"

  Sys.timezone()
  #Failed to create bus connection: No such file or directory
  #[1] "Etc/UTC"
  #Warning message:
  #In system("timedatectl", intern = TRUE) :
  #  running command 'timedatectl' had status 1

There was small discussion where I initially put comment about it in:
https://github.com/wch/r-source/commit/9866ac2ad1e2f1c4565ae829ba33b5b98a08d10d#r28867164

Below patch makes timedatectl call silent, both suppressWarnings and
ignore.stderr are required to deal with R warning, and warning printed
directly to console from timedatectl.

diff --git src/library/base/R/datetime.R src/library/base/R/datetime.R
index 6b34267936..b81c049f3e 100644
--- src/library/base/R/datetime.R
+++ src/library/base/R/datetime.R
@@ -73,7 +73,7 @@ Sys.timezone <- function(location = TRUE)
 ## First try timedatectl: should work on any modern Linux
 ## as part of systemd (and probably nowhere else)
 if (nzchar(Sys.which("timedatectl"))) {
-inf <- system("timedatectl", intern = TRUE)
+inf <- suppressWarnings(system("timedatectl", intern = TRUE,
ignore.stderr=TRUE))
 ## typical format:
 ## "   Time zone: Europe/London (GMT, +0000)"
 ## "   Time zone: Europe/Vienna (CET, +0100)"

Regards,
Jan Gorecki

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] colnames for data.frame could be greatly improved

2016-12-27 Thread Jan Gorecki

Hi there,
Any update on this?
Should I create bugzilla ticket and submit patch?
Regards
Jan Gorecki

On 20 December 2016 at 01:27, Jan Gorecki <j.gore...@wit.edu.pl> wrote:
> Hello,
>
> colnames seems to be not optimized well for data.frame. It escapes
> processing for data.frame in
>
>   if (is.data.frame(x) && do.NULL)
> return(names(x))
>
> but only when do.NULL true. This makes huge difference when do.NULL
> false. Minimal edit to `colnames`:
>
> if (is.data.frame(x)) {
> nm <- names(x)
> if (do.NULL || !is.null(nm))
> return(nm)
> else
> return(paste0(prefix, seq_along(x)))
> }
>
> Script and timings:
>
> N=1e7; K=100
> set.seed(1)
> DF <- data.frame(
> id1 = sample(sprintf("id%03d",1:K), N, TRUE),  # large groups (char)
> id2 = sample(sprintf("id%03d",1:K), N, TRUE),  # large groups (char)
> id3 = sample(sprintf("id%010d",1:(N/K)), N, TRUE), # small groups (char)
> id4 = sample(K, N, TRUE),  # large groups (int)
> id5 = sample(K, N, TRUE),  # large groups (int)
> id6 = sample(N/K, N, TRUE),# small groups (int)
> v1 =  sample(5, N, TRUE),  # int in range [1,5]
> v2 =  sample(5, N, TRUE),  # int in range [1,5]
> v3 =  sample(round(runif(100,max=100),4), N, TRUE) # numeric e.g. 23.5749
> )
> cat("GB =", round(sum(gc()[,2])/1024, 3), "\n")
> #GB = 0.397
> colnames(DF) = NULL
> system.time(nm1<-colnames(DF, FALSE))
> #   user  system elapsed
> # 22.158   0.299  22.498
> print(nm1)
> #[1] "col1" "col2" "col3" "col4" "col5" "col6" "col7" "col8" "col9"
>
> ### restart R
>
> colnames <- function (x, do.NULL = TRUE, prefix = "col")
> {
> if (is.data.frame(x)) {
> nm <- names(x)
> if (do.NULL || !is.null(nm))
> return(nm)
> else
> return(paste0(prefix, seq_along(x)))
> }
> dn <- dimnames(x)
> if (!is.null(dn[[2L]]))
> dn[[2L]]
> else {
> nc <- NCOL(x)
> if (do.NULL)
> NULL
> else if (nc > 0L)
> paste0(prefix, seq_len(nc))
> else character()
> }
> }
> N=1e7; K=100
> set.seed(1)
> DF <- data.frame(
> id1 = sample(sprintf("id%03d",1:K), N, TRUE),  # large groups (char)
> id2 = sample(sprintf("id%03d",1:K), N, TRUE),  # large groups (char)
> id3 = sample(sprintf("id%010d",1:(N/K)), N, TRUE), # small groups (char)
> id4 = sample(K, N, TRUE),  # large groups (int)
> id5 = sample(K, N, TRUE),  # large groups (int)
> id6 = sample(N/K, N, TRUE),# small groups (int)
> v1 =  sample(5, N, TRUE),  # int in range [1,5]
> v2 =  sample(5, N, TRUE),  # int in range [1,5]
> v3 =  sample(round(runif(100,max=100),4), N, TRUE) # numeric e.g. 23.5749
> )
> cat("GB =", round(sum(gc()[,2])/1024, 3), "\n")
> #GB = 0.397
> colnames(DF) = NULL
> system.time(nm1<-colnames(DF, FALSE))
> #   user  system elapsed
> #  0.001   0.000   0.000
> print(nm1)
> #[1] "col1" "col2" "col3" "col4" "col5" "col6" "col7" "col8" "col9"
>
> sessionInfo()
> #R Under development (unstable) (2016-12-19 r71815)
> #Platform: x86_64-pc-linux-gnu (64-bit)
> #Running under: Debian GNU/Linux stretch/sid
> #
> #locale:
> # [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
> # [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
> # [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
> # [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
> # [9] LC_ADDRESS=C   LC_TELEPHONE=C
> #[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
> #
> #attached base packages:
> #[1] stats graphics  grDevices utils datasets  methods   base  #
> #
> #loaded via a namespace (and not attached):
> #[1] compiler_3.4.0

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] colnames for data.frame could be greatly improved

2016-12-19 Thread Jan Gorecki

Hello,

colnames seems to be not optimized well for data.frame. It escapes
processing for data.frame in

  if (is.data.frame(x) && do.NULL)
return(names(x))

but only when do.NULL true. This makes huge difference when do.NULL
false. Minimal edit to `colnames`:

if (is.data.frame(x)) {
nm <- names(x)
if (do.NULL || !is.null(nm))
return(nm)
else
return(paste0(prefix, seq_along(x)))
}

Script and timings:

N=1e7; K=100
set.seed(1)
DF <- data.frame(
id1 = sample(sprintf("id%03d",1:K), N, TRUE),  # large groups (char)
id2 = sample(sprintf("id%03d",1:K), N, TRUE),  # large groups (char)
id3 = sample(sprintf("id%010d",1:(N/K)), N, TRUE), # small groups (char)
id4 = sample(K, N, TRUE),  # large groups (int)
id5 = sample(K, N, TRUE),  # large groups (int)
id6 = sample(N/K, N, TRUE),# small groups (int)
v1 =  sample(5, N, TRUE),  # int in range [1,5]
v2 =  sample(5, N, TRUE),  # int in range [1,5]
v3 =  sample(round(runif(100,max=100),4), N, TRUE) # numeric e.g. 23.5749
)
cat("GB =", round(sum(gc()[,2])/1024, 3), "\n")
#GB = 0.397
colnames(DF) = NULL
system.time(nm1<-colnames(DF, FALSE))
#   user  system elapsed
# 22.158   0.299  22.498
print(nm1)
#[1] "col1" "col2" "col3" "col4" "col5" "col6" "col7" "col8" "col9"

### restart R

colnames <- function (x, do.NULL = TRUE, prefix = "col")
{
if (is.data.frame(x)) {
nm <- names(x)
if (do.NULL || !is.null(nm))
return(nm)
else
return(paste0(prefix, seq_along(x)))
}
dn <- dimnames(x)
if (!is.null(dn[[2L]]))
dn[[2L]]
else {
nc <- NCOL(x)
if (do.NULL)
NULL
else if (nc > 0L)
paste0(prefix, seq_len(nc))
else character()
}
}
N=1e7; K=100
set.seed(1)
DF <- data.frame(
id1 = sample(sprintf("id%03d",1:K), N, TRUE),  # large groups (char)
id2 = sample(sprintf("id%03d",1:K), N, TRUE),  # large groups (char)
id3 = sample(sprintf("id%010d",1:(N/K)), N, TRUE), # small groups (char)
id4 = sample(K, N, TRUE),  # large groups (int)
id5 = sample(K, N, TRUE),  # large groups (int)
id6 = sample(N/K, N, TRUE),# small groups (int)
v1 =  sample(5, N, TRUE),  # int in range [1,5]
v2 =  sample(5, N, TRUE),  # int in range [1,5]
v3 =  sample(round(runif(100,max=100),4), N, TRUE) # numeric e.g. 23.5749
)
cat("GB =", round(sum(gc()[,2])/1024, 3), "\n")
#GB = 0.397
colnames(DF) = NULL
system.time(nm1<-colnames(DF, FALSE))
#   user  system elapsed
#  0.001   0.000   0.000
print(nm1)
#[1] "col1" "col2" "col3" "col4" "col5" "col6" "col7" "col8" "col9"

sessionInfo()
#R Under development (unstable) (2016-12-19 r71815)
#Platform: x86_64-pc-linux-gnu (64-bit)
#Running under: Debian GNU/Linux stretch/sid
#
#locale:
# [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
# [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
# [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
# [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
# [9] LC_ADDRESS=C   LC_TELEPHONE=C
#[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#
#attached base packages:
#[1] stats graphics  grDevices utils datasets  methods   base  #
#
#loaded via a namespace (and not attached):
#[1] compiler_3.4.0

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] new function to tools/utils package: dependencies based on DESCRIPTION file

2016-11-17 Thread Jan Gorecki

Hi Michael,
Are you willing to accept patch for this? I'm already using this and
few related functions for a while, it plays well. I could wrap it as
patch to utils, or tools?
Best,
Jan

On 16 June 2016 at 14:00, Michael Lawrence <lawrence.mich...@gene.com> wrote:
> I agree that the utils package needs some improvements related to
> this, and hope to make them eventually. This type of feedback is very
> helpful.
>
> Thanks,
> Michael
>
>
>
> On Thu, Jun 16, 2016 at 1:42 AM, Jan Górecki <j.gore...@wit.edu.pl> wrote:
>> Dear Joris,
>>
>> So it does looks like the proposed function makes a lot sense then, isn't it?
>>
>> Cheers,
>> Jan
>>
>> On 16 June 2016 at 08:37, Joris Meys <jorism...@gmail.com> wrote:
>>> Dear Jan,
>>>
>>> It is unavoidable to have OS and R dependencies for devtools. The building
>>> process for packages is both OS and R dependent, so devtools has to be too
>>> according to my understanding.
>>>
>>> Cheers
>>> Joris
>>>
>>> On 14 Jun 2016 18:56, "Jan Górecki" <j.gore...@wit.edu.pl> wrote:
>>>
>>> Hi Thierry,
>>>
>>> I'm perfectly aware of it. Any idea when devtools would be shipped as
>>> a base R package, or at least recommended package? To actually answer
>>> the problem described in my email.
>>> I have range of useful functions available tools/utils packages which
>>> are shipped together with R. They doesn't require any OS dependencies
>>> or R dependencies, unlike devtools which requires both. Installing
>>> unnecessary OS dependencies and R dependencies just for such a simple
>>> wrapper doesn't seem to be an elegant way to address it, therefore my
>>> proposal to include that simple function in tools, or utils package.
>>>
>>> Regards,
>>> Jan Gorecki
>>>
>>> On 14 June 2016 at 16:17, Thierry Onkelinx <thierry.onkel...@inbo.be> wrote:
>>>> Dear Jan,
>>>>
>>>> Similar functionality is available in devtools::dev_package_deps()
>>>>
>>>> Best regards,
>>>>
>>>> ir. Thierry Onkelinx
>>>> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
>>>> Forest
>>>> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
>>>> Kliniekstraat 25
>>>> 1070 Anderlecht
>>>> Belgium
>>>>
>>>> To call in the statistician after the experiment is done may be no more
>>>> than
>>>> asking him to perform a post-mortem examination: he may be able to say
>>>> what
>>>> the experiment died of. ~ Sir Ronald Aylmer Fisher
>>>> The plural of anecdote is not data. ~ Roger Brinner
>>>> The combination of some data and an aching desire for an answer does not
>>>> ensure that a reasonable answer can be extracted from a given body of
>>>> data.
>>>> ~ John Tukey
>>>>
>>>> 2016-06-14 16:54 GMT+02:00 Jan Górecki <j.gore...@wit.edu.pl>:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> Packages tools and utils have a lot of useful stuff for R developers.
>>>>> I find one task still not as straightforward as it could. Simply to
>>>>> extract dependencies of a package from DESCRIPTION file (before it is
>>>>> even installed to library). This would be valuable in automation of CI
>>>>> setup in a more meta-data driven way.
>>>>> The simple function below, I know it is short and simple, but having
>>>>> it to be defined in each CI workflow is a pain, it could be already
>>>>> available in tools or utils namespace.
>>>>>
>>>>> package.dependencies.dcf <- function(file = "DESCRIPTION", which =
>>>>> c("Depends","Imports","LinkingTo")) {
>>>>> stopifnot(file.exists(file), is.character(which))
>>>>> which_all <- c("Depends", "Imports", "LinkingTo", "Suggests",
>>>>> "Enhances")
>>>>> if (identical(which, "all"))
>>>>> which <- which_all
>>>>> else if (identical(which, "most"))
>>>>> which <- c("Depends", "Imports", "LinkingTo", "Suggests")
>>>>> stopifnot(which %in% which_all)
>>>>> dcf <- read.

Re: [Rd] Running package tests and not stop on first fail

2016-11-09 Thread Jan Gorecki

Sorry for late reply. I like the stop-on-error.
Thanks for merging.
Glad to be R contributor!

On 4 November 2016 at 09:42, Oliver Keyes  wrote:
> On Friday, 4 November 2016, Martin Maechler 
> wrote:
>>
>> > Dirk Eddelbuettel 
>> > on Fri, 4 Nov 2016 10:36:52 -0500 writes:
>>
>> > On 4 November 2016 at 16:24, Martin Maechler wrote: | My
>> > proposed name '--no-stop-on-error' was a quick shot; if |
>> > somebody has a more concise or better "English style"
>> > wording | (which is somewhat compatible with all the other
>> > options you see | from 'R CMD check --help'), | please
>> > speak up.
>>
>> > Why not keep it simple?  The similar feature this most
>> > resembles is 'make -k' and its help page has
>>
>> >-k, --keep-going
>>
>> >Continue as much as possible after an
>> > error.  While the target that failed, and those that
>> > depend on it, cannot be remade, the other dependencies of
>> > these targets can be processed all the same.
>>
>> Yes, that would be quite a bit simpler and nice in my view.
>> One may think it to be too vague,
>
>
> Mmn, I would agree on vagueness (and it breaks the pattern set by other
> flags of human-readability). Deep familiarity with make is probably not
> something we should ask of everyone who needs to test a package, too.
>
> I quite like stop-on-error=true (exactly the same as the previous suggestion
> but shaves off some characters by inverting the Boolean)
>
>> notably from Brian Pedersen's mentioning that the examples are
>> already continued in any case if they lead to an error.
>>
>> Other opinions?
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Running package tests and not stop on first fail

2016-11-04 Thread Jan Gorecki

Martin,
I submitted very simple patch on
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17176

Herve,
While I like your idea, I prefer to keep my patch simple, it is now
exactly what Martin mentions. I think it is a good start that can
eventually be extended later for what you are asking.

Regards,
Jan

On 3 November 2016 at 17:25, Hervé Pagès <hpa...@fredhutch.org> wrote:
>
> Hi Martin, Jan,
>
> On 11/03/2016 03:45 AM, Martin Maechler wrote:
>>>>>>>
>>>>>>> Jan Gorecki <j.gore...@wit.edu.pl>
>>>>>>> on Tue, 1 Nov 2016 22:51:28 + writes:
>>
>>
>> > Hello community/devs, Is there an option to run package
>> > tests during R CMD check and not stop on first error? I
>> > know that testing frameworks (testhat and others) can do
>> > that but asking about just R and base packages. Currently
>> > when package check runs test scripts in ./tests directory
>> > it will stop after first fail.  Do you think it could be
>> > optionally available to continue to run tests after
>> > failures?  Regards, Jan Gorecki
>>
>> I agree that this would be a useful option sometimes.
>>
>> So I would be supportive to get such an option, say,
>>
>>R CMD check --no-stop-on-error  
>
>
> A couple of years ago the behavior of 'R CMD check' was changed to
> continue checking (e.g. the examples) after many types of errors, and
> to output a summary count of errors at the end if any have occurred.
> So --no-stop-on-error could easily be interpreted as an option that
> controls this behavior (and would also suggest that the default has
> been reverted back to what it was prior to R 3.2.0), rather than an
> option that specifically controls what should happen while running
> the tests.
>
> Cheers,
> H.
>
>>
>> into R if someone provided (relatively small) patches to the R
>> sources (i.e. subversion repos at https://svn.r-project.org/R/trunk/ ).
>> The relevant source code should basically all be in
>> src/library/tools/R/testing.R
>>
>> Note that this may be complicated, also because "parallel"
>> checking is available in parts, via the TEST_MC_CORES
>> environment variable ((which is currently only quickly
>> documented in the 'R Administration ..' manual))
>>
>>
>> Martin Maechler
>> ETH Zurich
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Running package tests and not stop on first fail

2016-11-01 Thread Jan Gorecki

Hello community/devs,
Is there an option to run package tests during R CMD check and not stop on
first error? I know that testing frameworks (testhat and others) can do
that but asking about just R and base packages. Currently when package
check runs test scripts in ./tests directory it will stop after first fail.
Do you think it could be optionally available to continue to run tests
after failures?
Regards,
Jan Gorecki

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

90 matches

Mail list logo