Re: [R-pkg-devel] puzzling removal of 'dotwhisker' from CRAN

2024-04-18 Thread Henrik Bengtsson
On Thu, Apr 18, 2024 at 6:41 AM Uwe Ligges
 wrote:
>
> On 18.04.2024 15:35, Ben Bolker wrote:
> > Yes, but ffbase was archived a long time ago (2022-04-04) and CRAN
> > has apparently just caught up to checking.  What's a little frustrating
> > (to me) is that the dependence of prediction on ffbase is *very* soft
> > ("enhances") ...
>
> No, but CRAN  has still not received an update and asked for this each
> month this year, so in Jan, Feb, and March without a response, so we
> assume prediction is unmaintained. Also, this was escalated to reverse
> depends.

Uwe, has the CRAN Team ever considered making these request-for-update
email messages from CRAN public in real-time? For instance, there
could be a public "read-only" mailing list that anyone can subscribe
to, but not send/reply to. I see several advantages with such an
approach, e.g.

* maintainers of reverse dependencies would be aware of potential
problems much sooner,
* so would end-users who rely on the package (who often only notice
when a package is archived),
* the community could early on offer their help to the package maintainer, and
* the community could help locate and notify maintainers whose email
addresses are no longer working.

I'd imagine this would help lower the workload on the CRAN Team, so a
win-win for everyone. I, for sure, would find that useful.

Thanks,

Henrik

>> Best,
> Uwe Ligges
>
>
> >
> > On 2024-04-18 9:28 a.m., Thierry Onkelinx wrote:
> >> The cascade is even longer. prediction got archived because ffbase was
> >> no longer available. https://cran.r-project.org/web/packages/ffbase/
> >> 
> >>
> >> ir. Thierry Onkelinx
> >> Statisticus / Statistician
> >>
> >> Vlaamse Overheid / Government of Flanders
> >> INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE
> >> AND FOREST
> >> Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance
> >> thierry.onkel...@inbo.be 
> >> Havenlaan 88 bus 73, 1000 Brussel
> >> *Postadres:* Koning Albert II-laan 15 bus 186, 1210 Brussel
> >> /Poststukken die naar dit adres worden gestuurd, worden ingescand en
> >> digitaal aan de geadresseerde bezorgd. Zo kan de Vlaamse overheid haar
> >> dossiers volledig digitaal behandelen. Poststukken met de vermelding
> >> ‘vertrouwelijk’ worden niet ingescand, maar ongeopend aan de
> >> geadresseerde bezorgd./
> >> www.inbo.be 
> >>
> >> ///
> >> To call in the statistician after the experiment is done may be no
> >> more than asking him to perform a post-mortem examination: he may be
> >> able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher
> >> The plural of anecdote is not data. ~ Roger Brinner
> >> The combination of some data and an aching desire for an answer does
> >> not ensure that a reasonable answer can be extracted from a given body
> >> of data. ~ John Tukey
> >> ///
> >>
> >> 
> >>
> >>
> >> Op do 18 apr 2024 om 15:25 schreef Ben Bolker  >> >:
> >>
> >> Thank you! (I know about package_dependencies() but ran into
> >> precisely the same problem and didn't want to re-implement it ...)
> >>
> >> On 2024-04-18 9:11 a.m., Josiah Parry wrote:
> >>  > Well, after trying to install the package, I believe the issue is
> >>  > because margins has been archived.
> >>  >
> >>  > https://cran.r-project.org/web/packages/margins/index.html
> >> 
> >>  >  >> >
> >>  >
> >>  > You can check recursive dependencies using
> >>  > `tools::package_dependencies("pkg", recursive = TRUE)` which
> >> would help
> >>  > you in this case. I couldn't run it since the package is not
> >> available
> >>  > on CRAN, however.
> >>  >
> >>  >
> >>  > On Thu, Apr 18, 2024 at 9:05 AM Ben Bolker  >> 
> >>  > >> wrote:
> >>  >
> >>  > The 'dotwhisker' package was archived on CRAN: from
> >>  > https://cran.r-project.org/web/packages/dotwhisker/index.html
> >> 
> >>  >
> >>   >> >
> >>  >
> >>  >   Package ‘dotwhisker’ was removed from the CRAN
> >> repository
> >>  >   ...
> >>  >
> >>  >   Archived on 2024-04-12 as requires archived package
> >> 'prediction'.
> >>  >
> >>  >

Re: [Bioc-devel] Important Bioconductor Release Deadlines

2024-04-08 Thread Henrik Bengtsson
> Henrik, if Bioconductor releases weren’t tied to R releases, how could 
> Bioconductor test them? I guess like CRAN where as long as a package says it 
> depends on R >= 2.0 then you’re free to try installing on 2.0 even though the 
> combination has never been tested? Maybe I worry too much about such testing 
> since CRAN seems OK anyways, as far as we know?

TL;DR: I think Bioconductor could say: "We have only tested Bioc
release on R version x.y. There is no guarantee it will work with
another version of R."

Here is my view:

The Bioconductor Project guarantees that R packages part of the the
Bioc release version and the Bioc development version are well tested
and work well together.  That is the agreed upon objective.

Then there's the objective that Bioconductor wants to protect
end-users from shooting themselves in the foot. This is done by
preventing users from mix-and-match installing Bioc release and
development packages, which is the main role of the BiocManager
package.

Half of the year, it's easy to implement this protection, because the
Bioc release runs on R release and Bioc devel on R devel. I see that
as an obvious first approach, because R itself takes care of
everything for us and Bioc release and devel are nicely separated on
the users computer.  This model worked really well when R had two
releases per year.  However, even since R moved to one release per
year and Bioc kept two, the above model cannot be used six months of
the year.  Instead, for those months, we have to manually manage our
own R package library if we would like to swiftly move between Bioc
release and Bioc devel versions.  I'm pretty sure, few users and
developers actually bothers with that.

I would like to see a model where you can instantaneously switch
between Bioconductor versions for a given R version when you launch R,
e.g. BiocManager::use("release"), BiocManager::use("devel"), and
BiocManager::use("3.16"). Under the hood, this would just reconfigure
.libPaths() and the R 'repos' option.

Now, imagine we had the latter in place, how would we look at the
dependency between Bioc version and R version?  I suspect it's the
technical problems that keep us stuck with the above half-broken model
rather than what we ideally want.


To address "how could Bioconductor test them?": I think this question
has two parts to it.

The first one is practical/financial. If we had sufficient compute
resources, we could of course test Bioc release and Bioc devel on R
oldrel, R release, and R devel. Those are the version that CRAN
guarantees all CRAN packages support. If a CRAN package fail on one of
those, it will be archived on CRAN, unless you clearly state and argue
for a give R version requirement.

On the other hand, by promising to support multiple R versions, you
constrain what new R features an R package might use. So, if
Bioconductor wants to make life easy for package developers so that
they don't have to worry about backward compatibility, then
Bioconductor might want to be more conservation and say, we only
guarantee support for R release and R devel.  Which is basically what
we have today (or at least half of the year).

To address "I guess like CRAN where as long as a package says it
depends on R >= 2.0 then you’re free to try installing on 2.0 even
though the combination has never been tested?": Yes, CRAN only tests
on R oldrel, release, and devel version.  Anything beyond that is "Use
at your own risk". I think that's perfectly fine. Some package
managers test against old versions (e.g. via GitHub Actions) and
maintain a "Depends: R (>= 2.0)" specification. There are also lots of
packages that are no longer updated and just keep working regardless
of new R releases, meaning their code is never changes, meaning if
they passed CRAN's checks on R 2.0.0 back in the days, they will still
pass those checks.  I argue that it adds unnecessary friction to users
(and developers) by going extra miles trying to prevent them from
running a Bioc devel package on an older version of R just because
"Bioconductor hasn't tested it, so we cannot guarantee it works, so we
will not allow you to install it".  My opinion is that Bioconductor
should only use on the built-in "Depends: R (>= x.y)" feature to deal
with dependencies and rely on maintainers to keep them up-to-date. But
if they don't, it's not a big deal, because the overall expectation
should be that the Bioconductor Project only guarantees things to work
for a specific R version.

I think the biggest difference between Bioconductor and CRAN is the
decision how to deal with broken package dependencies. CRAN takes the
freedom to kick packages out within two weeks. They don't have to
think about maintaining a stable "CRAN release" bundle for six months.
In contrast, Bioconductor promises to maintain "Bioc release" for six
months. OTH, I don't know how this works in practice. I know broken
packages will not make it to the next Bioc release version if they're
not fixed, but 

Re: [Bioc-devel] Important Bioconductor Release Deadlines

2024-04-08 Thread Henrik Bengtsson
> ... I'm on Mac OS 12.5.

I've heard good things about 'rig' (The R Installation Manager;
https://github.com/r-lib/rig). It can install multiple R versions in
parallel on macOS, Windows, and Linux, without the different versions
conflicting with each other.  I recommend trying it out, because being
able to run different versions of R is always useful, even if you're
not a package maintainer.

/Henrik

On Thu, Apr 4, 2024 at 8:08 PM Anatoly Sorokin  wrote:
>
> Hi Henrik,
>
> thank you for the prompt reply. I'm on Mac OS 12.5.
>
> And regarding the linking version of Bioconductor and R my major complaint is 
> the timing, if R 4.4 hasn't been released yet, why not postpone this 
> dependency to the October release of Bioconductor?
>
> And it is also true that we have a lot of complaints from people working on 
> centrally maintained machines or HPC clusters, that they have no access to 
> the latest R versions. So sometimes even for demonstration purposes, we had 
> to install the package from the GitHub source, rather than from BiocManager.
>
> Cheers,
> Anatoly
>
> On Fri, Apr 5, 2024 at 11:57 AM Henrik Bengtsson  
> wrote:
>>
>> Hello,
>>
>> these days, it's quite straight forward to have multiple versions of R
>> installed in parallel without them conflicting with each other. I know
>> it works out of the box on MS Windows (just install all versions you'd
>> like), and I know there are various tools to achieve the same on
>> macOS.  I'm on Linux, and I build R from source, so that solves it for
>> me.  What's platform are you working on?  If you share that, I think
>> you'll get 1-2-3 instructions for how to install R 4.4 pre-release
>> from users on that operating system.
>>
>> Regarding Bioc versions being tied to specific R versions: That is a
>> design decision that goes back to day one of the Bioconductor project.
>> It's a rather big thing to change.  That said, I've always been in the
>> camp that thinks we should move away from that model for many reasons;
>> one is the friction added to developers, another is the friction added
>> to end-users, and some people may be stuck with older versions of R
>> and in no control of updating.
>>
>> Hope this helps,
>>
>> Henrik
>>
>> On Thu, Apr 4, 2024 at 7:29 PM Anatoly Sorokin  wrote:
>> >
>> > Hi all,
>> >
>> > I'm sorry for the complaint, but do you really think it is wise to make the
>> > new release dependent on the R version which has not released yet?
>> >
>> > I have a lot of R-related projects going on apart from maintaining the
>> > Bioconductor package and I'm not comfortable installing the unreleased
>> > version of R on my machine and spending time debugging it in the case of
>> > possible problems.
>> >
>> > At the same time, I have an error, possibly caused by a new version of
>> > GO.db package, which BioNAR is dependent upon and I can not fix it
>> > until the R 4.4 release on the 24th of April when I would have less than a
>> > day to fix the possible problem and fit into R CMD build and R CMD check by
>> > the Friday April 26th. Don't you think this is a rather tight time frame?
>> >
>> >
>> > Sorry once again, for the complaint.
>> >
>> > Cheers,
>> > Anatoly
>> >
>> > On Tue, Mar 26, 2024 at 11:06 PM Kern, Lori via Bioc-devel <
>> > bioc-devel@r-project.org> wrote:
>> >
>> > > Import update:  The Bioconductor Release will be May 1 following the
>> > > release of R 4.4 on April 24.
>> > >
>> > > The Bioconductor 3.18 branch will be frozen Monday April 15th. After that
>> > > date, no changes will be permitted ever on that branch.
>> > >
>> > > The deadline for devel Bioconductor 3.19 for packages to pass R CMD build
>> > > and R CMD check is Friday April 26th. While you will still be able to 
>> > > make
>> > > commits past this date, This ensures any changes pushed to
>> > > git.bioconductor.org are reflected in at least one build report before
>> > > the devel branch will be copied to a release 3.19 branch.
>> > >
>> > > Cheers,
>> > >
>> > >
>> > >
>> > > Lori Shepherd - Kern
>> > >
>> > > Bioconductor Core Team
>> > >
>> > > Roswell Park Comprehensive Cancer Center
>> > >
>> > > Department of Biostatistics & Bioinformatics
>> > >
>> > > Elm & Carlton Streets
>> > &g

Re: [Bioc-devel] Important Bioconductor Release Deadlines

2024-04-04 Thread Henrik Bengtsson
Hello,

these days, it's quite straight forward to have multiple versions of R
installed in parallel without them conflicting with each other. I know
it works out of the box on MS Windows (just install all versions you'd
like), and I know there are various tools to achieve the same on
macOS.  I'm on Linux, and I build R from source, so that solves it for
me.  What's platform are you working on?  If you share that, I think
you'll get 1-2-3 instructions for how to install R 4.4 pre-release
from users on that operating system.

Regarding Bioc versions being tied to specific R versions: That is a
design decision that goes back to day one of the Bioconductor project.
It's a rather big thing to change.  That said, I've always been in the
camp that thinks we should move away from that model for many reasons;
one is the friction added to developers, another is the friction added
to end-users, and some people may be stuck with older versions of R
and in no control of updating.

Hope this helps,

Henrik

On Thu, Apr 4, 2024 at 7:29 PM Anatoly Sorokin  wrote:
>
> Hi all,
>
> I'm sorry for the complaint, but do you really think it is wise to make the
> new release dependent on the R version which has not released yet?
>
> I have a lot of R-related projects going on apart from maintaining the
> Bioconductor package and I'm not comfortable installing the unreleased
> version of R on my machine and spending time debugging it in the case of
> possible problems.
>
> At the same time, I have an error, possibly caused by a new version of
> GO.db package, which BioNAR is dependent upon and I can not fix it
> until the R 4.4 release on the 24th of April when I would have less than a
> day to fix the possible problem and fit into R CMD build and R CMD check by
> the Friday April 26th. Don't you think this is a rather tight time frame?
>
>
> Sorry once again, for the complaint.
>
> Cheers,
> Anatoly
>
> On Tue, Mar 26, 2024 at 11:06 PM Kern, Lori via Bioc-devel <
> bioc-devel@r-project.org> wrote:
>
> > Import update:  The Bioconductor Release will be May 1 following the
> > release of R 4.4 on April 24.
> >
> > The Bioconductor 3.18 branch will be frozen Monday April 15th. After that
> > date, no changes will be permitted ever on that branch.
> >
> > The deadline for devel Bioconductor 3.19 for packages to pass R CMD build
> > and R CMD check is Friday April 26th. While you will still be able to make
> > commits past this date, This ensures any changes pushed to
> > git.bioconductor.org are reflected in at least one build report before
> > the devel branch will be copied to a release 3.19 branch.
> >
> > Cheers,
> >
> >
> >
> > Lori Shepherd - Kern
> >
> > Bioconductor Core Team
> >
> > Roswell Park Comprehensive Cancer Center
> >
> > Department of Biostatistics & Bioinformatics
> >
> > Elm & Carlton Streets
> >
> > Buffalo, New York 14263
> >
> >
> > This email message may contain legally privileged and/or confidential
> > information.  If you are not the intended recipient(s), or the employee or
> > agent responsible for the delivery of this message to the intended
> > recipient(s), you are hereby notified that any disclosure, copying,
> > distribution, or use of this email message is prohibited.  If you have
> > received this message in error, please notify the sender immediately by
> > e-mail and delete this email message from your computer. Thank you.
> > [[alternative HTML version deleted]]
> >
> > ___
> > Bioc-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Rd] Wish: a way to track progress of parallel operations

2024-03-25 Thread Henrik Bengtsson
Hello,

thanks for bringing this topic up, and it would be excellent if we
could come of with a generic solution for this in base R.  It is one
of the top frequently asked questions and requested features in
parallel processing, but also in sequential processing. We have also
seen lots of variants on how to attack the problem of reporting on
progress when running in parallel.

As the author Futureverse (a parallel framework), I've been exposed to
these requests and I thought quite a bit about how we could solve this
problem. I'll outline my opinionated view and suggestions on this
below:

* Target a solution that works the same regardless whether we run in
parallel or not, i.e. the code/API should look the same regardless of
using, say, parallel::parLapply(), parallel::mclapply(), or
base::lapply(). The solution should also work as-is in other parallel
frameworks.

* Consider who owns the control of whether progress updates should be
reported or not. I believe it's best to separate what the end-user and
the developer controls.  I argue the end-user should be able to
decided whether they want to "see" progress updates or not, and the
developer should focus on where to report on progress, but not how and
when.

* In line with the previous comment, controlling progress reporting
via an argument (e.g. `.progress`) is not powerful enough. With such
an approach, one need to make sure that that argument is exposed and
relayed throughout in all nested function calls. If a package decides
to introduce such an argument, what should the default be? If they set
`.progress = TRUE`, then all of a sudden, any code/packages that
depend on this function will all of a sudden see progress updates.
There are endless per-package versions of this on CRAN and
Bioconductor, any they rarely work in harmony.

* Consider accessibility as well as graphical user interfaces. This
means, don't assume progress is necessarily reported in the terminal.
I found it a good practice to never use the term "progress bar",
because that is too focused on how progress is reported.

* Let the end-user control how progress is reported, e.g. a progress
bar in the terminal, a progress bar in their favorite IDE/GUI,
OS-specific notifications, third-party notification services, auditory
output, etc.

The above objectives challenge you to take a step back and think about
what progress reporting is about, because the most immediate needs.
Based on these, I came up with the 'progressr' package
(https://progressr.futureverse.org/). FWIW, it was originally actually
meant to be a proof-of-concept proposal for a universal, generic
solution to this problem, but as the demands grew and the prototype
showed to be useful, I made it official.  Here is the gist:

* Motto: "The developer is responsible for providing progress updates,
but it’s only the end user who decides if, when, and how progress
should be presented. No exceptions will be allowed."

* It rely on R's condition system to signal progress. The developer
signals progress conditions. Condition handlers, which the end-user
controls, are used to report/render these progress updates. The
support for global condition handlers, introduced in R 4.0.0, makes
this much more convenient. It is useful to think of the condition
mechanism in R as a back channel for communication that operates
separately from the rest of the "communication" stream (calling
functions with arguments and returning value).

* For parallel processing, progress conditions can be relayed back to
the parent process via back channels in a "near-live" fashion, or at
the very end when the parallel task is completed. Technically,
progress conditions inherit from 'immediateCondition', which is a
special class indicating that such conditions are allowed to be
relayed immediately and out of order. It is possible to use the
existing PSOCK socket connections to send such 'immediateCondition':s.

* No assumption is made on progress updates arriving in a certain
order. They are just a stream of "progress of this and that amount"
was made.

* There is a progress handler API. Using this API, various types of
progress reporting can be implemented. This allows anyone to implement
progress handlers in contributed R packages.

See https://progressr.futureverse.org/ for more details.

> I would be happy to prepare code and documentation. If there is no time now, 
> we can return to it after R-4.4 is released.

I strongly recommend to not rush this. This is an important, big
problem that goes beyond the 'parallel' package. I think it would be a
disfavor to introduce a '.progress' argument. As mentioned above, I
think a solution should work throughout the R ecosystem - all base-R
packages and beyond. I honestly think we could arrive at a solution
where base-R proposes a very light, yet powerful, progress API that
handles all of the above. The main task is to come up with a standard
API/protocol - then the implementation does not matter.

/Henrik

On Mon, Mar 25, 

Re: [Rd] Ordered comparison operators on language objects will signal errors

2024-03-04 Thread Henrik Bengtsson
On Mon, Mar 4, 2024 at 8:45 AM luke-tierney--- via R-devel
 wrote:
>
> Comparison operators == and != can be used on language objects
> (i.e. call objects and symbols). The == operator in particular often
> seems to be used as a shorthand for calling identical(). The current
> implementation involves comparing deparsed calls as strings. This has
> a number of drawbacks and we would like to transition to a more robust
> and efficient implementation. As a first step, R-devel will soon be
> modified to signal an error when the ordered comparison operators <,
> <=, >, >= are used on language objects. A small number of CRAN and
> BIOC packages will fail after this change. If you want to check your
> packages or code before the change is committed you can run the
> current R-devel with the environment variable setting
>
>  _R_COMPARE_LANG_OBJECTS=eqonly

A minor comment, which or may not matter, depending on how long you're
planning to keep that variable around. I believe all other "internal"
environment variables in the R source code that starts with _R_ also
ends with an underscore (_).  This name is an outlier in that sense.
So, maybe it should be named '_R_COMPARE_LANG_OBJECTS_' instead? (I
checked the source code - it's indeed without the trailing
underscore).

/Henrik

>
> where using such a comparison now produces
>
>  > quote(x + y) > 1
>  Error in quote(x + y) > 1 :
>comparison (>) is not possible for language types
>
> Best,
>
> luke
>
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] Unusually long execution time for R.utils::gzip on r-devel-windows

2024-02-17 Thread Henrik Bengtsson
I can confirm that this has to fixed in R.utils. This gist is that
R.utils does lots of validation of read/write permissions, and deep
down it rely on system("dir") as a fallback method. If this is down
toward dirname(tempdir()), then it'll find a lot of files, e.g.

[1] " Datenträger in Laufwerk D: ist Daten"
[2] " Volumeseriennummer: 1826-A193"
[3] ""
[4] " Verzeichnis von D:\\temp"
[5] ""
[6] "17.02.2024  09:06  ."
[7] "14.02.2024  03:36 0 cc6H4Sp5"
[8] "15.02.2024  03:46 0 cc6PwKb4"
[9] "15.02.2024  16:25 0 cc6RH27v"
   [10] "16.02.2024  01:50 0 ccafzzMl"
   ...
[7] "09.02.2024  04:48  RtmpURWDbA"
[8] "14.02.2024  04:35  RtmpURWeVC"
[9] "15.02.2024  04:00  RtmpUrwhHU"
 [ reached getOption("max.print") -- omitted 17165 entries ]
Time difference of 18.67841 secs

So, yeah, wow!  I'll look into fixing this, probably by removing this
fallback approach, which is very rarely needed; it was added way back
when Sys.readlink() didn't cover all cases.

/Henrik

On Fri, Feb 16, 2024 at 9:24 PM Henrik Bengtsson
 wrote:
>
> Author of R.utils here. I happen to investigate this too right now,
> because of extremely slow win-builder performance of R.rsp checks,
> which in turn depends on R.utils.
>
> It's not obvious to me why this happens on win-builder. I've noticed
> slower and slower win-builder/cran-incoming checks over the years,
> despite the code not changing. Right now, I'm investigating a piece of
> code that calls shell("dir") as a fallback to figure out if a file is
> a symbol link or not - it could be that that takes a very long time on
> win-builder.
>
> So, stay tuned ... I'll report back when I find something out.
>
> /Henrik
>
> On Fri, Feb 16, 2024 at 4:43 PM Stefan Mayer
>  wrote:
> >
> > Dear list,
> >
> > I tried to submit an update to my R package imagefluency, but the update 
> > does not pass the incoming checks automatically. The problem is that one of 
> > the examples takes too long to execute – but only under Windows with the 
> > development version of R (R Under development (unstable) (2024-02-15 r85925 
> > ucrt)).
> >
> > I was able to pin down the problem to using R.utils::gzip(). I created a 
> > test package that illustrates the problem: https://github.com/stm/ziptest
> > The package has two functions that zip a file given a file path. When using 
> > R.utils::gzip() (function `gzipit()` in the test package), I get a NOTE 
> > when checking the package using devtools::check_win_devel()
> >
> > * checking examples ... [55s] NOTE
> > Examples with CPU (user + system) or elapsed time > 10s
> >user system elapsed
> > gzipit 6.91  47.24   54.84
> >
> > There is no issue with utils::zip() (function `zipit()` in the test 
> > package). Is this somehow a bug in R.utils::gzip(), or is there an issue 
> > with the combination of Windows and r-devel?
> >
> > Best, Stefan
> >
> > __
> > R-package-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Unusually long execution time for R.utils::gzip on r-devel-windows

2024-02-16 Thread Henrik Bengtsson
Author of R.utils here. I happen to investigate this too right now,
because of extremely slow win-builder performance of R.rsp checks,
which in turn depends on R.utils.

It's not obvious to me why this happens on win-builder. I've noticed
slower and slower win-builder/cran-incoming checks over the years,
despite the code not changing. Right now, I'm investigating a piece of
code that calls shell("dir") as a fallback to figure out if a file is
a symbol link or not - it could be that that takes a very long time on
win-builder.

So, stay tuned ... I'll report back when I find something out.

/Henrik

On Fri, Feb 16, 2024 at 4:43 PM Stefan Mayer
 wrote:
>
> Dear list,
>
> I tried to submit an update to my R package imagefluency, but the update does 
> not pass the incoming checks automatically. The problem is that one of the 
> examples takes too long to execute – but only under Windows with the 
> development version of R (R Under development (unstable) (2024-02-15 r85925 
> ucrt)).
>
> I was able to pin down the problem to using R.utils::gzip(). I created a test 
> package that illustrates the problem: https://github.com/stm/ziptest
> The package has two functions that zip a file given a file path. When using 
> R.utils::gzip() (function `gzipit()` in the test package), I get a NOTE when 
> checking the package using devtools::check_win_devel()
>
> * checking examples ... [55s] NOTE
> Examples with CPU (user + system) or elapsed time > 10s
>user system elapsed
> gzipit 6.91  47.24   54.84
>
> There is no issue with utils::zip() (function `zipit()` in the test package). 
> Is this somehow a bug in R.utils::gzip(), or is there an issue with the 
> combination of Windows and r-devel?
>
> Best, Stefan
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] round.Date and trunc.Date not working / implemented

2024-02-08 Thread Henrik Bengtsson
Technically, there is a round() for 'Date' objects, but it doesn't
seem very useful, because it basically just fall back to the default
round() method, which only takes the 'digits' argument.

Here's an example:

> date <- Sys.Date()
> class(date)
[1] "Date"

We see that there are only two round() methods in addition to the
implicit built-in one;

> methods("round")
[1] round.Date   round.POSIXt
see '?methods' for accessing help and source code

Looking at round() for 'Date';

> round.Date
function (x, ...)
{
.Date(NextMethod(), oldClass(x))
}


we see that it defers to the next method here, which is the built-in
one. The built-in one, only accepts 'digits', which does nothing for
digits >= 0.  For digits < 0, it rounds to power of ten, e.g.

> date
[1] "2024-02-08"
> round(date, digits = 0)
[1] "2024-02-08"
> round(date, digits = 1)
[1] "2024-02-08"
> round(date, digits = 2)
[1] "2024-02-08"
> round(date, digits = -1)
[1] "2024-02-07"
> round(date, digits = -2)
[1] "2024-03-18"
> round(date, digits = -3)
[1] "2024-10-04"
> round(date, digits = -4)
[1] "2024-10-04"
> round(date, digits = -5)
[1] "1970-01-01"

So, although technically invalid, OPs remark is a valid one. I'd also
expect `round()` for Date to support 'units' similar to timestamps,
e.g.

> time <- Sys.time()
> class(time)
[1] "POSIXct" "POSIXt"
> time
[1] "2024-02-08 09:17:02 PST"
> round(time, units = "days")
[1] "2024-02-08 PST"
> round(time, units = "months")
[1] "2024-02-01 PST"
> round(time, units = "years")
[1] "2024-01-01 PST"

So, I agree with OP that one would expect:

> round(date, units = "days")
[1] "2024-02-08"
> round(date, units = "months")
[1] "2024-02-01"
> round(date, units = "years")
[1] "2024-01-01"

to also work here.

FWIW, I don't think we want to encourage circumventing the S3 generic
and calling S3 methods directly, i.e. I don't recommend doing things
like round.POSIXt(...). Ideally, all S3 methods in R would be
non-exported, but some remain exported for legacy reason. But, I think
we should treat them as if they in the future will become
non-exported.

/Henrik

On Thu, Feb 8, 2024 at 8:18 AM Olivier Benz via R-devel
 wrote:
>
> > On 8 Feb 2024, at 15:15, Martin Maechler  wrote:
> >
> >> Jiří Moravec
> >>on Wed, 7 Feb 2024 10:23:15 +1300 writes:
> >
> >> This is my first time working with dates, so if the answer is "Duh, work
> >> with POSIXt", please ignore it.
> >
> >> Why is not `round.Date` and `trunc.Date` "implemented" for `Date`?
> >
> >> Is this because `Date` is (mostly) a virtual class setup for a better
> >> inheritance or is that something that is just missing? (like
> >> `sort.data.frame`). Would R core welcome a patch?
> >
> >> I decided to convert some dates to date using `as.Date` function, which
> >> converts to a plain `Date` class, because that felt natural.
> >
> >> But then when trying to round to closest year, I have realized that the
> >> `round` and `trunc` for `Date` do not behave as for `POSIXt`.
> >
> >> I would assume that these will have equivalent output:
> >
> >> Sys.time() |> round("years") # 2024-01-01 NZDT
> >
> >> Sys.Date() |> round("years") # Error in round.default(...): non-numeric
> >> argument to mathematical function
> >
> >
> >> Looking at the code (and reading the documentation more carefully) shows
> >> the issue, but this looks like an omission that should be patched.
> >
> >> -- Jirka
> >
> > You are wrong:  They *are* implemented,
> > both even visible since they are in the 'base' package!
> >
> > ==> they have help pages you can read 
> >
> > Here are examples:
> >
> >> trunc(Sys.Date())
> > [1] "2024-02-08"
> >> trunc(Sys.Date(), "month")
> > [1] "2024-02-01"
> >> trunc(Sys.Date(), "year")
> > [1] "2024-01-01"
> >>
> >
>
> Maybe he meant
>
> r$> Sys.time() |> round.POSIXt("years")
> [1] "2024-01-01 CET"
>
> r$> Sys.Date() |> round.POSIXt("years")
> [1] "2024-01-01 UTC"
>
> The only difference is the timezone
>
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

On Thu, Feb 8, 2024 at 9:06 AM Rui Barradas  wrote:
>
> Às 14:36 de 08/02/2024, Olivier Benz via R-devel escreveu:
> >> On 8 Feb 2024, at 15:15, Martin Maechler  
> >> wrote:
> >>
> >>> Jiří Moravec
> >>> on Wed, 7 Feb 2024 10:23:15 +1300 writes:
> >>
> >>> This is my first time working with dates, so if the answer is "Duh, work
> >>> with POSIXt", please ignore it.
> >>
> >>> Why is not `round.Date` and `trunc.Date` "implemented" for `Date`?
> >>
> >>> Is this because `Date` is (mostly) a virtual class setup for a better
> >>> inheritance or is that something that is just missing? (like
> >>> `sort.data.frame`). Would R core welcome a patch?
> >>
> >>> I decided to convert some dates to date using `as.Date` function, which
> >>> converts to a plain `Date` class, 

Re: [Rd] [EXTERNAL] Re: NOTE: multiple local function definitions for ?fun? with different formal arguments

2024-02-06 Thread Henrik Bengtsson
Here's a dummy example that I think illustrates the problem:

toto <- function() {
  if (runif(1) < 0.5)
function(a) a
  else
function(a,b) a+b
}

> fcn <- toto()
> fcn(1,2)
[1] 3
> fcn <- toto()
> fcn(1,2)
[1] 3
> fcn <- toto()
> fcn(1,2)
Error in fcn(1, 2) : unused argument (2)

How can you use the returned function, if you get different arguments?

In your example, you cannot use the returned function without knowing
'mode', or by inspecting the returned function.  So, the warning is
there to alert you to a potential bug.  Anecdotally, I'm pretty sure
this R CMD check NOTE has caught at least one such bug in one of
my/our packages.

If you want to keep the current design pattern, one approach could be
to add ... to your function definitions:

toto <- function(mode)
{
 if (mode == 1)
 fun <- function(a, b, ...) a*b
 else
 fun <- function(u, v, w) (u + v) / w
 fun
}

to make sure that toto() returns functions that accept the same
minimal number of arguments.

/Henrik

On Tue, Feb 6, 2024 at 1:15 PM Izmirlian, Grant (NIH/NCI) [E] via
R-devel  wrote:
>
> Because functions get called and therefore, the calling sequence matters. 
> It’s just protecting you from yourself, but as someone pointed out, there’s a 
> way to silence such notes.
> G
>
>
> From: Hervé Pagès 
> Sent: Tuesday, February 6, 2024 2:40 PM
> To: Izmirlian, Grant (NIH/NCI) [E] ; Duncan Murdoch 
> ; r-devel@r-project.org
> Subject: Re: [EXTERNAL] Re: [Rd] NOTE: multiple local function definitions 
> for ?fun? with different formal arguments
>
>
> On 2/6/24 11:19, Izmirlian, Grant (NIH/NCI) [E] wrote:
> The note refers to the fact that the function named ‘fun’ appears to be 
> defined in two different ways.
>
> Sure I get that. But how is that any different from a variable being defined 
> in two different ways like in
>
> if (mode == 1)
> x <- -8
> else
> x <- 55
>
> This is such a common and perfectly fine pattern. Why would this be 
> considered a potential hazard when the variable is a function?
>
> H.
>
> From: Hervé Pagès 
> 
> Sent: Tuesday, February 6, 2024 2:17 PM
> To: Duncan Murdoch 
> ; Izmirlian, Grant 
> (NIH/NCI) [E] ; 
> r-devel@r-project.org
> Subject: [EXTERNAL] Re: [Rd] NOTE: multiple local function definitions for 
> ?fun? with different formal arguments
>
>
> Thanks. Workarounds are interesting but... what's the point of the NOTE in 
> the first place?
>
> H.
> On 2/4/24 09:07, Duncan Murdoch wrote:
> On 04/02/2024 10:55 a.m., Izmirlian, Grant (NIH/NCI) [E] via R-devel wrote:
>
>
> Well you can see that yeast is exactly weekday you have.  The way out is to 
> just not name the result
>
> I think something happened to your explanation...
>
>
>
>
> toto <- function(mode)
> {
>  ifelse(mode == 1,
>  function(a,b) a*b,
>  function(u, v, w) (u + v) / w)
> }
>
> It's a bad idea to use ifelse() when you really want if() ... else ... .  In 
> this case it works, but it doesn't always.  So the workaround should be
>
>
> toto <- function(mode)
> {
> if(mode == 1)
> function(a,b) a*b
> else
> function(u, v, w) (u + v) / w
> }
>
>
>
>
>
>
> 
> From: Grant Izmirlian 
> Date: Sun, Feb 4, 2024, 10:44 AM
> To: "Izmirlian, Grant (NIH/NCI) [E]" 
> 
> Subject: Fwd: [EXTERNAL] R-devel Digest, Vol 252, Issue 2
>
> Hi,
>
> I just ran into this 'R CMD check' NOTE for the first time:
>
> * checking R code for possible problems ... NOTE
> toto: multiple local function definitions for �fun� with different
>formal arguments
>
> The "offending" code is something like this (simplified from the real code):
>
> toto <- function(mode)
> {
>  if (mode == 1)
>  fun <- function(a, b) a*b
>  else
>  fun <- function(u, v, w) (u + v) / w
>  fun
> }
>
> Is that NOTE really intended? Hard to see why this code would be
> considered "wrong".
>
> I know it's just a NOTE but still...
>
> I agree it's a false positive, but the issue is that you have a function 
> object in your function which can't be called unconditionally.  The 
> workaround doesn't create such an object.
>
> Recognizing that your function never tries to call fun requires global 
> inspection of toto(), and most of the checks are based on local inspection.
>
> Duncan Murdoch
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
>
> Hervé Pagès
>
>
>
> Bioconductor Core Team
>
> hpages.on.git...@gmail.com
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and are confident 
> the content is safe.
>
>
> --
>
> 

Re: [R-pkg-devel] new maintainer for CRAN package XML

2024-01-25 Thread Henrik Bengtsson
Some thoughts for what the next steps could be:

I could be wrong, but I doubt that there's someone out there that
wants to pick up 'XML' and start developing it beyond keeping the
lights on.  I also think the community agrees that 'xml2' is the
recommended package to use for XML-related needs.  If this can be
agreed upon, I think the best path forward would be to declare 'XML'
being deprecated. This is some that the current maintainer (CRAN Team)
could already do, but updating the document and the package
Description field to:

[WARNING: The 'XML' package is deprecated. Please do not add it as a
dependency to your R package.] Many approaches for both reading and
creating XML (and HTML) documents (including DTDs), both local and
accessible via HTTP or FTP. Also offers access to an 'XPath'
"interpreter".

That would at least help stopping the influx of new packages being
added that depends on XML, especially since not everyone might be
aware of the state of 'XML' and that 'xml2' exists as an alternative.

This also begs the question: Do we need a way to formally declare a
whole package being deprecated?  Is this something that could be added
to the DESCRIPTION file?  Deprecated: TRUE?  Or something richer like
"Lifecycle: deprecated"?  With such a mechanism in place, 'R CMD
check' could give a NOTE for deprecated dependencies.  That could also
be a block for *new* CRAN submission; existing CRAN package should be
accepted ("grandfathered in").  With a richer {deprecation, ...,
defunct} vocabulary, we could produce WARNINGs and eventually ERRORs
too.


/Henrik

On Thu, Jan 25, 2024 at 9:11 AM Heather Turner  wrote:
>
> Re: guidance on how to migrate from XML to xml2, these notes from Daniel Nüst 
> may be helpful: 
> https://gist.github.com/nuest/3ed3b0057713eb4f4d75d11bb62f2d66.
>
> Best wishes,
> Heather
>
> On Wed, Jan 24, 2024, at 3:38 PM, Emmanuel Blondel wrote:
> > if XML is deprecated, then what would be the choice for a package
> > maintainer? Move to xml2 probably at some point I assume
> >
> > I use XML in the R packages I've been developing. For some of them, I
> > started before CRAN started being the maintainer, and before xml2
> > inception. The thing is that XML fulfills requirements, it works and
> > fulfills needs of depending packages that made the choice to use it. For
> > this, it deserves to be maintained in CRAN, without having to enter into
> > comparison exercices with other packages that , as of today, may be
> > better to rely on (with certainly very good reasons).
> >
> > Moving to xml2 (or whatever other package), which although I could agree
> > on the principle, can be costly for packages that use extensively XML.
> > Doing so would mean that we first get the assurance that all XML
> > features are covered elsewhere, and can be migrated smoothly.
> >
> > In any case, please acknowledge that this kind of migration may take
> > time and require resources that vary (or even are missing) depending on
> > the package projects. I doubt having CRAN setting a common deadline for
> > retirement is a good way to foster an efficient maintenance of R
> > packages depending on XML. It would be good to receive guidance how to
> > migrate, while ensuring backward compatibility on our package features.
> >
> > Best
> >
> > Le 24/01/2024 à 15:59, Jeroen Ooms a écrit :
> >> On Mon, Jan 22, 2024 at 3:51 PM Uwe Ligges
> >>  wrote:
> >>> Dear package developers,
> >>>
> >>> the CRAN team (and Professor Ripley in particular) has been the defacto
> >>> maintainer of CRAN package 'XML'.
> >>> Our hope was that maintainers of packages depending on XML will migrate
> >>> to other packages for reading XML structures. This has not happened and
> >>> we still see dozens of strong dependencies on XML.
> >> How is this hope communicated? Many R users assume that XML package is
> >> in great shape and the preferable choice because it is maintained by
> >> the CRAN team and r-core members.
> >>
> >> Perhaps one could follow the precedent from the rgdal retirement, and
> >> set a deadline.
> >>
> >> One way to communicate this effectively would be by introducing a
> >> formal deprecation field in the package description. This could then
> >> be displayed on the XML CRAN html page, and when loading the package
> >> interactively. Other packages that import such a deprecated package
> >> could be given a CMD check warning.
> >>
> >> __
> >> R-package-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
> > __
> > R-package-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list

Re: [R-pkg-devel] Native pipe in package examples

2024-01-25 Thread Henrik Bengtsson
On Thu, Jan 25, 2024 at 9:23 AM Berwin A Turlach
 wrote:
>
> G'day Duncon,
>
> On Thu, 25 Jan 2024 11:27:50 -0500
> Duncan Murdoch  wrote:
>
> > On 25/01/2024 11:18 a.m., Henrik Bengtsson wrote:
> [...]
> > I think you're right that syntax errors in help page examples will be
> > installable, but I don't think there's a way to make them pass "R CMD
> > check" other than wrapping them in \dontrun{}, and I don't know a way
> > to do that conditional on the R version.
>
> I remember vaguely that 'S Programming' was discussing some nifty
> tricks to deal with differences between S and R, and how to write code
> that would work with either.  If memory serves correctly, those
> tricks depended on whether a macro called using_S (using_R?) was
> defined. Not sure if the same tricks could be used to distinguish
> between different versions of R.
>
> But you could always code your example (not tested :-) ) along lines
> similar to:
>
> if( with(version, all(as.numeric(c(major, minor)) >= c(4, 1))) ){
>   ## code that uses native pipe
> }else{
>   cat("You have to upgrade to R >= 4.1.0 to run this example\n")
> }

That will unfortunately not work in this case, because |> is part of
the new *syntax* that was introduced in R 4.1.0.  Older versions of R
simply doesn't understand how to *parse* those two symbols next to
each other, e.g.

{R 4.1.0}> parse(text = "1:3 |> sum()")
expression(1:3 |> sum())

{R 4.0.5}> parse(text = "1:3 |> sum()")
Error in parse(text = "1:3 |> sum()") : :1:6: unexpected '>'
1: 1:3 |>
 ^

In order for R to execute some code, it needs to be able to parse it
first. Only then, it can execute it.  So, here, we're not even getting
past the parsing phase.

/Henrik

>
>
> > I would say that a package that doesn't pass "R CMD check" without
> > errors shouldn't be trusted.
>
> Given the number of packages on CRAN and Murphy's law (or equivalents),
> I would say that there are packages that do pass "R CMD check" without
> errors but shouldn't be trusted, own packages not excluded. :)
>
> Cheers,
>
> Berwin

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Native pipe in package examples

2024-01-25 Thread Henrik Bengtsson
On Thu, Jan 25, 2024 at 8:27 AM Duncan Murdoch  wrote:
>
> On 25/01/2024 11:18 a.m., Henrik Bengtsson wrote:
> > On Thu, Jan 25, 2024 at 7:48 AM Duncan Murdoch  
> > wrote:
> >>
> >> On 25/01/2024 10:27 a.m., Josiah Parry wrote:
> >>> Hey all,
> >>>
> >>> I've encountered use of the native pipe operator in the examples for
> >>> 'httr2' e.g.
> >>>
> >>> request("http://example.com;) |> req_dry_run()
> >>>
> >>>
> >>> Since r-oldrel (according to rversions::r_oldrel()) is now 4.2.3, can the
> >>> native pipe be used in example code?
> >>>
> >>> I do notice that the package httr2 requires R >= 3.6.0 which implies that
> >>> the code itself does not use the native pipe, but the examples do.
> >>
> >> I think that the package should state it requires R (>= 4.1.0), since
> >> that code won't work in earlier versions.
> >>
> >> I believe it's a syntax error before 4.1.0, but don't have a copy handy
> >> to test.
> >
> > Yes, support for the |> syntax was introduced in R 4.1.0;
> >
> > $ Rscript --vanilla -e "getRversion()" -e "1:10 |> sum()"
> > [1] ‘4.0.5’
> > Error: unexpected '>' in "1:10 |>"
> > Execution halted
> >
> > $ Rscript --vanilla -e "getRversion()" -e "1:10 |> sum()"
> > [1] ‘4.1.0’
> > [1] 55
> >
> >> That means the package won't pass R CMD check in those old
> >> versions.  If it wasn't a syntax error, just a case of using a new
> >> feature, then I think it would be fine to put in a run-time test of the
> >> R version to skip code that won't run properly.
> >
> > There's also the distinction of package code versus code in
> > documentation. If it's only example code in help pages that use the
> > native pipe, but the code in R/*.R does not, then the package will
> > still install and work with R (< 4.1.0).  The only thing that won't
> > work is when the user tries to run the code in the documented
> > examples.  I'd argue that it's okay to specify, say, R (>= 3.6.0) in
> > such an example.  It allows users with older versions to still use the
> > package, while already now migrating the documentation to use newer
> > syntax.
>
> Is there a way to do that so that R will pay attention, or do you mean
> just saying it in a comment?

As a "comment".

>
> I think you're right that syntax errors in help page examples will be
> installable, but I don't think there's a way to make them pass "R CMD
> check" other than wrapping them in \dontrun{}, and I don't know a way to
> do that conditional on the R version.

I think

$ R CMD check --no-examples --no-vignettes ...

would check everything else but examples and vignettes.

>
> I would say that a package that doesn't pass "R CMD check" without
> errors shouldn't be trusted.

Somewhat agree, but we still get some "trust" from the fact that the
package passes R CMD check --as-cran on R (>= 4.1.0).  Also, if the
maintainer documents something like "On R (> 4.1.0), the package
passes 'R CMD check --no-examples ...'; we use R (>= 4.1.0)-specific
syntax in some of the help-age examples", then there's additional
"trust" in it's working there.  But, yes, there's less "trust" here,
but I think it's okay for maintainers to declare "R (>= 3.6.0)" to be
backward compatible. Another way to put it, it would be extreme to
require "R (>= 4.1.0)" just because of a single "1:3 |> sum()" in some
example code.

/Henrik

PS. Personally, I'd skip the use of |> in examples to avoid these concerns.

>
> Duncan Murdoch

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Native pipe in package examples

2024-01-25 Thread Henrik Bengtsson
On Thu, Jan 25, 2024 at 7:48 AM Duncan Murdoch  wrote:
>
> On 25/01/2024 10:27 a.m., Josiah Parry wrote:
> > Hey all,
> >
> > I've encountered use of the native pipe operator in the examples for
> > 'httr2' e.g.
> >
> > request("http://example.com;) |> req_dry_run()
> >
> >
> > Since r-oldrel (according to rversions::r_oldrel()) is now 4.2.3, can the
> > native pipe be used in example code?
> >
> > I do notice that the package httr2 requires R >= 3.6.0 which implies that
> > the code itself does not use the native pipe, but the examples do.
>
> I think that the package should state it requires R (>= 4.1.0), since
> that code won't work in earlier versions.
>
> I believe it's a syntax error before 4.1.0, but don't have a copy handy
> to test.

Yes, support for the |> syntax was introduced in R 4.1.0;

$ Rscript --vanilla -e "getRversion()" -e "1:10 |> sum()"
[1] ‘4.0.5’
Error: unexpected '>' in "1:10 |>"
Execution halted

$ Rscript --vanilla -e "getRversion()" -e "1:10 |> sum()"
[1] ‘4.1.0’
[1] 55

> That means the package won't pass R CMD check in those old
> versions.  If it wasn't a syntax error, just a case of using a new
> feature, then I think it would be fine to put in a run-time test of the
> R version to skip code that won't run properly.

There's also the distinction of package code versus code in
documentation. If it's only example code in help pages that use the
native pipe, but the code in R/*.R does not, then the package will
still install and work with R (< 4.1.0).  The only thing that won't
work is when the user tries to run the code in the documented
examples.  I'd argue that it's okay to specify, say, R (>= 3.6.0) in
such an example.  It allows users with older versions to still use the
package, while already now migrating the documentation to use newer
syntax.

/Henrik
>
> Duncan Murdoch
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] tools::startDynamicHelp(): Randomly prevents R from exiting (on MS Windows)

2024-01-06 Thread Henrik Bengtsson
Thank you for confirming this. I just filed PR#18650
(https://bugs.r-project.org/show_bug.cgi?id=18650).

FWIW, I've found two other issues with startDynamicHelp() prior to this:

* https://bugs.r-project.org/show_bug.cgi?id=18645
* https://bugs.r-project.org/show_bug.cgi?id=18648

/Henrik

On Sat, Jan 6, 2024 at 5:53 PM Steve Martin  wrote:
>
> Henrik,
>
> I was able to reproduce this both with Rscript and interactively using the 
> same version of R you're using (fresh install) and Windows 10.0.22621.2715. 
> It took about a dozen tries.
>
> Steve
>
>
>
>
>
>
>  Original Message ----
> On Jan 6, 2024, 12:38, Henrik Bengtsson < henrik.bengts...@gmail.com> wrote:
>
>
> ISSUE: On MS Windows, running cmd.exe, calling Rscript --vanilla -e "port <- 
> tools::startDynamicHelp(); port; port <- tools::startDynamicHelp(FALSE); 
> port" will sometimes stall R at the end, preventing it from existing. This 
> also happens when running R in interactive mode. It seems to stem from 
> calling tools::startDynamicHelp(FALSE). Before filing a formal bug report, 
> can someone please confirm this behavior? You might have to call it multiple 
> times to hit the bug. DETAILS: Microsoft Windows [Version 10.0.19045.3803] 
> (c) Microsoft Corporation. All rights reserved. C:\Users\hb>R --version R 
> version 4.3.2 (2023-10-31 ucrt) -- "Eye Holes" Copyright (C) 2023 The R 
> Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 
> (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are 
> welcome to redistribute it under the terms of the GNU General Public License 
> versions 2 or 3. For more information about these matters see 
> https://www.gnu.org/licenses/. C:\Users\hb> Rscript --vanilla -e "port <- 
> tools::startDynamicHelp(); port; port <- tools::startDynamicHelp(FALSE); 
> port" starting httpd help server ... done [1] 18897 [1] 0 [WORKED] 
> C:\Users\hb> Rscript --vanilla -e "port <- tools::startDynamicHelp(); port; 
> port <- tools::startDynamicHelp(FALSE); port" starting httpd help server ... 
> done [1] 17840 [1] 0 [STALLED] Bugwhisperer Bengtsson 
> __ R-devel@r-project.org mailing 
> list https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] tools::startDynamicHelp(): Randomly prevents R from exiting (on MS Windows)

2024-01-06 Thread Henrik Bengtsson
ISSUE:

On MS Windows, running cmd.exe, calling

Rscript --vanilla -e "port <- tools::startDynamicHelp(); port; port <-
tools::startDynamicHelp(FALSE); port"

will sometimes stall R at the end, preventing it from existing.  This
also happens when running R in interactive mode.  It seems to stem
from calling tools::startDynamicHelp(FALSE).

Before filing a formal bug report, can someone please confirm this
behavior? You might have to call it multiple times to hit the bug.

DETAILS:

Microsoft Windows [Version 10.0.19045.3803]
(c) Microsoft Corporation. All rights reserved.

C:\Users\hb>R --version
R version 4.3.2 (2023-10-31 ucrt) -- "Eye Holes"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under the terms of the
GNU General Public License versions 2 or 3.
For more information about these matters see
https://www.gnu.org/licenses/.

C:\Users\hb> Rscript --vanilla -e "port <- tools::startDynamicHelp();
port; port <- tools::startDynamicHelp(FALSE); port"
starting httpd help server ... done
[1] 18897
[1] 0

[WORKED]

C:\Users\hb> Rscript --vanilla -e "port <- tools::startDynamicHelp();
port; port <- tools::startDynamicHelp(FALSE); port"
starting httpd help server ... done
[1] 17840
[1] 0

[STALLED]

Bugwhisperer Bengtsson

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] Wrong mailing list: Could the 100 byte path length limit be lifted?

2023-12-13 Thread Henrik Bengtsson
On Wed, Dec 13, 2023 at 8:06 AM McGrath, Justin M  wrote:
>
> On Windows, packages will be in "C:\Users\[User 
> Name]\Documents\R\win-library\[R version]\[Package Name]".

In R (>= 4.2.0), the default R_LIBS_USER path has moved to under
LOCALAPPDATA, e.g. "C:\Users\[User
Name]\AppData\Local\R\win-library\[R version]".  See
https://cran.r-project.org/bin/windows/base/rw-FAQ.html

FWIW, one workaround for too long paths on MS Windows is to map a long
path to a drive letter, e.g.

subst Y: 'C:/VeryLongPathToo/Users/JohnDoe/AppData/Local/R/'.

and then work with Y: instead of C:. We had to use that in some
projects with nested data folder structures. This approach is tedious,
and might require special permissions (not sure).

/Henrik

>
> With a 150 byte limit, that leaves 70 bytes for the user name, R version and 
> package name. That seems more than sufficient. If people are downloading the 
> source files, that also leaves plenty of space regardless where they choose 
> to extract the files.
>
> 
> From: Dirk Eddelbuettel 
> Sent: Wednesday, December 13, 2023 9:13 AM
> To: Tomas Kalibera
> Cc: Dirk Eddelbuettel; McGrath, Justin M; Ben Bolker; Martin Maechler; 
> r-package-devel@r-project.org
> Subject: Re: [R-pkg-devel] Wrong mailing list: Could the 100 byte path length 
> limit be lifted?
>
>
> On 13 December 2023 at 16:02, Tomas Kalibera wrote:
> |
> | On 12/13/23 15:59, Dirk Eddelbuettel wrote:
> | > On 13 December 2023 at 15:32, Tomas Kalibera wrote:
> | > | Please don't forget about what has been correctly mentioned on this
> | > | thread already: there is essentially a 260 character limit on Windows
> | > | (see
> | > | 
> https://urldefense.com/v3/__https://blog.r-project.org/2023/03/07/path-length-limit-on-windows/index.html__;!!DZ3fjg!-NHAlEZQvY2kegkNVkuY3Lf84nKmMahrpZ4Euz2XXFDPvMWEcP28iepLlRiKzVZdAh752lyhHxd6zvk$
> | > | for more). Even if the relative path length limit for a CRAN package was
> | > | no longer regarded important for tar compatibility, it would still make
> | > | sense for compatibility with Windows. It may still be a good service to
> | > | your users if you keep renaming the files to fit into that limit.
> | >
> | > So can lift the limit from 100 char to 260 char ?
> |
> | The 260 char limit is for the full path. A package would be extracted in
> | some directory, possibly also with a rather long name.
>
> Call a cutoff number.
>
> Any move from '100' to '100 + N' for any nonzero N is a win. Pick one, and
> then commit the change.  N = 50 would be a great start as arbitrary as it is.
>
> Dirk
>
> --
> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Canonical way to Rprintf R_xlen_t

2023-11-29 Thread Henrik Bengtsson
On Tue, Nov 28, 2023 at 1:21 PM Tomas Kalibera  wrote:
>
>
> On 11/28/23 21:50, Henrik Bengtsson wrote:
> > Daniel, I get those compiler warnings for '%td" MS Windows. It works
> > fine on Linux.
>
> Please let me clarify. %td works in R on Windows in R 4.3 and R-devel,
> when using the recommended toolchain, which is Rtools43. It also worked
> with R 4.2 and Rtools42. It works since R has switched to UCRT on
> Windows. I assume you are not using a recommended toolchain and this is
> why you are getting the warning - please let me know if this is not the
> case and I will try to help.

Thank you.

I was getting those compiler warnings on %td when using the widely
used https://github.com/r-lib/actions tool chain.  It was using
gcc.exe (GCC) 12.2.0 two-three days ago.  I reran it yesterday, and it
seems to have been fixed now.  It is now reporting on gcc.exe (GCC)
12.3.0.  Not sure if that was the fix, or there was something else.

> There is a bug in GCC, still present in gcc 12 and gcc 10, due to which
> gcc displays warnings about the format even when it is supported. The
> details are complicated, but in short, it accidentally applies both
> Microsoft format and C99/GNU format checks to printf functions with UCRT
> - so you get a warning whenever the two formats disagree, which includes
> printing a 64 bit integer.  Also for %td which is not supported by
> Microsoft format. Or say %zu (size_t) or %Lf (long double). I've been
> patching GCC in Rtools42 and Rtools43 to avoid this problem, so you
> don't get the warning there. My patch has been picked up also by Msys2,
> I didn't check whether it is still there or not. Finally a new
> implementation of the patch was accepted to GCC trunk, so eventually
> this will no longer be needed. But regardless which version of GCC
> Rtools44 will use, I will make sure it will accept C99 printf formats
> without warnings.

Interesting. Thanks for this work and pushing this upstreams.

> An unpatched GCC 10 or 12 with UCRT will print a warning for %td but will 
> support it.

It sounds like '%td' is supported, and it's just that there's a false
warning. Do you happen to know we can assume '%td' is compatible with
much older versions of GCC too? My question is basically, can I safely
use '%td' with older version of GCC, e.g. for older versions of R, and
assume it'll compile on MS Windows?  In my case, we're trying to keep
'matrixStats' backward quite far back, and I can imagine there are
other packages doing that too.

Thanks,

Henrik

>
> Best
> Tomas
>
> > FYI, https://builder.r-hub.io/ is a great, free service for testing on
> > various platforms in the cloud.  Also, if you host your package code
> > on GitHub, it's a small step to configure GitHub Actions to check your
> > packages across platforms on their servers.  It's free and fairly
> > straightforward.  There should be plenty of tutorials and examples
> > online for how to do that with R packages.  So, no need to mock around
> > with Linux containers etc.
> >
> > /Henrik
> >
> > On Tue, Nov 28, 2023 at 12:30 PM Daniel Kelley  wrote:
> >> To HB: I also maintain a package that has this problem.  I do not have 
> >> access to a linux machine (or a machine with the C++ version in question) 
> >> so I spent quite a while trying to get docker set up. That was a slow 
> >> process because I had to install R, a bunch of packages, some other 
> >> software, and so forth.  Anyway, the docker container I had used didn't 
> >> seem to have a compiler that gave these warnings.  But, by then, I saw 
> >> that the machine used by
> >>
> >> devtools::check_win_devel()
> >>
> >> was giving those warnings :-)
> >>
> >> So, now there is a way to debug these things.
> >>
> >> PS. I also tried using rhub, but it takes a long time and often results in 
> >> a PREPERROR.
> >>
> >> On Nov 28, 2023, at 3:58 PM, Henrik Bengtsson  
> >> wrote:
> >>
> >> CAUTION: The Sender of this email is not from within Dalhousie.
> >>
> >> "%td" is not supported on all platforms/compilers.  This is what I got
> >> when I added it to 'matrixStats';
> >>
> >> * using log directory 
> >> 'D:/a/matrixStats/matrixStats/check/matrixStats.Rcheck'
> >> * using R Under development (unstable) (2023-11-26 r85638 ucrt)
> >> * using platform: x86_64-w64-mingw32
> >> * R was compiled by
> >> gcc.exe (GCC) 12.3.0
> >> GNU Fortran (GCC) 12.3.0
> >> * running under: Windows Server 2022 x64 (build 20348)
> >> * using session charset: UTF-8
> >> * using options '--no-manual 

Re: [R-pkg-devel] Canonical way to Rprintf R_xlen_t

2023-11-28 Thread Henrik Bengtsson
Daniel, I get those compiler warnings for '%td" MS Windows. It works
fine on Linux.

FYI, https://builder.r-hub.io/ is a great, free service for testing on
various platforms in the cloud.  Also, if you host your package code
on GitHub, it's a small step to configure GitHub Actions to check your
packages across platforms on their servers.  It's free and fairly
straightforward.  There should be plenty of tutorials and examples
online for how to do that with R packages.  So, no need to mock around
with Linux containers etc.

/Henrik

On Tue, Nov 28, 2023 at 12:30 PM Daniel Kelley  wrote:
>
> To HB: I also maintain a package that has this problem.  I do not have access 
> to a linux machine (or a machine with the C++ version in question) so I spent 
> quite a while trying to get docker set up. That was a slow process because I 
> had to install R, a bunch of packages, some other software, and so forth.  
> Anyway, the docker container I had used didn't seem to have a compiler that 
> gave these warnings.  But, by then, I saw that the machine used by
>
> devtools::check_win_devel()
>
> was giving those warnings :-)
>
> So, now there is a way to debug these things.
>
> PS. I also tried using rhub, but it takes a long time and often results in a 
> PREPERROR.
>
> On Nov 28, 2023, at 3:58 PM, Henrik Bengtsson  
> wrote:
>
> CAUTION: The Sender of this email is not from within Dalhousie.
>
> "%td" is not supported on all platforms/compilers.  This is what I got
> when I added it to 'matrixStats';
>
> * using log directory 'D:/a/matrixStats/matrixStats/check/matrixStats.Rcheck'
> * using R Under development (unstable) (2023-11-26 r85638 ucrt)
> * using platform: x86_64-w64-mingw32
> * R was compiled by
> gcc.exe (GCC) 12.3.0
> GNU Fortran (GCC) 12.3.0
> * running under: Windows Server 2022 x64 (build 20348)
> * using session charset: UTF-8
> * using options '--no-manual --as-cran'
> * checking for file 'matrixStats/DESCRIPTION' ... OK
> * this is package 'matrixStats' version '1.1.0-9003'
> * checking package namespace information ... OK
> * checking package dependencies ... OK
> * checking if this is a source package ... OK
> * checking if there is a namespace ... OK
> * checking for executable files ... OK
> * checking for hidden files and directories ... OK
> * checking for portable file names ... OK
> * checking serialization versions ... OK
> * checking whether package 'matrixStats' can be installed ... [22s] WARNING
> Found the following significant warnings:
> binCounts.c:25:81: warning: unknown conversion type character 't' in
> format [-Wformat=]
> binCounts.c:25:11: warning: too many arguments for format 
> [-Wformat-extra-args]
> binMeans.c:26:60: warning: unknown conversion type character 't' in
> format [-Wformat=]
> binMeans.c:26:67: warning: unknown conversion type character 't' in
> format [-Wformat=]
> ...
> See 'D:/a/matrixStats/matrixStats/check/matrixStats.Rcheck/00install.out'
> for details.
> * used C compiler: 'gcc.exe (GCC) 12.2.0'
>
> It worked fine on Linux. Because of this, I resorted to the coercion
> strategy, i.e. "%lld" and (long long int)value.  FWIW, on MS Windows,
> I see 'ptrsize_t' being 'long long int', whereas on Linux I see 'long
> int'.
>
> /Henrik
>
> On Tue, Nov 28, 2023 at 11:51 AM Ivan Krylov  wrote:
>
>
> On Wed, 29 Nov 2023 06:11:23 +1100
> Hugh Parsonage  wrote:
>
> Rprintf("%lld", (long long) xlength(x));
>
>
> This is fine. long longs are guaranteed to be at least 64 bits in size
> and are signed, just like lengths in R.
>
> Rprintf("%td, xlength(x));
>
>
> Maybe if you cast it to ptrdiff_t first. Otherwise I would expect this
> to fail on an (increasingly rare) 32-bit system where R_xlen_t is int
> (which is an implementation detail).
>
> In my opinion, ptrdiff_t is just the right type for array lengths if
> they have to be signed (which is useful for Fortran interoperability),
> so Rprintf("%td", (ptrdiff_t)xlength(x)) would be my preferred option
> for now. By definition of ptrdiff_t, you can be sure [*] that there
> won't be any vectors on your system longer than PTRDIFF_MAX.
>
> using the string macro found in Mr Kalibera's commit of r85641:
> R_PRIdXLEN_T
>
>
> I think this will be the best solution once we can afford
> having our packages depend on R >= 4.4.
>
> --
> Best regards,
> Ivan
>
> [*] https://en.cppreference.com/w/c/types/ptrdiff_t posits that there
> may exist long vectors that fit in SIZE_MAX (unsigned) elements but not
> PTRDIFF_MAX (signed) elements. If such vector exists, subtracting two
> pointers to its insides may result in undefined behaviour. This may be
> 

Re: [R-pkg-devel] Canonical way to Rprintf R_xlen_t

2023-11-28 Thread Henrik Bengtsson
"%td" is not supported on all platforms/compilers.  This is what I got
when I added it to 'matrixStats';

* using log directory 'D:/a/matrixStats/matrixStats/check/matrixStats.Rcheck'
* using R Under development (unstable) (2023-11-26 r85638 ucrt)
* using platform: x86_64-w64-mingw32
* R was compiled by
gcc.exe (GCC) 12.3.0
GNU Fortran (GCC) 12.3.0
* running under: Windows Server 2022 x64 (build 20348)
* using session charset: UTF-8
* using options '--no-manual --as-cran'
* checking for file 'matrixStats/DESCRIPTION' ... OK
* this is package 'matrixStats' version '1.1.0-9003'
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking serialization versions ... OK
* checking whether package 'matrixStats' can be installed ... [22s] WARNING
Found the following significant warnings:
binCounts.c:25:81: warning: unknown conversion type character 't' in
format [-Wformat=]
binCounts.c:25:11: warning: too many arguments for format [-Wformat-extra-args]
binMeans.c:26:60: warning: unknown conversion type character 't' in
format [-Wformat=]
binMeans.c:26:67: warning: unknown conversion type character 't' in
format [-Wformat=]
...
See 'D:/a/matrixStats/matrixStats/check/matrixStats.Rcheck/00install.out'
for details.
* used C compiler: 'gcc.exe (GCC) 12.2.0'

It worked fine on Linux. Because of this, I resorted to the coercion
strategy, i.e. "%lld" and (long long int)value.  FWIW, on MS Windows,
I see 'ptrsize_t' being 'long long int', whereas on Linux I see 'long
int'.

/Henrik

On Tue, Nov 28, 2023 at 11:51 AM Ivan Krylov  wrote:
>
> On Wed, 29 Nov 2023 06:11:23 +1100
> Hugh Parsonage  wrote:
>
> > Rprintf("%lld", (long long) xlength(x));
>
> This is fine. long longs are guaranteed to be at least 64 bits in size
> and are signed, just like lengths in R.
>
> > Rprintf("%td, xlength(x));
>
> Maybe if you cast it to ptrdiff_t first. Otherwise I would expect this
> to fail on an (increasingly rare) 32-bit system where R_xlen_t is int
> (which is an implementation detail).
>
> In my opinion, ptrdiff_t is just the right type for array lengths if
> they have to be signed (which is useful for Fortran interoperability),
> so Rprintf("%td", (ptrdiff_t)xlength(x)) would be my preferred option
> for now. By definition of ptrdiff_t, you can be sure [*] that there
> won't be any vectors on your system longer than PTRDIFF_MAX.
>
> > using the string macro found in Mr Kalibera's commit of r85641:
> > R_PRIdXLEN_T
>
> I think this will be the best solution once we can afford
> having our packages depend on R >= 4.4.
>
> --
> Best regards,
> Ivan
>
> [*] https://en.cppreference.com/w/c/types/ptrdiff_t posits that there
> may exist long vectors that fit in SIZE_MAX (unsigned) elements but not
> PTRDIFF_MAX (signed) elements. If such vector exists, subtracting two
> pointers to its insides may result in undefined behaviour. This may be
> already possible in a 32-bit process on Linux running with a 3G
> user-space / 1G kernel-space split. The only way around the problem is
> to use unsigned types for lengths, but that would preclude Fortran
> compatibility.
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] capture error messages from loading shared objects

2023-11-28 Thread Henrik Bengtsson
Careful; tryCatch() on non-error conditions will break out of what's
evaluated, e.g.

res <- tryCatch({
  cat("1\n")
  message("2")
  cat("3\n")
  42
}, message = identity)

will output '1' but not '3', because it returns as soon as the first
message() is called.

To "record" messages (same for warnings), use withCallingHandlers()
instead, e.g.

msgs <- list()
res <- withCallingHandlers({
  cat("1\n")
  message("2")
  cat("3\n")
  42
}, message = function(m) {
  msgs <<- c(msgs, list(m))
  invokeRestart("muffleMessage")
})

This will output '1', muffle '2', output '3', and return 42, and 'msgs' holds

> msgs
[[1]]
 wrote:
>
> If you would like to save the error message instead of suppressing it, you
> can use tryCatch(message=function(e)e, ...).
>
> -BIll
>
> On Tue, Nov 28, 2023 at 3:55 AM Adrian Dusa  wrote:
>
> > Once again, Ivan, many thanks.
> > Yes, that does solve it.
> > Best wishes,
> > Adrian
> >
> > On Tue, Nov 28, 2023 at 11:28 AM Ivan Krylov 
> > wrote:
> >
> > > В Tue, 28 Nov 2023 10:46:45 +0100
> > > Adrian Dusa  пишет:
> > >
> > > > tryCatch(requireNamespace("foobar"), error = function(e) e)
> > >
> > > I think you meant loadNamespace() (which throws errors), not
> > > requireNamespace() (which internally uses tryCatch(loadNamespace(...))
> > > and may or may not print the error depending on the `quietly` argument).
> > >
> > > --
> > > Best regards,
> > > Ivan
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] dim<-() changed in R-devel; no longer removing "dimnames" when doing dim(x) <- dim(x)

2023-11-01 Thread Henrik Bengtsson
> I assume it did, or you would not have noticed ?

I noticed it because I got a notice from CRAN about 'matrixStats'
starting to fail on R-devel.  It was a non-critical failure, because
it was due to how the package tests compare the results to the
corresponding base-R implementation. Basically, for legacy reasons
there was a `dim(res) <- dim` statement and anything following assumed
the "dimnames" would be done. I've since rewritten the tests to not
make such assumptions, which resulted in code that is easier to
follow. So, there were good outcomes from this change too :)

The discussion on whether certain R expressions (e.g. dim(x) <-
dim(x)) should be no-op is interesting, but it's much bigger, and I
can see how it becomes a quite complicated discussion.

Thanks,

Henrik


On Mon, Oct 30, 2023 at 3:53 AM Martin Maechler
 wrote:
>
>
> >>>>> Henrik Bengtsson
> >>>>> on Sun, 29 Oct 2023 10:42:19 -0700 writes:
>
> > Hello,
>
> > the fix of PR18612
> > (https://bugs.r-project.org/show_bug.cgi?id=18612) in
> > r85380
> > 
> (https://github.com/wch/r-source/commit/2653cc6203fce4c48874111c75bbccac3ac4e803)
> > caused a change in `dim<-()`.  Specifically, in the past,
> > any `dim<-()` assignment would _always_ remove "dimnames"
> > and "names" attributes per help("dim"):
>
>
> > The replacement method changes the "dim" attribute
> > (provided the new value is compatible) and removes any
> > "dimnames" and "names" attributes.
>
> > In the new version, assigning the same "dim" as before
> > will no longer remove "dimnames".  I'm reporting here to
> > check whether this change was intended, or if it was an
> > unintended side effect of the bug fix.
>
> > For example, in R Under development (unstable) (2023-10-21
> > r85379), we would get:
>
> >> x <- array(1:2, dim=c(1,2), dimnames=list("A",
> >> c("a","b"))) str(dimnames(x))
> > List of 2 $ : chr "A" $ : chr [1:2] "a" "b"
>
> >> dim(x) <- dim(x) ## Removes "dimnames" no matter what
> >> str(dimnames(x))
> >  NULL
>
>
> > whereas in R Under development (unstable) (2023-10-21
> > r85380) and beyond, we now get:
>
> >> x <- array(1:2, dim=c(1,2), dimnames=list("A",
> >> c("a","b"))) str(dimnames(x))
> > List of 2 $ : chr "A" $ : chr [1:2] "a" "b"
>
> >> dim(x) <- dim(x) ## No longer removes "dimnames"
> >> str(dimnames(x))
> > List of 2 $ : chr "A" $ : chr [1:2] "a" "b"
>
> >> dim(x) <- rev(dim(x)) ## Still removes "dimnames"
> >> str(dimnames(x))
> >  NULL
>
> > /Henrik
>
> Thank you, Henrik.
>
> This is "funny" (in an unusal sense):
> indeed, the change was *in*advertent, by me (svn rev 85380).
>
> I had experimentally {i.e., only in my own private version of R-devel!}
> modified the behavior of `dim<-` somewhat
> such it does *not* unnecessarily drop dimnames,
> e.g., in your   `dim(x) <- dim(x)` case above,
> one could really argue that it's a "true loss" if x loses
> dimnames "unnecessarily" ...
>
> OTOH, I knew in the mean time that  `dim<-` has always been
> documented to drop dimnames in all cases,  and even more
> importantly, I got a strong recommendation to *not* go further
> with this idea -- not only for back compatibility reasons, but
> also for internal logical consistency.
>
> Most probably, we will just revert this inadvertent change,
> but before that ... since it has been out in the wild anyway,
> we could quickly consider if it *did* break code.
>
> I assume it did, or you would not have noticed ?
>
> Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[R-pkg-devel] CRAN archival: Does CRAN un-archive some packages automatically?

2023-11-01 Thread Henrik Bengtsson
I'm asking here to spare the CRAN Team a direct message, but also
because the answer is of interest to others:

Consider a package PkgA that was archived on CRAN, because it fails
checks and errors that were not corrected in time.  At the moment when
package PkgA is archived, it will trigger automatic archiving of other
CRAN packages that has a hard dependency on it. Say, packages PkgB and
PkgC were archived automatically, because of their dependency on PkgA.

Question: If PkgA is at a later point revived on CRAN, will CRAN
unarchive PkgB and PkgC automatically? Or, should the maintainers of
PkgB and PkgC resubmit? If they have to resubmit, should they submit
identical versions and tarballs as before, or do they have to bump the
version?

/Henrik

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[Rd] dim<-() changed in R-devel; no longer removing "dimnames" when doing dim(x) <- dim(x)

2023-10-29 Thread Henrik Bengtsson
Hello,

the fix of PR18612 (https://bugs.r-project.org/show_bug.cgi?id=18612)
in r85380 
(https://github.com/wch/r-source/commit/2653cc6203fce4c48874111c75bbccac3ac4e803)
caused a change in `dim<-()`.  Specifically, in the past, any
`dim<-()` assignment would _always_ remove "dimnames" and "names"
attributes per help("dim"):

The replacement method changes the "dim" attribute (provided the
new value is compatible) and removes any "dimnames" and "names"
attributes.

In the new version, assigning the same "dim" as before will no longer
remove "dimnames".  I'm reporting here to check whether this change
was intended, or if it was an unintended side effect of the bug fix.

For example, in R Under development (unstable) (2023-10-21 r85379), we
would get:

> x <- array(1:2, dim=c(1,2), dimnames=list("A", c("a","b")))
> str(dimnames(x))
List of 2
 $ : chr "A"
 $ : chr [1:2] "a" "b"

> dim(x) <- dim(x)## Removes "dimnames" no matter what
> str(dimnames(x))
 NULL


whereas in R Under development (unstable) (2023-10-21 r85380) and
beyond, we now get:

> x <- array(1:2, dim=c(1,2), dimnames=list("A", c("a","b")))
> str(dimnames(x))
List of 2
 $ : chr "A"
 $ : chr [1:2] "a" "b"

> dim(x) <- dim(x)## No longer removes "dimnames"
> str(dimnames(x))
List of 2
 $ : chr "A"
 $ : chr [1:2] "a" "b"

> dim(x) <- rev(dim(x))  ## Still removes "dimnames"
> str(dimnames(x))
 NULL

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] Suppressing long-running vignette code in CRAN submission

2023-10-17 Thread Henrik Bengtsson
On Tue, Oct 17, 2023 at 12:45 PM John Fox  wrote:
>
> Hello Dirk,
>
> Thank you for the additional information.
>
> As you suggest, what you did to distribute pre-built PDF vignettes is
> quite similar to what R.rsp does, except that the latter also supports
> pre-built HTML vignettes, which is what I'd prefer to distribute. Since

Author of R.rsp here: It supports both static PDFs and static HTMLs, cf.

https://cran.r-project.org/web/packages/R.rsp/vignettes/R_packages-Static_PDF_and_HTML_vignettes.pdf

/Henrik

> I apparently have that working now, we'll probably go with it unless we
> hit snags when the package is sent to CRAN.
>
> While I appreciate the offer, it's probably not necessary for you to
> spend more time on this now.
>
> Thanks again,
>   John
>
> On 2023-10-17 3:19 p.m., Dirk Eddelbuettel wrote:
> > Caution: External email.
> >
> >
> > John,
> >
> > On 17 October 2023 at 10:02, John Fox wrote:
> > | Hello Dirk,
> > |
> > | Thank you (and Kevin and John) for addressing my questions.
> > |
> > | No one directly answered my first question, however, which was whether
> > | the approach that I suggested would work. I guess that the implication
> > | is that it won't, but it would be nice to confirm that before I try
> > | something else, specifically using R.rsp.
> >
> > I am a little remote here, both mentally and physically. What I might do 
> > here
> > in the case of your long-running vignette, and have done in about half a
> > dozen packages where I wanted 'certainty' and no surprises, is to render the
> > pdf vignette I want as I want them locally, ship them in the package as an
> > included file (sometimes from a subdirectory) and have a five-or-so line
> > Sweave .Rnw file include it. That works without hassles. Here is the Rnw I
> > use for package anytime
> >
> > -
> > \documentclass{article}
> > \usepackage{pdfpages}
> > %\VignetteIndexEntry{Introduction to anytime}
> > %\VignetteKeywords{anytime, date, datetime, conversion}
> > %\VignettePackage{anytime}
> > %\VignetteEncoding{UTF-8}
> >
> > \begin{document}
> > \includepdf[pages=-, fitpaper=true]{anytime-intro.pdf}
> > \end{document}
> > -
> >
> > That is five lines of LaTeX code slurping in the pdf (per the blog post by
> > Mark). As I understand it R.rsp does something similar at the marginal cost
> > of an added dependency.
> >
> > Now, as mentioned, you can also 'conditionally' conpute in a vignette and
> > choose if and when to use a data cache. I think that we show most of that in
> > the package described in the RJournal piece by Brooke and myself on drat for
> > data repositories. (We may be skipping the compute when the data is not
> > accessible. Loading a precomputed set is similar. I may be doing that in the
> > much older never quite finished gcbd package and its vignette.
> >
> > Hope this helps, maybe more once I am back home.
> >
> > Cheers, Dirk
> >
> > | Best,
> > |   John
> > |
> > | On 2023-10-17 4:02 a.m., Dirk Eddelbuettel wrote:
> > | > Caution: External email.
> > | >
> > | >
> > | > On 16 October 2023 at 10:42, Kevin R Coombes wrote:
> > | > | Produce a PDF file yourself, then use the "as.is" feature of the R.rsp
> > | > | package.
> > | >
> > | > For completeness, that approach also works directly with Sweave. 
> > Described in
> > | > a blog post by Mark van der Loo in 2019, and used in a number of 
> > packages
> > | > including a few of mine.
> > | >
> > | > That said, I also used the approach described by John Harrold and cached
> > | > results myself.
> > | >
> > | > Dirk
> > | >
> > | > --
> > | > dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
> > | >
> > | > __
> > | > R-package-devel@r-project.org mailing list
> > | > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> > |
> >
> > --
> > dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Question regarding listing base and recommended packages programmatically and efficiently

2023-10-14 Thread Henrik Bengtsson
On Sat, Oct 14, 2023 at 5:25 AM Martin Maechler
 wrote:
>
> > Ivan Krylov
> > on Thu, 12 Oct 2023 18:50:30 +0300 writes:
>
> > On Thu, 12 Oct 2023 11:32:24 -0400
> > Mikael Jagan  wrote:
>
> >> > mk <- file.path(R.home("share"), "make", "vars.mk")
> >> > pp <- sub("^.*= +", "", grep("^R_PKGS_RECOMMENDED",
> >> > readLines(mk), value = TRUE))
> >> > sort(strsplit(pp, " ")[[1L]])
> >> [1] "KernSmooth" "MASS"   "Matrix" "boot"   "class"
> >> [6] "cluster""codetools"  "foreign""lattice""mgcv"
> >> [11] "nlme"   "nnet"   "rpart"  "spatial" "survival"
> >>
> >> I grepped around and did not find variables in any base namespace
> >> containing the names of these packages.  It wouldn't be too hard to
> >> define such variables when R is configured/built, but maybe there are
> >> "reasons" to not do that ... ?
>
> > tools:::.get_standard_package_names does that at package installation
> > time, but it's still not public API.
>
> Within R-core, we have somewhat discussed this, and a few
> minutes ago I committed a "public API" version of the above,
> called
>standard_package_names()
>
> to R-devel (svn rev 85329), and hence probably in next year's
> April release of R.

Excellent. Will it be supported on all OSes?  Because, there's
currently a source code comment saying the current implementation
might not work on MS Windows:

## we cannot assume that file.path(R.home("share"), "make", "vars.mk")
## is installed, as it is not on Windows
standard_package_names <-
.get_standard_package_names <-
local({
lines <- readLines(file.path(R.home("share"), "make", "vars.mk"))
lines <- grep("^R_PKGS_[[:upper:]]+ *=", lines, value = TRUE)
out <- strsplit(sub("^R_PKGS_[[:upper:]]+ *= *", "", lines), " +")
names(out) <-
tolower(sub("^R_PKGS_([[:upper:]]+) *=.*", "\\1", lines))
eval(substitute(function() {out}, list(out=out)), envir = topenv())
})

/Henrik

>
>
> > A call to installed.packages() may take a long while because it has to
> > list files in every library (some of which can be large and/or
> > network-mounted) and parse each Meta/package.rds file, but at least
> > list.files() is faster than that.
>
> The above is another issue that we've wanted to improve, as some
> of you are aware,  notably thinking about caching the result
> .. there has been work on this during the R Sprint @ Warwick a
> couple of weeks ago,
>
>  ==> https://github.com/r-devel/r-project-sprint-2023/issues/78
>
> involving smart people and promising proposals (my personal view).
>
>   > If I had to make a choice at this point, I would hard-code the list of
>   > packages, but a better option may surface once we know what Tony needs
>   > the package lists for.
>
>   > --
>   > Best regards,
>   > Ivan
>
>
> With thanks to the discussants here on R-devel,
> and best regards,
> Martin
>
> --
> Martin Maechler
> ETH Zurich  and  R Core team
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] Bug report: parLapply with capture.output(type="message") produces an error

2023-10-05 Thread Henrik Bengtsson
This is actually not a bug. If we really want to identify a bug, then
it's actually a bug in your code. We'll get to that at the very end.
Either way, it's an interesting report that reveals a lot of things.

First, here's a slightly simpler version of your example:

$ Rscript --vanilla -e 'library(parallel); cl <- makeCluster(1); x <-
clusterEvalQ(cl, { capture.output(NULL, type = "message") })'
Error in unserialize(node$con) : error reading from connection
Calls:  ... doTryCatch -> recvData -> recvData.SOCKnode ->
unserialize
Execution halted

There are lots of things going on here, but before we get to the
answer, the most important take-home message here is:

 Never ever use capture.output(..., type = "message") in R.

Second comment is:

 No, really, do not do that!

Now, towards what is going on in your example. First, I don't think
help("capture.output") is too "kind" here, when it says:

'Messages sent to stderr() (including those from message, warning and
stop) are captured by type = "message". Note that this can be “unsafe”
and should only be used with care.'

To understand why you shouldn't do this, you have to know that
capture.output() uses sink() internally, and its help page says:

"Sink-ing the messages stream should be done only with great care. For
that stream file must be an already open connection, and there is no
stack of connections."

The "[When] Sink-ing the messages stream ... there is no stack of
connections" is the reason for your the problem you're experiencing.
What happens is that, the background workers that you launch with
parallel::makeCluster() will use sink(..., type = "message")
internally and that's active throughout all parallel evaluation.  Now,
when you add another one of, via your capture.output(..., type =
"message"), you are stealing the "message" sink from the parallel
worker.  Our simplified example can be reproduced using only sink():s
as:

$ Rscript --vanilla -e 'library(parallel); cl <- makeCluster(1); x <-
clusterEvalQ(cl, { sink(file(nullfile(), open = "a"), type =
"message"); sink(type = "message") })'
Error in unserialize(node$con) : error reading from connection
Calls:  ... doTryCatch -> recvData -> recvData.SOCKnode ->
unserialize
Execution halted

Back to the "message" sink that parallel sets up. By default, it sinks
to the "null" file.  This is done to avoid output on parallel workers
from cluttering up the terminal.  The default is controlled by
argument 'outfile' of makeCluster(), i.e. our example does:

cl <- makeCluster(1, outfile = "/dev/null")

Now, since we're stealing the "message" sink from the worker when we
call sink(..., type = "message") on the parallel worker, any output on
the workers is no longer sent to the "null" file, but instead straight
out to the terminal. So, after stealing the sink, we're effectively
running as if we had told the parallel workers to not sink output.  We
can manually do this by:

cl <- makeCluster(1, outfile = "")

We're almost there.  If we use the latter, we will see all output from
the parallel worker(s).  Let's try that:

$ Rscript --vanilla -e 'library(parallel); cl <- makeCluster(1,
outfile = ""); x <- clusterEvalQ(cl, { })'
starting worker pid=349252 on localhost:11036 at 17:45:05.125
Error in unserialize(node$con) : error reading from connection
Calls:  ... doTryCatch -> recvData -> recvData.SOCKnode ->
unserialize
Execution halted

You see. There's a "starting worker ..." output that we now see.  But
more importantly, we now also see that "error reading from connection"
message.  So, as you see, that error message is there regardless of us
capturing or sinking the "message" output.  Instead, what it tells us
is that there is an error taking place at the very end, but we
normally don't see it.

This error is because when the main R session shuts down, the parallel
workers are still running and trying to listen to the socket
connection that they use to communicate with the main R session.  But
that is now broken, so each parallel worker will fail when it tries to
communicate.

How to fix it? Make sure to close the 'cl' cluster before exiting the
main R session, i.e.

$ Rscript --vanilla -e 'library(parallel); cl <- makeCluster(1,
outfile = ""); x <- clusterEvalQ(cl, { }); stopCluster(cl)'
starting worker pid=349703 on localhost:11011 at 17:50:20.357

The error is no longer there, because the main R session will tell the
parallel workers to shut down *before* terminating itself. This means
there are no stray parallel workers trying to reach a non-existing
main R session.

In a way, your example revealed that you forgot to call
stopCluster(cl) at the end.

But, the real message here is: Do not mess with the "message" output in R!

I'll take the moment to rant about this: I think sink(..., type =
"message") should not be part of the public R API; it's simply
impossible to use safely, because there is no one owner controlling
it. To prevent it being used by mistake, at least it could throw an
error if 

Re: [R-pkg-devel] DESCRIPTION file Imports of base R packages

2023-10-03 Thread Henrik Bengtsson
On Tue, Oct 3, 2023 at 7:46 AM Jan Gorecki  wrote:
>
> Hello,
>
> I noticed some packages define Imports in DESCRIPTION file listing base R
> packages like methods, utils, etc.
>
> My question is that if it is necessary to list those* dependencies in
> DESCRIPTION file, or is it enough to have them listed in NAMESPACE file?

You do *not* have to declare base-R packages as dependencies in the
DESCRIPTION file.  'R CMD check' and 'R CMD check --as-cran' will not
complain about them.

I can't speak for others, but, I choose to declare all(*) my base-R
package dependencies in my DESCRIPTION files for a reason.  This makes
it explicit what packages my package rely on, including which of the
base-R packages.  For instance, it's nice to be able to tell when the
'parallel' or 'tcltk' package is used from the DESCRIPTION file (and
the online CRAN package page, if on CRAN).

(*) The exception is the 'base' package, because that is such a
fundamental package in R and is always loaded.  The other base-R
packages are theoretically optional when it comes to running R, e.g.

$ R_ENABLE_JIT=0 Rscript --vanilla --default-packages=base -e
"loadedNamespaces()"
[1] "base"

/Henrik

>
> * by "those" I mean packages listed by:
>
> in tools:::.get_standard_package_names()$base
>
> [[alternative HTML version deleted]]
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Check package without suggests

2023-07-19 Thread Henrik Bengtsson
Hello,

this is *not* a new behavior on CRAN, at least on (re-)submissions to
CRAN.  The package has to pass R CMD check --as-cran with all OK. If
one of the Suggests:ed package is not installed, but one of your
examples or package tests needed it, that would be detected by the
check system.

The win-builder service is will detect this
(https://win-builder.r-project.org/).

See 

for an example how to do this on GitHub Actions.

If you're on macOS, and have installed R the default way, it takes
more work to test on that platform. It works out of the box on Linux
and MS Windows.  See the '[R-SIG-Mac] CRAN installer for macOS -
directory permissions' thread started in April 2022
,
continued in May 2022
, and
June 2022 .
It was then renamed to 'System-wide site library [Was: CRAN installer
for macOS - directory permissions]' in June 2022
.

/Henrik

On Tue, Jul 18, 2023 at 8:07 PM William Gearty  wrote:
>
> Hi John,
>
> You need to set the R CMD check environment variable
> _R_CHECK_FORCE_SUGGESTS_ to FALSE/0. You should be able to do this
> with the env_vars
> argument in rhub::check(). You can also achieve this with github actions by
> customizing your yaml file (example here:
> https://github.com/willgearty/deeptime/blob/master/.github/workflows/R-CMD-check.yaml#L57
> ).
>
> Best,
> Will
>
> --
> *William Gearty*
> *Lerner-Gray Postdoctoral Research Fellow*
> Division of Paleontology
> American Museum of Natural History
> williamgearty.com
>
>
>
> On Tue, Jul 18, 2023 at 10:38 AM John Harrold 
> wrote:
>
> > Howdy Folks,
> >
> > I recent had a package start failing because I wasn't checking properly in
> > my tests to make sure my suggested packages were installed before running
> > tests. I think this is something new running on CRAN where packages are
> > tested with only the packages specified as Imports in the DESCRIPTION file
> > are installed. It took me a bit of back and forth to get all of these
> > issues worked out.  I was wondering if anyone has a good way to run R CMD
> > check with only the imports installed?  A github action, or a
> > specific platform on rhub?
> >
> > Thank you,
> >
> > John
> > :wq
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-package-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Bioc-devel] Urgent minor deployment of a release?

2023-06-19 Thread Henrik Bengtsson
On Mon, Jun 19, 2023 at 5:46 PM Vincent Carey
 wrote:
>
> Hi Adam, thanks for your note.
>
> Changes to release branch sources must be limited to bug fixes or doc
> improvement.  Any new features
> must be introduced only in the devel branch.

I know about this, but I wanted to find what the document exactly says
about this. I'm sure it's documented somewhere, but I just spend 10-15
min on the website trying, but I couldn't find it.  What I was looking
specifically for was "new features".  For instance, would one be
allowed to introduce a 100% backward compatible new features to the
Bioc release branch? For instance, assume you currently have

h <- function(x, ...) {
  sum(x)
}

in the release branch, would it be okay to add:

h <- function(x, na.rm = FALSE, ...) {
  sum(x, na.rm = na.rm)
}

?  This is how base-R itself does it in their patch releases, e.g. R
4.3.0 -> R 4.3.1.

/Henrik

> Any features to be removed
> must be indicated as
> deprecated for one release and then labeled as defunct.  See
>
> http://contributions.bioconductor.org/deprecation.html
>
> for guidan
> Changes to release branch sources must be limited to bug fixes or doc
> improvement.
> ce on feature removal.
>
>
> On Mon, Jun 19, 2023 at 10:11 AM Park, Adam Keebum 
> wrote:
>
> > Hi all,
> >
> > I wonder if there is any room for deploying a modification to a released
> > library(retrofit, 3.17), which was released last month.
> >
> > We are in a progress of a paper review, so the release schedule (twice
> > each year) does not perfectly fit our need.
> >
> > Or do you think we should have used "devel" for such purposes?
> >
> > Sincerely,
> > Adam.
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > Bioc-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
> --
> The information in this e-mail is intended only for th...{{dropped:6}}

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Rd] save.image Non-responsive to Interrupt

2023-05-04 Thread Henrik Bengtsson
On Thu, May 4, 2023 at 3:02 PM Serguei Sokol via R-devel
 wrote:
>
> Le 03/05/2023 à 01:25, Henrik Bengtsson a écrit :
> > Along the lines of calling R_CheckUserInterrupt() only onces in a while:
> >
> >> OTOH, in the past we have had to *disable*  R_CheckUserInterrupt()
> >> in parts of R's code because it was too expensive,
> >> {see current src/main/{seq.c,unique.c}  for a series of commented-out
> >> R_CheckUserInterrupt() for such speed-loss reasons}
> > First, here are links to these two files viewable online:
> >
> >   * https://github.com/wch/r-source/blob/trunk/src/main/seq.c
> >
> >   * https://github.com/wch/r-source/blob/trunk/src/main/unique.c
> >
> > When not commented out, R_CheckUserInterrupt() would have been called
> > every 1,000,000 times per:
> >
> >/* interval at which to check interrupts */
> >#define NINTERRUPT 100
> >
> > and
> >
> >if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt()
> >
> > in each iteration. That '(i+1) % NINTERRUPT == 0' expression can be
> > quite expensive too.
> I vaguely remember a hack that was mentioned on this list as close to
> 0-cost. It looked something like:
>
> iint = NINTERRUPT;
> for (...) {
> if (--iint == 0) {
>R_CheckUserInterrupt();
>iint = NINTERRUPT;
>}
> }
>
> Best,
> Serguei.

> > Alternatively, one can increment a counter and reset to zero after
> > calling R_CheckUserInterrupt(); I think that's equally performant.

Yes, that's the one, e.g. Tomas K migrated some "modulo" ones in
R-devel to this one yesterday
(https://github.com/wch/r-source/commit/1ca6c6c6246629c6a98a526a2906595e5cfcd45e).

/Henrik

>
> >However, if we change the code to use NINTERRUPT
> > = 2^k where k = {1, 2, ...}, say
> >
> >#define NINTERRUPT 1048576
> >
> > the compiler would optimize the condition to use "the modulo of powers
> > of 2 can alternatively be expressed as a bitwise AND operation"
> > (Thomas Lumley, 2015-06-15).  The speedup is quite impressive, cf.
> > <https://www.jottr.org/2015/06/05/checkuserinterrupt/>.
> > Alternatively, one can increment a counter and reset to zero after
> > calling R_CheckUserInterrupt(); I think that's equally performant.
> >
> > Regarding making serialize() / unserialize() interruptible: I think
> > can be a good idea since we work with larger objects these days.
> > However, if we implement this, we probably have to consider what
> > happens when an interrupt happens. For example, transfers between a
> > client and a server are no longer atomic at this level, which means we
> > might end up in a corrupt state. This may, for instance, happen to
> > database transactions, and PSOCK parallel worker communication.  A
> > quick fix would be to use base::suspendInterrupts(), but better would
> > of course be to handle interrupts gracefully.
> >
> > My $.02 + $0.02
> >
> > /Henrik
> >
> > On Tue, May 2, 2023 at 3:56 PM Jeroen Ooms  wrote:
> >> On Tue, May 2, 2023 at 3:29 PM Martin Maechler
> >>  wrote:
> >>>>>>>> Ivan Krylov
> >>>>>>>>  on Tue, 2 May 2023 14:59:36 +0300 writes:
> >>>  > В Sat, 29 Apr 2023 00:00:02 +
> >>>  > Dario Strbenac via R-devel  пишет:
> >>>
> >>>  >> Could save.image() be redesigned so that it promptly responds to
> >>>  >> Ctrl+C? It prevents the command line from being used for a number 
> >>> of
> >>>  >> hours if the contents of the workspace are large.
> >>>
> >>>  > This is ultimately caused by serialize() being non-interruptible. A
> >>>  > relatively simple way to hang an R session for a long-ish time 
> >>> would
> >>>  > therefore be:
> >>>
> >>>  > f <- xzfile(nullfile(), 'a+b')
> >>>  > x <- rep(0, 1e9) # approx. 8 gigabytes, adjust for your RAM size
> >>>  > serialize(x, f)
> >>>  > close(f)
> >>>
> >>>  > This means that calling R_CheckUserInterrupt() between saving
> >>>  > individual objects is not enough: R also needs to check for 
> >>> interrupts
> >>>  > while saving sufficiently long vectors.
> >>>
> >>>  > Since the serialize() infrastructure is carefully written to avoid
> >>>  > resource leaks on allocation failures, 

Re: [Rd] save.image Non-responsive to Interrupt

2023-05-02 Thread Henrik Bengtsson
Along the lines of calling R_CheckUserInterrupt() only onces in a while:

> OTOH, in the past we have had to *disable*  R_CheckUserInterrupt()
> in parts of R's code because it was too expensive,
> {see current src/main/{seq.c,unique.c}  for a series of commented-out
> R_CheckUserInterrupt() for such speed-loss reasons}

First, here are links to these two files viewable online:

 * https://github.com/wch/r-source/blob/trunk/src/main/seq.c

 * https://github.com/wch/r-source/blob/trunk/src/main/unique.c

When not commented out, R_CheckUserInterrupt() would have been called
every 1,000,000 times per:

  /* interval at which to check interrupts */
  #define NINTERRUPT 100

and

  if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt()

in each iteration. That '(i+1) % NINTERRUPT == 0' expression can be
quite expensive too.  However, if we change the code to use NINTERRUPT
= 2^k where k = {1, 2, ...}, say

  #define NINTERRUPT 1048576

the compiler would optimize the condition to use "the modulo of powers
of 2 can alternatively be expressed as a bitwise AND operation"
(Thomas Lumley, 2015-06-15).  The speedup is quite impressive, cf.
.
Alternatively, one can increment a counter and reset to zero after
calling R_CheckUserInterrupt(); I think that's equally performant.

Regarding making serialize() / unserialize() interruptible: I think
can be a good idea since we work with larger objects these days.
However, if we implement this, we probably have to consider what
happens when an interrupt happens. For example, transfers between a
client and a server are no longer atomic at this level, which means we
might end up in a corrupt state. This may, for instance, happen to
database transactions, and PSOCK parallel worker communication.  A
quick fix would be to use base::suspendInterrupts(), but better would
of course be to handle interrupts gracefully.

My $.02 + $0.02

/Henrik

On Tue, May 2, 2023 at 3:56 PM Jeroen Ooms  wrote:
>
> On Tue, May 2, 2023 at 3:29 PM Martin Maechler
>  wrote:
> >
> > > Ivan Krylov
> > > on Tue, 2 May 2023 14:59:36 +0300 writes:
> >
> > > В Sat, 29 Apr 2023 00:00:02 +
> > > Dario Strbenac via R-devel  пишет:
> >
> > >> Could save.image() be redesigned so that it promptly responds to
> > >> Ctrl+C? It prevents the command line from being used for a number of
> > >> hours if the contents of the workspace are large.
> >
> > > This is ultimately caused by serialize() being non-interruptible. A
> > > relatively simple way to hang an R session for a long-ish time would
> > > therefore be:
> >
> > > f <- xzfile(nullfile(), 'a+b')
> > > x <- rep(0, 1e9) # approx. 8 gigabytes, adjust for your RAM size
> > > serialize(x, f)
> > > close(f)
> >
> > > This means that calling R_CheckUserInterrupt() between saving
> > > individual objects is not enough: R also needs to check for interrupts
> > > while saving sufficiently long vectors.
> >
> > > Since the serialize() infrastructure is carefully written to avoid
> > > resource leaks on allocation failures, it looks relatively safe to
> > > liberally sprinkle R_CheckUserInterrupt() where it makes sense to do
> > > so, i.e. once per WriteItem() (which calls itself recursively and
> > > non-recursively) and once per every downstream for loop iteration.
> > > Valgrind doesn't show any new leaks if I apply the patch, interrupt
> > > serialize() and then exit. R also passes make check after the applied
> > > patch.
> >
> > > Do these changes make sense, or am I overlooking some other problem?
> >
> > Thank you, Ivan!
> >
> > They do make sense... but :
> >
> > OTOH, in the past we have had to *disable*  R_CheckUserInterrupt()
> > in parts of R's code because it was too expensive,
> > {see current src/main/{seq.c,unique.c}  for a series of commented-out
> >  R_CheckUserInterrupt() for such speed-loss reasons}
> >
> > so  adding these may need a lot of care when we simultaneously
> > want to remain  efficient for "morally valid" use of serialization...
> > where we really don't want to pay too much of a premium.
>
> Alternatively, one could consider making R throttle or debounce calls
> to R_CheckUserInterrupt such that a repeated calls within x time are
> ignored, cf: https://www.freecodecamp.org/news/javascript-debounce-example/
>
> The reasoning being that it may be difficult for (contributed) code to
> determine when/where it is appropriate to check for interrupts, given
> varying code paths and cpu speed. Maybe it makes more sense to call
> R_CheckUserInterrupt frequently wherever it is safe to do so, and let
> R decide if reasonable time has elapsed to actually run the (possibly
> expensive) ui check again.
>
> Basic example: https://github.com/r-devel/r-svn/pull/125/files
>
>
>
>
> >
> > {{ saving the whole user workspace is not "valid" in that sense
> >in my view.  I tell all my 

Re: [R-pkg-devel] CRAN check fails on NEWS

2023-04-26 Thread Henrik Bengtsson
To reproduce this, you can use tools:::.news_reader_default(), e.g.

> utils::download.file("https://raw.githubusercontent.com/gertvv/rsmaa/master/smaa/NEWS;,
>  "NEWS", quiet = TRUE)
> news <- tools:::.news_reader_default("NEWS")
Warning messages:
1: Cannot process chunk/lines:
  smaa 0.3-0
2: Cannot process chunk/lines:
  smaa 0.2-5
...
9: Cannot process chunk/lines:
  smaa 0.1

You can check for attribute 'bad' to detect if there are parsing errors, e.g.

> bad <- which(attr(news, "bad"))
> news[bad, "Version"]
[1] "0.3-0" "0.2-5" "0.2-4" "0.2-3" "0.2-2" "0.2-1" "0.2"   "0.1-1" "0.1"

I have a check_news() [0] that does this job for me (it checks both
NEWS and NEWS.md), e.g.

> check_news("NEWS")
Error: Detected 9 malformed entries in 'NEWS': 0.3-0, 0.2-5, 0.2-4,
0.2-3, 0.2-2, 0.2-1, 0.2, 0.1-1, 0.1

Hope this helps,

/Henrik

[0] 
https://github.com/HenrikBengtsson/dotfiles-for-R/blob/master/Rprofile.d/interactive%3DTRUE/check_news.R

On Wed, Apr 26, 2023 at 12:32 PM Ivan Krylov  wrote:
>
> On Wed, 26 Apr 2023 20:38:36 +0200
> Gert van Valkenhoef  wrote:
>
> > I'm hoping you can help me understand this new CRAN check failure
> > that occurs on Debian but not on Windows:
>
> Unfortunately, not all checks are enabled on all check machines.
>
> > * checking package subdirectories ... NOTE
> > Problems with news in ‘NEWS’:
> >Cannot process chunk/lines:
> >  smaa 0.3-0
> >... etc ...
>
> Your NEWS file at
>  looks like
> valid Markdown to me. In fact, I can parse it using R's NEWS.md parser
> without any warnings, so in order to solve the problem, you just need
> to rename it to NEWS.md.
>
> R's requirements for "plain text" NEWS files are documented in
> help(news) (together with requirements for NEWS.md and NEWS.Rd).
>
> > The same package passes the same check on  Debian r-devel on r-hub
> > (see the full build log:
> > HTML
> > ,
>
> Are they running R CMD check without --as-cran? This could be the
> reason why they didn't pick up this problem sooner. Running
> rhub::check_for_cran() will run a more comprehensive set of checks on
> your package.
>
> --
> Best regards,
> Ivan
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Bioc-devel] missing packages on nebbiolo1

2023-04-11 Thread Henrik Bengtsson
Hello,

those errors are because they forgot to declare those packages as dependencies;

* eiR needs to add 'RSQLite' to Suggests:
* ChemmineOB needs to add 'RUnit' to Suggests:

/Henrik

On Tue, Apr 11, 2023 at 1:26 PM Kevin Horan  wrote:
>
>
> The packages eiR and ChemmineOB are failing to build because nebbiolo1
> is missing the packages RUnit and RSQLite. Could those be installed
> please? Thank you.
>
> https://master.bioconductor.org/checkResults/3.17/bioc-LATEST/eiR/nebbiolo1-checksrc.html
>
> https://master.bioconductor.org/checkResults/3.17/bioc-LATEST/ChemmineOB/nebbiolo1-checksrc.html
>
> Kevin
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Will package version 3.99.0 bump to 4.0.0?

2023-04-09 Thread Henrik Bengtsson
Sunday drive-by comment:  I guess that means that our affxparser
package which currently has version 1.71.1 in Bioc devel, will reach
1.99.0 in 13-14 years from now because of the automatic version bumps
that are done twice a year. So, if we don't touch anything, it'll roll
over to 2.0 around 2037, or so. Just an early heads-up to our
end-users :p

/Henrik

On Sun, Apr 9, 2023 at 3:46 PM Kern, Lori  wrote:
>
> Yes. If you bump to 3.99.0 now in devel, when the release happens it should 
> be released at 4.0.0
>
>
>
> Get Outlook for iOS
> 
> From: Bioc-devel  on behalf of Gordon Smyth 
> via Bioc-devel 
> Sent: Saturday, April 8, 2023 3:47:00 AM
> To: bioc-devel@r-project.org 
> Cc: Andy Chen 
> Subject: [Bioc-devel] Will package version 3.99.0 bump to 4.0.0?
>
> I want edgeR to bump to a major new version 4.0.0 for the Bioconductor 3.17 
> release this month.
>
> If I set the edgeR version to 3.99.0 on the developmental repository now, 
> will that bump to version 4.0.0 for the Bioconductor 3.17 release?
>
> Thanks
> Gordon
>
> --
> Professor Gordon K Smyth
> Joint Head, Bioinformatics Division
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://secure-web.cisco.com/1R7vQW9lEE-wvyqAgCASzumbItUssse3KzL55r0yXrt0_Z9llDNgEwmi8TXBCM7ZE_toIUMD9iahxSewi5VTWOYcwCkS2wj3RhkdG-F8x3ynIWVR7tNm5HVrTTr1xg9NKJivCQCkuQjE2aUAbg-3l519AFd8seJkl9wNk3VJoojUrvRJmcoFYv18-_Icb1QrvE9q2McMsLbf-abFreI2H1PCFElkbjMljmPnYSPNcOihoD-wIHApdmtYpezYKRZtOd4NTJNdWTlQ1dZC0qv2UX_M2dTdDi4MwC25CIz8Mmx8f4pHN6Que75v4DFGvqJJ6/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel
>
>
>
> This email message may contain legally privileged and/or confidential 
> information.  If you are not the intended recipient(s), or the employee or 
> agent responsible for the delivery of this message to the intended 
> recipient(s), you are hereby notified that any disclosure, copying, 
> distribution, or use of this email message is prohibited.  If you have 
> received this message in error, please notify the sender immediately by 
> e-mail and delete this email message from your computer. Thank you.
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Error building package

2023-04-06 Thread Henrik Bengtsson
Hello,

add 'BiocStyle' to Suggests: in your DESCRIPTION file to fix this.

For details, see


/Henrik

On Thu, Apr 6, 2023 at 8:12 AM ELENI ADAM  wrote:
>
> Dear all,
>
> I am the creator of the hummingbird package (
> https://bioconductor.org/packages/devel/bioc/html/hummingbird.html ) and
> suddenly today I noticed that the check is failing in nebbiolo1 for the
> devel version, even though I did not make any change to the package, it was
> building fine the previous days. I am not sure why this has happened?
>
> The error message occurring states:
> Error(s) in re-building vignettes:
>   ...
> --- re-building ‘hummingbird.Rmd’ using rmarkdown
> Error: processing vignette 'hummingbird.Rmd' failed with diagnostics:
> there is no package called ‘BiocStyle’
> --- failed re-building ‘hummingbird.Rmd’
>
> And this is only happening in the devel version of the package, which is
> identical to the release. And only happening in nebbiolo1.
>
> I am also worried because this happened now and the deadline for the new
> release is too close.
>
> Could you please help me?
>
> Thank you,
> Eleni Adam
> *PhD Candidate in CS*
> *Old Dominion University*
> *Norfolk, VA, USA*
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] Missing dependency errors on nebbiolo1 (Bioc 3.17)

2023-04-04 Thread Henrik Bengtsson
On Tue, Apr 4, 2023 at 2:44 AM Oleksii Nikolaienko
 wrote:
>
> Hi,
> maybe because BiocStyle is not listed as a dependency or suggested package
> in DESCRIPTION?
> More info here: https://github.com/Bioconductor/BBS/issues/248

Yes, this is why. The package needs 'BiocStyle' to build the vignette,
and therefore it needs to be declared. Some other packages forgot to
declare 'RUnit' needed for the unit tests. We'll also see packages
that rely on non-declared annotation packages. These new checks were
introduced for Bioc devel(*) checks and aligns with how this already
works on CRAN since a long time.

(*) I'm not 100% sure, but I believe the reason you only see it on the
Linux server is that that's the only one that runs 'R CMD check' on
vignettes. Checking vignettes takes time, so the decision was to do
that only one platform. Since 'BiocStyle' is only involved for
vignettes, this is why that particular package gets flagged only
there.  Packages that forgot to add 'RUnit' to suggests, will get an
error on all platforms.

Adding 'BiocStyle' to Suggests: in the package DESCRIPTION file solves
this problem.

/Henrik


>
> Best,
> Oleksii
>
> On Tue, 4 Apr 2023 at 11:22, Rainer Johannes 
> wrote:
>
> > Dear all,
> >
> > I'm continuing to see some strange dependency problems for packages on the
> > linux build system for Bioconductor 3.17. First time I've seen them where
> > on the build report for snapshot 2023-03-31, and they are still there for
> > snapshot date 2023-04-02 (build report from 2023-04-03). What I've seen so
> > far is that most are related to `BiocStyle` missing.
> >
> > Some examples are:
> > - ROC (missing BiocStyle)
> > - SCATE (missing BiocStyle)
> > - SC3 (missing BiocStyle)
> > - scTHI (missing BiocStyle)
> >
> > Since they are only present on the linux build system - maybe that's
> > something related to the setup on that particular server? Or the R version
> > used there does not properly get all dependencies of packages?
> >
> > cheers, jo
> >
> > ---
> > Johannes Rainer, PhD
> >
> > Eurac Research
> > Institute for Biomedicine
> > Via A.-Volta 21, I-39100 Bolzano, Italy
> >
> > email: johannes.rai...@eurac.edu
> > github: jorainer
> > mastodon: jorai...@fosstodon.org
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > Bioc-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Rd] [External] subfolders in the R folder

2023-03-28 Thread Henrik Bengtsson
A quick drive-by-comment: What if 'R CMD build' would have an option
to flatten R/ subfolders when building the tarball, e.g.

R/unix/a.R
R/windows/a.R
R/a.R

becomes:

R/00__unix__a.R
R/00__windows__a.R
R/a.R

?  Maybe that would be sufficient for most use cases.  The only thing
I can imagine is that source file references (e.g. in check NOTEs)
will be toward the latter and not the former.

Of course, one could write a 'build2' shell script locally that wraps
all this internally, so that one can call 'R CMD build2 mypkg', which
then creates a flattened copy of the package folder, and runs 'R CMD
build' on that. Prototyping that could be a good start to see what
such a solution will bring and what it breaks.

/Henrik

On Tue, Mar 28, 2023 at 6:24 PM Barry Rowlingson
 wrote:
>
> The "good reason" is all the tooling in R doesn't work with subfolders and
> would have to be rewritten. All the package check and build stuff. And
> that's assuming you don't want to change the basic flat package structure -
> for example to allow something like `library(foo)` to attach a package and
> `library(foo.bar)` to attach some subset of package `foo`. That would
> require more changes of core R package and namespace code.
>
> As a workaround, you could implement a hierarchical structure in your file
> *names*. That's what `ggplot2` does with its (...downloads tarball...) 192
> files in its R folder. Well mostly, there's a load of files called
> annotation- and geom- and plot- and position- and stat- etc etc. No reason
> why you can't have multiple "levels" separated with "-" as you would have
> multiple folder levels separated with "/". You can then do `ls geom-*` to
> see the `geom` "folder" and so on (on a unix shell).
>
> And then when R Core receive a patch that implements subfolders, a quick
> shell script will be able to create the hierarchy for you and drop all the
> files in the right place.
>
> One reason for the flat folder structure may be that R's packages
> themselves have no structure to the functions - compare with Python where
> modules can have subfolders and functions in subfolders can be access with
> module.subfolder.subsub.foo(x), and module subfolders can be imported etc.
> The whole module ecosystem was designed with structure in mind.
>
> I don't think there's any restriction on subfolders in the "inst" folder of
> a package so if you have scripts you can arrange them there.
>
> Given that most of my students seem to keep all their 23,420 files in one
> folder called "Stuff" I think we can manage like this for a bit longer.
>
> B
>
>
>
> On Tue, Mar 28, 2023 at 4:43 PM Antoine Fabri 
> wrote:
>
> > This email originated outside the University. Check before clicking links
> > or attachments.
> >
> > Dear R-devel,
> >
> > Packages don't allow for subfolders in R with a couple exceptions. We find
> > in "Writing R extensions" :
> >
> > > The R and man subdirectories may contain OS-specific subdirectories named
> > unix or windows.
> >
> > This is something I've seen discussed outside of the mailing list numerous
> > times, and thanks to this SO question
> >
> > https://stackoverflow.com/questions/14902199/using-source-subdirectories-within-r-packages-with-roxygen2
> > I could find a couple instances where this was discussed here as well,
> > apologies if I missed later discussions :
> >
> > * https://stat.ethz.ch/pipermail/r-devel/2009-December/056022.html
> > * https://stat.ethz.ch/pipermail/r-devel/2010-February/056513.html
> >
> > I don't see a very compelling conclusion, nor a justification for the
> > behavior, and I see that it makes some users snarky (second link is an
> > example), so let me make a case.
> >
> > This limitation is an annoyance for bigger projects where we must choose
> > between having fewer files with too many objects defined (less structure,
> > more scrolling), or to have too many scripts, often with long prefixed
> > names to emulate essentially what folders would do. In my experience this
> > creates confusion, slows down the workflow, makes onboarding or open source
> > contributions on a new project harder (where do we start ?), makes dead
> > code easier to happen, makes it harder to test the rights things etc...
> >
> > It would seem to me, but I might be naive, that it'd be a quick enough fix
> > to flatten the R folders not named "unix" or "windows"  when building the
> > package. Is there a good reason why we can't do that ?
> >
> > Thanks,
> >
> > Antoine
> >
> > PS:
> > Other SO Q:
> > https://stackoverflow.com/questions/33776643/subdirectory-in-r-package
> >
> > https://stackoverflow.com/questions/18584807/code-organisation-in-r-package-development
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> 

[Rd] CRAN 'R Sources' page: Provide link to https://svn.r-project.org/R/

2023-03-19 Thread Henrik Bengtsson
Not sure who is the webadmin for
https://cran.r-project.org/sources.html, so posting it here:

I just noticed it's not straightforward to find the Subversion URL for
the R source code.  A natural search would be to go to
https://cran.r-project.org/, then click 'Source code' to get to
https://cran.r-project.org/sources.html.  That page does mention
"Subversion tree", but it gives not URL for it.  I'd like to suggest
adding it in parentheses, as in:

"The above archives are created automatically from the Subversion tree
(https://svn.r-project.org/R/), hence might not even compile on your
platform and can contain any number of bugs. They will probably work,
but maybe not. Use them to verify whether a bug you're tracking has
been fixed or a new feature you always wanted has already been
implemented."

Thanks,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] WISH: Optional mechanism preventing var <<- value from assigning non-existing variable

2023-03-19 Thread Henrik Bengtsson
I'd like to be able to prevent the <<- assignment operator ("super
assignment") from assigning to the global environment unless the
variable already exists and is not locked.  If it does not exist or is
locked, I'd like an error to be produced.  This would allow me to
evaluate expressions with this temporarily set to protect against
mistakes.

For example, I'd like to do something like:

$ R --vanilla
> exists("a")
[1] FALSE

> options(check.superassignment = TRUE)
> local({ a <<- 1 })
Error: object 'a' not found

> a <- 0
> local({ a <<- 1 })
> a
[1] 1

> rm("a")
> options(check.superassignment = FALSE)
> local({ a <<- 1 })
> exists("a")
[1] TRUE


BACKGROUND:

>From help("<<-") we have:

"The operators <<- and ->> are normally only used in functions, and
cause a search to be made through parent environments for an existing
definition of the variable being assigned. If such a variable is found
(and its binding is not locked) then its value is redefined, otherwise
assignment takes place in the global environment."

I argue that it's unfortunate that <<- fallbacks back to assigning to
the global environment if the variable does not already exist.
Unfortunately, it has become a "go to" solution for many to use it
that way.  Sometimes it is intended, sometimes it's a mistake.  We
find it also in R packages on CRAN, even if 'R CMD check' tries to
detect when it happens (but it's limited to do so from run-time
examples and tests).

It's probably too widely used for us to change to a more strict
behavior permanent.  The proposed R option allows me, as a developer,
to evaluate an R expression with the strict behavior, especially if I
don't trust the code.

With 'check.superassignment = TRUE' set, a developer would have to
first declare the variable in the global environment for <<- to assign
there.  This would remove the fallback "If such a variable is found
(and its binding is not locked) then its value is redefined, otherwise
assignment takes place in the global environment" in the current
design.  For those who truly intends to assign to the global, could
use assign(var, value, envir = globalenv()) or globalenv()[[var]] <-
value.

'R CMD check' could temporarily set 'check.superassignment = TRUE'
during checks.  If we let environment variable
'R_CHECK_SUPERASSIGNMENT' set the default value of option
'check.superassignment' on R startup, it would be possible to check
packages optionally this way, but also to run any "non-trusted" R
script in the "strict" mode.


TEASER:

Here's an example why using <<- for assigning to the global
environment is a bad idea:

This works:

$ R --vanilla
> y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
> keep
> [1] 3


This doesn't work:

$ R --vanilla
> library(purrr)
> y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
Error in keep <<- x : cannot change value of locked binding for 'keep'


But, if we "declare" the variable first, it works:

$ R --vanilla
> library(purrr)
> keep <- 0
> y <- lapply(1:3, function(x) { if (x > 2) keep <<- x; x^2 })
> keep
> [1] 3

/Henrik

PS. Does the <<- operator have an official name? Hadley calls it
"super assignment" in 'Advanced R'
(https://adv-r.hadley.nz/environments.html), which is where I got it
from.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Request: better default R_LIBS_USER

2023-03-16 Thread Henrik Bengtsson
> Your best bet really to govern your .libPaths from your Rprofile.site and
Renviron.site ...

To do this for any version of R, one can add:

R_LIBS_USER=~/.local/share/R/%p-library/%v

to ~/.Renviron or the Renviron.site file. This automatically expands
to the platform and R x.y version early on when R starts up, e.g.
~/.local/share/R/x86_64-pc-linux-gnu-library/4.2.

> rather than asking a few million R users to adjust from past practice.

We're all starting out with a fresh R_LIBS_USER once a year when a new
minor version of R is released, so changing the default should be
doable without major troubles. On MS Windows, this move has already
been made. When R 4.2.0 was released, the default R_LIBS_USER location
on MS Windows was changed, similarly, to the Local Application Data
directory in R (>= 4.2.0), e.g.
C:\Users\alice\AppData\Local\R\win-library\4.2.

/Henrik

On Thu, Mar 16, 2023 at 3:09 PM Dirk Eddelbuettel  wrote:
>
>
> On 16 March 2023 at 13:39, Felipe Contreras wrote:
> | I see R by default installs packages in ~/R. I know I can change the
> | default directory with R_LIBS_USER, but software shouldn't be
> | polluting the home directory.
> |
> | For example both python and node install packages to ~/.local/lib,
> | ruby to ~/.local/share. They don't install to for example ~/node.
> |
> | R should do the same: it should install packages to somewhere inside
> | ~/.local by default.
>
> Use of ~/.local is a fairly recent convention (relative to the time R has
> been around, which is now decades) and one which R supports already eg in the
> (rather useful) portable config directories:
>
>> tools::R_user_dir("r2u")
>[1] "/home/edd/.local/share/R/r2u"
>>
>
> Your best bet really to govern your .libPaths from your Rprofile.site and
> Renviron.site rather than asking a few million R users to adjust from past
> practice.
>
> Also: personal preferences differ. I think of Linux as multi-tenant and
> expect other (even system) users to have access to packages I install. So I
> am happy with this default --- which goes after all back to fruitful
> discussion in a bar in Vienna with Kurt and Fritz some 20 years ago:
>
>> .libPaths()
>[1] "/usr/local/lib/R/site-library" "/usr/lib/R/site-library"
>[3] "/usr/lib/R/library"
>>
>
> Dirk
>
> --
> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Compiling R-devel on older Linux distributions, e.g. RHEL / CentOS 7

2023-02-08 Thread Henrik Bengtsson
On Wed, Feb 8, 2023 at 12:22 PM Iñaki Ucar  wrote:
>
> On Wed, 8 Feb 2023 at 19:59, Henrik Bengtsson
>  wrote:
> >
> > I just want to add a few reasons that I know of for why users are
> > still on Red Hat/CentOS 7 and learned from being deeply involved with
> > big academic and research high-performance compute (HPC) environments.
> > These systems are not like your regular sailing boat, but more like a
> > giant container ship; much harder to navigate, slower to react, you
> > cannot just cruise around and pop into any harbor you like to, or when
> > you like to. It takes much more efforts, more logistics, and more
> > people to operate them. If you mess up, the damage is much bigger.
>
> I'm fully aware of, and I understand, all the technical and
> organizational reasons why there are CentOS 7 systems out there. I
> only challenge a single point (cherry-picked from your list):
>
> > * The majority of users and sysadmins prefer stability over the being
> > able to run the latest tools.
>
> This is simply not true. In general, sysadmins do prefer stability,
> but users want the latest tools (otherwise, this very thread would not
> exist, QED). And the first thing is hardly compatible with the second
> one. That is, without containers, which brings us to the next point.

We might operate in different environments, but there are lots of labs
that keep the exact same pipeline for years (5-10 years), because "it
works", and because if they change anything, they might have to
re-analyze all their old data to avoid batch effects purely from
different versions of algorithms. I can agree with this strategy too,
especially if your data are huge and staging them back on the compute
environment from cold storage can be a huge task in itself.  Then
there are reasons such as being less savvy, and bad memories from last
time they tried this (e.g. years ago), everything broke, and it took
them weeks and months to sort it out.  I'm not trying to make fun of
anyone here - it's just that on big clusters with many users, the
skill-level spectrum varies a lot.

>
> > * Although you might want to tell everyone to just run a new version
> > via Linux containers, it's not the magic sauce for all of the above.
> > Savvy users might be able to do it, but not your average users. Also,
> > this basically puts the common sysadmin burden on the end-user, who
> > now have to keep their container stacks up-to-date and in sync.  In
> > contrast to a homogeneous environment, this strategy increases the
> > support burden on sysadms, because they will get much more questions
> > and request for troubleshooting on very specific setups.
>
> How is that so? Let's say a user wants the latest version of R.
> Nothing prevents a sysadmin to set up a script called "R" in the PATH
> that runs e.g. the r2u container [1] with the proper mounts. And
> that's it: the user runs "R" and receives the latest version (and even
> package installations seem to be blazing fast now!) without even
> knowing that it's running inside a container.
>
> I know, you are thinking "security", "permissions"...

I'm actually thinking maintenance and support. When you bring in Linux
containers, you basically introduce a bunch of new compute
environments in addition to your host system. So, instead of the
support team (often same as the sysadm) having to understand and
answer questions for a homogeneous environment, they now have to be
up-to-date with different versions of CentOS/Rocky, Ubuntu, Debian,
... and different container images. In R we often have a hard time to
even get users to report their sessionInfo() - now imagine their
container details.  If admins start providing one-off container
images, that becomes an added maintenance load. But, I agree, Linux
containers are great and makes it possible for a lot of users to run
analyzes that they otherwise would not be able to do on the host
system.

>
> $ yum install podman
>
> Drop-in replacement for docker, but rootless, daemonless. Also there's
> a similar thing called Apptainer [1], formerly Singularity, that was
> specifically designed with HPC in mind, now part of the Linux
> Foundation.
>
> [1] https://github.com/eddelbuettel/r2u
> [2] https://apptainer.org/

Yes, Singularity/Apptainer is awesome, especially since Docker is
mostly considered a no-no in HPC environments. The minimal, or even
zero use of SUID, these days, is great. That it runs as a regular
process as the users itself with good default file mounts is also
neat.  These things get even better with newer Linux kernels, which,
by the way, is another motivation for upgrading the OS.

That said, with Apptainer and likes, the user might run into conflicts
here, similar to what we see when use

Re: [Rd] need help from someone know screen reader and R high DPI GUI

2023-02-08 Thread Henrik Bengtsson
Thanks for this work. My suggestion would be to provide those
pre-built Windows binaries to maximize the chances to get the feedback
you need. The amount of people ready to, or have the setup to, build R
from source, especially so on MS Windows, is much smaller than those
who are willing to give it a quick try.

/Henrik

On Wed, Feb 8, 2023 at 1:39 AM yu gong  wrote:
>
> hello , everyone:
>
>  I recheck and retest about the patch about high dpi of windows R GUI ,  IMO 
> it works mostly. Last thing I am not sure is screen reader.
>
>  I download NVDA screen reader, and try it on high dpi R GUI , in NVDA Speech 
> viewer,  it seems it can read R GUI normally, but since I konw so little 
> about screen reader before.   I couldn't confirm NVDA Speech viewer indeed 
> work on high dpi R GUI.
>
> Could anyone who know screen reader , help me  to confirm the high dpi patch 
> not break screen reader.
>
> I put the patch 
> https://github.com/armgong/misc-r-patch/blob/main/dpi-c-code.diff  .
>
> If anyone need the compiled binary on windows x64 , I can upload it on github 
> repo also.
>
> thanks,
>
> Yu Gong
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Compiling R-devel on older Linux distributions, e.g. RHEL / CentOS 7

2023-02-08 Thread Henrik Bengtsson
I just want to add a few reasons that I know of for why users are
still on Red Hat/CentOS 7 and learned from being deeply involved with
big academic and research high-performance compute (HPC) environments.
These systems are not like your regular sailing boat, but more like a
giant container ship; much harder to navigate, slower to react, you
cannot just cruise around and pop into any harbor you like to, or when
you like to. It takes much more efforts, more logistics, and more
people to operate them. If you mess up, the damage is much bigger.

Reasons:

* Users don't have many options, but have to use what is available.

* Red Hat/CentOS is designed for long term stability and backward
compatibility. They've only done major upgrades every 3-4 years.

* Red Hat backports security fixes to old versions of common software,
which is why you see, for instance, Python 3.6 still being the
provided version, although Python made that End-of-Life in December
2021.

* HPC environments (aka "compute cluster") often have 100s to 1000s of
users. Imagine the amount of software tools and difference versions
installed in such environments.

* Upgrading an HPC environments is a major disruption for users who
rely on it in their work and research, e.g. some software stacks,
pipelines, and scripts have to reinstalled and re-coded.

* The majority of users and sysadmins prefer stability over the being
able to run the latest tools.

* Over years, stability increases the technical debt, to a point where
it is cheaper to upgrade than to stay behind.

* Even if sysadmins want to upgrade to a newer release, their hands
are often also tied, because of external factors.  For example, the
global parallel file system or the backup system you rely on has not
yet be validated for the next version of OS you want to upgrade too.
There might also be research critical scientific pipelines that does
not yet support the new version, which can be because the maintainers
of those tools don't have access to a new version to test on. GPU
drives much not be available. This is also the case for commercial
tools. Sometimes IT security requirements cannot be met on the new
version, because security scanning tools are not yet up-to-date. There
can also be hardware limitations, e.g. you might even have to replace
some central server for the whole cluster to be able to upgrade.

* Although you might want to tell everyone to just run a new version
via Linux containers, it's not the magic sauce for all of the above.
Savvy users might be able to do it, but not your average users. Also,
this basically puts the common sysadmin burden on the end-user, who
now have to keep their container stacks up-to-date and in sync.  In
contrast to a homogeneous environment, this strategy increases the
support burden on sysadms, because they will get much more questions
and request for troubleshooting on very specific setups.

Specifically to Red Hat/CentOS 7: When sites started to think about
migrating to CentOS 8, Red Hat decided to pull the plug and change
their long-term business plan.  This itself was a disruptive event,
because any plans to do a "regular" distro upgrade had to be flushed
down the toilet.  The community waited to see what would happen and
what the options would be.  A lot of sites now plan on migrating to
Rocky 8, which (AFAIU) tries to stay true to the original CentOS
mission.  This means they are waiting for third-party hardware and
software providers to validate their products for Rocky 8, e.g.
parallel file systems, backup software, software stacks, etc.

What R Core, Gabor, and many others are doing, often silently in the
background, is to make sure R works smoothly for many R users out
there, whatever operating system and version they may be on. This is
so essential to R's success, and, more importantly, for research and
science to be able to move forward.  Those endless hours spend on
trying to support some OS, even obscure ones, to pay off many times,
especially on common OSes such as Red Hat and CentOS.  You spare lots
of users and sysadmins lots of pain when you put those hours in.  So,
thank you for doing all that.

/Henrik

On Wed, Feb 8, 2023 at 2:24 AM Iñaki Ucar  wrote:
>
> On Wed, 8 Feb 2023 at 07:05, Prof Brian Ripley  wrote:
> >
> > On 08/02/2023 00:13, Gábor Csárdi wrote:
> > > As preparation for the next release, I am trying to compile R devel on
> > > RHEL / CentOS 7, which is still supported by RedHat until 2024 June.
>
> True, but with a big asterisk. Full updates ended on 2020-08-06, and
> it's been in maintenance mode since then, meaning that only security
> and critical fixes are released until EOL to facilitate a transition
> to a newer version. So CentOS 7 users shouldn't expect new releases of
> software to be available.
>
> > > There are two issues.
> > >
> > > One is that the libcurl version in CentOS 7 is quite old, 7.29.0, and
> > > R devel now requires 7.32.0, since 83715 about a week ago. This
> > > requirement is 

Re: [Rd] Sys.getenv(): Error in substring(x, m + 1L) : invalid multibyte string at '' if an environment variable contains \xFF

2023-01-30 Thread Henrik Bengtsson
Thanks for the quick replies.

For me, the main issue is that the value of an environment variable
can causes errors in R.  Say that there's an environment variable that
is used as a raw data blob (as Simon mentions) and where that data is
flavor of the day. Then, this would affect R in a random,
non-predictable way.  Unless it can be argued that environment
variables must not contain random bytes, I argue that R needs to be
robust against this.  If not, we're effectively asking everyone to use
tryCatch() whenever calling Sys.getenv().  Even so, they wouldn't not
be able to query *all* environment variables, or even know which they
are, because Sys.getenv() gives an error.  So, I think the only
reasonable thing to do is to make Sys.getenv() robust against this.

One alternative is to have Sys.getenv() (1) detect for which
environment variables this happens, (2) produce an informative warning
that R cannot represent the value correctly, and (3) either drop it,
or return an adjusted value (e.g. NA_character_).  Something like:

> envs <- Sys.getenv()
Warning message:
Environment variable 'BOOM' was dropped, because its
value cannot be represented as a string in R

As you see from my examples below, R used to drop such environment
variables in R (< 4.1.0).

Simon, your comment saying it used to work, maybe me look back at
older versions of R.  Although I didn't say so, I used R 4.2.2 in my
original report.  It looks like it was in R 4.1.0 that this started to
fail.  I get:

# R 2.15.0 - R 3.1.0
$ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv()['BOOM']"

  NA
Warning message:
In strsplit(.Internal(Sys.getenv(character(), "")), "=", fixed = TRUE) :
  input string 8 is invalid in this locale

# R 3.2.0 - R 4.0.4
$ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv()['BOOM']"
NA  NA
Warning message:
In strsplit(.Internal(Sys.getenv(character(), "")), "=", fixed = TRUE) :
  input string 137 is invalid in this locale

# R 4.1.0 - ...
$ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv()['BOOM']"
Error in substring(x, m + 1L) : invalid multibyte string at ''
Calls: Sys.getenv -> substring
In addition: Warning message:
In regexpr("=", x, fixed = TRUE) :
  input string 137 is invalid in this locale
Execution halted

/Henrik

On Mon, Jan 30, 2023 at 4:27 PM Simon Urbanek
 wrote:
>
> Tomas,
>
> I think you're not addressing the actual issue which is a clear regression in 
> Sys.getenv() [because it used to work and still works for single env var, but 
> not a list] and the cryptic error due to that regression (caused by changes 
> in R-devel). So in either case, Sys.getenv needs fixing (i.e., this should 
> really go to the bugzilla). Its behavior is currently inconsistent.
>
> The quoted Python discussion diverges very quickly into file name 
> discussions, but I think that is not relevant here - environment variables 
> are essentially data "blobs" that have application-specific meaning. They 
> just chose to drop them because they didn't want to make a decision.
>
> I don't have a strong opinion on this, but Sys.getenv("FOO") and 
> Sys.getenv()[["FOO"]] should not yield two different results. I would argue 
> that if we want to make specific checks, we should make them conditional - 
> even if the default is to add them. Again, the error is due to the 
> implementation of Sys.getenv() breaking in R-devel, not due to any design 
> decision.
>
> Cheers,
> Simon
>
>
> > On Jan 31, 2023, at 1:04 PM, Tomas Kalibera  
> > wrote:
> >
> >
> > On 1/30/23 23:01, Henrik Bengtsson wrote:
> >> /Hello.
> >>
> >> SUMMARY:
> >> $ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv()"
> >> Error in substring(x, m + 1L) : invalid multibyte string at ''
> >>
> >> $ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv('BOOM')"
> >> [1] "\xff"
> >>
> >> BACKGROUND:
> >> I launch R through an Son of Grid Engine (SGE) scheduler, where the R
> >> process is launched on a compute host via  'qrsh', which part of SGE.
> >> Without going into details, 'mpirun' is also involved. Regardless, in
> >> this process, an 'qrsh'-specific environment variable 'QRSH_COMMAND'
> >> is set automatically.  The value of this variable comprise of a string
> >> with \xff (ASCII 255) injected between the words.  This is by design
> >> of SGE [1].  Here is an example of what this environment variable may
> >> look like:
> >>
> >> QRSH_COMMAND= 
> >> orted\xff--hnp-topo-sig\xff2N:2S:32L3:128L2:128L1:128C:256H:x86_64\xff-m

[Rd] Sys.getenv(): Error in substring(x, m + 1L) : invalid multibyte string at '' if an environment variable contains \xFF

2023-01-30 Thread Henrik Bengtsson
/Hello.

SUMMARY:
$ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv()"
Error in substring(x, m + 1L) : invalid multibyte string at ''

$ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv('BOOM')"
[1] "\xff"

BACKGROUND:
I launch R through an Son of Grid Engine (SGE) scheduler, where the R
process is launched on a compute host via  'qrsh', which part of SGE.
Without going into details, 'mpirun' is also involved. Regardless, in
this process, an 'qrsh'-specific environment variable 'QRSH_COMMAND'
is set automatically.  The value of this variable comprise of a string
with \xff (ASCII 255) injected between the words.  This is by design
of SGE [1].  Here is an example of what this environment variable may
look like:

QRSH_COMMAND= 
orted\xff--hnp-topo-sig\xff2N:2S:32L3:128L2:128L1:128C:256H:x86_64\xff-mca\xffess\xff\"env\"\xff-mca\xfforte_ess_jobid\xff\"3473342464\"\xff-mca\xfforte_ess_vpid\xff1\xff-mca\xfforte_ess_num_procs\xff\"3\"\xff-mca\xfforte_hnp_uri\xff\"3473342464.0;tcp://192.168.1.13:50847\"\xff-mca\xffplm\xff\"rsh\"\xff-mca\xfforte_tag_output\xff\"1\"\xff--tree-spawn"

where each \xff is a single byte 255=0xFF=\xFF.


ISSUE:
An environment variable with embedded 0xFF bytes in its value causes
calls to Sys.getenv() to produce an error when running R in a UTF-8
locale. Here is a minimal example on Linux:

$ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv()"
Error in substring(x, m + 1L) : invalid multibyte string at ''
Calls: Sys.getenv -> substring
In addition: Warning message:
In regexpr("=", x, fixed = TRUE) :
  input string 134 is invalid in this locale
Execution halted


WORKAROUND:
The workaround is to (1) identify any environment variables with
invalid UTF-8 symbols, and (2) prune or unset those variables before
launching R, e.g. in my SGE case, launching R using:

QRSH_COMMAND= Rscript --vanilla -e "Sys.getenv()"

avoid the problem.  Having to unset/modify environment variables
because R doesn't like them, see a bit of an ad-hoc hack to me.  Also,
if you are not aware of this problem, or not a savvy R user, it can be
quite tricky to track down the above error message, especially if
Sys.getenv() is called deep down in some package dependency.


DISCUSSION/SUGGESTION/ASK:
My suggestion would be to make Sys.getenv() robust against any type of
byte values in environment variable strings.

The error occurs in Sys.getenv() from:

x <- .Internal(Sys.getenv(character(), ""))
m <- regexpr("=", x, fixed = TRUE)  ## produces a warning
n <- substring(x, 1L, m - 1L)
v <- substring(x, m + 1L)  ## produces the error

I know too little about string encodings, so I'm not sure what the
best approach would be here, but maybe falling back to parsing strings
that are invalid in the current locale using the C locale would be
reasonable?  Maybe Sys.getenv() should always use the C locale for
this. It looks like Sys.getenv(name) does this, e.g.

$ BOOM=$'\xFF' LC_ALL=en_US.UTF-8 Rscript --vanilla -e "Sys.getenv('BOOM')"
[1] "\xff"

I'd appreciate any comments and suggestions. I'm happy to file a bug
report on BugZilla, if this is a bug.

Henrik

[1] 
https://github.com/gridengine/gridengine/blob/master/source/clients/qrsh/qrsh_starter.c#L462-L466

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] PeacoQC Bioc version not in line with github repo

2022-12-15 Thread Henrik Bengtsson
> Indeed, although PeacoQC version on Bioc is higher (1.8) than in github
repo (1.7.3), the latest commits have not been included.

This is explained by Bioconductor bumping versions automatically when
there's a new release cycle. So, in this case it went from something like
1.7.2 to 1.8.0 on the Bioconductor side, while your contribution in 1.7.3
is still only in the GitHub side (because it was never pushed up to
Bioconductor).

/Henrik

On Thu, Dec 15, 2022, 04:19 Kern, Lori 
wrote:

> Besides reminding them to keep both in sync there is not much more the
> Bioc Devel can do.  It is up to the maintainers to properly maintain their
> packages and push to the appropriate places to make it available to the
> community.
>
>
> Lori Shepherd - Kern
>
> Bioconductor Core Team
>
> Roswell Park Comprehensive Cancer Center
>
> Department of Biostatistics & Bioinformatics
>
> Elm & Carlton Streets
>
> Buffalo, New York 14263
>
> 
> From: Bioc-devel  on behalf of Philippe
> Hauchamps 
> Sent: Thursday, December 15, 2022 4:09 AM
> To: bioc-devel@r-project.org 
> Subject: [Bioc-devel] PeacoQC Bioc version not in line with github repo
>
> Hi dear Bioc-devel community,
>
> I am developing a package, CytoPipeline, which depends a.o. on PeacoQC
> Bioc package. A few months ago, I contributed to PeacoQC with a PR, in
> order to solve an issue :
>
>
> https://secure-web.cisco.com/1kHaxHWvrkJ6pmvMy5l11CQpE8JgnouzBYno3I-i9g9wZ-q7AQ_bG1vGgaLMp-LdWnFUJjuNlZFQgO9JABK6ct50bePM4LthauLpL_xlbHzHAQH7cuy9RWzJ-lgUPnXBWkNCmWeDBxugF5OBSkJTUZFowlYTknhOhJxfEMyIiFUAukcq1jC8qBgdKD8P9tUZlu3ScUU-Mj9_4vH2cznII7rFPQhVqKlrqUsEG8uXKZz83XGxZZi51OkoL0FibgYpJKAoASPM0F2Odv0k5qvpjnRYISpGIwaRSEHrGVDKwEkNkeaZeYJGD43FHT96CPpmtGa0vDNFypb26tZPTZuLWzA/https%3A%2F%2Fgithub.com%2Fsaeyslab%2FPeacoQC%2Fpull%2F9
>
>
> https://secure-web.cisco.com/18j7J58SAasdW-RIw2_2x4Ap2qzc4hVoqLUkTzo1Z6tU0s2Bo1wokU4CGAIxAT5Ab2aqkUMHg2JzA4cUPORI4y5ZVgxJ3S5s_l6Y9GwtAcXp3Vo9WckqO41lnnsjzFvo9CBoj1yweLFfiZavDovFlwZt4UBlmXHPwRvs176H3Y8Kte87L-psj5Gtf1sVJUatcGs6OaD3AjsgRG5KYeH-H00ceCHAbO86U6MTobby02TO7eYSX_ZOwPjgZZHbTsVGKAWEbdt3PqmlSURhtKzNrsNwcknPunJZyyQEtf2wRHXEuMA_GincdXB-KypjzcKZDw4SB44B3ybSExbSYEYyNtQ/https%3A%2F%2Fgithub.com%2Fsaeyslab%2FPeacoQC%2Fissues%2F8
>
>
> My PR was subsequently validated by the package maintainer and integrated
> into the code base (June 2022).
>
> However, I recently noticed that the latest changes made on the PeacoQC
> github repo, have not been ported to the BioC git repository.  Indeed,
> although PeacoQC version on Bioc is higher (1.8) than in github repo
> (1.7.3), the latest commits have not been included.
>
> I already posted a message linked to the PeacoQC github issue, to request
> an update of the Bioc repo.
> However, is there possibly some action that could be done from Bioc Devel
> side to gently push for this update ?
>
> Thanks,
>
> Philippe
>
>
>
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
>
> https://secure-web.cisco.com/11IzcX-KI42HSp1tEYgBqRq-AGft0wfLvPlpu5CDHoCQFUSDsIXhMFc1C3nRD2yq_J1N4ohPYtZdK74Nw7qKNwk9oEuWrOn9XiYM2s2GF1PKt81NsiLynu19oaY5se0MjSbBPgBeI3VbetQKemQbPfunfK4tateHwh0myCnSR0omnIOY9qD0wX-XiG8mzjV6PZzTm-padKVvth_0wYJAJLJsGU-2IL8cwJu8m6feRLaI-CceRuHg91j5WhHEvX4l_5JRNxWm5-x-H3XRygEsavUSeNCg7N68nzMvoGTRQl5rhl7F7ELHvVbV3FCOmogVzdv6Kr1nON12ua09a7_YA7w/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel
>
>
>
> This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [R-pkg-devel] NOTEs - Problems with news in 'NEWS.md'

2022-12-12 Thread Henrik Bengtsson
I've got a check_news() function to test that a NEWS.md can be parsed
as expected.  See attached file (if that gets dropped in the
interwebs, it's also at
https://gist.github.com/HenrikBengtsson/2361371be4ae8681b3ee2eb361730990).
Call it in the folder where the NEWS.md file is.

My $.02

/Henrik

On Mon, Dec 12, 2022 at 5:46 PM Max Turgeon  wrote:
>
> Hi Sanjeev,
>
> Requirements change over time, and it's the responsibility of the maintainer 
> to make sure the package keeps up with the new requirements. Things that 
> "worked" back in 2010 may no longer fit these new requirements, and such is 
> the case for the NEWS.md file.
>
> It's useful to look at examples from other packages to understand how your 
> NEWS.md file should be formatted. Here's ggplot2's, but you can find many 
> other examples on Github: 
> https://raw.githubusercontent.com/tidyverse/ggplot2/main/NEWS.md
>
> I'll point out two things they do that is relevant (according to ?news):
>
>   *   Every new version of the package has a separate level-1 header (i.e. a 
> line that starts with #). New changes appearing in that version are then 
> listed below.
>   *   Changes are sometimes grouped in categories, where each category has a 
> separate level-2 header (i.e. a line that starts with ##).
>
> In your case, you don't seem to use headers at all. If I were you, that's 
> where I would start.
>
> Good luck!
>
>
> Max Turgeon
>
> Adjunct Professor
>
> Department of Statistics
> University of Manitoba
>
> maxturgeon.ca
>
> 
> From: R-package-devel  on behalf of 
> Sanjeev Sariya 
> Sent: Monday, December 12, 2022 7:21 PM
> To: Uwe Ligges ; 
> r-package-devel@r-project.org 
> Subject: Re: [R-pkg-devel] NOTEs - Problems with news in 'NEWS.md'
>
> 
> Caution: This message was sent from outside the University of Manitoba.
> 
>
> Dear Devel and Uwe,
>
> Sorry about not providing more details.
> Please see log file links below:
>
> https://win-builder.r-project.org/incoming_pretest/GARCOM_1.2.1_20221212_174734/Windows/00check.log
>
> https://win-builder.r-project.org/incoming_pretest/GARCOM_1.2.1_20221212_174734/Debian/00check.log
>
> I do not have access to the old email address.
>
> Attached is also the news.md file. I cannot understand what is needed in
> ?news. It was fine until ~2020 Dec.
>
> github link of package:
> https://github.com/sariya/GARCOM/tree/new_test_hello
> Best,
> --
> Sanjeev M
>
>
>
>
> On Tue, Dec 13, 2022 at 5:52 AM Uwe Ligges 
> wrote:
>
> >
> >
> > On 12.12.2022 18:05, Sanjeev Sariya wrote:
> > > Hi there,
> > >
> > > I made changes to the CRAN package by adding a file to the inst/ext
> > folder.
> > > I mention about this in the NEWS.md file however, when I upload the build
> > > package .gz file to cran it fails to pass with error message as:
> > >
> > >
> > > *  Problems with news in 'NEWS.md':  No news entries found.*
> > >
> > > There are two notes. I cannot seem to find the second note.
> >
> > For other to help, you really have to say where the package is and where
> > the check log is. As a CRAN team member, I know:
> >
> > 1. Note is about the maintainer address change. Once the package passes
> > the checks, an auto generated message will go to the old maintainer
> > address. But for that to happen, the second note has to be fixed:
> >
> > 2.   Problems with news in 'NEWS.md':
> >No news entries found.
> >
> > See ?news which explains how NEWS.md files should be written.
> >
> > Best,
> > Uwe Ligges
> >
> >
> >
> > >
> > > Kindly help.
> > > --
> > > Sanjeev M
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-package-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-package-devel
> >
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
>
>
> [[alternative HTML version deleted]]
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel
__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Bioc-devel] Packages failing Development build due to LaTex problem in .Rnw vignette

2022-12-09 Thread Henrik Bengtsson
Yes, it was just fixed in R-devel, cf.
https://bugs.r-project.org/show_bug.cgi?id=18443.

/Henrik

On Fri, Dec 9, 2022 at 7:38 AM Lluís Revilla  wrote:
>
> Hi Rory,
>
> There was a recent change in R devel (
> https://cran.r-project.org/doc/manuals/r-devel/NEWS.html):
>  - sessionInfo() records (and by default prints) the system time zone as
> part of the locale information.
> It was reported a couple of days ago in the R-devel mailing list by Henrik
> Bengtsson, that time zones with underscores caused problems which seems to
> be your case with the New_York time zone.
> This will probably be fixed at the R core level.
>
> The other option would be to set the local time to another timezone (you or
> the Bioconductor core team).
>
> Best,
>
> Lluís
>
> On Fri, 9 Dec 2022 at 16:01, Rory Stark via Bioc-devel <
> bioc-devel@r-project.org> wrote:
>
> > My packages are failing in the Development build due to something going
> > wrong processing the .Rnw vignettes. Looking around I see a number of other
> > packages with similar errors.
> >
> > They mostly include this error:
> >
> > LaTeX errors:
> > ! Missing $ inserted.
> > 
> > $
> > l.4064   \item Time zone America/New_
> >  York
> > ! LaTeX Error: Command \item invalid in math mode.
> >
> >
> > although the details differ on each platform.
> >
> > Something has changed in the environment (headers?) that I can't quite
> > track down (the source documents have not changed). Is this a known issue?
> > Any suggestions on resolving it?
> >
> > Rory
> >
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > Bioc-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Rd] R-devel: toLatex() for sessionInfo needs to escape new 'Time zone' entry

2022-12-05 Thread Henrik Bengtsson
I've moved this to https://bugs.r-project.org/show_bug.cgi?id=18443.

/Henrik

On Wed, Nov 30, 2022 at 2:03 PM Henrik Bengtsson
 wrote:
>
> BACKGROUND:
>
> In recent versions of R-devel, sessionInfo() has a 'tzone' element, e.g.
>
> > sessionInfo()$tzone
> [1] "America/Los_Angeles"
>
>
> ISSUE:
>
> Some time zones, like the one above, has an underscore.  This
> underscore is *not* escaped by utils:::toLatex.sessionInfo, e.g.
>
> $ TZ="America/Los_Angeles" Rscript --vanilla -e "toLatex(sessionInfo())"
> \begin{itemize}\raggedright
>   \item R Under development (unstable) (2022-11-30 r83391),
> \verb|x86_64-pc-linux-gnu|
>   \item Locale: \verb|LC_CTYPE=en_US.UTF-8|, \verb|LC_NUMERIC=C|,
> \verb|LC_TIME=en_US.UTF-8|, \verb|LC_COLLATE=en_US.UTF-8|,
> \verb|LC_MONETARY=en_US.UTF-8|, \verb|LC_MESSAGES=en_US.UTF-8|,
> \verb|LC_PAPER=en_US.UTF-8|, \verb|LC_NAME=C|, \verb|LC_ADDRESS=C|,
> \verb|LC_TELEPHONE=C|, \verb|LC_MEASUREMENT=en_US.UTF-8|,
> \verb|LC_IDENTIFICATION=C|
>   \item Time zone America/Los_Angeles
>   \item Running under: \verb|Ubuntu 20.04.5 LTS|
>   \item Matrix products: default
>   \item BLAS:   \verb|/home/hb/software/R-devel/trunk/lib/R/lib/libRblas.so|
>   \item LAPACK: \verb|/home/hb/software/R-devel/trunk/lib/R/lib/libRlapack.so|
>   \item Base packages: base, datasets, graphics, grDevices, methods,
> stats, utils
>   \item Loaded via a namespace (and not attached): compiler~4.3.0
> \end{itemize}
>
> This causes LaTeX-based vignettes using toLatex(sessionInfo()) to fail
> their LaTeX compilation with an error, e.g.
>
> Error: processing vignette 'mypkg.Rnw' failed with diagnostics:
> Running 'texi2dvi' on 'mypkg.tex' failed.
> LaTeX errors:
> ! Missing $ inserted.
> 
> $
> l.684   \item Time zone America/Los_
> Angeles
> ! LaTeX Error: Command \item invalid in math mode.
>
>
> SUGGESTION:
>
> To fix this, either escape any underscores, e.g.
>
>   \item Time zone America/Los\_Angeles
>
> or use \verb as done elsewhere:
>
>   \item Time zone \verb|America/Los_Angeles|
>
> /Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] S4 Methods Documentation Convention Triggers Warnings

2022-12-01 Thread Henrik Bengtsson
> Anyways, that's hundreds of \items that I need to fix in a dozen of packages. 
> Not fun!

A great opportunity to freshen up your 'sed' skills. (I think it's
possible to use 'sed' here, but 100% sure)

/Henrik

On Thu, Dec 1, 2022 at 12:27 PM Hervé Pagès  wrote:
>
> Itemizing brings semantics and structure. Plus, the \preformatted
> solution doesn't look good either IMO. FWIW I mostly care about what I
> see when I open the man page in a terminal with ?. I don't care so
> much about the PDF manual, never look at it.
>
> So I'm going to switch from
>
>\item{}{\code{show(x)}:
>  ...
>}
>
> to
>
>\item{\code{show(x)}:}{
>  ...
>}
>
> as suggested by Henrik.
>
> I actually tried the latter many years ago and compared with the former
> and for some reason decided to go for the former. But now I like the
> rendering of the latter better. Don't know if it has changed or if my
> taste has changed ;-)
>
> Anyways, that's hundreds of \items that I need to fix in a dozen of
> packages. Not fun! Also the thing with the exact location of the colon
> is very error prone. As Michael said, it would be nice to be able to
> achieve this with a simpler/more natural syntax.
>
> H.
>
> On 30/11/2022 10:47, Deepayan Sarkar wrote:
> > On Wed, Nov 30, 2022 at 9:51 PM Martin Morgan 
> > wrote:
> >
> >> I recently made the change Henrik suggests in the ‘devel’ but not
> >> ‘release’ branch of BiocParallel, so the manuals can be compared. Take a
> >> look at the ‘Constructor’ and ‘Accessors: Logging and results’ sections of
> >> the ‘SnowParam-class.Rd’ man page, starting on p. 53 of the PDF.
> >>
> >>
> >> https://bioconductor.org/packages/release/bioc/manuals/BiocParallel/man/BiocParallel.pdf
> >>
> >>
> >> https://bioconductor.org/packages/devel/bioc/manuals/BiocParallel/man/BiocParallel.pdf
> >>
> >> I think in \item{}{bar} the  is not wrapped, so runs off the
> >> margin in the devel ‘Constructor’ section.
> >
> > This should ideally use \preformatted{}, but R-exts says that "Due to
> > limitations in LaTeX as of this writing, this macro may not be nested
> > within other markup macros other than \dQuote and \sQuote, as errors or bad
> > formatting may result."
> >
> > Still, in this particular case, and possibly others like it, free-format
> > text (instead of itemizing) might work better:
> >
> > \section{Constructor}{
> >
> >  \preformatted{SnowParam(workers = snowWorkers(), type=c("SOCK", "MPI"),
> >tasks = 0L, stop.on.error = FALSE,
> >progressbar = FALSE, RNGseed = NULL,
> >timeout = Inf, exportglobals = TRUE,
> >exportvariables = TRUE,
> >log = FALSE, threshold = "INFO", logdir = NA_character_,
> >resultdir = NA_character_, jobname = "BPJOB",
> >manager.hostname = NA_character_,
> >manager.port = NA_integer_,
> >...)}
> >  Returns an object representing a SNOW cluster. The cluster is not
> >  created until \code{bpstart} is called. Named arguments in \code{...}
> >  are passed to \code{makeCluster}.
> >
> > }
> >
> > Even if we retain the status quo, the \item{}{\code{...}{}} version of this
> > (as in the release branch) is by no means nice-looking.
> >
> > Best,
> > -Deepayan
> >
> > The shorter items in the ‘Accessors: Logging and results’ section look
> >> almost identical, with a little extra (unnecessary) indentation under the
> >> original formatting.
> >>
> >> I changed the ‘Accessors: Back-end control’ to an itemized list, since
> >> there was no description associated with each item. This adds bullets.
> >>
> >> The commit is at
> >>
> >>
> >> https://code.bioconductor.org/browse/BiocParallel/commit/4e85b38b92f2adb68fe04ffb0476cbc47c1241a8
> >>
> >> (as well as https://github.com/Bioconductor/BiocParallel...)
> >>
> >> Martin
> >>
> >> From: Bioc-devel  on behalf of Martin
> >> Maechler 
> >> Date: Wednesday, November 30, 2022 at 6:28 AM
> >> To: Michael Lawrence (MICHAFLA) 
> >> Cc: bioc-devel@r-project.org , Kurt Hornik <
> >> kurt.hor...@wu.ac.at>
> >> Subject: Re: [Bioc-devel] S4 Methods Documentation Convention Triggers
> >> Warnings
> >>> Michael Lawrence \(MICHAFLA\) via Bioc-devel
> >>>  on Mon, 28 Nov 2022 12:11:00 -0800 writes:
> >>  > It may be worth revisiting why we arrived at this convention in the
> >> first
> >>  > place and see whether the Rd system can be enhanced somehow so that
> >> we can
> >>  > achieve the same effect with a more natural syntax.
> >>
> >>  > Michael
> >>
> >>
> >> Yes, I agree.
> >>
> >> It may be that in the distant past, Henrik's suggestion
> >> (a version of which I am using myself a lot in *.Rd -- mostly
> >>   *not* related to S4)
> >> did not work correctly, but I know it has worked for years now,
> >> as I use it "all the time".
> >> and not just I.
> >>
> >> If I grep in R's *base* package alone,
> >>
> >> inside ./R/src/library/base/man/
> >>
> >> grep --color -nH --null -e 

[Rd] R-devel: toLatex() for sessionInfo needs to escape new 'Time zone' entry

2022-11-30 Thread Henrik Bengtsson
BACKGROUND:

In recent versions of R-devel, sessionInfo() has a 'tzone' element, e.g.

> sessionInfo()$tzone
[1] "America/Los_Angeles"


ISSUE:

Some time zones, like the one above, has an underscore.  This
underscore is *not* escaped by utils:::toLatex.sessionInfo, e.g.

$ TZ="America/Los_Angeles" Rscript --vanilla -e "toLatex(sessionInfo())"
\begin{itemize}\raggedright
  \item R Under development (unstable) (2022-11-30 r83391),
\verb|x86_64-pc-linux-gnu|
  \item Locale: \verb|LC_CTYPE=en_US.UTF-8|, \verb|LC_NUMERIC=C|,
\verb|LC_TIME=en_US.UTF-8|, \verb|LC_COLLATE=en_US.UTF-8|,
\verb|LC_MONETARY=en_US.UTF-8|, \verb|LC_MESSAGES=en_US.UTF-8|,
\verb|LC_PAPER=en_US.UTF-8|, \verb|LC_NAME=C|, \verb|LC_ADDRESS=C|,
\verb|LC_TELEPHONE=C|, \verb|LC_MEASUREMENT=en_US.UTF-8|,
\verb|LC_IDENTIFICATION=C|
  \item Time zone America/Los_Angeles
  \item Running under: \verb|Ubuntu 20.04.5 LTS|
  \item Matrix products: default
  \item BLAS:   \verb|/home/hb/software/R-devel/trunk/lib/R/lib/libRblas.so|
  \item LAPACK: \verb|/home/hb/software/R-devel/trunk/lib/R/lib/libRlapack.so|
  \item Base packages: base, datasets, graphics, grDevices, methods,
stats, utils
  \item Loaded via a namespace (and not attached): compiler~4.3.0
\end{itemize}

This causes LaTeX-based vignettes using toLatex(sessionInfo()) to fail
their LaTeX compilation with an error, e.g.

Error: processing vignette 'mypkg.Rnw' failed with diagnostics:
Running 'texi2dvi' on 'mypkg.tex' failed.
LaTeX errors:
! Missing $ inserted.

$
l.684   \item Time zone America/Los_
Angeles
! LaTeX Error: Command \item invalid in math mode.


SUGGESTION:

To fix this, either escape any underscores, e.g.

  \item Time zone America/Los\_Angeles

or use \verb as done elsewhere:

  \item Time zone \verb|America/Los_Angeles|

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] S4 Methods Documentation Convention Triggers Warnings

2022-11-30 Thread Henrik Bengtsson
> I think in \item{}{bar} the  is not wrapped, so runs off the margin 
> in the devel ‘Constructor’ section.

Would a hack be to split up the \code{} in multiple ones, e.g.

\item{
   \code{SnowParam(workers = snowWorkers(), type=c("SOCK", "MPI"),}
 \code{tasks = 0L, stop.on.error = FALSE,}
 \code{progressbar = FALSE, RNGseed = NULL,}
 \code{timeout = Inf, exportglobals = TRUE,}
 \code{exportvariables = TRUE,}
 \code{log = FALSE, threshold = "INFO", logdir = NA_character_,}
 \code{resultdir = NA_character_, jobname = "BPJOB",}
 \code{manager.hostname = NA_character_,}
 \code{manager.port = NA_integer_,}
 \code{...)}}:}{

Does that line wrap?

/Henrik

On Wed, Nov 30, 2022 at 8:21 AM Martin Morgan  wrote:
>
> I recently made the change Henrik suggests in the ‘devel’ but not ‘release’ 
> branch of BiocParallel, so the manuals can be compared. Take a look at the 
> ‘Constructor’ and ‘Accessors: Logging and results’ sections of the 
> ‘SnowParam-class.Rd’ man page, starting on p. 53 of the PDF.
>
> https://bioconductor.org/packages/release/bioc/manuals/BiocParallel/man/BiocParallel.pdf
>
> https://bioconductor.org/packages/devel/bioc/manuals/BiocParallel/man/BiocParallel.pdf
>
> I think in \item{}{bar} the  is not wrapped, so runs off the margin 
> in the devel ‘Constructor’ section. The shorter items in the ‘Accessors: 
> Logging and results’ section look almost identical, with a little extra 
> (unnecessary) indentation under the original formatting.
>
> I changed the ‘Accessors: Back-end control’ to an itemized list, since there 
> was no description associated with each item. This adds bullets.
>
> The commit is at
>
> https://code.bioconductor.org/browse/BiocParallel/commit/4e85b38b92f2adb68fe04ffb0476cbc47c1241a8
>
> (as well as https://github.com/Bioconductor/BiocParallel...)
>
> Martin
>
> From: Bioc-devel  on behalf of Martin 
> Maechler 
> Date: Wednesday, November 30, 2022 at 6:28 AM
> To: Michael Lawrence (MICHAFLA) 
> Cc: bioc-devel@r-project.org , Kurt Hornik 
> 
> Subject: Re: [Bioc-devel] S4 Methods Documentation Convention Triggers 
> Warnings
> > Michael Lawrence \(MICHAFLA\) via Bioc-devel
> > on Mon, 28 Nov 2022 12:11:00 -0800 writes:
>
> > It may be worth revisiting why we arrived at this convention in the 
> first
> > place and see whether the Rd system can be enhanced somehow so that we 
> can
> > achieve the same effect with a more natural syntax.
>
> > Michael
>
>
> Yes, I agree.
>
> It may be that in the distant past, Henrik's suggestion
> (a version of which I am using myself a lot in *.Rd -- mostly
>  *not* related to S4)
> did not work correctly, but I know it has worked for years now,
> as I use it "all the time".
> and not just I.
>
> If I grep in R's *base* package alone,
>
> inside ./R/src/library/base/man/
>
> grep --color -nH --null -e '\\item{\\code{' *.Rd
>
> starts with
>
> agrep.Rd
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Bioc-devel] S4 Methods Documentation Convention Triggers Warnings

2022-11-27 Thread Henrik Bengtsson
What about:

  \item{\code{show(x)}}{
  Displays the first five and last five elements.
}

?

On Sat, Nov 26, 2022, 23:00 Dario Strbenac via Bioc-devel <
bioc-devel@r-project.org> wrote:

> Good day,
>
> For a long time, it has been a convention to document S4 methods in the
> format:
>
> \section{Displaying}{
>   In the code snippets below, \code{x} is a GRanges object.
>   \describe{
> \item{}{
>   \code{show(x)}:
>   Displays the first five and last five elements.
> }
>   }
> }
>
> In R Under Development, this is now a warning:
>
> * checking Rd files ... WARNING
> checkRd: (5) GRanges-class.Rd:115-165: \item in \describe must have
> non-empty label.
>
> This affects my own package as well as the core Bioconductor packages
> which I used as inspiration for designing my pacakge documentation seven
> years ago. What should the new convention be? Or could R developers be
> convinced to get rid of this check before this prototype is released?
>
> --
> Dario Strbenac
> University of Sydney
> Camperdown NSW 2050
> Australia
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [R-pkg-devel] [UNCLASSIFIED] Remotes in description when submitting a package until a dependency is fixed

2022-11-16 Thread Henrik Bengtsson
Not sure if it's already been said, but note that we can only use
'Additional_repositories' for *optional* dependencies, which are
listed under 'Suggests' (and less used 'Enhances').  We cannot use
them for *hard* dependencies, which are listed under 'Depends' or
'Imports'. From the CRAN Policies
(https://cran.r-project.org/web/packages/policies.html):

"Packages on which a CRAN package depends should be available from a
mainstream repository: if any mentioned in ‘Suggests’ or ‘Enhances’
fields are not from such a repository, where to obtain them at a
repository should be specified in an ‘Additional_repositories’ field
of the DESCRIPTION file (as a comma-separated list of repository URLs)
or for other means of access, described in the ‘Description’ field."

Currently, it's only CRAN and Bioconductor that are "mainstream" repositories.

/Henrik

On Wed, Nov 16, 2022 at 3:36 AM Duncan Murdoch  wrote:
>
> On 15/11/2022 11:59 p.m., Hugh Parsonage wrote:
> > I think you've misunderstood that excerpt.  By "temporary development
> > state", it means _between_ CRAN releases; packages in a development
> > state are not suitable for CRAN, as the policy states:
> >
> >> CRAN hosts packages of publication quality and is not a development 
> >> platform.
> >
> > You'll need to stop depending on that package until it's fixed and the
> > fix is on CRAN. That said, it looks like it might be relatively
> > straightforward to disentangle yourself from the package -- just
> > rewrite the offending example?
>
> Another solution is to put a version of that package in your own drat
> repository, and use "Additional_repositories".  For example, at one
> point rgl used webshot2 before it was released, and I had
>
>Suggests:  webshot2, ...
>Additional_repositories:  https://dmurdoch.github.io/drat
>
> with a copy of webshot2 in the drat repository.
>
> The disadvantage of this approach is that you'll need to keep that
> repository up to date as the third party package evolves, and eventually
> remove the Additional_repositories: line from your DESCRIPTION, which
> requires your own package update.
>
> See https://github.com/eddelbuettel/drat for instructions on setting up
> the drat repository.
>
> Duncan Murdoch
>
> >
> > On Wed, 16 Nov 2022 at 15:35, Bernd.Gruber  
> > wrote:
> >>
> >> Hi,
> >>
> >> I have a package (dartR) that needs to be updated by CRAN (and got a time 
> >> set until a certain date). It depends on a package that is currently 
> >> showing errors in the CRAN results and therefore fails. The maintainer of 
> >> that package is busily trying to rectify the error (as can be seen be 
> >> repeated submissions in the last weeks), but was not able yet to fix it. 
> >> As we are running out of time my approach would be to have a version of 
> >> the package that fixes it and use Remotes: in the description. It runs 
> >> fine without errors.
> >>
> >> In the R-packages book I read the following:
> >>
> >> "It's important to note that you should not submit your package to CRAN in 
> >> the intermediate state, meaning with a Remotes field and with a dependency 
> >> required at a version that's not available from CRAN or Bioconductor. For 
> >> CRAN packages, this can only be a temporary development state, eventually 
> >> resolved when the dependency updates on CRAN and you can bump your minimum 
> >> version accordingly."
> >>
> >> So is it okay to submit our package with a remote statement until the 
> >> maintainer of the other package has fixed their issues?
> >>
> >> Thanks in advance,
> >> Bernd
> >>
> >>
> >> ==
> >> Dr Bernd Gruber  )/_
> >>   _.--..---"-,--c_
> >> Professor Ecological Modelling  \|..'   ._O__)_
> >> Tel: (02) 6206 3804 ,=._.+   _ \..--( /
> >> Fax: (02) 6201 2328   \\.-''_.-' \ ( \_
> >> Institute for Applied Ecology  `'''   `\__   /\
> >> Faculty of Science and Technology  ')
> >> University of Canberra   ACT 2601 AUSTRALIA
> >> Email: bernd.gru...@canberra.edu.au
> >> WWW: 
> >> bernd-gruber
> >> ==
> >>
> >> [UC Logo]
> >>
> >> [Its time to control your Future. Apply now to study with Australia's 
> >> fastest rising University. *QS, 2022]
> >>
> >>
> >>
> >> The Ngunnawal people are the Traditional Custodians of the ACT where UC's 
> >> Bruce Campus is situated and are an integral and celebrated part of UC's 
> >> culture. We also acknowledge other First Nations Peoples.
> >>
> >> Australian Government Higher Education Registered Provider (CRICOS) 

Re: [R-pkg-devel] Too many processes spawned on Windows and Debian, although only 2 cores should be used

2022-11-16 Thread Henrik Bengtsson
Hello.

As already pointed out, the current R implementation treats any
non-empty value on _R_CHECK_LIMIT_CORES_ different from "false" as a
true value, e.g. "TRUE", "true", "T", "1", but also "donald duck".
Using '--as-cran' sets _R_CHECK_LIMIT_CORES_="TRUE", if unset.  If
already set, it'll not touch it.  So, it could be that a CRAN check
server already uses, say, _R_CHECK_LIMIT_CORES_="true".  We cannot
make assumptions about that.

To make your life, and an end-user's too, easier, I suggest just using

  num_workers <- 2L

without conditioning on running on CRAN or not.

Why? There are many problems with using parallel::detectCores().

First of all, it can return NA_integer_ on some systems, so you cannot
assume it gives a valid value (== error).  It can also return 1L,
which means your 'num_workers - 1' will give zero worker (== error).
You need to account for that if you rely on detectCores().

Second, detectCores() returns number of physical CPU cores. It's
getting more and more common to run in "cgroups" constrained
environments where your R process only gets access to a fraction of
these cores.  Such constrains are in place in many shared multi-user
HPC environments, and sometimes when using Linux containers (e.g.
Docker, Apptainer, and Podman).  A notable example of this is when
using the RStudio Cloud.  So, if you use detectCores() on those
systems, you'll actually over-parallelize, which slows things down and
you risk running out of memory. For example, you might launch 64
parallel workers when you only have access to four CPU cores.  Each
core will be clogged up by 16 workers.

Third, if you default to detectCores() and a user runs your code on a
machine shared by many users, the other users will not be happy.  Note
that the user will often not know they're overusing the machine.  So,
it's a loss-loss for everyone.

Fourth, detectCores() will return *all* physical CPU cores on the
current machine. These days we have machines with 128, 196, and more
cores.  Are you sure your software will actually run faster when using
that many cores?  The benefit from parallelization tends to decrease
as you add more workers until there is no longer an speed improvement.
If you keep adding more parallel workers you're going to see a
negative effect, i.e. you're penalized when parallelization too much.
So, be aware that when you test on 16 or 24 cores and things runs
really fast, that might not be the experience for other users, or
users in the future (who will have access to more CPU cores).

So, yes, I suggest not to use num_workers <- detectCores().  Pick a
fixed number instead, and the CRAN policy suggests using two.  You can
let the user control how many they want to use.  As a developer, it's
really really ... (read impossible) to know how many they want to use.

Cheers,

Henrik

PS. Note that detectCores() returns a single integer value (possible
NA_integer_).  Because of this, there is no need to subset with
num_workers[1]. I've seen this used in code; not sure where it comes
from but it looks like a cut'n'paste behavior.

On Wed, Nov 16, 2022 at 6:38 AM Riko Kelter  wrote:
>
> Hi Ivan,
>
> thanks for the info, I changed the check as you pointed out and it
> worked. R CMD build and R CMD check --as-cran run without errors or
> warnings on Linux + MacOS. However, I uploaded the package again at the
> WINBUILDER service and obtained the following weird error:
>
> * checking re-building of vignette outputs ... ERROR
> Check process probably crashed or hung up for 20 minutes ... killed
> Most likely this happened in the example checks (?),
> if not, ignore the following last lines of example output:
>
>  End of example output (where/before crash/hang up occured ?) 
>
> Strangely, there are no examples included in any .Rd file. Also, I
> checked whether a piece of code spawns new clusters. However, the
> critical lines are inside a function which is repeatedly called in the
> vignettes. The parallelized part looks as copied below. After the code
> is executed the cluster is stopped. I use registerDoSNOW(cl) because
> otherwise my progress bar does not work.
>
>
> Code:
>
> ### CHECK CORES
>
> chk <- tolower(Sys.getenv("_R_CHECK_LIMIT_CORES_", ""))
>if (nzchar(chk) && (chk != "false")){  # then limit the workers
>  num_workers <- 2L
>} else {
>  # use all cores
>  num_workers <- parallel::detectCores()
>}
>
>chk <- Sys.getenv("_R_CHECK_LIMIT_CORES_", "")
>
>cl <- parallel::makeCluster(num_workers[1]-1) # not to overload your
> computer
>#doParallel::registerDoParallel(cl)
>doSNOW::registerDoSNOW(cl)
>
> ### SET UP PROGRESS BAR
>
> pb <- progress_bar$new(
>  format = "Iteration = :letter [:bar] :elapsed | expected time till
> finish: :eta",
>  total = nsim,# 100
>  width = 120)
>
>progress_letter <- seq(1,nsim)  # token reported in progress bar
>
># allowing progress bar to 

Re: [R-pkg-devel] Examples with CPU time is greater than elapsed time.

2022-11-05 Thread Henrik Bengtsson
I think it's because it suggests that the package uses more than 250%
of CPU load on average, which suggests it runs in parallel with more
than two parallel workers, which is the upper limit in the CRAN
Policies (https://cran.r-project.org/web/packages/policies.html);

"If running a package uses multiple threads/cores it must never use
more than two simultaneously: the check farm is a shared resource and
will typically be running many checks simultaneously. "

>From the R Internals
(https://cran.r-project.org/doc/manuals/r-release/R-ints.html):

"_R_CHECK_EXAMPLE_TIMING_CPU_TO_ELAPSED_THRESHOLD_: For checks with
timings enabled, report examples where the ratio of CPU time to
elapsed time exceeds this threshold (and the CPU time is at least one
second). This can help detect the simultaneous use of multiple CPU
cores. Default: NA."

It looks like CRAN incoming sets
_R_CHECK_EXAMPLE_TIMING_CPU_TO_ELAPSED_THRESHOLD_=2.5.

If multi-threading is involved, I guess you need to make sure to limit
it to ~2 parallel threads.

/Henrik

On Sat, Nov 5, 2022 at 11:10 AM Jiaming Yuan  wrote:
>
> Hi all,
>
> I tried to submit an update to the xgboost package but didn't pass the
> pre-tests with the following note (solved the other one, but this one is
> a bit confusing):
>
> ```
> Flavor: r-devel-linux-x86_64-debian-gcc
> Check: examples, Result: NOTE
>Examples with CPU time > 2.5 times elapsed time
> user system elapsed ratio
>cb.gblinear.history 1.454  0.0170.49 3.002
>
> ```
>
> I can't reproduce the note on win-builder:
> https://win-builder.r-project.org/ as it's running on Windows but the
> note appears on debian tests. I'm not able to reproduce it on my local
> machine either with Ubuntu 22.04. I'm wondering what the note is trying
> to tell me and how can I resolve it with confidence.
>
>
> The full log is here:
> https://win-builder.r-project.org/incoming_pretest/xgboost_1.7.1.1_20221104_194252/Debian/00check.log
> .
>
>
> Many thanks!
>
> Jiaming
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] Rscript -e EXPR fails to launch if stdin is closed

2022-10-10 Thread Henrik Bengtsson
Thank you Peter for the quick fix.  Will this make it into R-patched
to become R 4.2.2 soon?

I can confirm that the fix resolved also the original problem report
involving launching a parallel PSOCK cluster from within a 'processx'
background process
(https://stackoverflow.com/questions/73962109/why-are-the-workers-failing-to-connect-when-calling-makepsockcluster-from-an-e/73991833#73991833
and https://github.com/r-lib/callr/issues/236)


/Henrik

On Mon, Oct 10, 2022 at 5:54 AM peter dalgaard  wrote:
>
> It seems to work simply to do  "if (ifd >= 0)..." (the ifp test is fine since 
> ifp is FILE* and initialized to NULL). Will commit (to r-devel for now).
>
> -pd
>
> > On 10 Oct 2022, at 11:07 , peter dalgaard  wrote:
> >
> > He!
> >
> > Yes, that looks like a blunder.
> >
> > mkstemp() returns -1 on failure, not 0, so the test on ifd (and I suppose 
> > also the one on ifp) is wrong. And of course, once you close file 
> > descriptor 0, mkstemp() chooses the 1st available fd, i.e. 0, for its 
> > return value.
> >
> > -pd
> >
> >> On 9 Oct 2022, at 20:25 , Henrik Bengtsson  
> >> wrote:
> >>
> >> Rscript fails to launch if the standard input (stdin) is closed, e.g.
> >>
> >> $ Rscript --vanilla -e 42 0<&-
> >> Fatal error: creating temporary file for '-e' failed
> >>
> >> This appear to only happen with `-e EXPR`, e.g. it works when doing:
> >>
> >> $ echo "42" > script.R
> >> $ Rscript --vanilla script.R 0<&-
> >> [1] 42
> >>
> >> and:
> >>
> >> $ R --vanilla 0<&-
> >> R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
> >> Copyright (C) 2022 The R Foundation for Statistical Computing
> >> Platform: x86_64-pc-linux-gnu (64-bit)
> >> ...
> >>>
> >>
> >>
> >> TROUBLESHOOTING:
> >>
> >> $ strace Rscript --vanilla -e 42 0<&-
> >> execve("/home/hb/shared/software/CBI/R-4.2.1-gcc9/bin/Rscript",
> >> ["Rscript", "--vanilla", "-e", "42"], 0x7fff9f476418 /* 147 vars */) =
> >> 0
> >> brk(NULL)   = 0x5625ca9e6000
> >> arch_prctl(0x3001 /* ARCH_??? */, 0x7fff23b4d260) = -1 EINVAL (Invalid 
> >> argument)
> >> ...
> >> fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}) = 0
> >> write(1, "Fatal error: creating temporary "..., 53Fatal error:
> >> creating temporary file for '-e' failed
> >> ) = 53
> >> exit_group(2)   = ?
> >> +++ exited with 2 +++
> >>
> >> which points to src/unix/system.c:
> >>
> >> ifd = mkstemp(ifile);
> >> if (ifd > 0)
> >>   ifp = fdopen(ifd, "w+");
> >> if(!ifp) R_Suicide(_("creating temporary file for '-e' failed"));
> >>
> >>
> >> One rationale for having closed standard files (including stdin) is to
> >> avoid leaking file descriptors, cf.
> >> https://wiki.sei.cmu.edu/confluence/display/c/FIO22-C.+Close+files+before+spawning+processes
> >> and https://danwalsh.livejournal.com/53603.html.  The background for
> >> reporting on this was that `system()` fails to work in processx
> >> spawned processes, which closes the standard files by default in
> >> processx (<= 3.7.0).
> >>
> >> Best,
> >>
> >> Henrik
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > --
> > Peter Dalgaard, Professor,
> > Center for Statistics, Copenhagen Business School
> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > Phone: (+45)38153501
> > Office: A 4.23
> > Email: pd@cbs.dk  Priv: pda...@gmail.com
> >
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Rscript -e EXPR fails to launch if stdin is closed

2022-10-09 Thread Henrik Bengtsson
Rscript fails to launch if the standard input (stdin) is closed, e.g.

$ Rscript --vanilla -e 42 0<&-
Fatal error: creating temporary file for '-e' failed

This appear to only happen with `-e EXPR`, e.g. it works when doing:

$ echo "42" > script.R
$ Rscript --vanilla script.R 0<&-
[1] 42

and:

$ R --vanilla 0<&-
R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
...
>


TROUBLESHOOTING:

$ strace Rscript --vanilla -e 42 0<&-
execve("/home/hb/shared/software/CBI/R-4.2.1-gcc9/bin/Rscript",
["Rscript", "--vanilla", "-e", "42"], 0x7fff9f476418 /* 147 vars */) =
0
brk(NULL)   = 0x5625ca9e6000
arch_prctl(0x3001 /* ARCH_??? */, 0x7fff23b4d260) = -1 EINVAL (Invalid argument)
...
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}) = 0
write(1, "Fatal error: creating temporary "..., 53Fatal error:
creating temporary file for '-e' failed
) = 53
exit_group(2)   = ?
+++ exited with 2 +++

which points to src/unix/system.c:

ifd = mkstemp(ifile);
if (ifd > 0)
ifp = fdopen(ifd, "w+");
if(!ifp) R_Suicide(_("creating temporary file for '-e' failed"));


One rationale for having closed standard files (including stdin) is to
avoid leaking file descriptors, cf.
https://wiki.sei.cmu.edu/confluence/display/c/FIO22-C.+Close+files+before+spawning+processes
and https://danwalsh.livejournal.com/53603.html.  The background for
reporting on this was that `system()` fails to work in processx
spawned processes, which closes the standard files by default in
processx (<= 3.7.0).

Best,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] Interpreting BiocCheck output

2022-10-07 Thread Henrik Bengtsson
I'm throwing in another 1 cent.

I agree that utils::globalVariables() is risky; since it goes in the
root of the package code, it applies to *all* functions in the
package, which is a bit too broad of a stroke for my taste.  The way I
deal with false globals from non-standard evaluation (NSE), is to
declare them a as dummy variables local to the function.  In this
case, I would use:

myfcn <- function() {
  ## To please R CMD check
  mpg <- cyl <- NULL

  mtcars |> select(mpg, cyl) |> head(3)
}


ADVANCED:

To avoid those dummy assignments from taking place in every call, one
can also do:

myfcn <- local({
  ## To please R CMD check
  mpg <- cyl <- NULL

  function() {
mtcars |> select(mpg, cyl) |> head(3)
  }
})

which is also a bit cleaner, because it keeps the original function body as-is.


REALLY ADVANCED:
If you want to be fancy, you can even protect against mistakes by using:

myfcn <- local({
  ## To please R CMD check
  mpg <- cyl <- NULL

  ## Prevent developer mistakes
  lapply(names(environment()), function(name, envir) {
delayedAssign(
  name,
  stop(sprintf("INTERNAL ERROR: %s is not declared",
sQuote(name)), call. = FALSE),
  assign.env = envir
)
  }, envir = environment())

  function() {
mtcars |> select(mpg, cyl) |> head(3)
  }
})

The latter would throw an error, if you actually end up using 'mpg' or
'cyl' in a non-NSE way.

/Henrik

On Fri, Oct 7, 2022 at 12:29 PM Martin Morgan  wrote:
>
> Just my two cents, but I don’t think using `globalVariables()` is a good idea 
> in a package – it’s too easy to say that R should ignore a variable that it 
> should not.
>
> In the context of dplyr, the alternative is to `importFrom dplyr .data` or to 
> use ‘standard’ evaluation, depending on circumstance
>
>
> > mtcars |> as_tibble() |> filter(.data$mpg > 30)  # .data is imported, so 
> > known…
> # A tibble: 4 × 11
> mpg   cyl  disphp  dratwt  qsecvsam  gear  carb
> 
> 1  32.4 4  78.766  4.08  2.2   19.5 1 1 4 1
> 2  30.4 4  75.752  4.93  1.62  18.5 1 1 4 2
> 3  33.9 4  71.165  4.22  1.84  19.9 1 1 4 1
> 4  30.4 4  95.1   113  3.77  1.51  16.9 1 1 5 2
> > mtcars |> select("mpg", "cyl") |> head(3)  # `”mpg”` and `”cyl”` are 
> > character vectors, not symbols…
>mpg cyl
> Mazda RX4 21.0   6
> Mazda RX4 Wag 21.0   6
> Datsun 71022.8   4
>
> Martin
>
> From: Bioc-devel  on behalf of Marcel Ramos 
> 
> Date: Friday, October 7, 2022 at 3:07 PM
> To: bioc-devel@r-project.org 
> Subject: Re: [Bioc-devel] Interpreting BiocCheck output
> Hi Giulia,
>
> Thanks for sharing.
> I took a look at https://github.com/calabrialab/ISAnalytics and I'm glad
> you resolved the issue.
>
> Just a reminder, you can also use `utils::globalVariables('.')` in your
> package for functions
> that use `'.'` (and other symbols) as a variable, e.g. in `purrr::pmap`.
>
> Best regards,
>
> Marcel
>
>
> On 10/6/22 4:34 AM, Giulia Pais wrote:
> >
> > Hi, thanks for the reply. I managed to fix the first error as it was a
> > minor issue in the code, while for the second one I don’t have a
> > solution since the problem appears only locally and not on Biconductor
> > after the build.
> >
> > Just for reference the package is ISAnalytics and the BiocCheck
> > version is the latest one.
> >
> > Thanks again,
> >
> > Giulia
> >
> > *From: *Bioc-devel  on behalf of
> > Marcel Ramos 
> > *Date: *Wednesday, 2022October5 at 23:48
> > *To: *bioc-devel@r-project.org 
> > *Subject: *Re: [Bioc-devel] EXTERNAL: Interpreting BiocCheck output
> >
> > Hi Giulia,
> >
> > Are you using a recent version of BiocCheck?
> >
> > If so, check the bottom of the BiocCheck::BiocCheck():
> >
> > ---
> > See the .BiocCheck folder and run
> >  browseVignettes(package = 'BiocCheck')
> > for details.
> > ---
> >
> > Can you provide more details, e.g., the repository of the package?
> >
> > Thanks.
> >
> > Best regards,
> >
> > Marcel
> >
> > On 10/4/22 4:44 AM, Giulia Pais wrote:
> > > Hello,
> > > I�m having some issues in interpreting BiocCheck outputs, maybe
> > someone can tell me how to fix the issues.
> > >
> > > I�ve got 2 main issues that cause the check to fail after normal
> > CRAN check has passed:
> > >
> > >1.  I get this error message
> > >
> > > * Checking if other packages can import this one...
> > >
> > >  * ERROR: Packages providing 2 object(s) used in this package
> > should be imported in the NAMESPACE file, otherwise packages importing
> > >
> > >this package may fail.
> > >
> > >
> > >
> > > However it is nowhere mentioned which packages they are and where
> > those objects are instantiated so I have no clue how to solve this one
> > >
> > >1.  Since previous version of the package, which built and passed
> > checks without issues, I�ve been using a custom *.Rmd file placed in
> > inst/rmd in the vignette to recycle the same chunk of code and
> 

Re: [Rd] [External] Time to drop globalenv() from searches in package code?

2022-10-05 Thread Henrik Bengtsson
Excluding the global environment, and all its parent environments from
the search path that a package sees would be great news, because it
would makes things more robust and probably detect a few more bugs out
there.  In addition to the use case that Duncan mentions, it would
also remove the ambiguity that comes from searching attached packages
for global/free variables.

Speaking of the latter, isn't it time to escalate the following to a WARNING?

* checking R code for possible problems ... NOTE
my_fcn: no visible global function definition for ‘var’
Undefined global functions or variables:
  var
Consider adding
  importFrom("stats", "var")
to your NAMESPACE file.

There are several package on Bioconductor with such mistakes, and I
can imagine there are still some old CRAN package too that haven't
gone from "recent" CRAN incoming checks. The problem here is that the
intended `var` can be overridden in the global environment, in a
third-party attached package, or in any other environment on the
search() path.

Regarding:

> if (require("foo"))
>bar(...)
>
> with bar exported from foo. I don't know if that is already warned
> about.  Moving away from this is arguably good in principle but also
> probably fairly disruptive.

This should be coved by 'R CMD check', also without --as-cran; if we use:

hello <- function() {
  msg <- "Hello world!"
  if (require("tools")) msg <- toTitleCase(msg)
  message(msg)
}

we get:

* checking dependencies in R code ... NOTE
'library' or 'require' call to ‘tools’ in package code.
  Please use :: or requireNamespace() instead.
  See section 'Suggested packages' in the 'Writing R Extensions' manual.…
...
* checking R code for possible problems ... NOTE
hello: no visible global function definition for ‘toTitleCase’
Undefined global functions or variables:
  toTitleCase

This should be enough for a package developer to figure out that the
function should be implemented as:

hello <- function() {
  msg <- "Hello world!"
  if (requireNamespace("tools")) msg <- tools::toTitleCase(msg)
  message(msg)
}

which passes 'R CMD check' with all OK.


BTW, is there a reason why one cannot import from the 'base' package?
If we add, say, importFrom(base, mean) to a package's NAMESPACE file,
the package builds fine, but fails to install;

$ R --vanilla CMD INSTALL teeny_0.1.0.tar.gz
* installing to library ‘/home/hb/R/x86_64-pc-linux-gnu-library/4.2-CBI-gcc9’
* installing *source* package ‘teeny’ ...
** using staged installation
** R
** byte-compile and prepare package for lazy loading
Error in asNamespace(ns, base.OK = FALSE) :
  operation not allowed on base namespace
Calls:  ... namespaceImportFrom -> importIntoEnv ->
getNamespaceInfo -> asNamespace
Execution halted
ERROR: lazy loading failed for package ‘teeny’

Is this by design (e.g. more flexibility in maintaining the base-R
packages), or due to a technical limitation (e.g. the 'base' package
is not fully loaded at this point)?

/Henrik

On Sat, Sep 17, 2022 at 1:24 PM  wrote:
>
> On Sat, 17 Sep 2022, Kurt Hornik wrote:
>
> >> luke-tierney  writes:
> >
> >> On Thu, 15 Sep 2022, Duncan Murdoch wrote:
> >>> The author of this Stackoverflow question
> >>> https://stackoverflow.com/q/73722496/2554330 got confused because a typo 
> >>> in
> >>> his code didn't trigger an error in normal circumstances, but it did when 
> >>> he
> >>> ran his code in pkgdown.
> >>>
> >>> The typo was to use "x" in a test, when the local variable was named ".x".
> >>> There was no "x" defined locally or in the package or its imports, so the
> >>> search got all the way to the global environment and found one.  (The very
> >>> confusing part for this user was that it found the right variable.)
> >>>
> >>> This author had suppressed the "R CMD check" check for use of global
> >>> variables.  Obviously he shouldn't have done that, but he's working with
> >>> tidyverse NSE, and that causes so many false positives that it is somewhat
> >>> understandable he would suppress one too many.
> >>>
> >>> The pkgdown simulation of code in examples doesn't do perfect mimicry of
> >>> running it at top level; the fake global environment never makes it onto 
> >>> the
> >>> search list.  Some might call this a bug, but I'd call it the right search
> >>> strategy.
> >>>
> >>> My suggestion is that the search for variables in package code should 
> >>> never
> >>> get to globalenv().  The chain of environments should stop after handling 
> >>> the
> >>> imports.  (Probably base package functions should also be implicitly
> >>> imported, but nothing else.)
> >>>
> >
> >> This was considered and discussed when I added namespaces. Basically
> >> it would mean making the parent of the base namespace environment be
> >> the empty environment instead of the global environment. As a design
> >> this is cleaner, and it would be a one-line change in eval.c.  But
> >> there were technical reasons this was not a viable option at the time,
> >> also a few political reasons. The 

Re: [R-pkg-devel] NOTE checking for detritus in the temp directory on windows

2022-09-30 Thread Henrik Bengtsson
Hi.

> * checking for detritus in the temp directory ... NOTE
> Found the following files/directories:
> 'Rscript306c1f1adf1' 'Rscript3fb41f1adf1'

Whenever seeing detritus files with name Rscript like these,
it's a strong suggestion that parallel workers on MS Windows are
involved.  More precisely, each of them is from a separate PSOCK
cluster node worker typically launched by parallel::makeCluster().
When such workers are launched in example code, unit tests, or
vignettes those files are created by R itself when running on MS
Windows.  If the workers are not properly shutdown (e.g. forgetting to
call parallel::stopCluster()), those files are left behind, and 'R CMD
check' will detect them.

Now, from experience, but neither fully investigated not understood
yet, it *might* be that even if one calls parallel::stopCluster() at
the end of examples, unit tests, or vignettes, it might be that there
is still a race condition where the parallel workers are still in the
process of shutting down when 'R CMD check' is checking for detritus
files.  If that is the case, then it might be sporadic and tricky to
reproduce this problem.  This has happened to me exactly once. I'm
saying this, only in case you're struggle to reproduce it, but my
guess is that this is *not* the case here. Please see below for
options how to reproduce.

Now, I see you're using 'future' for parallelization.  Because of
this, I suspect you use plan(multisession), which launches parallel
PSOCK workers like parallel::makeCluster().  To shut down those
workers at the end, call plan(sequential).  My guess is that this is
your problem.  Before you do that, I would make sure you can reproduce
the problem first - see below.

> I did not manage to reproduce it with Rhub and github action.

1. Looking at your
https://github.com/comeetie/greed/blob/master/.github/workflows/R-CMD-check.yaml,
I see you're not checking with **R-devel** on MS Windows. I suggest
you add that and too see if you can reproduce it there.  You could
also add an explicit `env: _R_CHECK_THINGS_IN_TEMP_DIR_: true` just in
case (but I'd guess --as-cran does this).

2. Similarly, did you make sure to test with R-devel on MS Windows on R-hub?

3. Did you try with R-devel on the win-builder service?

For your future needs (pun not intended), I would make sure you can
reproduce the problem before trying to fix it.

Hope this helps,

Henrik

On Fri, Sep 30, 2022 at 12:14 AM  wrote:
>
> Dear all,
>
> I'm getting the following Note for CRAN pre-tests (for package greed 
> https://github.com/comeetie/greed) :
>
> * checking for detritus in the temp directory ... NOTE
> Found the following files/directories:
> 'Rscript306c1f1adf1' 'Rscript3fb41f1adf1'
>
> this note only appears on windows :
>
> * using R Under development (unstable) (2022-09-26 r82921 ucrt)
> * using platform: x86_64-w64-mingw32 (64-bit)
>
> The full pretest logs are located at 
> https://win-builder.r-project.org/incoming_pretest/greed_0.6.1_20220927_163632/Windows/00check.log
>
> I did not manage to reproduce it with Rhub and github action.
>
> I've seen other questions on the mailing list about "detritus" but the root 
> causes seems to be different in my case since i did not open a navigator and 
> i did not create
> files or folders in my tests or vignettes. Any hint on how to solve this note 
> will be welcome.
>
> Thanks a lot
>
> Etienne
>
>
> Etienne Côme, @comeetie
> Chargé de Recherche
> Université Gustave Eiffel
> GRETTIA/COSYS
> Tel : 01 81 66 87 18
> Web : http://www.comeetie.fr
>
> [[alternative HTML version deleted]]
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Help - Shiny app on CRAN

2022-09-28 Thread Henrik Bengtsson
Hi,

it's not just you; the win-builder server is down for all of us (e.g.
https://downforeveryoneorjustme.com/win-builder.r-project.org?proto=https).
Uwe Ligges, who maintains this, mentioned (in another thread
somewhere) that there were networking issues at his university that
are being worked on.  FWIW, this is a rare event - the win-builder and
CRAN incoming services/servers have a quite solid track record.

/Henrik

On Wed, Sep 28, 2022 at 2:07 PM Jahajeeah, Havisha  wrote:
>
> Dear team,
>
> I am having trouble  submitting my package to CRAN because the link
> https://win-builder.r-project.org/ is not accessible.
>
> Please advise.
>
> Thank you
> Havisha Jahajeeah
>
> On Mon, Sep 26, 2022 at 11:17 PM Ivan Krylov  wrote:
>
> > It might be easier to help you if you show us your package by
> > publishing the source code somewhere.
> >
> > On Mon, 26 Sep 2022 22:22:48 +0400
> > "Jahajeeah, Havisha"  wrote:
> >
> > > CIvalue2: no visible global function definition for 'qt'
> > > andgm11: no visible binding for global variable 'ParticleSwarm'
> > > andgm11: no visible global function definition for 'tail'
> > > app: no visible global function definition for 'shinyApp'
> > > dbgm12: no visible binding for global variable 'ParticleSwarm'
> >
> > It sounds like your NAMESPACE file isn't properly set up to import the
> > functions you're using. For the package to work correctly, it should
> > contain lines
> >
> > importFrom(stats, qt)
> > importFrom(utils, tail)
> > importFrom(shiny, shinyApp)
> >
> > and so on for every function you use that's not in base.
> >
> > See Writing R Extensions section 1.5:
> >
> > https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Specifying-imports-and-exports
> >
> > > Objects in \usage without \alias in documentation object 'Plots':
> > >   'plots'
> >
> > A single Rd file can describe multiple functions, but they should be
> > both mentioned in the \usage{} section and as an \alias{}. Do you
> > export two different objects (functions?) named "plots" and "Plots", or
> > is one of those an error?
> >
> > > Bad \usage lines found in documentation object 'BackgroundValues':
> > >   gm11(x0), epgm11(x0), tbgm11(x0), igm11(x0), gm114(x0)
> >
> > The \usage{} section must exactly match the definition of the function
> > (but you can omit default values of the arguments if they're too large
> > and not very informative), without any other words or punctuation.
> >
> > Once your package passes the automated tests, a human volunteer will go
> > over your package to make sure that it fits the CRAN policy (not
> > providing a link because you've already read it when you submitted the
> > package), which includes having good documentation for every function
> > you export.
> >
> > See Writing R Extensions section 2 for more information on this:
> >
> > https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Writing-R-documentation-files
> >
> > I've also noticed that you're showing us an error message from the CRAN
> > pre-test infrastructure. You can get these errors (and start fixing
> > them) faster without spending time waiting for the test result by
> > running R CMD check --as-cran your_package.tar.gz on your own machine.
> >
> > --
> > Best regards,
> > Ivan
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Help - Shiny app on CRAN

2022-09-26 Thread Henrik Bengtsson
Hello,

are you aware of https://win-builder.r-project.org/? It'll allow you
to validate that your package passes all the requirements before
submitting it to CRAN.

My $.02

/Henrik

On Mon, Sep 26, 2022 at 11:23 AM Jahajeeah, Havisha
 wrote:
>
> Dear team,
>
> My second attempt at submitting the package GreymodelsPackage_1.0.tar.gz to
> CRAN.
>
> Grateful if you could please assist me with the following issues:
>
> CIvalue2: no visible global function definition for 'qt'
> andgm11: no visible binding for global variable 'ParticleSwarm'
> andgm11: no visible global function definition for 'tail'
> app: no visible global function definition for 'shinyApp'
> dbgm12: no visible binding for global variable 'ParticleSwarm'
> dbgm12: no visible global function definition fo
>
> lotegm: no visible global function definition for 'geom_point'
> plotegm: no visible global function definition for 'aes'
> plotegm: no visible binding for global variable 'x'
> plotegm: no visible binding for global variable 'y'
> plotegm: no visible global function definition for 'geom_line'
> plotegm: no visible global function definition for 'scale_color_manual'
> plotegm: no visible global function definition for 'ggplot
>
> Also,
>
> Objects in \usage without \alias in documentation object 'Multivariable':
>   'dbgm12'
>
> Objects in \usage without \alias in documentation object 'Plots':
>   'plots'
>
> Bad \usage lines found in documentation object 'BackgroundValues':
>   gm11(x0), epgm11(x0), tbgm11(x0), igm11(x0), gm114(x0)
> Bad \usage lines found in documentation object 'CombinedModels':
>   ngbm11(x0), ggvm11(x0), tfdgm11(x0)
>
>
>
> Thanks and Regards,
>
> Havisha Jahajeeah
>
>
> On Mon, Sep 26, 2022 at 7:14 PM Jahajeeah, Havisha 
> wrote:
>
> > Dear team,
> >
> > I have submitted the package GreymodelsPackage_1.0.tar.gz to CRAN and it's
> > a shiny app.
> >
> > However, I received  the following
> >
> > * using log directory 
> > 'd:/RCompile/CRANincoming/R-devel/GreymodelsPackage.Rcheck'
> > * using R Under development (unstable) (2022-09-25 r82916 ucrt)
> > * using platform: x86_64-w64-mingw32 (64-bit)
> > * using session charset: UTF-8
> > * checking for file 'GreymodelsPackage/DESCRIPTION' ... OK
> > * checking extension type ... ERROR
> > Extensions with Type 'Shiny application' cannot be checked.
> > * DONE
> > Status: 1 ERROR
> >
> > I am not sure  how to fix the problems and I would appreciate your help on 
> > how to resolve this issue.
> >
> > Thanks and regards,
> >
> > Havisha Jahajeeah
> >
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Package ‘wflo’ was removed from the CRAN repository.

2022-09-24 Thread Henrik Bengtsson
Hello. It looks like things to have changed since;

$ R CMD check --as-cran wflo_1.6.tar.gz
* using log directory ‘/tmp/wflo.Rcheck’
* using R version 4.2.1 (2022-06-23)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: UTF-8
* using option ‘--as-cran’
* checking for file ‘wflo/DESCRIPTION’ ... OK
* this is package ‘wflo’ version ‘1.6’
* package encoding: UTF-8
* checking CRAN incoming feasibility ...^[O WARNING
Maintainer: ‘Carsten Croonenbroeck ’

New submission

Package was archived on CRAN

Insufficient package version (submitted: 1.6, existing: 1.6)

CRAN repository db overrides:
  X-CRAN-Comment: Archived on 2021-12-20 as requires archived package
'emstreeR'.

Uses the superseded packages: ‘doSNOW’, ‘snow’

Found the following (possibly) invalid URLs:
  URL: 
https://www.researchgate.net/publication/2560062_Real-Time_Fluid_Dynamics_for_Games
From: man/ShowWakePenalizers.Rd
Status: 403
Message: Forbidden

The Date field is over a month old.

This build time stamp is over a month old.
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘wflo’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking for future file timestamps ... OK
* checking ‘build’ directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking use of S3 registration ... OK
* checking dependencies in R code ... NOTE
Namespace in Imports field not imported from: ‘rgdal’
  All declared Imports should be used.
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking contents of ‘data’ directory ... OK
* checking data for non-ASCII characters ... OK
* checking LazyData ... OK
* checking data for ASCII and uncompressed saves ... OK
* checking sizes of PDF files under ‘inst/doc’ ... OK
* checking installed files from ‘inst/doc’ ... OK
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking examples with --run-donttest ... ERROR
Running examples in ‘wflo-Ex.R’ failed
The error most likely occurred in:

> base::assign(".ptime", proc.time(), pos = "CheckExEnv")
> ### Name: Cost
> ### Title: Stub for a turbine's cost function.
> ### Aliases: Cost
> ### Keywords: Cost Profit
>
> ### ** Examples
>
> ## Returns a vector of two, c(10, 10).
> Cost(c(0.5, 0.7), c(0.2, 0.3))
[1] 1e+05 1e+05
>
> ## Replace the function by another function
> ## also called 'Cost', embedded in environment e.
> ## Also, see the vignette.
> ## No test:
> e$Cost <- function(x, y) #x, y \in R^n
+ {
+ retVal <- rep(e$FarmVars$UnitCost, min(length(x), length(y)))
+ retVal[x > 0.5] <- retVal[x > 0.5] * 2
+ return(retVal)
+ }
> set.seed(1357)
> Result <- pso::psoptim(par = runif(NumTurbines * 2), fn = Profit,
+   lower = rep(0, NumTurbines * 2), upper = rep(1, NumTurbines * 2))
Error in h(simpleError(msg, call)) :
  error in evaluating the argument 'par' in selecting a method for
function 'psoptim': object 'NumTurbines' not found
Calls:  -> runif -> .handleSimpleError -> h
Execution halted
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in ‘inst/doc’ ... OK
* checking re-building of vignette outputs ... OK
* checking PDF version of manual ... OK
* checking HTML version of manual ... OK
* checking for non-standard things in the check directory ... OK
* checking for detritus in the temp directory ... OK
* DONE

Status: 1 ERROR, 1 WARNING, 1 NOTE
See
  ‘/tmp/wflo.Rcheck/00check.log’
for details.

So, you'll need to fix those in order for the package to 

Re: [Rd] Typo in Renviron.site in R 4.2.0 on Ubuntu?

2022-05-08 Thread Henrik Bengtsson
> ... The extra apostrophe does not seem to have created an issue during all 
> those tests, or since, under either Debian or Ubuntu.

I think that is because the system library '/usr/lib/R/library' is
always appended at the end of the library path, so that non-existing
to folder (the one with the extra single quote appended) makes no
difference; it's silently ignored by R if used.

This begs the question, why is the system library added to the site
library path in the first place?  I would argue that:

R_LIBS_SITE="/usr/local/lib/R/site-library/:${R_LIBS_SITE}"

is the correct way here.

/Henrik

On Sun, May 8, 2022 at 10:35 AM Dirk Eddelbuettel  wrote:
>
>
> On 8 May 2022 at 19:12, Michał Bojanowski wrote:
> | I installed R 4.2.0 on Ubuntu (from CRAN apt repository) and some
> | startup errors lead me to Renviron.site and it's last line:
> |
> | 
> R_LIBS_SITE="/usr/local/lib/R/site-library/:${R_LIBS_SITE}:/usr/lib/R/library'"
> |
> | Note the unmatched single quote just before the closing double quote.
> | That's a typo, is it not?
>
> Good catch, thank you, and fixed!
>
> I altered that file / carried the setting over from Renviron at the beginning
> of the test cycle in the first or second alpha release. The extra apostrophe
> does not seem to have created an issue during all those tests, or since,
> under either Debian or Ubuntu. So three cheers to R for robustly parsing
> configs I may have messed up ;-)
>
> (And if I may: a more focussed venue for a bug report may have been the
> r-sig-debian list for R on Debian/Ubuntu, or a bug report at bugs.debian.org,
> or an email to me. My name is at the top of the file.  No point in sending it
> to every mailbox of r-devel subscribers.)
>
> Thanks again,  Dirk
>
> --
> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] HTML5 errors in win-builder R-release check

2022-05-03 Thread Henrik Bengtsson
Looks like a mistake in R that was fixed in R-devel (rev 82308) less
than a day ago, cf.

https://github.com/wch/r-source/commit/60cc6ebd762d4199aa8e5addf08282e336138a4a

/Henrik

On Tue, May 3, 2022 at 10:54 AM J C Nash  wrote:
>
> I've been asked to "fix" some NOTEs in two of my packages.
>
> The local (Linux Mint) R CMD check --as-cran gives no errors. Nor does 
> win-builder for R-devel,
> but R-release gives many errors of the form
>
> Found the following problems:
> Rcgmin.Rd:17:1: Warning:  attribute "width" not allowed for HTML5
>
> Rcgmin.Rd:46:1: Warning:  attribute "valign" not allowed for HTML5
>
> Rcgmin.Rd:50:1: Warning:  attribute "valign" not allowed for HTML5
>
> Rcgmin.Rd:56:1: Warning:  attribute "valign" not allowed for HTML5
>
>
> In some cases the lines pointed to are beyond the end of my Rd file.
>
> Does anyone know the source, and hopefully the solution, to these?
>
> It's really difficult to fix something that does not appear to be an error
> in the systems I can dig into.
>
> John Nash
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

On Tue, May 3, 2022 at 10:54 AM J C Nash  wrote:
>
> I've been asked to "fix" some NOTEs in two of my packages.
>
> The local (Linux Mint) R CMD check --as-cran gives no errors. Nor does 
> win-builder for R-devel,
> but R-release gives many errors of the form
>
> Found the following problems:
> Rcgmin.Rd:17:1: Warning:  attribute "width" not allowed for HTML5
>
> Rcgmin.Rd:46:1: Warning:  attribute "valign" not allowed for HTML5
>
> Rcgmin.Rd:50:1: Warning:  attribute "valign" not allowed for HTML5
>
> Rcgmin.Rd:56:1: Warning:  attribute "valign" not allowed for HTML5
>
>
> In some cases the lines pointed to are beyond the end of my Rd file.
>
> Does anyone know the source, and hopefully the solution, to these?
>
> It's really difficult to fix something that does not appear to be an error
> in the systems I can dig into.
>
> John Nash
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[Rd] MS Windows: R does not escape quotes in CLI options the same way as Rterm and Rscript

2021-12-15 Thread Henrik Bengtsson
On MS Windows 10, the following works:

> Rscript --vanilla -e "\"abc\""
[1] "abc"

and also:

> Rterm --vanilla --no-echo -e "\"abc.txt\""
[1] "abc.txt"

whereas attempting the same with 'R' fails;

> R --vanilla --no-echo -e "\"abc.txt\""
Error: object 'abc' not found
Execution halted

I get this with R 4.1.2 and R Under development (unstable) (2021-12-14
r81376 ucrt).

Is this a bug?

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] capturing multiple warnings in tryCatch()

2021-12-02 Thread Henrik Bengtsson
Simon's suggestion with withCallingHandlers() is the correct way.
Also, note that if you use tryCatch() to catch warnings, you're
*interrupting* the evaluation of the expression of interest, e.g.

> res <- tryCatch({ message("hey"); warning("boom"); message("there"); 42 }, 
> warning = function(w) { message("Warning caught: ", conditionMessage(w)); 
> 3.14 })
hey
Warning caught: boom
> res
[1] 3.14

Note how it never completes your expression.

/Henrik

On Thu, Dec 2, 2021 at 1:14 PM Simon Urbanek
 wrote:
>
>
> Adapted from demo(error.catching):
>
> > W=list()
> > withCallingHandlers(foo(), warning=function(w) { W <<- c(W, list(w)); 
> > invokeRestart("muffleWarning") })
> > str(W)
> List of 2
>  $ :List of 2
>   ..$ message: chr "warning 1"
>   ..$ call   : language foo()
>   ..- attr(*, "class")= chr [1:3] "simpleWarning" "warning" "condition"
>  $ :List of 2
>   ..$ message: chr "warning 2"
>   ..$ call   : language foo()
>   ..- attr(*, "class")= chr [1:3] "simpleWarning" "warning" "condition"
>
> Cheers,
> Simon
>
>
> > On Dec 3, 2021, at 10:02 AM, Fox, John  wrote:
> >
> > Dear R-devel list members,
> >
> > Is it possible to capture more than one warning message using tryCatch()? 
> > The answer may be in ?conditions, but, if it is, I can't locate it.
> >
> > For example, in the following only the first warning message is captured 
> > and reported:
> >
> >> foo <- function(){
> > +   warning("warning 1")
> > +   warning("warning 2")
> > + }
> >
> >> foo()
> > Warning messages:
> > 1: In foo() : warning 1
> > 2: In foo() : warning 2
> >
> >> bar <- function(){
> > +   tryCatch(foo(), warning=function(w) print(w))
> > + }
> >
> >> bar()
> > 
> >
> > Is there a way to capture "warning 2" as well?
> >
> > Any help would be appreciated.
> >
> > John
> >
> > --
> > John Fox, Professor Emeritus
> > McMaster University
> > Hamilton, Ontario, Canada
> > Web: http://socserv.mcmaster.ca/jfox/
> >
> >
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R-devel: as.vector(x, mode = "list") drops attributes despite documented not to

2021-12-01 Thread Henrik Bengtsson
Hi,

in R 4.1.2 we have:

> x <- structure(as.list(1:2), dim = c(1,2))
> x
 [,1] [,2]
[1,] 12
> as.vector(x, mode = "list")
 [,1] [,2]
[1,] 12

whereas in recent versions of R-devel (4.2.0) we have:

> x <- structure(as.list(1:2), dim = c(1,2))
> x
 [,1] [,2]
[1,] 12
> as.vector(x, mode = "list")
[[1]]
[1] 1

[[2]]
[1] 2

However, as I read ?as.vector, dropping of attributes should _not_
happen for non-atomic results such as lists.  Is the new behavior a
mistake?

Specifically, ?as.vector says:

'as.vector', a generic, attempts to coerce its argument into a vector
of mode 'mode' (the default is to coerce to whichever vector mode is
most convenient): if the result is atomic all attributes are removed.

[...]

Details:

The atomic modes are "logical", "integer", "numeric" (synonym
"double"), "complex", "character" and "raw".

[...] On the other hand, as.vector removes all attributes including
names for results of atomic mode (but not those of mode "list" nor
"expression").

Value:

[...]

For as.vector, a vector (atomic or of type list or expression). All
attributes are removed from the result if it is of an atomic mode, but
not in general for a list result. The default method handles 24 input
types and 12 values of type: the details of most coercions are
undocumented and subject to change.

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] mapply(): Special case of USE.NAMES=TRUE with recent R-devel updates

2021-11-30 Thread Henrik Bengtsson
Hi,

in R-devel (4.2.0), we now get:

> mapply(paste, "A", character(), USE.NAMES = TRUE)
named list()

Now, in ?mapply we have:

USE.NAMES: logical; use the names of the first ... argument, or if
that is an unnamed character vector, use that vector as the names.

This basically says we should get:

> answer <- list()
> first <- "A"
> names(answer) <- first

which obviously is an error. The help is not explicit what should
happen when the length "of the first ... argument" is zero, but the
above behavior effectively does something like:

> answer <- list()
> first <- "A"
> names(answer) <- first[seq_along(answer)]
> answer
named list()

Is there a need for the docs to be updated, or should the result be an
unnamed empty list?

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] Use set.seed inside function

2021-11-29 Thread Henrik Bengtsson
The easiest is to use withr::with_seed(), e.g.

> withr::with_seed(seed = 42L, randomcoloR::distinctColorPalette(6))
[1] "#A0E1BC" "#B8E363" "#D686BE" "#DEA97F" "#B15CD8" "#A2B9D5"
> withr::with_seed(seed = 42L, randomcoloR::distinctColorPalette(6))
[1] "#A0E1BC" "#B8E363" "#D686BE" "#DEA97F" "#B15CD8" "#A2B9D5"

It works by undoing globalenv()$.Random.seed after the random number
generator has updated.  If you want to roll your own version of this,
you need to make sure to handle the special case when there is no
pre-existing .Random.seed in globalenv().

Regarding packages and functions changing the random seed via a
set.seed() [without undoing it]: this should *never* be done, because
it will wreak havoc on a analyses and studies that rely on random
numbers.  My rule of thumb: only the end-user should be allowed to use
set.seed(), which should typically be done at the top of their R
scripts.

/Henrik

On Mon, Nov 29, 2021 at 1:23 PM Meng Chen  wrote:
>
> Thanks. I think it may work in theory, generating "enough" distinct colors
> is fairly easy. Then the problem will be how to find a subset of colors of
> size n, and the selected colors are still most distinguishable. I think I
> will do this with my eyes if no other methods, a tedious job.
> But at least for my curiosity, I still want to know if there are other ways
> to achieve this. I feel like 80% of people who use the distinctColorPallete
> function actually don't need the "random" feature :) Thanks.
>
> On Mon, Nov 29, 2021 at 9:39 PM James W. MacDonald  wrote:
>
> > It appears that you don't actually want random colors, but instead you
> > want the same colors each time. Why not just generate the vector of 'random
> > distinct colors' one time and save the vector of colors?
> >
> > -Original Message-
> > From: Bioc-devel  On Behalf Of Meng Chen
> > Sent: Monday, November 29, 2021 3:21 PM
> > To: bioc-devel@r-project.org
> > Subject: [Bioc-devel] Use set.seed inside function
> >
> > Dear BioC team and developers,
> >
> > I am using BiocCheck to check my package, it returns a warning:
> > " Remove set.seed usage in R code"
> >
> > I am using "set.seed" inside my functions, before calling function
> > distinctColorPalette (randomcoloR package) in order to generate
> > reproducible "random distinct colors". So what would be the best practice
> > to solve this warning? I think 1. use set.seed and don't change anything.
> > 2. use the set.seed function, but include something like below inside the
> > function *gl.seed <- .Random.seed* *on.exit(assign(".Random.seed", gl.seed,
> > envir = .GlobalEnv))* 3. use some other functions for the purpose
> >
> > Any suggestions will be appreciated. Thanks.
> > --
> > Best Regards,
> > Chen
> >
> > [[alternative HTML version deleted]]
> >
> > ___
> > Bioc-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
> >
>
>
> --
> Best Regards,
> Chen
>
> [[alternative HTML version deleted]]
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Rd] How can a package be aware of whether it's on CRAN

2021-11-23 Thread Henrik Bengtsson
On Tue, Nov 23, 2021 at 12:06 PM Gábor Csárdi  wrote:
>
> On Tue, Nov 23, 2021 at 8:49 PM Henrik Bengtsson
>  wrote:
> >
> > > Is there any reliable way to let packages to know if they are on CRAN, so 
> > > they can set omp cores to 2 by default?
> >
> > Instead of testing for "on CRAN" or not, you can test for 'R CMD
> > check' running or not. 'R CMD check' sets environment variable
> > _R_CHECK_LIMIT_CORES_=TRUE. You can use that to limit your code to run
> > at most two (2) parallel threads or processes.
>
> AFAICT this is only set with --as-cran and many CRAN machines don't
> use that and I am fairly sure that some of them don't set this env var
> manually, either.

Oh my - yes & yes, especially on the second part - I totally forgot.
So, that alone is not sufficient. It's not meant to be easy, eh?

So, parallelly::availableCores() tries to account for this as well by
detecting that 'R CMD check' runs, cf.
https://github.com/HenrikBengtsson/parallelly/blob/3e403f600e7181423b9d77c739373d36b4fe34df/R/zzz.R#L42-L47

/Henrik

>
> Gabor
>
> [...]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How can a package be aware of whether it's on CRAN

2021-11-23 Thread Henrik Bengtsson
> Is there any reliable way to let packages to know if they are on CRAN, so 
> they can set omp cores to 2 by default?

Instead of testing for "on CRAN" or not, you can test for 'R CMD
check' running or not. 'R CMD check' sets environment variable
_R_CHECK_LIMIT_CORES_=TRUE. You can use that to limit your code to run
at most two (2) parallel threads or processes.

The parallelly::availableCores() function is agile to this and many
other settings, i.e. it'll return 2 when running via 'R CMD check'. As
the author, I obviously suggest using that function to query what
amount of CPU resources your R process is allowed to use. For more
info, see 
.

/Henrik

PS. I'm in the camp of *not* having R packages parallelize by default.
At least not until R and its community have figured out how to avoid
ending up with nested parallelization (e.g. via dependencies) by
mistake.  We would also need a standard for end-users (and the sysadms
on the machines they're running) to control the default number of CPU
cores the R session may use.  Right now we only have a few scattered
settings for separate purposes, e.g. option 'mc.cores'/env var
'MC_CORES', and option 'Ncpus', which is not enough for establishing a
de facto standard.

On Tue, Nov 23, 2021 at 11:11 AM Dipterix Wang  wrote:
>
> Dear R wizards,
>
> I recently received an email from Prof. Ripley. He pointed out that my 
> package seriously violates the CRAN policy: "using 8 threads is a serious 
> violation of the CRAN policy”. By default the number of cores my package uses 
> is determined from system CPU cores. After carefully reading all the CRAN 
> policies, now I understand that CRAN does not allow a package to use more 
> than 2 CPU cores when checking a package. I can easily change my code to let 
> my tests comply to that constraint.
>
> However, this warning worries me because my package uses OpenMP. I got 
> “caught" partially because I printed the number of cores used in the package 
> startup message, and one of my test exceeded the time limit (which leads to 
> manual inspection). However, what if I develop a package that imports on 
> those openmp-dependent packages? (For example, data.table, fst…) These 
> packages use more than 2 cores by default. If not carefully treated, it’ll be 
> very easy to exceed that limit, and it’s very hard for CRAN to detect it.
>
> Is there any reliable way to let packages to know if they are on CRAN, so 
> they can set omp cores to 2 by default?
>
> Best,
> - Dipterix
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] DOCS: Exactly when, in the signaling process, is option 'warn' applied?

2021-11-18 Thread Henrik Bengtsson
Hi,

the following question sprung out of a package settings option warn=-1
to silence warnings, but those warnings were still caught by
withCallingHandlers(..., warning), which the package author did not
anticipate. The package has been updated to use suppressWarnings()
instead, but as I see a lot of packages on CRAN [1] use
options(warn=-1) to temporarily silence warnings, I wanted to bring
this one up. Even base R itself [2] does this, e.g.
utils::assignInMyNamespace().

Exactly when is the value of 'warn' options used when calling warning("boom")?

I think the docs, including ?options, would benefit from clarifying
that. To the best of my understanding, it should also mention that
options 'warn' is meant to be used by end-users, and not in package
code where suppressWarnings() should be used.

To clarify, if we do:

> options(warn = -1)
> tryCatch(warning("boom"), warning = function(w) stop("Caught warning: ", 
> conditionMessage(w), call. = FALSE))
Error: Caught warning: boom

we see that the warning is indeed signaled.  However, in Section '8.2
warning' of the 'R Language Definition' [3], we can read:

"The function `warning` takes a single argument that is a character
string. The behaviour of a call to `warning` depends on the value of
the option `"warn"`. If `"warn"` is negative warnings are ignored.
[...]"

The way this is written, it may suggest that warnings are
ignored/silences already early on when calling warning(), but the
above example shows that that is not the case.

>From the same section, we can also read:

"[...] If it is zero, they are stored and printed after the top-level
function has completed. [...]"

which may hint at the 'warn' option is applied only when a warning
condition is allowed to "bubble up" all the way to the top level.
(FWIW, this is how always though it worked, but it's only now I looked
into the docs and see it's ambiguous on this).

/Henrik

[1] 
https://github.com/search?q=org%3Acran+language%3Ar+R%2F+in%3Afile%2Cpath+options+warn+%22-1%22=Code
[2] 
https://github.com/wch/r-source/blob/0a31ab2d1df247a4289efca5a235dc45b511d04a/src/library/utils/R/objects.R#L402-L405
[3] https://cran.r-project.org/doc/manuals/R-lang.html#warning

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] gettext(msgid, domain="R") doesn't work for some 'msgid':s

2021-11-05 Thread Henrik Bengtsson
I'm trying to reuse some of the translations available in base R by using:

  gettext(msgid, domain="R")

This works great for most 'msgid's, e.g.

$ LANGUAGE=de Rscript -e 'gettext("cannot get working directory", domain="R")'
[1] "kann das Arbeitsverzeichnis nicht ermitteln"

However, it does not work for all.  For instance,

$ LANGUAGE=de Rscript -e 'gettext("Execution halted\n", domain="R")'
[1] "Execution halted\n"

This despite that 'msgid' existing in:

$ grep -C 2 -F 'Execution halted\n' src/library/base/po/de.po

#: src/main/main.c:342
msgid "Execution halted\n"
msgstr "Ausführung angehalten\n"

It could be that the trailing newline causes problems, because the
same happens also for:

$ LANGUAGE=de Rscript --vanilla -e 'gettext("error during cleanup\n",
domain="R")'
[1] "error during cleanup\n"

Is this meant to work, and if so, how do I get it to work, or is it a bug?

Thanks,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] BUG?: R CMD check with --as-cran *disables* checks for unused imports otherwise performed

2021-11-02 Thread Henrik Bengtsson
I've just posted this to BugZilla as PR18229
(https://bugs.r-project.org/show_bug.cgi?id=18229) to make sure it's
tracked.

/Henrik

On Wed, Oct 20, 2021 at 8:08 PM Jeffrey Dick  wrote:
>
> FWIW, I also encountered this issue and posted on R-pkg-devel about it, with 
> no resolution at the time (May 2020). See "Dependencies NOTE lost with 
> --as-cran" (https://stat.ethz.ch/pipermail/r-package-devel/2020q2/005467.html)
>
> On Wed, Oct 20, 2021 at 11:55 PM Henrik Bengtsson 
>  wrote:
>>
>> ISSUE:
>>
>> Using 'R CMD check' with --as-cran,
>> set_R_CHECK_PACKAGES_USED_IGNORE_UNUSED_IMPORTS_=TRUE, whereas the
>> default is FALSE, which you get if you don't add --as-cran.
>> I would expect --as-cran to check more things and more be conservative
>> than without.  So, is this behavior a mistake?  Could it be a thinko
>> around the negating "IGNORE", and the behavior is meant to be vice
>> verse?
>>
>> Example:
>>
>> $ R CMD check QDNAseq_1.29.4.tar.gz
>> ...
>> * using R version 4.1.1 (2021-08-10)
>> * using platform: x86_64-pc-linux-gnu (64-bit)
>> ...
>> * checking dependencies in R code ... NOTE
>> Namespace in Imports field not imported from: ‘future’
>>   All declared Imports should be used.
>>
>> whereas, if I run with --as-cran, I don't get that NOTE;
>>
>> $ R CMD check --as-cran QDNAseq_1.29.4.tar.gz
>> ...
>> * checking dependencies in R code ... OK
>>
>>
>> TROUBLESHOOTING:
>>
>> In src/library/tools/R/check.R [1], the following is set if --as-cran is 
>> used:
>>
>>   Sys.setenv("_R_CHECK_PACKAGES_USED_IGNORE_UNUSED_IMPORTS_" = "TRUE")
>>
>> whereas, if not set, the default is:
>>
>> ignore_unused_imports <-
>> config_val_to_logical(Sys.getenv("_R_CHECK_PACKAGES_USED_IGNORE_UNUSED_IMPORTS_",
>> "FALSE"))
>>
>> [1] 
>> https://github.com/wch/r-source/blob/b50e3f755674cbb697a4a7395b766647a5cfeea2/src/library/tools/R/check.R#L6335
>> [2] 
>> https://github.com/wch/r-source/blob/b50e3f755674cbb697a4a7395b766647a5cfeea2/src/library/tools/R/QC.R#L5954-L5956
>>
>> /Henrik
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fwd: Using existing envars in Renviron on friendly Windows

2021-11-02 Thread Henrik Bengtsson
Oh, I see, I misunderstood.  Thanks for clarifying.

One more thing, to mix-and-match environment variables and strings
with escaped characters, while mimicking how POSIX shells does it, by
using strings with double and single quotes. For example, with:

$ cat .Renviron
APPDATA='C:\Users\foobar\AppData\Roaming'
R_LIBS_USER="${APPDATA}"'\R-library'

we get:

$ Rscript --no-init-file --quiet -e 'cat(sprintf("R_LIBS_USER=[%s]\n",
Sys.getenv("R_LIBS_USER")))'
R_LIBS_USER=[C:\Users\foobar\AppData\Roaming\R-library]

and

$ source .Renviron
$ echo "R_LIBS_USER=[${R_LIBS_USER}]"
R_LIBS_USER=[C:\Users\foobar\AppData\Roaming\R-library]

/Henrik

On Sun, Oct 31, 2021 at 2:59 AM Tomas Kalibera  wrote:
>
>
> On 10/31/21 2:55 AM, Henrik Bengtsson wrote:
> >> ... If one still needed backslashes,
> >> they could then be entered in single quotes, e.g. VAR='c:\users'.
> > I don't think it matters whether you use single or double quotes -
> > both will work.  Here's a proof of concept on Linux with R 4.1.1:
> >
> > $ cat ./.Renviron
> > A=C:\users
> > B='C:\users'
> > C="C:\users"
> >
> > $ Rscript -e "Sys.getenv(c('A', 'B', 'C'))"
> >A   B   C
> >"C:users" "C:\\users" "C:\\users"
>
> Yes, but as I wrote "I think the Renviron files should be written in a
> way so that they would work the same in a POSIX shell". This is why
> single quotes. With double quotes, backslashes are interpreted
> differently from a POSIX shell.
>
> Tomas
>
>
> >
> > /Henrik
> >
> > On Wed, Oct 27, 2021 at 11:45 AM Tomas Kalibera
> >  wrote:
> >>
> >> On 10/21/21 5:18 PM, Martin Maechler wrote:
> >>>>>>>> Michał Bojanowski
> >>>>>>>>   on Wed, 20 Oct 2021 16:31:08 +0200 writes:
> >>>   > Hello Tomas,
> >>>   > Yes, that's accurate although rather terse, which is perhaps the
> >>>   > reason why I did not realize it applies to my case.
> >>>
> >>>   > How about adding something in the direction of:
> >>>
> >>>   > 1. Continuing the cited paragraph with:
> >>>   > In particular, on Windows it may be necessary to quote references 
> >>> to
> >>>   > existing environment variables, especially those containing file 
> >>> paths
> >>>   > (which include backslashes). For example: `"${WINVAR}"`.
> >>>
> >>>   > 2. Add an example (not run):
> >>>
> >>>   > # On Windows do quote references to variables containing paths, 
> >>> e.g.:
> >>>   > # If APPDATA=C:\Users\foobar\AppData\Roaming
> >>>   > # to point to a library tree inside APPDATA in .Renviron use
> >>>   > R_LIBS_USER="${APPDATA}"/R-library
> >>>
> >>>   > Incidentally the last example is on backslashes too.
> >>>
> >>>
> >>>   > What do you think?
> >>>
> >>> I agree that adding an example really helps a lot in such cases,
> >>> in my experience, notably if it's precise enough to be used +/- directly.
> >> Yes, I agree as well. I think the Renviron files should be written in a
> >> way so that they would work the same in a POSIX shell, so e.g.
> >> VAR="${VAR0}" or VAR="${VAR0}/subdir" are the recommended ways to
> >> preserve backslashes in VAR0. It is better to use forward slashes in
> >> string literals, e.g. VAR="c:/users". If one still needed backslashes,
> >> they could then be entered in single quotes, e.g. VAR='c:\users'.
> >>
> >> The currently implemented parsing of Renviron files differs in a number
> >> of details from POSIX shells, some are documented and some are not.
> >> Relying only on the documented behavior that is the same as in POSIX
> >> shells is the best choice for future compatibility.
> >>
> >> Tomas
> >>
> >>>
> >>>   > On Mon, Oct 18, 2021 at 5:02 PM Tomas Kalibera 
> >>>  wrote:
> >>>   >>
> >>>   >>
> >>>   >> On 10/15/21 6:44 PM, Michał Bojanowski wrote:
> >>>   >> > Perhaps a small update to ?.Renviron would be in order to 
> >>> mention that...
> >>>   >>
> >>>   >> Would you have a more spe

Re: [Rd] Fwd: Using existing envars in Renviron on friendly Windows

2021-10-30 Thread Henrik Bengtsson
> ... If one still needed backslashes,
> they could then be entered in single quotes, e.g. VAR='c:\users'.

I don't think it matters whether you use single or double quotes -
both will work.  Here's a proof of concept on Linux with R 4.1.1:

$ cat ./.Renviron
A=C:\users
B='C:\users'
C="C:\users"

$ Rscript -e "Sys.getenv(c('A', 'B', 'C'))"
  A   B   C
  "C:users" "C:\\users" "C:\\users"

/Henrik

On Wed, Oct 27, 2021 at 11:45 AM Tomas Kalibera
 wrote:
>
>
> On 10/21/21 5:18 PM, Martin Maechler wrote:
> >> Michał Bojanowski
> >>  on Wed, 20 Oct 2021 16:31:08 +0200 writes:
> >  > Hello Tomas,
> >  > Yes, that's accurate although rather terse, which is perhaps the
> >  > reason why I did not realize it applies to my case.
> >
> >  > How about adding something in the direction of:
> >
> >  > 1. Continuing the cited paragraph with:
> >  > In particular, on Windows it may be necessary to quote references to
> >  > existing environment variables, especially those containing file 
> > paths
> >  > (which include backslashes). For example: `"${WINVAR}"`.
> >
> >  > 2. Add an example (not run):
> >
> >  > # On Windows do quote references to variables containing paths, e.g.:
> >  > # If APPDATA=C:\Users\foobar\AppData\Roaming
> >  > # to point to a library tree inside APPDATA in .Renviron use
> >  > R_LIBS_USER="${APPDATA}"/R-library
> >
> >  > Incidentally the last example is on backslashes too.
> >
> >
> >  > What do you think?
> >
> > I agree that adding an example really helps a lot in such cases,
> > in my experience, notably if it's precise enough to be used +/- directly.
>
> Yes, I agree as well. I think the Renviron files should be written in a
> way so that they would work the same in a POSIX shell, so e.g.
> VAR="${VAR0}" or VAR="${VAR0}/subdir" are the recommended ways to
> preserve backslashes in VAR0. It is better to use forward slashes in
> string literals, e.g. VAR="c:/users". If one still needed backslashes,
> they could then be entered in single quotes, e.g. VAR='c:\users'.
>
> The currently implemented parsing of Renviron files differs in a number
> of details from POSIX shells, some are documented and some are not.
> Relying only on the documented behavior that is the same as in POSIX
> shells is the best choice for future compatibility.
>
> Tomas
>
> >
> >
> >  > On Mon, Oct 18, 2021 at 5:02 PM Tomas Kalibera 
> >  wrote:
> >  >>
> >  >>
> >  >> On 10/15/21 6:44 PM, Michał Bojanowski wrote:
> >  >> > Perhaps a small update to ?.Renviron would be in order to mention 
> > that...
> >  >>
> >  >> Would you have a more specific suggestion how to update the
> >  >> documentation? Please note that it already says
> >  >>
> >  >> "‘value’ is then processed in a similar way to a Unix shell: in
> >  >> particular the outermost level of (single or double) quotes is 
> > stripped,
> >  >> and backslashes are removed except inside quotes."
> >  >>
> >  >> Thanks,
> >  >> Tomas
> >  >>
> >  >> > On Fri, Oct 15, 2021 at 6:43 PM Michał Bojanowski 
> >  wrote:
> >  >> >> Indeed quoting works! Kevin suggested the same, but he didnt 
> > reply to the list.
> >  >> >> Thank you all!
> >  >> >> Michal
> >  >> >>
> >  >> >> On Fri, Oct 15, 2021 at 6:40 PM Ivan Krylov 
> >  wrote:
> >  >> >>> Sorry for the noise! I wasn't supposed to send my previous 
> > message.
> >  >> >>>
> >  >> >>> On Fri, 15 Oct 2021 16:44:28 +0200
> >  >> >>> Michał Bojanowski  wrote:
> >  >> >>>
> >  >>  AVAR=${APPDATA}/foo/bar
> >  >> 
> >  >>  Which is a documented way of referring to existing environment
> >  >>  variables. Now, with that in R I'm getting:
> >  >> 
> >  >>  Sys.getenv("APPDATA")# That works OK
> >  >>  [1] "C:\\Users\\mbojanowski\\AppData\\Roaming"
> >  >> 
> >  >>  so OK, but:
> >  >> 
> >  >>  Sys.getenv("AVAR")
> >  >>  [1] "C:UsersmbojanowskiAppDataRoaming/foo/bar"
> >  >> >>> Hmm, a function called by readRenviron does seem to remove 
> > backslashes,
> >  >> >>> but not if they are encountered inside quotes:
> >  >> >>>
> >  >> >>> 
> > https://github.com/r-devel/r-svn/blob/3f8b75857fb1397f9f3ceab6c75554e1a5386adc/src/main/Renviron.c#L149
> >  >> >>>
> >  >> >>> Would AVAR="${APPDATA}"/foo/bar work?
> >  >> >>>
> >  >> >>> --
> >  >> >>> Best regards,
> >  >> >>> Ivan
> >  >> > __
> >  >> > R-devel@r-project.org mailing list
> >  >> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >  > __
> >  > R-devel@r-project.org mailing list
> >  > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing 

Re: [Bioc-devel] Package name

2021-10-22 Thread Henrik Bengtsson
For CRAN packages it's easy. Packages on CRAN are eternal. They may be
archived, but they are never removed, so in a sense they're always
"currently on CRAN". Archived packages may still be installed, but only
with some efforts of the user. Some packages go in an out of "archived"
status depending how quick the maintainer fixes issues. Because of this, I
cannot really see how a CRAN package name can be "reused" by anyone else
without a formal handover agreement between old and new maintainers. Even
so, I think CRAN needs to approve on the "update" in order to unarchive it.

Personally, I'd argue the same should apply to Bioconductor packages.
Reusing package names for other purposes/different APIs is just asking for
troubles, e.g. when it comes to future scientists trying to reproduce
legacy results.

/Henrik

On Fri, Oct 22, 2021, 03:02 Wolfgang Huber  wrote:

> This is probably a niche concern, but  I’d find it a pity if a good
> package name (*) became unavailable forever, esp. if it refers to a
> real-world concept not owned by the authors of the original package.
> Perhaps we could allow re-using a name after a grace period (say 1 or 2
> years)?
> To be extra safe, one could also require the first version number of the
> new package be much higher than the last version of the old (dead) package.
>
> (*) One example I have in mind where we re-used the name of an extinct
> project is rhdf5.
>
> Kind regards
> Wolfgang
>
> > Il giorno 21ott2021, alle ore 13:39, Kern, Lori
>  ha scritto:
> >
> > Good point.  I'll open an issue on the github to fix.
> >
> >
> > Lori Shepherd
> >
> > Bioconductor Core Team
> >
> > Roswell Park Comprehensive Cancer Center
> >
> > Department of Biostatistics & Bioinformatics
> >
> > Elm & Carlton Streets
> >
> > Buffalo, New York 14263
> >
> > 
> > From: Bioc-devel  on behalf of
> Laurent Gatto 
> > Sent: Thursday, October 21, 2021 12:53 AM
> > To: bioc-devel@r-project.org 
> > Subject: [Bioc-devel] Package name
> >
> > The Package Guidelines for Developers and Reviewers say that:
> >
> > A package name should be descriptive and should not already exist as a
> current package (case-insensitive) in Bioconductor nor CRAN.
> >
> > The sentences says current packages - does that imply that names of
> packages that have been archived (on CRAN) or deprecated (on Bioconductor)
> are available? This is likely to lead to serious confusion.
> >
> > Laurent
> >
> > ___
> > Bioc-devel@r-project.org mailing list
> >
> https://secure-web.cisco.com/18tLjfrOdSZ-K_8neKbEy5VWz_fgbNJthSRI3zRVyXXtc-p9kCgNhG51wWXnY7UGhy4yP_imTwLGoP4BCIicB_fqzg9U937WF_IJiOPJh7NnfQXFLeEV-SiiJJ1eCyN2vaJFacWPvahAlN135mDHZNw_peW0Yl4BOq8m2QBMh4i952Nt6oghMQpSWSjaP_2bN4VKIBT2ZP-A7pDqddlOSeCCaMEKJZp_6w1WthdY69MB6lAbsF-i9uX3JVNSCmAlXW3YMNOfVEBijto4EJaGIUJMJwGX_vec9kTf9gtFiYztotSHNfquFZ4GlaHmXeHwPaBEtazOY5fPiuzLjzDK52Q/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel
> >
> >
> >
> > This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
> >   [[alternative HTML version deleted]]
> >
> > ___
> > Bioc-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

[[alternative HTML version deleted]]

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Rd] Fwd: Using existing envars in Renviron on friendly Windows

2021-10-20 Thread Henrik Bengtsson
Two comments/suggestions:

1. What about recommending to always quote the value in Renviron
files, e.g. ABC="Hello world" and DEF="${APPDATA}/R-library"?  This
should a practice that works on all platforms.

2. What about having readRenviron() escapes strings it imports via
environment variables?  See example below.  Is there ever a use case
where someone wants/needs, or even rely on, the current behavior? (I
would even like to argue the current behavior is a design bug that
should be fixed.)  As an analogue from the shell world, Bash escapes
its input.

To illustrate the latter, with:

A=C:\\ABC
B=${A}
C="${A}"

or equivalently:

A="C:\ABC"
B=${A}
C="${A}"

we currently get:

$ Rscript -e "Sys.getenv(c('A', 'B', 'C'))"
A B C
"C:\\ABC"   "C:ABC" "C:\\ABC"

If base::readRenviron() would escape "input" environment variables, we
would get identical values for both 'B' and 'C', which I think is what
most people would expect.

To be clear, this is a problem that occur on all platforms, but it's
more likely to be revealed on MS Windows since paths uses backslashes,
but you could image a Linux user using something like
A="Hello\nworld\n" and would also be surprised about the above
behavior, when they end up with B="Hellonworldn".

/Henrik

On Wed, Oct 20, 2021 at 7:31 AM Michał Bojanowski  wrote:
>
> Hello Tomas,
>
> Yes, that's accurate although rather terse, which is perhaps the
> reason why I did not realize it applies to my case.
>
> How about adding something in the direction of:
>
> 1. Continuing the cited paragraph with:
> In particular, on Windows it may be necessary to quote references to
> existing environment variables, especially those containing file paths
> (which include backslashes). For example: `"${WINVAR}"`.
>
> 2. Add an example (not run):
>
> # On Windows do quote references to variables containing paths, e.g.:
> # If APPDATA=C:\Users\foobar\AppData\Roaming
> # to point to a library tree inside APPDATA in .Renviron use
> R_LIBS_USER="${APPDATA}"/R-library
>
> Incidentally the last example is on backslashes too.
>
> What do you think?
>
> On Mon, Oct 18, 2021 at 5:02 PM Tomas Kalibera  
> wrote:
> >
> >
> > On 10/15/21 6:44 PM, Michał Bojanowski wrote:
> > > Perhaps a small update to ?.Renviron would be in order to mention that...
> >
> > Would you have a more specific suggestion how to update the
> > documentation? Please note that it already says
> >
> > "‘value’ is then processed in a similar way to a Unix shell: in
> > particular the outermost level of (single or double) quotes is stripped,
> > and backslashes are removed except inside quotes."
> >
> > Thanks,
> > Tomas
> >
> > > On Fri, Oct 15, 2021 at 6:43 PM Michał Bojanowski  
> > > wrote:
> > >> Indeed quoting works! Kevin suggested the same, but he didnt reply to 
> > >> the list.
> > >> Thank you all!
> > >> Michal
> > >>
> > >> On Fri, Oct 15, 2021 at 6:40 PM Ivan Krylov  
> > >> wrote:
> > >>> Sorry for the noise! I wasn't supposed to send my previous message.
> > >>>
> > >>> On Fri, 15 Oct 2021 16:44:28 +0200
> > >>> Michał Bojanowski  wrote:
> > >>>
> >  AVAR=${APPDATA}/foo/bar
> > 
> >  Which is a documented way of referring to existing environment
> >  variables. Now, with that in R I'm getting:
> > 
> >  Sys.getenv("APPDATA")# That works OK
> >  [1] "C:\\Users\\mbojanowski\\AppData\\Roaming"
> > 
> >  so OK, but:
> > 
> >  Sys.getenv("AVAR")
> >  [1] "C:UsersmbojanowskiAppDataRoaming/foo/bar"
> > >>> Hmm, a function called by readRenviron does seem to remove backslashes,
> > >>> but not if they are encountered inside quotes:
> > >>>
> > >>> https://github.com/r-devel/r-svn/blob/3f8b75857fb1397f9f3ceab6c75554e1a5386adc/src/main/Renviron.c#L149
> > >>>
> > >>> Would AVAR="${APPDATA}"/foo/bar work?
> > >>>
> > >>> --
> > >>> Best regards,
> > >>> Ivan
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] BUG?: R CMD check with --as-cran *disables* checks for unused imports otherwise performed

2021-10-20 Thread Henrik Bengtsson
ISSUE:

Using 'R CMD check' with --as-cran,
set_R_CHECK_PACKAGES_USED_IGNORE_UNUSED_IMPORTS_=TRUE, whereas the
default is FALSE, which you get if you don't add --as-cran.
I would expect --as-cran to check more things and more be conservative
than without.  So, is this behavior a mistake?  Could it be a thinko
around the negating "IGNORE", and the behavior is meant to be vice
verse?

Example:

$ R CMD check QDNAseq_1.29.4.tar.gz
...
* using R version 4.1.1 (2021-08-10)
* using platform: x86_64-pc-linux-gnu (64-bit)
...
* checking dependencies in R code ... NOTE
Namespace in Imports field not imported from: ‘future’
  All declared Imports should be used.

whereas, if I run with --as-cran, I don't get that NOTE;

$ R CMD check --as-cran QDNAseq_1.29.4.tar.gz
...
* checking dependencies in R code ... OK


TROUBLESHOOTING:

In src/library/tools/R/check.R [1], the following is set if --as-cran is used:

  Sys.setenv("_R_CHECK_PACKAGES_USED_IGNORE_UNUSED_IMPORTS_" = "TRUE")

whereas, if not set, the default is:

ignore_unused_imports <-
config_val_to_logical(Sys.getenv("_R_CHECK_PACKAGES_USED_IGNORE_UNUSED_IMPORTS_",
"FALSE"))

[1] 
https://github.com/wch/r-source/blob/b50e3f755674cbb697a4a7395b766647a5cfeea2/src/library/tools/R/check.R#L6335
[2] 
https://github.com/wch/r-source/blob/b50e3f755674cbb697a4a7395b766647a5cfeea2/src/library/tools/R/QC.R#L5954-L5956

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] Too many dependencies / MultiAssayExperiment + rtracklayer

2021-10-19 Thread Henrik Bengtsson
If you're willing to depend on R (>= 4.0.0), then tools::R_user_dir() can
replace the 'rappdirs' package.

/Henrik

On Mon, Oct 18, 2021, 09:05 Shraddha Pai  wrote:

> Hi all,
> Despite moving rarely-used packages to Suggests and eliminating some (e.g.
> TCGAutils), the number of dependencies is still listed as 200 for our
> package netDx.
> https://www.bioconductor.org/packages/devel/bioc/html/netDx.html#since
> Is there anything else we can do to cut down on dependencies?
>
> Thank you,
> Shraddha
>
> On Tue, Sep 21, 2021 at 5:35 PM Shraddha Pai 
> wrote:
>
> > Hi Michael,
> > Thanks! Looks like the package trying to load 'rtracklayer' was
> > 'TCGAutils' (see graph from Zugang above, generated using pkgndep - looks
> > to be quite useful). Turns out TCGAutils really wasn't necessary for my
> > package so I just took it out and removed all associated dependencies -
> > mercifully an easier fix.
> >
> > Thanks for your help,
> > Shraddha
> >
> > On Mon, Sep 20, 2021 at 2:57 PM Michael Lawrence <
> > lawrence.mich...@gene.com> wrote:
> >
> >> Hi Shraddha,
> >>
> >> From the rtracklayer perspective, it sounds like Rsamtools is
> >> (indirectly) bringing in those system libraries. I would have expected
> >> zlibbioc to cover the zlib dependency, and perhaps bz2 and lzma
> >> support is optional. Perhaps a core member could comment on that.
> >>
> >> In the past, I've used this package
> >> https://github.com/Bioconductor/codetoolsBioC to identify missing
> >> NAMESPACE imports. In theory, you could remove the rtracklayer import
> >> and run functions in that package to identify the symbol-level
> >> dependencies. The output is a bit noisy though.
> >>
> >> Btw, using @importFrom only allows you to be selective of symbol-level
> >> dependencies, not package-level.
> >>
> >> Michael
> >>
> >> On Mon, Sep 20, 2021 at 11:37 AM Shraddha Pai  >
> >> wrote:
> >> >
> >> > Hello again,
> >> > I'm trying to simplify the dependencies for my package "netDx", make
> it
> >> > easier to install. It's currently got over 200(!) + some Unix
> libraries
> >> > that need to be installed.
> >> >
> >> > 1. I ran pkgDepMetrics() from BiocPkgTools to find less-needed pkgs,
> and
> >> > the package with the most dependencies is MultiAssayExperiment (see
> >> below
> >> > email). I'm using MAE to construct a container - is there a way to use
> >> > @importFrom calls to reduce MAE dependencies?
> >> >
> >> > 2. Another problem package is rtracklayer which requires Rhtslib,
> which
> >> > requires some unix libraries: zlib1g-dev libbz2-dev liblzma-dev. I'm
> not
> >> > sure which functionality in the package requires rtracklayer - how
> can I
> >> > tell? Is there a way to simplify / reduce these deps so the user
> doesn't
> >> > have to install all these unix packages?
> >> >
> >> > 3. Are there other "problem packages" you can see that I can remove?
> >> Let's
> >> > assume for now ggplot2 stays because people find it useful to have
> >> plotting
> >> > functions readily available.
> >> >
> >> > Thanks very much in advance,
> >> > Shraddha
> >> > ---
> >> > "ImportedAndUsed" "Exported" "Usage" "DepOverlap" "DepGainIfExcluded"
> >> > "igraph" 1 782 0.13 0.05 0
> >> > "ggplot2" 1 520 0.19 0.19 0
> >> > "pracma" 1 448 0.22 0.03 0
> >> > "plotrix" 1 160 0.62 0.03 1
> >> > "S4Vectors" 2 283 0.71 0.03 0
> >> > "grDevices" 1 112 0.89 0.01 0
> >> > "httr" 1 91 1.1 0.05 0
> >> > "scater" 1 85 1.18 0.4 0
> >> > "utils" 3 217 1.38 0.01 0
> >> > "GenomeInfoDb" 1 60 1.67 0.06 0
> >> > "stats" 12 449 2.67 0.01 0
> >> > "bigmemory" 1 35 2.86 0.03 3
> >> > "RCy3" 12 386 3.11 0.32 18
> >> > "BiocFileCache" 1 29 3.45 0.23 3
> >> > "glmnet" 1 24 4.17 0.07 2
> >> > "parallel" 2 33 6.06 0.01 0
> >> > "combinat" 1 13 7.69 0.01 1
> >> > "MultiAssayExperiment" 4 46 8.7 0.22 1
> >> > "foreach" 2 23 8.7 0.02 0
> >> > "graphics" 8 87 9.2 0.01 0
> >> > "GenomicRanges" 15 106 14.15 0.08 0
> >> > "rappdirs" 1 7 14.29 0.01 0
> >> > "reshape2" 1 6 16.67 0.05 0
> >> > "RColorBrewer" 1 4 25 0.01 0
> >> > "netSmooth" 1 3 33.33 0.82 3
> >> > "Rtsne" 1 3 33.33 0.02 0
> >> > "doParallel" 1 2 50 0.03 0
> >> > "ROCR" 2 3 66.67 0.05 4
> >> > "clusterExperiment" NA 122 NA 0.74 0
> >> > "IRanges" NA 255 NA 0.04 0
> >> >
> >> >
> >> > --
> >> >
> >> > *Shraddha Pai, PhD*
> >> > Principal Investigator, OICR
> >> > Assistant Professor, Department of Molecular Biophysics, University of
> >> > Toronto
> >> > shraddhapai.com; @spaiglass on Twitter
> >> > https://pailab.oicr.on.ca
> >> >
> >> >
> >> > *Ontario Institute for Cancer Research*
> >> > MaRS Centre, 661 University Avenue, Suite 510, Toronto, Ontario,
> Canada
> >> M5G
> >> > 0A3
> >> > *@OICR_news*  | *www.oicr.on.ca*
> >> > 
> >> >
> >> >
> >> >
> >> > *Collaborate. Translate. Change lives.*
> >> >
> >> >
> >> >
> >> > This message and any attachments may contain confidential and/or
> >> privileged
> >> > information for the sole use of the intended recipient. Any 

Re: [Bioc-devel] Strange "internal logical NA value has been modified" error

2021-10-12 Thread Henrik Bengtsson
In addition to checking with Valgrind, the ASan/UBsan and rchk
platforms on R-Hub (https://builder.r-hub.io/) can probably also be
useful;

> rhub::check(platform = "linux-x86_64-rocker-gcc-san")
> rhub::check(platform = "ubuntu-rchk")

/Henrik



On Tue, Oct 12, 2021 at 4:54 PM Martin Morgan  wrote:
>
> It is from base R
>
>   
> https://github.com/wch/r-source/blob/a984cc29b9b8d8821f8eb2a1081d9e0d1d4df56e/src/main/memory.c#L3214
>
> and likely indicates memory corruption, not necessarily in the code that 
> triggers the error (this is when the garbage collector is triggered...). 
> Probably in *your* C code :) since it's the least tested. Probably writing 
> out of bounds.
>
> This could be quite tricky to debug. I'd try to get something close to a 
> minimal reproducible example.
>
> I'd try to take devtools out of the picture, maybe running the test/testhat.R 
> script from the command line using Rscript, or worst case creating a shell 
> package that adds minimal code and can be checked with R CMD build 
> --no-build-vignettes / R CMD check.
>
> You could try inserting gc() before / after the unit test; it might make it 
> clear that the unit test isn't the problem. You could also try 
> gctorture(TRUE); this will make your code run extremely painfully slowly, 
> which puts a big premium on having a minimal reproducible example; you could 
> put this near the code chunks that are causing problems.
>
> You might have success running under valgrind, something like R -d valgrind 
> -f minimal_script.R.
>
> Hope those suggestions help!
>
> Martin
>
>
> On 10/12/21, 6:43 PM, "Bioc-devel on behalf of Pariksheet Nanda" 
>  
> wrote:
>
> Hi folks,
>
> I've been told to ask some of my more fun questions on this mailing list
> instead of Slack.  I'm climbing the ladder of submitting my first
> Bioconductor package (https://gitlab.com/coregenomics/tsshmm) and feel
> like there are gremlins that keep adding rungs to the top of the ladder.
>   The latest head scratcher from running devtools::check() is a unit
> test for a  trivial 2 line function failing with this gem of an error:
>
>
>  > test_check("tsshmm")
> ══ Failed tests
> 
> ── Error (test-tss.R:11:5): replace_unstranded splits unstranded into +
> and - ──
> Error in `tryCatchOne(expr, names, parentenv, handlers[[1L]])`: internal
> logical NA value has been modified
> Backtrace:
>   █
>1. ├─testthat::expect_equal(...) test-tss.R:11:4
>2. │ └─testthat::quasi_label(enquo(expected), expected.label, arg =
> "expected")
>3. │   └─rlang::eval_bare(expr, quo_get_env(quo))
>4. └─GenomicRanges::GRanges(c("chr:100:+", "chr:100:-"))
>5.   └─methods::as(seqnames, "GRanges")
>6. └─GenomicRanges:::asMethod(object)
>7.   └─GenomicRanges::GRanges(ans_seqnames, ans_ranges, ans_strand)
>8. └─GenomicRanges:::new_GRanges(...)
>9.   └─S4Vectors:::normarg_mcols(mcols, Class, ans_len)
>   10. └─S4Vectors::make_zero_col_DFrame(x_len)
>   11.   └─S4Vectors::new2("DFrame", nrows = nrow, check = 
> FALSE)
>   12. └─methods::new(...)
>   13.   ├─methods::initialize(value, ...)
>   14.   └─methods::initialize(value, ...)
>   15. └─methods::validObject(.Object)
>   16.   └─base::try(...)
>   17. └─base::tryCatch(...)
>   18.   └─base:::tryCatchList(expr, classes,
> parentenv, handlers)
>   19. └─base:::tryCatchOne(expr, names,
> parentenv, handlers[[1L]])
> [ FAIL 1 | WARN 0 | SKIP 0 | PASS 109 ]
>
>
> The full continuous integration log is here:
> https://gitlab.com/coregenomics/tsshmm/-/jobs/1673603868
>
> The function in question is:
>
>
> replace_unstranded <- function (gr) {
>  idx <- strand(gr) == "*"
>  if (length(idx) == 0L)
>  return(gr)
>  sort(c(
>  gr[! idx],
>  `strand<-`(gr[idx], value = "+"),
>  `strand<-`(gr[idx], value = "-")))
> }
>
>
> Also online here:
> 
> https://gitlab.com/coregenomics/tsshmm/-/blob/ef5e19a0e2f68fca93665bc417afbcfb6d437189/R/hmm.R#L170-178
>
> ... and the unit test is:
>
>
> test_that("replace_unstranded splits unstranded into + and -", {
>  expect_equal(replace_unstranded(GRanges("chr:100")),
>   GRanges(c("chr:100:+", "chr:100:-")))
>  expect_equal(replace_unstranded(GRanges(c("chr:100", "chr:200:+"))),
>   sort(GRanges(c("chr:100:+", "chr:100:-", "chr:200:+"
> })
>
>
> Also online here:
> 
> 

Re: [Rd] R-devel: as.character() for hexmode no longer pads with zeros

2021-09-23 Thread Henrik Bengtsson
Thanks for confirming and giving details on the rationale (... and
I'll updated R.utils to use format() instead).

Regarding as.character(x)[j] === as.character(x[j]): I agree with this
- is that property of as.character()/subsetting explicitly
stated/documented somewhere?  I wonder if this is a property we should
all strive for for other types of objects?

/Henrik

On Thu, Sep 23, 2021 at 12:46 AM Martin Maechler
 wrote:
>
> >>>>> Henrik Bengtsson
> >>>>> on Wed, 22 Sep 2021 20:48:05 -0700 writes:
>
> > The update in rev 80946
> > 
> (https://github.com/wch/r-source/commit/d970867722e14811e8ba6b0ba8e0f478ff482f5e)
> > caused as.character() on hexmode objects to no longer pads with zeros.
>
> Yes -- very much on purpose; by me, after discussing a related issue
> within R-core which showed "how wrong" the previous (current R)
> behavior of the as.character() method is for
> hexmode and octmode objects :
>
> If you look at the whole rev 80946 , you also read NEWS
>
>  * as.character() for "hexmode" or "octmode" objects now
>fulfills the important basic rule
>
>   as.character(x)[j] === as.character(x[j])
>   ^
>
> rather than just calling format().
>
> The format() generic (notably for "atomic-alike" objects) should indeed
> return a character vector where each string has the same "width",
> however, the result of  as.character(x) --- at least for all
> "atomic-alike" / "vector-alike" objects --
> for a single x[j] should not be influenced by other elements in x.
>
>
>
>
> > Before:
>
> >> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode")
> >> x
> > [1] "00" "08" "10" "18" "20"
> >> as.character(x)
> > [1] "00" "08" "10" "18" "20"
>
> > After:
>
> >> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode")
> >> x
> > [1] "00" "08" "10" "18" "20"
> >> as.character(x)
> > [1] "0"  "8"  "10" "18" "20"
>
> > Was that intended?
>
> Yes!
> You have to explore your example a bit to notice how "illogical"
> the behavior before was:
>
> > as.character(as.hexmode(0:15))
>  [1] "0" "1" "2" "3" "4" "5" "6" "7" "8" "9" "a" "b" "c" "d" "e" "f"
> > as.character(as.hexmode(0:16))
>  [1] "00" "01" "02" "03" "04" "05" "06" "07" "08" "09" "0a" "0b" "0c" "0d" 
> "0e"
> [16] "0f" "10"
>
> > as.character(as.hexmode(16^(0:2)))
> [1] "001" "010" "100"
> > as.character(as.hexmode(16^(0:3)))
> [1] "0001" "0010" "0100" "1000"
> > as.character(as.hexmode(16^(0:4)))
> [1] "1" "00010" "00100" "01000" "1"
>
> all breaking the rule in the NEWS  and given above.
>
> If you want format()  you should use format(),
> but as.character() should never have used format() ..
>
> Martin
>
> > /Henrik
>
> > PS. This breaks R.utils::intToHex()
> > [https://cran.r-project.org/web/checks/check_results_R.utils.html]
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R-devel: as.character() for hexmode no longer pads with zeros

2021-09-22 Thread Henrik Bengtsson
The update in rev 80946
(https://github.com/wch/r-source/commit/d970867722e14811e8ba6b0ba8e0f478ff482f5e)
caused as.character() on hexmode objects to no longer pads with zeros.

Before:

> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode")
> x
[1] "00" "08" "10" "18" "20"
> as.character(x)
[1] "00" "08" "10" "18" "20"

After:

> x <- structure(as.integer(c(0,8,16,24,32)), class="hexmode")
> x
[1] "00" "08" "10" "18" "20"
> as.character(x)
[1] "0"  "8"  "10" "18" "20"

Was that intended?

/Henrik

PS. This breaks R.utils::intToHex()
[https://cran.r-project.org/web/checks/check_results_R.utils.html]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] WISH: set.seed(seed) to produce error if length(seed) != 1 (now silent)

2021-09-17 Thread Henrik Bengtsson
> I'd say a more serious problem would be using set.seed(.Random.seed) ...

Exactly, I'm pretty sure I also tried that at some point.  This leads
to another thing I wanted to get to, which is to add support for
exactly that case.  So, instead of having poke around with:

globalenv()$.Random.seed <- new_seed

where 'new_seed' is a valid ".Random.seed" seed, it would be
convenient to be able to do just set.seed(new_seed), which comes handy
in parallel processing.

/Henrik

On Fri, Sep 17, 2021 at 3:10 PM Duncan Murdoch  wrote:
>
> I'd say a more serious problem would be using set.seed(.Random.seed),
> because the first entry codes for RNGkind, it hardly varies at all.  So
> this sequence could really mislead someone:
>
>  > set.seed(.Random.seed)
>  > sum(.Random.seed)
> [1] 24428993419
>
> # Use it to get a new .Random.seed value:
>  > runif(1)
> [1] 0.3842704
>
>  > sum(.Random.seed)
> [1] -13435151647
>
> # So let's make things really random, by using the new seed as a seed:
>  > set.seed(.Random.seed)
>  > sum(.Random.seed)
> [1] 24428993419
>
> # Back to the original!
>
> Duncan Murdoch
>
>
> On 17/09/2021 8:38 a.m., Henrik Bengtsson wrote:
> >> I’m curious, other than proper programming practice, why?
> >
> > Life's too short for troubleshooting silent mistakes - mine or others.
> >
> > While at it, searching the interwebs for use of set.seed(), gives
> > mistakes/misunderstandings like using set.seed(), e.g.
> >
> >> set.seed(6.1); sum(.Random.seed)
> > [1] 73930104
> >> set.seed(6.2); sum(.Random.seed)
> > [1] 73930104
> >
> > which clearly is not what the user expected.  There are also a few
> > cases of set.seed(), e.g.
> >
> >> set.seed("42"); sum(.Random.seed)
> > [1] -2119381568
> >> set.seed(42); sum(.Random.seed)
> > [1] -2119381568
> >
> > which works just because as.numeric("42") is used.
> >
> > /Henrik
> >
> > On Fri, Sep 17, 2021 at 12:55 PM GILLIBERT, Andre
> >  wrote:
> >>
> >> Hello,
> >>
> >> A vector with a length >= 2 to set.seed would probably be a bug. An error 
> >> message will help the user to fix his R code. The bug may be accidental or 
> >> due to bad understanding of the set.seed function. For instance, a user 
> >> may think that the whole state of the PRNG can be passed to set.seed.
> >>
> >> The "if" instruction, emits a warning when the condition has length >= 2, 
> >> because it is often a bug. I would expect a warning or error with 
> >> set.seed().
> >>
> >> Validating inputs and emitting errors early is a good practice.
> >>
> >> Just my 2 cents.
> >>
> >> Sincerely.
> >> Andre GILLIBERT
> >>
> >> -Message d'origine-
> >> De : R-devel [mailto:r-devel-boun...@r-project.org] De la part de Avraham 
> >> Adler
> >> Envoyé : vendredi 17 septembre 2021 12:07
> >> À : Henrik Bengtsson
> >> Cc : R-devel
> >> Objet : Re: [Rd] WISH: set.seed(seed) to produce error if length(seed) != 
> >> 1 (now silent)
> >>
> >> Hi, Henrik.
> >>
> >> I’m curious, other than proper programming practice, why?
> >>
> >> Avi
> >>
> >> On Fri, Sep 17, 2021 at 11:48 AM Henrik Bengtsson <
> >> henrik.bengts...@gmail.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> according to help("set.seed"), argument 'seed' to set.seed() should be:
> >>>
> >>>a single value, interpreted as an integer, or NULL (see ‘Details’).
> >>>
> >>>  From code inspection (src/main/RNG.c) and testing, it turns out that
> >>> if you pass a 'seed' with length greater than one, it silently uses
> >>> seed[1], e.g.
> >>>
> >>>> set.seed(1); sum(.Random.seed)
> >>> [1] 4070365163
> >>>> set.seed(1:3); sum(.Random.seed)
> >>> [1] 4070365163
> >>>> set.seed(1:100); sum(.Random.seed)
> >>> [1] 4070365163
> >>>
> >>> I'd like to suggest that set.seed() produces an error if length(seed)
> >>>> 1.  As a reference, for length(seed) == 0, we get:
> >>>
> >>>> set.seed(integer(0))
> >>> Error in set.seed(integer(0)) : supplied seed is not a valid integer
> >>>
> >>> /Henrik
> >>>
> >>> __
> >>> R-devel@r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>>
> >> --
> >> Sent from Gmail Mobile
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] WISH: set.seed(seed) to produce error if length(seed) != 1 (now silent)

2021-09-17 Thread Henrik Bengtsson
> I’m curious, other than proper programming practice, why?

Life's too short for troubleshooting silent mistakes - mine or others.

While at it, searching the interwebs for use of set.seed(), gives
mistakes/misunderstandings like using set.seed(), e.g.

> set.seed(6.1); sum(.Random.seed)
[1] 73930104
> set.seed(6.2); sum(.Random.seed)
[1] 73930104

which clearly is not what the user expected.  There are also a few
cases of set.seed(), e.g.

> set.seed("42"); sum(.Random.seed)
[1] -2119381568
> set.seed(42); sum(.Random.seed)
[1] -2119381568

which works just because as.numeric("42") is used.

/Henrik

On Fri, Sep 17, 2021 at 12:55 PM GILLIBERT, Andre
 wrote:
>
> Hello,
>
> A vector with a length >= 2 to set.seed would probably be a bug. An error 
> message will help the user to fix his R code. The bug may be accidental or 
> due to bad understanding of the set.seed function. For instance, a user may 
> think that the whole state of the PRNG can be passed to set.seed.
>
> The "if" instruction, emits a warning when the condition has length >= 2, 
> because it is often a bug. I would expect a warning or error with set.seed().
>
> Validating inputs and emitting errors early is a good practice.
>
> Just my 2 cents.
>
> Sincerely.
> Andre GILLIBERT
>
> -Message d'origine-
> De : R-devel [mailto:r-devel-boun...@r-project.org] De la part de Avraham 
> Adler
> Envoyé : vendredi 17 septembre 2021 12:07
> À : Henrik Bengtsson
> Cc : R-devel
> Objet : Re: [Rd] WISH: set.seed(seed) to produce error if length(seed) != 1 
> (now silent)
>
> Hi, Henrik.
>
> I’m curious, other than proper programming practice, why?
>
> Avi
>
> On Fri, Sep 17, 2021 at 11:48 AM Henrik Bengtsson <
> henrik.bengts...@gmail.com> wrote:
>
> > Hi,
> >
> > according to help("set.seed"), argument 'seed' to set.seed() should be:
> >
> >   a single value, interpreted as an integer, or NULL (see ‘Details’).
> >
> > From code inspection (src/main/RNG.c) and testing, it turns out that
> > if you pass a 'seed' with length greater than one, it silently uses
> > seed[1], e.g.
> >
> > > set.seed(1); sum(.Random.seed)
> > [1] 4070365163
> > > set.seed(1:3); sum(.Random.seed)
> > [1] 4070365163
> > > set.seed(1:100); sum(.Random.seed)
> > [1] 4070365163
> >
> > I'd like to suggest that set.seed() produces an error if length(seed)
> > > 1.  As a reference, for length(seed) == 0, we get:
> >
> > > set.seed(integer(0))
> > Error in set.seed(integer(0)) : supplied seed is not a valid integer
> >
> > /Henrik
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> --
> Sent from Gmail Mobile
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] WISH: set.seed(seed) to produce error if length(seed) != 1 (now silent)

2021-09-17 Thread Henrik Bengtsson
Hi,

according to help("set.seed"), argument 'seed' to set.seed() should be:

  a single value, interpreted as an integer, or NULL (see ‘Details’).

>From code inspection (src/main/RNG.c) and testing, it turns out that
if you pass a 'seed' with length greater than one, it silently uses
seed[1], e.g.

> set.seed(1); sum(.Random.seed)
[1] 4070365163
> set.seed(1:3); sum(.Random.seed)
[1] 4070365163
> set.seed(1:100); sum(.Random.seed)
[1] 4070365163

I'd like to suggest that set.seed() produces an error if length(seed)
> 1.  As a reference, for length(seed) == 0, we get:

> set.seed(integer(0))
Error in set.seed(integer(0)) : supplied seed is not a valid integer

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Bioc-devel] git: lost write access to some repos + what is my BiocCredentials email address?

2021-08-19 Thread Henrik Bengtsson
Thank you, confirming git push now works and

$ ssh -T g...@git.bioconductor.org | grep -E
"(affxparser|aroma.light|illuminaio|QDNAseq)$"
X11 forwarding request failed on channel 0
 R Wpackages/QDNAseq
 R Wpackages/affxparser
 R Wpackages/aroma.light
 R Wpackages/illuminaio

/Henrik

On Thu, Aug 19, 2021 at 2:38 PM Nitesh Turaga  wrote:
>
> Hi Henrik,
>
> You should have access to these packages again.
>
> Please try again.
>
> Best
>
> Nitesh Turaga
> Scientist II, Department of Data Science,
> Bioconductor Core Team Member
> Dana Farber Cancer Institute
>
> > On Aug 19, 2021, at 8:08 AM, Henrik Bengtsson  
> > wrote:
> >
> > Hi,
> >
> > I seem to have "lost" write access to several Bioconductor git
> > repositories that I had git push access for before;
> >
> > $ ssh -T g...@git.bioconductor.org | grep -E
> > "(affxparser|aroma.light|illuminaio|QDNAseq)$"
> > X11 forwarding request failed on channel 0
> > R  packages/QDNAseq
> > R  packages/affxparser
> > R  packages/aroma.light
> > R Wpackages/illuminaio
> >
> > Using `ssh -v ...`, I see that my git+ssh "offers" the server an RSA
> > public key (B...PwYDZ), which is accepted.  Since this gives me
> > write access to one of the repositories, I either have lost write
> > access to the others, or I somehow have ended up with different SSH
> > keys associated with different repositories (since I had write
> > permissions in the past).
> >
> > For example, with:
> >
> > $ git clone g...@git.bioconductor.org:packages/aroma.light
> > $ cd aroma.light
> > $ git remote -v
> > origing...@git.bioconductor.org:packages/aroma.light (fetch)
> > origing...@git.bioconductor.org:packages/aroma.light (push)
> >
> > I get:
> >
> > $ git push
> > X11 forwarding request failed on channel 0
> > FATAL: W any packages/aroma.light h.bengtsson DENIED by fallthru
> > (or you mis-spelled the reponame)
> > fatal: Could not read from remote repository.
> >
> > Please make sure you have the correct access rights and the repository 
> > exists.
> >
> > I followed FAQ #15 to check what SSH key I have on BiocCredentials,
> > but when I try to activate the account on
> > https://git.bioconductor.org/BiocCredentials/account_activation/ using
> > the email address I have in the DESCRIPTION file, I get
> > "henr...@braju.com is not associated with a maintainer of a
> > Bioconductor package. Please check the spelling or contact
> > bioc-devel@r-project.org for help."(*) I suspect it's another email
> > address I should use, possibly one from the SVN era. How can I find
> > out which email address I should use?
> >
> > (*) FYI, the webpage hint reading "Enter the email associated with
> > your Bioconductor package" might be ambiguous; Is it really specific
> > to a particular package?  Should it say something like "Enter the
> > email associated with your Bioconductor developer account"?
> >
> > ___
> > Bioc-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] git: lost write access to some repos + what is my BiocCredentials email address?

2021-08-19 Thread Henrik Bengtsson
Hi,

I seem to have "lost" write access to several Bioconductor git
repositories that I had git push access for before;

$ ssh -T g...@git.bioconductor.org | grep -E
"(affxparser|aroma.light|illuminaio|QDNAseq)$"
X11 forwarding request failed on channel 0
 R  packages/QDNAseq
 R  packages/affxparser
 R  packages/aroma.light
 R Wpackages/illuminaio

Using `ssh -v ...`, I see that my git+ssh "offers" the server an RSA
public key (B...PwYDZ), which is accepted.  Since this gives me
write access to one of the repositories, I either have lost write
access to the others, or I somehow have ended up with different SSH
keys associated with different repositories (since I had write
permissions in the past).

For example, with:

$ git clone g...@git.bioconductor.org:packages/aroma.light
$ cd aroma.light
$ git remote -v
origing...@git.bioconductor.org:packages/aroma.light (fetch)
origing...@git.bioconductor.org:packages/aroma.light (push)

I get:

$ git push
X11 forwarding request failed on channel 0
FATAL: W any packages/aroma.light h.bengtsson DENIED by fallthru
(or you mis-spelled the reponame)
fatal: Could not read from remote repository.

Please make sure you have the correct access rights and the repository exists.

I followed FAQ #15 to check what SSH key I have on BiocCredentials,
but when I try to activate the account on
https://git.bioconductor.org/BiocCredentials/account_activation/ using
the email address I have in the DESCRIPTION file, I get
"henr...@braju.com is not associated with a maintainer of a
Bioconductor package. Please check the spelling or contact
bioc-devel@r-project.org for help."(*) I suspect it's another email
address I should use, possibly one from the SVN era. How can I find
out which email address I should use?

(*) FYI, the webpage hint reading "Enter the email associated with
your Bioconductor package" might be ambiguous; Is it really specific
to a particular package?  Should it say something like "Enter the
email associated with your Bioconductor developer account"?

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


Re: [Rd] Force quitting a FORK cluster node on macOS and Solaris wreaks havoc

2021-08-16 Thread Henrik Bengtsson
Thank you Simon, this is helpful.  I take this is specific to quit(),
so it's a poor choice for emulating crashed parallel workers, and
Sys.kill() is much better for that.

I was focusing on that odd extra execution/output, but as you say,
there are lots of other things that is done by quit() here, e.g.
regardless of platform quit() damages the main R process too:

> f <- parallel::mcparallel(quit("no"))
> v <- parallel::mccollect(f)
Warning message:
In parallel::mccollect(f) : 1 parallel job did not deliver a result
> file.exists(tempdir())
[1] FALSE


Would it be sufficient to make quit() fork safe by, conceptually,
doing something like:

quit <- function(save = "default", status = 0, runLast = TRUE) {
  if (parallel:::isChild())
  stop("quit() must not be called in a forked process")
  .Internal(quit(save, status, runLast))
}

This would protect against calling quit() in forked code by mistake,
e.g. when someone parallelize over code/scripts they don't have full
control over and the ones who write those scripts might not be aware
that they may be used in forks.

Thanks,

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Force quitting a FORK cluster node on macOS and Solaris wreaks havoc

2021-08-12 Thread Henrik Bengtsson
The following smells like a bug in R to me, because it puts the main R
session into an unstable state.  Consider the following R script:

a <- 42
message("a=", a)
cl <- parallel::makeCluster(1L, type="FORK")
try(parallel::clusterEvalQ(cl, quit(save="no")))
message("parallel:::isChild()=", parallel:::isChild())
message("a=", a)
rm(a)

The purpose of this was to emulate what happens when an parallel
workers crashes.

Now, if you source() the above on macOS, you might(*) end up with:

> a <- 42
> message("a=", a)
a=42
> cl <- parallel::makeCluster(1L, type="FORK")
> try(parallel::clusterEvalQ(cl, quit(save="no")))
Error: Error in unserialize(node$con) : error reading from connection
> message("parallel:::isChild()=", parallel:::isChild())
parallel:::isChild()=FALSE
> message("a=", a)
a=42
> rm(a)
> try(parallel::clusterEvalQ(cl, quit(save="no")))
Error: Error in unserialize(node$con) : error reading from connection
> message("parallel:::isChild()=", parallel:::isChild())
parallel:::isChild()=FALSE
> message("a=", a)
Error: Error in message("a=", a) : object 'a' not found
Execution halted

Note how 'rm(a)' is supposed to be the last line of code to be
evaluated.  However, the force quitting of the FORK cluster node
appears to result in the main code being evaluated twice (in
parallel?).

(*) This does not happen on all macOS variants. For example, it works
fine on CRAN's 'r-release-macos-x86_64' but it does give the above
behavior on 'r-release-macos-arm64'.  I can reproduce it on GitHub
Actions 
(https://github.com/HenrikBengtsson/teeny/runs/3309235106?check_suite_focus=true#step:10:219)
but not on R-hub's 'macos-highsierra-release' and
'macos-highsierra-release-cran'.  I can also reproduce it on R-hub's
'solaris-x86-patched' and solaris-x86-patched-ods' machines.  However,
I still haven't found a Linux machine where this happens.

If one replaces quit(save="no") with tools::pskill(Sys.getpid()) or
parallel:::mcexit(0L), this behavior does not take place (at least not
on GitHub Actions and R-hub).

I don't have access to a macOS or a Solaris machine, so I cannot
investigate further myself. For example, could it be an issue with
quit(), or does is it possible to trigger by other means? And more
importantly, should this be fixed? Also, I'd be curious what happens
if you run the above in an interactive R session.

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] find functions with missing Rd tags

2021-06-23 Thread Henrik Bengtsson
$ grep -L -F "\value{" man/*.Rd

/Henrik

On Wed, Jun 23, 2021 at 10:58 AM Alex Chubaty  wrote:
>
> During a recent package submission process, a CRAN maintainer showed one of
> their checks found missing \value{} documentation in some package Rd files,
> and asked us to ensure all exported functions have their return values
> described.
>
> This check (for missing Rd values) is not run by the default checks, so I
> have no idea how to quickly identify which functions are missing those
> components, without manually inspecting everything. I am hoping that
> someone here can tell me which special R CMD check incantation, or similar
> I can use to find _exported_ functions with missing Rd tags.
>
> Thank you,
> Alex
>
> [[alternative HTML version deleted]]
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Bioc-devel] Build error "failure: length > 1 in coercion to logical" not reproducible

2021-06-17 Thread Henrik Bengtsson
On Thu, Jun 17, 2021 at 2:32 AM  wrote:
>
> Dear colleagues,
>
> It seems to me that, starting with the latest BioC devel branch (3.14), the
> build systems have become more pedantic about logical vectors of length > 1
> in conditions. Two of the packages I am maintaining, 'kebabs' and 'procoil'
> currently fail to build. Surely I want to fix this. However, I cannot
> reproduce these errors on my local system (R 4.1.0 alpha on Ubuntu 18.04
> LTS). The discussions https://support.bioconductor.org/p/9137605/ and
> https://github.com/Bioconductor/BBS/issues/71  have pointed me to the
> setting "_R_CHECK_LENGTH_1_CONDITION_=verbose".
>
> First question: Can anybody confirm that this has been changed in the recent
> devel?

Not a Bioc maintainer, but yes, the Bioc build system added this on
May 22, 2021 in order to catch similar bugs in package vignettes, cf.
https://community-bioc.slack.com/archives/CLUJWDQF4/p1622062783020300?thread_ts=1622053611.008100=CLUJWDQF4

>
> Second question: I have tried to include
> "_R_CHECK_LENGTH_1_CONDITION_=verbose" in my .Renviron file and it seems
> that my R session respects that. However, when I run 'R CMD build' on the
> aforementioned packages, they still build fine. The suggestions in
> https://github.com/Bioconductor/BBS/issues/71 don't work for me either
> (maybe I have done something wrong). I would actually like to reproduce the
> errors in my local system, since this will help me fixing the errors and
> testing the changes. So can anybody give me advice how I can make my local
> installation to check for logical vectors of length > 1 in conditions more
> strictly?

You want to set:

_R_CHECK_LENGTH_1_LOGIC2_=verbose

That one catches bugs where x && y or x || y is called with length(x)
> 1 or length(y) > 1.

Using:

_R_CHECK_LENGTH_1_CONDITION_=verbose

catches bugs where if (x) { ... } and similar conditions are called
with length(x) > 1.

In your case, a reproducible minimal example is:

Sys.setenv("_R_CHECK_LENGTH_1_LOGIC2_"="verbose")
files <- c("a", "b")
files <- Rsubread:::.check_and_NormPath(files)
...
Error in is.na(files) || is.null(files) :

  'length(x) = 2 > 1' in coercion to 'logical(1)'

The problem is that there's a is.na(files) || is.null(files) in the code, where

> is.na(files)
[1] FALSE FALSE
> is.null(files)
[1] FALSE

so, we have an x || y case with length(x) > 1.

/Henrik




>
> Thanks a lot in advance,
> Ulrich
>
> ___
> Bioc-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Rd] R for Windows leaves detritus in the temp directory

2021-06-15 Thread Henrik Bengtsson
ISSUE:

The TMPDIR validation done in src/gnuwin32/system.c:

/* in case getpid() is not unique -- has been seen under Windows */
snprintf(ifile, 1024, "%s/Rscript%x%x", tm, getpid(),
 (unsigned int) GetTickCount());
ifp = fopen(ifile, "w+b");
if(!ifp) R_Suicide(_("creation of tmpfile failed -- set TMPDIR suitably?"));
  }

does _not_ clean up after itself, i.e. there's a missing

unlink(ifile);

In contrast, ditto in src/unix/system.c does this.


BACKGROUND:

When running R CMD check --as-cran on my 'future' package, I get:

* checking for detritus in the temp directory ... NOTE
Found the following files/directories:
  'Rscript171866c62e'

when checked on R Under development (unstable) (2021-06-13 r80496),
including on win-builder.  I can reproduce this with a package
'tests/detritus.R':

  cl <- parallel::makeCluster(1)
  dummy <- parallel::clusterEvalQ(cl, {
cl <- parallel::makeCluster(1)
on.exit(parallel::stopCluster(cl))
parallel::clusterEvalQ(cl, Sys.getpid())
  })
  print(dummy)
  parallel::stopCluster(cl)


I believe it requires a nested PSOCK cluster to reproduce the 'R CMD
check' NOTE, e.g. it does _not_ happen with:

  cl <- parallel::makeCluster(1)
  dummy <- parallel::clusterEvalQ(cl, {
Sys.getpid()
  })
  print(dummy)
  parallel::stopCluster(cl)

/Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [R-pkg-devel] Old version of rtracklayer on a single check server

2021-05-18 Thread Henrik Bengtsson
I stand corrected about the R-devel version (and now R 4.1.0) on CRAN.

However, isn't it the case that it will never be solved for R 4.0.0 (to
become R-oldrel on CRAN) and CRAN will keep reporting an error on MS
Windows there because
https://bioconductor.org/packages/3.12/bioc/html/rtracklayer.html provides
only an older version for MS Windows?

If so, an alternative to relying on Suggests is to make the package depend
on R (>= 4.1.0).

/Henrik

On Tue, May 18, 2021, 09:08 Martin Morgan  wrote:

> That’s not correct Henrik.
>
> CRAN follows CRAN rules for installing packages, so uses
> tools:::BioC_version_for_R_version(). For R-devel we have
>
> > R.version.string
> [1] "R Under development (unstable) (2021-05-18 r80323)"
> > tools:::.BioC_version_associated_with_R_version()
> [1] '3.13'
>
> For this version of Bioconductor, the rtracklayer version (from
> https://bioconductor.org/packages/3.13/rtracklayer, or
> `available.packages(repos = 
> "https://bioconductor.org/packages/3.13/bioc;)["rtracklayer",
> "Version"]`) is 1.51.5.
>
> So the r-devel-windows-ix86+x86_64 builder mentioned in the post has the
> wrong version of rtracklayer for R-devel.
>
> Martin Morgan
>
> On 5/18/21, 11:49 AM, "R-package-devel on behalf of Henrik Bengtsson" <
> r-package-devel-boun...@r-project.org on behalf of
> henrik.bengts...@gmail.com> wrote:
>
> It's a problem with Bioconductor and a broken release history of
> 'rtracklayer' on MS Windows (e.g.
> https://bioconductor.org/packages/3.12/bioc/html/rtracklayer.html)
> plus how each Bioconductor version is tied to a specific R version.
> In other words, even if they fix it in Bioconductor 3.13 (for R
> 4.1.0), it can't be fixed in Bioconductor 3.12 (for R 4.0.0), so
> you're package will keep failing on Windows for R 4.0.0.  The reason
> why it can't be fixed in Bioconductor 3.12 is that they have now
> frozen that release forever.
>
> Because of this, I suspect the only solution is to make 'rtracklayer'
> an optional package, i.e. move it to Suggests: and update all your
> code to run conditionally of that package being available. I recommend
> you reach out to the bioc-devel mailing list for advice.
>
> /Henrik
>
> On Tue, May 18, 2021 at 4:33 AM Dalgleish, James (NIH/NCI) [F] via
> R-package-devel  wrote:
> >
> > To any who might have an idea:
> >
> > I've been reading several posts in the digest about dependency
> version issues on the check servers and I'm having my own issue, which I
> can't solve because I can't upgrade the check server's package version:
> > * installing *source* package 'CNVScope' ...
> > ** using staged installation
> > ** R
> > ** data
> > *** moving datasets to lazyload DB
> > ** inst
> > ** byte-compile and prepare package for lazy loading
> > Warning: multiple methods tables found for 'export'
> > Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()),
> versionCheck = vI[[j]]) :
> >   namespace 'rtracklayer' 1.48.0 is already loaded, but >= 1.51.5 is
> required
> > Calls:  ... namespaceImportFrom -> asNamespace ->
> loadNamespace
> > Execution halted
> > ERROR: lazy loading failed for package 'CNVScope'
> > * removing 'd:/RCompile/CRANguest/R-devel/lib/CNVScope'
> >
> > These errors tend to be check server dependent (only occurs on
> r-devel-windows-ix86+x86_64<
> https://cran.r-project.org/web/checks/check_flavors.html#r-devel-windows-ix86_x86_64>)
> and I'm just trying to make the small change to closeAllConnections() that
> was asked earlier of maintainers by Kurt Hornik and the CRAN team, but I
> can't because of this old package version on the devel check server, which
> has the same error:
> >
> https://win-builder.r-project.org/incoming_pretest/CNVScope_3.5.7_20210518_062953/Windows/00check.log
> >
> https://win-builder.r-project.org/incoming_pretest/CNVScope_3.5.7_20210518_062953/Windows/00install.out
> >
> > Is there any way around this? I notice the maintainer of the
> 'gtsummary' package had a similar issue:
> >
> > "> I am trying to make a release that depends on gt v0.3.0, but I
> get an error
> >
> > > when I test the package on Windows Dev
> `devtools::check_win_devel()` that
> >
> > > the gt package is available but it's an unsuitable version.  Does
> anyone
> >
> > > know why the gt v0.3.0 is unavailable?"
> >
> >
> >
> >

Re: [R-pkg-devel] Old version of rtracklayer on a single check server

2021-05-18 Thread Henrik Bengtsson
It's a problem with Bioconductor and a broken release history of
'rtracklayer' on MS Windows (e.g.
https://bioconductor.org/packages/3.12/bioc/html/rtracklayer.html)
plus how each Bioconductor version is tied to a specific R version.
In other words, even if they fix it in Bioconductor 3.13 (for R
4.1.0), it can't be fixed in Bioconductor 3.12 (for R 4.0.0), so
you're package will keep failing on Windows for R 4.0.0.  The reason
why it can't be fixed in Bioconductor 3.12 is that they have now
frozen that release forever.

Because of this, I suspect the only solution is to make 'rtracklayer'
an optional package, i.e. move it to Suggests: and update all your
code to run conditionally of that package being available. I recommend
you reach out to the bioc-devel mailing list for advice.

/Henrik

On Tue, May 18, 2021 at 4:33 AM Dalgleish, James (NIH/NCI) [F] via
R-package-devel  wrote:
>
> To any who might have an idea:
>
> I've been reading several posts in the digest about dependency version issues 
> on the check servers and I'm having my own issue, which I can't solve because 
> I can't upgrade the check server's package version:
> * installing *source* package 'CNVScope' ...
> ** using staged installation
> ** R
> ** data
> *** moving datasets to lazyload DB
> ** inst
> ** byte-compile and prepare package for lazy loading
> Warning: multiple methods tables found for 'export'
> Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = 
> vI[[j]]) :
>   namespace 'rtracklayer' 1.48.0 is already loaded, but >= 1.51.5 is required
> Calls:  ... namespaceImportFrom -> asNamespace -> loadNamespace
> Execution halted
> ERROR: lazy loading failed for package 'CNVScope'
> * removing 'd:/RCompile/CRANguest/R-devel/lib/CNVScope'
>
> These errors tend to be check server dependent (only occurs on 
> r-devel-windows-ix86+x86_64)
>  and I'm just trying to make the small change to closeAllConnections() that 
> was asked earlier of maintainers by Kurt Hornik and the CRAN team, but I 
> can't because of this old package version on the devel check server, which 
> has the same error:
> https://win-builder.r-project.org/incoming_pretest/CNVScope_3.5.7_20210518_062953/Windows/00check.log
> https://win-builder.r-project.org/incoming_pretest/CNVScope_3.5.7_20210518_062953/Windows/00install.out
>
> Is there any way around this? I notice the maintainer of the 'gtsummary' 
> package had a similar issue:
>
> "> I am trying to make a release that depends on gt v0.3.0, but I get an error
>
> > when I test the package on Windows Dev `devtools::check_win_devel()` that
>
> > the gt package is available but it's an unsuitable version.  Does anyone
>
> > know why the gt v0.3.0 is unavailable?"
>
>
>
> I'm open to any suggestions, but can't see a way around this issue from my 
> end without the ability to service the check server.
>
>
> Thanks,
> James Dalgleish
> Cancer Genetics Branch,
> Center for Cancer Research,
> National Cancer Institute,
> National Institutes of Health,
> Bethesda, MD
>
>
> [[alternative HTML version deleted]]
>
> __
> R-package-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-package-devel

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [Rd] Testing R build when using --without-recommended-packages?

2021-05-05 Thread Henrik Bengtsson
On Wed, May 5, 2021 at 2:13 AM Martin Maechler
 wrote:
>
> > Gabriel Becker
> > on Tue, 4 May 2021 14:40:22 -0700 writes:
>
> > Hmm, that's fair enough Ben, I stand corrected.  I will say that this 
> seems
> > to be a pretty "soft" recommendation, as these things go, given that it
> > isn't tested for by R CMD check, including with the -as-cran 
> extensions. In
> > principle, it seems like it could be, similar checks are made in package
> > code for inappropriate external-package-symbol usage/
>
> > Either way, though, I suppose I have a number of packages which have 
> been
> > invisibly non-best-practices compliant for their entire lifetimes (or at
> > least, the portion of that where they had tests/vignettes...).
>
> > Best,
> > ~G
>
> > On Tue, May 4, 2021 at 2:22 PM Ben Bolker  wrote:
>
> >> Sorry if this has been pointed out already, but some relevant text
> >> from
> >>
> >> 
> https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Suggested-packages
> >>
> >> > Note that someone wanting to run the examples/tests/vignettes may not
> >> have a suggested package available (and it may not even be possible to
> >> install it for that platform). The recommendation used to be to make
> >> their use conditional via if(require("pkgname")): this is OK if that
> >> conditioning is done in examples/tests/vignettes, although using
> >> if(requireNamespace("pkgname")) is preferred, if possible.
> >>
> >> ...
> >>
> >> > Some people have assumed that a ‘recommended’ package in ‘Suggests’
> >> can safely be used unconditionally, but this is not so. (R can be
> >> installed without recommended packages, and which packages are
> >> ‘recommended’ may change.)
>
>
> Thank you all (Henrik, Gabe, Dirk & Ben) !
>
> I think it would be a good community effort  and worth the time
> also of R core to further move into the right direction
> as Dirk suggested.
>
> I think we all agree it would be nice if Henrik (and anybody)
> could use  'make check' on R's own sources after using
>  --without-recommended-packages
>
> Even one more piece of evidence is the   tests/README   file in
> the R sources.  It has much more but simply starts with
>
> ---
> There is a hierarchy of check targets:
>
>  make check
>
> for all builders.  If this works one can be reasonably happy R is working
> and do `make install' (or the equivalent).
>
> make check-devel
>
> for people changing the code: this runs things like the demos and
> no-segfault which might be broken by code changes, and checks on the
> documentation (effectively R CMD check on each of the base packages).
> This needs recommended packages installed.
>
> make check-all
>
> runs all the checks, those in check-devel plus tests of the recommended
> packages.
>
> Note that for complete testing you will need a number of other
> ..
> ..
>
> ---
>
> So, our (R core) own intent has been that   'make check'  should
> run w/o rec.packages  but further checking not.
>
> So, yes, please, you are encouraged to send patches against the
> R devel trunk  to fix such examples and tests.

Thanks Martin!  Thanks for confirming and for being open to patches.
This encourages me to try to patch what we've got so that 'make check'
and 'make check-devel' can complete also without 'recommended'
packages.

/Henrik

>
> Best,
> Martin
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Testing R build when using --without-recommended-packages?

2021-05-04 Thread Henrik Bengtsson
Two questions to R Core:

1. Is R designed so that 'recommended' packages are optional, or
should that be considered uncharted territories?

2. Can such an R build/installation be validated using existing check methods?


--

Dirk, it's not clear to me whether you know for sure, or you draw
conclusions based your long experience and reading. I think it's very
important that others don't find this thread later on and read your
comments as if they're the "truth" (unless they are).  I haven't
re-read it from start to finish, but there are passages in 'R
Installation and Administration' suggesting you can build and install
R without 'recommended' packages.  For example, post-installation,
Section 'Testing an Installation' suggests you can run (after making
sure `make install-tests`):

cd tests
../bin/R CMD make check

but they fail the same way.  The passage continuous "... and other
useful targets are test-BasePackages and test-Recommended to run tests
of the standard and recommended packages (if installed) respectively."
(*).  So, to me that hints at 'recommended' packages are optional just
as they're "Priority: recommended".  Further down, there's also a
mentioning of:

$ R_LIBS_USER="" R --vanilla
> Sys.setenv(LC_COLLATE = "C", LC_TIME = "C", LANGUAGE = "en")
> tools::testInstalledPackages(scope = "base")

which also produces errors when 'recommended' packages are missing,
e.g. "Failed with error:  'there is no package called 'nlme'".

(*) BTW, '../bin/R CMD make test-BasePackages' gives "make: *** No
rule to make target 'test-BasePackages'.  Stop."

Thanks,

/Henrik

On Tue, May 4, 2021 at 12:22 PM Dirk Eddelbuettel  wrote:
>
>
> On 4 May 2021 at 11:25, Henrik Bengtsson wrote:
> | FWIW,
> |
> | $ ./configure --help
> | ...
> |   --with-recommended-packages
> |   use/install recommended R packages [yes]
>
> Of course. But look at the verb in your Subject: no optionality _in testing_ 
> there.
>
> You obviously need to be able to build R itself to then build the recommended
> packages you need for testing.
>
> Dirk
>
> --
> https://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


  1   2   3   4   5   6   7   8   9   10   >