Re: [R-pkg-devel] Compile issues on r-devel-linux-x86_64-debian-clang with OpenMP

2024-05-23 Thread Ivan Krylov via R-package-devel
On Wed, 22 May 2024 09:18:13 -0500
Dirk Eddelbuettel  wrote:

> Testing via 'nm' as you show is possible but not exactly 'portable'.
> So any suggestions as to what to condition on here?

(My apologies if you already got an answer from Kurt. I think we're not
seeing his mails to the list.)

Perhaps take the configure test a bit further and try to dyn.load() the
resulting shared object? To be extra sure, call the function that uses
the OpenMP features? (Some weird systems may have lazy binding enabled,
making dyn.load() succeed but crashing the process on invocation of a
missing function.)

On GNU/Linux, the linker will happily leave undefined symbols in when
creating a shared library (unlike on, say, Windows, where extern void
foo(void); foo(); is a link-time error unless an object file or an
import library providing foo() is also present). When loading such a
library, the operation fails unless the missing symbols are already
present in the address space of the process (e.g. from a different
shared library).

A fresh process of R built without OpenMP support will neither link in
the OpenMP runtime while running SHLIB nor have the OpenMP runtime
loaded and so should successfully fail the test.

I also wouldn't call the entry point "main" just in case some future
compiler considers this a violation of the rules™ [*] and breaks the
code. extern "C" void configtest(int*) would be compatible with .C()
without having to talk to R's memory manager:

# The configure script:
cat > test-omp.cpp <
extern "C" void configtest(int * arg) {
  *arg = omp_get_num_threads();
}
EOF
# Without the following you're relying on the GNU/Linux-like behaviour
# w.r.t. undefined symbols (see WRE 1.2.1.1):
cat > Makevars 

Re: [R-pkg-devel] [External] Re: Assistance Needed to Resolve CRAN Submission Note

2024-05-19 Thread Ivan Krylov via R-package-devel
On Sun, 19 May 2024 09:52:08 +
Daniel Kelley  wrote:

> In answer to the question about tidy version on macOS, I have the
> latest version of that OS (Sonoma 14.5 release 23F79 -- a beta
> release, unless the official has caught up in recent days) and I get
> as follows.

> $  ~ /usr/bin/tidy --version
> HTML Tidy for Mac OS X released on 31 October 2006 - Apple Inc. build
> 9576

Thank you for providing the output! R CMD check only knows about "Apple
Inc. build 2649" (not 9576) being old, which must be why the spurious
NOTEs appeared on Zeinab's computer.

Submitted the updated patch at
.

> The second one is from the homebrew project, and that's
> what gets used by default on my machine.  I don't know which of these
> R would be using, but I could check that if required (and if provided
> a hint on how to invoke R to tell me).

I would expect Sys.which(Sys.getenv("R_TIDYCMD", "tidy")) to point to
the new version in /usr/local/bin/.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] [External] Re: Assistance Needed to Resolve CRAN Submission Note

2024-05-18 Thread Ivan Krylov via R-package-devel
On Sat, 18 May 2024 21:10:18 +
"Richard M. Heiberger"  wrote:

> when checking a package and discovering these messages about html5,
> can you generate an informational message about tidy with a link to
> updating tidy?

That's a useful suggestion.

Would you mind testing the patch from
?

If you or someone else here has a computer running macOS, what exactly
does it print when running `tidy --version` (1) with an old version of
Tidy (that comes with macOS) and (2) with a new (>= 5) version of Tidy?

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Altrep header, MSVC, and STRUCT_SUBTYPES macro

2024-05-17 Thread Ivan Krylov via R-package-devel
В Thu, 16 May 2024 21:32:24 +0200
David Cortes  пишет:

> Unfortunately, after some further testing, it seems this was just a
> matter of getting lucky - using the alternative non-STRUCT_SUBTYPES 
> def. of Altrep still leads to memory corruptions and crashes, just at
> different points than when using the STRUCT_SUBTYPES definition.

So much for the hope for an easy solution.
 
> May I ask: how would you go around getting R code into Godbolt?

Definitely not much of it. I was assuming that the problem was due to
passing structs by value (something that had been a problem for MSVC
compatibility more than a decade ago on x86 Windows), so I only
provided typedef struct SEXPREC *SEXP, one of the two definitions of
R_altrep_class_t and related macros, declared a number of functions
that would accept or return R_altrep_class_t by value, and tried to
call them.

I don't know a good way to use Godbolt for larger amounts of code.

> As far as I can tell, the altrep methods are calling the functions
> which they were assigned so at least the 'set' and 'dataptr' methods
> are working, but memory corruptions that crash the program happen
> after calling such altrep methods, particularly when there is a
> combination of 'R_UnwindProtect', C++ 'catch' that involves
> destructing variables before 'R_ContinueUnwind', and then 'Rf_error'.

What do the crashes look like? Is it heap corruption? Stack corruption?
Are they at least deterministic?

Can you reproduce the error by compiling one of the example ALTREP
classes [1] with MSVC, without C++ exception handling? What about using
R_ContinueUnwind() and/or C++ exceptions in some non-ALTREP code
compiled using MSVC?

Maybe if you run the code under Dr. Memory [2] or Application Verifier
[3], it'll detect the corruption slightly earlier and let you pinpoint
the problem? I'm assuming that there is no good way to link a sanitizer
into the process.

Can you eliminate C runtime incompatibility? Is there a chance that a
heap object allocated by the UCRT linked to R is freed by the CRT
linked to the MSVC-side library (or vice versa)?

-- 
Best regards,
Ivan

[1]
https://github.com/altrep-examples

[2]
https://drmemory.org/

[3]
https://learn.microsoft.com/en-us/windows-hardware/drivers/devtest/application-verifier

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Assistance Needed to Resolve CRAN Submission Note

2024-05-16 Thread Ivan Krylov via R-package-devel
В Thu, 16 May 2024 16:01:45 +
Zeinab Mashreghi  пишет:

> checking HTML version of manual ... NOTE
> Found the following HTML validation problems:
> All.data.html:4:1 (All.data.Rd:10): Warning:  inserting "type"
> attribute
> All.data.html:12:1 (All.data.Rd:10): Warning: 

Re: [R-pkg-devel] Altrep header, MSVC, and STRUCT_SUBTYPES macro

2024-05-16 Thread Ivan Krylov via R-package-devel
В Wed, 15 May 2024 18:54:37 +0200
David Cortes  пишет:

> The code compiles without errors under MSVC, but executing code that
> involves returning Altrep objects leads to segfaults and memory
> corruptions, even though it works fine under other compilers.
> 
> I see the R Altrep header has this section:
> #define STRUCT_SUBTYPES
> #ifdef STRUCT_SUBTYPES
> # define R_SEXP(x) (x).ptr
> # define R_SUBTYPE_INIT(x) { x }
>   typedef struct { SEXP ptr; } R_altrep_class_t;
> #else
> # define R_SEXP(x) ((SEXP) (x))
> # define R_SUBTYPE_INIT(x) (void *) (x)
>   typedef struct R_altcls *R_altrep_class_t;
> #endif

Interesting ABI incompatibility you've found. Can you show a minimal
example? I've tried playing with https://godbolt.org/ and passing
around values of type R_altrep_class_t between functions, but couldn't
convince "x64 msvc v19.latest" to generate different assembly no matter
whether R_altrep_class_t was a pointer or a struct containing a SEXP.

> If I manually edit the R header to remove the definition of
> 'STRUCT_SUBTYPES', leading to the second definition of
> 'R_altrep_class_t' being used, then things work as expected when the
> package is compiled with MSVC (no segfaults and no memory
> corruptions).

While it's hard to argue with results (I don't think it'll ever be
broken on x86_64 Windows), this workaround relies on undefined
behaviour and will only work as long as the ABI as understood by GCC
passes a structure with a pointer inside exactly the same way as the
ABI as understood by MSVC passes a bare pointer.

Isolating the MSVC-specific code as suggested by Vladimir should be
safer, but it's also important to find out where exactly the
incompatibility arises from. The GCC and MSVC parts still have to use a
common ABI to talk to each other.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Assistance Needed to Resolve CRAN Submission Note

2024-05-16 Thread Ivan Krylov via R-package-devel
Dear Zeinab,

Welcome to R-package-devel!

В Thu, 16 May 2024 03:22:56 +
Zeinab Mashreghi  пишет:

> I recently submitted my R package to CRAN, and I received this note
> from the CRAN teams: "checking CRAN incoming feasibility ... NOTE."

Without a link to the full error log and, ideally, to the source code
of the package, it's impossible to help with such a NOTE, because the
check for "CRAN incoming feasibility" encompasses many tests. I was
lucky to fish your package from the archived queue and correlate it
with the publicly available logs, but it's not always this simple:

https://win-builder.r-project.org/incoming_pretest/bootsurv_0.0.0.9000_20240515_212834/

>> New submission

This is expected and will always result in a NOTE to flag the package
to a CRAN reviewer.

>> Version contains large components (0.0.0.9000)

The convention is to use version components like 9000 for pre-release,
untested versions of packages. Could you please use a version like
0.0.1 for the version of the package to be released on CRAN?

> Unknown, possibly misspelled, fields in DESCRIPTION:
>  ‘ImportFrom’ ‘Data’

'importFrom' is a NAMESPACE file directive [1]. The DESCRIPTION must
list 'Imports:' instead [2]. What did you intend to mean with the Data:
field of your DESCRIPTION?

>> The Title field should be in title case. Current version is:
>> ‘Bootstrap Methods for complete (absence of missing values) survey
>> data’
>> In title case that is:
>> ‘Bootstrap Methods for Complete (Absence of Missing Values) Survey
>> Data’

This is yet another CRAN convention. You'll need to change the 'Title:'
field of the DESCRIPTION file.

>> * checking Rd line widths ... NOTE
>> Rd file 'boot.twostage.Rd':
>>  \examples lines wider than 100 characters:

Could you please wrap the lines of your \examples{} sections to 100
characters or less?

>> * checking examples ... [21m/21m] NOTE
>> Examples with CPU (user + system) or elapsed time > 5s
>>user system  elapsed
>> boot.weights.stsrs 1242.735  0.492 1243.285
>> boot.twostage 9.799  0.0489.847

The \examples{...} in your documentation are not only to be read by the
user. R CMD check runs them periodically on the CRAN servers. The user
should also be able to run example(boot.weights.stsrs) and see your
code directly in action. CRAN requires examples to run in 5 seconds or
less, both elapsed and CPU time. An example(...) that runs for 20
minutes is way too long. You'll need to find a way to reduce the time
spent in the example.

Wasn't something like this said in the e-mail you received from CRAN?

> When I run R CMD check on my device, I do not encounter any issues,

Even with R CMD check --as-cran?

-- 
Best regards,
Ivan

[1]
https://cran.r-project.org/doc/manuals/R-exts.html#Specifying-imports-and-exports

[2]
https://cran.r-project.org/doc/manuals/R-exts.html#The-DESCRIPTION-file

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] An issue regarding the authors field in DESCRIPTION

2024-05-13 Thread Ivan Krylov via R-package-devel
В Mon, 13 May 2024 08:33:04 -0500
Ruwani Herath  пишет:

> This is what I entered in DESCRIPTION field.
> 
> Authors@R: c(person(given = "Ruwani", family = "Herath", role =
> c("aut","cre"), email = "ruwanirasanja...@gmail.com"),
>person(given = "Leila", family = "Amiri",  role = "ctb"),
>  person(given = "Mahmoud", family = "Torabi", role =
> "ctb"))
> 
> Authors: Ruwani Herath [aut, cre],
>   Leila Amiri [ctb],
>   Mahmoud Torabi [ctb]
> Maintainer: Ruwani Herath 

R CMD build generates the fields "Authors" and "Maintainers" from the
field "Authors@R", so the easiest way forward is to delete Authors: and
Maintainer: from your DESCRIPTION. Next time you run R CMD build, the
DESCRIPTION file inside the resulting *.tar.gz file will contain the
correct fields "Authors" and "Maintainers".

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] clang-UBSAN

2024-05-13 Thread Ivan Krylov via R-package-devel
В Sun, 12 May 2024 14:43:18 -0400
Kaifeng Lu  пишет:

> /data/gannet/ripley/R/test-clang/Rcpp/include/Rcpp/internal/caster.h:30:25:
> runtime error: nan is outside the range of representable values of
> type 'int'

On line 4618 of src/misc.cpp of the lrstat package, you have a
suspicious default parameter value:

>> const int n = NA_REAL

NA_REAL is a special kind of NaN, and C++ signed integers cannot
represent NaNs. You probably meant NA_INTEGER.

I think that Rcpp::traits::input_parameter takes care of
asking R to cast NA_REAL to NA_INTEGER, so this shouldn't directly
cause problems, but without a link to the code and the full error
report we have to resort to forbidden OSINT techniques [1], which don't
always work reliably and may attract the wrong kind of attention on the
darknet [2].

> Is there any way to reproduce the error before submitting the package
> to CRAN?

Yes.

If you use containers, try the rocker/r-devel-ubsan-clang [3] image
that should already contain a "sanitized" build of R produced with the
clang compiler.

If that doesn't help, start with a Fedoda 36 installation and follow
the description [4] to install clang and compile R from source with
sanitizers enabled. This procedure is described in more detail in WRE
4.3.4 [5].

If you start having problems using the Docker/podman image or compiling
R from source, don't hesitate to ask further questions.

-- 
Best regards,
Ivan

[1]
Such as searching your name on CRAN and GitHub.

[2]
Such as Google suggesting AI-powered results.

[3]
https://rocker-project.org/images/

[4]
https://www.stats.ox.ac.uk/pub/bdr/memtests/README.txt

[5]
https://cran.r-project.org/doc/manuals/R-exts.html#Using-Undefined-Behaviour-Sanitizer

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Fast Matrix Serialization in R?

2024-05-10 Thread Ivan Krylov via R-package-devel
On Fri, 10 May 2024 15:12:17 +1200
Simon Urbanek  wrote:

> I wonder if it may be worth doing something a bit smarter and tag
> officially a "reverse XDR" format instead - that way it would be
> well-defined and could be made the default.

Do you mean changing R so that when reading a "B\n" serialized stream,
a format code read as 0x0200 or 0x0300 would mean regular
formats 2 or 3 but byte-swapped? That would be backwards-compatible,
and we probably weren't going to have >= 65536 format versions anyway...

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Overcoming CRAN's 5mb vendoring requirement

2024-05-09 Thread Ivan Krylov via R-package-devel
В Wed, 8 May 2024 16:01:23 -0400
Josiah Parry  пишет:

>- I'll see if I can get the configure.ac to make the appropriate
> Rscript call for configure.win.
>   - I think the idea of having a single `confgure.ac` file to
> generate both configure and configure.win is nice. Guidance with
> GitHub actions and ChatGPT is essentially a must for me since my bash
> is remedial at best.

Then you might like Kevin Ushey's configure
, which is like autoconf
redone in R. The only few lines of bash are the system-specific bits in
{configure,cleanup}{.win,} to run the R scripts under tools/, and they
are already written for you.

Generating two system-specific configures from one configure.ac might
be possible - GNU m4 is very versatile - but to implement that, you
would have to program m4, which is even more niche than bash.

> The requirement to avoid GitHub feels surprisingly anachronistic
> given how central it is to the vast majority of software development.

I think that Ben Bolker's answer explains it very well. Part of the
goal of the CRAN archive is to be able to take a package, a
period-appropriate version of R and install the former on the latter.
The URL carrying the code must be able to survive as long. Unlike
Zenodo, GitHub's goal is not directly to provide storage forever, and
its current owners have a reputation [*] that could have played a part
in the requirement to avoid them.

I wonder if it would be ethical to use Archive.org for this.

In an ideal world, CRAN would be able to directly archive larger
software packages (just like PyPI is currently hosting more than a
terabyte of Tensorflow builds and a few terabytes more of other
GPU-related code [**]) without requiring the maintainers to swim
between the Scylla of vendoring the dependencies and the Charybdis of
making the build depend on an external URL, but that's a luxury someone
would have to pay for.

-- 
Best regards,
Ivan

[*]
https://stat.ethz.ch/pipermail/r-package-devel/2024q2/010708.html

[**]
https://discuss.python.org/t/what-to-do-about-gpus-and-the-built-distributions-that-support-them/7125/16

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] flang doesn't support derived types

2024-05-09 Thread Ivan Krylov via R-package-devel
В Thu, 09 May 2024 15:31:25 +
Othman El Hammouchi  пишет:

> Do I understand it correctly that there is no way to specify a
> Fortran standard in the SystemRequirements?

It's possible (and even recommended) to describe the Fortran version
requirement in SystemRequirements [1], but this field is for now mostly
informational. I think I remember efforts to standardise it, but they
are far from complete.

> I had resubmitted my package in the mean time with a configure script
> that aborts the install if the compiler does not support
> polymorphism, but I understand that this is a fruitless avenue for
> CRAN?

Signs point to yes, at least judging by a previous time we had
flang-related problems [2]. On the other hand, there were relatively
easy workarounds that time, and here I'm not seeing anything as simple.

> I should point out my local flang install is version 16, but I cannot
> install 18 on my system since it's in unstable (this again
> underscores the problem of developing under these constraints).

Would you consider containers for this purpose? I was able to reproduce
the problem relatively quickly by starting podman run --rm -it
debian:sid and installing flang-18 in there. (Unlike Docker a few years
ago, podman can be installed straight from the repository, at least on
Debian, and doesn't require adding users to special groups in order to
work. Maybe Docker has also improved.) I don't like containers as a
basis for software distribution, but I can't deny that they are being
great at letting me quickly reproduce problems without installing 10
different GNU/Linux distros.

> What would you advise? And don't you think these Fortran constraints
> should be better documenten.

I'm afraid I don't have any more specific advice besides testing your
workarounds with Debian Sid in a container or a virtual machine or a
chroot. I can try to take a look at more concrete problems. I hope you
will be able to find a relatively painless workaround.

I do wish that flang-new would be a better compiler or at least a
better documented one, but instead of a list of features on their
website, I can only see "Getting Involved [3] for tips on how to get in
touch <...> and to learn more about the current status". There is only
so many projects one can get involved in.

-- 
Best regards,
Ivan

[1]
https://cran.r-project.org/doc/manuals/R-exts.html#Using-modern-Fortran-code

[2]
https://stat.ethz.ch/pipermail/r-package-devel/2023q4/010065.html

[3]
https://flang.llvm.org/docs/GettingInvolved.html

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] flang doesn't support derived types

2024-05-09 Thread Ivan Krylov via R-package-devel
Dear Othman El Hammouchi,

Welcome to R-package-devel!

В Wed, 08 May 2024 16:52:51 +
Othman El Hammouchi  пишет:

> However, upon submission I received an automatic reply shortly
> afterwards saying the build had failed on CRAN's servers for Debian.
> The log gives the following error:
> 
> flang/lib/Lower/CallInterface.cpp:949: not yet implemented: support
> for polymorphic types

Your use of contained procedures in class(t_mack_triangle) and
class(t_cl_res) signifies the derived types as being extensible and
thus potentially polymorphic. You'll have to replace class(...) with
type(...) and move the contained procedures out of the type definitions
(and maybe additionally make the types 'sequence' or 'bind(C)' to
signify them being non-extensible) to make the code work with flang-18.
I'm afraid this will also prevent you from defining destructors for
these types.

flang-new can be a very disappointing compiler at times [*], but it's
what people do use in the real world, especially for 64-bit ARM
processors, so in order to keep our packages portable, we have to cater
to its whims.

-- 
Best regards,
Ivan

[*] https://stat.ethz.ch/pipermail/r-package-devel/2023q4/009987.html

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Overcoming CRAN's 5mb vendoring requirement

2024-05-08 Thread Ivan Krylov via R-package-devel
В Wed, 8 May 2024 14:08:36 -0400
Josiah Parry  пишет:

> With ChatGPT's ability to write autoconf, I *think *I have something
> that can work.

You don't have to write autoconf if your configure.ac is mostly a plain
shell script. You can write the configure script itself. Set the PATH
and then exec "${R_HOME}/bin/Rscript" tools/configure.R (in the
regular, non-multiarch configure for Unix-like systems) or exec
"${R_HOME}/bin${R_ARCH_BIN}/Rscript.exe" tools/configure.R (in
configure.win, which you'll also need). You've already wrote the rest
of the code in a language you know well: R.

Autoconf would be useful if you had system-specific dependencies with
the need to perform lots of compile tests. Those would have been a pain
to set up in R. Here you mostly need sys.which() instead of
AC_CHECK_PROGS and command -v.

> The configure file runs tools/get-deps.R which will download the
> dependencies from the repo if available and verify the checksums.

One of the pain points is the need for a strong, cryptographically
secure hash. MD5 is, unfortunately, no longer such a hash. In a cmake
build, you would be able to use cmake's built in strong hashes (such as
SHA-2 or SHA-3). The CRAN policy doesn't explicitly forbid MD5; it only
requires a "checksum". If you figure out a way to use a strong hash
from tools/configure.R for the downloaded tarball, please let us know.

> If the checksums don't match, an error is thrown, otherwise it can
> continue. I believe this meets the requirements of CRAN?

The other important CRAN requirement is to store the vendor tarball
somewhere as permanent as CRAN itself (see the caveats at the bottom of
https://cran.r-project.org/web/packages/using_rust.html), that is, not
GitHub. I think that Zenodo counts as a sufficiently reliable store.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] package removed from CRAN

2024-05-08 Thread Ivan Krylov via R-package-devel
В Wed, 8 May 2024 17:30:46 +0200
"Jose V. Die Ramon"  пишет:

> Could anyone please help me understand the reasons behind this, or
> suggest any steps I should take to resolve it?

Here's what I could find in
https://cran.r-project.org/src/contrib/PACKAGES.in:

>> X-CRAN-Comment: Archived on 2024-04-30 for policy violation.
>>  .
>>  On Internet access.  Also other errors.

So Avi is right, this is about the tests and/or examples failing
(possibly due to problems on the remote server).

If possible, try to emit errors with a special class set for
Internet-related errors. This will make it possible for your examples
and tests to catch them, as in:

tests/*.R:

tryCatch(
 ,
 refseqR_internet_error = function(e)
  message("Caught Internet-related error")
)

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Cannot repro failing CRAN autochecks

2024-05-07 Thread Ivan Krylov via R-package-devel
В Tue, 7 May 2024 21:40:31 +0300
Ivan Krylov via R-package-devel  пишет:

> It's too late for Makevars to exclude files from the source package
> tarball. Use .Rbuildignore instead:

Sorry, that was mostly misguided. .Rbuildignore won't help with the
contents of the Rust vendor tarball.

1. Can you omit the .cff file from src/rust/vendor.tar.xz when building
it?

2. I think that there is --exclude in both GNU tar and BSD tar. How
about tar --exclude="*.cff" -x -f rust/vendor.tar.xz ?

3. From
<https://win-builder.r-project.org/incoming_pretest/arcgisutils_0.3.0_20240507_194020/Debian/00install.out>,
it can be seen that the "clean" target does not get called. Can you
remove the *.cff file in the same Make target?

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Cannot repro failing CRAN autochecks

2024-05-07 Thread Ivan Krylov via R-package-devel
В Tue, 7 May 2024 14:03:42 -0400
Josiah Parry  пишет:

> This NOTE does not appear in Ubuntu, Mac, or Windows checks
> https://github.com/R-ArcGIS/arcgisutils/actions/runs/8989812276/job/24693685840

That's a bit strange. It fires for me in a local R CMD check for a test
package even without --as-cran. The code performing the check has been
in R since ~2010.

> I've made an edit to the Makevars to specifically remove this
> directory, but it seems to continue to persist.

It's too late for Makevars to exclude files from the source package
tarball. Use .Rbuildignore instead:
https://cran.r-project.org/doc/manuals/R-exts.html#Building-binary-packages

I think that the line src/vendor/chrono/CITATION\\.cff will prevent the
file from appearing in the package tarball.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Trouble with dependencies on phyloseq and microViz

2024-05-07 Thread Ivan Krylov via R-package-devel
On Tue, 7 May 2024 10:07:59 +1200
Simon Urbanek  wrote:

> That doesn't work - additional repositories are not allowed on CRAN
> other than in very exceptional cases, because it means the package
> cannot be installed by users making it somewhat pointless.

I suppose that with(tools::CRAN_package_db(),
sum(!is.na(Additional_repositories)) / length(Additional_repositories))
= 0.7% does make it very rare. But not even for a weak dependency? Is
it for data packages only, as seems to be the focus of
[10.32614/RJ-2017-026]? The current wording of the CRAN policy makes it
sound like Additional_repositories is preferred to explaining the
non-mainstream weak dependencies in Description.

So what should be done about the non-Bioconductor weak dependency
microViz?

> As for the OP, can you post the name of the package and/or the link
> to the errors so I can have a look?

Sharon has since got rid of the WARNING and now only has NOTEs due to
microViz and a URL to its repo in the Description:
https://win-builder.r-project.org/incoming_pretest/HybridMicrobiomes_0.1.2_20240504_185748/Debian/00check.log

If Additional_repositories: is the correct way to specify a
non-mainstream weak dependency for a CRAN package, the URL must be
specified as https://david-barnett.r-universe.dev/src/contrib, not just
https://david-barnett.r-universe.dev/. I am sorry for not getting it
right the first time.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Trouble with dependencies on phyloseq and microViz

2024-05-04 Thread Ivan Krylov via R-package-devel
В Sat, 4 May 2024 15:53:25 +
Sharon Bewick  пишет:

> I have a dependency on phyloseq, which is available through GitHub
> but not on the CRAN site. I have a similar problem with microViz,
> however I’ve managed to make it suggested, rather than required.
> There is no way to get around the phyloseq requirement. How do I fix
> this problem so that I can upload my package to the CRAN website?

Did a human reviewer tell you to get rid of the dependencies? There is
at least 444 packages on CRAN with strong dependencies on Bioconductor
packages, so your phyloseq dependency should work. In fact, 14 of them
depend on phyloseq.

What you need is an Additional_repositories field in your DESCRIPTION
specifying the source package repository where microViz could be
installed from. I think that

Additional_repositories: https://david-barnett.r-universe.dev

...should work.

Besides that, you'll need to increment the version and list the *.Rproj
file in .Rbuildignore:
https://win-builder.r-project.org/incoming_pretest/HybridMicrobiomes_0.1.1_20240504_173331/Debian/00check.log

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Urgent Review of R Packages in Light of Recent RDS Exploit

2024-05-04 Thread Ivan Krylov via R-package-devel
On Sat, 4 May 2024 08:09:28 +0200
Maciej Nasinski  wrote:

> What do you think about promoting containers?

Containers have an attack surface too, have user experience problems
(how's Docker on Windows?) and may bring in more third-party code than
what you're trying to protect against (whole operating system images!).
Even Firejail and Bubblewrap, containers specifically designed to
sandbox untrusted code, have bugs in their setup or implementation
every now and then.

Still, you are welcome to run third-party code in a virtual machine or
a container. It may be not everyone's favourite trade-off, but is a net
increase in security over running untrusted code directly. Feel free to
search for a point on the Pareto optimal line between security and
convenience that you'll be comfortable with: https://xkcd.com/2044/

> Nowadays, containers are more accessible, with GitHub codespaces
> being more affordable (mostly free for students and the educational
> sector).

The GitHub-isation of the development process is kind of a
vulnerability too, or at the very least has a cost. I'm a few
handshakes away from several people who have been disappeared from
GitHub and couldn't get their accounts back. Microsoft is too big to
have real tech support [*], so once you fall foul of their AI
moderation systems, you'll have to be a Hacker News celebrity to
attract attention of a human on the inside.

I've got an ageing ThinkPad that I cannot afford to replace. It can
process all the data I've been gathering during my PhD and then some,
least squares, inverse problems, you name it, all while playing music
and having Quake I open. But the moment I try to launch Codespaces, it
downloads more bytes of JavaScript than the whole Quake I installation
takes in size, and then the browser overheats the laptop.

Maybe programming other people's computers in the browser is the
future, but then you need a fancy laptop and maybe a friend in
Microsoft just to be admitted into that future. A solution for some,
but not for all.

-- 
Best regards,
Ivan

[*] https://danluu.com/diseconomies-scale/

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Urgent Review of R Packages in Light of Recent RDS Exploit

2024-05-03 Thread Ivan Krylov via R-package-devel
On Fri, 3 May 2024 18:17:52 +0200
Maciej Nasinski  wrote:

> I found the https://github.com/hrbrmstr/rdaradar solution and ran it
> on the 100 most downloaded R packages.
> Happily, all data/inst rda files are safe/non-exposed to RDS exploit
> (using the linked solution).

This is a bit useful - knowing that there are no obvious exploits in
the 100 most downloaded CRAN packages is better that not knowing that - 
but it is important to keep the big picture in mind. Bob himself said
that the script is "super basic". Currently, it only checks whether an
*.rda file, when loaded in the global environment, would shadow certain
important functions. This is not an attack a package author would
perform; this is something one would send directly to the victim.

In order to defeat an attacker, you must think like an attacker.

Here's someone jokingly describing how they would trojan the world's
online shop checkout systems if they wanted to commit financial crimes:
https://archive.ph/FCdBu
(With kindness and pull requests.)

Here's someone spending two years to plant a fake maintainer with a
backdoor in a key free software project:
https://lwn.net/Articles/967192/
(The backdoor was assembled from obfuscated "test files for the
decompressor".)

Here's the 2015 Underhanded C Contest, where people competed in writing
the most harmless-looking code that would instead do something
nefarious: http://www.underhanded-c.org/

On the one hand, hiding the bad functions in a data file (which is
compressed and binary) instead of the R files (which are plain text and
indexed everywhere) would be the obvious first step, so it may be
useful to flag data files with functions in them for human review.

On the other hand, an evil package author has so many tools at their
disposal that they may not need this one in particular. There are CRAN
packages with tens of megabytes of compiled code inside. Sneaking a
little extra something in a file starting with "// This is generated
grammar parser. Do not edit!" followed by an impenetrable wall of C
could be easier and stay undetected for longer. How many packages use
Java? You don't even have to ship the Java source together with an R
package, so one of your *.jars could have a poisoned dependency with
nobody being the wiser.

Attackers are very cunning, and we don't even know what exactly we are
looking for. We can automate some of it, but the kind of code review
that will spot an evil function tucked 50 layers inside a giant
auxiliary data object is a lot of effort, hours to days per package.

> It will be great to run it on all CRAN packages, but I imagine we
> should be sure that the check is decent enough to not overload the
> servers without a need.

This probably counts as creating an unofficial CRAN mirror:
https://cran.r-project.org/mirror-howto.html

(I remember someone sending too many requests to download packages one
my one and losing access from a university address to CRAN as a result.)

You'll need 12.7 Gb for the current versions of the packages or >400 Gb
for the whole archive.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Urgent Review of R Packages in Light of Recent RDS Exploit

2024-05-03 Thread Ivan Krylov via R-package-devel
Dear Maciej Nasinski,

On Fri, 3 May 2024 11:37:57 +0200
Maciej Nasinski  wrote:

> I believe we must conduct a comprehensive review of all existing CRAN
> packages.

Why now? R packages are already code. You don't need poisoned RDS files
to wreak havoc using an R package.

On the other hand, R data files contain R objects, which contain code.
You don't need exploits to smuggle code inside an R object.

> Additionally, I will expect an introduction of an additional
> step in the R CMD check process.

What exactly would you like this step to be?

> It is stated that R Team is aware of
> that, and the exploit is fixed in R 4.4.0, but I can not find any
> clear bullet point in the NEWS file for 4.4.0
> (https://cran.r-project.org/doc/manuals/r-release/NEWS.html).

This has recently been discussed in the R-help thread:
https://stat.ethz.ch/pipermail/r-help/2024-May/479287.html

> I look forward to your thoughts and collaborating closely on this
> urgent review.

It may be worth teaching people that in general, R data files should be
as trusted as R code.

It may also be worth setting aside a strict subset of the R data format
to carry data only, without any executable code [*], but it may turn
out to be much less useful than it sounds. For example, you won't be
able to save many kinds of model objects using this plain data format,
which makes it unrealistic to require plain data only inside data files
in CRAN packages.

An independent review of the whole >2 packages on CRAN for
malicious behaviour is a noble endeavour, but it will require people
and funding. Perhaps you could try to apply for an R Consortium
infrastructure grant to do that.

-- 
Best regards,
Ivan

[*] https://aitap.github.io/2024/05/02/unserialize.html#subset

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Extending proj with proj.line3d methods and overloading the methods

2024-04-28 Thread Ivan Krylov via R-package-devel
В Sun, 28 Apr 2024 15:15:06 +
Leo Mada  пишет:

> This is why I intended to define a new method "proj.line3d" and
> overload this method. But it seems that R interprets "line3d.numeric"
> as a class - which originates probably from the "data,frame" class.

It may help to call the original 'proj' function and your new
'proj.line3d' function "generics", because that's what most S3
literature calls these functions that you overload. This separates them
from the "methods" 'proj.line3d.numeric' and 'proj.line3d.matrix' that
can be said to "implement" or "overload" the generic.

A concise but very readable guide to S3 and other built-in OOP systems
in R can be found in Advanced R by Hadley Wickham:
http://adv-r.had.co.nz/OO-essentials.html#s3

> How can I define a real method "proj.line3d"?

In order to export an S3 generic and register methods for it from a
package, you need the following directives in your NAMESPACE:

export(proj.line3d)
S3method(proj.line3d, numeric) # will use function proj.line3d.numeric
S3method(proj.line3d, matrix) # similar



> There might be some limitations from Roxygen as well (as I use it for
> the package); but it might be easier to proceed, once I understand
> how to do it in R.

The roxygen2 documentation says that if there are multiple dots in the
name of a function, you need to use the two-argument form of the
@method keyword: @method proj.line3d numeric (untested).


> I thought that this solves the problem:
> proj.line3d <- function(p, x, y, z, ...)
>   UseMethod("proj.line3d")

Right. This is the definition of an S3 generic generic in R. As long as
all the methods will also accept arguments (p, x, y, z, , ), all will be fine.

> The other solution, as you pointed out, is more cumbersome; and it
> needs 2 separate classes, so I would need to define "proj" as an S4
> class (as S3 does not handle 2 classes at once).

Moreover, you still need an exported generic proj.line3d and registered
methods for it to work. Inheritance does work in S3 (see NextMethod()),
but it alone won't help you call proj.line3d.numeric() from proj() and
'numeric' x.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Extending proj with proj.line3d methods and overloading the methods

2024-04-27 Thread Ivan Krylov via R-package-devel
27 апреля 2024 г. 00:49:47 GMT+03:00, Leo Mada via R-package-devel 
 пишет:
>Dear List-Members,
>
>I try to implement a proj.line3d method and to overload this method as follows:
>
>proj.line3d <- function(p, x, y, z, ...)
>  UseMethod("proj.line3d")
>
>proj.line3d.numeric = function(p, x, y, z, ...) {
>  # ...
>}
>
>proj.line3d.matrix = function(p, x, y, z, ...) {
>  # ...
>}

>p = c(1,2,3)
>line = matrix(c(0,5,2,3,1,4), 2)
>proj.line3d(p, line)
>#  Error in UseMethod("proj.line3d") :
>#   no applicable method for 'proj.line3d' applied to an object of class 
>"c('double', 'numeric')"

>methods(proj)
># [1] proj.aov*   proj.aovlist*   proj.default*   proj.line3d
># [5] proj.line3d.matrix  proj.line3d.numeric proj.lm

In your NAMESPACE, you've registered methods for the generic function 'proj', 
classes 'line3d.matrix' and 'line3d.numeric', but above you are calling a 
different generic, 'proj.line3d', for which no methods are registered.

For proj.line3d(, ) to work, you'll have to register the 
methods for the proj.line3d generic. If you need a visible connection to the 
proj() generic, you can try registering a method on the 'proj' generic, class 
'line3d' *and* creating a class 'line3d' that would wrap your vectors and 
matrices:

proj(line3d(p), line) -> call lands in proj.line3d -> maybe additional dispatch 
on the remaining classes of 'p'?

This seems to work, but I haven't tested it extensively:

> proj.line3d <- \(x, ...) UseMethod('proj.line3d')
> proj.line3d.numeric <- \(x, ...) { message('proj.line3d.numeric'); x }
> line3d <- \(x) structure(x, class = c('line3d', class(x)))
> proj(line3d(pi))
proj.line3d.numeric
[1] 3.141593
attr(,"class")
[1] "line3d"  "numeric"

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Some, but not all vignettes compressed

2024-04-25 Thread Ivan Krylov via R-package-devel
В Thu, 25 Apr 2024 11:54:49 -0700
Bryan Hanson  пишет:

> So my version of gs blows things up!

The relatively good news is that GhostScript is not solely to blame. A
fresh build of "GPL Ghostscript 10.03.0 (2024-03-06)" was able to
reduce the files to 16..70% of their original size on my computer. But
I just typed ./configure && make and relied on the dependencies already
present on my system.

We can try to compare the build settings (which will involve compiling
things by hand) or ask the Homebrew people [*] (and they will probably
ask for a PDF file and a specific command line that works on some
builds of gs-10.03.0 but not with Homebrew).

What would you rather do?

qpdf, on the other hand, results in no size reduction (99.7% or worse),
just like on your system.

-- 
Best regards,
Ivan

[*]
https://docs.brew.sh/Troubleshooting
https://github.com/Homebrew/homebrew-core/issues?q=ghostscript

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Some, but not all vignettes compressed

2024-04-25 Thread Ivan Krylov via R-package-devel
В Thu, 25 Apr 2024 08:54:41 -0700
Bryan Hanson  пишет:

>   'gs+qpdf' made some significant size reductions:
>  compacted 'Vig_02_Conceptual_Intro_PCA.pdf' from 432Kb to 143Kb
>  compacted 'Vig_03_Step_By_Step_PCA.pdf' from 414Kb to 101Kb
>  compacted 'Vig_04_Scores_Loadings.pdf' from 334Kb to 78Kb
>  compacted 'Vig_06_Math_Behind_PCA.pdf' from 558Kb to 147Kb
>  compacted 'Vig_07_Functions_PCA.pdf' from 381Kb to 90Kb

I'm getting similar (but not same) results on Debian Stable, gs 10.00.0
& qpdf 11.3.0:

# R CMD build --no-resave-data --compact-vignettes=both
compacted ‘Vig_01_Start_Here.pdf’ from 244Kb to 45Kb   
compacted ‘Vig_02_Conceptual_Intro_PCA.pdf’ from 432Kb to 143Kb
compacted ‘Vig_03_Step_By_Step_PCA.pdf’ from 411Kb to 100Kb
compacted ‘Vig_04_Scores_Loadings.pdf’ from 335Kb to 78Kb  
compacted ‘Vig_05_Visualizing_PCA_3D.pdf’ from 679Kb to 478Kb  
compacted ‘Vig_06_Math_Behind_PCA.pdf’ from 556Kb to 145Kb 
compacted ‘Vig_07_Functions_PCA.pdf’ from 378Kb to 89Kb
compacted ‘Vig_08_Notes.pdf’ from 239Kb to 39Kb

 
> - doc/Vig_01_Start_Here.pdf:gs: res=0;  + qpdf: res=0; 
> ==> (new=49942)/(old=45101) = 1.10734 .. not worth using  
> - doc/Vig_02_Conceptual_Intro_PCA.pdf:gs: res=0;  + qpdf: res=0; 
> ==> (new=1.00061e+07)/(old=442210) = 22.6275 .. not worth using  
> - doc/Vig_03_Step_By_Step_PCA.pdf:gs: res=0;  + qpdf: res=0; 
> ==> (new=5.763e+06)/(old=423484) = 13.6085 .. not worth using  
> - doc/Vig_04_Scores_Loadings.pdf:gs: res=0;  + qpdf: res=0; 
> ==> (new=5.41409e+06)/(old=341680) = 15.8455 .. not worth using  
> - doc/Vig_05_Visualizing_PCA_3D.pdf:gs: res=0;  + qpdf: res=0; 
> ==> (new=1.23622e+07)/(old=692901) = 17.8412 .. not worth using  
> - doc/Vig_06_Math_Behind_PCA.pdf:gs: res=0;  + qpdf: res=0; 
> ==> (new=816690)/(old=571493) = 1.42905 .. not worth using  
> - doc/Vig_07_Functions_PCA.pdf:gs: res=0;  + qpdf: res=0; 
> ==> (new=1.36419e+06)/(old=389478) = 3.50262 .. not worth using  
> - doc/Vig_08_Notes.pdf:gs: res=0;  + qpdf: res=0; 
> ==> (new=40919)/(old=38953) = 1.05047 .. not worth using  

Thank you for providing this data! Somehow, instead of compacting the
PDFs, one of the tools manages to blow them up in size, as much as ~23
times.

Can you try tools::compactPDF() separately with gs_quality = 'none'
(isolating qpdf) and with qpdf = '' (isolating GhostScript)?

If the culprit turns out to be GhostScript, it may be due to their
rewritten PDF rendering engine (now in C instead of PostScript with
special extensions) not being up to par when the PDF file needs to be
compressed. If it turns out to be qpdf, we might have to extract the
exact command lines and compare results further.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] [External] Re: Package submission to CRAN not passing incoming checks

2024-04-24 Thread Ivan Krylov via R-package-devel
В Wed, 24 Apr 2024 00:17:28 +
"Petersen, Isaac T"  пишет:

> I included the packages (including the raw package folders and their
> .tar.gz files) in the /inst/extdata folder.

Would you prefer your test to install them from the source directories
(as you currently do, in which case the *.tar.gz files can be omitted)
or the *.tar.gz files (in which case you can set the `repos` argument
to a file:/// URI and omit the package directories and the setwd()
calls)?

I think (but haven't tested) that the two problems that are currently
breaking your test are with .libPaths() and setwd().

.libPaths(temp_lib) overwrites the library paths with `temp_lib` and
the system libraries, the ones in %PROGRAMFILES%\R\R-*\library. In
particular, this removes %LOCALAPPDATA%\R\win-library\* from the list
of library paths, so the packages installed by the user (including
'waldo', which is needed by 'testthat') stop being available.

In order to add temp_lib to the list of the paths, use
.libPaths(c(temp_lib, .libPaths())).

Since setwd() returns the previous directory, one that was current
before setwd() was called, the code newpath <- setwd(filepath);
setwd(newpath) will keep the current directory, not set it to
`filepath`. Use oldpath <- setwd(filepath) instead.

Since you're already using 'testthat' and it already depends on
'withr', you may find it easier to use withr::local_dir(...) and
withr::local_temp_libpaths(...).

In order to test for a package being attached by load_or_install() (and
not just installed and loadable), check for 'package:testpackage1'
being present in the return value of search(). (This check is good
enough and much easier to write than comparing environments on the
search path with the package exports or comparing searchpaths() with
the paths under the temporary library.)

Finally, I think that there is no need for the test_load_or_install()
call because I don't see the function being defined anywhere. Doesn't
test_that(...) run the tests by itself?

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Package submission to CRAN not passing incoming checks

2024-04-23 Thread Ivan Krylov via R-package-devel
Dear Isaac,

В Mon, 22 Apr 2024 17:00:27 +
"Petersen, Isaac T"  пишет:

> This my first post--I read the posting guidelines, but my apologies
> in advance if I make a mistake.

Welcome to R-package-devel! You're doing just fine.

> 1) The first note <...> includes the contents of the LICENSE file

It's multiple NOTEs in a trench coat. Kasper has addressed the "large
version components" and the DOIs interpreted as file URIs, but there's
one more.

The ' + file LICENSE' syntax has two uses: (1)
for when the terms of the license is a template, requiring the author
of the software to substitute some information (e.g. the year and the
copyright holder for MIT) and (2) for when a package puts additional
restrictions on the base license.

(Hmm. Only case (2) is currently described at
; case
(1) is only described inside the license files.)

The CRAN team has expressed a preference for the package authors not to
put 2 twisty little copies of standard licenses, all slightly
different, inside their packages. Since you're not restricting CC BY
4.0, it's enough to say 'License: CC BY 4.0'. If you'd like a full copy
of the license text in your source code repository, that's fine, but
you'll need to list the file in .Rbuildignore:
https://cran.r-project.org/doc/manuals/R-exts.html#Building-package-tarballs

Speaking of the Creative Commons license: the choice of a license for
your code is obviously yours, but Creative Commons themselves recommend
against using their licenses for software:
.
I can't recommend you a license - that would be politically motivated
meddling in foreign affairs - but the lists linked by the CC FAQ and
Writing R Extensions section 1.1.2 should provide a good starting point.

> Here are the results from win-builder:
> https://win-builder.r-project.org/incoming_pretest/petersenlab_0.1.2-9033_20240415_212322/

There is one more NOTE:

>> * checking examples ... [437s/438s] NOTE
>> Examples with CPU (user + system) or elapsed time > 5s
>>user system elapsed
>> load_or_install 349.410 37.410 387.233
>> vwReg35.199  0.379  35.606
 
The examples are not only for the user to read in the help page; they
are also for the user to run example(vwReg) and see your code in action
(and for R CMD check to see whether they crash, including regularly on
CRAN).

For vwReg, try reducing the number of regressions you are running
(since your dataset is mtcars, which is already very compact).

For load_or_install, we have the additional issue that running
example(load_or_install) modifies the contents of the R library and the
search path, which belong to the user. The CRAN policy forbids such
modifications: 

Examples in general should change as little of the global state of the
R session and the underlying computer as possible. I suggest wrapping
the example in \dontrun{} (since everything about load_or_install() is
about altering global state) and creating a test for the function in
tests/*.R.

The test should set up a new library under tempdir(), run
load_or_install(), check the outcomes (that the desired package is
attached, etc.) and clean up after itself. There's also the matter of
the package not failing without a connection to the Internet, which is
another CRAN policy requirement. You might have to bring a very small
test package in inst/extdata just for load_or_install() to install and
load it, so that R CMD check won't fail when running offline.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Old references in the Description file.

2024-04-11 Thread Ivan Krylov via R-package-devel
В Thu, 11 Apr 2024 11:57:00 +
Gabriel Constantino Blain  пишет:

> The problem is that it is a paper from the 70's (Priestley and
> Taylor, 1972) and its DOI has very uncommon symbols, such as <>. The
> DOI is: 10.1175/1520-0493(1972)100<0081:OTAOSH>2.3.CO;2.

Since the R CMD check function responsible for locating and checking
the DOIs from the package metadata expects to see them URL-encoded, it
should be possible to put your DOI through paste0('') in order to generate the correct link.

Another workaround is to generate a shortDOI that would redirect to the
same place as the original DOI:
https://shortdoi.org/10.1175/1520-0493(1972)100%3C0081:OTAOSH%3E2.3.CO;2
Now  should work like the original DOI.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Question about CRAN submission resulting in 1 note

2024-04-10 Thread Ivan Krylov via R-package-devel
В Wed, 10 Apr 2024 14:11:53 +
Chris Knoll  пишет:

> For "Package has VignetteBuilder field but no prebuilt vignette
> index", how would this be resolved?

The package at https://github.com/OHDSI/CirceR/ doesn't seem to have any
vignettes. Without vignettes, there's no need for VignetteBuilder:
knitr.

> For "Package ahs FOSS license, installs .class/.jar but has no 'java
> directory'':  This is custom code that I've written in Java plus has
> a few maven dependencies and I'm not sure if they are asking me to
> bundle the source code of all Java dependencies (that have classes in
> the jar file).   That could be hard to do, and was hoping if anyone
> had experience in this, is it enough to put into the Readme where
> such source code could be found?

Here's what the policy has to say:

>> For Java .class and .jar files, the sources should be in a top-level
>> java directory in the source package (or that directory should
>> explain how they can be obtained).



At the very least, XLconnect seems to be fine supplying just the
README. If it's not too much trouble, shipping your custom source code
(definitely not all of the maven dependencies) would be the kind thing
to do, I think. (Feel free to disregard this part if a more experienced
Java package developer says otherwise.)

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Linking Tutorial Site to CRAN Package site.

2024-04-07 Thread Ivan Krylov via R-package-devel
В Sat, 6 Apr 2024 18:27:24 +
"Ruff, Sergej"  пишет:

> The CRAN site
> (https://cran.r-project.org/web/packages/RepeatedHighDim/index.html)
> has a "documentation" part with the refrence pdf.
> 
> Can I link to our tutorial site (https://software.klausjung-lab.de/.)
> under documentation?

Since your tutorial is relatively short and contains R code intermixed
with the results of running it, it could make a great vignette.
Vignettes are linked on the CRAN page for a package right under the
PDF reference manual. For example, the BiocManager package has one
vignette: https://cran.r-project.org/package=BiocManager

Vignettes are a part of the package and their code is automatically
checked together with your examples. For the users of your package,
this will help keep the tutorial available (even if the website moves
in the future) and compatible with the current version of the package
(even if the package evolves and the tutorial website evolves together
with it).

R has built-in support for PDF vignettes via LaTeX using Sweave [*].
HTML vignettes can be much more accessible than PDF files, but there is
no built-in HTML vignette engine in R [**]. The 'markdown' package is
reasonably lightweight and has an HTML vignette engine. Markdown tries
to be a superset of HTML, so it should be possible to keep most of your
original HTML, including the styling, while rewriting the tutorial as
an executable vignette.

-- 
Best regards,
Ivan

[*]
https://cran.r-project.org/doc/manuals/R-exts.html#Writing-package-vignettes

[**]
It's possible to write a crude HTML vignette engine in ~100 lines of R
code, but we cannot expect every package author to do that.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to store large data to be used in an R package?

2024-03-25 Thread Ivan Krylov via R-package-devel
В Mon, 25 Mar 2024 11:12:57 +0100
Jairo Hidalgo Migueles  пишет:

> Specifically, this data consists of regression and random forest
> models crucial for making predictions within our R package.

Apologies for asking a silly question, but is there a chance that these
models are large by accident (e.g. because an object references a large
environment containing multiple copies of the training dataset)? Or it
is there really more than a million weights required to make
predictions?

> Initially, I attempted to save these models as internal data within
> the package. While this approach maintains functionality, it has led
> to a package size exceeding 20 MB. I'm concerned that this would
> complicate submitting the package to CRAN in the future.

The policy mentions the possibility of having a separate large
data-only package. Since CRAN strives to archive all package versions,
this data-only package will have to be updated as rarely as possible.
You will need to ask CRAN for approval.

If there is a significant amount of core functionality inside your
package that does *not* require the large data (so that it can still
be installed and used without the data), you can publish the data-only
package yourself (e.g. using the 'drat' package), put it in Suggests
and link to it in the Additional_repositories field of your DESCRIPTION.
Alternatively, you can publish the data on Zenodo and offer to download
it on first use. Make sure to (1) use tools::R_user_dir to determine
where to put the files, (2) only download the files after the user
explicitly agrees to it and (3) test as much of your package
functionality as possible without requiring the data to be downloaded.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Request for assistance: error in installing on Debian (undefined symbol: omp_get_num_procs) and note in checking the HTML versions (no command 'tidy' found, package 'V8' unavailable

2024-03-22 Thread Ivan Krylov via R-package-devel
В Thu, 21 Mar 2024 18:32:59 +
Annaig De-Walsche  пишет:

> If ever I condition the use of OpenMD directives, users will indeed
> be capable of installing the package, but they wont access to a
> performant version of the code, as it necessitates the use of OpenMD.
> Is there a method to explicitly express that the use of OpenMD is
> highly encouraged?

I think the most practical method would be to produce a
packageStartupMessage() from the .onAttach function of your package if
you detect that the package has been compiled without OpenMP support:
https://cran.r-project.org/doc/manuals/R-exts.html#Load-hooks

> In practical, how to know from R code if OpenMP is present or not?

Your C code will have to detect it and provide this information to the
R code. WRE 1.6.4 says:

>> [C]heck carefully that you have followed the advice in the
>> subsection on OpenMP support [WRE 1.2.1.1]. In particular, any use
>> of OpenMP in C/C++ code will need to use
>> 
>>  #ifdef _OPENMP
>>  # include 
>>  #endif



Similarly, any time you use #pragma omp ... or call
omp_set_num_threads(), it needs to be wrapped in #ifdef _OPENMP ...
#endif.

Additionally, it is important to make sure that during tests and
examples, your OpenMP code doesn't use more than two threads:
https://cran.r-project.org/web/packages/policies.html
This is in place because CRAN checks are run in parallel, and a package
that tries to helpfully use all of the processor cores would interfere
with other packages being checked at the same time.

>   [[alternative HTML version deleted]]

This mailing list removes HTML e-mails. If you compose your messages in
HTML, we only get the plain text version automatically prepared by your
mailer:
https://stat.ethz.ch/pipermail/r-package-devel/2024q1/010595.html

In order to preserve the content and the presentation of your messages,
it's best to compose them in plain text.

-- 
Très cordialement,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] help diagnosing win-builder failures

2024-03-17 Thread Ivan Krylov via R-package-devel
Hi,

This may need the help of Uwe Ligges to diagnose. I suspect this may be
related to the Windows machine having too much memory committed (as Uwe
has been able to pinpoint recently [*] about a package that failed to
compile some heavily templated C++), but there is not enough information
to give a conclusive diagnosis.

On Sun, 17 Mar 2024 14:01:33 -0400
Ben Bolker  wrote:

> 2. an ERROR running tests, where the output ends with a cryptic
> 
>Anova: ..
> 
> (please try to refrain from snarky comments about not using testthat
> ...)

Pardon my ignorance, but is it an option to upload a version of the
package that uses test_check(pkg, reporter=LocationReporter()) instead
of the summary reporter?

-- 
Best regards,
Ivan

[*] https://stat.ethz.ch/pipermail/r-package-devel/2024q1/010304.html

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Removing import(methods) stops exporting S4 "meta name"

2024-03-15 Thread Ivan Krylov via R-package-devel
On Thu, 14 Mar 2024 16:06:50 -0400
Duncan Murdoch  wrote:

> Error in xj[i] : invalid subscript type 'list'
> Calls: join_inner -> data.frame -> [ -> [.data.table -> [.data.frame
> Execution halted

And here's how it happens:

join_inner calls xi[yi,on=by,nomatch=0] on data.tables xi and yi.

`[.data.table` calls cedta() to determine whether the calling
environment is data.table-aware. If the import of `.__T__[:base` is
removed, cedta() returns FALSE.

`[.data.table` then forwards the call to `[.data.frame`, which cannot
handle data.table-style subsetting.

This is warned about in
;
the 'do' package should have set the .datatable.aware = TRUE marker in
its environment. In fact, example(join_inner) doesn't raise an error
with the following changes when running with data.table commit f92aee69
(i.e. pre-#6001):

diff -rU2 do/NAMESPACE do_2.0.0.0.2/NAMESPACE
--- do/NAMESPACE2021-08-03 12:37:00.0 +0300
+++ do_2.0.0.0.2/NAMESPACE  2024-03-15 14:01:10.588561222 +0300
@@ -130,5 +130,4 @@
 export(upper.dir)
 export(write_xlsx)
-importFrom(data.table,`.__T__[:base`)
 importFrom(methods,as)
 importFrom(reshape2,melt)
diff -rU2 do/R/join.R do_2.0.0.0.2/R/join.R
--- do/R/join.R 2020-06-30 06:47:22.0 +0300
+++ do_2.0.0.0.2/R/join.R   2024-03-15 13:54:02.289440613 +0300
@@ -1,2 +1,4 @@
+.datatable.aware = TRUE
+
 #' @title Join two dataframes together
 #' @description Join two dataframes by the same id column.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] confusion over spellchecking

2024-03-13 Thread Ivan Krylov via R-package-devel
В Sun, 10 Mar 2024 13:55:43 -0400
Ben Bolker  пишет:

> I am working on a package and can't seem to get rid of a NOTE about
> 
> Possibly misspelled words in DESCRIPTION:
>glmmTMB (10:88)
>lme (10:82)
> 
> on win-builder.

Do you have these words anywhere else in the package (e.g. in the Rd
files)? It turns out that R has a special environment variable that
makes it ignore custom dictionaries specifically for DESCRIPTION:

>>## Allow providing package defaults but make this controllable via
>>##   _R_ASPELL_USE_DEFAULTS_FOR_PACKAGE_DESCRIPTION_
>>## to safeguard against possible mis-use for CRAN incoming checks.

I cannot see it used anywhere under the trunk/CRAN subdirectory in the
developer.r-project.org Subversion repo, but it could be set somewhere
else on Win-Builder.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Submission after archived version

2024-03-13 Thread Ivan Krylov via R-package-devel
В Mon, 11 Mar 2024 23:45:13 +0100
Nils Mechtel  пишет:

> Despite R CMD check not giving any errors or warnings, the package
> doesn’t pass the pre-tests:

If your question was more about the reasons for the difference between
your R CMD check and the pre-tests, most of it is due to --as-cran:

(Using commit ffe216d from https://github.com/nilsmechtel/MetAlyzer as
the basis for the example, which seems to be different from the
incoming pretest from the link you've shared.)

$ R-devel CMD check MetAlyzer_1.0.0.tar.gz
<...>
Status: OK  
$ R-devel CMD check --as-cran MetAlyzer_1.0.0.tar.gz
<...>
* checking for non-standard things in the check directory ... NOTE
Found the following files/directories: ‘metabolomics_data.csv’
<...>

It's less wasteful to run checks without --as-cran in CI (as you
currently do), but you need to perform additional testing before making
a release. The incoming pre-tests use a custom set of environment
variables that go a but further than just --as-cran:
https://svn.r-project.org/R-dev-web/trunk/CRAN/QA/Kurt/lib/R/Scripts/check_CRAN_incoming.R

In particular, _R_CHECK_CRAN_INCOMING_USE_ASPELL_=true enables the
check for words that are possibly misspelled:

(Using an extra environment variable because your package has been
already published and R filters out "misspellings" found in the CRAN
version of the package. Congratulations!)

$ env \
 _R_CHECK_CRAN_INCOMING_ASPELL_RECHECK_MAYBE_=FALSE \
 _R_CHECK_CRAN_INCOMING_USE_ASPELL_=true \
 R-devel CMD check --as-cran MetAlyzer_1.0.0.tar.gz
<...>
Possibly misspelled words in DESCRIPTION:
  metabolomics (15:78)
<...>

Yet another way to avoid false misspellings is to create a custom
dictionary:
http://dirk.eddelbuettel.com/blog/2017/08/10/#008_aspell_cran_incoming

$ mkdir -p .aspell
$ echo '
 Rd_files <- vignettes <- R_files <- description <- list(
  encoding = "UTF-8",
  language = "en",
  dictionaries = c("en_stats", "dictionary")
 )
' > .aspell/defaults.R
$ R -q -s -e '
 saveRDS(c(
  "metabolomics" # , extra words go here
 ), file.path(".aspell", "dictionary.rds"))
'
$ R CMD build .
$ env \
 _R_CHECK_CRAN_INCOMING_ASPELL_RECHECK_MAYBE_=FALSE \
 _R_CHECK_CRAN_INCOMING_USE_ASPELL_=true \
 R-devel CMD check --as-cran MetAlyzer_1.0.0.tar.gz
# No more "Possibly misspelled words in DESCRIPTION"!

Some day, this will be documented in Writing R Extensions, or maybe in
R Internals (where the other _R_CHECK_* variables are documented), or
perhaps in the CRAN policy. See also:
https://stat.ethz.ch/pipermail/r-package-devel/2024q1/010558.html

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Submission after archived version

2024-03-12 Thread Ivan Krylov via R-package-devel
В Mon, 11 Mar 2024 23:45:13 +0100
Nils Mechtel  пишет:

> Debian:
> 
> Status: 3 NOTEs

>> * checking CRAN incoming feasibility ... [4s/6s] NOTE

>> Possibly misspelled words in DESCRIPTION:
>>  metabolomics (36:78)

This one can be explained in the submission comment. The rest of the
NOTE is to be expected.

>> * checking DESCRIPTION meta-information ... NOTE
>> Author field differs from that derived from Authors@R

Just remove the Author: field from your DESCRIPTION and let R CMD build
automatically generate it from Authors@R.

>> * checking for non-standard things in the check directory ... NOTE
>> Found the following files/directories:
>>  ‘metabolomics_data.csv’

Make sure that when your tests and examples create files, they do so in
the session temp directory and then remove the files afterwards. If a
user had a valuable file named metabolomics_data.csv in the current
directory, ran example(...) and had it overwritten as a result, they
would be very unhappy.

The NOTEs on Windows are similar.

Good luck!

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] [EXTERN] Re: [EXTERN] Re: [EXTERN] Re: @doctype is deprecated. need help for r package documentation

2024-03-12 Thread Ivan Krylov via R-package-devel
В Mon, 11 Mar 2024 14:57:58 +
"Ruff, Sergej"  пишет:

> I uploaded the old version of the package to my repo:
> https://github.com/SergejRuff/boot

After installing this tarball, running RStudio and typing:

library(bootGSEA)
?bootGSEA

...I see the help page in RStudio's help tab, not in the browser. I
think this is the expected behaviour for RStudio.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] RFC: C backtraces for R CMD check via just-in-time debugging

2024-03-11 Thread Ivan Krylov via R-package-devel
Vladimir,

Thank you for the example and for sharing the ideas regarding
symbol-relative offsets!

On Thu, 7 Mar 2024 09:38:18 -0500 (EST)
Vladimir Dergachev  wrote:

>  unw_get_reg(, UNW_REG_IP, );

Is it ever possible for unw_get_reg() to fail (return non-zero) for
UNW_REG_IP? The documentation isn't being obvious about this. Then
again, if the process is so damaged it cannot even read the instruction
pointer from its own stack frame, any attempts at self-debugging must
be doomed.

>* this should work as a package, but I am not sure whether the
> offsets between package symbols and R symbols would be static or not.

Since package shared objects are mmap()ed into the address space and
(at least on Linux with ASLR enabled) mmap()s are supposed to be made
unpredictable, this offset ends up not being static. On Linux, R seems
to be normally built as a position-independent executable, so no matter
whether there is a libR.so, both the R base address and the package
shared object base address are randomised:

$ cat ex.c
#include 
#include 
void addr_diff(void) {
 ptrdiff_t diff = (char*)_diff - (char*)
 Rprintf("self - Rprintf = %td\n", diff);
}
$ R CMD SHLIB ex.c
$ R-dynamic -q -s -e 'dyn.load("ex.so"); .C("addr_diff");'
self - Rprintf = -9900928
$ R-dynamic -q -s -e 'dyn.load("ex.so"); .C("addr_diff");'
self - Rprintf = -15561600
$ R-static -q -s -e 'dyn.load("ex.so"); .C("addr_diff");'
self - Rprintf = 45537907472976
$ R-static -q -s -e 'dyn.load("ex.so"); .C("addr_diff");'
self - Rprintf = 46527711447632

>* R ought to know where packages are loaded, we might want to be
> clever and print out information on which package contains which
> function, or there might be identical R_init_RMVL() printouts.

That's true. Informaion on all registered symbols is available from
getLoadedDLLs().

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] [EXTERN] Re: [EXTERN] Re: @doctype is deprecated. need help for r package documentation

2024-03-07 Thread Ivan Krylov via R-package-devel
В Thu, 7 Mar 2024 20:27:29 +
"Ruff, Sergej"  пишет:

> I am refering to Rstudio. I checked the settings and type is set to
> "htlm", not text. And I was wondering why the package documentation
> opened in a browser when I used @doctype.

Do you still have the source package .tar.gz file for which ?bootGSEA
would start a browser from inside RStudio?

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] @doctype is deprecated. need help for r package documentation

2024-03-07 Thread Ivan Krylov via R-package-devel
В Thu, 7 Mar 2024 10:37:51 +
"Ruff, Sergej"  пишет:

> I noticed that when I try _?bootGSEA_ it goes to the help page in R
> itself but not to the html page

That's up to the user to choose. help(bootGSEA, help_type = 'html')
should get you to the HTML documentation; help(bootGSEA, help_type =
'text') should give you plain text. The default depends on
options(help_type=...). On Windows, you get a choice during
installation of R; this gets recorded in file.path(R.home('etc'),
'Rprofile.site').

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] RFC: C backtraces for R CMD check via just-in-time debugging

2024-03-07 Thread Ivan Krylov via R-package-devel
On Tue, 5 Mar 2024 18:26:28 -0500 (EST)
Vladimir Dergachev  wrote:

> I use libunwind in my programs, works quite well, and simple to use.
> 
> Happy to share the code if there is interest..

Do you mean that you use libunwind in signal handlers? An example on
how to produce a backtrace without calling any async-signal-unsafe
functions would indeed be greatly useful.

Speaking of shared objects injected using LD_PRELOAD, I've experimented
some more, and I think that none of them would work with R without
additional adjustments. They install their signal handler very soon
after the process starts up, and later, when R initialises, it
installs its own signal handler, overwriting the previous one. For this
scheme to work, either R would have to cooperate, remembering a pointer
to the previous signal handler and calling it at some point (which
sounds unsafe), or the injected shared object would have to override
sigaction() and call R's signal handler from its own (which sounds
extremely unsafe).

Without that, if we want C-level backtraces, we either need to patch R
to produce them (using backtrace() and limiting this to glibc systems
or using libunwind and paying the dependency cost) or to use a debugger.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] [External] [External] RcmdrPlugin.HH_1.1-48.tar.gz

2024-03-07 Thread Ivan Krylov via R-package-devel
On Wed, 6 Mar 2024 13:46:55 -0500
Duncan Murdoch  wrote:

> is this just a more or less harmless error, thinking that 
> the dot needs escaping

I think it's this one. You are absolutely right that the dot doesn't
need escaping in either TRE (which is what's used inside exportPattern)
or PCRE. In PRCE, this regular expression would have worked as intended:

# We do match backslashes by mistake.
grepl('[\\.]', '\\')
# [1] TRUE

# In PCRE, this wouldn't have been a mistake.
grepl('[\\.]', c('\\', '.'), perl = TRUE)
# [1] FALSE TRUE

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] RcmdrPlugin.HH_1.1-48.tar.gz

2024-03-05 Thread Ivan Krylov via R-package-devel
В Tue, 5 Mar 2024 22:41:32 +
"Richard M. Heiberger"  пишет:

>  Undocumented code objects:
>'.__global__'
>  All user-level objects in a package should have documentation
> entries. See chapter 'Writing R documentation files' in the 'Writing R
>  Extensions' manual.

This object is not here for the user of the package. If you don't
export it, there will be no WARNING about it being undocumented. This
variable is exported because of exportPattern(".") in the file
NAMESPACE. The lone dot is a regular expression that matches any name
of an R object.

If you don't want to manually list your exports in the NAMESPACE file
(which can get tedious) or generate it (which takes additional
dependencies and build steps), you can use exportPattern('^[^\\.]') to
export everything except objects with a name starting with a period:
https://cran.r-project.org/doc/manuals/R-exts.html#Specifying-imports-and-exports

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] RFC: C backtraces for R CMD check via just-in-time debugging

2024-03-03 Thread Ivan Krylov via R-package-devel
On Sun, 3 Mar 2024 19:19:43 -0800
Kevin Ushey  wrote:

> Would libSegFault be useful here?

Glad to know it has been moved to
 and not
just removed altogether after the upstream commit
.

libSegFault is safer than, say, libsegfault [*] because it both
supports SA_ONSTACK (for when a SIGSEGV is caused by stack overflow)
and avoids functions like snprintf() (which depend on the locale code,
which may have been the source of the crash). The only correctness
problem that may still be unaddressed is potential memory allocations
in backtrace() when it loads libgcc on first use. That should be easy
to fix by calling backtrace() once in segfault_init(). Unfortunately,
libSegFault is limited to glibc systems, so a different solution will
be needed on Windows, macOS and Linux systems with the musl libc.

Google-owned "backward" [**] tries to do most of this right, but (1) is
designed to be compiled together with C++ programs, not injected into
unrelated processes and (2) will exit the process if it survives
raise(signum), which will interfere with both rJava (judging by the
number of Java-related SIGSEGVs I saw while running R CMD check) and R's
own stack overflow survival attempts.

-- 
Best regards,
Ivan

[*] https://github.com/stass/libsegfault
(Which doesn't compile out of the box on GNU/Linux due to missing
pthread_np.h, although that should be easy to patch.)

[**] https://github.com/bombela/backward-cpp

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] RFC: C backtraces for R CMD check via just-in-time debugging

2024-03-03 Thread Ivan Krylov via R-package-devel
Hello,

This may be of interest to people who run lots of R CMD checks and have
to deal with resulting crashes in compiled code.

Every now and then, the CRAN checks surface a particularly nasty crash.
The R-level traceback stops in the compiled code. It's not obvious
where exactly the crash happens. Naturally, this never happened on the
maintainer's computer before and, in fact, is hard to reproduce.

Containers would help, but they cannot solve the problem completely.
Some problems only surface when there's more than 32 logical
processors, or during certain times of day. It may help to at least see
the location of the crash as it happens on the computer running the
check.

One way to provide that would be to run a special debugger that does
nothing most of the time, attaches to child threads and processes, and
produces backtraces when processes receive a crashing signal. There is
such a debugger for Windows [1], and there is now a proof of concept
for amd64 Linux [2]. 

I've just tried [2] on a 250-package reverse dependency check and saw a
lot of SIGSEGVs with rcx=cafebabe or Java in the backtrace, but
other than that, it seems to work fine. Do you think it's worth
developing further?

The major downside of using a debugger like this is a noticeable change
in the environment: [v]fork(), clone() and exec() become slower,
attaching another tracer becomes impossible, SIGSEGVs may become much
slower (although I do hope that most software I rely upon doesn't care
about SIGSEGVs per second). On the other hand, these wrappers are as
transparent as they get and don't even need R -d to pass the arguments
to the child process.

The other way to provide C-level backtraces is a post-mortem debugger
(registered via the AeDebug registry key on Windows or
kernel.core_pattern sysctl on Linux). This avoids interference with the
process environment during normal execution, but requires more
integration work to collect the crash dumps, process them into usable
backtraces and associate with the R CMD check runs. There are also
injectable DLLs like libbacktrace, but these have to interfere with the
process from the inside, which may be worse than ptrace() in terms of
observable environment changes. On glibc systems (but not musl, macOS,
Windows), R's SIGSEGV handler could be enhanced to call
backtrace_symbols_fd(), which should be safe (no malloc()) as long as
libgcc is preloaded.

Is adding C-level backtraces to R CMD checks worth the effort? Could it
be a good idea to add this on CRAN? If yes, how can I help?

-- 
Best regards,
Ivan

[1] , see "catchsegv"

[2] https://codeberg.org/aitap/tracecrash

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Additional issues: Intel segfault

2024-03-01 Thread Ivan Krylov via R-package-devel
В Sat, 2 Mar 2024 02:07:47 +
Murray Efford  пишет:

> Gabor suggested https://github.com/r-hub/rhub2 and that worked like a
> charm. A check there on the Intel platform found no errors in my
> present version of secrdesign, so I'll resubmit with confidence.

Thank you for letting me know! Having this as a container simplifies a
lot of things.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Additional issues: Intel segfault

2024-03-01 Thread Ivan Krylov via R-package-devel
В Fri, 1 Mar 2024 07:42:01 +
Murray Efford  пишет:

> R CMD check suggests it is most likely in the Examples for
> 'validate', but all code there is wrapped in \dontrun{}.

The crash happens after q('no'), suggesting a corruption in the heap or
in the R memory manager. At least it's a null pointer being
dereferenced and not a 0xRANDOM_LOOKING_NUMBER: this limits the impact
of the problem.

I don't know if anyone created an easily reproducible container with an
Intel build of R (there's https://hub.docker.com/r/intel/oneapi, but
aren't the compilers themselves supposed to be not redistributable?),
so you will most likely have to follow
https://www.stats.ox.ac.uk/pub/bdr/Intel/README.txt and
https://cran.r-project.org/doc/manuals/r-devel/R-admin.html#Intel-compilers
manually, compiling R using Intel compilers yourself in order to
reproduce this.

I think it would be great if CRAN checking machines used a just-in-time
debugger to provide C-level backtraces at the place of the crash. For
Windows, such a utility does exist [*], but I recently learned that the
glibc `catchsegv` program (and most other similar programs) used to
perform shared object preloading (before being thrown out of the
codebase altogether), which is more intrusive than it could be. A proof
of concept using GDB on Linux can be shown to work:

R -d gdb \
 --debugger-args='-batch -ex run -ex bt -ex c -ex q' \
 -e '
  Rcpp::sourceCpp(code =
   "//[[Rcpp::export]]\nvoid rip() { *(double*)(42) = 42; }"
  ); rip()
 '

-- 
Best regards,
Ivan

[*] https://github.com/jrfonseca/drmingw

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Unexpected multi-core CPU usage in package tests

2024-02-28 Thread Ivan Krylov via R-package-devel
В Tue, 27 Feb 2024 11:14:19 +
Jon Clayden  пишет:

> My testing route is to install the packages within the
> 'rocker/r-devel' Docker container, which is Debian-based, then use
> 'time' to evaluate CPU usage. Note that, even though 'RNifti' does not
> use OpenMP, setting OMP_NUM_THREADS changes its CPU usage

I think that's because rocker/r-devel uses parallel OpenBLAS:

$ podman run --rm -it docker.io/rocker/r-devel \
 R -q -s -e 'sessionInfo()' | grep -A1 BLAS
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.24.so;  
LAPACK version 3.11.0

The incoming CRAN check machine either sets the BLAS parallellism to 1
or uses a non-parallel BLAS. With rocker/r-devel, you can run R with
the environment variable OPENBLAS_NUM_THREADS set to 1. It's been
effective in the past to run R -d gdb and set a breakpoint on
pthread_create before launching the test. (In theory, it may be
required to set a breakpoint on every system call that may be used to
create threads, including various variations of clone(), subject to
variations between operating systems, but pthread_create has been
enough for me so far.)

With OPENBLAS_NUM_THREADS=1, I'm only seeing OpenMP threads created by
the mmand package during tests for your package tractor.base, and the
latest commit (that temporary disables testing of mmand) doesn't hit
the breakpoint or raise any NOTEs at all.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] CRAN Package Check Note: Warning: trimming empty

2024-02-24 Thread Ivan Krylov via R-package-devel
В Fri, 23 Feb 2024 17:04:39 +
Sunmee Kim  пишет:

> Version: 1.0.4
> Check: HTML version of manual
> Result: NOTE

This may not be immediately obvious in the e-mail from CRAN, but I
think this is a reminder of a warning from the previous version of the
package. Haven't you just uploaded version 1.0.5? I'm not getting any
warnings for gesca_1.0.5.tar.gz from the /incoming/archive subdirectory
on the CRAN FTP server, except perhaps "This build time stamp is over a
month old", and the latest check looks almost clean in the same manner:
https://win-builder.r-project.org/incoming_pretest/gesca_1.0.5_20240223_172938/

What does the rest of the e-mail say?

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Conversion failure in 'mbcsToSbcs'

2024-02-21 Thread Ivan Krylov
В Wed, 21 Feb 2024 12:29:02 +
Package Maintainer  пишет:

> Error: processing vignette 'ggenealogy.Rnw' failed with diagnostics:
>  chunk 58 (label = plotCBText)

In order to use the non-standard graphics device, the chunk must
set the option fig=TRUE. Otherwise, when something calls
graphics::strwidth('Lubomír Kubáček', "inches"), R notices that no
graphics device is active and creates a default one, which happens to
be pdf() and has all these problems. With fig=TRUE, Sweave will
initialise the cairo_pdf() device first, and then graphics::strwidth()
will use the existing device, avoiding the error.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Conversion failure in 'mbcsToSbcs'

2024-02-15 Thread Ivan Krylov
В Mon, 12 Feb 2024 16:01:27 +
Package Maintainer  пишет:

> Unfortunately, I received a reply from the CRAN submission team
> stating that my vignette file is still obtaining the "mbcsToSbcs"
> ERROR as is shown here
> (https://win-builder.r-project.org/incoming_pretest/ggenealogy_1.0.3_20240212_152455/Debian/00check.log).

I am sorry for leading you down the wrong way with my advice. It turns
out that no 8-bit Type-1 encoding known to pdf() can represent both
'Lubomír Kubáček' and 'Anders Ågren':

lapply(
 setNames(nm = c(
  'latin1', 'cp1252', 'latin2', 'latin7',
  'latin-9', 'CP1250', 'CP1257'
 )), function(enc)
  iconv(enc2utf8(c(
   'Lubomír Kubáček', 'Anders Ågren'
  )), 'UTF-8', enc, toRaw = TRUE)
) |> sapply(lengths)
# one of the two strings cannot be represented, returning a NULL:
#  latin1 cp1252 latin2 latin7 latin-9 CP1250 CP1257
# [1,]  0  0 15  0   0 15  0
# [2,] 12 12  0 12  12  0 12

While it may still be possible to give extra parameters to pdf() to use
a font encoding that covers all the relevant characters, it seems
easier to switch to cairo_pdf() for your multi-lingual plots. Place the
following somewhere in the beginning of the vignette:

<>=
my.Swd <- function(name, width, height, ...)
 grDevices::cairo_pdf(
  filename = paste(name, "pdf", sep = "."),
  width = width, height = height
 )
@
\SweaveOpts{grdevice=my.Swd,pdf=FALSE}

This should define a new plot device function for Sweave, one that
handles more Unicode characters correctly.

> PS: Thanks for the advice about plain text mode. Hopefully, I have
> correctly abide by that advice in this current email.

This e-mail arrived in plain text, thank you!

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] failing CRAN checks due to problems with dependencies

2024-02-08 Thread Ivan Krylov via R-package-devel
В Wed, 7 Feb 2024 08:40:44 -0600
Marcin Jurek  пишет:

> Packages required but not available: 'Rcpp', 'FNN',
> 'RcppArmadillo' Packages suggested but not available for checking:
> 'fields', 'rmarkdown', 'testthat', 'maptools'

One of the machines running the incoming checks was having problems. If
you followed the failing dependency chain by looking at the CRAN check
results of the packages described as "not available", you could
eventually find a package needing compilation (Rcpp or stringi or
something else), look at the installation log and see Make trying to
run commands that are completely wrong.

It looked like the path to the compiler was empty:
https://web.archive.org/web/20240208191430/https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-debian-clang/Rcpp-00install.html

I think that the problems are solved now, so it should be safe to
increment the version and submit it again.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] r-oldrel-linux- not in CRAN checks?

2024-02-06 Thread Ivan Krylov via R-package-devel
В Tue, 6 Feb 2024 18:27:32 +0100
Vincent van Hees  пишет:

> For details see:
> https://github.com/RfastOfficial/Rfast/issues/99

GitHub processed your plain text description of the problem as if it
was Markdown and among other things ate the text that used to be there
between angle brackets:

> #include
>  ^~~

By digging through the raw source code of the issue at
https://api.github.com/repos/RfastOfficial/Rfast/issues/99 it is
possible to find out which header was missing for Rfast:

> ../inst/include/Rfast/parallel.h:20:10:fatal error: tion: No such
> file or directory
> #include 
>  ^~~
>compilation terminated.

Indeed,  is a C++17 header [1]. While g++ version
7.5.0-3ubuntu1~18.04 seems to accept --std=c++17 without complaint, its
libstdc++-7-dev package is missing this header. Moreover, there's still
no  in libstdc++-8-dev. I think that you need libstdc++-9
for that to work, which is not in Bionic; older versions aren't
C++17-compliant enough to compile Rfast, and C++17 is listed in the
SystemRequirements of the package.

Installing clang-10 and editing Makeconf to use clang++-10 instead of
g++ seems to let the compilation proceed. In order to successfully link
the resulting shared object, I also had to edit Makeconf to specify
-L/usr/lib/gcc/x86_64-linux-gnu/7 when linking -lgfortran.

If you plan to use this in production, be very careful. I don't know
about binary compatibility guarantees between g++-7 and clang++-10, so
you might have to recompile every C++-using R package from source with
clang++-10 in order to avoid hard-to-debug problems when using them
together. (It might also work fine. That's the worst thing about such
problems.)

-- 
Best regards,
Ivan

[1] https://en.cppreference.com/w/cpp/header/execution

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] new maintainer for CRAN package XML

2024-02-05 Thread Ivan Krylov via R-package-devel
Dear Uwe Ligges,

On Mon, 22 Jan 2024 15:50:44 +0100
Uwe Ligges  wrote:

> So we are looking for a person volunteering to take over 'XML'.
> Please let us know if you are interested.

Unless someone else has been discussing this with CRAN in private or
had a package depending on XML and was planning to step up but forgot,
I would like to volunteer.

I'm assuming that the Omegahat page is best preserved in its current
form for historical reasons, so instead I have prepared a Git
repository and a page with an option to file issues on the Codeberg
forge: https://codeberg.org/aitap/XML

With the help of the amazing list members, I have also set up a virtual
machine to run the reverse dependency checks, so it should be possible
to avoid immediate breakage if I have to make any changes.

That's the theory, at least.

(Also, thank you for your reply to my question!)

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Bioconductor reverse dependency checks for a CRAN package

2024-02-05 Thread Ivan Krylov via R-package-devel
Thank you Georgi Boshnakov, Ben Bolker, and Diego Hernangómez Herrero
for introducing me to `revdepcheck`!

On Tue, 30 Jan 2024 12:38:57 -0500
Ben Bolker  wrote:

> I have had a few issues with it 
>  but overall it's
> been very helpful.

Indeed that looks perplexing. Writable .Library can also cause problems
for people running R-svn built in their home directories without
R_LIBS_USER set when they check their packages without Suggests.
I'm also relying on .Library.site for the dependencies of the reverse
dependencies. So far, my setup seems to be working as intended, but I'll
keep this issue in mind.

On Tue, 30 Jan 2024 18:57:41 +0100
Diego Hernangómez Herrero  wrote:

> Haven’t tried with a package with such an amount of revdeps, but my
> approach is revdepcheck in GH actions and commiting the result to the
> repo (that is somehow similar to the docker approach if you host the
> package in GitHub).

Great to know that reverse dependency checks can run in CI! I think
I'll keep a stateful virtual machine for now, because otherwise I would
need to find space for 4 to 32 gigabytes of cache somewhere (or download
everything from the repository mirrors every time).

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Bioconductor reverse dependency checks for a CRAN package

2024-02-05 Thread Ivan Krylov via R-package-devel
On Tue, 30 Jan 2024 16:24:40 +
Martin Morgan  wrote:

> BiocManager (the recommended way to install Bioconductor packages) at
> the end of the day does essentially install.packages(repos =
> BiocManager::repositories()), ensuring that the right versions of
> Bioconductor packages are installed for the version of R in use.

That's great to know, thanks! I think I will use BiocManager::install
for now, both because it uses the correct repositories and because it
doesn't forcibly reinstall the packages I am asking for. With bspm, I
can run BiocManager::install(all_the_dependencies) and have the system
perform the least amount of work required to reach the desired state.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Bioconductor reverse dependency checks for a CRAN package

2024-02-05 Thread Ivan Krylov via R-package-devel
Dear Dirk,

Thank you very much for your help here and over on GitHub!

I have finally managed to get the reverse dependency checks working. It
took some additional disk space and a few more system dependencies. If
not for r2u, I would have been stuck for much longer. I really
appreciate the work that went into packaging all these R packages.

On Tue, 30 Jan 2024 10:32:36 -0600
Dirk Eddelbuettel  wrote:

> For what it is worth, my own go-to for many years has been a VM in
> which I install 'all packages needed' for the rev.dep to be checked.

This approach seems to be working for me, too. I had initially hoped to
set something up using CI infrastructure, but there's too many
dependencies to install in a prepare step and it's too much work to
make a container image with all dependencies anew every time I want to
run a reverse dependency check. Easier to just let it run overnight on
a spare computer.

> Well a few of us maintain packages with quite a tail and cope. Rcpp
> has 2700, RcppArmadillo have over 100, BH a few hundred. These aren't
> 'light'.

Maintaining a top-5 CRAN package by in-degree rank
[10.32614/RJ-2023-060] is indeed a very serious responsibility. 

> I wrote myself the `prrd` package (on CRAN) for this, others have
> other tools -- Team data.table managed to release 1.5.0 to CRAN today
> too. So this clearly is possible.

I'll check out `prrd` next, thanks. tools::check_packages_in_dir is
nice, but it could be faster if I could disable mc.preschedule. 

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] Bioconductor reverse dependency checks for a CRAN package

2024-01-30 Thread Ivan Krylov via R-package-devel
Hello R-package-devel,

What would you recommend in order to run reverse dependency checks for
a package with 182 direct strong dependencies from CRAN and 66 from
Bioconductor (plus 3 more from annotations and experiments)?

Without extra environment variables, R CMD check requires the Suggested
packages to be available, which means installing...

revdepdep <- package_dependencies(revdep, which = 'most')
revdeprest <- package_dependencies(
 unique(unlist(revdepdep)),
 which = 'strong', recursive = TRUE
)
length(setdiff(
 unlist(c(revdepdep, revdeprest)),
 unlist(standard_package_names())
))

...up to 1316 packages. 7 of these suggested packages aren't on CRAN or
Bioconductor (because they've been archived or have always lived on
GitHub), but even if I filter those out, it's not easy. Some of the
Bioconductor dependencies are large; I now have multiple gigabytes of
genome fragments and mass spectra, but also a 500-megabyte arrow.so in
my library. As long as a data package declares a dependency on your
package, it still has to be installed and checked, right?

Manually installing the SystemRequirements is no fun at all, so I've
tried the rocker/r2u container. It got me most of the way there, but
there were a few remaining packages with newer versions on CRAN. For
these, I had to install the system packages manually in order to build
them from source.

Someone told me to try the rocker/r-base container together with pak.
It was more proactive at telling me about dependency conflicts and
would have got me most of the way there too, except it somehow got me a
'stringi' binary without the corresponding libicu*.so*, which stopped
the installation process. Again, nothing that a bit of manual work
wouldn't fix, but I don't feel comfortable setting this up on a CI
system. (Not on every commit, of course - that would be extremely
wasteful - but it would be nice if it was possible to run these checks
before release on a different computer and spot more problems this way.)

I can't help but notice that neither install.packages() nor pak() is
the recommended way to install Bioconductor packages. Could that
introduce additional problems with checking the reverse dependencies?

Then there's the check_packages_in_dir() function itself. Its behaviour
about the reverse dependencies is not very helpful: they are removed
altogether or at least moved away. Something may be wrong with my CRAN
mirror, because some of the downloaded reverse dependencies come out
with a size of zero and subsequently fail the check very quickly.

I am thinking of keeping a separate persistent library with all the
1316 dependencies required to check the reverse dependencies and a
persistent directory with the reverse dependencies themselves. Instead
of using the reverse=... argument, I'm thinking of using the following
scheme:

1. Use package_dependencies() to determine the list of packages to test.
2. Use download.packages() to download the latest version of everything
if it doesn't already exist. Retry if got zero-sized or otherwise
damaged tarballs. Remove old versions of packages if a newer version
exists.
3. Run check_packages_in_dir() on the whole directory with the
downloaded reverse dependencies.

For this to work, I need a way to run step (3) twice, ensuring that one
of the runs is performed with the CRAN version of the package in the
library and the other one is performed with the to-be-released version
of the package in the library. Has anyone already come up with an
automated way to do that?

No wonder nobody wants to maintain the XML package.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Possible malware(?) in a vignette

2024-01-28 Thread Ivan Krylov via R-package-devel
There used to be a long analysis in the draft of this e-mail [*], but
let me cut to the chase.

Even something as simple as replacing the four-byte comment [**] at the
beginning of the file ("%\xd0\xd4\xc5\xd8" -> "%") that keeps the
file fully readable (!) results in the same behaviour but zero
detections:

$ sha256sum d_jss_paper*.pdf
0ae3b229fdd763a0571463dc98e02010752bb0213a672db6826afcd72ccaf291  
d_jss_paper1.pdf
9486d99c1c1f2d1b06f0b6c5d27c54d4f6e39d69a91d7fad845f323b0ab88de9  
d_jss_paper.pdf
$ diff -u <(hd d_jss_paper.pdf) <(hd d_jss_paper1.pdf)
--- /dev/fd/63  2024-01-28 13:00:43.454419322 +0300
+++ /dev/fd/62  2024-01-28 13:00:43.454419322 +0300
@@ -1,4 +1,4 @@
-  25 50 44 46 2d 31 2e 35  0a 25 d0 d4 c5 d8 0a 37  |%PDF-1.5.%.7|
+  25 50 44 46 2d 31 2e 35  0a 25 20 20 20 20 0a 37  |%PDF-1.5.%.7|
 0010  37 20 30 20 6f 62 6a 0a  3c 3c 0a 2f 4c 65 6e 67  |7 0 obj.<<./Leng|
 0020  74 68 20 32 36 32 38 20  20 20 20 20 20 0a 2f 46  |th 2628  ./F|
 0030  69 6c 74 65 72 20 2f 46  6c 61 74 65 44 65 63 6f  |ilter /FlateDeco|

https://www.virustotal.com/gui/file/0ae3b229fdd763a0571463dc98e02010752bb0213a672db6826afcd72ccaf291

The scary-looking files and hosts being accessed are just Adobe Reader
and Chrome behaving in a manner indistinguishable from spyware. Upload
any PDF file with links in it and you'll see the same picture. Even the
original report for d_jss_paper.pdf from poweRlaw_0.70.6 says "no
sandboxes flagged this file as malicious".

I think that the few non-major antivirus products that "detected" the
original file remembered a low-quality checksum of a different file,
and this whole thread resulted from a checksum collision. 0x043BC33F
(71025471) is what, four bytes? Doesn't seem to be a standard CRC-32 or
the sum of all bytes modulo 2^32, though.

I cannot prove a negative, but I invite infosec people with more PDF
experience to comment further on the issue.

-- 
Best regards,
Ivan

[*] Colin seems to have used the Debian build of TeX Live 2017 to
generate it, which is non-trivial but possible to reproduce by
installing it from Debian Snapshots on top of Stretch. The resulting
file has a different hash (for valid reasons), the same behaviour, but
zero detections:
https://www.virustotal.com/gui/file/f7b0e0400167e06970ac61fcadfda29daec1c2ee685d4c9ff805e375bcffc985/behavior

Trying a "binary search" by removing PDF objects or replacing byte
ranges with ASCII spaces was also a dead end: any change results in no
detections.

[**] PDF 1.5 specification, section 3.1.2:

>> Comments (other than the %PDF−1.4 and %%EOF comments described in
>> Section 3.4, “File Structure”) have no semantics.

https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/pdfreference1.5_v6.pdf#G8.1860480

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Possible malware(?) in a vignette

2024-01-27 Thread Ivan Krylov via R-package-devel
Apologies for being insufficiently clear. By "a file straight from NOAA" I 
meant a completely different PDF, 
, 
that gives the same SHA-256 hash whether downloaded by VirusTotal 

 or me, comes from a supposedly trusted source, and still makes Acrobat Reader 
behave like it's infected, show a crashed Firefox on the screenshot and drop a 
number of scary-looking files. Surely there will be a difference between 
reading an infected file and a non-infected file?

27 января 2024 г. 15:10:53 GMT+03:00, Bob Rudis  пишет:
>Ivan: do you know what mirror NOAA used at that time to get that version of
>the package? Or, did they pull it "directly" from cran.r-project.org
>(scare-quotes only b/c DNS spoofing is and has been a pretty solid attack
>vector)?

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Possible malware(?) in a vignette

2024-01-27 Thread Ivan Krylov via R-package-devel
В Sat, 27 Jan 2024 03:52:01 -0500
Bob Rudis  пишет:

> Two VT sandboxes used Adobe Acrobat Reader to open the PDF and the PDF
> seems to either had malicious JavaScript or had been crafted
> sufficiently to caused a buffer overflow in Reader that then let it
> perform other functions on those sandboxes.

Let's talk package versions and SHA256 hashes of
poweRlaw/inst/doc/d_jss_paper.pdf.

poweRlaw version 0.70.4:
Packaged: 2020-04-07 14:55:32 UTC
Date/Publication: 2020-04-07 16:10:02 UTC
SHA-256(poweRlaw/inst/doc/d_jss_paper.pdf):
96535de112f471c66e29b74c77444b34a29b82d6525c04d477ed2d987ea6ccae

Not previously uploaded to VirusTotal, currently checks out clean:
https://www.virustotal.com/gui/file/96535de112f471c66e29b74c77444b34a29b82d6525c04d477ed2d987ea6ccae

poweRlaw version 0.70.5:
Packaged: 2020-04-23 15:36:49 UTC
Date/Publication: 2020-04-23 16:40:06 UTC
SHA-256(poweRlaw/inst/doc/d_jss_paper.pdf):
5f827302ede74e1345fba5ba52c279129823da3c104baa821d654ebb8d7a67fb

Not previously uploaded to VirusTotal, also checks out clean:
https://www.virustotal.com/gui/file/5f827302ede74e1345fba5ba52c279129823da3c104baa821d654ebb8d7a67fb/behavior

For some reason, the Zenbox report shows a browser starting up and
someone (something?) moving the mouse:
https://vtbehaviour.commondatastorage.googleapis.com/5f827302ede74e1345fba5ba52c279129823da3c104baa821d654ebb8d7a67fb_Zenbox.html?GoogleAccessId=758681729565-rc7fgq07icj8c9dm2gi34a4cckv23...@developer.gserviceaccount.com=1706348766=KSTxSZJJUUv0FOA51Kwuot89ep4PKUDTY6tHL7kTyG7VwaMlF8VjmU90loeF4ytLBxKjkEtAk%2Ffr39xFrTTyOym3mehtc3HLyT9DS3C5qGa9OPVcu%2BfQfd8qr%2BRubBWb3SKNnhGpi%2Bn%2BTDhaiRx3PilEz%2BwVGiukfNUzWGBlGweG%2BmR1Y%2F0fIgDxJ3eyZ8KwTaocbywMoOLJeC1GSmoW8VYUAnFS2bb8P9Jt%2Bs%2F0axvAkc0M2pmSN3s2lpMq8u5P%2FZZ8yRIMdmv%2B1kUR5ajBdIa%2FHV8Vw8xAdNjZID6ozwAsmBOOizJmHgzr4zh1tX4V65qmcz8D3jctvDRKsuEqXA%3D%3D=text%2Fhtml;#overview

Lots of file activity. I think that all of it can be attributed to
either normal Acrobat Reader activity or normal Chrome activity.

Then we come to poweRlaw version 0.70.6:
Packaged: 2020-04-24 10:44:31 UTC
Date/Publication: 2020-04-25 07:30:12 UTC
SHA-256(inst/doc/d_jss_paper.pdf):
9486d99c1c1f2d1b06f0b6c5d27c54d4f6e39d69a91d7fad845f323b0ab88de9

The Web Archive capture version 20201205222617 for the address
https://cran.r-project.org/web/packages/poweRlaw/vignettes/d_jss_paper.pdf
has the same SHA-256 hash.

This file is being disputed because some antivirus applications flag it:
https://www.virustotal.com/gui/file/9486d99c1c1f2d1b06f0b6c5d27c54d4f6e39d69a91d7fad845f323b0ab88de9/behavior

The behaviour is exactly the same as the one from version 0.70.5:
browser opens with a link to a wrong DOI. Some links are followed.
https://vtbehaviour.commondatastorage.googleapis.com/9486d99c1c1f2d1b06f0b6c5d27c54d4f6e39d69a91d7fad845f323b0ab88de9_Zenbox.html?GoogleAccessId=758681729565-rc7fgq07icj8c9dm2gi34a4cckv23...@developer.gserviceaccount.com=1706347808=Kv1LXUGvDe988Br0pU1AMlttjYY1K9sDwouvZrlzAVSspkdOGS9Ow%2Bg%2F3VjnQLEshx08QqgOHZzQcghownumPDUJLBbEHbOk6KG9IZSH43rxkYhTIy%2BYT5PfNFIupevbJA5XrnJHrm1wKho2%2BDb4t8vA4cgOJJY0UahXTbIMKUeUmPCKAzx9W5kYKj55WhNDrIPrEuni9EeGWkFV45kPr%2BBwYfl2hK4%2BWv6K78CB7zJtzFltF6P3pewafn5Lg3M3AY5YcZ4TryXi01t0dq04Fha83fLRP7JUkmcfpAJauA48Ct0XN7RdCRPSogb0TAGwG%2BDstxNzLAphOEsVju9LUQ%3D%3D=text%2Fhtml;#dropped-info

I've uploaded a decompressed version (prepared using qpdf in.pdf
--stream-data=uncompress out.pdf) of the same file to VirusTotal, and
there are no detections. Zero detections, but the behaviour is the same:
some files are "dropped", but all of them relate to cache in Acrobat
Reader (which is nowadays a piece of Chrome) and Chrome itself:
https://www.virustotal.com/gui/file/5acbc41f103c88a801db36fa72f01d4fa81b9afa1879c36235b1f5373d46ee1a/behavior

Finally, there's poweRlaw version 0.80.0:
Packaged: 2024-01-25 10:39:42 UTC
Date/Publication: 2024-01-25 18:00:02 UTC
SHA-256(inst/doc/d_jss_paper.pdf):
17c252a38e6c9bcfab90a69070b17c5e9d8a1713b7bb376badaeba28b3a38739
Same zero flags, same behaviour of starting the browser, same "dropped"
files in the cache:
https://www.virustotal.com/gui/file/17c252a38e6c9bcfab90a69070b17c5e9d8a1713b7bb376badaeba28b3a38739/behavior
https://vtbehaviour.commondatastorage.googleapis.com/17c252a38e6c9bcfab90a69070b17c5e9d8a1713b7bb376badaeba28b3a38739_Zenbox.html?GoogleAccessId=758681729565-rc7fgq07icj8c9dm2gi34a4cckv23...@developer.gserviceaccount.com=1706348864=UjXMjCvz0uTjS1sqyr5y%2FOwluE%2BskW9F2XupXuOs5JgODlsL1BuwJcWJ56xddQNEtKDHDOaXoRfNxynsffmSaza4yJD9hvPJ6%2BrNMibbB8hojY53g07WKnCd3wdaOmOHEqIP7Md06QWD4CnLEN0KlRvWdsUUA%2F9YTB1bAVqkIR%2FtiaJcRrOTAmdG%2F9Hwrq4xpiEBaFZzO%2FsQPVj3dzNS1LQEXOHFAfnOTaC1LlbBfn9QQWCPib%2FpCOL7huVYqIFSm%2FO8VHWv67JD1qwcTOY7JSl8XPw1ueyumRpF5xF1rpWYCPjC1awU8tho25A2COA7f7LSkku0BRqkuHYW3kuZaw%3D%3D=text%2Fhtml;#dropped-info

I've also uploaded a PDF that came directly from a US agency (NOAA) and
got a similar report:

Re: [R-pkg-devel] How to deal with issues when using devtools::check_rhub(), rhub::check(), and web form

2024-01-24 Thread Ivan Krylov via R-package-devel
В Wed, 24 Jan 2024 16:14:05 -0800
Carl Schwarz  пишет:

> I tried using the web interface at https://builder.r-hub.io/ to
> select the denebian machines, and it returns a message saying
> 
> We're sorry, but something went wrong.
> If you are the application owner check the logs for more information.

> So how do I tell if this a "Rhub issue" or an issue with my package?

A problem with your package would look more like the check at least
starting and then producing errors. Here, it doesn't look like the
check is even starting.

> Or do I just give up on using Rhub to check the denebian machines?

For a while, Rhub used to offer the only on-demand checking service
specifically on Linux machines (there was Win-builder by Uwe Ligges and
macOS builder by Simon Urbanek, but no "Linux builder"), including
Debian [*]. Now that the funding ran out [**], you can try using
various continuous integration services to run your checks in a Linux
virtual machine. Many of them offer free compute minutes.

I think that you've already fulfilled the requirements of the CRAN
policy by fixing all known problems and having R CMD check --as-cran on
R-devel run for you by Win-Builder (which is what
devtools::check_win_devel() does).

-- 
Best regards,
Ivan

[*]
Named after Debra Lynn and Ian Murdock

[**]
https://github.com/RConsortium/r-repositories-wg/blob/main/minutes/2023-09-07_Minutes.md

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] New Package Removal because Shared Library Too Large from Debugging Symbols

2024-01-24 Thread Ivan Krylov via R-package-devel
On Mon, 22 Jan 2024 17:14:04 +0100
Tomas Kalibera  wrote:

> Yes, inside a bigger email, reports can get overlooked, particularly 
> when in a thread with a rather different subject. It wasn't
> overlooked this time thanks to Martin.

Then additional thanks goes to Martin, and I'll make sure to report in
the right place if a similar situation happens again.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] lost braces note on CRAN pretest related to \itemize

2024-01-23 Thread Ivan Krylov via R-package-devel
В Tue, 23 Jan 2024 19:39:54 +0100
Patrick Giraudoux  пишет:

>    \itemize{
>    \item{.}{lm and glm objects can be passed directly as the upper
> scope of term addition (all terms added).

Inside the \itemize and \enumerate commands, the \item command doesn't
take any arguments:
https://cran.r-project.org/doc/manuals/R-exts.html#Lists-and-tables

Instead, it starts a new paragraph with a number (\enumerate) or a
bullet point (\itemize). R CMD check is reminding you that \itemize{
\item{foo}{bar} } is equivalent to \itemize{ \item foo bar } without
any braces.

If you meant to highlight a word by making it an argument of the \item
command, use the \describe command. Here, you're highlighting a dot,
which would be rendered with a bullet point before it, so it's probably
neither semantically nor visually appropriate.

> \value{
>    A \code{\link[sf]{sfc}} object, of POINT geometry, with the
> following columns:
>    \itemize{
>    \item{ID}{ ID number}

The same problem applies here.

Additionally, R CMD check is reminding you that \value{} is implicitly
a special case of a \describe{} environment:
https://cran.r-project.org/doc/manuals/R-exts.html#index-_005cvalue

Since you're already using \item{}{} labels to name the components of
the value, just drop the \itemize{} (but keep its contents). \value{} is
enough.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Cannot see the failure output on Fedora clang/gcc falvor (page not found)

2024-01-22 Thread Ivan Krylov via R-package-devel
On Sun, 21 Jan 2024 16:51:39 +
Sameh Abdulah  wrote:

> However, we cannot access the webpage (page not found) to identify
> and address the failures on Fedora systems.
> 
> https://cran-archive.r-project.org/web/checks/2024/2024-01-12_check_results_MPCR.html
> 
> How can we see the failures on these systems?

I cannot help you with the exact output from the Fedora system (I think
it's lost), but here's how the package fails on mine:

* installing *source* package 'MPCR' ...
** using staged installation
Linux

/tmp/RtmpCSPOGc/Rbuild6043fb1a651/MPCR
/usr/bin/cmake
CMake is installed in: /usr/bin
-- The C compiler identification is GNU 12.2.0
-- The CXX compiler identification is GNU 12.2.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/gcc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/g++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- WORKING ON RELEASE MODE
MPCR Install Result : FALSE
-- Found OpenMP_C: -fopenmp (found version "4.5")
-- Found OpenMP_CXX: -fopenmp (found version "4.5")
-- Found OpenMP: TRUE (found version "4.5")
OpenMp Found
R Include Path :  /home/ivan/R-build/include
Rcpp Lib Path :  /home/ivan/R-build/library/Rcpp
R Home Path :  /home/ivan/R-build
CMake Error at cmake/FindR.cmake:63 (find_library):
  Could not find R_LIB using the following names: libR.so
Call Stack (most recent call first):
  CMakeLists.txt:70 (FIND_PACKAGE)


-- Configuring incomplete, errors occurred!
See also 
"/tmp/RtmpCSPOGc/Rbuild6043fb1a651/MPCR/bin/CMakeFiles/CMakeOutput.log".
make: *** No rule to make target 'clean'.  Stop.
make: *** No rule to make target 'all'.  Stop.
cp: cannot stat '/tmp/RtmpCSPOGc/Rbuild6043fb1a651/MPCR/bin/src/libmpcr.so': No 
such file or directory
Failed: libmpcr.so -> src
** libs
make: Nothing to be done for 'all'.
** R
** inst
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
Error: package or namespace load failed for 'MPCR' in library.dynam(lib, 
package, package.lib):
 shared object 'MPCR.so' not found

It is not the default to build R as a shared library, and this
installation of R has been built without --enable-R-shlib. I'm sure
that with enough effort it's possible to propagate the information from
R to CMake so that it would make you a shared library in the correct
manner, but I think it's easier to separate your code into two parts:

 1. One part should contain most of your code, without the dependencies
on R. It can be built using CMake if that's what you prefer. It
will probably be more convenient to build it as a static library.

 2. The other part will be the R interface. Let the R build system
(described in WRE 1.2 [*] and below, especially 1.2.6) link the
final shared library from the small remaining part of the source
files (those that include R-related headers) and the static library
from the previous step. If you play your cards right, it will also
work on Windows without significant additional effort.

Have you considered linking your R package against the BLAS and LAPACK
that already come with R? This may not give the user the best possible
performance ever, but those who do care about performance have probably
installed a copy of BLAS of their own choice and may not prefer an
extra copy of OpenBLAS that may or may not match the optimal parameters
for their hardware. Same goes for libgfortran (that may be required
depending on what you're linking) [**].

This would also make it easier to comply with CRAN policy on external
libraries [***]: if you want to download software during package
installation, you may be required to host a fixed version of the
package on something extra reliable (like Zenodo) and verify a
cryptographic hash of the file you download before using it.

-- 
Best regards,
Ivan

[*]
https://cran.r-project.org/doc/manuals/R-exts.html#Configure-and-cleanup

[**]
https://cran.r-project.org/doc/manuals/R-exts.html#index-FLIBS

[***]
https://cran.r-project.org/web/packages/using_rust.html
https://cran.r-project.org/web/packages/external_libs.html

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] New Package Removal because Shared Library Too Large from Debugging Symbols

2024-01-22 Thread Ivan Krylov via R-package-devel
On Mon, 22 Jan 2024 12:30:46 +0100
Tomas Kalibera  wrote:

> Thanks, ported now to R-patched.

Thank you!

Is it fine to mention problems like this one in the middle of an
e-mail, or should I have left a note in the Bugzilla instead?

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Assistance Needed for Resolving Submission Issues with openaistream Package

2024-01-22 Thread Ivan Krylov via R-package-devel
Hello Li Gen and welcome to R-package-devel!

В Mon, 22 Jan 2024 17:50:33 +0800
 пишет:

> The specific areas of concern are:License Information: There's a note
> indicating that the license stub is an "invalid DCF". I've used 'MIT
> + file LICENSE' as the licensing terms. I would appreciate guidance
> on how to correctly format this section to meet the DCF standards.

Leave just the following lines in the LICENSE file, as it currently is
on CRAN [*]:

YEAR: 2023
COPYRIGHT HOLDER: openaistream authors

Why would you like to change it? CRAN doesn't want packages to provide
yet another copy of the MIT license inside the tarball. The text of the
MIT license is always available in an R install at
file.path(R.home('share'), 'licenses', 'MIT').

If you need a copy of the MIT license inside your GitHub repository,
store it elsewhere (e.g. LICENSE.md) and list it in .Rbuildignore [**].

Since you composed your e-mail in HTML and left your mailer to generate
a plain text equivalent, we only got the latter, somewhat mangled:
https://stat.ethz.ch/pipermail/r-package-devel/2024q1/010356.html

Please compose your messages to R mailing lists in plain text.

-- 
Best regards,
Ivan

[*]
https://cran.r-project.org/web/packages/openaistream/LICENSE

[**]
https://cran.r-project.org/doc/manuals/R-exts.html#Building-package-tarballs

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] New Package Removal because Shared Library Too Large from Debugging Symbols

2024-01-21 Thread Ivan Krylov via R-package-devel
В Sat, 20 Jan 2024 20:28:00 -0500
Johann Gaebler  пишет:

> most likely there’s some error on my part in how I’ve set up cpp11,
> but it also seems possible that cpp11 should have detected that that
> header needs to be included and added it automatically

Upon further investigation, it's more complicated than a missing
#include.

cpp11::cpp_register() uses
tools::package_native_routine_registration_skeleton() to generate these
declarations. This function works by scanning the R code for calls to
.Call(), .C(), .Fortran(), and others and then trying to come up with
appropriate prototypes for the native functions being called. For
.Call()s, the function must output the correct type of SEXP for every
argument in the generated declaration.

This works the right way, for example, in R-4.2.2 (2022-11-10) and
today's R-devel, but was broken for a while (e.g. in R-4.3.1 and
R-4.3.2), and the fix, unfortunately, hasn't been backported (not to
R-patched either): https://bugs.r-project.org/show_bug.cgi?id=18585

I can suggest three workarounds.

1. Edit src/cpp11.cpp on a separate "for-CRAN" branch and rebase it on
   top of the main branch every time you update the package.

2. Install R-devel and use it to generate the source package. Strictly
   speaking, this would go against the letter of the CRAN policy
   (builds "should be done with current R-patched or the current
   release of R"), but would at least follow its spirit (use the
   version of R where the known package-building-related bug was fixed).

3. Add a configure script that would modify src/cpp11.cpp while the
   package is being installed. This way, the only thing modifying
   generated code would be more code, which is considered
   architecturally pure by some developers.

   Lots of ways to implement it, too: you can do it in a single shell
   script (using sed or patch -- are these tools guaranteed to be
   available?), delegate to tools/configure.R (that you would also
   write yourself), or go full GNU Autoconf and generate a
   megabyte-sized ./configure from some m4 macros just to replace one
   line.

   There is definitely a lot of performance art value if you go this
   way, but extra code means extra ways for it to go wrong. For more
   style points, make it a Makevars target instead of a configure
   script.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] New Package Removal because Shared Library Too Large from Debugging Symbols

2024-01-20 Thread Ivan Krylov via R-package-devel
В Sat, 20 Jan 2024 14:38:55 -0500
Johann Gaebler  пишет:

> The issue is that the compiled libraries are too large.

Was it in the e-mail? As you quite correctly observed, many other
packages get the NOTE about shared library size.

It may be not exactly obvious, but the red link saying "LTO" on the
check page that points to
 is hiding a more
serious issue:

> cpp11.cpp:18:13: warning: 'run_testthat_tests' violates the C++ One 
> Definition Rule [-Wodr]
>18 | extern SEXP run_testthat_tests(void *);
>   | ^
> /data/gannet/ripley/R/test-dev/testthat/include/testthat/testthat.h:172:17: 
> note: 'run_testthat_tests' was previously declared here
>   172 | extern "C" SEXP run_testthat_tests(SEXP use_xml_sxp) {
>   | ^

Modern C++ compilers are painfully pedantic about undefined behaviour
and can optimise away large sections of code if they think they have a
proof that your code causes it [*]. If you edit cpp11.cpp to provide the
correct declaration (#include the testthat header if possible), the
error should go away.

-- 
Best regards,
Ivan

[*] For example, see this issue in R: 
https://bugs.r-project.org/show_bug.cgi?id=18430

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Inquiry Regarding Package Organization in CRAN

2024-01-19 Thread Ivan Krylov via R-package-devel
Hello Andriy and welcome to R-package-devel!

On Fri, 19 Jan 2024 14:34:25 +
Protsak Andriy via R-package-devel 
wrote:

> to achieve this the initial focus is on exploring the possibility of
> renaming the packages so that they share a common prefix, making it
> easier for uses to locate them in the package list.

CRAN package names are long-term identifiers. Assume that there are
many users happy with the packages as they are. If you rename a
package, they will have to patch their scripts and their own packages
just to keep them working as before. Red Queen's race is not something
people like to participate in.

It is certainly not impossible to rename a package, but there has to be
a very good reason to break backwards compatibility and assume a new
name, while the old name stays in the archive, unavailable for new
packages.

Here are some past responses to similar questions:

https://stat.ethz.ch/pipermail/r-package-devel/2022q2/008140.html
https://stat.ethz.ch/pipermail/r-package-devel/2017q2/001678.html
https://stat.ethz.ch/pipermail/r-package-devel/2015q3/000271.html

> If you believe there are alternative strategies to achieve a similar
> result, please feel free to share your perspective.

There are approximately 2 active packages on CRAN. Looking for
useful packages by scanning a list of names will not be very effective.
Better results can be achieved using tools like RSiteSearch
. If you want a package to be more
visible, request its addition to a Task View
. If some packages are related,
make them link to each other in their documentation. David's options
are all very good.

> Additionally, I'm looking into the prospect of merging two packages
> that contain similar functionalities. The aim is to create a more
> comprehensive package by incorporation additional features and
> ensuring seamless compatibility.

The previous point about keeping backwards compatibility still stands.
It should be possible to move all the functions to one package and then
import() it from the other package. Both packages can then export() all
functions, making them available to the dependencies of either package.
Eventually, the skeleton package may grow packageStartupMessage()s
letting the users know that it is deprecated and could they please use
the other package instead. After a while, it should be possible to
archive the skeleton package. But deprecation cycles should be long:
for example, rgeos and rgdal took more than a year to retire
.

Or do you intend to come up with a completely new API? Beware of the
second system effect (although it's certainly not unheard of for second
system projects to succeed).

The spatstat package went through the opposite process a few years ago:
it grew too big and had to be split into multiple packages. Here's one
of its maintainers sharing the experience:
https://stat.ethz.ch/pipermail/r-package-devel/2022q4/008557.html

What is the nature of your final year project? If it can include
technical writing, you could add well-written vignettes to the packages
(only one of the CRAN packages maintained by people @uah.es has a
vignette, and it's very terse). If it has to be mostly programming or
maintenance of R packages, I'm out of ideas.

Either way, good luck, and I hope your project succeeds!

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Additional Issues: Intel

2024-01-17 Thread Ivan Krylov via R-package-devel
В Wed, 17 Jan 2024 10:30:36 +1100
Hugh Parsonage  пишет:

> I am unable to immediately see where in the test suite this error has
> occurred.

Without testthat, you would have gotten a line by line printout of the code, 
letting you pinpoint the (top-level) place of the crash. With
testthat, you will need a more verbose reporter that would print tests
as they are executed to find out which test causes the crash.

> The only hunch I have is that the package uses C code and includes
> structs with arrays on the stack, which perhaps are excessive for the
> Intel check machine, but am far from confident that's the issue.

According to GNU cflow, your only recursive C functions are
getListElement (from getListElement.c) and nthOffset (from Offset.c),
but the recursion seems bounded in both cases.

I've tried looking for variable-length arrays in your code using a
Coccinelle patch, but found none. If you had variable-bounded recursion
or variable-length stack arrays (VLA or alloca()), it would be prudent
to use R_CheckStack() or R_CheckStack2(size_of_VLA), but your C code
contains neither, so there's no obvious culprit. If you know about
R-level recursion happening in your code and have a way to reduce it,
that might help too.

Otherwise, it's time to install Intel Everything and reproduce and
debug the problem the hard way.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] CMake on CRAN Systems

2024-01-16 Thread Ivan Krylov via R-package-devel
Dear Sameh,

Regarding your question about the MPCR package and the use of CMake
:
on a Mac, you have to look for the cmake executable in more than one
place because it is not guaranteed to be on the $PATH. As described in
Writing R Extensions
, the
following is one way to work around the problem:

if test -z "$CMAKE"; then CMAKE="`which cmake`"; fi
if test -z "$CMAKE"; then
 CMAKE=/Applications/CMake.app/Contents/bin/cmake;
fi
if test -f "$CMAKE"; then echo "no ‘cmake’ command found"; exit 1; fi

Please don't reply to existing threads when starting a new topic on
mailing lists. Your message had a mangled link that went to
urldefense.com instead of cran-archive.r-project.org, letting Amazon
(who host the website) know about every visit to the link:
https://stat.ethz.ch/pipermail/r-package-devel/2024q1/010328.html

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] checking CRAN incoming feasibility

2024-01-16 Thread Ivan Krylov via R-package-devel
В Tue, 16 Jan 2024 08:47:07 +
David Hugh-Jones  пишет:

> If I understand correctly, the current procedure is that the client
> downloads every package name from CRAN, and then checks its name is
> unique.

This is not the only check that relies on utils::available.packages().

In particular, strong dependencies are ensured to be present in
mainstream repositories, and the whole strong dependency tree is checked
for packages with FOSS licenses to ensure that their dependencies do not
restrict use.

Additional checks require even more files:

 - src/contrib/PACKAGES.in is checked for CRAN notes on packages
 - src/contrib/Meta/archive.rds is also checked for potential name
   collisions, case-insensitively.
 - src/contrib/Meta/current.rds is checked together with archive.rds
   for update frequency
 - web/packages/packages.rds is checked for maintainer changes

> Wouldn’t it be faster (for both parties) to check name uniqueness
> directly on the server?

The current scheme, if somewhat wasteful, makes it possible to run R
CMD check with any CRAN mirror without making it run any code server
side. (With the small exception of .htaccess to rewrite some paths, but
that should be translatable for other servers like nginx too.)

It's probably not impossible to transmit only data related to the
current package while keeping this property, but recursive dependency
checks in particular will not be easy. I think it's not worth the
effort.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] checking CRAN incoming feasibility

2024-01-15 Thread Ivan Krylov via R-package-devel
В Tue, 16 Jan 2024 05:49:01 +
Rolf Turner  пишет:

> The problem is persistent/repeatable.  I don't believe that there is
> any faulty connection.

One of the things done by R CMD check --as-cran at this point is
sending a HEAD request to every Web link mentioned in the package
documentation and DESCRIPTION. One of the hosts may be slow to respond,
either by accident or due to misguided anti-robot countermeasures.
(Most website protection systems would say that R CMD check counts as a
robot because there's no human behind it to look at the ads.)

Here's what you could try. Unpack your built source package. If you
have a fresh .Rcheck directory from an R CMD check, use
YOURPACKAGE.Rcheck/00_pkg_src/YOURPACKAGE. Then profile the check
function, using the subdirectory from the source package archive as the
argument:

Rprof(); tools:::.check_package_CRAN_incoming(dir); Rprof('NULL')

Does any one function stand out in the subsequent summaryRprof()
output? For me, it's readRDS (not very helpful), but by reading
Rprof.out I can see that it's used by CRAN_package_db and
CRAN_archive_db to download web/packages/packages.rds and
src/contrib/Meta/archive.rds from the chosen CRAN mirror, which for me
takes a few seconds for both files.

Do you have a CRAN mirror set up in ~/.Rprofile? It could be having a
slow day.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Does dependencies up to date on the pretest CRAN infrastructure

2024-01-13 Thread Ivan Krylov via R-package-devel
В Fri, 12 Jan 2024 21:19:00 +0100
Serge  пишет:

> After somme minor midficiations, I make a try on the winbuilder site.
> I was able to build the archive with the static library
> but I get again a Bad address error. You can have a look to
> 
> https://win-builder.r-project.org/bw47qsMX3HTd/00install.out

I think that Win-Builder is running out of memory. It took some
experimenting, but I was able to reproduce something like this using
the following:

1. Set the swap file in the Windows settings to minimal recommended
size and disable its automatic growth

2. Write and run a program that does malloc(LARGE_NUMBER); getchar();
so that almost all physical memory is allocated

3. Run gcc -DFOO=`/path/to/Rscript -e 'some script'` & many times

I got a lot of interesting errors, including the "Bad address":

Warnings:
1: .getGeneric(f, , package) : internal error -4 in R_decompress1
2: package "methods" in options("defaultPackages") was not found

0 [main] bash (2892) child_copy: cygheap read copy failed,
0x0..0x800025420, done 0, windows pid 2892, Win32 error 299

0 [main] bash (3256) C:\rtools43\usr\bin\bash.exe: *** fatal error in
forked process - MEM_COMMIT failed, Win32 error 1455

-bash: fork: retry: Resource temporarily unavailable

-bash: R-devel/bin/Rscript.exe: Bad address

Your package is written in C++, but that by itself shouldn't disqualify
it. On my Linux computer, /usr/bin/time R -e
'install.packages("MixAll")' says that the installation takes slightly
less than a gigabyte of memory ("912516maxresident k"), which is par
the course for such packages. (My small Rcpp-using package takes
approximately half a gigabyte by the same metric.)

I'm still not 100% sure (if Win-Builder is running out of memory, why
are you seeing "Bad address" only and not the rest of the carnage?),
but I'm not seeing a problem with your package, either. If EFAULT is
Cygwin's way of saying "I caught a bad pointer in your system call"
(which, I must stress, is happening inside /bin/sh, not your package
or even R at all), it's not impossible that Win-Builder is having
hardware problems. Unfortunately, they take a lot of effort and
downtime to diagnose and could be hiding anywhere from RAM to the power
supply.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Does dependencies up to date on the pretest CRAN infrastructure

2024-01-12 Thread Ivan Krylov via R-package-devel
В Fri, 12 Jan 2024 19:09:29 +0100
Serge  пишет:

> I updated the package rtkore one month ago, fixing a compilation
> problem on windows devel platform.
> 
> MixAll has a dependency to rtkore. Thus, I suspect that the error
> reported below is due to the presence of the old version of rtkore on
> the pretest infrastructure of the CRAN.

:

/usr/bin/make -C projects/Clustering/src/
make[2]: Entering directory 
'/d/temp/RtmpYJkDTJ/R.INSTALL316dc7c0f48e6/MixAll/inst/projects/Clustering/src'
g++ -std=gnu++17  -I"D:/RCompile/recent/R/include" -DNDEBUG 
`D:/RCompile/recent/R/bin/Rscript -e "rtkore:::CppFlags()"`  
-I'D:/RCompile/CRANpkg/lib/4.4/Rcpp/include' 
-I'D:/RCompile/CRANpkg/lib/4.4/rtkore/include'   
-I"d:/rtools43/x86_64-w64-mingw32.static.posix/include"
`D:/RCompile/recent/R/bin/Rscript -e "rtkore:::CxxFlags()"` -I../inst/projects/ 
-I../inst/include/ -fopenmp   -pedantic -O2 -Wall  -mfpmath=sse -msse2 
-mstackrealign  -I../../../projects/ -I../../../include/ 
STK_CategoricalParameters.cpp -c -o ../../../bin/STK_CategoricalParameters.o
/bin/sh: line 1: /x86_64-w64-mingw32.static.posix/bin/g++: Bad address
make[2]: *** [makefile:54: ../../../bin/STK_CategoricalParameters.o] Error 126

RTools uses Cygwin features to emulate the presence of certain virtual
paths; /x86_64-w64-mingw32.static.posix/bin/g++ actually exists and is
transparently mapped to
d:/rtools43/x86_64-w64-mingw32.static.posix/bin/g++.exe:

User@WINMACHINE MSYS ~
$ /x86_64-w64-mingw32.static.posix/bin/g++ --version
g++.exe (GCC) 12.2.0

The "Bad address" here means that /bin/sh got an EFAULT while trying to
launch g++.exe:
https://stat.ethz.ch/pipermail/r-package-devel/2023q4/010223.html

Unless there is something extremely weird in the command line arguments
returned by Rscript -e "rtkore:::CxxFlags()" that causes the process to
fail to launch (in my opinion, very unlikely, but can you print them
from your compilation process just in case?), I would be looking for
problems elsewhere.

In particular, the problem cannot be in having rtkore installed that is
one version too old, because you only changed Makevars in that version,
and your package MixAll doesn't use the Makevars from a different
source package.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] "Examples with CPU time > 2.5 times elapsed time" and other NOTEs on CRAN and rhub checks

2024-01-11 Thread Ivan Krylov via R-package-devel
В Thu, 11 Jan 2024 12:39:17 +
D Z  пишет:

> The package itself has no parallelism built-in, but Imports
> data.table. This NOTE does not surface on other platforms (eg using
> rhub or on my GitHub actions runners). My unit tests already limit
> data.table to 2 cores using setDTthreads(2), but I would like to keep
> this line out of the help files for my functions.

A breakpoint on pthread_create confirms that these are OpenMP threads
created by data.table. You can wrap setDTthreads(2) in \dontshow{} to
avoid visual pollution:
https://cran.r-project.org/doc/manuals/R-exts.html#index-_005cdontshow

> I receive the NOTE that my libs/ sub-directory is at 7.7Mb. Can I
> ignore this or do I need to figure out how to reduce the binary size
> of the package?

I think this is typically accepted for packages using C++.

> And last but not least, on some rhub instances (Fedora and Ubuntu
> GCC) I receive a NOTE that the package runs its examples too slowly
> (eg above 5secs). I have already tweaked the example code already
> that it runs reliably <4 secs on my development laptop

Then it should be fine.

Additionally, you may need to cast some of your Rprintf arguments to
avoid format warnings on Windows:
https://win-builder.r-project.org/incoming_pretest/RITCH_0.1.23_20240110_120457/Windows/00check.log

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] CRAN submission struggle

2024-01-07 Thread Ivan Krylov
On Sun, 7 Jan 2024 10:52:44 +0200
Christiaan Pieterse  wrote:

> I have edited my package to have two examples. One uses a small
> self-generated dataset and another uses a big dataset. For the big
> dataset example, I put \donttest{} around it, should this be fine?

The small example is definitely fine: it exercises the code and does so
fast.

The big example wrapped in \donttest{} could be fine, I'm not sure.
I've seen CRAN packages that wrap the long parts of their examples in
\donttest{} while still making sure that R CMD check --run-donttest
exercises most of their code.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] CRAN submission struggle

2024-01-06 Thread Ivan Krylov
On Sat, 6 Jan 2024 15:16:01 +0300
Ivan Krylov  wrote:

> Congratulations! I also get the single expected NOTE in my checks.

Apologies for the double e-mail, but I've read the code now, and
wrapping the example of your only function in \dontrun{} will most
likely not be allowed.

Is it really the case that you cannot remove a single row from the
example dataset without making the example crash? It may help to write
a function that would remove rows one by one, make sure that the
example still runs, and keep doing that until not a single row can be
removed. The complexity is terrible (something like O(n^k)), but let it
run for a while, and maybe it'll reduce the dataset enough to fit in
the example time limit.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] CRAN submission struggle

2024-01-06 Thread Ivan Krylov
On Sat, 6 Jan 2024 13:50:49 +0200
Christiaan Pieterse  wrote:

> Is there a way to confirm that this package is ready for submission?
> I submitted it to https://win-builder.r-project.org/ and
> https://mac.r-project.org/macbuilder/submit.html and
> https://builder.r-hub.io/. All of these seem to only show the
> expected new submission note.

Congratulations! I also get the single expected NOTE in my checks.

This may be your last chance to rename the package from iopspackage to,
say, "IOPS". If you go over the package and the CRAN policy at
 one last time
and deem the package compliant, it should be ready for submission.

The CRAN reviewer may find additional problems and ask you to fix them,
but you are likely most of the way there.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] how to use pkgdown::build_site() with a project using S7 with a specialized plot()?

2024-01-03 Thread Ivan Krylov
On Wed, 3 Jan 2024 13:34:27 +
Daniel Kelley  wrote:

> Error: 
> ! in callr subprocess.
> Caused by error in `map2(.x, vec_index(.x), .f, ...)`:
> ! In index: 1.

Interesting that the actual error messages seem to be completely
empty.

By chance (I was searching for "rlang See `$stdout` for standard
output" because I was curious to know what is this error message
telling the user to subset) I found a bug report that seems relevant
(as it's also about S7, has the same warning and crashes in the same
call to rlang::check_installed):
https://github.com/r-lib/pkgdown/issues/2186

Unfortunately, there's no solution, just two similar-looking cases.

Is there an equivalent of options(error = recover) for callr child
processes? If you can recover the expression evaluated by the child
process, it could be worth executing it directly and walking the call
stack looking at the local variables at the time of the crash.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] CRAN submission struggle

2023-12-29 Thread Ivan Krylov
On Thu, 28 Dec 2023 18:00:37 +0200
Christiaan Pieterse  wrote:

> I only get 3 notes (see below), and if I run it in PositCloud, it
> crashes or yields the same 1 ERROR and 2 NOTES result as before. Why
> might this be? 

Does the PositCloud check crash with "Killed" (most likely out of RAM)
or with a different error message?

> Is it a problem or is it fine if I continue working in RStudio since
> I cannot increase the specs in PositCloud because I'm working on a
> research group account?

If your local R CMD check works, it should be fine. Rough
specifications for the machines running CRAN checks can be found at
. We ought to
test our packages on the weakest hardware that could plausibly be used
to run our code, but that's not always easy to do. I know I don't
always dig out my old Intel Atom ultraportable to run the checks myself.

> The second is the runtime that is too long:
> * checking examples ... [43s] NOTE
> Examples with CPU (user + system) or elapsed time > 5s
>   user system elapsed
> IOPS 10.06   3.35   35.04

Similar NOTEs can be seen about the use of multi-threading, but here
the "elapsed" (real, as measured by a clock) time exceeds the "user"
(CPU time spent inside applications) + "system" (CPU time spent inside
the operating system kernel) time, so the code uses less than 100% of
one CPU core on average, which fits comfortably in the 200% allowed by
the CRAN policy for examples and tests.

Unfortunately, 35 seconds is still too much.

> How can I reduce this time? I'm not sure how to reduce the size of my
> ExampleTradeData without the check giving errors when running the
> example.

How does the algorithm work? I've seen it fail due to Proximities being
a 0x0 matrix. Can you work backwards from
economiccomplexity::proximity() returning a 0x0 matrix to derive the
requirements that IOPS places on the dataset? It may help to experiment
with sample.int() to subset the rows and see which combinations work.
Perhaps you can reduce the dataset to two countries and two products?

Have you tried profiling? You can profile your code for both speed and
memory use, and replacing less performant idioms with those using less
CPU time and memory may solve both the CPU time problem and the
OOM-crash problem:
https://cran.r-project.org/doc/manuals/R-exts.html#Tidying-and-profiling-R-code
 
> The third note I am unsure what it means:
> * checking for detritus in the temp directory ... NOTE
> Found the following files/directories:
>   'lastMiKTeXException'

Have you installed the inconsolata MiKTeX package?
https://cran.r-project.org/doc/manuals/R-admin.html#LaTeX-on-Windows
Try running R CMD Rd2pdf on your package directory: maybe MiKTeX will
pop up an interactive dialog to let you install any remaining missing
dependencies. If not, there should be a lastMiKTeXException file for
you to read.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Issue with flang-new (segfault from C stack overflow)

2023-12-18 Thread Ivan Krylov
В Mon, 18 Dec 2023 11:06:16 +0100
Jisca Huisman  пишет:

> I isolated the problem in a minimal working example available here: 
> https://github.com/JiscaH/flang_segfault_min_example . All that does
> is pass a vector of length N*N back and forth between R and Fortran.
> It works fine for very long vectors (tested up to length 5e8), but
> throws a segfault when I reshape a large array in Fortran to a vector
> to pass to R, both when using RESHAPE() and when using loops.

You've done an impressive amount of investigative work. Thank you for
reducing your problem to such a small example! My eyes are drawn to
these two lines:

>>  integer, intent(IN) :: N
>>  integer :: M(N,N)

If this was C, such a declaration would mean a variable-length array
that would have to be placed on the (limited-size) stack and eventually
overflow it. gfortran places the array on the heap, so the program
works:

  integer, intent(IN) :: N
  integer, intent(INOUT) :: V(N*N)
  integer :: M(N,N)
1205:   48 63 dbmovslq %ebx,%rbx
1208:   b8 00 00 00 00  mov$0x0,%eax
120d:   48 85 dbtest   %rbx,%rbx
1210:   49 89 c4mov%rax,%r12
1213:   4c 0f 49 e3 cmovns %rbx,%r12
1217:   48 89 dfmov%rbx,%rdi
121a:   49 0f af fc imul   %r12,%rdi
121e:   48 85 fftest   %rdi,%rdi
1221:   48 0f 48 f8 cmovs  %rax,%rdi
1225:   48 c1 e7 02 shl$0x2,%rdi
1229:   b8 01 00 00 00  mov$0x1,%eax
122e:   48 0f 44 f8 cmove  %rax,%rdi
1232:   e8 19 fe ff ff  callq  1050 
1237:   48 89 c5mov%rax,%rbp
123a:   4c 89 e7mov%r12,%rdi
123d:   48 f7 d7not%rdi

(Looking at the address of M in GDB and comparing it with the output
of info proc mappings, I can confirm that it lives on the heap.)

flang-new makes M into a C-style VLA:

  integer, intent(IN) :: N
  integer, intent(INOUT) :: V(N*N)
  integer :: M(N,N)
74ec:   48 63 17movslq (%rdi),%rdx
74ef:   89 d1   mov%edx,%ecx
74f1:   31 c0   xor%eax,%eax
74f3:   48 85 d2test   %rdx,%rdx
74f6:   48 0f 49 c2 cmovns %rdx,%rax
74fa:   48 89 85 b0 fe ff ffmov%rax,-0x150(%rbp)
7501:   48 89 c2mov%rax,%rdx
7504:   48 0f af d2 imul   %rdx,%rdx
7508:   48 8d 34 95 0f 00 00lea0xf(,%rdx,4),%rsi
750f:   00
7510:   48 83 e6 f0 and$0xfff0,%rsi
7514:   48 89 e2mov%rsp,%rdx
7517:   48 29 f2sub%rsi,%rdx
751a:   48 89 95 b8 fe ff ffmov%rdx,-0x148(%rbp)
7521:   48 89 d4mov%rdx,%rsp

(Looking at the value of the stack pointer in GDB after M(N,N) is
declared, I can see it way below the end of the stack and the loaded
shared libraries according to info proc mappings. GDB doesn't let me
see the address of M. The program crashes in `M = 42`, trying to
overwrite the code from the C standard library.)

Are Fortran processors allowed to place such "automatic data objects"
like integer :: M(N,N) on the stack? The Fortran standard doesn't seem
to give an answer to this question, but if you make your M allocatable,
you won't have to worry about stack usage:

subroutine dostuff(N,V)
  implicit none

  integer, intent(IN) :: N
  integer, intent(INOUT) :: V(N*N)
  integer, allocatable :: M(:,:) ! <-- here

  allocate(M(N,N))   ! <-- and here
  M = 42
  V = RESHAPE(M, (/N*N/))
end subroutine dostuff

No leaks or crashes observed with these two changes and either
compiler. The Fortran standard requires that local allocatable unsaved
arrays (except for the function result) are deallocated at the end of
procedures.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] CRAN submission struggle

2023-12-18 Thread Ivan Krylov
On Sun, 17 Dec 2023 21:48:51 +0200
Christiaan Pieterse  wrote:

> Warning in complexity_measures(Mbin, method = "reflections",
> iterations = iterCompl) :
>   'iterations' was changed to 'iterations + 1' to work with an even
> number of iterations
> Killed

If this is happening on Linux, this could mean the current example (or
some other, completely unrelated process running at the same time)
allocating too much memory and summoning the OOM-killer.

In the current HEAD of the CRAN-prep branch, the man/*.Rd files are
empty, which prevents the package from being installed, so I couldn't
reproduce the problem. Are there any local changes you need to commit
and push to GitHub? If you're not comfortable keeping the package
source in sync with GitHub, we can consider other options.

> * checking HTML version of manual ... NOTE
> Skipping checking HTML validation: no command 'tidy' found

This is a problem with the system running R CMD check, not your
package. In order to check the validity of generated HTML
documentation, R needs to use the program called HTML-Tidy, which is
not installed on this computer.

Where are you getting these results from? Win-Builder
? macOS builder
?

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] CRAN submission struggle

2023-12-17 Thread Ivan Krylov
On Sun, 17 Dec 2023 15:29:34 +0200
Christiaan Pieterse  wrote:

> But, I've uploaded the newly created package as discussed in my first
> email, available at:
> https://github.com/ChristiaanPieterse/iopspackage2.1.0

Are you sure it wouldn't be better to clean up the existing package
instead of creating unrelated forks? If you're afraid of breaking
something that works, do the work on a separate branch

until it both works as well as your current repo currently does *and*
passes R CMD check --as-cran.

> * checking CRAN incoming feasibility ... [25s] NOTE
> Maintainer: 'C.J Pieterse '
> New submission

"New submission" is to be expected.

> Unknown, possibly misspelled, fields in DESCRIPTION:
>   'Exports'

"Writing R Extensions" doesn't define a field named "Exports". The
exports are declared in the NAMESPACE file. Since you're using
roxygen2, use its @export tag to export your functions and remove the
Exports: field.

> * checking whether package 'iopspackage' can be installed ... [27s]
> WARNING Found the following significant warnings:
>   Warning: package 'Rcpp' was built under R version 4.3.2

This is probably not a problem with your package (but may be a problem
with the way the machine running R CMD check is set up).

> * checking dependencies in R code ... NOTE
> Namespaces in Imports field not imported from:
>   'openxlsx' 'roxygen2' 'tibble'
>   All declared Imports should be used.

> * checking R code for possible problems ... [12s] NOTE
> IOPS: no visible global function definition for 'createWorkbook'
> IOPS: no visible global function definition for 'addWorksheet'
> IOPS: no visible global function definition for 'writeData'
> IOPS: no visible global function definition for 'saveWorkbook'
> Undefined global functions or variables:
>   addWorksheet createWorkbook saveWorkbook writeData

Are you sure you should be importing roxygen2? You only run
roxygenise() before running R CMD build in order to generate
man/*.Rd and NAMESPACE; I don't think it's used after that.

If you don't use functions from tibble, there's no need to import it
or depend on it either. I also don't see you directly using Rcpp, but
there's no warning about it for some reason.

Use the @importFrom tags to import individual functions that you
actually use (i.e. createWorkbook and friends). See

for more information on importing.

Also, remove all library() calls from R/iopspackage2.R. Packages live
in namespaces, not in the global environment; your package should rely
upon the dependency information in DESCRIPTION and NAMESPACE (the
latter generated by roxygen2) for its dependencies.

> > data(ExampleTradeData)  
> Warning in data(ExampleTradeData) :
>   data set 'ExampleTradeData' not found

There's no 'data' directory and no file named ExampleTradeData.* in it.
Data for use by the function data() should be prepared as described in
.
If you want to use files under inst/extdata, you have to read them
manually:

ETD <- read.csv(system.file(
 file.path('extdata','ExampleTradeData.csv'),
 package = 'iopspackage'
))

> * checking for detritus in the temp directory ... NOTE
> Found the following files/directories:
>   'lastMiKTeXException'

Is this on R-hub? This usually happens on R-hub and doesn't indicate a
problem with your package.

> #' temp_dir <- tempdir()
> #' 
> #' # Set the working directory to a temporary directory
> #' setwd(temp_dir)
<...>
> #' # Clean up the temporary directory
> #' unlink(temp_dir, recursive = TRUE)

Please make it a subdirectory of the session temporary directory:

temp_dir <- tempfile()
dir.create(temp_dir)
...

Removing the session temporary directory is not as bad as directly
overwriting user data (it's all going away after the process is shut
down), but quite a lot of parts of R and other packages rely on
tempdir() being a directory that exists, not to mention that there
could be other temporary files in use by other packages.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] CRAN submission struggle

2023-12-16 Thread Ivan Krylov
On Sat, 16 Dec 2023 19:41:16 +0200
Christiaan Pieterse  wrote:

> This .R file contained the Roxygen2 comments. (I was very unsure what
> comments to include in this so it might be wrong, I'm unsure)

If you like roxygen2 and would like to keep using it, you're welcome to
keep the roxygen2 comments in your R files. Just don't forget to re-run
roxygenise() every time you update them.

>3. Included a DESCRIPTION file is the 'iopspackage' folder. (Once
> again I was very unsure what to include in this file so it might be
> wrong).

Does it help to follow the guide at
?
Start with the mandatory fields Package, Version, License,
Description, Title and Authors@R (to generate Author: and Maintainer:
from).

>8. I checked the tar file using *R CMD check --as-cran
>"iopspackage_2.1.0.tar.gz". *This yielded errors, warnings and
> notes which I don't know how to solve and suspect are due to me
> setting the file up wrong.

Can you show us the log from the check? It should be fine to copy &
paste the entries that don't say OK (i.e. NOTEs, WARNINGs, and ERRORs).

Most of what you'll need to fix is described in "Writing R Extensions"
(link above) and the CRAN policy at
.

> I've been told before not to include my package as an attachment, so
> can someone please help me with the submission process?

Can you publish the code anywhere? Ideally, this place should provide
instant access to the latest version of every source code file inside
your package. The most popular option nowadays is GitHub, but it does
not have to be GitHub. Two GDPR-friendly alternatives are Codeberg and
SourceHut. If you don't like Git (which does take effort to learn),
there's R-Forge and Chiselapp.com. If you don't want to learn software
version control right now, any free web/file hosting will suffice as
long as you keep the files updated and accessible.

> It is a 10mb file

I think we've discussed this before. A 10-megabyte package mostly
consisting of example data is not a good fit for CRAN. It's possible to
use free Web hosting services to distribute data packages (see the
'drat' package and the function tools::writePACKAGES) separate from the
CRAN package that should mainly contain the code.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Getting 'Rscript: Bad address' error when CRAN build my package on windows platforms

2023-12-14 Thread Ivan Krylov
On Thu, 7 Dec 2023 19:29:46 +0100
Serge  wrote:

> g++ -std=gnu++11 -I"D:/RCompile/recent/R-4.3.2/include" -DNDEBUG
> -I../inst/projects/ -I../inst/include/ -DIS_RTKPP_LIB -DSTKUSELAPACK
> -I'D:/RCompile/CRANpkg/lib/4.3/Rcpp/include'
> -I"d:/rtools43/x86_64-w64-mingw32.static.posix/include" -fopenmp -O2
> -Wall -mfpmath=sse -msse2 -mstackrealign -c fastRand.cpp -o
> fastRand.o
> /bin/sh: line 1: /x86_64-w64-mingw32.static.posix/bin/g++: Bad address

I don't think this is a problem with your package. The shell says "Bad
address" when it gets an EFAULT while trying to run a program:

$ strace -f -e fault=execve:error=EFAULT:when=1 -e trace=execve \
 /bin/sh -c '/usr/bin/g++'
execve("/bin/sh", ["/bin/sh", "-c", "/usr/bin/g++"], [/* 51 vars */]) = 0
strace: Process 20756 attached
[pid 20756] execve("/usr/bin/g++", ["/usr/bin/g++"], [/* 50 vars */]) = -1 
EFAULT (Bad address) (INJECTED)
/bin/sh: 1: /usr/bin/g++: Bad address

There is not enough information to find out why this happens. I think
that since Rtools are based on MSYS2 which is based on Cygwin, the
place to look for EFAULT is Cygwin's implementation of the exec()
system call. Indeed, there's one such place there, after the giant
structured exception handling __try block, where errno is set to EFAULT
if the system got such an exception while launching a process without
previously setting errno to ENOMEM:
https://cygwin.com/cgit/newlib-cygwin/tree/winsup/cygwin/spawn.cc?id=ca2a4ec243627b19f0ac2c7262703f81712f3be4#n947

Does this happen every time? If not, I think the problem was
Win-Builder temporary running out of memory.

P.S.: Please don't compose HTML e-mail to this list with Thunderbird.
Thunderbird's auto-generated plain text version is all we get, and it's
severely mangled:
https://stat.ethz.ch/pipermail/r-package-devel/2023q4/010178.html

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to fix: non-standard things in the check directory: 'NUL' ?

2023-12-13 Thread Ivan Krylov
Dear Friedemann von Lampe,

Welcome to R-package-devel! This is a good, concise description of the
problem, but please also provide a link to your code in the future.

В Wed, 13 Dec 2023 09:58:41 +0100
Friedemann von Lampe 
пишет:

> Flavor: r-devel-linux-x86_64-debian-gcc
> Check: for non-standard things in the check directory, Result: NOTE
> Found the following files/directories:
> 'NUL'

The file named 'NUL' is created in the function screeplot_NMDS:

R/screeplot_NMDS.R:  capture.output(nmds_i <- invisible(metaMDS(matrix,
distance = distance, k = i, trymax = trymax, engine = "monoMDS",
autotransform = autotransform)), file='NUL')

https://github.com/fvlampe/goeveg/blob/db94c4a567eeac67b6df1df5a4f2d1aa771e629a/R/screeplot_NMDS.R#L76

On Windows, everything goes right and the output is redirected to the
null device. On Linux, the null device is '/dev/null', not 'NUL', and
this name doesn't hold any special powers, so the file with this name
gets created.

Use the base function nullfile() to obtain the name of the null device
in a portable manner. I think you can also not supply the `file`
argument and ignore the return value of the capture.output(...)
expression. This may be less efficient if there's truly a lot of output.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] vignette with "Run Examples"

2023-12-12 Thread Ivan Krylov
On Tue, 12 Dec 2023 08:24:11 +0100
Sigbert Klinke  wrote:

> is it possible to get a button or link to run an example in a vignette

Technically, yes, but very hard to implement in practice.

Vignettes are a form of literate programming, expressed in terms of
files: there's a source file containing code mixed with prose, and
there are two programs, one of which extracts the code into a runnable
.R file and the other renders the code together with prose and any
resulting plots into a human-readable document. A link to run examples
implies that there's R running somewhere, which cannot be guaranteed by
the time the human-readable document is opened by the human.

One way around this problem would be to embed a copy of webR [*] in the
document so that R would run in the browser. This involves a
significant developer effort and would either bloat your vignette to
the size of an R installation or make it depend on external resources
to load webR from (that could go away or spy on the user). webR is
still experimental; last time I tried it, it crashed the browser tab
when I invoked functions from the quadprog package.

Another way would be to add a hack to the vignette engine to start a
server at vignette rendering time, insert the link to this server into
the vignette as it's being rendered and hope that the server is still
running by the time the vignette is opened. This would require the user
to re-render the vignette every time they restart the server.

Technically, one could also invent a completely new kind of vignette
engine that would output self-contained executable files with a
document rendering engine and R built in, so that a click on the "Run
examples" would use that built-in R. This is basically the webR
solution without the web and with a lot of extra pain.

You could also fake some of it by writing extra JavaScript (with the
help of third-party statistics libraries, e.g. [**]) to do the same
thing in the browser as is done in R, but that's still a lot of work
for little benefit.

Yet another way would be to make these links point to an external
service somewhere on the Internet that would run the R code. Since R is
not designed to work with untrusted input (not to mention untrusted
users entering code), that would be an informational security nightmare
both on your side (R would have to run in locked-down read-only
disposable virtual machines hardened against sandbox escape and
privilege escalation exploits) and on the GDPR side of things.

There are doubtlessly more approaches, but I think they would all be
this convoluted or worse.

-- 
Best regards,
Ivan

[*] https://docs.r-wasm.org/webr/latest/

[**] https://github.com/svkucheryavski/mdatools-js

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Fortran compilation issues (errors/warnings) on Fedora clang/llvm

2023-12-11 Thread Ivan Krylov
В Mon, 11 Dec 2023 10:02:14 +
Koen Hufkens  пишет:

> error:
> loc("/data/gannet/ripley/R/packages/incoming/rsofun.Rcheck/00_pkg_src/rsofun/src/interface_biosphere_biomee.mod.f90":105:62):
> /data/gannet/ripley/Sources2/LLVM/17.0/llvm-project-17.0.3.src/flang/lib/Lower/ConvertType.cpp:392:
> not yet implemented: derived type components with non default lower
> bounds

Experience shows that reporting bugs in flang-new may get them
acknowledged but not fixed [1]. Have you tried detecting the flang-new
compiler from the ./configure script and only adding the workaround
flag if the compiler matches? Something like the following:

#!/bin/sh
# taken from Writing R Extensions, 1.2. Configure and cleanup
: ${R_HOME=`R RHOME`}
if test -z "${R_HOME}"; then
  echo "could not determine R_HOME"
  exit 1
fi
# determine the Fortran 9x compiler
FC="`"${R_HOME}/bin/R" CMD config FC`"
# Use --version output to determine the compiler
# A different compiler will either accept --version and print something
# else or fail due to "unknown argument". In both cases the branch will
# not be taken
if "$FC" --version 2>/dev/null | grep -q 'flang-new version 17'; then
 echo "PKG_FCFLAGS = `"${R_HOME}/bin/R" CMD config FCFLAGS`" \
  " -fc-prototypes-external" >>src/Makevars
fi

You will still get an "unsupported flag" warning on machines with
flang-new, but at least it won't warn with standards-compliant
compilers or crash with flang-new. Unfortunately, I don't know whether
this is acceptable for CRAN, sorry.

Good luck!

-- 
Best regards,
Ivan

[1] https://stat.ethz.ch/pipermail/r-package-devel/2023q4/009987.html

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] problems with Maintainers in DESCRIPTION file

2023-12-08 Thread Ivan Krylov
Dear Olga Viedma,

Welcome to R-package-devel!

When looking for advice here, it's best to link to the package code and
the R CMD check report you have received.

On Thu, 7 Dec 2023 20:58:01 +
María Olga Viedma Sillero  wrote:

> Flavor: r-devel-linux-x86_64-debian-gcc, r-devel-windows-x86_64
> 
> Check: CRAN incoming feasibility, Result: NOTE
> 
>   Maintainer: 'Olga Viedma
> mailto:olga.vie...@uclm.es>>'

The maintainer address is always printed here, with a
Note_for_CRAN_maintainers instead of NOTE if all other tests pass
successfully. Unfortunately, this is not the whole NOTE:
https://win-builder.r-project.org/incoming_pretest/LadderFuelsR_0.0.1_20231207_204411/Debian/00check.log

>> New submission

This NOTE is expected to happen. New submissions require manual review.

>> Possibly misspelled words in DESCRIPTION:
>>  LiDAR (3:67, 9:94)

I think you can mention this being an established term of the art in
the comment field. Only new submissions are spell-checked, so once the
package is accepted, this won't reappear either. Alternatively, you can
serialize a character vector of words into an .rds file under .aspell
(see help(aspell)) so that it will be accepted by the automated check:

saveRDS(c('LiDAR'), '.aspell/dictionary.rds')

(This is untested but follows the documentation and the R CMD check
source code.)

>> Found the following (possibly) invalid URLs:
>>   URL: https://travis-ci.com/olgaviedma/LadderFuelsR (moved to
>>   https://www.travis-ci.com/olgaviedma/LadderFuelsR)
>> From: README.md
>> Status: 301
>> Message: Moved Permanently
>>  For content that is 'Moved Permanently', please change http to
>>  https, add trailing slashes, or replace the old by the new URL.

This automated check makes sure that any URL listed in the
documentation (README or help pages) directly points to a web page.
It's not exactly obvious unless you run R CMD check --as-cran yourself,
but https://travis-ci.com/olgaviedma/LadderFuelsR redirects to
https://www.travis-ci.com/olgaviedma/LadderFuelsR. If you add the
missing www., you will get rid of the problem.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Troubleshooting Winbuilder Run Timeouts

2023-12-07 Thread Ivan Krylov
On Thu, 7 Dec 2023 12:22:11 -0500
Andrew Robbins  wrote:

> I just benchmarked devtools::check() (which is actually a build
> followed by a check) on my Ryzen 5700X/intel 660p system with a bunch
> of stuff going on in the background and got just over 11 minutes. I
> don't actually think its timing out. No output differences, of course.

Then it's probably not a timeout, thanks for confirming this. If you've
eliminated Rcpp::Rcout, OpenMP and std::chrono::high_resolution_clock,
the crash must be happening inside
planc::INMF::computeObjectiveError for T = arma::mat but not T =
arma::sp_mat.

I see that inmf() is one of the first examples that tests the fitting
functions (and exercises Armadillo), but not the first such example.
Inside RcppPlanc-Ex.R, there is bppnnls() which should have exercised
both arma::mat and arma::sp_mat cases, but with a different algorithm.

My yet another unlikely idea is that something is going wrong when
linking the package with OpenBLAS while another BLAS (provided by R) is
present in the address space. Not a lot of CRAN packages do this, and
some of those that did are now archived:


Evidently it works on Linux and on our Windows machines, but maybe the
Win-Builder configuration uncovers something that we don't see
ourselves. I think we need the help of Uwe Ligges to see a backtrace,
otherwise I don't know what to change.

(I wonder if it's possible to enhance the current Win-Builder setup
with something like https://github.com/jrfonseca/drmingw to catch the
process crashing and print the annotated traceback somewhere for the
Win-Builder scripts to pick it up. The tricky part will be to only
handle `R CMD check` launched by the scripts but not anything else.)

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Troubleshooting Winbuilder Run Timeouts

2023-12-07 Thread Ivan Krylov
On Tue, 5 Dec 2023 17:16:37 -0500
Andrew Robbins via R-package-devel 
wrote:

> Every time, it reports a runtime of zero seconds for INMF res1

I get exactly 0 seconds for both res1 and res2 on both my Windows
machines and on my Linux machine. I had discounted that as an integer
type used somewhere for elapsed time. Is that (exactly 0 seconds and an
objective value of 44244.4) supposed to indicate a problem?

(On my Linux machine, I had to crudely hack in linking with -lstdc++fs
because some of the dependencies of RcppPlanc.so didn't pick it up
automatically.)

> and hangs/is killed on initiation of INMF res2 before the
> initialization of RcppProgress. 

Can we eliminate it running out of time? On my Windows 10 LTSC machine
(AMD Ryzen 5 2400G from 2018, SATA SSD, nothing else going on at the
time) the complete R CMD check takes slightly more than 10.5 minutes.
An R-release check running on an Intel Xeon E5-2680 from 2016 with
other checks going on at the same time could plausibly take more than 20
minutes and get terminated [1], with some of the buffered output
never reaching the RcppPlanc-Ex.Rout file... but then it happens on a
much faster R-devel checking machine [2], which runs on an AMD EPYC
7443, and we never see any differences in the output. Right?

-- 
Best regards,
Ivan

[1]
https://svn.r-project.org/R-dev-web/trunk/CRAN/QA/Uwe/make/killR.R
https://svn.r-project.org/R-dev-web/trunk/CRAN/QA/Uwe/make-guest/CRANguest.R

[2]
https://win-builder.r-project.org/incoming_pretest/RcppPlanc_1.0.0_20231118_001540/Windows/00check.log

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] NOTE: Examples with CPU time > 2.5 times elapsed time

2023-12-04 Thread Ivan Krylov
Dear Artur,

You've got a well-written package. There are some parts I wasn't able
to understand (e.g. changing the class of a variable by reference using
SET_CLASS and later changing it back), but there are no obvious places
where a mistake could be hiding.

On Fri, 1 Dec 2023 23:43:32 +
Artur Araujo  wrote:

> Examples with CPU time > 2.5 times elapsed time
> user system elapsed  ratio
> contour.TPCmsm 2.605  0.132   0.108 25.343

I managed to reproduce this, but not before compiling R-devel r85646
on a Debian Testing virtual machine using clang-17 and flang-17 with
libomp-17-dev installed. In particular, this doesn't seem to happen
with latest R-devel built using (much older versions of) the GNU
toolchain.

There was a false start where I saw pthread_create breakpoints hit
inside libpango while contour() was running, but that couldn't be your
problem because examples run with the pdf() device active. Instead,
when source()ing the TPmsm-Ex.R file, the breakpoint fired elsewhere:

### Name: as.data.frame.survTP
### Title: as.data.frame method for a survTP object
### Aliases: as.data.frame.survTP
### Keywords: manip metho  [TRUNCATED]


  
Breakpoint 7, __pthread_create_2_1 (newthread=0x7fff5a70, 
attr=0x7fff5a78, start_routine=0x775882a0 <__kmp_launch_worker()>, 
arg=0x572d9e80)
at ./nptl/pthread_create.c:623
(gdb) bt
#0  __pthread_create_2_1 (newthread=0x7fff5a70, attr=0x7fff5a78, 
start_routine=0x775882a0 <__kmp_launch_worker()>, arg=0x572d9e80)
at ./nptl/pthread_create.c:623
#1  0x775879b4 in __kmp_create_worker () at 
build-llvm/tools/clang/stage2-bins/openmp/runtime/src/z_Linux_util.cpp:795
#2  0x77518899 in __kmp_allocate_thread () at 
build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_runtime.cpp:4677
#3  0x7751078a in __kmp_allocate_team () at 
build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_runtime.cpp:5384
#4  0x77511813 in __kmp_fork_call () at 
build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_runtime.cpp:2122
#5  0x774ffc81 in __kmpc_fork_call () at 
build-llvm/tools/clang/stage2-bins/openmp/runtime/src/kmp_csupport.cpp:300
#6  0x7465100e in cens2 (pT1=, pE1=, 
pS=, pE=, pn=, tfunc=,
pcorr=, pdistpar=, cfunc=, 
pcenspar=, pstate2prob=) at dgpTP.c:83
#7  dgpTP (n=, corr=, dist=, 
distpar=, modelcens=, censpar=, 
state2prob=) at dgpTP.c:178
#8  0x55645cba in R_doDotCall (fun=0x0, nargs=nargs@entry=7, 
cargs=cargs@entry=0x7fff63b0, call=call@entry=0x573b62b8)
at ../../../R-svn/src/main/dotcode.c:1498
(gdb) frame 6
#6  0x7465100e in cens2 (pT1=, pE1=, 
pS=, pE=, pn=, tfunc=, 
pcorr=, pdistpar=, cfunc=, 
pcenspar=, pstate2prob=) at dgpTP.c:83
83  #pragma omp parallel num_threads(global_num_threads)
(gdb) p global_num_threads
$4 = 8

Not sure why the CPU time gets counted for the contour() example
instead of as.data.frame.survTP - do libomp threads not get counted
until they terminate or something? - but since the contour.TPCmsm
example directly follows as.data.frame.survTP, the call to dgpTP here
must be the culprit.

I would like to suggest a new diagnostic for the CPU time / elapsed
time tests. If any examples have been found to exceed the time ratio
limit and R is running on Linux with a special environment variable
set, the examples should be re-run under a debugger. The debugger
should set a breakpoint on the clone _system call_ (ideally, all system
calls that can create threads or child processes) and print a C-level
backtrace when it fires. Ideally, it should call something like
R_ConciseTraceback to obtain an R-level traceback too, but it's not an
API (at least yet) and I'm not sure it's generally safe to do this in
the middle of an R call. If there is interest, I can try to see if gdb
can be scripted to make this happen. I think it must be a ptrace
debugger and not an LD_PRELOAD wrapper because there are too many ways
to start threads and processes. For example, with libomp,
pthread_create() breakpoints fire while clone() breakpoints do not,
despite something later does make the clone() system call.

This won't help if the threads get started separately and only later
get utilised in a different example, but I wasn't able to put the
output of `perf report` for good use when debugging similar problems in
a different package. Other ideas for tracing the CPU usage culprits are
also welcome.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Example files for functions reading binary files

2023-12-03 Thread Ivan Krylov
On Sun, 3 Dec 2023 09:33:30 +
Rafael Ayala Hernandez  wrote:

> The binary files that are read contain just arrays of coefficients
> and metadata about these.
> 
> I would like to provide a small file of these to be used in the
> examples of the man page for the functions that read them.

Files can be "binary" in multiple different senses. Your files are
binary in the sense that they contain data not in a plain text format.
This usage is fine. PNG images and files produced by save() or
saveRDS() are binary in this sense too.

Files can also be "binary" in the sense that they are not
human-readable source code for a computer program, but compiled
executable code. *.exe's and *.dll's, *.so's and *.dylib's, and also
*.jar files are binaries in this sense. The reason they are forbidden
is because they are much harder to inspect than the source code they
have been compiled from.

It's not always so clear-cut, though. One example is R's own
serialization format where one could save a function and later restore
and run it. If R did not allow this, it would be much less convenient
to use, despite being slightly more secure. In another example, due to
a vulnerability, a binary data file (or even a text file, say, a
JavaScript program) may contain an encoded computer program that the
data-reading program will execute instead of reading it due to being
confused. (E.g. jailbreak.me used to confuse iPhones into overwriting
parts of their firmware despite that's normally not allowed for a
website.)

It is important to write parsers for binary and text files so that even
if these files are produced by an adversary, the program reading them
will not be tricked into executing program code stored inside. You
should be mostly safe if you're doing this in pure R (or else R may
need to be fixed).

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] URL woes at CRAN: Anaconda edition

2023-11-30 Thread Ivan Krylov
On Thu, 30 Nov 2023 10:49:36 -0600
Dirk Eddelbuettel  wrote:

>   Found the following (possibly) invalid URLs:
> URL: https://anaconda.org/conda-forge/r-tiledb
>   From: README.md
>   Status: 400
>   Message: Bad Request

The problem is that https://anaconda.org/conda-forge/PACKAGE fails to
answer to HEAD requests and only answers to GET requests. HEAD differs
from GET in that there is no need to send the actual content of the page
in response to the request; it is enough to say "200 OK" (and pass the
check) or, as it happens here, "400 Bad Request".

It may help to ask ContinuumIO to support HEAD requests by filing a bug
report for the Anaconda.org website at
. It may be worth
referencing RFC 9110, section 9.3.2
, in
particular:

>> The HEAD method is identical to GET except that the server MUST NOT
>> send content in the response.

>> The server SHOULD send the same header fields in response to a HEAD
>> request as it would have sent if the request method had been GET.

Arguably, the spirit of the standard is being violated here (if not the
letter, which only mentions headers).

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Troubleshooting Winbuilder Run Timeouts

2023-11-30 Thread Ivan Krylov
Hi Andrew Robbins,

В Mon, 27 Nov 2023 12:22:44 -0500
Andrew Robbins via R-package-devel 
пишет:

> I'm currently attempting to submit a package to CRAN and am getting 
> timeouts during the "running examples" phase of the winbuilder tests 
> that I cannot reproduce locally or on the r-hub windows runner

Do I understand it correctly that in multiple win-builder tests, you
always get a timeout in the same inmf example? Or is it different
examples?

There doesn't seem to be a lot of code between Rcpp::Rcerr << "INMF
started, niter=" << niter << std::endl; and Progress p(niter, verbose);
to crash or hang in. I've tried to reproduce the problem myself, but it
doesn't happen to me either.

So far I only noticed minor things, like the build-time dependency on
pkgbuild (from tools/WindowsOverrides/hdf5/hdf5-config.cmake) that may
be worth declaring in your DESCRIPTION. I also think that your patch to
hwloc can be reduced by adding a -DNDEBUG somewhere instead of patching
out calls to assert(). I think that the C standard (e.g. C99, WG14 draft
version N1256, 7.4) requires assert() to expand to ((void)0) if NDEBUG
is defined.

The amount of systems engineering effort to do the right thing that
went into this package is really impressive; I'd hate to see it fail.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Canonical way to Rprintf R_xlen_t

2023-11-28 Thread Ivan Krylov
On Wed, 29 Nov 2023 06:11:23 +1100
Hugh Parsonage  wrote:

> Rprintf("%lld", (long long) xlength(x));

This is fine. long longs are guaranteed to be at least 64 bits in size
and are signed, just like lengths in R.

> Rprintf("%td, xlength(x));

Maybe if you cast it to ptrdiff_t first. Otherwise I would expect this
to fail on an (increasingly rare) 32-bit system where R_xlen_t is int
(which is an implementation detail).

In my opinion, ptrdiff_t is just the right type for array lengths if
they have to be signed (which is useful for Fortran interoperability),
so Rprintf("%td", (ptrdiff_t)xlength(x)) would be my preferred option
for now. By definition of ptrdiff_t, you can be sure [*] that there
won't be any vectors on your system longer than PTRDIFF_MAX.

> using the string macro found in Mr Kalibera's commit of r85641:
> R_PRIdXLEN_T

I think this will be the best solution once we can afford
having our packages depend on R >= 4.4.

-- 
Best regards,
Ivan

[*] https://en.cppreference.com/w/c/types/ptrdiff_t posits that there
may exist long vectors that fit in SIZE_MAX (unsigned) elements but not
PTRDIFF_MAX (signed) elements. If such vector exists, subtracting two
pointers to its insides may result in undefined behaviour. This may be
already possible in a 32-bit process on Linux running with a 3G
user-space / 1G kernel-space split. The only way around the problem is
to use unsigned types for lengths, but that would preclude Fortran
compatibility.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


  1   2   3   4   >