On 4/24/25 00:18, Michael Chirico wrote:
In that case it seems like just erroring instead of returning invalid
pointers is a much friendlier option. Why give developers an unpinned
grenade to carry around?
That would be too strict at this point. There is too much code around
depending on that holding on to an invalid pointer (but not
dereferencing it) is ok, and it is currently still working on the
current platforms where R is used.
The mechanism of returning an invalid pointer from functions like
INTEGER() called on an empty vector (sometimes this is called
"poisoning") is much more precise. When this causes a crash, it is
because the program actually used the pointer, which is a clear error
(and it is a clear error regardless of the platform). There are no
false alarms: when package authors have to debug a package because of
this, there is really an error in the code to be fixed. And, for a large
number of CRAN packages where a recent improvement in the poisoning
showed up more cases, the debugging was done by the CRAN team for the
package authors, in some cases even providing patches.
Best
Tomas
On Wed, Apr 23, 2025 at 1:38 PM Tomas Kalibera <tomas.kalib...@gmail.com> wrote:
On 4/23/25 19:03, Michael Chirico wrote:
h/t Tim Taylor for pointing out my blindspot :)
We have Memcpy() in API already [1], which wraps a 0-aware R_chk_memcpy() [2].
We don't quite have Memset() in API, though; instead we have Memzero()
[3] for R_chk_memset(s, 0, n) which is 0-aware memset() [4].
I don't think that using wrappers for this is the right approach. Even
loading an invalid pointer is undefined behavior. While it wouldn't
matter on typical hardware where R is used today, it could cause a crash
on some hardware, and it may well be that various checkers will start
warning about such things. Also, holding (intentionally) invalid
pointers makes debugging harder.
At least in new code, I would avoid holding invalid pointers. Certainly
I don't think we should be adding wrappers to R for C functions to make
them work with invalid pointers.
Best
Tomas
[1]
https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/include/R_ext/RS.h#L59-L60
[2]
https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/main/memory.c#L3575-L3580
[3]
https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/include/R_ext/RS.h#L62-L63
[4]
https://github.com/r-devel/r-svn/blob/9e597ca8132e8b6298b0bedb39fa8deba5c819df/src/main/memory.c#L3582-L3587
Mike C
On Wed, Apr 23, 2025 at 8:57 AM Michael Chirico
<michaelchiri...@gmail.com> wrote:
From R 4.5.0 [1], all builds of R discourage use of INTEGER() [and
friends REAL(), ... and *_RO() equivalents] on length-0 SEXP [2].
Before R 4.5.0, this was the behavior under --enable-strict-barrier.
That means the following can segfault under strict builds (e.g.
-fsanitize=alignment and -O0):
SEXP x = PROTECT(Rf_allocVector(INTSXP, 0));
SEXP y = PROTECT(Rf_allocVector(INTSXP, 0));
const int *x = INTEGER_RO(x); // invalid!
int *y = INTEGER(y);
memcpy(y, x, 0); // alluring, but undefined behavior!
There are a number of CRAN packages that fall victim to this, see e.g.
this PR and others linked to it [3]. I'm sure there are dozens if not
hundreds of other equivalent bugs waiting to be discovered that just
aren't covered by existing tests.
{rlang} took the approach to define r_memcpy() and r_memset() which
wrap memcpy() and memset(), resp., with an added length-0 check [4]; I
think R itself should offer these (probably more consistently styled
as R_Memcpy() and R_Memset()).
(NB there's a possibility I'm still not fully grasping what's going on here :) )
Mike C
[1] related: https://stat.ethz.ch/pipermail/r-devel/2024-June/083456.html
[2]
https://github.com/r-devel/r-svn/blob/2b29e52e1c4e3d26b649cb7ac320b8a3dd13de30/src/main/memory.c#L4146
[3] https://github.com/r-lib/vctrs/pull/1968
[4] https://github.com/r-lib/rlang/pull/1797
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel