Oops... Forgot to attach the dotC_NULL.c, the C source file for the test case.
Pavel Krivitsky On Fri, 2012-05-04 at 13:42 -0400, Pavel N. Krivitsky wrote: > Dear R-devel, > > While tracking down some hard-to-reproduce bugs in a package I maintain, > I stumbled on a behavior change between R 2.15.0 and the current R-devel > (or SVN trunk). > > In 2.15.0 and earlier, if you passed an 0-length vector of the right > mode (e.g., double(0) or integer(0)) as one of the arguments in a .C() > call with DUP=TRUE (the default), the C routine would be passed NULL > (the C pointer, not R NULL) in the corresponding argument. The current > development version instead passes it a pointer to what appears to be > memory location immediately following the the SEXP that holds the > metadata for the argument. If the argument has length 0, this is often > memory belonging to a different R object. (DUP=FALSE in 2.15.0 > appears to have the same behavior as R-devel.) > > .C() documentation and Writing R Extensions don't explicitly specify a > behavior for 0-length vectors, so I don't know if this change is > intentional, or whether it was a side-effect of the following news item: > > .C() and .Fortran() do less copying: arguments which are raw, > logical, integer, real or complex vectors and are unnamed are not > copied before the call, and (named or not) are not copied after > the call. Lists are no longer copied (they are supposed to be > used read-only in the C code). > > Was the change in the empty vector behavior intentional? > > It seems to me that standardizing on the behavior of giving the C > routine NULL is safer, more consistent with other memory-related > routines, and more convenient: whereas dereferencing a NULL pointer is > an immediate (and therefore easily traced) segfault, dereferencing an > invalid pointer that is nevertheless in the general memory area > allocated to the program often causes subtle errors down the line; > R_alloc asked to allocate 0 bytes returns NULL, at least on my platform; > and the C routine can easily check if a pointer is NULL, but with the > R-devel behavior, the programmer has to add an explicit way of telling > that an empty vector was passed. > > I've attached a small test case (dotC_NULL.* files) that shows the > difference. The C file should be built with R CMD SHLIB, and the R file > calls the functions in the library with a variety of arguments. Output I > get from running > R CMD BATCH --no-timing --vanilla --slave dotC_NULL.R > on R 2.15.0, R trunk, and R trunk with my patch (described below) are > attached. > > The attached patch (dotC_NULL.patch) against the current trunk > (affecting src/main/dotcode.c) restores the old behavior for DUP=TRUE > (i.e., 0-length vector -> NULL pointer) and extends it to the DUP=FALSE > case. It does so by checking if an argument --- if it's of mode raw, > integer, real, or complex --- to a .C() or .Fortran() call has length 0, > and, if so, sets the pointer to be passed to NULL and then skips the > copying of the C routine's changes back to the R object for that > argument. The additional computing cost should be negligible (i.e., > checking if vector length equals 0 and break-ing out of a switch > statement if so). > > The patch appears to work, at least for my package, and R CMD check > passes for all recommended packages (on my 64-bit Linux system), but > this is my first time working with R's internals, so handle with care. > > Best, > Pavel Krivitsky > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel