Splus used a similar method for sampling from "bigdata" objects. One
problem was that sample() is used both for creating a sample and for
scrambling the order of a vector. Scrambling the order of a big vector
wastes time. It would be nice to be able to tell sample() that we don't
care about the
Bug 17337. Note that I get R making the zero-length file on both Windows
and Linux, but the return values are different.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Mon, Sep 11, 2017 at 7:01 AM, Martin Maechler <maech...@stat.math.ethz.ch
> wrote:
> >>>>> Willi
When I mistakenly use file.copy() with a directory for the 'from' argument
and a non-directory for the 'to' and overwrite=TRUE, file.copy returns
FALSE, meaning it could not do the copying. However, it also replaces the
'to' file with a zero-length file.
dir.create( fromDir <- tempfile() )
Should arithmetic operations work on zero-column data.frames (returning a
zero-column data.frame with the same number of rows as the data.frame
argument(s))? Currently we get:
> 1 + data.frame(row.names=c("A","B"))
Error in data.frame(value, row.names = rn, check.names = FALSE, check.rows
=
quote(expr) will make no changes in expr, it just returns its one argument,
unevaluated.
substitute could be used in your lapply(..., library) example to give
library a name instead
of a character string for an input (which might be necessary if the
character.only argument
were not available)
Functions, like your loader(), that use substitute to let users confound
things and their names, should give the user a way to avoid the use of
substitute. E.g., library() has the 'character.only' argument; if TRUE
then the package argument is treated as an ordinary argument and not passed
1: substitute(), when given an argument to a function (which will be a
promise) gives you the unevaluated expression given as the argument:
> L <- list(a=1, b=2, c=3)
> str(lapply(L, function(x) substitute(x)))
List of 3
$ a: language X[[i]]
$ b: language X[[i]]
$ c: language X[[i]]
The 'X'
How should R deal with matrices that have a 'names' attribute? S (and S+)
did not allow an object to have both dims and names but R does. However,
some R functions copy the dims but not the names to the returned value and
some copy both. I don't see a pattern to it. Is there a general rule for
>The function Rmpi::mpi.bcast.cmd() calls eventually something along the
lines of
>
>> scmd <- scmd <- substitute(cmd)
>> arg <- list(...)
>> scmd.arg <-serialize(list(scmd=scmd, arg=arg), NULL)
>> if (length(scmd.arg$args) > 0)
>>do.call(as.character(scmd.arg$scmd), scmd.arg$args, envir =
The multcomp package has code in multcomp:::expression2coef that attaches
the 'coef' attribute to symbols. Since there is only one symbol object in
a session with a given name, this means that this attaching has a global
effect. Should this be quietly allowed or should there be a warning or an
re
wdunlap tibco.com
On Fri, Jun 16, 2017 at 11:53 AM, peter dalgaard <pda...@gmail.com> wrote:
>
> Wikipedia claims that C ints are still only guaranteed to be at least 16
bits, and longs are at least 32 bits. So no, R's integers are long.
>
> -pd
>
> > On 16 Jun 2017, at
But R "integers" are C "ints", as opposed to S "integers", which are C
"long ints". (I suppose R never had to run on ancient hardware with 16 bit
ints.)
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Fri, Jun 16, 2017 at 10:47 AM, Yihui Xie wrote:
> Yeah, that was what I
You can avoid the warnings and the unneeded calls to FUN by adding
drop=TRUE to the call to ave(), since all of its ... arguments are passed
to interaction (I think).
In TERR we dealt with this problem by adding drop=TRUE to ave's
argument list and we pass ... and drop=drop to interaction. I'm
If you are changing the parser (which is a major change) you
might consider treating strings in the C/C++ way:
char *s = "A"
"B";
means the same as
char *s = "AB";
I am not a big fan of that syntax but it is widely used.
A backslash at the end of the line leads to errors
While you are fiddling with stopifnot(), please consider changing the form
of the error thrown so that it includes the caller's call. The change
would be from something like
stop( <> )
to
stop(simpleError( <>, sys.call(-1)))
For the following code
f <- function(x, y) {
stopifnot(x >
gt;
wrote:
> On Wed, May 10, 2017 at 2:36 AM, William Dunlap via R-devel
> <r-devel@r-project.org> wrote:
> > Some formula methods for S3 generic functions use the idiom
> > returnValue$call <- sys.call(sys.parent())
> > to show how to recreate the returned object or to
9-4973
> IM: bgpbraverock
>
> On Wed, 2017-04-26 at 08:25 -0700, William Dunlap via R-devel wrote:
> > The Ubuntu machine I use a lot (along with others) must not be
> > cleaning
> > /tmp as it has a fair number of Rtmp* directories in /tmp, even when
> > the
The Ubuntu machine I use a lot (along with others) must not be cleaning
/tmp as it has a fair number of Rtmp* directories in /tmp, even when there
are no R sessions running on the machine. I would like to automate their
removal but there is no obvious way to see if the R process that created
the
I recently noticed a change between R-3.3.3 and R-3.4.0 in the definition
of the R_CMethodDef struct.
typedef struct {
const char *name;
DL_FUNC fun;
int numArgs;
-
R_NativePrimitiveArgType *types;
- R_NativeArgStyle *styles;
-
} R_CMethodDef;
I
In S+ on Unix-alikes we dealt with this issue by using fcntl(fd,
F_SETFD, 1) to set the close-on-exec flag on a file descriptor as soon
as we opened it.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Apr 19, 2017 at 8:40 PM, Winston Chang wrote:
> In addition to
>I think that the suggestion I made, in response to a posting by Barry
>>Rowlingson, that the first argument of lapply() be given the name of ".X"
>rather >than just-plain-X, would be (a) effective, and (b) harmless.
It would break any call to *apply() that used X= to name the first
argument.
> Or is this a bad idea?
I don't like the proposal. I have seen code like the following (in
fact, I have written such code, where I had forgotten a function was
not vectorized) where the error would have been discovered much later
if outer() didn't catch it.
> outer(1:3, 11:13, sum)
Error
> void dnk_c(double *sortedFsample, unsigned long int n, unsigned
long int k, double *dKol)
All arguments to C functions called by .C() must be pointers. Also, R
integers are C ints, not unsigned long ints.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Mon, Mar 20, 2017 at 5:55 AM, Hristo
I can see that allowing a user-defined unary prefix operator can be useful.
We want to make sure its precedence and associative behavior are
convenient for a variety of envisioned uses, as we won't get a chance
to change them after the language construct is introduced.
An example of precedence
OK. I am more concerned now with semantics than the syntax.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Fri, Mar 17, 2017 at 1:09 PM, Gabriel Becker wrote:
> Bill,
>
> Right. My example was the functional form for clarity.
>
> There is a desire for a unary-operator
Your example
x = 5
exp = parse(text="f(uq(x)) + y +z") # expression: f(uq(x)) +y + z
do_unquote(expr)
# -> the language object f(5) + y + z
could be done with the following wrapper for bquote
my_do_unquote <- function(language, envir = parent.frame()) {
if
>After off list discussions with Jonathan Carrol and with
>Michael Lawrence I think it's doable, unambiguous,
>and even imo pretty intuitive for an "unquote" operator.
For those of us who are not CS/Lisp mavens, what is an
"unquote" operator? Can you expression quoting and unquoting
in R syntax
I am biased against introducing new syntax, but if one is
experimenting with it one should make sure the precedence feels right.
I think the unary and binary minus-sign operators have different
precedences so I see no a priori reason to make the unary and binary
%xxx% operators to be the same.
I noticed that simplify2array acted oddly when given a list of
data.frames of various sizes. If the data.frames have one row, it
makes a new first dimension with the dimname equal to
rownames(firstDataframe). That dimension does not appear for
data.frames with other numbers of rows.
>
It happens in the fda package because some of the headers are longer
than typical (e.g., "Repository/R-Forge/DateTimeStamp") and formatDL
dies if the indent argument is too large compared to the width argument.
It might be nice to change formatDL so it never gave such an error, but
did something
This error can arise when getOption("width") is too small. 80 seems to be the
limit for me with R-3.3.2 on Windows.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, Mar 8, 2017 at 10:28 PM, Spencer Graves
wrote:
> Hello:
>
>
> I tried "debug(help)" with
Da Zheng would like to override 'if' and 'while' to accept more than
scalar logicals and Martin Maechler would like to change 'if' to
accept only scalar logicals. No one has mentioned '||' and '&&',
which also want scalar logicals.
Perhaps a solution is to have all of these call a new generic
dplyr::translate_sql() redefines lots of functions, include "if", to
translate from R syntax to SQL syntax.
> dplyr::translate_sql(if ("mpg">25) "better" else "worse")
CASE WHEN ('mpg' > 25.0) THEN ('better') ELSE ('worse') END
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Sat, Mar 4, 2017
Were you suppressing warnings? I get a warning along with the "unable
to start device 'png'" in some cases where it fails. E.g., on Linux
> png("Figure1A.png", h = 7, w = 7, res = 1e5, units = "cm")
Error in png("Figure1A.png", h = 7, w = 7, res = 1e+05, units = "cm") :
unable to start device
If you use check.names=FALSE in your call to read.csv you can see that
the first column name starts with the 3 bytes ef bb bf, which is the
UTF-8 "byte-order mark" that Microsoft applications like to put at the
start of a text file stored in UTF-8.
> v0514 <- read.csv(unz(temp, file0514[1]),
In addition, signed zeroes only exist for floating point numbers - the
bit patterns for as.integer(0) and as.integer(-0) are identical.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Jan 26, 2017 at 1:53 AM, Martin Maechler
wrote:
>> Florent Angly
It would be cool if the default for tapply's init.value could be
FUN(X[0]), so it would be 0 for FUN=sum or FUN=length, TRUE for
FUN=all, -Inf for FUN=max, etc. But that would take time and would
break code for which FUN did not work on length-0 objects.
Bill Dunlap
TIBCO Software
wdunlap
If these are identifiers, store them as strings. If not, what sort of
calculations do you plan on doing with them?
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Fri, Jan 20, 2017 at 6:33 AM, Nicolas Paris wrote:
> Hello r users,
>
> I have to deal with int8 data with
While doing some speed testing I noticed that in R-3.2.3 the perl=TRUE
variants of strsplit() and gregexpr() took time proportional to the
square of the number of pattern matches in their input strings. E.g.,
the attached test function times gsub, strsplit, and gregexpr, with
perl TRUE (PCRE) and
You can define the data in the R directory. You can keep it all in a *.R
file
by wrapping the text of the *.csv file in quotes and using
read.table(text="quoted stuff"), as in:
theData <- read.csv(header=TRUE, text="
English,Digit
One,1
Two,2
Three,3")
N <- nrow(theData)
You need to make sure
Another example uses formula.character's other arguments:
> as.formula("env")
Error: object of type 'special' is not subsettable
> as.formula("...")
Error in eval(expr, envir, enclos) : '...' used in an incorrect context
It may happen for the same reason that the following does not give an
Here is a simplified version of your problem
> { sqrt }(c(2,4,8))
[1] 1.414214 2.00 2.828427
Do you want that to act differently?
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Fri, Oct 21, 2016 at 6:10 AM, Wilm Schumacher
wrote:
> Hi,
>
> I hope this is
t seems appropriate.
> >> >
> >> > If there are no matches, then there is no easy way to see if the
> >> > prototype
> >> > is compatible with the pattern, so should strcapture just assume the
> >> > best
> >> > and fill in the prototype wit
ves NA's when the format does not match the text input.
> >
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> > On Wed, Sep 21, 2016 at 2:10 PM, Michael Lawrence
> > <lawrence.mich...@gene.com> wrote:
> >>
> >> Hi Bill,
In addition, there is a formula method for data.frame that
assumes the first column is the dependent variable.
> z <- data.frame(X1=1:6,X2=letters[1:3],Y=log(1:6))
> formula(z)
X1 ~ X2 + Y
> colnames(model.matrix(formula(z), z))
[1] "(Intercept)" "X2b" "X2c" "Y"
Spencer's
In Splus c() and unlist() called the same C code, but with a different
'sys_index' code (the last argument to .Internal) and c() did not consider
an argument named 'use.names' special.
> c
function(..., recursive = F)
.Internal(c(..., recursive = recursive), "S_unlist", TRUE, 1)
> unlist
Lawrence <lawrence.mich...@gene.com
> wrote:
> Hi Bill,
>
> Thanks, another good suggestion. strcapture() now returns NAs for
> non-matches. It's nice to have someone kicking the tires on that
> function.
>
> Michael
>
> On Wed, Sep 21, 2016 at 12:11 PM, Willia
Michael, thanks for looking at my first issue with utils::strcapture.
Another issue is how it deals with lines that don't match the pattern.
Currently it gives an error
> strcapture("(.+) (.+)", c("One 1", "noSpaceInLine", "Three 3"),
proto=list(Name="", Number=0))
Error in strcapture("(.+)
The new strcapture function in R-devel is handy, capturing
the matches to the parenthesized subpatterns in a regular
expression in the columns of a data.frame, whose column
names and classes are given by the 'proto' argument. E.g.,
> p1 <- data.frame(Name="", Number=0)
>
Other examples of functions like this are log1p(x), which is log(1+x)
accurate for small x, and expm1(x), which is exp(x)-1 accurate for small
x. E.g.,
> log1p( 1e-20 )
[1] 1e-20
> log( 1 + 1e-20 )
[1] 0
log itself cannot be accurate here because the problem is that 1 == 1 +
1e-20 in
tanpi(x) should be more accurate than tan(pi*x), especially near multiples
of pi/2.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Fri, Sep 9, 2016 at 11:55 AM, Hans W Borchers
wrote:
> The same argument would hold for tan(pi/2).
> I don't say the result 'NaN' is wrong,
It should be the case that tan(pi*x) != tanpi(x) in many cases - that is
why it was added. The limits from below and below of the real function
tan(pi*x) as x approaches 1/2 are different, +Inf and -Inf, so the limit is
not well defined. Hence the computer function tanpi(1/2) ought to return
Prior to the mid-1990s, S did "length-0 OP length-n -> rep(NA, n)" and it
was changed
to "length-0 OP length-n -> length-0" to avoid lots of problems like
any(x<0) being NA
when length(x)==0. Yes, people could code defensively by putting lots of
if(length(x)==0)...
in their code, but that is
Shouldn't binary operators (arithmetic and logical) should throw an error
when one operand is NULL (or other type that doesn't make sense)? This is
a different case than a zero-length operand of a legitimate type. E.g.,
any(x < 0)
should return FALSE if x is number-like and length(x)==0 but
Re withAutoprint(), Splus's source() function could take a expression
(literal or not) in place of a file name or text so it could support
withAutoprint-like functionality in its GUI. E.g.,
> source(auto.print=TRUE, exprs.literal= { x <- 3:7 ; sum(x) ; y <- log(x)
; x - 100}, prompt="--> ")
-->
Try comparing the streams for when the 625-integer versions of the seeds
are identical. (R's seed is 626 integers: omit the first value, which
indicates which random number generator the seed is for.). I find the the
MKL Mersenne Twister results match R's (with occassional differences in the
What should findInterval(x,vec,all.inside=TRUE) return when length(vec)<=1,
so there are no inside intervals?
R-3.3.0 gives a decreasing map of x->output when length(vec)==1 and -1's
when length(vec)==0. Would '0' in all those cases be better?
> findInterval(x=c(10, 11, 12), vec=11,
Another solution is to only save the parts of the model object that
interest you. As long as they don't include the formula (which is
what drags along the environment it was created in), you will
save space. E.g.,
tfun2 <- function(subset) {
junk <- 1:1e6
list(subset=subset,
One way around this problem is to make a new environment whose
parent environment is .GlobalEnv and which contains only what the
the call to lm() requires and to compute lm() in that environment. E.g.,
tfun1 <- function (subset)
{
junk <- 1:1e+06
env <- new.env(parent = globalenv())
In R code tryCatch can detect the difference. Hit control-C (on Unixen) or
Escape
(on Windows) to interrupt the long-running for loop and see that the
interrupt clause
gets called:
> z <- tryCatch(for(i in seq_len(1e8))log(exp(i/10)), error=function(e)e,
interrupt=function(e)e)
^C> dput(z)
While constructing some tests of symbolic link code in R, I got
an odd warning when trying the remove a symbolic link:
file.create(tfile <- tempfile())
#[1] TRUE
file.symlink(tfile, tlink <- tempfile())
#[1] TRUE
unlink(tlink)
#Warning message:
#In unlink(tlink) :
# cannot delete reparse point
In knitr-1.13 the DESCRIPTION file's Depends line uses a ">" instead of the
usual ">=".
Depends: R (> 3.0.2)
I don't see the strict greater than in Writing R Extensions. Does it
really mean that 3.0.2 is not suitable and that whatever version comes
after it is?
Bill Dunlap
TIBCO Software
The R language itself has features that limit how much
mulitthreading/parallel processing can be done. There are functions with
side effects, such as library(), plot(), runif(), <-, and <<- and there are
no mechanisms to isolate them.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, May 12,
Note that cophenetic.default (which works on the output of hclust(dist(X)))
uses the
row names of X as labels. as.dendrogram.hclust does not retain those row
names
so cophenetic.dendrogram cannot use them (so it orders them based on the
topology of the dendrogram).
Bill Dunlap
TIBCO Software
I think the results differ only in the order of the labels. The following
function
puts the labels in a standard order and then the results are the same:
canonicalize.dist <- function (distObject)
{
o <- order(labels(distObject))
as.matrix(distObject)[o, o, drop = FALSE]
}
Shouldn't the following 4 ways to alter an object in a locked environment
either all work or all fail? (All working would be nice, I think.)
E <- new.env()
assign("var", c(1,2,3,4), envir=E)
lockEnvironment(E, bindings=FALSE)
E$var[1] <- 101 ; E$var
#[1] 101 2 3
> but this strategy quickly inflates the number of packages on CRAN.
CRAN contains 8210 packages today, so I would not worry about
adding an extra one.
Also, I think several small packages are preferable to one large one
because attaching a big one just to get the one or two functions you
want
>If I recall correctly, some eigen vectors had their
>direction flipped (negative values became positive and vice versa).
>Did you notice anything of this kind when running 'make check' and
>'make check recommended' ? It is important to us that numeric results
>are reproducible between versions of
Is it intended that in yesterday's version of R-devel the default value of
keepNA is different in nchar (NA) and nzchar (FALSE)?
> args(nchar)
function (x, type = "chars", allowNA = FALSE, keepNA = NA)
NULL
> args(nzchar)
function (x, keepNA = FALSE)
NULL
Is it intended that for
What is the point of the 'endrule' attribute of smooth()'s return value?
For the simpler smooths (not involving 'S'?) it is value of the endrule
argument, for more complicated smooths it is the endrule not asked for (the
choices are 'Tukey' and 'copy'). For kind="S", it is not in the return
as.data.frame methods behave inconsistently when they are given a row.name
argument of the wrong length. The matrix method silently ignores row.names
if it has the wrong length and the numeric, integer, and character methods
do not bother to check and thus make an illegal data.frame.
>
Is there a reason that array() silently ignores dimnames that are not
a list but matrix() gives an error?
> str(matrix(11:14, 2, 2, dimnames=c("Rows","Cols")))
Error in matrix(11:14, 2, 2, dimnames = c("Rows", "Cols")) :
'dimnames' must be a list
> str(array(11:14, dim=c(2, 2),
101 - 172 of 172 matches
Mail list logo