Re: [Rd] [External] R hang/bug with circular references and promises

2024-05-13 Thread luke-tierney--- via R-devel

On Mon, 13 May 2024, Ivan Krylov wrote:


[You don't often get email from ikry...@disroot.org. Learn why this is 
important at https://aka.ms/LearnAboutSenderIdentification ]

On Mon, 13 May 2024 09:54:27 -0500 (CDT)
luke-tierney--- via R-devel  wrote:


Looks like I added that warning 22 years ago, so that should be enough
notice :-). I'll look into removing it now.




For now I have just changed the internal code to throw an error
if the change would produce a cycle (r86545). This gives

> e <- new.env()
> parent.env(e) <- e
Error in `parent.env<-`(`*tmp*`, value = ) :
  cycles in parent chains are not allowed


Dear Luke,

I've got a somewhat niche use case: as a way of protecting myself
against rogue *.rds files and vulnerabilities in the C code, I've been
manually unserializing "plain" data objects (without anything
executable), including environments, in R [1].


I would try using two passes: create the environments in the first pass
and in a second pass, either over the file or a new object with place holders, 
fill them in.


I see that SET_ENCLOS() is already commented as "not API and probably
should not be <...> used". Do you think there is a way to recreate an
environment, taking the REFSXP entries into account, without
`parent.env<-`?  Would you recommend to abandon the folly of
unserializing environments manually?


SET_ENCLOS is one of a number of SET... functions that are not in the
API and should not be since they are potentially unsafe to use. (One
that is in the API and needs to be removed is SET_TYPEOF). So we would
like to move them out of installed headers and not export them as
entry points. For this particular case most uses I see are something
like

env = allocSExp(ENVSXP);
SET_FRAME(env, R_NilValue);
SET_ENCLOS(env, parent);
SET_HASHTAB(env, R_NilValue);
SET_ATTRIB(env, R_NilValue);

which could just use

 env = R_NewEnv(parent, FALSE, 0);

Best,

luke



--
Best regards,
Ivan

[1]
https://codeberg.org/aitap/unserializeData/src/commit/33d72705c1ee265349b3e369874ce4b47f9cd358/R/unserialize.R#L289-L313



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] R hang/bug with circular references and promises

2024-05-13 Thread Ivan Krylov via R-devel
On Mon, 13 May 2024 09:54:27 -0500 (CDT)
luke-tierney--- via R-devel  wrote:

> Looks like I added that warning 22 years ago, so that should be enough
> notice :-). I'll look into removing it now.

Dear Luke,

I've got a somewhat niche use case: as a way of protecting myself
against rogue *.rds files and vulnerabilities in the C code, I've been
manually unserializing "plain" data objects (without anything
executable), including environments, in R [1].

I see that SET_ENCLOS() is already commented as "not API and probably
should not be <...> used". Do you think there is a way to recreate an
environment, taking the REFSXP entries into account, without
`parent.env<-`?  Would you recommend to abandon the folly of
unserializing environments manually?

-- 
Best regards,
Ivan

[1]
https://codeberg.org/aitap/unserializeData/src/commit/33d72705c1ee265349b3e369874ce4b47f9cd358/R/unserialize.R#L289-L313

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] R hang/bug with circular references and promises

2024-05-13 Thread luke-tierney--- via R-devel

On Sat, 11 May 2024, Peter Langfelder wrote:


On Sat, May 11, 2024 at 9:34 AM luke-tierney--- via R-devel
 wrote:


On Sat, 11 May 2024, Travers Ching wrote:


The following code snippet causes R to hang. This example might be a
bit contrived as I was experimenting and trying to understand
promises, but uses only base R.


This has nothing to do with promises. You created a cycle in the
environment chain. A simpler variant:

e <- new.env()
parent.env(e) <- e
get("x", e)

This will hang and is not interruptable -- loops searching up
environment chains are too speed-critical to check for interrupts.  It
is, however, pretty easy to check whether the parent change would
create a cycle and throw an error if it would. Need to think a bit
about exactly where the check should go.


FWIW, the help for parent.env already explicitly warns against using
parent.env <-:

The replacement function ‘parent.env<-’ is extremely dangerous as
it can be used to destructively change environments in ways that
violate assumptions made by the internal C code.  It may be
removed in the near future.


Looks like I added that warning 22 years ago, so that should be enough
notice :-). I'll look into removing it now.

Best,

luke



Peter



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] R hang/bug with circular references and promises

2024-05-11 Thread Peter Langfelder
On Sat, May 11, 2024 at 9:34 AM luke-tierney--- via R-devel
 wrote:
>
> On Sat, 11 May 2024, Travers Ching wrote:
>
> > The following code snippet causes R to hang. This example might be a
> > bit contrived as I was experimenting and trying to understand
> > promises, but uses only base R.
>
> This has nothing to do with promises. You created a cycle in the
> environment chain. A simpler variant:
>
> e <- new.env()
> parent.env(e) <- e
> get("x", e)
>
> This will hang and is not interruptable -- loops searching up
> environment chains are too speed-critical to check for interrupts.  It
> is, however, pretty easy to check whether the parent change would
> create a cycle and throw an error if it would. Need to think a bit
> about exactly where the check should go.

FWIW, the help for parent.env already explicitly warns against using
parent.env <-:

The replacement function ‘parent.env<-’ is extremely dangerous as
 it can be used to destructively change environments in ways that
 violate assumptions made by the internal C code.  It may be
 removed in the near future.

Peter

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] R hang/bug with circular references and promises

2024-05-10 Thread luke-tierney--- via R-devel

On Sat, 11 May 2024, Travers Ching wrote:


The following code snippet causes R to hang. This example might be a
bit contrived as I was experimenting and trying to understand
promises, but uses only base R.

It looks like it is looking for "not_a_variable" recursively but since
it doesn't exist it goes on indefinitely.

x0 <- new.env()
x1 <- new.env(parent = x0)
parent.env(x0) <- x1
delayedAssign("v", not_a_variable, eval.env=x1)
delayedAssign("w", v, assign.env=x1, eval.env=x0)
x1$w


This has nothing to do with promises. You created a cycle in the
environment chain. A simpler variant:

e <- new.env()
parent.env(e) <- e
get("x", e)

This will hang and is not interruptable -- loops searching up
environment chains are too speed-critical to check for interrupts.  It
is, however, pretty easy to check whether the parent change would
create a cycle and throw an error if it would. Need to think a bit
about exactly where the check should go.

Best,

luke



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Forcing a PROTECT Bug to Occur

2023-04-30 Thread Tomas Kalibera


On 4/30/23 06:05, Michael Milton wrote:
> Hi Tomas, thanks for the reply.
>
> I played with some of the factors you mentioned like allocating more 
> INTSXP of the same size as vec_1, to little success. The thing that 
> actually "worked" and caused a segfault, was simply allocating a 
> larger vector the first time. 100 elements seemed to do it, but 10 or 
> less elements almost never caused a segfault. Is there some other part 
> of memory that small vectors are allocated that would prevent them 
> being collected? Might this relate to the Ncells vs Vcells distinction?

I don't think it would be directly related (these are both vectors), but 
I don't remember all the details. If you are interested in how exactly 
the allocator works, I suggest reading memory.c - the sources are quite 
small and easy to read. There is some external fragmentation which 
limits what can be re-used. You could in principle force re-use by using 
objects of the same size and type (but the number might have to be 
large, maybe it wasn't large enough), or a single larger object (as you 
did). It doesn't make sense speculating more without reading the code, 
instrumenting and possibly debugging, if you want to find out exactly 
what is happening in your situation.

Tomas

> > z = inline::cfunction(body="
>     SEXP vec_1 = Rf_allocVector(INTSXP, 100);
>     SEXP vec_2 = Rf_allocVector(VECSXP, 3);
>     SET_VECTOR_ELT(vec_2, 1, vec_1);
> ")
> > z()
> NULL
> Warning message:
> In z() : your C program does not return anything!
> > gctorture(TRUE)
> > z()
>
>  *** caught segfault ***
> address 0x55a68ce57f90, cause 'memory not mapped'
>
> Traceback:
>  1: doWithOneRestart(return(expr), restart)
>  2: withOneRestart(expr, restarts[[1L]])
>  3: withRestarts({  .Internal(.signalCondition(simpleWarning(msg, 
> call), msg,     call))    .Internal(.dfltWarn(msg, call))}, 
> muffleWarning = function() NULL)
>  4: .signalSimpleWarning("your C program does not return anything!",   
>   base::quote(z()))
>  5: .Primitive(".Call")()
>  6: z()
>
>
>
>
> On Sun, Apr 30, 2023 at 4:04 AM Tomas Kalibera 
>  wrote:
>
>
> On 4/29/23 19:26, Michael Milton wrote:
> > I'm trying to learn about R's PROTECT system. To that end I've
> tried to
> > create an example of C code that doesn't protect anything. I was
> hoping it
> > would either segfault from trying to access deallocated memory,
> or maybe
> > print out nonsense results because the unprotected memory got
> overwritten,
> > but I can't make either happen.
> >
> > Here's my current code (all in R, using the inline package for
> simplicity):
> >
> >> gctorture(TRUE)
> >> z = inline::cfunction(body="
> >      SEXP vec_1 = Rf_ScalarInteger(99);
> >      SEXP vec_2 = Rf_allocVector(VECSXP, 10);
> >      SET_VECTOR_ELT(vec_2, 1, vec_1);
> >      Rf_PrintValue(vec_2);
> > ")
> >
> > My thinking was that, with torture mode enabled, the allocation
> of vec_2
> > should ensure that vec_1 is collected, and then trying to put it
> into vec_2
> > and then print it would then fail. But it consistently prints 99
> in the
> > list's second element.
> >
> > Why does this code not have issues? Alternatively, is there a
> simpler
> > example of C code that demonstrates missing PROTECT calls?
>
> It is not guaranteed that a PROTECT error will always lead to memory
> corruption or a crash in all executions of a program, not even with
> gctorture.
>
> To increase the chances, you can instruct gctorture to run the gc
> more
> often (but the execution would be slower). You can build R with
> strict
> write barrier checking (see e.g. Writing R Extensions, look for
> gctorture) to make it more likely errors will be detected. There is
> always a tradeoff between this probability and slowdown of the
> execution. In theory we could invalidate all unreachable objects at
> every allocation, but that would be so slow that we could not test
> any
> programs in practice.
>
> Then, not every piece of memory can be used by any allocation, due to
> how the memory allocator and gc works. If you need to provoke an
> error
> e.g. for educational purposes, perhaps chances would be higher if you
> allocate an object of the same type and size as the one not
> protected in
> error.
>
> You can also play with R sources - e.g. introduce a PROTECT error
> into
> some key part of the R interpreter, it should not be too hard to
> trigger
> that via make check-devel.
>
> Best
> Tomas
>
> >
> >       [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]

__

Re: [Rd] Forcing a PROTECT Bug to Occur

2023-04-29 Thread Tomas Kalibera



On 4/29/23 19:26, Michael Milton wrote:

I'm trying to learn about R's PROTECT system. To that end I've tried to
create an example of C code that doesn't protect anything. I was hoping it
would either segfault from trying to access deallocated memory, or maybe
print out nonsense results because the unprotected memory got overwritten,
but I can't make either happen.

Here's my current code (all in R, using the inline package for simplicity):


gctorture(TRUE)
z = inline::cfunction(body="

 SEXP vec_1 = Rf_ScalarInteger(99);
 SEXP vec_2 = Rf_allocVector(VECSXP, 10);
 SET_VECTOR_ELT(vec_2, 1, vec_1);
 Rf_PrintValue(vec_2);
")

My thinking was that, with torture mode enabled, the allocation of vec_2
should ensure that vec_1 is collected, and then trying to put it into vec_2
and then print it would then fail. But it consistently prints 99 in the
list's second element.

Why does this code not have issues? Alternatively, is there a simpler
example of C code that demonstrates missing PROTECT calls?


It is not guaranteed that a PROTECT error will always lead to memory 
corruption or a crash in all executions of a program, not even with 
gctorture.


To increase the chances, you can instruct gctorture to run the gc more 
often (but the execution would be slower). You can build R with strict 
write barrier checking (see e.g. Writing R Extensions, look for 
gctorture) to make it more likely errors will be detected. There is 
always a tradeoff between this probability and slowdown of the 
execution. In theory we could invalidate all unreachable objects at 
every allocation, but that would be so slow that we could not test any 
programs in practice.


Then, not every piece of memory can be used by any allocation, due to 
how the memory allocator and gc works. If you need to provoke an error 
e.g. for educational purposes, perhaps chances would be higher if you 
allocate an object of the same type and size as the one not protected in 
error.


You can also play with R sources - e.g. introduce a PROTECT error into 
some key part of the R interpreter, it should not be too hard to trigger 
that via make check-devel.


Best
Tomas



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Forcing a PROTECT Bug to Occur

2023-04-29 Thread Michael Milton
I'm trying to learn about R's PROTECT system. To that end I've tried to
create an example of C code that doesn't protect anything. I was hoping it
would either segfault from trying to access deallocated memory, or maybe
print out nonsense results because the unprotected memory got overwritten,
but I can't make either happen.

Here's my current code (all in R, using the inline package for simplicity):

> gctorture(TRUE)
> z = inline::cfunction(body="
SEXP vec_1 = Rf_ScalarInteger(99);
SEXP vec_2 = Rf_allocVector(VECSXP, 10);
SET_VECTOR_ELT(vec_2, 1, vec_1);
Rf_PrintValue(vec_2);
")

My thinking was that, with torture mode enabled, the allocation of vec_2
should ensure that vec_1 is collected, and then trying to put it into vec_2
and then print it would then fail. But it consistently prints 99 in the
list's second element.

Why does this code not have issues? Alternatively, is there a simpler
example of C code that demonstrates missing PROTECT calls?

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with contrasts bug fix?

2022-06-30 Thread Murray Efford
Thanks for your comprehensive explanation. Recent changes exposed errors in my 
code (and understanding!).
Murray


From: Sebastian Meyer 
Sent: 30 June 2022 20:19
To: Murray Efford
Cc: r-devel@r-project.org
Subject: Re: [Rd] Problem with contrasts bug fix?

For reference: this is about
<https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.r-project.org%2Fshow_bug.cgi%3Fid%3D17616data=05%7C01%7Cmurray.efford%40otago.ac.nz%7Cad75da3bbad346a7d0c108da5a715362%7C0225efc578fe4928b1579ef24809e9ba%7C0%7C0%7C637921740057400452%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=XOLVFd9sZKQKebLOj%2BiIN%2FmNnvLNSNKdzakHHZXX2O4%3Dreserved=0>.

The problem with your 'contr.none' example is that it is not a proper
contrasts function (and has never been). ?contrasts says

>  Suitable
>  functions have a first argument which is the character vector of
>  levels, a named argument 'contrasts' (always called with
>  'contrasts = TRUE') and optionally a logical argument 'sparse'.

Your example function fails also in R 4.1.3 and 4.2.1:

> contr.none <- function(n) contrasts(factor(1:n), contrasts = FALSE)
> contrasts(CO2$Treatment) <- "contr.none"
> contrasts(CO2$Treatment)

Error in ctrfn(levels(x), contrasts = contrasts) :
   unused argument (contrasts = contrasts)

Even if we fix the missing argument:

> contr.none <- function(n, ...) contrasts(factor(1:n), contrasts = FALSE)
> contrasts(CO2$Treatment) <- "contr.none"
> contrasts(CO2$Treatment)

Error in 1:n : NA/NaN argument
In addition: Warning messages:
1: In 1:n : numerical expression has 2 elements: only the first used
2: In factor(1:n) : NAs introduced by coercion

What *did* work (inconsistently) was :

> contrasts(CO2$Treatment) <- contr.none
> contrasts(CO2$Treatment)

1
nonchilled 1
chilled0

It no longer does in R-devel as 'contr.none' is now called with the
factor *levels* (not nlevels) also in this case as documented.

Note that the argument name 'n' in the default contrast functions such
as contr.treatment() is a bit delicate as these support passing either
levels or nlevels.

Hope this helps.
Best regards,

Sebastian Meyer


Am 30.06.22 um 00:27 schrieb Murray Efford:
> This worked previously but gives an error in R-devel (Windows today), 
> triggering a warning to a package maintainer:
>
> contr.none <- function(n) contrasts(factor(1:n), contrasts = FALSE)
> lm(uptake~Treatment, CO2, contrasts=list(Treatment=contr.none))
>
> Error in 1:n : NA/NaN argument
> In addition: Warning messages:
> 1: In 1:n : numerical expression has 2 elements: only the first used
> 2: In factor(1:n) : NAs introduced by coercion
>
> __
> R-devel@r-project.org mailing list
> https://apc01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-develdata=05%7C01%7Cmurray.efford%40otago.ac.nz%7Cad75da3bbad346a7d0c108da5a715362%7C0225efc578fe4928b1579ef24809e9ba%7C0%7C0%7C637921740057400452%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=vVBIv0EsP%2FmbohM9menUPmhfMl%2BsrLkohHA3JLu5BDc%3Dreserved=0

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Possible ALTREP bug

2021-06-17 Thread Gabriel Becker
Hi Toby,

Just to be more concrete about why to avoid REAL and about the iteration
macros Luke mentioned.

The ITERATE_BY_REGION* macros in include/R_ext/Itermacros.h are built on
top of *_GET_REGION rather than REAL/INTEGER. The key difference here is
that REAL/INTEGER  go down to the Dataptr method of the altclass, will
generally "explode" the ALTREP, which returns a writable data pointer, thus
invalidating the metadata, effectively turning it into a non-altrep form
while Get_region method provides a way to return a contiguous region of the
data in a way that (can and generally does) leave the ALTREPness intact.

We can see the difference by looking at the two respective methods for the
compact sequences ALTREP classes that luke wrote that ship with R:

static void *compact_intseq_Dataptr(SEXP x, Rboolean writeable)
{

if (COMPACT_SEQ_EXPANDED(x) == R_NilValue) {

/* no need to re-run if expanded data exists */

PROTECT(x);

SEXP info = COMPACT_SEQ_INFO(x);

R_xlen_t n = COMPACT_INTSEQ_INFO_LENGTH(info);

int n1 = COMPACT_INTSEQ_INFO_FIRST(info);

int inc = COMPACT_INTSEQ_INFO_INCR(info);

* SEXP val = allocVector(INTSXP, n);*

int *data = INTEGER(val);


if (inc == 1) {

   /* compact sequences n1 : n2 with n1 <= n2 */

*for (R_xlen_t i = 0; i < n; i++)*

*   data[i] = (int) (n1 + i);*

}

else if (inc == -1) {

   /* compact sequences n1 : n2 with n1 > n2 */

   for (R_xlen_t i = 0; i < n; i++)

data[i] = (int) (n1 - i);

}

else

   error("compact sequences with increment %d not supported yet", inc);


* SET_COMPACT_SEQ_EXPANDED(x, val);*
UNPROTECT(1);

}
*return DATAPTR(COMPACT_SEQ_EXPANDED(x));*
}

So the above sets the "Expanded" altrep data to a SEXP that holds a normal
SEXP with all the data if it isn't there already, and then returns DATAPTR
of that.

Get_region, on the other hand, looks like this:

static R_xlen_t
compact_intseq_Get_region(SEXP sx, R_xlen_t i, R_xlen_t n,* int *buf*)
{
/* should not get here if x is already expanded */
CHECK_NOT_EXPANDED(sx);

SEXP info = COMPACT_SEQ_INFO(sx);
R_xlen_t size = COMPACT_INTSEQ_INFO_LENGTH(info);
R_xlen_t n1 = COMPACT_INTSEQ_INFO_FIRST(info);
int inc = COMPACT_INTSEQ_INFO_INCR(info);

R_xlen_t ncopy = size - i > n ? n : size - i;
if (inc == 1) {


* for (R_xlen_t k = 0; k < ncopy; k++)**buf[k] = (int) (n1 + k + i);*
return ncopy;

}
else if (inc == -1) {

for (R_xlen_t k = 0; k < ncopy; k++)
   buf[k] = (int) (n1 - k - i);
return ncopy;

}
else

error("compact sequences with increment %d not supported yet", inc);

}

Here we are passing a buffer to the method, and that buffer gets populated
with the data. No new SEXP, no expanding the vector.

ITERATE_BY_REGION wraps that nicely so you can loop over the whole thing,
but the data is grabbed chunkwise, no huge allocation, no expanding of the
altrep.

One of the examples from summary.c is rsum, which looks like:

static Rboolean rsum(SEXP sx, double *value, Rboolean narm)
{
LDOUBLE s = 0.0;
Rboolean updated = FALSE;


*   ITERATE_BY_REGION(sx, x, i, nbatch, double, REAL, {*

*for (R_xlen_t k = 0; k < nbatch; k++) {*

* if (!narm || !ISNAN(x[k])) {*

*if(!updated) updated = TRUE;*

*s += x[k];*

* }*

*}*
* });*

if(s > DBL_MAX) *value = R_PosInf;
else if (s < -DBL_MAX) *value = R_NegInf;
else *value = (double) s;

return updated;
}

sx is the SEXP input, x is the chosen name of a variable, declared in the
scope of the macro, that holds the current batch of data that can be used
within the bracketted expression that is the macro's last argument. nbatch
is the name of a variable which will contain how many elements of data the
current batch has in it,  double is the raw data type (used in declaration
of x, in fact), REAL is the R-macro .."type" I guess, used internally. i is
the name chosen for another declared-within-the-macro variable which will
always contain the position in the overall vector corresponding to the
start of the current buffer.

Hope that helps.

Best,
~G

On Thu, Jun 17, 2021 at 11:32 AM  wrote:

> On Thu, 17 Jun 2021, Toby Hocking wrote:
>
> > Oliver, for clarification that section in writing R extensions mentions
> > VECTOR_ELT and REAL but not REAL_ELT nor any other *_ELT functions. I was
> > looking for an explanation of all the *_ELT functions (which are
> apparently
> > new), not just VECTOR_ELT.
> > Thanks Simon that response was very helpful.
> > One more question: are there any circumstances in which one should use
> > REAL_ELT(x,i) rather than REAL(x)[i] or vice versa? Or can they be used
> > interchangeably?
>
> For a single call it is better to use REAL_ELT(x, i) since it doesn't
> force allocating a possibly large object in order to get a pointer to
> its data with REAL(x).  If you are iterating over a whole object you
> may want to get data in chunks. There are iteration macros that
> help. Some examples are in src/main/summary.c.
>
> Best,
>
> luke
>

Re: [Rd] [External] Possible ALTREP bug

2021-06-17 Thread luke-tierney

On Thu, 17 Jun 2021, Toby Hocking wrote:


Oliver, for clarification that section in writing R extensions mentions
VECTOR_ELT and REAL but not REAL_ELT nor any other *_ELT functions. I was
looking for an explanation of all the *_ELT functions (which are apparently
new), not just VECTOR_ELT.
Thanks Simon that response was very helpful.
One more question: are there any circumstances in which one should use
REAL_ELT(x,i) rather than REAL(x)[i] or vice versa? Or can they be used
interchangeably?


For a single call it is better to use REAL_ELT(x, i) since it doesn't
force allocating a possibly large object in order to get a pointer to
its data with REAL(x).  If you are iterating over a whole object you
may want to get data in chunks. There are iteration macros that
help. Some examples are in src/main/summary.c.

Best,

luke



On Wed, Jun 16, 2021 at 4:29 PM Simon Urbanek 
wrote:
  The usual quote applies: "use the source, Luke":

  $ grep _ELT *.h | sort
  Rdefines.h:#define SET_ELEMENT(x, i, val)     
   SET_VECTOR_ELT(x, i, val)
  Rinternals.h:   The function STRING_ELT is used as an argument
  to arrayAssign even
  Rinternals.h:#define VECTOR_ELT(x,i)    ((SEXP *) DATAPTR(x))[i]
  Rinternals.h://SEXP (STRING_ELT)(SEXP x, R_xlen_t i);
  Rinternals.h:Rbyte (RAW_ELT)(SEXP x, R_xlen_t i);
  Rinternals.h:Rbyte ALTRAW_ELT(SEXP x, R_xlen_t i);
  Rinternals.h:Rcomplex (COMPLEX_ELT)(SEXP x, R_xlen_t i);
  Rinternals.h:Rcomplex ALTCOMPLEX_ELT(SEXP x, R_xlen_t i);
  Rinternals.h:SEXP (STRING_ELT)(SEXP x, R_xlen_t i);
  Rinternals.h:SEXP (VECTOR_ELT)(SEXP x, R_xlen_t i);
  Rinternals.h:SEXP ALTSTRING_ELT(SEXP, R_xlen_t);
  Rinternals.h:SEXP SET_VECTOR_ELT(SEXP x, R_xlen_t i, SEXP v);
  Rinternals.h:double (REAL_ELT)(SEXP x, R_xlen_t i);
  Rinternals.h:double ALTREAL_ELT(SEXP x, R_xlen_t i);
  Rinternals.h:int (INTEGER_ELT)(SEXP x, R_xlen_t i);
  Rinternals.h:int (LOGICAL_ELT)(SEXP x, R_xlen_t i);
  Rinternals.h:int ALTINTEGER_ELT(SEXP x, R_xlen_t i);
  Rinternals.h:int ALTLOGICAL_ELT(SEXP x, R_xlen_t i);
  Rinternals.h:void ALTCOMPLEX_SET_ELT(SEXP x, R_xlen_t i,
  Rcomplex v);
  Rinternals.h:void ALTINTEGER_SET_ELT(SEXP x, R_xlen_t i, int v);
  Rinternals.h:void ALTLOGICAL_SET_ELT(SEXP x, R_xlen_t i, int v);
  Rinternals.h:void ALTRAW_SET_ELT(SEXP x, R_xlen_t i, Rbyte v);
  Rinternals.h:void ALTREAL_SET_ELT(SEXP x, R_xlen_t i, double v);
  Rinternals.h:void ALTSTRING_SET_ELT(SEXP, R_xlen_t, SEXP);
  Rinternals.h:void SET_INTEGER_ELT(SEXP x, R_xlen_t i, int v);
  Rinternals.h:void SET_LOGICAL_ELT(SEXP x, R_xlen_t i, int v);
  Rinternals.h:void SET_REAL_ELT(SEXP x, R_xlen_t i, double v);
  Rinternals.h:void SET_STRING_ELT(SEXP x, R_xlen_t i, SEXP v);

  So the indexing is with R_xlen_t and they return the value
  itself as one would expect.

  Cheers,
  Simon


  > On Jun 17, 2021, at 2:22 AM, Toby Hocking 
  wrote:
  >
  > By the way, where is the documentation for INTEGER_ELT,
  REAL_ELT, etc? I
  > looked in Writing R Extensions and R Internals but I did not
  see any
  > mention.
  > REAL_ELT is briefly mentioned on
  > https://svn.r-project.org/R/branches/ALTREP/ALTREP.html
  > Would it be possible to please add some mention of them to
  Writing R
  > Extensions?
  > - how many of these _ELT functions are there? INTEGER, REAL,
  ... ?
  > - in what version of R were they introduced?
  > - I guess input types are always SEXP and int?
  > - What are the output types for each?
  >
  > On Fri, May 28, 2021 at 5:16 PM 
  wrote:
  >
  >> Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly
  new it may
  >> be possible to check that places where they are used allow
  for them to
  >> allocate. I have fixed the one that got caught by Gabor's
  example, and
  >> a rchk run might be able to pick up others if rchk knows
  these could
  >> allocate. (I may also be forgetting other places where the
  _ELt
  >> methods are used.)  Fixing all call sites for REAL, INTEGER,
  etc, was
  >> never realistic so there GC has to be suspended during the
  method
  >> call, and that is done in the dispatch mechanism.
  >>
  >> The bigger problem is jumps from inside things that existing
  code
  >> assumes will not do that. Catching those jumps is possible
  but
  >> expensive; doing anything sensible if one is caught is really
  not
  >> possible.
  >>
  >> Best,
  >>
  >> luke
  >>
  >> On Fri, 28 May 2021, Gabriel Becker wrote:
  >>
  >>> Hi Jim et al,
  >>> Just to hopefully add a bit to what Luke already answered,
  from what I am
  >>> recalling looking back at that bioconductor thread Elt
  methods are used
  >> in
  >>> places where there are 

Re: [Rd] [External] Possible ALTREP bug

2021-06-17 Thread Toby Hocking
Oliver, for clarification that section in writing R extensions mentions
VECTOR_ELT and REAL but not REAL_ELT nor any other *_ELT functions. I was
looking for an explanation of all the *_ELT functions (which are apparently
new), not just VECTOR_ELT.
Thanks Simon that response was very helpful.
One more question: are there any circumstances in which one should use
REAL_ELT(x,i) rather than REAL(x)[i] or vice versa? Or can they be used
interchangeably?

On Wed, Jun 16, 2021 at 4:29 PM Simon Urbanek 
wrote:

> The usual quote applies: "use the source, Luke":
>
> $ grep _ELT *.h | sort
> Rdefines.h:#define SET_ELEMENT(x, i, val)   SET_VECTOR_ELT(x, i, val)
> Rinternals.h:   The function STRING_ELT is used as an argument to
> arrayAssign even
> Rinternals.h:#define VECTOR_ELT(x,i)((SEXP *) DATAPTR(x))[i]
> Rinternals.h://SEXP (STRING_ELT)(SEXP x, R_xlen_t i);
> Rinternals.h:Rbyte (RAW_ELT)(SEXP x, R_xlen_t i);
> Rinternals.h:Rbyte ALTRAW_ELT(SEXP x, R_xlen_t i);
> Rinternals.h:Rcomplex (COMPLEX_ELT)(SEXP x, R_xlen_t i);
> Rinternals.h:Rcomplex ALTCOMPLEX_ELT(SEXP x, R_xlen_t i);
> Rinternals.h:SEXP (STRING_ELT)(SEXP x, R_xlen_t i);
> Rinternals.h:SEXP (VECTOR_ELT)(SEXP x, R_xlen_t i);
> Rinternals.h:SEXP ALTSTRING_ELT(SEXP, R_xlen_t);
> Rinternals.h:SEXP SET_VECTOR_ELT(SEXP x, R_xlen_t i, SEXP v);
> Rinternals.h:double (REAL_ELT)(SEXP x, R_xlen_t i);
> Rinternals.h:double ALTREAL_ELT(SEXP x, R_xlen_t i);
> Rinternals.h:int (INTEGER_ELT)(SEXP x, R_xlen_t i);
> Rinternals.h:int (LOGICAL_ELT)(SEXP x, R_xlen_t i);
> Rinternals.h:int ALTINTEGER_ELT(SEXP x, R_xlen_t i);
> Rinternals.h:int ALTLOGICAL_ELT(SEXP x, R_xlen_t i);
> Rinternals.h:void ALTCOMPLEX_SET_ELT(SEXP x, R_xlen_t i, Rcomplex v);
> Rinternals.h:void ALTINTEGER_SET_ELT(SEXP x, R_xlen_t i, int v);
> Rinternals.h:void ALTLOGICAL_SET_ELT(SEXP x, R_xlen_t i, int v);
> Rinternals.h:void ALTRAW_SET_ELT(SEXP x, R_xlen_t i, Rbyte v);
> Rinternals.h:void ALTREAL_SET_ELT(SEXP x, R_xlen_t i, double v);
> Rinternals.h:void ALTSTRING_SET_ELT(SEXP, R_xlen_t, SEXP);
> Rinternals.h:void SET_INTEGER_ELT(SEXP x, R_xlen_t i, int v);
> Rinternals.h:void SET_LOGICAL_ELT(SEXP x, R_xlen_t i, int v);
> Rinternals.h:void SET_REAL_ELT(SEXP x, R_xlen_t i, double v);
> Rinternals.h:void SET_STRING_ELT(SEXP x, R_xlen_t i, SEXP v);
>
> So the indexing is with R_xlen_t and they return the value itself as one
> would expect.
>
> Cheers,
> Simon
>
>
> > On Jun 17, 2021, at 2:22 AM, Toby Hocking  wrote:
> >
> > By the way, where is the documentation for INTEGER_ELT, REAL_ELT, etc? I
> > looked in Writing R Extensions and R Internals but I did not see any
> > mention.
> > REAL_ELT is briefly mentioned on
> > https://svn.r-project.org/R/branches/ALTREP/ALTREP.html
> > Would it be possible to please add some mention of them to Writing R
> > Extensions?
> > - how many of these _ELT functions are there? INTEGER, REAL, ... ?
> > - in what version of R were they introduced?
> > - I guess input types are always SEXP and int?
> > - What are the output types for each?
> >
> > On Fri, May 28, 2021 at 5:16 PM  wrote:
> >
> >> Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly new it may
> >> be possible to check that places where they are used allow for them to
> >> allocate. I have fixed the one that got caught by Gabor's example, and
> >> a rchk run might be able to pick up others if rchk knows these could
> >> allocate. (I may also be forgetting other places where the _ELt
> >> methods are used.)  Fixing all call sites for REAL, INTEGER, etc, was
> >> never realistic so there GC has to be suspended during the method
> >> call, and that is done in the dispatch mechanism.
> >>
> >> The bigger problem is jumps from inside things that existing code
> >> assumes will not do that. Catching those jumps is possible but
> >> expensive; doing anything sensible if one is caught is really not
> >> possible.
> >>
> >> Best,
> >>
> >> luke
> >>
> >> On Fri, 28 May 2021, Gabriel Becker wrote:
> >>
> >>> Hi Jim et al,
> >>> Just to hopefully add a bit to what Luke already answered, from what I
> am
> >>> recalling looking back at that bioconductor thread Elt methods are used
> >> in
> >>> places where there are hard implicit assumptions that no garbage
> >> collection
> >>> will occur (ie they are called on things that aren't PROTECTed), and
> >> beyond
> >>> that, in places where there are hard assumptions that no error
> (longjmp)
> >>> will occur. I could be wrong, but I don't know that suspending garbage
> >>> collection would protect from the second one. Ie it is possible that an
> >>> error *ever* being raised from R code that implements an elt method
> could
> >>> cause all hell to break loose.
> >>>
> >>> Luke or Tomas Kalibera would know more.
> >>>
> >>> I was disappointed that implementing ALTREPs in R code was not in the
> >> cards
> >>> (it was in my original proposal back in 2016 to the DSC) but I trust
> Luke
> >>> that there are important reasons we can't safely allow that.
> >>>

Re: [Rd] [External] Possible ALTREP bug

2021-06-16 Thread Simon Urbanek


The usual quote applies: "use the source, Luke":

$ grep _ELT *.h | sort
Rdefines.h:#define SET_ELEMENT(x, i, val)   SET_VECTOR_ELT(x, i, val)
Rinternals.h:   The function STRING_ELT is used as an argument to arrayAssign 
even
Rinternals.h:#define VECTOR_ELT(x,i)((SEXP *) DATAPTR(x))[i]
Rinternals.h://SEXP (STRING_ELT)(SEXP x, R_xlen_t i);
Rinternals.h:Rbyte (RAW_ELT)(SEXP x, R_xlen_t i);
Rinternals.h:Rbyte ALTRAW_ELT(SEXP x, R_xlen_t i);
Rinternals.h:Rcomplex (COMPLEX_ELT)(SEXP x, R_xlen_t i);
Rinternals.h:Rcomplex ALTCOMPLEX_ELT(SEXP x, R_xlen_t i);
Rinternals.h:SEXP (STRING_ELT)(SEXP x, R_xlen_t i);
Rinternals.h:SEXP (VECTOR_ELT)(SEXP x, R_xlen_t i);
Rinternals.h:SEXP ALTSTRING_ELT(SEXP, R_xlen_t);
Rinternals.h:SEXP SET_VECTOR_ELT(SEXP x, R_xlen_t i, SEXP v);
Rinternals.h:double (REAL_ELT)(SEXP x, R_xlen_t i);
Rinternals.h:double ALTREAL_ELT(SEXP x, R_xlen_t i);
Rinternals.h:int (INTEGER_ELT)(SEXP x, R_xlen_t i);
Rinternals.h:int (LOGICAL_ELT)(SEXP x, R_xlen_t i);
Rinternals.h:int ALTINTEGER_ELT(SEXP x, R_xlen_t i);
Rinternals.h:int ALTLOGICAL_ELT(SEXP x, R_xlen_t i);
Rinternals.h:void ALTCOMPLEX_SET_ELT(SEXP x, R_xlen_t i, Rcomplex v);
Rinternals.h:void ALTINTEGER_SET_ELT(SEXP x, R_xlen_t i, int v);
Rinternals.h:void ALTLOGICAL_SET_ELT(SEXP x, R_xlen_t i, int v);
Rinternals.h:void ALTRAW_SET_ELT(SEXP x, R_xlen_t i, Rbyte v);
Rinternals.h:void ALTREAL_SET_ELT(SEXP x, R_xlen_t i, double v);
Rinternals.h:void ALTSTRING_SET_ELT(SEXP, R_xlen_t, SEXP);
Rinternals.h:void SET_INTEGER_ELT(SEXP x, R_xlen_t i, int v);
Rinternals.h:void SET_LOGICAL_ELT(SEXP x, R_xlen_t i, int v);
Rinternals.h:void SET_REAL_ELT(SEXP x, R_xlen_t i, double v);
Rinternals.h:void SET_STRING_ELT(SEXP x, R_xlen_t i, SEXP v);

So the indexing is with R_xlen_t and they return the value itself as one would 
expect.

Cheers,
Simon


> On Jun 17, 2021, at 2:22 AM, Toby Hocking  wrote:
> 
> By the way, where is the documentation for INTEGER_ELT, REAL_ELT, etc? I
> looked in Writing R Extensions and R Internals but I did not see any
> mention.
> REAL_ELT is briefly mentioned on
> https://svn.r-project.org/R/branches/ALTREP/ALTREP.html
> Would it be possible to please add some mention of them to Writing R
> Extensions?
> - how many of these _ELT functions are there? INTEGER, REAL, ... ?
> - in what version of R were they introduced?
> - I guess input types are always SEXP and int?
> - What are the output types for each?
> 
> On Fri, May 28, 2021 at 5:16 PM  wrote:
> 
>> Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly new it may
>> be possible to check that places where they are used allow for them to
>> allocate. I have fixed the one that got caught by Gabor's example, and
>> a rchk run might be able to pick up others if rchk knows these could
>> allocate. (I may also be forgetting other places where the _ELt
>> methods are used.)  Fixing all call sites for REAL, INTEGER, etc, was
>> never realistic so there GC has to be suspended during the method
>> call, and that is done in the dispatch mechanism.
>> 
>> The bigger problem is jumps from inside things that existing code
>> assumes will not do that. Catching those jumps is possible but
>> expensive; doing anything sensible if one is caught is really not
>> possible.
>> 
>> Best,
>> 
>> luke
>> 
>> On Fri, 28 May 2021, Gabriel Becker wrote:
>> 
>>> Hi Jim et al,
>>> Just to hopefully add a bit to what Luke already answered, from what I am
>>> recalling looking back at that bioconductor thread Elt methods are used
>> in
>>> places where there are hard implicit assumptions that no garbage
>> collection
>>> will occur (ie they are called on things that aren't PROTECTed), and
>> beyond
>>> that, in places where there are hard assumptions that no error (longjmp)
>>> will occur. I could be wrong, but I don't know that suspending garbage
>>> collection would protect from the second one. Ie it is possible that an
>>> error *ever* being raised from R code that implements an elt method could
>>> cause all hell to break loose.
>>> 
>>> Luke or Tomas Kalibera would know more.
>>> 
>>> I was disappointed that implementing ALTREPs in R code was not in the
>> cards
>>> (it was in my original proposal back in 2016 to the DSC) but I trust Luke
>>> that there are important reasons we can't safely allow that.
>>> 
>>> Best,
>>> ~G
>>> 
>>> On Fri, May 28, 2021 at 8:31 AM Jim Hester 
>> wrote:
>>>  From reading the discussion on the Bioconductor issue tracker it
>>>  seems like
>>>  the reason the GC is not suspended for the non-string ALTREP Elt
>>>  methods is
>>>  primarily due to performance concerns.
>>> 
>>>  If this is the case perhaps an additional flag could be added to
>>>  the
>>>  `R_set_altrep_*()` functions so ALTREP authors could indicate if
>>>  GC should
>>>  be halted when that particular method is called for that
>>>  particular ALTREP
>>>  class.
>>> 
>>>  This would avoid the performance hit (other than a boolean
>>>  

Re: [Rd] [External] Possible ALTREP bug

2021-06-16 Thread Oliver Madsen
*_ELT accessor functions are described in "vector accessor functions" in
Writing R extensions.

https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Vector-accessor-functions

On Wed, Jun 16, 2021 at 4:22 PM Toby Hocking  wrote:

> By the way, where is the documentation for INTEGER_ELT, REAL_ELT, etc? I
> looked in Writing R Extensions and R Internals but I did not see any
> mention.
> REAL_ELT is briefly mentioned on
> https://svn.r-project.org/R/branches/ALTREP/ALTREP.html
> Would it be possible to please add some mention of them to Writing R
> Extensions?
> - how many of these _ELT functions are there? INTEGER, REAL, ... ?
> - in what version of R were they introduced?
> - I guess input types are always SEXP and int?
> - What are the output types for each?
>
> On Fri, May 28, 2021 at 5:16 PM  wrote:
>
> > Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly new it may
> > be possible to check that places where they are used allow for them to
> > allocate. I have fixed the one that got caught by Gabor's example, and
> > a rchk run might be able to pick up others if rchk knows these could
> > allocate. (I may also be forgetting other places where the _ELt
> > methods are used.)  Fixing all call sites for REAL, INTEGER, etc, was
> > never realistic so there GC has to be suspended during the method
> > call, and that is done in the dispatch mechanism.
> >
> > The bigger problem is jumps from inside things that existing code
> > assumes will not do that. Catching those jumps is possible but
> > expensive; doing anything sensible if one is caught is really not
> > possible.
> >
> > Best,
> >
> > luke
> >
> > On Fri, 28 May 2021, Gabriel Becker wrote:
> >
> > > Hi Jim et al,
> > > Just to hopefully add a bit to what Luke already answered, from what I
> am
> > > recalling looking back at that bioconductor thread Elt methods are used
> > in
> > > places where there are hard implicit assumptions that no garbage
> > collection
> > > will occur (ie they are called on things that aren't PROTECTed), and
> > beyond
> > > that, in places where there are hard assumptions that no error
> (longjmp)
> > > will occur. I could be wrong, but I don't know that suspending garbage
> > > collection would protect from the second one. Ie it is possible that an
> > > error *ever* being raised from R code that implements an elt method
> could
> > > cause all hell to break loose.
> > >
> > > Luke or Tomas Kalibera would know more.
> > >
> > > I was disappointed that implementing ALTREPs in R code was not in the
> > cards
> > > (it was in my original proposal back in 2016 to the DSC) but I trust
> Luke
> > > that there are important reasons we can't safely allow that.
> > >
> > > Best,
> > > ~G
> > >
> > > On Fri, May 28, 2021 at 8:31 AM Jim Hester 
> > wrote:
> > >   From reading the discussion on the Bioconductor issue tracker it
> > >   seems like
> > >   the reason the GC is not suspended for the non-string ALTREP Elt
> > >   methods is
> > >   primarily due to performance concerns.
> > >
> > >   If this is the case perhaps an additional flag could be added to
> > >   the
> > >   `R_set_altrep_*()` functions so ALTREP authors could indicate if
> > >   GC should
> > >   be halted when that particular method is called for that
> > >   particular ALTREP
> > >   class.
> > >
> > >   This would avoid the performance hit (other than a boolean
> > >   check) for the
> > >   standard case when no allocations are expected, but allow
> > >   authors to
> > >   indicate that R should pause GC if needed for methods in their
> > >   class.
> > >
> > >   On Fri, May 28, 2021 at 9:42 AM  wrote:
> > >
> > >   > integer and real Elt methods are not expected to allocate. You
> > >   would
> > >   > have to suspend GC to be able to do that. This currently can't
> > >   be done
> > >   > from package code.
> > >   >
> > >   > Best,
> > >   >
> > >   > luke
> > >   >
> > >   > On Fri, 28 May 2021, Gábor Csárdi wrote:
> > >   >
> > >   > > I have found some weird SEXP corruption behavior with
> > >   ALTREP, which
> > >   > > could be a bug. (Or I could be doing something wrong.)
> > >   > >
> > >   > > I have an integer ALTREP vector that calls back to R from
> > >   the Elt
> > >   > > method. When this vector is indexed in a lapply(), its first
> > >   element
> > >   > > gets corrupted. Sometimes it's just a type change to
> > >   logical, but
> > >   > > sometimes the corruption causes a crash.
> > >   > >
> > >   > > I saw this on macOS from R 3.5.3 to 4.2.0. I created a small
> > >   package
> > >   > > that demonstrates this:
> > >   https://github.com/gaborcsardi/redfish
> > >   > >
> > >   > > The R callback in this package calls
> > >   `loadNamespace("Matrix")`, but
> > >   > > the same crash happens for other packages as 

Re: [Rd] [External] Possible ALTREP bug

2021-06-16 Thread Toby Hocking
By the way, where is the documentation for INTEGER_ELT, REAL_ELT, etc? I
looked in Writing R Extensions and R Internals but I did not see any
mention.
REAL_ELT is briefly mentioned on
https://svn.r-project.org/R/branches/ALTREP/ALTREP.html
Would it be possible to please add some mention of them to Writing R
Extensions?
- how many of these _ELT functions are there? INTEGER, REAL, ... ?
- in what version of R were they introduced?
- I guess input types are always SEXP and int?
- What are the output types for each?

On Fri, May 28, 2021 at 5:16 PM  wrote:

> Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly new it may
> be possible to check that places where they are used allow for them to
> allocate. I have fixed the one that got caught by Gabor's example, and
> a rchk run might be able to pick up others if rchk knows these could
> allocate. (I may also be forgetting other places where the _ELt
> methods are used.)  Fixing all call sites for REAL, INTEGER, etc, was
> never realistic so there GC has to be suspended during the method
> call, and that is done in the dispatch mechanism.
>
> The bigger problem is jumps from inside things that existing code
> assumes will not do that. Catching those jumps is possible but
> expensive; doing anything sensible if one is caught is really not
> possible.
>
> Best,
>
> luke
>
> On Fri, 28 May 2021, Gabriel Becker wrote:
>
> > Hi Jim et al,
> > Just to hopefully add a bit to what Luke already answered, from what I am
> > recalling looking back at that bioconductor thread Elt methods are used
> in
> > places where there are hard implicit assumptions that no garbage
> collection
> > will occur (ie they are called on things that aren't PROTECTed), and
> beyond
> > that, in places where there are hard assumptions that no error (longjmp)
> > will occur. I could be wrong, but I don't know that suspending garbage
> > collection would protect from the second one. Ie it is possible that an
> > error *ever* being raised from R code that implements an elt method could
> > cause all hell to break loose.
> >
> > Luke or Tomas Kalibera would know more.
> >
> > I was disappointed that implementing ALTREPs in R code was not in the
> cards
> > (it was in my original proposal back in 2016 to the DSC) but I trust Luke
> > that there are important reasons we can't safely allow that.
> >
> > Best,
> > ~G
> >
> > On Fri, May 28, 2021 at 8:31 AM Jim Hester 
> wrote:
> >   From reading the discussion on the Bioconductor issue tracker it
> >   seems like
> >   the reason the GC is not suspended for the non-string ALTREP Elt
> >   methods is
> >   primarily due to performance concerns.
> >
> >   If this is the case perhaps an additional flag could be added to
> >   the
> >   `R_set_altrep_*()` functions so ALTREP authors could indicate if
> >   GC should
> >   be halted when that particular method is called for that
> >   particular ALTREP
> >   class.
> >
> >   This would avoid the performance hit (other than a boolean
> >   check) for the
> >   standard case when no allocations are expected, but allow
> >   authors to
> >   indicate that R should pause GC if needed for methods in their
> >   class.
> >
> >   On Fri, May 28, 2021 at 9:42 AM  wrote:
> >
> >   > integer and real Elt methods are not expected to allocate. You
> >   would
> >   > have to suspend GC to be able to do that. This currently can't
> >   be done
> >   > from package code.
> >   >
> >   > Best,
> >   >
> >   > luke
> >   >
> >   > On Fri, 28 May 2021, Gábor Csárdi wrote:
> >   >
> >   > > I have found some weird SEXP corruption behavior with
> >   ALTREP, which
> >   > > could be a bug. (Or I could be doing something wrong.)
> >   > >
> >   > > I have an integer ALTREP vector that calls back to R from
> >   the Elt
> >   > > method. When this vector is indexed in a lapply(), its first
> >   element
> >   > > gets corrupted. Sometimes it's just a type change to
> >   logical, but
> >   > > sometimes the corruption causes a crash.
> >   > >
> >   > > I saw this on macOS from R 3.5.3 to 4.2.0. I created a small
> >   package
> >   > > that demonstrates this:
> >   https://github.com/gaborcsardi/redfish
> >   > >
> >   > > The R callback in this package calls
> >   `loadNamespace("Matrix")`, but
> >   > > the same crash happens for other packages as well, and
> >   sometimes it
> >   > > also happens if I don't load any packages at all. (But that
> >   example
> >   > > was much more complicated, so I went with the package
> >   loading.)
> >   > >
> >   > > It is somewhat random, and sometimes turning off the JIT
> >   avoids the
> >   > > crash, but not always.
> >   > >
> >   > > Hopefully I am just doing something wrong in the ALTREP code
> >   (see
> >   > >
> >  

Re: [Rd] [External] Possible ALTREP bug

2021-05-28 Thread luke-tierney

Since the INTEGER_ELT, REAL_ELT, etc, functions are fairly new it may
be possible to check that places where they are used allow for them to
allocate. I have fixed the one that got caught by Gabor's example, and
a rchk run might be able to pick up others if rchk knows these could
allocate. (I may also be forgetting other places where the _ELt
methods are used.)  Fixing all call sites for REAL, INTEGER, etc, was
never realistic so there GC has to be suspended during the method
call, and that is done in the dispatch mechanism.

The bigger problem is jumps from inside things that existing code
assumes will not do that. Catching those jumps is possible but
expensive; doing anything sensible if one is caught is really not
possible.

Best,

luke

On Fri, 28 May 2021, Gabriel Becker wrote:


Hi Jim et al,
Just to hopefully add a bit to what Luke already answered, from what I am
recalling looking back at that bioconductor thread Elt methods are used in
places where there are hard implicit assumptions that no garbage collection
will occur (ie they are called on things that aren't PROTECTed), and beyond
that, in places where there are hard assumptions that no error (longjmp)
will occur. I could be wrong, but I don't know that suspending garbage
collection would protect from the second one. Ie it is possible that an
error *ever* being raised from R code that implements an elt method could
cause all hell to break loose.

Luke or Tomas Kalibera would know more.

I was disappointed that implementing ALTREPs in R code was not in the cards
(it was in my original proposal back in 2016 to the DSC) but I trust Luke
that there are important reasons we can't safely allow that.

Best,
~G

On Fri, May 28, 2021 at 8:31 AM Jim Hester  wrote:
  From reading the discussion on the Bioconductor issue tracker it
  seems like
  the reason the GC is not suspended for the non-string ALTREP Elt
  methods is
  primarily due to performance concerns.

  If this is the case perhaps an additional flag could be added to
  the
  `R_set_altrep_*()` functions so ALTREP authors could indicate if
  GC should
  be halted when that particular method is called for that
  particular ALTREP
  class.

  This would avoid the performance hit (other than a boolean
  check) for the
  standard case when no allocations are expected, but allow
  authors to
  indicate that R should pause GC if needed for methods in their
  class.

  On Fri, May 28, 2021 at 9:42 AM  wrote:

  > integer and real Elt methods are not expected to allocate. You
  would
  > have to suspend GC to be able to do that. This currently can't
  be done
  > from package code.
  >
  > Best,
  >
  > luke
  >
  > On Fri, 28 May 2021, Gábor Csárdi wrote:
  >
  > > I have found some weird SEXP corruption behavior with
  ALTREP, which
  > > could be a bug. (Or I could be doing something wrong.)
  > >
  > > I have an integer ALTREP vector that calls back to R from
  the Elt
  > > method. When this vector is indexed in a lapply(), its first
  element
  > > gets corrupted. Sometimes it's just a type change to
  logical, but
  > > sometimes the corruption causes a crash.
  > >
  > > I saw this on macOS from R 3.5.3 to 4.2.0. I created a small
  package
  > > that demonstrates this:
  https://github.com/gaborcsardi/redfish
  > >
  > > The R callback in this package calls
  `loadNamespace("Matrix")`, but
  > > the same crash happens for other packages as well, and
  sometimes it
  > > also happens if I don't load any packages at all. (But that
  example
  > > was much more complicated, so I went with the package
  loading.)
  > >
  > > It is somewhat random, and sometimes turning off the JIT
  avoids the
  > > crash, but not always.
  > >
  > > Hopefully I am just doing something wrong in the ALTREP code
  (see
  > >
  https://github.com/gaborcsardi/redfish/blob/main/src/test.c),
  and it
  > > is not actually a bug.
  > >
  > > Thanks,
  > > Gabor
  > >
  > > __
  > > R-devel@r-project.org mailing list
  > > https://stat.ethz.ch/mailman/listinfo/r-devel
  > >
  >
  > --
  > Luke Tierney
  > Ralph E. Wareham Professor of Mathematical Sciences
  > University of Iowa                  Phone:           
   319-335-3386
  > Department of Statistics and        Fax:             
   319-335-3017
  >     Actuarial Science
  > 241 Schaeffer Hall                  email: 
   luke-tier...@uiowa.edu
  > Iowa City, IA 52242                 WWW: 
  http://www.stat.uiowa.edu
  > __
  > R-devel@r-project.org mailing list
  > https://stat.ethz.ch/mailman/listinfo/r-devel
  >

 

Re: [Rd] [External] Possible ALTREP bug

2021-05-28 Thread Gabriel Becker
Hi Jim et al,

Just to hopefully add a bit to what Luke already answered, from what I am
recalling looking back at that bioconductor thread Elt methods are used in
places where there are hard implicit assumptions that no garbage collection
will occur (ie they are called on things that aren't PROTECTed), and beyond
that, in places where there are hard assumptions that no error (longjmp)
will occur. I could be wrong, but I don't know that suspending garbage
collection would protect from the second one. Ie it is possible that an
error *ever* being raised from R code that implements an elt method could
cause all hell to break loose.

Luke or Tomas Kalibera would know more.

I was disappointed that implementing ALTREPs in R code was not in the cards
(it was in my original proposal back in 2016 to the DSC) but I trust Luke
that there are important reasons we can't safely allow that.

Best,
~G

On Fri, May 28, 2021 at 8:31 AM Jim Hester  wrote:

> From reading the discussion on the Bioconductor issue tracker it seems like
> the reason the GC is not suspended for the non-string ALTREP Elt methods is
> primarily due to performance concerns.
>
> If this is the case perhaps an additional flag could be added to the
> `R_set_altrep_*()` functions so ALTREP authors could indicate if GC should
> be halted when that particular method is called for that particular ALTREP
> class.
>
> This would avoid the performance hit (other than a boolean check) for the
> standard case when no allocations are expected, but allow authors to
> indicate that R should pause GC if needed for methods in their class.
>
> On Fri, May 28, 2021 at 9:42 AM  wrote:
>
> > integer and real Elt methods are not expected to allocate. You would
> > have to suspend GC to be able to do that. This currently can't be done
> > from package code.
> >
> > Best,
> >
> > luke
> >
> > On Fri, 28 May 2021, Gábor Csárdi wrote:
> >
> > > I have found some weird SEXP corruption behavior with ALTREP, which
> > > could be a bug. (Or I could be doing something wrong.)
> > >
> > > I have an integer ALTREP vector that calls back to R from the Elt
> > > method. When this vector is indexed in a lapply(), its first element
> > > gets corrupted. Sometimes it's just a type change to logical, but
> > > sometimes the corruption causes a crash.
> > >
> > > I saw this on macOS from R 3.5.3 to 4.2.0. I created a small package
> > > that demonstrates this: https://github.com/gaborcsardi/redfish
> > >
> > > The R callback in this package calls `loadNamespace("Matrix")`, but
> > > the same crash happens for other packages as well, and sometimes it
> > > also happens if I don't load any packages at all. (But that example
> > > was much more complicated, so I went with the package loading.)
> > >
> > > It is somewhat random, and sometimes turning off the JIT avoids the
> > > crash, but not always.
> > >
> > > Hopefully I am just doing something wrong in the ALTREP code (see
> > > https://github.com/gaborcsardi/redfish/blob/main/src/test.c), and it
> > > is not actually a bug.
> > >
> > > Thanks,
> > > Gabor
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> >
> > --
> > Luke Tierney
> > Ralph E. Wareham Professor of Mathematical Sciences
> > University of Iowa  Phone: 319-335-3386
> > Department of Statistics andFax:   319-335-3017
> > Actuarial Science
> > 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> > Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Possible ALTREP bug

2021-05-28 Thread Jim Hester
>From reading the discussion on the Bioconductor issue tracker it seems like
the reason the GC is not suspended for the non-string ALTREP Elt methods is
primarily due to performance concerns.

If this is the case perhaps an additional flag could be added to the
`R_set_altrep_*()` functions so ALTREP authors could indicate if GC should
be halted when that particular method is called for that particular ALTREP
class.

This would avoid the performance hit (other than a boolean check) for the
standard case when no allocations are expected, but allow authors to
indicate that R should pause GC if needed for methods in their class.

On Fri, May 28, 2021 at 9:42 AM  wrote:

> integer and real Elt methods are not expected to allocate. You would
> have to suspend GC to be able to do that. This currently can't be done
> from package code.
>
> Best,
>
> luke
>
> On Fri, 28 May 2021, Gábor Csárdi wrote:
>
> > I have found some weird SEXP corruption behavior with ALTREP, which
> > could be a bug. (Or I could be doing something wrong.)
> >
> > I have an integer ALTREP vector that calls back to R from the Elt
> > method. When this vector is indexed in a lapply(), its first element
> > gets corrupted. Sometimes it's just a type change to logical, but
> > sometimes the corruption causes a crash.
> >
> > I saw this on macOS from R 3.5.3 to 4.2.0. I created a small package
> > that demonstrates this: https://github.com/gaborcsardi/redfish
> >
> > The R callback in this package calls `loadNamespace("Matrix")`, but
> > the same crash happens for other packages as well, and sometimes it
> > also happens if I don't load any packages at all. (But that example
> > was much more complicated, so I went with the package loading.)
> >
> > It is somewhat random, and sometimes turning off the JIT avoids the
> > crash, but not always.
> >
> > Hopefully I am just doing something wrong in the ALTREP code (see
> > https://github.com/gaborcsardi/redfish/blob/main/src/test.c), and it
> > is not actually a bug.
> >
> > Thanks,
> > Gabor
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Possible ALTREP bug

2021-05-28 Thread Gábor Csárdi
Thank you Luke, that makes a lot of sense,
Gabor

On Fri, May 28, 2021 at 3:41 PM  wrote:
>
> integer and real Elt methods are not expected to allocate. You would
> have to suspend GC to be able to do that. This currently can't be done
> from package code.
>
> Best,
>
> luke
>
> On Fri, 28 May 2021, Gábor Csárdi wrote:
>
> > I have found some weird SEXP corruption behavior with ALTREP, which
> > could be a bug. (Or I could be doing something wrong.)
> >
> > I have an integer ALTREP vector that calls back to R from the Elt
> > method. When this vector is indexed in a lapply(), its first element
> > gets corrupted. Sometimes it's just a type change to logical, but
> > sometimes the corruption causes a crash.
> >
> > I saw this on macOS from R 3.5.3 to 4.2.0. I created a small package
> > that demonstrates this: https://github.com/gaborcsardi/redfish
> >
> > The R callback in this package calls `loadNamespace("Matrix")`, but
> > the same crash happens for other packages as well, and sometimes it
> > also happens if I don't load any packages at all. (But that example
> > was much more complicated, so I went with the package loading.)
> >
> > It is somewhat random, and sometimes turning off the JIT avoids the
> > crash, but not always.
> >
> > Hopefully I am just doing something wrong in the ALTREP code (see
> > https://github.com/gaborcsardi/redfish/blob/main/src/test.c), and it
> > is not actually a bug.
> >
> > Thanks,
> > Gabor
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Possible ALTREP bug

2021-05-28 Thread luke-tierney

integer and real Elt methods are not expected to allocate. You would
have to suspend GC to be able to do that. This currently can't be done
from package code.

Best,

luke

On Fri, 28 May 2021, Gábor Csárdi wrote:


I have found some weird SEXP corruption behavior with ALTREP, which
could be a bug. (Or I could be doing something wrong.)

I have an integer ALTREP vector that calls back to R from the Elt
method. When this vector is indexed in a lapply(), its first element
gets corrupted. Sometimes it's just a type change to logical, but
sometimes the corruption causes a crash.

I saw this on macOS from R 3.5.3 to 4.2.0. I created a small package
that demonstrates this: https://github.com/gaborcsardi/redfish

The R callback in this package calls `loadNamespace("Matrix")`, but
the same crash happens for other packages as well, and sometimes it
also happens if I don't load any packages at all. (But that example
was much more complicated, so I went with the package loading.)

It is somewhat random, and sometimes turning off the JIT avoids the
crash, but not always.

Hopefully I am just doing something wrong in the ALTREP code (see
https://github.com/gaborcsardi/redfish/blob/main/src/test.c), and it
is not actually a bug.

Thanks,
Gabor

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Patch proposal for bug 17770 - xtabs does not act as documented for na.action = na.pass

2020-05-21 Thread SOEIRO Thomas
Dear all,

(This issue was previously reported on Bugzilla 
(https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17770) and discussed on 
Stack Overflow (https://stackoverflow.com/q/61240049).)

The documentation of xtabs says:

"na.action: When it is na.pass and formula has a left hand side (with counts), 
sum(*, na.rm = TRUE) is used instead of sum(*) for the counts."

However, this is not the case:
 
DF <- data.frame(group = c("a", "a", "b", "b"),
 count = c(NA, TRUE, FALSE, TRUE))

xtabs(formula = count ~ group,
  data = DF,
  na.action = na.pass)

# group
# a b
# 1

In the code, na.rm is TRUE if and only if na.action = na.omit:

na.rm <- 
  identical(naAct, quote(na.omit)) || identical(naAct, na.omit) ||
  identical(naAct, "na.omit")

xtabs(formula = count ~ group,
  data = DF,
  na.action = na.omit)

# group
# a b
# 1 1

The example works as documented if we change the code to:

na.rm <- 
  identical(naAct, quote(na.pass)) || identical(naAct, na.pass) ||
  identical(naAct, "na.pass")

However, there may be something I am missing, and na.omit may be necessary for 
something else...

Best regards,

Thomas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Possible documentation problem/bug?

2020-05-01 Thread Deepayan Sarkar
On Thu, Apr 30, 2020 at 6:04 PM Dominic Littlewood
<11dlittlew...@gmail.com> wrote:
>
> It seems like there is no obvious way in the documentation to convert the
> expressions in the dots argument to a list without evaluating them. Say, if
> you want to have a function that prints all its arguments:
>
> > foo(abc$de, fg[h], i)
> abc$de
> fg[h]
> i
>
> ...then converting them to a list would be helpful.
> Using substitute(...) was the first thing I tried, but that only gives
> the *first* argument

Isn't that what you would expect anyway? substitute() takes two
arguments, the expression and an environment. You are giving it three.
Normally this should be an error:

foo <- function(a, b, c) substitute(a, b, c)
foo(abc$de, fg[h], i)
# Error in substitute(a, b, c) : unused argument (c)

Clearly ... is being handled in some special way so that we don't get
an error, but otherwise works as expected.

foo <- function(...) substitute(...)
foo(abc$de, fg[h], i)
# abc$de

I would consider this a side-effect of the implementation, and not
something you should rely on.

On the other hand, I would have expected the following to give
something sensible, and it does:

foo <- function(...) substitute({...})
foo(abc$de, fg[h], i)
# {
#   abc$de
#fg[h]
#i
# }
as.character(foo(abc$de, fg[h], i))
# [1] "{"  "abc$de" "fg[h]"  "i"

> in dots. It turns out that there is a way to do this, using
> substitute(...()), but this does not appear to be in either the substitute or
> the dots help page.

There is no documented reason for this to work (AFAIK), so again, I
would guess this is a side-effect of the implementation, and not a API
feature you should rely on. This is somewhat borne out by the
following:

> foo <- function(...) substitute({...()})
> foo(abc$de, fg[h], i)
{
   pairlist(abc$de, fg[h], i)
}
> foo(abc$de, fg[h], , i) # add a missing argument for extra fun
{
   as.pairlist(alist(abc$de, fg[h], , i))
}

which is not something you would expect to see at the user level. So
my recommendation: don't use ...() and pretend that you never
discovered it in the first place. Use match.call() instead, as
suggested by Serguei.

[Disclaimer: I have no idea what is actually going on, so these are
just guesses. There are some hints at
https://cran.r-project.org/doc/manuals/r-devel/R-ints.html#Dot_002ddot_002ddot-arguments
if you want to folllow up.]

-Deepayan

> In fact, there is a clue how to do this in the documentation, if you look
> closely. Let me quote the substitute page:
>
> "Substituting and quoting often cause confusion when the argument is
> expression(...). The result is a call to the expression constructor
> function and needs to be evaluated with eval to give the actual expression
> object."
>
> So this appears to give a way to turn the arguments into a list -
> eval(substitute(expression(...))).  But that's quite long, and hard to
> understand if you just come across it in some code - why are we using eval
> here? why are we substituting expression? - and would definitely require an
> explanatory comment. If the user just wants to iterate over the arguments,
> substitute(...()) is better. In fact, you can get exactly the same effect
> as the above code using as.expression(substitute(...())). Should the
> documentation be updated?
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Possible documentation problem/bug?

2020-04-30 Thread Sokol Serguei

Le 30/04/2020 à 14:31, Dominic Littlewood a écrit :
It seems like there is no obvious way in the documentation to convert 
the expressions in the dots argument to a list without evaluating 
them. Say, if you want to have a function that prints all its arguments:
If you wish to iterate through all the arguments (not only '...') then 
match.call() seems to be the most straightforward and explicit tool:


f=function(a, ...) {mc <- match.call(); print(as.list(mc)[-1])}
f(x,y[h],abc$d)
#$a
#x
#
#[[2]]
#y[h]
#
#[[3]]
#abc$d

Best,
Serguei.





foo(abc$de, fg[h], i)

abc$de
fg[h]
i

...then converting them to a list would be helpful.
Using substitute(...) was the first thing I tried, but that only gives
the *first
*argument in dots. It turns out that there is a way to do this, using
substitute(...()), but this does not appear to be in either the substitute or
the dots help page.

In fact, there is a clue how to do this in the documentation, if you look
closely. Let me quote the substitute page:

"Substituting and quoting often cause confusion when the argument is
expression(...). The result is a call to the expression constructor
function and needs to be evaluated with eval to give the actual expression
object."

So this appears to give a way to turn the arguments into a list -
eval(substitute(expression(...))).  But that's quite long, and hard to
understand if you just come across it in some code - why are we using eval
here? why are we substituting expression? - and would definitely require an
explanatory comment. If the user just wants to iterate over the arguments,
substitute(...()) is better. In fact, you can get exactly the same effect
as the above code using as.expression(substitute(...())). Should the
documentation be updated?

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Possible documentation problem/bug?

2020-04-30 Thread Dominic Littlewood
It seems like there is no obvious way in the documentation to convert the
expressions in the dots argument to a list without evaluating them. Say, if
you want to have a function that prints all its arguments:

> foo(abc$de, fg[h], i)
abc$de
fg[h]
i

...then converting them to a list would be helpful.
Using substitute(...) was the first thing I tried, but that only gives
the *first
*argument in dots. It turns out that there is a way to do this, using
substitute(...()), but this does not appear to be in either the substitute or
the dots help page.

In fact, there is a clue how to do this in the documentation, if you look
closely. Let me quote the substitute page:

"Substituting and quoting often cause confusion when the argument is
expression(...). The result is a call to the expression constructor
function and needs to be evaluated with eval to give the actual expression
object."

So this appears to give a way to turn the arguments into a list -
eval(substitute(expression(...))).  But that's quite long, and hard to
understand if you just come across it in some code - why are we using eval
here? why are we substituting expression? - and would definitely require an
explanatory comment. If the user just wants to iterate over the arguments,
substitute(...()) is better. In fact, you can get exactly the same effect
as the above code using as.expression(substitute(...())). Should the
documentation be updated?

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gfortran 9 quantreg bug

2019-08-06 Thread Dirk Eddelbuettel


On 6 August 2019 at 10:51, Dirk Eddelbuettel wrote:
| then providing compatibility with what came after. As for Fortran, can't
| recall such a change.

Come to think about it we had it in Debian once or twice in the 20+ years I
contributed but I can't recall anymore when it was either.

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gfortran 9 quantreg bug

2019-08-06 Thread Dirk Eddelbuettel


Hi Kasper,

On 6 August 2019 at 10:33, Kasper Daniel Hansen wrote:
| Thanks for the blog post on this, and the pointers in this email.

My pleasure!  There wasn't much in there that was "new" but it often helps to
just tie it together with a valid and real example (as provided by Roger).
 
| I have a question: it seems to me that you end up using a different
| compiler for the package (quantreg) than was used to build R itself. As I
| understand ABI changes, this is considered unsupported (ok, that depends on

Are you thinking that because the 'number' increased to 9, the ABI must have
changed in some (presumably incompatible) ways?  Luckily that is not
generally the case or we'd be in much dire straits.

Every couple of years there is such a change but it is rare. I can't even
recall when g++ last forced us.  May have been the 3.3 to 4.0 change with 3.4
then providing compatibility with what came after. As for Fortran, can't
recall such a change.

| what version of gcc/gfortran was used to build R, but there has been a lot
| of ABI changes in GCC). Is that correct? I understand that this shortcut
| makes it much easier to use different compilers, and might work for Roger's
| usecase, but I was wondering about this issue of using a different compiler
| for packages. Is this something I should worry about?

No.

But don't take my word for it, but trust 'R CMD check'. If it tests, trust it.

Hth, Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gfortran 9 quantreg bug

2019-08-06 Thread Kasper Daniel Hansen
Dirk,

Thanks for the blog post on this, and the pointers in this email.

I have a question: it seems to me that you end up using a different
compiler for the package (quantreg) than was used to build R itself. As I
understand ABI changes, this is considered unsupported (ok, that depends on
what version of gcc/gfortran was used to build R, but there has been a lot
of ABI changes in GCC). Is that correct? I understand that this shortcut
makes it much easier to use different compilers, and might work for Roger's
usecase, but I was wondering about this issue of using a different compiler
for packages. Is this something I should worry about?

Best,
Kasper

On Sun, Aug 4, 2019 at 10:41 AM Dirk Eddelbuettel  wrote:

>
> Roger,
>
> On 4 August 2019 at 06:48, Koenker, Roger W wrote:
> | I’d like to solicit some advice on a debugging problem I have in the
> quantreg package.
> | Kurt and Brian have reported to me that on Debian machines with gfortran
> 9
> |
> | library(quantreg)
> | f = summary(rq(foodexp ~ income, data = engel, tau = 1:4/5))
> | plot(f)
> |
> | fails because summary() produces bogus estimates of the coefficient
> bounds.
> | This example has been around in my R package from the earliest days of
> R, and
> | before that in various incarnations of S.  The culprit is apparently
> rqbr.f which is
> | even more ancient, but must have something that gfortran 9 doesn’t
> approve of.
> |
> | I note that in R-devel there have been some other issues with gfortran
> 9, but these seem
> | unrelated to my problem.  Not having access to a machine with an
> R/gfortran9
> | configuration, I can’t  apply my rudimentary debugging methods.  I’ve
> considered
> | trying to build gfortran on my mac air and then building R from source,
> but before
> | going down this road, I wondered whether others had other suggestions, or
> | advice about  my proposed route.  As far as I can see there are not yet
> | binaries for gfortran 9 for osx.
>
> Maybe installing and running Docker on your mac is an alternative?
>
> Minimally viable example using
>
>   a) docker (on Linux, but it is portable) and
>
>   b) the current official 'r-base' container (an alias to our Rocker
> r-base container)
>
> r-base is begged to Debian testing, and also allows you to get Debian
> unstable.  Below I fire up the container, tell it to use bash (not R) and
> update
>
>   edd@rob:~/git$ docker run --rm -ti r-base bash
>   root@1307193fadf4:/#
>   root@1307193fadf4:/# apt-get update
>   Get:1 http://cdn-fastly.deb.debian.org/debian sid InRelease [149 kB]
>   Get:2 http://cdn-fastly.deb.debian.org/debian testing InRelease [117 kB]
>   Get:3 http://cdn-fastly.deb.debian.org/debian sid/main amd64 Packages
> [8,385 kB]
>   Get:4 http://cdn-fastly.deb.debian.org/debian testing/main amd64
> Packages [7,918 kB]
>   Fetched 16.6 MB in 4s (4,649 kB/s)
>   Reading package lists... Done
>   root@1307193fadf4:/# apt-cache policy gcc-9
>   gcc-9:
> Installed: (none)
> Candidate: 9.1.0-10
> Version table:
>9.1.0-10 990
>   990 http://deb.debian.org/debian testing/main amd64 Packages
>   500 http://http.debian.net/debian sid/main amd64 Packages
>   root@1307193fadf4:/# apt-cache policy gfortran-9
>   gfortran-9:
> Installed: (none)
> Candidate: 9.1.0-10
> Version table:
>9.1.0-10 990
>   990 http://deb.debian.org/debian testing/main amd64 Packages
>   500 http://http.debian.net/debian sid/main amd64 Packages
>   root@1307193fadf4:/#
>
> At this point it just a matter of actually installing gcc-9 and gfortran-9
> (via apt-get install ...), and setting CC, FC, F77 and whichever other
> environment variables the R build reflect to build quantreg.
>
> That said, this will be Debian's standard gfortran-9.  What is at times a
> little frustrating is that some of the builds used by some of the CRAN
> tests
> use local modifications which make their behaviour a little harder to
> reproduce.  I have an open issue with my (also old and stable) digest
> package
> which goes belly-up on a clang-on-Fedora build and nowhere else -- and I
> have
> been unable to reproduce this too.
>
> For such cases, having Docker container would be one possible way of
> giving access to the test environment to make it accessible to more users.
>
> Best,  Dirk
>
>
> |
> | Thanks,
> | Roger
> |
> | Roger Koenker
> | r.koen...@ucl.ac.uk
> | Department of Economics, UCL
> | London  WC1H 0AX.
> |
> |
> |
> |   [[alternative HTML version deleted]]
> |
> | __
> | R-devel@r-project.org mailing list
> | https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>


-- 
Best,
Kasper

[[alternative HTML version deleted]]


Re: [Rd] [EXTERNAL] Re: Potential bug in update.formula when updating offsets

2019-08-06 Thread Therneau, Terry M., Ph.D. via R-devel
Yes, it is almost certainly the same issue.  At useR I promised Martin that I 
would put 
together a clear example and fix for him and I have not yet done so.  I will 
try to do 
that this week.

   The heart of the issue is that in a terms object the offset expression will 
apear in 
the 'variables' attribute but not in the 'term.labels' or 'order' attributes, 
and the base 
code tries to use the same subscripting vector for all 3 if then.   The same 
bit of code 
shows up in update.formula and in [.formula; a fix for one can be applied to 
both.

I had all this worked out, then had some problems logging into buzilla, then 
sent it to 
Martin about the same time 2-3 more urgent things got dumped on him, and then 
we've both 
let it lie. At useR he said (and I've no reason to disagree) that my prior note 
was 
unclear.  Let me recreate the example and fix, more carefully.

Terry T.


On 8/6/19 3:21 AM, peter dalgaard wrote:
> Terry, Martin
>
> Would this happen to be related to the "indexing term objects" issue that has 
> been bothering you?
>
> -pd
>
>> On 5 Aug 2019, at 21:44 , Paul Buerkner  wrote:
>>
>> Hi all,
>>
>> update.formula does not seem to correctly update (i.e. remove in my case)
>> offset terms.
>>
>> Here is an example:
>>
>> update(~x + offset(z), ~ . - offset(z))
>>> ~x + offset(z)
>> Also:
>> update(~x, ~ . - offset(z))
>>> ~x + offset(z)
>> In both cases, I would expect the result
>>> ~x
>> as  -   should remove  from the formula as happens for instance
>> in:
>>
>> update(~x + z, ~ . - z)
>>> ~x
>> I don't know if this behavior is intentional but I would say it is at least
>> unfortunate.
>>
>> Paul
>>
>>  [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gfortran 9 quantreg bug

2019-08-05 Thread Koenker, Roger W
With extensive help from Dirk Eddelbuettel I have installed 
docker on my mac mini from 

https://hub.docker.com/editions/community/docker-ce-desktop-mac

which installs from a dmg in quite standard fashion.  This has allowed
me to simulate running R in a Debian environment with gfortran-9 and
begin the process of debugging my ancient rqbr.f code.

Some further details:

0.  After some initial testing, e.g.

docker --version 
docker run hello-world   

1.  Download r-base and test os

docker pull r-base   $ downloads r-base for us 
docker run --rm -ti r-base R --version   # to check we have the R we want
docker run --rm -ti r-base bash  # now in shell, Ctrl-d to exit

2.  Setup working directory -- tell Docker to run from the current directory
and access

cd projects/rq
docker run --rm -ti -v ${PWD}:/work -w /work r-base bash

This put the contents of projects/rq into the /work directory.

root@90521904fa86:/work# apt-get update
Get:1 http://cdn-fastly.deb.debian.org/debian sid InRelease [149 kB]
Get:2 http://cdn-fastly.deb.debian.org/debian testing InRelease [117 kB]
Get:3 http://cdn-fastly.deb.debian.org/debian sid/main amd64 Packages 
[8,385 kB]
Get:4 http://cdn-fastly.deb.debian.org/debian testing/main amd64 Packages 
[7,916 kB]
Fetched 16.6 MB in 4s (4,411 kB/s)   
Reading package lists... Done
  
3.  Get gcc-9 and gfortran-9

root@90521904fa86:/work# apt-get install gcc-9 gfortran-9
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following additional packages will be installed:
cpp-9 gcc-9-base libasan5 libatomic1 libcc1-0 libgcc-9-dev libgcc1 
libgfortran-9-dev
libgfortran5 libgomp1 libitm1 liblsan0 libquadmath0 libstdc++6 libtsan0 
libubsan1
Suggested packages:
gcc-9-locales gcc-9-multilib gcc-9-doc libgcc1-dbg libgomp1-dbg libitm1-dbg 
libatomic1-dbg
libasan5-dbg liblsan0-dbg libtsan0-dbg libubsan1-dbg libquadmath0-dbg 
gfortran-9-multilib
gfortran-9-doc libgfortran5-dbg libcoarrays-dev
The following NEW packages will be installed:
cpp-9 gcc-9 gfortran-9 libgcc-9-dev libgfortran-9-dev
The following packages will be upgraded:
gcc-9-base libasan5 libatomic1 libcc1-0 libgcc1 libgfortran5 libgomp1 
libitm1 liblsan0
libquadmath0 libstdc++6 libtsan0 libubsan1
13 upgraded, 5 newly installed, 0 to remove and 71 not upgraded.
Need to get 35.6 MB of archives.
After this operation, 107 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
Get:1 http://cdn-fastly.deb.debian.org/debian testing/main amd64 libasan5 
amd64 9.1.0-10 [390 kB]
Get:2 http://cdn-fastly.deb.debian.org/debian testing/main amd64 libubsan1 
amd64 9.1.0-10 [128 kB]
Get:3 http://cdn-fastly.deb.debian.org/debian testing/main amd64 libtsan0 
amd64 9.1.0-10 [295 kB]
Get:4 http://cdn-fastly.deb.debian.org/debian testing/main amd64 gcc-9-base 
amd64 9.1.0-10 [190 kB]
Get:5 http://cdn-fastly.deb.debian.org/debian testing/main amd64 libstdc++6 
amd64 9.1.0-10 [500 kB]
Get:6 http://cdn-fastly.deb.debian.org/debian testing/main amd64 
libquadmath0 amd64 9.1.0-10 [145 kB]
Get:7 http://cdn-fastly.deb.debian.org/debian testing/main amd64 liblsan0 
amd64 9.1.0-10 [137 kB]
Get:8 http://cdn-fastly.deb.debian.org/debian testing/main amd64 libitm1 
amd64 9.1.0-10 [27.6 kB]
Get:9 http://cdn-fastly.deb.debian.org/debian testing/main amd64 libgomp1 
amd64 9.1.0-10 [88.1 kB]
Get:10 http://cdn-fastly.deb.debian.org/debian testing/main amd64 
libgfortran5 amd64 9.1.0-10 [633 kB]
Get:11 http://cdn-fastly.deb.debian.org/debian testing/main amd64 libcc1-0 
amd64 9.1.0-10 [47.7 kB]
Get:12 http://cdn-fastly.deb.debian.org/debian testing/main amd64 
libatomic1 amd64 9.1.0-10 [9,012 B]
Get:13 http://cdn-fastly.deb.debian.org/debian testing/main amd64 libgcc1 
amd64 1:9.1.0-10 [40.5 kB]
Get:14 http://cdn-fastly.deb.debian.org/debian testing/main amd64 cpp-9 
amd64 9.1.0-10 [9,667 kB]
Get:15 http://cdn-fastly.deb.debian.org/debian testing/main amd64 
libgcc-9-dev amd64 9.1.0-10 [2,346 kB]
Get:16 http://cdn-fastly.deb.debian.org/debian testing/main amd64 gcc-9 
amd64 9.1.0-10 [9,945 kB]
Get:17 http://cdn-fastly.deb.debian.org/debian testing/main amd64 
libgfortran-9-dev amd64 9.1.0-10 [676 kB]
Get:18 http://cdn-fastly.deb.debian.org/debian testing/main amd64 
gfortran-9 amd64 9.1.0-10 [10.4 MB]
Fetched 35.6 MB in 6s (6,216 kB/s)  
debconf: delaying package configuration, since apt-utils is not installed
(Reading database ... 17787 files and directories currently installed.)
Preparing to unpack .../libasan5_9.1.0-10_amd64.deb ...
Unpacking libasan5:amd64 (9.1.0-10) over (9.1.0-8) ...
Preparing to unpack .../libubsan1_9.1.0-10_amd64.deb ...
Unpacking libubsan1:amd64 (9.1.0-10) over (9.1.0-8) ...

Re: [Rd] gfortran 9 quantreg bug

2019-08-04 Thread Koenker, Roger W
Thanks Berend,

Yes,  I know about these warnings, they are mostly a consequence of the 
automated
translation from the ancient Bell Labs dialect of fortran called ratfor.  It is 
easy to add
type declarations for “in”  and the others, but it seems unlikely that this is 
going to fix
anything.  The extra labels are all attributable to the ratfor translation.  I 
agree that
the code is ugly — the ratfor is somewhat better, but not much.  I fact the 
algorithm
is rather simple, but I’m reluctant to write it again from scratch, since there 
are few
fiddly details and I would worry somewhat about reproducibility.

Roger

Roger Koenker
r.koen...@ucl.ac.uk
Department of Economics, UCL
London  WC1H 0AX.


On Aug 4, 2019, at 3:26 PM, Berend Hasselman 
mailto:b...@xs4all.nl>> wrote:

Roger,

I have run

gfortran -c -fsyntax-only -fimplicit-none -Wall -pedantic rqbr.f

in the src folder of quantreg.

There are many warnings about defined but not used labels.
Also two errors such as "Symbol ‘in’ at (1) has no IMPLICIT type".
And warnings such as: Warning: "Possible change of value in conversion from 
REAL(8) to INTEGER(4)  at ..."

No offense intended but this fortran code is awful. I wouldn't want to debug 
this before an extensive cleanup by
getting rid of as many numerical labels as possible, indenting and doing 
something about the warnings "Possible change of value ...".

This is going to be very difficult.

Berend Hasselman

On 4 Aug 2019, at 08:48, Koenker, Roger W 
mailto:rkoen...@illinois.edu>> wrote:

I’d like to solicit some advice on a debugging problem I have in the quantreg 
package.
Kurt and Brian have reported to me that on Debian machines with gfortran 9

library(quantreg)
f = summary(rq(foodexp ~ income, data = engel, tau = 1:4/5))
plot(f)

fails because summary() produces bogus estimates of the coefficient bounds.
This example has been around in my R package from the earliest days of R, and
before that in various incarnations of S.  The culprit is apparently rqbr.f 
which is
even more ancient, but must have something that gfortran 9 doesn’t approve of.

I note that in R-devel there have been some other issues with gfortran 9, but 
these seem
unrelated to my problem.  Not having access to a machine with an R/gfortran9
configuration, I can’t  apply my rudimentary debugging methods.  I’ve considered
trying to build gfortran on my mac air and then building R from source, but 
before
going down this road, I wondered whether others had other suggestions, or
advice about  my proposed route.  As far as I can see there are not yet
binaries for gfortran 9 for osx.

Thanks,
Roger

Roger Koenker
r.koen...@ucl.ac.uk
Department of Economics, UCL
London  WC1H 0AX.



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gfortran 9 quantreg bug

2019-08-04 Thread Balasubramanian Narasimhan


On 8/4/19 7:26 AM, Berend Hasselman wrote:
> Roger,
>
> I have run
>
>   gfortran -c -fsyntax-only -fimplicit-none -Wall -pedantic rqbr.f
>
> in the src folder of quantreg.
>
> There are many warnings about defined but not used labels.
> Also two errors such as "Symbol ‘in’ at (1) has no IMPLICIT type".
> And warnings such as: Warning: "Possible change of value in conversion from 
> REAL(8) to INTEGER(4)  at ..."
>
> No offense intended but this fortran code is awful. I wouldn't want to debug 
> this before an extensive cleanup by
> getting rid of as many numerical labels as possible, indenting and doing 
> something about the warnings "Possible change of value ...".

The unused labels at least can be removed automatically at least for 
fixed form along the lines shown in steps 8 and 9 of

https://bnaras.github.io/SUtools/articles/SUtools.html

which pertain to lines 261--281 of

https://github.com/bnaras/SUtools/blob/master/R/process.R

In fact, here it is, excerpted.

library(stringr)
code_lines  <- readLines(con = "rqbr.f")
cat("Running gfortran to detect warning lines on unused labels\n")
system2(command = "gfortran",
 args = c("-Wunused", "-c", "rqbr.f", "-o", "temp.o"),
 stderr = "gfortran.out")
cat("Scanning gfortran output for warnings on unusued labels\n")
warnings <- readLines("gfortran.out")
line_numbers <- grep('rqbr.f', warnings)
label_warning_line_numbers <- grep(pattern = "^Warning: Label [0-9]+ at", 
warnings)
just_warnings <- sum(grepl('Warning:', warnings))

nW <- length(label_warning_line_numbers)
for (i in seq_len(nW)) {
 offending_line <- 
as.integer(stringr::str_extract(warnings[line_numbers[i]], pattern = 
"([0-9]+)"))
 code_line <- code_lines[offending_line]
 offending_label <- 
stringr::str_extract(warnings[label_warning_line_numbers[i]],
 pattern = "([0-9]+)")
 code_lines[offending_line] <- sub(pattern = offending_label,
   replacement = str_pad("", width = 
nchar(offending_label)),
   x = code_lines[offending_line])
}
writeLines(code_lines, con = "rqbr-new.f")

-Naras

> This is going to be very difficult.
>
> Berend Hasselman
>
>> On 4 Aug 2019, at 08:48, Koenker, Roger W  wrote:
>>
>> I’d like to solicit some advice on a debugging problem I have in the 
>> quantreg package.
>> Kurt and Brian have reported to me that on Debian machines with gfortran 9
>>
>> library(quantreg)
>> f = summary(rq(foodexp ~ income, data = engel, tau = 1:4/5))
>> plot(f)
>>
>> fails because summary() produces bogus estimates of the coefficient bounds.
>> This example has been around in my R package from the earliest days of R, and
>> before that in various incarnations of S.  The culprit is apparently rqbr.f 
>> which is
>> even more ancient, but must have something that gfortran 9 doesn’t approve 
>> of.
>>
>> I note that in R-devel there have been some other issues with gfortran 9, 
>> but these seem
>> unrelated to my problem.  Not having access to a machine with an R/gfortran9
>> configuration, I can’t  apply my rudimentary debugging methods.  I’ve 
>> considered
>> trying to build gfortran on my mac air and then building R from source, but 
>> before
>> going down this road, I wondered whether others had other suggestions, or
>> advice about  my proposed route.  As far as I can see there are not yet
>> binaries for gfortran 9 for osx.
>>
>> Thanks,
>> Roger
>>
>> Roger Koenker
>> r.koen...@ucl.ac.uk
>> Department of Economics, UCL
>> London  WC1H 0AX.
>>
>>
>>
>>  [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gfortran 9 quantreg bug

2019-08-04 Thread Dirk Eddelbuettel


Roger,

On 4 August 2019 at 06:48, Koenker, Roger W wrote:
| I’d like to solicit some advice on a debugging problem I have in the quantreg 
package.
| Kurt and Brian have reported to me that on Debian machines with gfortran 9
| 
| library(quantreg)
| f = summary(rq(foodexp ~ income, data = engel, tau = 1:4/5))
| plot(f)
| 
| fails because summary() produces bogus estimates of the coefficient bounds.
| This example has been around in my R package from the earliest days of R, and
| before that in various incarnations of S.  The culprit is apparently rqbr.f 
which is
| even more ancient, but must have something that gfortran 9 doesn’t approve of.
| 
| I note that in R-devel there have been some other issues with gfortran 9, but 
these seem
| unrelated to my problem.  Not having access to a machine with an R/gfortran9
| configuration, I can’t  apply my rudimentary debugging methods.  I’ve 
considered
| trying to build gfortran on my mac air and then building R from source, but 
before
| going down this road, I wondered whether others had other suggestions, or
| advice about  my proposed route.  As far as I can see there are not yet
| binaries for gfortran 9 for osx.

Maybe installing and running Docker on your mac is an alternative?

Minimally viable example using

  a) docker (on Linux, but it is portable) and
  
  b) the current official 'r-base' container (an alias to our Rocker r-base 
container)

r-base is begged to Debian testing, and also allows you to get Debian
unstable.  Below I fire up the container, tell it to use bash (not R) and update

  edd@rob:~/git$ docker run --rm -ti r-base bash
  root@1307193fadf4:/# 
  root@1307193fadf4:/# apt-get update
  Get:1 http://cdn-fastly.deb.debian.org/debian sid InRelease [149 kB]
  Get:2 http://cdn-fastly.deb.debian.org/debian testing InRelease [117 kB]
  Get:3 http://cdn-fastly.deb.debian.org/debian sid/main amd64 Packages [8,385 
kB]
  Get:4 http://cdn-fastly.deb.debian.org/debian testing/main amd64 Packages 
[7,918 kB]
  Fetched 16.6 MB in 4s (4,649 kB/s)   
  Reading package lists... Done
  root@1307193fadf4:/# apt-cache policy gcc-9
  gcc-9:
Installed: (none)
Candidate: 9.1.0-10
Version table:
   9.1.0-10 990
  990 http://deb.debian.org/debian testing/main amd64 Packages
  500 http://http.debian.net/debian sid/main amd64 Packages
  root@1307193fadf4:/# apt-cache policy gfortran-9
  gfortran-9:
Installed: (none)
Candidate: 9.1.0-10
Version table:
   9.1.0-10 990
  990 http://deb.debian.org/debian testing/main amd64 Packages
  500 http://http.debian.net/debian sid/main amd64 Packages
  root@1307193fadf4:/# 

At this point it just a matter of actually installing gcc-9 and gfortran-9
(via apt-get install ...), and setting CC, FC, F77 and whichever other
environment variables the R build reflect to build quantreg.

That said, this will be Debian's standard gfortran-9.  What is at times a
little frustrating is that some of the builds used by some of the CRAN tests
use local modifications which make their behaviour a little harder to
reproduce.  I have an open issue with my (also old and stable) digest package
which goes belly-up on a clang-on-Fedora build and nowhere else -- and I have
been unable to reproduce this too.

For such cases, having Docker container would be one possible way of
giving access to the test environment to make it accessible to more users.

Best,  Dirk


| 
| Thanks,
| Roger
| 
| Roger Koenker
| r.koen...@ucl.ac.uk
| Department of Economics, UCL
| London  WC1H 0AX.
| 
| 
| 
|   [[alternative HTML version deleted]]
| 
| __
| R-devel@r-project.org mailing list
| https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gfortran 9 quantreg bug

2019-08-04 Thread Berend Hasselman
Roger,

I have run

gfortran -c -fsyntax-only -fimplicit-none -Wall -pedantic rqbr.f

in the src folder of quantreg.

There are many warnings about defined but not used labels.
Also two errors such as "Symbol ‘in’ at (1) has no IMPLICIT type".
And warnings such as: Warning: "Possible change of value in conversion from 
REAL(8) to INTEGER(4)  at ..."

No offense intended but this fortran code is awful. I wouldn't want to debug 
this before an extensive cleanup by
getting rid of as many numerical labels as possible, indenting and doing 
something about the warnings "Possible change of value ...".

This is going to be very difficult.

Berend Hasselman

> On 4 Aug 2019, at 08:48, Koenker, Roger W  wrote:
> 
> I’d like to solicit some advice on a debugging problem I have in the quantreg 
> package.
> Kurt and Brian have reported to me that on Debian machines with gfortran 9
> 
> library(quantreg)
> f = summary(rq(foodexp ~ income, data = engel, tau = 1:4/5))
> plot(f)
> 
> fails because summary() produces bogus estimates of the coefficient bounds.
> This example has been around in my R package from the earliest days of R, and
> before that in various incarnations of S.  The culprit is apparently rqbr.f 
> which is
> even more ancient, but must have something that gfortran 9 doesn’t approve of.
> 
> I note that in R-devel there have been some other issues with gfortran 9, but 
> these seem
> unrelated to my problem.  Not having access to a machine with an R/gfortran9
> configuration, I can’t  apply my rudimentary debugging methods.  I’ve 
> considered
> trying to build gfortran on my mac air and then building R from source, but 
> before
> going down this road, I wondered whether others had other suggestions, or
> advice about  my proposed route.  As far as I can see there are not yet
> binaries for gfortran 9 for osx.
> 
> Thanks,
> Roger
> 
> Roger Koenker
> r.koen...@ucl.ac.uk
> Department of Economics, UCL
> London  WC1H 0AX.
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] gfortran 9 quantreg bug

2019-08-04 Thread Koenker, Roger W
I’d like to solicit some advice on a debugging problem I have in the quantreg 
package.
Kurt and Brian have reported to me that on Debian machines with gfortran 9

library(quantreg)
f = summary(rq(foodexp ~ income, data = engel, tau = 1:4/5))
plot(f)

fails because summary() produces bogus estimates of the coefficient bounds.
This example has been around in my R package from the earliest days of R, and
before that in various incarnations of S.  The culprit is apparently rqbr.f 
which is
even more ancient, but must have something that gfortran 9 doesn’t approve of.

I note that in R-devel there have been some other issues with gfortran 9, but 
these seem
unrelated to my problem.  Not having access to a machine with an R/gfortran9
configuration, I can’t  apply my rudimentary debugging methods.  I’ve considered
trying to build gfortran on my mac air and then building R from source, but 
before
going down this road, I wondered whether others had other suggestions, or
advice about  my proposed route.  As far as I can see there are not yet
binaries for gfortran 9 for osx.

Thanks,
Roger

Roger Koenker
r.koen...@ucl.ac.uk
Department of Economics, UCL
London  WC1H 0AX.



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Re: Possible bug in `class<-` when a class-specific '[[.' method is defined

2019-07-15 Thread Duncan Murdoch

On 15/07/2019 9:24 a.m., Tierney, Luke wrote:

Pasting the entire example into RStudio and hitting return to evaluate
does not show this. Evaluating the finall line to print counttt
separately does.

Looks like RStudio is calling `[[` on your object when examining the
environment for the Environment panel. If this concerns you then you
should contact RStudio.


Now I see it!

Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Re: Possible bug in `class<-` when a class-specific '[[.' method is defined

2019-07-15 Thread Tierney, Luke
Pasting the entire example into RStudio and hitting return to evaluate
does not show this. Evaluating the finall line to print counttt
separately does.

Looks like RStudio is calling `[[` on your object when examining the
environment for the Environment panel. If this concerns you then you
should contact RStudio.

Best,

luke

On Mon, 15 Jul 2019, Rui Barradas wrote:

> Hello,
>
> Clean R 3.6.1 session on Ubuntu 19.04, RStudio 1.1.453. sessionInfo() at the 
> end.
>
> I can reproduce this.
>
> counttt <- 0
>
> `[[.MYCLASS` = function(x, ...) {
>  counttt <<- counttt + 1
>  # browser()
>  x = NextMethod()
>  return(x)
> }
>
> df <- as.data.frame(matrix(1:20, nrow=5))
> class(df) <- c("MYCLASS","data.frame")
> counttt
> #[1] 9
>
>
> But there's more. I tried to print the values of x in the method and got 
> really strange results
>
> counttt <- 0
>
> `[[.MYCLASS` = function(x, ...) {
>  counttt <<- counttt + 1
>  print(x)
>  # browser()
>  x = NextMethod()
>  return(x)
> }
>
> df <- as.data.frame(matrix(1:20, nrow=5))
> class(df) <- c("MYCLASS","data.frame")
> counttt
> #[1] 151
>
>
> If I change print to print.data.frame it goes up to
>
> counttt
> #[1] 176
>
> With print.default back to 9. What is the print method called in the second 
> example?
>
>
> sessionInfo()
> R version 3.6.1 (2019-07-05)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Ubuntu 19.04
>
> Matrix products: default
> BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.8.0
> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.8.0
>
> locale:
> [1] LC_CTYPE=pt_PT.UTF-8   LC_NUMERIC=C
> [3] LC_TIME=pt_PT.UTF-8LC_COLLATE=pt_PT.UTF-8
> [5] LC_MONETARY=pt_PT.UTF-8LC_MESSAGES=pt_PT.UTF-8
> [7] LC_PAPER=pt_PT.UTF-8   LC_NAME=C
> [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=pt_PT.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods
> [7] base
>
> loaded via a namespace (and not attached):
>  [1] sos_2.0-0   nlme_3.1-140matrixStats_0.54.0
>  [4] fs_1.2.7xts_0.11-2  usethis_1.5.0
>  [7] lubridate_1.7.4 devtools_2.0.2  RColorBrewer_1.1-2
> [10] rprojroot_1.3-2 rbenchmark_1.0.0tools_3.6.1
> [13] backports_1.1.4 R6_2.4.0rpart_4.1-15
> [16] Hmisc_4.2-0 lazyeval_0.2.2  colorspace_1.4-1
> [19] nnet_7.3-12 npsurv_0.4-0withr_2.1.2
> [22] tidyselect_0.2.5gridExtra_2.3   prettyunits_1.0.2
> [25] processx_3.3.0  curl_3.3compiler_3.6.1
> [28] cli_1.1.0   htmlTable_1.13.1randomNames_1.4-0.0
> [31] dvmisc_1.1.3desc_1.2.0  tseries_0.10-46
> [34] scales_1.0.0checkmate_1.9.1 lmtest_0.9-36
> [37] fracdiff_1.4-2  mvtnorm_1.0-10  quadprog_1.5-6
> [40] callr_3.2.0 stringr_1.4.0   digest_0.6.18
> [43] foreign_0.8-71  rio_0.5.16  base64enc_0.1-3
> [46] stocks_1.1.4pkgconfig_2.0.2 htmltools_0.3.6
> [49] sessioninfo_1.1.1   readxl_1.3.1htmlwidgets_1.3
> [52] rlang_0.3.4 TTR_0.23-4  rstudioapi_0.10
> [55] quantmod_0.4-14 MLmetrics_1.1.1 zoo_1.8-5
> [58] zip_2.0.1   acepack_1.4.1   dplyr_0.8.0.1
> [61] car_3.0-2   magrittr_1.5Formula_1.2-3
> [64] Matrix_1.2-17   Rcpp_1.0.1  munsell_0.5.0
> [67] abind_1.4-5 stringi_1.4.3   forecast_8.6
> [70] yaml_2.2.0  carData_3.0-2   MASS_7.3-51.3
> [73] pkgbuild_1.0.3  plyr_1.8.4  grid_3.6.1
> [76] parallel_3.6.1  forcats_0.4.0   crayon_1.3.4
> [79] lattice_0.20-38 haven_2.1.0 splines_3.6.1
> [82] hms_0.4.2   knitr_1.22  ps_1.3.0
> [85] pillar_1.4.0pkgload_1.0.2   urca_1.3-0
> [88] glue_1.3.1  lsei_1.2-0  babynames_1.0.0
> [91] latticeExtra_0.6-28 data.table_1.12.2   remotes_2.0.4
> [94] cellranger_1.1.0testthat_2.1.0  gtable_0.3.0
> [97] purrr_0.3.2 assertthat_0.2.1ggplot2_3.1.1
> [100] openxlsx_4.1.0  xfun_0.6survey_3.35-1
> [103] survival_2.44-1.1   timeDate_3043.102   tibble_2.1.1
> [106] memoise_1.1.0   cluster_2.0.8   toOrdinal_1.1-0.0
> [109] fitdistrplus_1.0-14 brew_1.0-6
>
>
>
> Hope this helps,
>
> Rui Barradas
>
>
> Às 13:16 de 15/07/19, Duncan Murdoch escreveu:
>> On 07/07/2019 11:49 a.m., Ghiggi Gionata wrote:
>>> Hi all !
>>> 
>>> I noticed a strange behaviour of the function `class<-` when a 
>>> class-specific '[[.' method is defined.
>>> 
>>> Here below a reproducible example :
>>> 
>>> 
>>> #---.
>>> 
>>> counttt <- 0
>>> 
>>> `[[.MYCLASS` = function(x, ...) {
>>>    counttt <<- counttt + 1
>>>    # browser()
>>>    x = NextMethod()
>>>    return(x)
>>> }
>>> 
>>> df <- as.data.frame(matrix(1:20, nrow=5))
>>> class(df) <- c("MYCLASS","data.frame")
>>> counttt
>>> 
>>> # The same occurs when using structure(, class=) or 

Re: [Rd] long-standing documentation bug in ?anova.lme

2019-01-21 Thread Ben Bolker

  Here are relevant patches to address the various issues described
below.  Thanks for the SVN info!

  cheers
Ben Bolker


On 2019-01-21 4:54 a.m., Martin Maechler wrote:
>> Ben Bolker 
>> on Thu, 17 Jan 2019 12:32:20 -0500 writes:
> 
> > tl;dr anova.lme() claims to provide sums of squares, but it doesn't. And
> > some names are misspelled in ?lme.  I can submit all this stuff as a bug
> > report if that's preferred.
> 
> > ?anova.lme says:
> 
> > When only one fitted model object is present, a data frame with
> > the sums of squares, numerator degrees of freedom, denominator
> > degrees of freedom, F-values, and P-values
> 
> > The output of
> 
> > fm1 <- lme(distance ~ age, data = Orthodont) # random is ~ age
> > anova(fm1)
> 
> > gives columns
> 
> > numDF denDF   F-value p-value
> 
> > -- i.e. the sums of squares aren't there!  (For fairly good reasons; lme
> > doesn't actually compute them internally, and it might not always be
> > straightforward to compute them, for more complex models. They would
> > mostly be useful for comparison with simpler, method-of-moments based
> > approaches like aov()). Federico Calboli pointed this out on r-help in
> > 2004: https://stat.ethz.ch/pipermail/r-help/2004-May/051444.html
> 
> 
> > Two more points:
> 
> > * the last sentence of the Description might need one fewer comma
> > [after "statistic"] or one more [after "p-value"].
> > * in ?lme, Littell's name is misspelled at least twice and Reinsel's
> > at least once.
> 
> We'd be grateful for patches, thank you Ben!
> 
> Notably for 'nlme' and 'foreign', both of which are maintained
> by R-core (rather than individual R core or R Foundation
> members) we've also encouraged that  R's bugzilla be used for
> non-trivial bug reports as that allows attached patches and
> simple references too. 
> 
> 
> > Is there a publicly accessible SVN server for recommended packages (in
> > general) and nlme (in particular) anywhere?
> 
> nlme's SVN is physically at the same place as the R sources
> (here at ETH Zurich), with URL
> 
>https://svn.r-project.org/R-packages/trunk/nlme
> 
> in addition to 'nlme', at least  'foreign', 'mgcv'  and
> 'cluster' are also maintained there.
> 
> Thank you for the question:
>  I do think "we" should add the corresponding  svn URL to the
>  respective DESCRIPTION file.
> 
> OTOH, 'Matrix' has moved to R-forge a while ago .. and I'm
> currently also not sure about the other Recommended packages
> such as 'KernSmooth' or 'boot' . 
> 
> Best,
> Martin
> 
> Martin Maechler
> ETH Zurich and R core team
> 
Index: nlme/DESCRIPTION
===
--- nlme/DESCRIPTION(revision 7616)
+++ nlme/DESCRIPTION(working copy)
@@ -21,3 +21,4 @@
 Encoding: UTF-8
 License: GPL (>= 2) | file LICENCE
 BugReports: https://bugs.r-project.org
+URL: https://svn.r-project.org/R-packages/trunk/nlme
\ No newline at end of file
Index: nlme/man/anova.lme.Rd
===
--- nlme/man/anova.lme.Rd   (revision 7616)
+++ nlme/man/anova.lme.Rd   (working copy)
@@ -61,7 +61,7 @@
 }
 \description{
   When only one fitted model object is present, a data frame with the
-  sums of squares, numerator degrees of freedom, denominator degrees of
+  numerator degrees of freedom, denominator degrees of
   freedom, F-values, and P-values for Wald tests for the terms in the
   model (when \code{Terms} and \code{L} are \code{NULL}), a combination
   of model terms (when \code{Terms} in not \code{NULL}), or linear
@@ -71,7 +71,7 @@
   log-likelihood, the Akaike Information Criterion (AIC), and the
   Bayesian Information Criterion (BIC) of each object is returned.  If
   \code{test=TRUE}, whenever two consecutive  objects have different
-  number of degrees of freedom, a likelihood ratio statistic, with the
+  number of degrees of freedom, a likelihood ratio statistic with the
   associated p-value is included in the returned data frame.
 }
 \value{
Index: nlme/man/lme.Rd
===
--- nlme/man/lme.Rd (revision 7616)
+++ nlme/man/lme.Rd (working copy)
@@ -117,8 +117,8 @@
   (1982).  The variance-covariance parametrizations are described in
   Pinheiro and Bates (1996).  The different correlation structures
   available for the \code{correlation} argument are described in Box,
-  Jenkins and Reinse (1994), Littel \emph{et al} (1996), and Venables and
-  Ripley, (2002). The use of variance functions for linear and nonlinear
+  Jenkins and Reinsel (1994), Littell \emph{et al} (1996), and Venables and
+  Ripley (2002). The use of variance functions for linear and nonlinear
   mixed effects models is presented in detail in Davidian and Giltinan
   (1995).
 
@@ -136,7 +136,7 @@
   Data", Journal of the American Statistical Association, 83,
   

Re: [Rd] long-standing documentation bug in ?anova.lme

2019-01-21 Thread Martin Maechler
> Ben Bolker 
> on Thu, 17 Jan 2019 12:32:20 -0500 writes:

> tl;dr anova.lme() claims to provide sums of squares, but it doesn't. And
> some names are misspelled in ?lme.  I can submit all this stuff as a bug
> report if that's preferred.

> ?anova.lme says:

> When only one fitted model object is present, a data frame with
> the sums of squares, numerator degrees of freedom, denominator
> degrees of freedom, F-values, and P-values

> The output of

> fm1 <- lme(distance ~ age, data = Orthodont) # random is ~ age
> anova(fm1)

> gives columns

> numDF denDF   F-value p-value

> -- i.e. the sums of squares aren't there!  (For fairly good reasons; lme
> doesn't actually compute them internally, and it might not always be
> straightforward to compute them, for more complex models. They would
> mostly be useful for comparison with simpler, method-of-moments based
> approaches like aov()). Federico Calboli pointed this out on r-help in
> 2004: https://stat.ethz.ch/pipermail/r-help/2004-May/051444.html


> Two more points:

> * the last sentence of the Description might need one fewer comma
> [after "statistic"] or one more [after "p-value"].
> * in ?lme, Littell's name is misspelled at least twice and Reinsel's
> at least once.

We'd be grateful for patches, thank you Ben!

Notably for 'nlme' and 'foreign', both of which are maintained
by R-core (rather than individual R core or R Foundation
members) we've also encouraged that  R's bugzilla be used for
non-trivial bug reports as that allows attached patches and
simple references too. 


> Is there a publicly accessible SVN server for recommended packages (in
> general) and nlme (in particular) anywhere?

nlme's SVN is physically at the same place as the R sources
(here at ETH Zurich), with URL

   https://svn.r-project.org/R-packages/trunk/nlme

in addition to 'nlme', at least  'foreign', 'mgcv'  and
'cluster' are also maintained there.

Thank you for the question:
 I do think "we" should add the corresponding  svn URL to the
 respective DESCRIPTION file.

OTOH, 'Matrix' has moved to R-forge a while ago .. and I'm
currently also not sure about the other Recommended packages
such as 'KernSmooth' or 'boot' . 

Best,
Martin

Martin Maechler
ETH Zurich and R core team

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] long-standing documentation bug in ?anova.lme

2019-01-20 Thread Ben Bolker


  Silence on this so far? Trying here one more time, otherwise I'll
submit it as a bug report ...

  cheers
   Ben Bolker

On 2019-01-17 12:32 p.m., Ben Bolker wrote:
> tl;dr anova.lme() claims to provide sums of squares, but it doesn't. And
> some names are misspelled in ?lme.  I can submit all this stuff as a bug
> report if that's preferred.
> 
> ?anova.lme says:
> 
> When only one fitted model object is present, a data frame with
>  the sums of squares, numerator degrees of freedom, denominator
>  degrees of freedom, F-values, and P-values
> 
> The output of
> 
> fm1 <- lme(distance ~ age, data = Orthodont) # random is ~ age
> anova(fm1)
> 
> gives columns
> 
> numDF denDF   F-value p-value
> 
> -- i.e. the sums of squares aren't there!  (For fairly good reasons; lme
> doesn't actually compute them internally, and it might not always be
> straightforward to compute them, for more complex models. They would
> mostly be useful for comparison with simpler, method-of-moments based
> approaches like aov()). Federico Calboli pointed this out on r-help in
> 2004: https://stat.ethz.ch/pipermail/r-help/2004-May/051444.html
> 
> 
> Two more points:
> 
>   * the last sentence of the Description might need one fewer comma
> [after "statistic"] or one more [after "p-value"].
>   * in ?lme, Littell's name is misspelled at least twice and Reinsel's
> at least once.
> 
>   Is there a publicly accessible SVN server for recommended packages (in
> general) and nlme (in particular) anywhere?
> 
>   cheers
>   Ben Bolker
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] long-standing documentation bug in ?anova.lme

2019-01-17 Thread Ben Bolker
tl;dr anova.lme() claims to provide sums of squares, but it doesn't. And
some names are misspelled in ?lme.  I can submit all this stuff as a bug
report if that's preferred.

?anova.lme says:

When only one fitted model object is present, a data frame with
 the sums of squares, numerator degrees of freedom, denominator
 degrees of freedom, F-values, and P-values

The output of

fm1 <- lme(distance ~ age, data = Orthodont) # random is ~ age
anova(fm1)

gives columns

numDF denDF   F-value p-value

-- i.e. the sums of squares aren't there!  (For fairly good reasons; lme
doesn't actually compute them internally, and it might not always be
straightforward to compute them, for more complex models. They would
mostly be useful for comparison with simpler, method-of-moments based
approaches like aov()). Federico Calboli pointed this out on r-help in
2004: https://stat.ethz.ch/pipermail/r-help/2004-May/051444.html


Two more points:

  * the last sentence of the Description might need one fewer comma
[after "statistic"] or one more [after "p-value"].
  * in ?lme, Littell's name is misspelled at least twice and Reinsel's
at least once.

  Is there a publicly accessible SVN server for recommended packages (in
general) and nlme (in particular) anywhere?

  cheers
  Ben Bolker

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is this a bug in `[`?

2018-08-28 Thread Rui Barradas

Hello,

Thanks for the pointer.
Inline.

On 29/08/2018 04:17, Henrik Bengtsson wrote:

FYI, this behavior is documented in Section 3.4.1 'Indexing by
vectors' of 'R Language Definition' (accessible for instance via
help.start()):

"*Integer* [...] A special case is the zero index, which has null
effects: x[0] is an empty vector and otherwise including zeros among
positive or negative indices has the same effect as if they were
omitted."



So I was in part right, the zero index is handled as a special case.
My use case was an operation in a function. I wasn't testing whether the 
result was of length zero, I was just using seq_len(result) to avoid the 
test. And found the error surprising.


Thanks again,

Rui Barradas



The rest of that section is very useful and well written. I used it as
the go-to reference to implement support for all those indexing
alternatives in matrixStats.

/Henrik
On Sun, Aug 5, 2018 at 3:42 AM Iñaki Úcar  wrote:


El dom., 5 ago. 2018 a las 6:27, Kenny Bell () escribió:


This should more clearly illustrate the issue:

c(1, 2, 3, 4)[-seq_len(4)]
#> numeric(0)
c(1, 2, 3, 4)[-seq_len(3)]
#> [1] 4
c(1, 2, 3, 4)[-seq_len(2)]
#> [1] 3 4
c(1, 2, 3, 4)[-seq_len(1)]
#> [1] 2 3 4
c(1, 2, 3, 4)[-seq_len(0)]
#> numeric(0)
Created on 2018-08-05 by the reprex package (v0.2.0.9000).


IMO, the problem is that you are reading it sequentially: "-" remove
"seq_" a sequence "len(0)" of length zero. But that's not how R works
(how programming languages work in general). Instead, the sequence is
evaluated in the first place, and then the sign may apply as long as
you provided something that can hold a sign. And an empty element has
no sign, so the sign is lost.

Iñaki



On Sun, Aug 5, 2018 at 3:58 AM Rui Barradas  wrote:




Às 15:51 de 04/08/2018, Iñaki Úcar escreveu:

El sáb., 4 ago. 2018 a las 15:32, Rui Barradas
() escribió:


Hello,

Maybe I am not understanding how negative indexing works but

1) This is right.

(1:10)[-1]
#[1]  2  3  4  5  6  7  8  9 10

2) Are these right? They are at least surprising to me.

(1:10)[-0]
#integer(0)

(1:10)[-seq_len(0)]
#integer(0)


It was the last example that made me ask, seq_len(0) whould avoid an
if/else or something similar.


I think it's ok, because there is no negative zero integer, so -0 is 0.


Ok, this makes sense, I should have thought about that.



1.0/-0L # Inf
1.0/-0.0 # - Inf

And the same can be said for integer(0), which is the result of
seq_len(0): there is no negative empty integer.


I'm not completely convinced about this one, though.
I would expect -seq_len(n) to remove the first n elements from the
vector, therefore, when n == 0, it would remove none.

And integer(0) is not the same as 0.

(1:10)[-0] == (1:10)[0] == integer(0) # empty

(1:10)[-seq_len(0)] == (1:10)[-integer(0)]


And I have just reminded myself to run

identical(-integer(0), integer(0))

It returns TRUE so my intuition is wrong, R is right.
End of story.

Thanks for the help,

Rui Barradas



Iñaki




Thanks in advance,

Rui Barradas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



---
This email has been checked for viruses by AVG.
https://www.avg.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is this a bug in `[`?

2018-08-28 Thread Henrik Bengtsson
FYI, this behavior is documented in Section 3.4.1 'Indexing by
vectors' of 'R Language Definition' (accessible for instance via
help.start()):

"*Integer* [...] A special case is the zero index, which has null
effects: x[0] is an empty vector and otherwise including zeros among
positive or negative indices has the same effect as if they were
omitted."

The rest of that section is very useful and well written. I used it as
the go-to reference to implement support for all those indexing
alternatives in matrixStats.

/Henrik
On Sun, Aug 5, 2018 at 3:42 AM Iñaki Úcar  wrote:
>
> El dom., 5 ago. 2018 a las 6:27, Kenny Bell () escribió:
> >
> > This should more clearly illustrate the issue:
> >
> > c(1, 2, 3, 4)[-seq_len(4)]
> > #> numeric(0)
> > c(1, 2, 3, 4)[-seq_len(3)]
> > #> [1] 4
> > c(1, 2, 3, 4)[-seq_len(2)]
> > #> [1] 3 4
> > c(1, 2, 3, 4)[-seq_len(1)]
> > #> [1] 2 3 4
> > c(1, 2, 3, 4)[-seq_len(0)]
> > #> numeric(0)
> > Created on 2018-08-05 by the reprex package (v0.2.0.9000).
>
> IMO, the problem is that you are reading it sequentially: "-" remove
> "seq_" a sequence "len(0)" of length zero. But that's not how R works
> (how programming languages work in general). Instead, the sequence is
> evaluated in the first place, and then the sign may apply as long as
> you provided something that can hold a sign. And an empty element has
> no sign, so the sign is lost.
>
> Iñaki
>
> >
> > On Sun, Aug 5, 2018 at 3:58 AM Rui Barradas  wrote:
> >>
> >>
> >>
> >> Às 15:51 de 04/08/2018, Iñaki Úcar escreveu:
> >> > El sáb., 4 ago. 2018 a las 15:32, Rui Barradas
> >> > () escribió:
> >> >>
> >> >> Hello,
> >> >>
> >> >> Maybe I am not understanding how negative indexing works but
> >> >>
> >> >> 1) This is right.
> >> >>
> >> >> (1:10)[-1]
> >> >> #[1]  2  3  4  5  6  7  8  9 10
> >> >>
> >> >> 2) Are these right? They are at least surprising to me.
> >> >>
> >> >> (1:10)[-0]
> >> >> #integer(0)
> >> >>
> >> >> (1:10)[-seq_len(0)]
> >> >> #integer(0)
> >> >>
> >> >>
> >> >> It was the last example that made me ask, seq_len(0) whould avoid an
> >> >> if/else or something similar.
> >> >
> >> > I think it's ok, because there is no negative zero integer, so -0 is 0.
> >>
> >> Ok, this makes sense, I should have thought about that.
> >>
> >> >
> >> > 1.0/-0L # Inf
> >> > 1.0/-0.0 # - Inf
> >> >
> >> > And the same can be said for integer(0), which is the result of
> >> > seq_len(0): there is no negative empty integer.
> >>
> >> I'm not completely convinced about this one, though.
> >> I would expect -seq_len(n) to remove the first n elements from the
> >> vector, therefore, when n == 0, it would remove none.
> >>
> >> And integer(0) is not the same as 0.
> >>
> >> (1:10)[-0] == (1:10)[0] == integer(0) # empty
> >>
> >> (1:10)[-seq_len(0)] == (1:10)[-integer(0)]
> >>
> >>
> >> And I have just reminded myself to run
> >>
> >> identical(-integer(0), integer(0))
> >>
> >> It returns TRUE so my intuition is wrong, R is right.
> >> End of story.
> >>
> >> Thanks for the help,
> >>
> >> Rui Barradas
> >>
> >> >
> >> > Iñaki
> >> >
> >> >>
> >> >>
> >> >> Thanks in advance,
> >> >>
> >> >> Rui Barradas
> >> >>
> >> >> __
> >> >> R-devel@r-project.org mailing list
> >> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is this a bug in `[`?

2018-08-05 Thread Iñaki Úcar
El dom., 5 ago. 2018 a las 6:27, Kenny Bell () escribió:
>
> This should more clearly illustrate the issue:
>
> c(1, 2, 3, 4)[-seq_len(4)]
> #> numeric(0)
> c(1, 2, 3, 4)[-seq_len(3)]
> #> [1] 4
> c(1, 2, 3, 4)[-seq_len(2)]
> #> [1] 3 4
> c(1, 2, 3, 4)[-seq_len(1)]
> #> [1] 2 3 4
> c(1, 2, 3, 4)[-seq_len(0)]
> #> numeric(0)
> Created on 2018-08-05 by the reprex package (v0.2.0.9000).

IMO, the problem is that you are reading it sequentially: "-" remove
"seq_" a sequence "len(0)" of length zero. But that's not how R works
(how programming languages work in general). Instead, the sequence is
evaluated in the first place, and then the sign may apply as long as
you provided something that can hold a sign. And an empty element has
no sign, so the sign is lost.

Iñaki

>
> On Sun, Aug 5, 2018 at 3:58 AM Rui Barradas  wrote:
>>
>>
>>
>> Às 15:51 de 04/08/2018, Iñaki Úcar escreveu:
>> > El sáb., 4 ago. 2018 a las 15:32, Rui Barradas
>> > () escribió:
>> >>
>> >> Hello,
>> >>
>> >> Maybe I am not understanding how negative indexing works but
>> >>
>> >> 1) This is right.
>> >>
>> >> (1:10)[-1]
>> >> #[1]  2  3  4  5  6  7  8  9 10
>> >>
>> >> 2) Are these right? They are at least surprising to me.
>> >>
>> >> (1:10)[-0]
>> >> #integer(0)
>> >>
>> >> (1:10)[-seq_len(0)]
>> >> #integer(0)
>> >>
>> >>
>> >> It was the last example that made me ask, seq_len(0) whould avoid an
>> >> if/else or something similar.
>> >
>> > I think it's ok, because there is no negative zero integer, so -0 is 0.
>>
>> Ok, this makes sense, I should have thought about that.
>>
>> >
>> > 1.0/-0L # Inf
>> > 1.0/-0.0 # - Inf
>> >
>> > And the same can be said for integer(0), which is the result of
>> > seq_len(0): there is no negative empty integer.
>>
>> I'm not completely convinced about this one, though.
>> I would expect -seq_len(n) to remove the first n elements from the
>> vector, therefore, when n == 0, it would remove none.
>>
>> And integer(0) is not the same as 0.
>>
>> (1:10)[-0] == (1:10)[0] == integer(0) # empty
>>
>> (1:10)[-seq_len(0)] == (1:10)[-integer(0)]
>>
>>
>> And I have just reminded myself to run
>>
>> identical(-integer(0), integer(0))
>>
>> It returns TRUE so my intuition is wrong, R is right.
>> End of story.
>>
>> Thanks for the help,
>>
>> Rui Barradas
>>
>> >
>> > Iñaki
>> >
>> >>
>> >>
>> >> Thanks in advance,
>> >>
>> >> Rui Barradas
>> >>
>> >> __
>> >> R-devel@r-project.org mailing list
>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is this a bug in `[`?

2018-08-05 Thread Rui Barradas

Thanks,

This is what I needed.
I had read the R Inferno a long time ago and apparently forgot this one.

Rui Barradas

Às 08:46 de 05/08/2018, Patrick Burns escreveu:

This is Circle 8..1.13 of the R Inferno.


On 05/08/2018 06:57, Rui Barradas wrote:

Thanks.
This is exactly the doubt I had.

Rui Barradas

Às 05:26 de 05/08/2018, Kenny Bell escreveu:

This should more clearly illustrate the issue:

c(1, 2, 3, 4)[-seq_len(4)]
#> numeric(0)
c(1, 2, 3, 4)[-seq_len(3)]
#> [1] 4
c(1, 2, 3, 4)[-seq_len(2)]
#> [1] 3 4
c(1, 2, 3, 4)[-seq_len(1)]
#> [1] 2 3 4
c(1, 2, 3, 4)[-seq_len(0)]
#> numeric(0)
Created on 2018-08-05 by the reprex package (v0.2.0.9000).

On Sun, Aug 5, 2018 at 3:58 AM Rui Barradas > wrote:




    Às 15:51 de 04/08/2018, Iñaki Úcar escreveu:
 > El sáb., 4 ago. 2018 a las 15:32, Rui Barradas
 > (mailto:ruipbarra...@sapo.pt>>) escribió:
 >>
 >> Hello,
 >>
 >> Maybe I am not understanding how negative indexing works but
 >>
 >> 1) This is right.
 >>
 >> (1:10)[-1]
 >> #[1]  2  3  4  5  6  7  8  9 10
 >>
 >> 2) Are these right? They are at least surprising to me.
 >>
 >> (1:10)[-0]
 >> #integer(0)
 >>
 >> (1:10)[-seq_len(0)]
 >> #integer(0)
 >>
 >>
 >> It was the last example that made me ask, seq_len(0) whould 
avoid an

 >> if/else or something similar.
 >
 > I think it's ok, because there is no negative zero integer, so -0
    is 0.

    Ok, this makes sense, I should have thought about that.

 >
 > 1.0/-0L # Inf
 > 1.0/-0.0 # - Inf
 >
 > And the same can be said for integer(0), which is the result of
 > seq_len(0): there is no negative empty integer.

    I'm not completely convinced about this one, though.
    I would expect -seq_len(n) to remove the first n elements from the
    vector, therefore, when n == 0, it would remove none.

    And integer(0) is not the same as 0.

    (1:10)[-0] == (1:10)[0] == integer(0) # empty

    (1:10)[-seq_len(0)] == (1:10)[-integer(0)]


    And I have just reminded myself to run

    identical(-integer(0), integer(0))

    It returns TRUE so my intuition is wrong, R is right.
    End of story.

    Thanks for the help,

    Rui Barradas

 >
 > Iñaki
 >
 >>
 >>
 >> Thanks in advance,
 >>
 >> Rui Barradas
 >>
 >> __
 >> R-devel@r-project.org  mailing 
list

 >> https://stat.ethz.ch/mailman/listinfo/r-devel

    __
    R-devel@r-project.org  mailing list
    https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is this a bug in `[`?

2018-08-05 Thread Patrick Burns

This is Circle 8..1.13 of the R Inferno.


On 05/08/2018 06:57, Rui Barradas wrote:

Thanks.
This is exactly the doubt I had.

Rui Barradas

Às 05:26 de 05/08/2018, Kenny Bell escreveu:

This should more clearly illustrate the issue:

c(1, 2, 3, 4)[-seq_len(4)]
#> numeric(0)
c(1, 2, 3, 4)[-seq_len(3)]
#> [1] 4
c(1, 2, 3, 4)[-seq_len(2)]
#> [1] 3 4
c(1, 2, 3, 4)[-seq_len(1)]
#> [1] 2 3 4
c(1, 2, 3, 4)[-seq_len(0)]
#> numeric(0)
Created on 2018-08-05 by the reprex package (v0.2.0.9000).

On Sun, Aug 5, 2018 at 3:58 AM Rui Barradas > wrote:




    Às 15:51 de 04/08/2018, Iñaki Úcar escreveu:
 > El sáb., 4 ago. 2018 a las 15:32, Rui Barradas
 > (mailto:ruipbarra...@sapo.pt>>) escribió:
 >>
 >> Hello,
 >>
 >> Maybe I am not understanding how negative indexing works but
 >>
 >> 1) This is right.
 >>
 >> (1:10)[-1]
 >> #[1]  2  3  4  5  6  7  8  9 10
 >>
 >> 2) Are these right? They are at least surprising to me.
 >>
 >> (1:10)[-0]
 >> #integer(0)
 >>
 >> (1:10)[-seq_len(0)]
 >> #integer(0)
 >>
 >>
 >> It was the last example that made me ask, seq_len(0) whould 
avoid an

 >> if/else or something similar.
 >
 > I think it's ok, because there is no negative zero integer, so -0
    is 0.

    Ok, this makes sense, I should have thought about that.

 >
 > 1.0/-0L # Inf
 > 1.0/-0.0 # - Inf
 >
 > And the same can be said for integer(0), which is the result of
 > seq_len(0): there is no negative empty integer.

    I'm not completely convinced about this one, though.
    I would expect -seq_len(n) to remove the first n elements from the
    vector, therefore, when n == 0, it would remove none.

    And integer(0) is not the same as 0.

    (1:10)[-0] == (1:10)[0] == integer(0) # empty

    (1:10)[-seq_len(0)] == (1:10)[-integer(0)]


    And I have just reminded myself to run

    identical(-integer(0), integer(0))

    It returns TRUE so my intuition is wrong, R is right.
    End of story.

    Thanks for the help,

    Rui Barradas

 >
 > Iñaki
 >
 >>
 >>
 >> Thanks in advance,
 >>
 >> Rui Barradas
 >>
 >> __
 >> R-devel@r-project.org  mailing list
 >> https://stat.ethz.ch/mailman/listinfo/r-devel

    __
    R-devel@r-project.org  mailing list
    https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @burnsstat @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of:
 'Impatient R'
 'The R Inferno'
 'Tao Te Programming')

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is this a bug in `[`?

2018-08-04 Thread Rui Barradas

Thanks.
This is exactly the doubt I had.

Rui Barradas

Às 05:26 de 05/08/2018, Kenny Bell escreveu:

This should more clearly illustrate the issue:

c(1, 2, 3, 4)[-seq_len(4)]
#> numeric(0)
c(1, 2, 3, 4)[-seq_len(3)]
#> [1] 4
c(1, 2, 3, 4)[-seq_len(2)]
#> [1] 3 4
c(1, 2, 3, 4)[-seq_len(1)]
#> [1] 2 3 4
c(1, 2, 3, 4)[-seq_len(0)]
#> numeric(0)
Created on 2018-08-05 by the reprex package (v0.2.0.9000).

On Sun, Aug 5, 2018 at 3:58 AM Rui Barradas > wrote:




Às 15:51 de 04/08/2018, Iñaki Úcar escreveu:
 > El sáb., 4 ago. 2018 a las 15:32, Rui Barradas
 > (mailto:ruipbarra...@sapo.pt>>) escribió:
 >>
 >> Hello,
 >>
 >> Maybe I am not understanding how negative indexing works but
 >>
 >> 1) This is right.
 >>
 >> (1:10)[-1]
 >> #[1]  2  3  4  5  6  7  8  9 10
 >>
 >> 2) Are these right? They are at least surprising to me.
 >>
 >> (1:10)[-0]
 >> #integer(0)
 >>
 >> (1:10)[-seq_len(0)]
 >> #integer(0)
 >>
 >>
 >> It was the last example that made me ask, seq_len(0) whould avoid an
 >> if/else or something similar.
 >
 > I think it's ok, because there is no negative zero integer, so -0
is 0.

Ok, this makes sense, I should have thought about that.

 >
 > 1.0/-0L # Inf
 > 1.0/-0.0 # - Inf
 >
 > And the same can be said for integer(0), which is the result of
 > seq_len(0): there is no negative empty integer.

I'm not completely convinced about this one, though.
I would expect -seq_len(n) to remove the first n elements from the
vector, therefore, when n == 0, it would remove none.

And integer(0) is not the same as 0.

(1:10)[-0] == (1:10)[0] == integer(0) # empty

(1:10)[-seq_len(0)] == (1:10)[-integer(0)]


And I have just reminded myself to run

identical(-integer(0), integer(0))

It returns TRUE so my intuition is wrong, R is right.
End of story.

Thanks for the help,

Rui Barradas

 >
 > Iñaki
 >
 >>
 >>
 >> Thanks in advance,
 >>
 >> Rui Barradas
 >>
 >> __
 >> R-devel@r-project.org  mailing list
 >> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is this a bug in `[`?

2018-08-04 Thread Kenny Bell
This should more clearly illustrate the issue:

c(1, 2, 3, 4)[-seq_len(4)]
#> numeric(0)
c(1, 2, 3, 4)[-seq_len(3)]
#> [1] 4
c(1, 2, 3, 4)[-seq_len(2)]
#> [1] 3 4
c(1, 2, 3, 4)[-seq_len(1)]
#> [1] 2 3 4
c(1, 2, 3, 4)[-seq_len(0)]
#> numeric(0)
Created on 2018-08-05 by the reprex package (v0.2.0.9000).

On Sun, Aug 5, 2018 at 3:58 AM Rui Barradas  wrote:

>
>
> Às 15:51 de 04/08/2018, Iñaki Úcar escreveu:
> > El sáb., 4 ago. 2018 a las 15:32, Rui Barradas
> > () escribió:
> >>
> >> Hello,
> >>
> >> Maybe I am not understanding how negative indexing works but
> >>
> >> 1) This is right.
> >>
> >> (1:10)[-1]
> >> #[1]  2  3  4  5  6  7  8  9 10
> >>
> >> 2) Are these right? They are at least surprising to me.
> >>
> >> (1:10)[-0]
> >> #integer(0)
> >>
> >> (1:10)[-seq_len(0)]
> >> #integer(0)
> >>
> >>
> >> It was the last example that made me ask, seq_len(0) whould avoid an
> >> if/else or something similar.
> >
> > I think it's ok, because there is no negative zero integer, so -0 is 0.
>
> Ok, this makes sense, I should have thought about that.
>
> >
> > 1.0/-0L # Inf
> > 1.0/-0.0 # - Inf
> >
> > And the same can be said for integer(0), which is the result of
> > seq_len(0): there is no negative empty integer.
>
> I'm not completely convinced about this one, though.
> I would expect -seq_len(n) to remove the first n elements from the
> vector, therefore, when n == 0, it would remove none.
>
> And integer(0) is not the same as 0.
>
> (1:10)[-0] == (1:10)[0] == integer(0) # empty
>
> (1:10)[-seq_len(0)] == (1:10)[-integer(0)]
>
>
> And I have just reminded myself to run
>
> identical(-integer(0), integer(0))
>
> It returns TRUE so my intuition is wrong, R is right.
> End of story.
>
> Thanks for the help,
>
> Rui Barradas
>
> >
> > Iñaki
> >
> >>
> >>
> >> Thanks in advance,
> >>
> >> Rui Barradas
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is this a bug in `[`?

2018-08-04 Thread Rui Barradas




Às 15:51 de 04/08/2018, Iñaki Úcar escreveu:

El sáb., 4 ago. 2018 a las 15:32, Rui Barradas
() escribió:


Hello,

Maybe I am not understanding how negative indexing works but

1) This is right.

(1:10)[-1]
#[1]  2  3  4  5  6  7  8  9 10

2) Are these right? They are at least surprising to me.

(1:10)[-0]
#integer(0)

(1:10)[-seq_len(0)]
#integer(0)


It was the last example that made me ask, seq_len(0) whould avoid an
if/else or something similar.


I think it's ok, because there is no negative zero integer, so -0 is 0.


Ok, this makes sense, I should have thought about that.



1.0/-0L # Inf
1.0/-0.0 # - Inf

And the same can be said for integer(0), which is the result of
seq_len(0): there is no negative empty integer.


I'm not completely convinced about this one, though.
I would expect -seq_len(n) to remove the first n elements from the 
vector, therefore, when n == 0, it would remove none.


And integer(0) is not the same as 0.

(1:10)[-0] == (1:10)[0] == integer(0) # empty

(1:10)[-seq_len(0)] == (1:10)[-integer(0)]


And I have just reminded myself to run

identical(-integer(0), integer(0))

It returns TRUE so my intuition is wrong, R is right.
End of story.

Thanks for the help,

Rui Barradas



Iñaki




Thanks in advance,

Rui Barradas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Is this a bug in `[`?

2018-08-04 Thread Iñaki Úcar
El sáb., 4 ago. 2018 a las 15:32, Rui Barradas
() escribió:
>
> Hello,
>
> Maybe I am not understanding how negative indexing works but
>
> 1) This is right.
>
> (1:10)[-1]
> #[1]  2  3  4  5  6  7  8  9 10
>
> 2) Are these right? They are at least surprising to me.
>
> (1:10)[-0]
> #integer(0)
>
> (1:10)[-seq_len(0)]
> #integer(0)
>
>
> It was the last example that made me ask, seq_len(0) whould avoid an
> if/else or something similar.

I think it's ok, because there is no negative zero integer, so -0 is 0.

1.0/-0L # Inf
1.0/-0.0 # - Inf

And the same can be said for integer(0), which is the result of
seq_len(0): there is no negative empty integer.

Iñaki

>
>
> Thanks in advance,
>
> Rui Barradas
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Is this a bug in `[`?

2018-08-04 Thread Rui Barradas

Hello,

Maybe I am not understanding how negative indexing works but

1) This is right.

(1:10)[-1]
#[1]  2  3  4  5  6  7  8  9 10

2) Are these right? They are at least surprising to me.

(1:10)[-0]
#integer(0)

(1:10)[-seq_len(0)]
#integer(0)


It was the last example that made me ask, seq_len(0) whould avoid an 
if/else or something similar.



Thanks in advance,

Rui Barradas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Inconsistency, may be bug in read.delim ?

2018-03-21 Thread Tomas Kalibera

On 03/19/2018 02:23 PM, Detlef Steuer wrote:

Dear friends,

I stumbled into beheaviour of read.delim which I would consider a bug
or at least an inconsistency that should be improved upon.

Recently we had to work with data that used "", two double quotes, as
symbol to start and end character input.

Essentially the data looked like this

data.csv

V1, V2, V3
""data"", 3, 

The last sequence of  indicating a missing.

After processing the quotes, this is internally parsed as

data 3 "

Which I think is correct; in particular,  represents single quote. 
This is correct and it conforms to RFC 4180. "" in contrast represents 
an empty string.


Based on my reading of RFC4180, ""data"" is not a valid field, but not 
every CSV file follows that RFC, and R supports this pattern as expected 
in your data. So you should be fine here.



One obvious solution to read in this data is using some gsub(),
but that's not the point I want to make.

Consider this case we found during tests:

test.csv

V1, V2, V3, V4
, , 3, ""

and read it with

read.delim("test.csv", sep=",", header=TRUE, na.strings="\"")

After processing the quotes, this is internally parsed as
" " 3 

which is again I think correct (and conforms to RFC 4180)


you get the following

   V1 V2 V3 V4
1 NA  "  3 NA

(and a warning)


I do not get the warning on my system. The reason why the second " is 
not translated to NA by na.strings is white space after the comma in the 
CSV file, this works more consistently:


> read.delim("test.csv", sep=",", header=TRUE, na.strings="\"", 
strip.white=TRUE)

  V1 V2 V3 V4
1 NA NA  3 NA

If one needed to differentiate between " and , then it 
might be necessary to run without the na.strings argument.


Best
Tomas


I would have assumed to get some error message or at
least the same result for both appearances of  in the
input file.
(the setting na.strings="\"" turned out to be working for
  a colleague and his specific data, while I think it shouldn't)

My main concern is the different interpretation for the two 
sequences.

Real bug? Minor inconsistency? I don't know.

All the best
Detlef




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Inconsistency, may be bug in read.delim ?

2018-03-19 Thread Detlef Steuer
Dear friends,

I stumbled into beheaviour of read.delim which I would consider a bug
or at least an inconsistency that should be improved upon.

Recently we had to work with data that used "", two double quotes, as
symbol to start and end character input.

Essentially the data looked like this

data.csv

V1, V2, V3
""data"", 3,  

The last sequence of  indicating a missing.

One obvious solution to read in this data is using some gsub(),
but that's not the point I want to make.

Consider this case we found during tests:

test.csv

V1, V2, V3, V4
, , 3, ""

and read it with 
> read.delim("test.csv", sep=",", header=TRUE, na.strings="\"")  

you get the following

  V1 V2 V3 V4
1 NA  "  3 NA  

(and a warning)

I would have assumed to get some error message or at
least the same result for both appearances of  in the
input file.
(the setting na.strings="\"" turned out to be working for
 a colleague and his specific data, while I think it shouldn't)

My main concern is the different interpretation for the two 
sequences.

Real bug? Minor inconsistency? I don't know.

All the best
Detlef


-- 
'People who say "I have nothing to hide" misunderstand the purpose of
surveillance. It was never about privacy. It's about power.' E. Snowden

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] Somewhat obscure bug in R 3.4.0 building from source

2017-05-22 Thread Peter Carbonetto
Hi Peter, Duncan & Bert,

Thank you kindly for the responses.

Indeed, doc/NEWS.pdf is included in the source distribution, and then
removed upon "make clean".

I thought that it might be useful to report this for your benefit, but on
closer inspection it appears that I'm getting errors that arise due to
incompatibilities in my texlive and texinfo installations. This is the
error I get when trying to build NEWS.pdf using "R CMD Rd2pdf":

R CMD Rd2pdf --output=NEWS.pdf NEWS.Rd
Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = quiet,  :
  Running 'texi2dvi' on 'Rd2.tex' failed.
Messages:
/software/texinfo-6.3-el7-x86_64/bin/texi2dvi: TeX neither supports
-recorder nor outputs \openout lines in its log file
Output:

I'm not sure what to make of this error exactly but perhaps it is
introduced by the latest version of texinfo (which seems to be a recurring
issue based on reading the help for texi2dvi in R):

texi2dvi --version
texi2dvi (GNU Texinfo 6.3) 7353

Peter

On Sun, May 21, 2017 at 5:07 PM, Peter Dalgaard  wrote:

> Inline below...
>
> > On 21 May 2017, at 20:57 , Duncan Murdoch 
> wrote:
> >
> > On 21/05/2017 10:30 AM, Peter Carbonetto wrote:
> >> Hi,
> >>
> >> I uncovered a bug in installing R 3.4.0 from source in Linux, following
> the
> >> standard procedure (configure; make; make install). Is this an
> appropriate
> >> place to report this bug? If not, can you please direct me to the
> >> appropriate place?
> >
> > Generally R-devel is better; I've responded there.
> >
> >>
> >> The error occurs only when I do "make clean" followed by "make" again;
> make
> >> works the first time.
> >>
> >> The error is a failure to build NEWS.pdf:
> >>
> >> Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet =
> quiet,  :
> >>  pdflatex is not available
> >> Calls:  -> texi2pdf -> texi2dvi
> >> Execution halted
> >> make[1]: *** [NEWS.pdf] Error 1
> >> make: [docs] Error 2 (ignored)
> >>
> >> and can be reproduced wit the following sequence:
> >>
> >> ./configure
> >> make
> >> make clean
> >> make
> >
> > We usually don't build in the source directory; see the second
> recommendation in the admin manual section 2.1.  So it's possible there's a
> bug triggered when you do that.  Can you try building in a separate
> directory?
>
> Notice that the error is that "pdflatex" is missing from your setup. We
> do, for the benefit of users with defective TeX installations supply a
> pre-built NEWS.pdf (and NEWS.html too) in the source tarballs. However,
> they are technically make targets and make clean will wipe them; in that
> case, you had better have the tools to rebuild them!
>
> -pd
>
> >
> > Duncan Murdoch
> >
> >>
> >> This suggests to me that perhaps "make clean" is not working.
> >>
> >> I'm happy to provide more details so that you are able to reproduce the
> bug.
> >>
> >> Thanks,
> >>
> >> Peter Carbonetto, Ph.D.
> >> Computational Staff Scientist, Statistics & Genetics
> >> Research Computing Center
> >> University of Chicago
> >>
> >>  [[alternative HTML version deleted]]
> >>
> >> __
> >> r-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
>
>
>
>
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] Somewhat obscure bug in R 3.4.0 building from source

2017-05-21 Thread Peter Dalgaard
Inline below...

> On 21 May 2017, at 20:57 , Duncan Murdoch  wrote:
> 
> On 21/05/2017 10:30 AM, Peter Carbonetto wrote:
>> Hi,
>> 
>> I uncovered a bug in installing R 3.4.0 from source in Linux, following the
>> standard procedure (configure; make; make install). Is this an appropriate
>> place to report this bug? If not, can you please direct me to the
>> appropriate place?
> 
> Generally R-devel is better; I've responded there.
> 
>> 
>> The error occurs only when I do "make clean" followed by "make" again; make
>> works the first time.
>> 
>> The error is a failure to build NEWS.pdf:
>> 
>> Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = quiet,  :
>>  pdflatex is not available
>> Calls:  -> texi2pdf -> texi2dvi
>> Execution halted
>> make[1]: *** [NEWS.pdf] Error 1
>> make: [docs] Error 2 (ignored)
>> 
>> and can be reproduced wit the following sequence:
>> 
>> ./configure
>> make
>> make clean
>> make
> 
> We usually don't build in the source directory; see the second recommendation 
> in the admin manual section 2.1.  So it's possible there's a bug triggered 
> when you do that.  Can you try building in a separate directory?

Notice that the error is that "pdflatex" is missing from your setup. We do, for 
the benefit of users with defective TeX installations supply a pre-built 
NEWS.pdf (and NEWS.html too) in the source tarballs. However, they are 
technically make targets and make clean will wipe them; in that case, you had 
better have the tools to rebuild them! 

-pd

> 
> Duncan Murdoch
> 
>> 
>> This suggests to me that perhaps "make clean" is not working.
>> 
>> I'm happy to provide more details so that you are able to reproduce the bug.
>> 
>> Thanks,
>> 
>> Peter Carbonetto, Ph.D.
>> Computational Staff Scientist, Statistics & Genetics
>> Research Computing Center
>> University of Chicago
>> 
>>  [[alternative HTML version deleted]]
>> 
>> __
>> r-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] Somewhat obscure bug in R 3.4.0 building from source

2017-05-21 Thread Duncan Murdoch

On 21/05/2017 10:30 AM, Peter Carbonetto wrote:

Hi,

I uncovered a bug in installing R 3.4.0 from source in Linux, following the
standard procedure (configure; make; make install). Is this an appropriate
place to report this bug? If not, can you please direct me to the
appropriate place?


Generally R-devel is better; I've responded there.



The error occurs only when I do "make clean" followed by "make" again; make
works the first time.

The error is a failure to build NEWS.pdf:

Error in texi2dvi(file = file, pdf = TRUE, clean = clean, quiet = quiet,  :
  pdflatex is not available
Calls:  -> texi2pdf -> texi2dvi
Execution halted
make[1]: *** [NEWS.pdf] Error 1
make: [docs] Error 2 (ignored)

and can be reproduced wit the following sequence:

./configure
make
make clean
make


We usually don't build in the source directory; see the second 
recommendation in the admin manual section 2.1.  So it's possible 
there's a bug triggered when you do that.  Can you try building in a 
separate directory?


Duncan Murdoch



This suggests to me that perhaps "make clean" is not working.

I'm happy to provide more details so that you are able to reproduce the bug.

Thanks,

Peter Carbonetto, Ph.D.
Computational Staff Scientist, Statistics & Genetics
Research Computing Center
University of Chicago

[[alternative HTML version deleted]]

__
r-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug: floating point bug in nclass.FD can cause hist() to crash

2017-05-20 Thread Sietse Brouwer
Hi, all,

Sietse wrote:
> Floating point errors can cause a data vector to have an ultra-small
> inter-quartile range, which causes `grDevices::nclass.FD` to suggest
> an absurdly large number of breaks to `graphics::hist(breaks="FD")`.
> Because this large float becomes NA when converted to integer, hist's
> call to `base::pretty` crashes.

I have been provided with an account, and filed the bug at
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17274

Discussion continues there.

Cheers,
Sietse

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug: floating point bug in nclass.FD can cause hist() to crash

2017-05-18 Thread Spencer Graves

I just got the same error message with


> sessionInfo()
R version 3.4.0 (2017-04-21)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.4

Matrix products: default
BLAS: 
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib


locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils
[5] datasets  methods   base

loaded via a namespace (and not attached):
[1] compiler_3.4.0 tools_3.4.0
>

On 2017-05-18 3:50 PM, Sietse Brouwer wrote:

Hello everybody,

This is a bug involving functions in core R package:
graphics::hist.default, grDevices::nclass.FD, and
base::pretty.default. It is not yet on Bugzilla. I cannot submit it
myself, as I do not have an account. Could somebody else add it for
me, perhaps? That would be much appreciated.

Kind regards,

Sietse
Sietse Brouwer


Summary
---

Floating point errors can cause a data vector to have an ultra-small
inter-quartile range, which causes `grDevices::nclass.FD` to suggest
an absurdly large number of breaks to `graphics::hist(breaks="FD")`.
Because this large float becomes NA when converted to integer, hist's
call to `base::pretty` crashes.

How could nclass.FD, which has the job of suggesting a reasonable number of
classes, avoid suggesting an absurdly large number of classes when the
inter-quartile range is absurdly small compared to the range?


Steps to reproduce
--

 hist(c(1, 1, 1, 1 + 1e-15, 2), breaks="FD")


Observed behaviour
--

Running this code gives the following error message:

 Error in pretty.default(range(x), n = breaks, min.n = 1):
   invalid 'n' argument
 In addition: Warning message:
 In pretty.default(range(x), n = breaks, min.n = 1) :
   NAs introduced by coercion to integer range


Expected behaviour
--

That hist() should never crash when given valid numerical data. Specifically,
that it should be robust even to those rare datasets where (through floating
point inaccuracy) the inter-quartile range is tens of orders of magnitude
smaller than the range.


Analysis


Dramatis personae:

* graphics::hist.default
   https://svn.r-project.org/R/trunk/src/library/graphics/R/hist.R

* grDevices::nclass.FD
   https://svn.r-project.org/R/trunk/src/library/grDevices/R/calc.R

* base::pretty.default
   https://svn.r-project.org/R/trunk/src/library/base/R/pretty.R

`nclass.FD` examines the inter-quartile range of `x`, and gets a positive, but
very small floating point value -- let's call it TINYFLOAT. It inserts this
ultra-low IQR into the `nclass` denominator, which means `nclass`
becoms a huge number -- let's call it BIGFLOAT. `nclass.FD` then returns this
huge value to `hist`.

Once `hist` has its 'number of breaks' suggestion, it feeds this
number to `pretty`:

 pretty(range(x), BIGFLOAT, min.n = 1)

`pretty`, in turn, calls

 .Internal(pretty(min(x), max(x), BIGFLOAT, min.n, shrink.sml,
 c(high.u.bias, u5.bias), eps.correct))

Which fails with the error and warning shown at start of this e-mail. (Invalid
'n' argument / NA's introduced by coercion to integer range.) My reading is
that .Internal tried to coerce BIGFLOAT to integer range and produced an NA,
and that (the C implementation of) `pretty`, in turn, choked when confronted
with NA.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Bug: floating point bug in nclass.FD can cause hist() to crash

2017-05-18 Thread Sietse Brouwer
Hello everybody,

This is a bug involving functions in core R package:
graphics::hist.default, grDevices::nclass.FD, and
base::pretty.default. It is not yet on Bugzilla. I cannot submit it
myself, as I do not have an account. Could somebody else add it for
me, perhaps? That would be much appreciated.

Kind regards,

Sietse
Sietse Brouwer


Summary
---

Floating point errors can cause a data vector to have an ultra-small
inter-quartile range, which causes `grDevices::nclass.FD` to suggest
an absurdly large number of breaks to `graphics::hist(breaks="FD")`.
Because this large float becomes NA when converted to integer, hist's
call to `base::pretty` crashes.

How could nclass.FD, which has the job of suggesting a reasonable number of
classes, avoid suggesting an absurdly large number of classes when the
inter-quartile range is absurdly small compared to the range?


Steps to reproduce
--

hist(c(1, 1, 1, 1 + 1e-15, 2), breaks="FD")


Observed behaviour
--

Running this code gives the following error message:

Error in pretty.default(range(x), n = breaks, min.n = 1):
  invalid 'n' argument
In addition: Warning message:
In pretty.default(range(x), n = breaks, min.n = 1) :
  NAs introduced by coercion to integer range


Expected behaviour
--

That hist() should never crash when given valid numerical data. Specifically,
that it should be robust even to those rare datasets where (through floating
point inaccuracy) the inter-quartile range is tens of orders of magnitude
smaller than the range.


Analysis


Dramatis personae:

* graphics::hist.default
  https://svn.r-project.org/R/trunk/src/library/graphics/R/hist.R

* grDevices::nclass.FD
  https://svn.r-project.org/R/trunk/src/library/grDevices/R/calc.R

* base::pretty.default
  https://svn.r-project.org/R/trunk/src/library/base/R/pretty.R

`nclass.FD` examines the inter-quartile range of `x`, and gets a positive, but
very small floating point value -- let's call it TINYFLOAT. It inserts this
ultra-low IQR into the `nclass` denominator, which means `nclass`
becoms a huge number -- let's call it BIGFLOAT. `nclass.FD` then returns this
huge value to `hist`.

Once `hist` has its 'number of breaks' suggestion, it feeds this
number to `pretty`:

pretty(range(x), BIGFLOAT, min.n = 1)

`pretty`, in turn, calls

.Internal(pretty(min(x), max(x), BIGFLOAT, min.n, shrink.sml,
c(high.u.bias, u5.bias), eps.correct))

Which fails with the error and warning shown at start of this e-mail. (Invalid
'n' argument / NA's introduced by coercion to integer range.) My reading is
that .Internal tried to coerce BIGFLOAT to integer range and produced an NA,
and that (the C implementation of) `pretty`, in turn, choked when confronted
with NA.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Potential clue for Bug 16975 - lme fixed sigma - inconsistent REML estimation

2017-03-06 Thread Francisco Mauro
Dear list,

I was trying to create a VarClass for nlme to work with Fay-Herriot
(FH) models. The idea was to create a modification of VarComb that
instead of multiplying the variance functions made their sum (I called
it varSum). After some fails etc... I found that the I was not getting
the expected results because I needed to make sigma fixed. Trying to
find how to make sigma fixed I run into this bug (with uconfirmed
status but listed) report

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16975

The ideas they propose make unnecessary the use of the variance
function I was thinking of, and I left my idea aside for a couple
days. I recently tried the variance function I mentioned before and I
got estimates that were consistent with those provided by the packages
sae and metafor and with the s-plus version of nlme.

The way of fitting a FH model proposed by Maciej in the bug report is
different to the way I fitted the model using the additive variance
function varSum. Both formulations lead to a similar distribution of
the response. However, Maciej's formulation adds the unknown variance
component as a random effect and the formulation with the additive
varClass treat this variance as a variance of an error component.

Results using varSum are the same as those provided by packages sae
and metafor, while the alternative proposed by Maciej does not fit
with those two packages, which is the subject of the bug mentioned
above. Considering this and the fact that the bug is active I thought
that this example (below) could be helpful and provide some clues to
figure out what is the problem with the bug.

I'm not sure about what way would be better to fit models like the FH
model. I find Maciej's solution more flexible for several things than
the route I was taking, but the reported bug made me loose some
confidence. I believe that information about the bug 16975 can be very
interesting. I hope the example below can help or provide some clues.

Thanks

Paco Mauro

# TODO: Add comment
#
# Author: Paco
###
sessionInfo()
#R version 3.3.2 (2016-10-31)
#Platform: x86_64-w64-mingw32/x64 (64-bit)
#Running under: Windows 10 x64 (build 14393)
#
#locale:
#  [1] LC_COLLATE=English_United States.1252
#[2] LC_CTYPE=English_United States.1252
#[3] LC_MONETARY=English_United States.1252
#[4] LC_NUMERIC=C
#[5] LC_TIME=English_United States.1252
#
#attached base packages:
#  [1] stats graphics  grDevices utils datasets  methods   base
#
#other attached packages:
#  [1] sae_1.1  MASS_7.3-45  nlme_3.1-131 rj_2.0.5-2
#
#loaded via a namespace (and not attached):
#  [1] tools_3.3.2 grid_3.3.2  lattice_0.20-34

#Package: nlme
#Version: 3.1-131
library(nlme)
library(sae)
* varSum, a modification of varComb to make the combination
additive instead of multiplicative
varSum <-
  ## constructor for the varSum class
  function(...)
{
 val <- list(...)
 if (!all(unlist(lapply(val, inherits, "varFunc" {
  stop("all arguments to 'varSum' must be of class \"varFunc\".")
 }
 if (is.null(names(val))) {
  names(val) <- LETTERS[seq_along(val)]
 }
 class(val) <- c("varSum","varComb", "varFunc")
 val
}
varWeights.varSum <-
  function(object)
{
 apply(as.data.frame(lapply(object, varWeights)), 1, function(x){

1/sqrt(sum((1/x)^2))

   })
}
Initialize.varSum <-
  function(object, data, ...)
{
 val <- lapply(object, Initialize, data)
 attr(val, "plen") <- unlist(lapply(val, function(el) length(coef(el
 class(val) <- c("varSum","varComb", "varFunc")
 val
}
logLik.varSum <-
  function(object, ...)
{
 lls <- lapply(object, logLik)

 lls2 <- apply(as.data.frame(lapply(object, varWeights)), 1, function(x){

1/sqrt(sum((1/x)^2))

   })

 val <- sum(log(lls2))
 attr(val, "df") <- sum(unlist(lapply(lls, attr, "df")))
 class(val) <- "logLik"
 val
}

* The methods from here to the example are just copies of the
varComb methods with different names
coef.varSum <-
  function(object, unconstrained = TRUE, allCoef = FALSE, ...)
{
 unlist(lapply(object, coef, unconstrained, allCoef))
}

"coef<-.varSum" <-
  function(object, ..., value)
{
 plen <- attr(object, "plen")
 if ((len <- sum(plen)) > 0) {  # varying parameters
  if (length(value) != len) {
   stop("cannot change parameter length of initialized \"varSum\" object")
  }
  start <- 0
  for (i in seq_along(object)) {
   if (plen[i] > 0) {
coef(object[[i]]) <- value[start + (1:plen[i])]
start <- start + plen[i]
   }
  }
 }
 object
}
formula.varSum <-
  function(x, ...) lapply(x, formula)
needUpdate.varSum <-
  function(object) any(unlist(lapply(object, needUpdate)))

print.varSum <-
  function(x, ...)
{
 cat("Sum of:\n")
 lapply(x, print)
 invisible(x)
}

print.summary.varSum <-
  function(x, ...)
{
 cat(attr(x, "structName"),"\n")
 lapply(x, print, FALSE)
 invisible(x)
}

summary.varSum <-
  function(object, structName = "Sum of variance functions:", ...)
{
 object[] <- 

Re: [Rd] anonymous function parsing bug?

2016-10-21 Thread peter dalgaard

> On 21 Oct 2016, at 19:17 , Wilm Schumacher  wrote:
> 
> Am 21.10.2016 um 18:10 schrieb William Dunlap:
>> Are you saying that
>>f1 <- function(x) log(x)
>>f2 <- function(x) { log } (x)
>> should act differently?
> yes. Or more precisely: I would expect that. "Should" implies, that I want to 
> change something. I just want to understand the behavior (or file a bug, if 
> this would have been one).

I think Bill and Luke are failing in trying to make you work out the logic for 
yourself...

The point is that 
{
  some_computation
}(x)

is an expression that evaluates some_computation and applies it as a function 
to the argument x (or fails if not a function). 

When you define functions, the body can be a single expression, so

f <- function(a)
{
  some_computation
}(x)

is effectively the same as

f <- function(a) {
 {
   some_computation
 }(x)
}

where you seem to be expecting

{f <- function(a) {
 {
   some_computation
 }
}(x)

Got it?
  
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] anonymous function parsing bug?

2016-10-21 Thread Wilm Schumacher

Hi,


Am 21.10.2016 um 18:10 schrieb William Dunlap:

Are you saying that
f1 <- function(x) log(x)
f2 <- function(x) { log } (x)
should act differently?
yes. Or more precisely: I would expect that. "Should" implies, that I 
want to change something. I just want to understand the behavior (or 
file a bug, if this would have been one).


As I wrote, in e.g. node.js the pendents to the lines that you wrote are 
treated differently (the first is a function, the latter is a parsing 
error).


Let's use this example instead:
x <- 20
f1 <- function(x) { x<-x+1; log(x) }
f2 <- function(x) { x<-x+1; log } (x)
which act equally.

But as the latter is a legal statement, I would read it as
f2 <- (function(x) { x<-x+1; log }) (x)

thus, I would expect the first to be a function, the latter to be a 
numeric ( log(20) in this case ).



Using 'return' complicates the matter, because it affects evaluation, 
not parsing.


But perhaps it illustrates my problem a little better:
x <- 20
f1 <- function(x) return(log(x))
f2 <- function(x) { return(log) } (x)

f1(10) is a numeric, f2(10) is the log function. Again: as the latter is 
a legal statement, I would expect:

f2 <- (function(x) { x<-x+1; log }) (x)

However, regarding the answers I will try to construct the AST regarding 
the grammar defined in gramm.y of that statement

f2 <- function(x) { x<-x+1; log } (x)
to understand what the R interpreter really does.

Best wishes,

Wilm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] anonymous function parsing bug?

2016-10-21 Thread luke-tierney

You might find it useful to look at what body() shows you for your
example and to think about what return does.

Best,

luke

On Fri, 21 Oct 2016, Wilm Schumacher wrote:


Hi,

thx for the reply. Unfortunately that is not a simplified version of the
problem. You have a function, call it and get the result (numeric in,
numeric out in that case). For simplicity lets use the "return" case:

##
foobar<-function(x) { return(sqrt(x)) }(2)
##
which is a function (numeric in, numeric out) which is defined, then
gets called and the return value is a function (with an appendix of
"(2)" which gets ignored), not the numeric.

In my opinion the result of the expression above should be a numeric
(1.41... in this case) or an parser error because of ambiguities.

e.g. in comparison with node.js

##
function(x){
return(2*x)
}(2);
##

leads to

##
SyntaxError: Unexpected token (
##

Or Haskell (and basically every complete functional languange)
##
(\x -> 2*x) 2
##
which leads to 4 (... okay, that is not comparable because here the
parenthesis make a closure which also works in R or node.js).

However, I think it's weird that

> ( function(x) { return(2*x) } ( 2 ) ) (3)

is a legal statement which results to 6 and that the "(2)" is basically
ignored by the parser.

Furthermore it is very strange, that

##
f1<-function(x) { print(2*x) }(2)
f1(3)
##
does the command and gives an error ("attempt to apply non-function") and
##
f2<-function(x) { return(2*x) }(2)
f2(3)
##
is perfectly fine. Thus the return statement changes the interpretation
as a function? Or do I miss something?

Best wishes

Wilm

Am 21.10.2016 um 17:00 schrieb William Dunlap:

Here is a simplified version of your problem
 > { sqrt }(c(2,4,8))
  [1] 1.414214 2.00 2.828427
Do you want that to act differently?


Bill Dunlap
TIBCO Software
wdunlap tibco.com 

On Fri, Oct 21, 2016 at 6:10 AM, Wilm Schumacher
> wrote:

Hi,

I hope this is the correct list for my question. I found a wired
behaviour of my R installation on the evaluation of anonymous
functions.

minimal working example

###
f<-function(x) {
print( 2*x )
}(2)

class(f)

f(3)

f<-function(x) {
print( 2*x )
}(4)(5)

f(6)
###

leads to

###
   > f<-function(x) {
+ print( 2*x )
+ }(2)
   >
   > class(f)
[1] "function"
   >
   > f(3)
[1] 6
Error in f(3) : attempt to apply non-function
   >
   > f<-function(x) {
+ print( 2*x )
+ }(4)(5)
   >
   > f(6)
[1] 12
Error in f(6) : attempt to apply non-function

###

is this a bug or desired behavior? Using parenthesis of coures
solves the problem. However, I think the operator precedence could
be the problem here. I looked at the "./src/main/gram.y" and I
think that the line 385
|FUNCTION '(' formlist ')' cr expr_or_assign %prec LOW
should be of way higher precedence. But I cannot forsee the side
effects of that (which could be horrible in that case).

If this is the desired behaviour and not a bug, I'm very
interested in the rational behind that.

Best wishes,

Wilm

ps:

$ R --version
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"

__
R-devel@r-project.org  mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] anonymous function parsing bug?

2016-10-21 Thread Wilm Schumacher
Hi,

thx for the reply. Unfortunately that is not a simplified version of the 
problem. You have a function, call it and get the result (numeric in, 
numeric out in that case). For simplicity lets use the "return" case:

##
foobar<-function(x) { return(sqrt(x)) }(2)
##
which is a function (numeric in, numeric out) which is defined, then 
gets called and the return value is a function (with an appendix of 
"(2)" which gets ignored), not the numeric.

In my opinion the result of the expression above should be a numeric 
(1.41... in this case) or an parser error because of ambiguities.

e.g. in comparison with node.js

##
function(x){
 return(2*x)
}(2);
##

leads to

##
SyntaxError: Unexpected token (
##

Or Haskell (and basically every complete functional languange)
##
(\x -> 2*x) 2
##
which leads to 4 (... okay, that is not comparable because here the 
parenthesis make a closure which also works in R or node.js).

However, I think it's weird that

 > ( function(x) { return(2*x) } ( 2 ) ) (3)

is a legal statement which results to 6 and that the "(2)" is basically 
ignored by the parser.

Furthermore it is very strange, that

##
f1<-function(x) { print(2*x) }(2)
f1(3)
##
does the command and gives an error ("attempt to apply non-function") and
##
f2<-function(x) { return(2*x) }(2)
f2(3)
##
is perfectly fine. Thus the return statement changes the interpretation 
as a function? Or do I miss something?

Best wishes

Wilm

Am 21.10.2016 um 17:00 schrieb William Dunlap:
> Here is a simplified version of your problem
>   > { sqrt }(c(2,4,8))
>   [1] 1.414214 2.00 2.828427
> Do you want that to act differently?
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com 
>
> On Fri, Oct 21, 2016 at 6:10 AM, Wilm Schumacher 
> > wrote:
>
> Hi,
>
> I hope this is the correct list for my question. I found a wired
> behaviour of my R installation on the evaluation of anonymous
> functions.
>
> minimal working example
>
> ###
> f<-function(x) {
> print( 2*x )
> }(2)
>
> class(f)
>
> f(3)
>
> f<-function(x) {
> print( 2*x )
> }(4)(5)
>
> f(6)
> ###
>
> leads to
>
> ###
> > f<-function(x) {
> + print( 2*x )
> + }(2)
> >
> > class(f)
> [1] "function"
> >
> > f(3)
> [1] 6
> Error in f(3) : attempt to apply non-function
> >
> > f<-function(x) {
> + print( 2*x )
> + }(4)(5)
> >
> > f(6)
> [1] 12
> Error in f(6) : attempt to apply non-function
>
> ###
>
> is this a bug or desired behavior? Using parenthesis of coures
> solves the problem. However, I think the operator precedence could
> be the problem here. I looked at the "./src/main/gram.y" and I
> think that the line 385
> |FUNCTION '(' formlist ')' cr expr_or_assign %prec LOW
> should be of way higher precedence. But I cannot forsee the side
> effects of that (which could be horrible in that case).
>
> If this is the desired behaviour and not a bug, I'm very
> interested in the rational behind that.
>
> Best wishes,
>
> Wilm
>
> ps:
>
> $ R --version
> R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
>
> __
> R-devel@r-project.org  mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
>
>


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] anonymous function parsing bug?

2016-10-21 Thread William Dunlap via R-devel
Here is a simplified version of your problem
  > { sqrt }(c(2,4,8))
  [1] 1.414214 2.00 2.828427
Do you want that to act differently?


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Oct 21, 2016 at 6:10 AM, Wilm Schumacher 
wrote:

> Hi,
>
> I hope this is the correct list for my question. I found a wired behaviour
> of my R installation on the evaluation of anonymous functions.
>
> minimal working example
>
> ###
> f<-function(x) {
> print( 2*x )
> }(2)
>
> class(f)
>
> f(3)
>
> f<-function(x) {
> print( 2*x )
> }(4)(5)
>
> f(6)
> ###
>
> leads to
>
> ###
> > f<-function(x) {
> + print( 2*x )
> + }(2)
> >
> > class(f)
> [1] "function"
> >
> > f(3)
> [1] 6
> Error in f(3) : attempt to apply non-function
> >
> > f<-function(x) {
> + print( 2*x )
> + }(4)(5)
> >
> > f(6)
> [1] 12
> Error in f(6) : attempt to apply non-function
>
> ###
>
> is this a bug or desired behavior? Using parenthesis of coures solves the
> problem. However, I think the operator precedence could be the problem
> here. I looked at the "./src/main/gram.y" and I think that the line 385
> |FUNCTION '(' formlist ')' cr expr_or_assign %prec LOW
> should be of way higher precedence. But I cannot forsee the side effects
> of that (which could be horrible in that case).
>
> If this is the desired behaviour and not a bug, I'm very interested in the
> rational behind that.
>
> Best wishes,
>
> Wilm
>
> ps:
>
> $ R --version
> R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] anonymous function parsing bug?

2016-10-21 Thread Wilm Schumacher

Hi,

sry for the double posting. I forgot to mention that this example

###
f<-function(x) {
return( 2*x )
}(2)

class(f)

f(3)

f<-function(x) {
return( 2*x )
}(4)(5)

f(6)
###

leads to

##
> f<-function(x) {
+ return( 2*x )
+ }(2)
>
> class(f)
[1] "function"
>
> f(3)
[1] 6
>
> f<-function(x) {
+ return( 2*x )
+ }(4)(5)
>
> f(6)
[1] 12
##

which is even stranger (at least for me) and contradicts the first 
listing imho in behaviour.


Best wishes,

Wilm

Am 21.10.2016 um 15:10 schrieb Wilm Schumacher:

Hi,

I hope this is the correct list for my question. I found a wired 
behaviour of my R installation on the evaluation of anonymous functions.


minimal working example

###
f<-function(x) {
print( 2*x )
}(2)

class(f)

f(3)

f<-function(x) {
print( 2*x )
}(4)(5)

f(6)
###

leads to

###
> f<-function(x) {
+ print( 2*x )
+ }(2)
>
> class(f)
[1] "function"
>
> f(3)
[1] 6
Error in f(3) : attempt to apply non-function
>
> f<-function(x) {
+ print( 2*x )
+ }(4)(5)
>
> f(6)
[1] 12
Error in f(6) : attempt to apply non-function

###

is this a bug or desired behavior? Using parenthesis of coures solves 
the problem. However, I think the operator precedence could be the 
problem here. I looked at the "./src/main/gram.y" and I think that the 
line 385

|FUNCTION '(' formlist ')' cr expr_or_assign %prec LOW
should be of way higher precedence. But I cannot forsee the side 
effects of that (which could be horrible in that case).


If this is the desired behaviour and not a bug, I'm very interested in 
the rational behind that.


Best wishes,

Wilm

ps:

$ R --version
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] anonymous function parsing bug?

2016-10-21 Thread Wilm Schumacher

Hi,

I hope this is the correct list for my question. I found a wired 
behaviour of my R installation on the evaluation of anonymous functions.


minimal working example

###
f<-function(x) {
print( 2*x )
}(2)

class(f)

f(3)

f<-function(x) {
print( 2*x )
}(4)(5)

f(6)
###

leads to

###
> f<-function(x) {
+ print( 2*x )
+ }(2)
>
> class(f)
[1] "function"
>
> f(3)
[1] 6
Error in f(3) : attempt to apply non-function
>
> f<-function(x) {
+ print( 2*x )
+ }(4)(5)
>
> f(6)
[1] 12
Error in f(6) : attempt to apply non-function

###

is this a bug or desired behavior? Using parenthesis of coures solves 
the problem. However, I think the operator precedence could be the 
problem here. I looked at the "./src/main/gram.y" and I think that the 
line 385

|FUNCTION '(' formlist ')' cr expr_or_assign %prec LOW
should be of way higher precedence. But I cannot forsee the side effects 
of that (which could be horrible in that case).


If this is the desired behaviour and not a bug, I'm very interested in 
the rational behind that.


Best wishes,

Wilm

ps:

$ R --version
R version 3.3.1 (2016-06-21) -- "Bug in Your Hair"

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] getGraphicsEvent() and setTimeLimit() bug and compatibility patch.

2016-09-18 Thread Richard Bodewits
Hey all.

Setting a time limit with setTimeLimit(), and then using
getGraphicsEvent(), will cause graphics event handling for the current
device to break on timeout, until the device is destroyed and recreated.

The problem lies in do_getGraphicsEvent() checking the value of
dd->gettingEvent and concluding it's being called recursively
(ironically this same test fails to detect actual recursion). The
gettingEvent value is not reset on error conditions, so the device
becomes unusable for the remainder of its life.

As far as I've been able to tell, the values set with setTimeLimit() are
checked in R_ProcessEvents(), which is defined separately in
gnuwin32/system.c and unix/sys-unix.c. If a time limit is exceeded, it
error()s out. That in turn percolates up and through jump_to_top_ex(),
as defined in main/errors.c, which eventually calls GEonExit() in
main/engine.c, a function meant to reset some state on graphics devices
and which from the look of it and its comments was added to fix a
similar bug with recordGraphics().

I've added a single line to GEonExit(), to also reset gettingEvent back
to FALSE, which seems to have no side effects, and serves to make
getGraphicsEvent() timeout-friendly. It errors out as expected, and can
then be reused immediately. The patch is attached to this mail.

One problem with this, is recursion. My previous patch fixes the problem
of do_getGraphicsEvent() being called recursively from its own handlers,
but without that patch it becomes possible that R_ProcessEvents()
triggers a timeout while we're in a recursive call of
do_getGraphicsEvent(). Resetting gettingEvent would then potentially
affect all parent do_getGraphicsEvent() calls in the recursion stack.

Currently do_getGraphicsEvent() does not actually check the value of
gettingEvent after it would have triggered a recursion, but this seems
like a landmine for future changes to step on. To support setTimeLimit()
properly without applying the recursion fix, would require some sort of
stack-aware alternative to gettingEvent's current single boolean
implementation.

So short version; this patch builds on my do_getGraphicsEvent()
recursion patch, and will fix getGraphicsEvent() choking on
setTimeLimit() timing out.


- Richard Bodewits
Index: src/main/engine.c
===
--- src/main/engine.c   (revision 71293)
+++ src/main/engine.c   (working copy)
@@ -3043,6 +3043,7 @@
gd = GEgetDevice(devNum);
gd->recordGraphics = TRUE;
dd = gd->dev;
+   dd->gettingEvent = FALSE; // Added to allow setTimeLimit() to be 
used with getGraphicsEvent().
if (dd->onExit) dd->onExit(dd);
devNum = nextDevice(devNum);
}
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] problem submitting R bug; bug plotting in tiling window manager

2016-02-26 Thread frederik
That was an amusing bug.

I wish I had time to help more directly with this project, which has
been very helpful to me.

Frederick

On Thu, Feb 25, 2016 at 08:04:40PM +0100, peter dalgaard wrote:
> In case you still care, see 
> 
> https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16726
> 
> which even our very human spam detector hasn't decided to label as spam (yet).
> 
> -pd
> 
> > On 08 Feb 2016, at 18:34 , frede...@ofb.net wrote:
> > 
> > Ah, thank you for that explanation. I somehow didn't catch that my
> > Bugzilla account had been disabled by a human.
> > 
> > "Common pattern is to post ... something copied from a generic bug
> > report" - that sounds very annoying.
> > 
> > Frederick
> > 
> > On Sun, Feb 07, 2016 at 11:54:11AM +0100, peter dalgaard wrote:
> >> Unfortunately, the spammers in question appear to be human (of sorts). 
> >> 
> >> We're not sure what they're up to, but a common pattern is to post random 
> >> text, or something copied from a generic bug report (like "able to add 6 
> >> item"), later followed by a comment containing a link or a file 
> >> attachment. 
> >> 
> >> Presumably, it is some sort of click-bait scheme, but it could also be a 
> >> covert channel for contrabande files. At any rate, it is very hard to 
> >> distinguish by mechanical means. So it is done by eye, with some risk of 
> >> Type-I error. Thus, the Bugzilla maintainers are pretty vigilant to stamp 
> >> out spammers, sometimes edging on being ham-fisted (er, -footed?).
> >> 
> >> -pd
> >> 
> >> 
> >>> On 07 Feb 2016, at 00:25 , frede...@ofb.net wrote:
> >>> 
> >>> No problem.
> >>> 
> >>> Another suggestion would be to simply validate user input like most
> >>> websites, and reject invalid submissions immediately, rather than
> >>> blocking the user's account. I don't know what kind of spambots you
> >>> are up against, but unless they are very intelligent I doubt they'll
> >>> be able to understand a message like "You submitted a bug with no body
> >>> text, please enter something and try again." There may also be the
> >>> option of using Captchas.
> >>> 
> >>> Not sure how hard it is to get Bugzilla to do these things.
> >>> 
> >>> Thanks,
> >>> 
> >>> Frederick
> >>> 
> >>> P.S. (I now see that all errors on the bug tracker are displayed with
> >>> a red background)
> >>> 
> >>> On Sat, Feb 06, 2016 at 03:00:21AM -0500, Duncan Murdoch wrote:
>  Thanks for the suggestions.
>  
>  Duncan Murdoch
>  
>  On 05/02/2016 10:07 PM, frede...@ofb.net wrote:
> > Hi Duncan Murdoch,
> > 
> > Thanks for your time. I apologize for not telling you that my email
> > address on the bug tracker is slightly different -
> > "frederik-rproj...@ofb.net" vs "frede...@ofb.net". I was going to
> > follow up with this information, but then I thought, he probably knows
> > how to find a tagged email address.
> > 
> > I do hope that you are able to fix the bug tracker. In particular,
> > people should be made aware that their account is blocked before being
> > invited to submit a bug. The error they get should be less rude - no
> > need to make it red - and the email address in the error should be
> > filled in. You complained about wasting time having to look for my
> > email address - well, I wasted time looking for yours. The error
> > message could even hint at what triggered the ban. I don't think that
> > you're going to get very far by trying to scare off actual spammers
> > with a big red accusation - I imagine they all have pretty thick skin.
> > 
> > Reading the first line of my bug report was generous of you, but if
> > you read the rest, you'll see that, indeed, after checking with the
> > knowledgeable i3 guys, it appears to be an R bug. So I would like to
> > submit it. What appears at the top of my bug report is a copy of the
> > original bug I posted to i3, at the linked URL (are links OK or will I
> > get banned again?).
> > 
> > The reason a bug appeared with the subject "til" is because I noticed
> > that when typing into the subject field, some "related bugs" come up.
> > However, this "suggestion" process appeared to be stalled when I typed
> > "til" (for "tiling" or "tilable"). I tried hitting "enter" and it
> > ended up opening a bug with that subject, which I never submitted,
> > because I clicked "back" and figured out that *four* characters are
> > actually necesary to start getting suggestions. The whole point of
> > doing this was to see if another bug had been submitted with the same
> > topic, and thereby save you time! I'm not going to try to reproduce
> > this error, because you said it will get me banned again, but I think
> > somebody should try to fix the site so that people don't get banned
> > for any content which is not submitted. Especially people with
> > months-old accounts, like me.
> > 
> > I definitely 

Re: [Rd] problem submitting R bug; bug plotting in tiling window manager

2016-02-25 Thread peter dalgaard
In case you still care, see 

https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16726

which even our very human spam detector hasn't decided to label as spam (yet).

-pd

> On 08 Feb 2016, at 18:34 , frede...@ofb.net wrote:
> 
> Ah, thank you for that explanation. I somehow didn't catch that my
> Bugzilla account had been disabled by a human.
> 
> "Common pattern is to post ... something copied from a generic bug
> report" - that sounds very annoying.
> 
> Frederick
> 
> On Sun, Feb 07, 2016 at 11:54:11AM +0100, peter dalgaard wrote:
>> Unfortunately, the spammers in question appear to be human (of sorts). 
>> 
>> We're not sure what they're up to, but a common pattern is to post random 
>> text, or something copied from a generic bug report (like "able to add 6 
>> item"), later followed by a comment containing a link or a file attachment. 
>> 
>> Presumably, it is some sort of click-bait scheme, but it could also be a 
>> covert channel for contrabande files. At any rate, it is very hard to 
>> distinguish by mechanical means. So it is done by eye, with some risk of 
>> Type-I error. Thus, the Bugzilla maintainers are pretty vigilant to stamp 
>> out spammers, sometimes edging on being ham-fisted (er, -footed?).
>> 
>> -pd
>> 
>> 
>>> On 07 Feb 2016, at 00:25 , frede...@ofb.net wrote:
>>> 
>>> No problem.
>>> 
>>> Another suggestion would be to simply validate user input like most
>>> websites, and reject invalid submissions immediately, rather than
>>> blocking the user's account. I don't know what kind of spambots you
>>> are up against, but unless they are very intelligent I doubt they'll
>>> be able to understand a message like "You submitted a bug with no body
>>> text, please enter something and try again." There may also be the
>>> option of using Captchas.
>>> 
>>> Not sure how hard it is to get Bugzilla to do these things.
>>> 
>>> Thanks,
>>> 
>>> Frederick
>>> 
>>> P.S. (I now see that all errors on the bug tracker are displayed with
>>> a red background)
>>> 
>>> On Sat, Feb 06, 2016 at 03:00:21AM -0500, Duncan Murdoch wrote:
 Thanks for the suggestions.
 
 Duncan Murdoch
 
 On 05/02/2016 10:07 PM, frede...@ofb.net wrote:
> Hi Duncan Murdoch,
> 
> Thanks for your time. I apologize for not telling you that my email
> address on the bug tracker is slightly different -
> "frederik-rproj...@ofb.net" vs "frede...@ofb.net". I was going to
> follow up with this information, but then I thought, he probably knows
> how to find a tagged email address.
> 
> I do hope that you are able to fix the bug tracker. In particular,
> people should be made aware that their account is blocked before being
> invited to submit a bug. The error they get should be less rude - no
> need to make it red - and the email address in the error should be
> filled in. You complained about wasting time having to look for my
> email address - well, I wasted time looking for yours. The error
> message could even hint at what triggered the ban. I don't think that
> you're going to get very far by trying to scare off actual spammers
> with a big red accusation - I imagine they all have pretty thick skin.
> 
> Reading the first line of my bug report was generous of you, but if
> you read the rest, you'll see that, indeed, after checking with the
> knowledgeable i3 guys, it appears to be an R bug. So I would like to
> submit it. What appears at the top of my bug report is a copy of the
> original bug I posted to i3, at the linked URL (are links OK or will I
> get banned again?).
> 
> The reason a bug appeared with the subject "til" is because I noticed
> that when typing into the subject field, some "related bugs" come up.
> However, this "suggestion" process appeared to be stalled when I typed
> "til" (for "tiling" or "tilable"). I tried hitting "enter" and it
> ended up opening a bug with that subject, which I never submitted,
> because I clicked "back" and figured out that *four* characters are
> actually necesary to start getting suggestions. The whole point of
> doing this was to see if another bug had been submitted with the same
> topic, and thereby save you time! I'm not going to try to reproduce
> this error, because you said it will get me banned again, but I think
> somebody should try to fix the site so that people don't get banned
> for any content which is not submitted. Especially people with
> months-old accounts, like me.
> 
> I definitely sympathize with the spam problem, and thank you for your
> hard work. Best wishes,
> 
> Frederick
> 
> On Fri, Feb 05, 2016 at 08:19:40PM -0500, Duncan Murdoch wrote:
>> On 05/02/2016 7:26 PM, frede...@ofb.net wrote:
>>> Dear Dirk Eddelbuettel and Duncan Murdoch,
>>> 
>>> Thank you for your work on the wonderful R project!
>>> 
>>> I recently attempted to 

Re: [Rd] problem submitting R bug; bug plotting in tiling window manager

2016-02-08 Thread frederik
Ah, thank you for that explanation. I somehow didn't catch that my
Bugzilla account had been disabled by a human.

"Common pattern is to post ... something copied from a generic bug
report" - that sounds very annoying.

Frederick

On Sun, Feb 07, 2016 at 11:54:11AM +0100, peter dalgaard wrote:
> Unfortunately, the spammers in question appear to be human (of sorts). 
> 
> We're not sure what they're up to, but a common pattern is to post random 
> text, or something copied from a generic bug report (like "able to add 6 
> item"), later followed by a comment containing a link or a file attachment. 
> 
> Presumably, it is some sort of click-bait scheme, but it could also be a 
> covert channel for contrabande files. At any rate, it is very hard to 
> distinguish by mechanical means. So it is done by eye, with some risk of 
> Type-I error. Thus, the Bugzilla maintainers are pretty vigilant to stamp out 
> spammers, sometimes edging on being ham-fisted (er, -footed?).
> 
> -pd
> 
> 
> > On 07 Feb 2016, at 00:25 , frede...@ofb.net wrote:
> > 
> > No problem.
> > 
> > Another suggestion would be to simply validate user input like most
> > websites, and reject invalid submissions immediately, rather than
> > blocking the user's account. I don't know what kind of spambots you
> > are up against, but unless they are very intelligent I doubt they'll
> > be able to understand a message like "You submitted a bug with no body
> > text, please enter something and try again." There may also be the
> > option of using Captchas.
> > 
> > Not sure how hard it is to get Bugzilla to do these things.
> > 
> > Thanks,
> > 
> > Frederick
> > 
> > P.S. (I now see that all errors on the bug tracker are displayed with
> > a red background)
> > 
> > On Sat, Feb 06, 2016 at 03:00:21AM -0500, Duncan Murdoch wrote:
> >> Thanks for the suggestions.
> >> 
> >> Duncan Murdoch
> >> 
> >> On 05/02/2016 10:07 PM, frede...@ofb.net wrote:
> >>> Hi Duncan Murdoch,
> >>> 
> >>> Thanks for your time. I apologize for not telling you that my email
> >>> address on the bug tracker is slightly different -
> >>> "frederik-rproj...@ofb.net" vs "frede...@ofb.net". I was going to
> >>> follow up with this information, but then I thought, he probably knows
> >>> how to find a tagged email address.
> >>> 
> >>> I do hope that you are able to fix the bug tracker. In particular,
> >>> people should be made aware that their account is blocked before being
> >>> invited to submit a bug. The error they get should be less rude - no
> >>> need to make it red - and the email address in the error should be
> >>> filled in. You complained about wasting time having to look for my
> >>> email address - well, I wasted time looking for yours. The error
> >>> message could even hint at what triggered the ban. I don't think that
> >>> you're going to get very far by trying to scare off actual spammers
> >>> with a big red accusation - I imagine they all have pretty thick skin.
> >>> 
> >>> Reading the first line of my bug report was generous of you, but if
> >>> you read the rest, you'll see that, indeed, after checking with the
> >>> knowledgeable i3 guys, it appears to be an R bug. So I would like to
> >>> submit it. What appears at the top of my bug report is a copy of the
> >>> original bug I posted to i3, at the linked URL (are links OK or will I
> >>> get banned again?).
> >>> 
> >>> The reason a bug appeared with the subject "til" is because I noticed
> >>> that when typing into the subject field, some "related bugs" come up.
> >>> However, this "suggestion" process appeared to be stalled when I typed
> >>> "til" (for "tiling" or "tilable"). I tried hitting "enter" and it
> >>> ended up opening a bug with that subject, which I never submitted,
> >>> because I clicked "back" and figured out that *four* characters are
> >>> actually necesary to start getting suggestions. The whole point of
> >>> doing this was to see if another bug had been submitted with the same
> >>> topic, and thereby save you time! I'm not going to try to reproduce
> >>> this error, because you said it will get me banned again, but I think
> >>> somebody should try to fix the site so that people don't get banned
> >>> for any content which is not submitted. Especially people with
> >>> months-old accounts, like me.
> >>> 
> >>> I definitely sympathize with the spam problem, and thank you for your
> >>> hard work. Best wishes,
> >>> 
> >>> Frederick
> >>> 
> >>> On Fri, Feb 05, 2016 at 08:19:40PM -0500, Duncan Murdoch wrote:
>  On 05/02/2016 7:26 PM, frede...@ofb.net wrote:
> > Dear Dirk Eddelbuettel and Duncan Murdoch,
> > 
> > Thank you for your work on the wonderful R project!
> > 
> > I recently attempted to submit a bug with your Bugzilla interface:
> > 
> > https://bugs.r-project.org/bugzilla/enter_bug.cgi
> > 
> > I created an account, typed in all my information, first checking
> > details with another project. Then I clicked submit, and was 

Re: [Rd] problem submitting R bug; bug plotting in tiling window manager

2016-02-08 Thread peter dalgaard

On 08 Feb 2016, at 18:34 , frede...@ofb.net wrote:

> Ah, thank you for that explanation. I somehow didn't catch that my
> Bugzilla account had been disabled by a human.

I like the notion that a somewhat rude message is ameliorated by the knowledge 
that it was put there by a chronically annoyed human... ;-) We might still 
consider rephrasing it though.

-pd


> 
> "Common pattern is to post ... something copied from a generic bug
> report" - that sounds very annoying.
> 
> Frederick
> 
> On Sun, Feb 07, 2016 at 11:54:11AM +0100, peter dalgaard wrote:
>> Unfortunately, the spammers in question appear to be human (of sorts). 
>> 
>> We're not sure what they're up to, but a common pattern is to post random 
>> text, or something copied from a generic bug report (like "able to add 6 
>> item"), later followed by a comment containing a link or a file attachment. 
>> 
>> Presumably, it is some sort of click-bait scheme, but it could also be a 
>> covert channel for contrabande files. At any rate, it is very hard to 
>> distinguish by mechanical means. So it is done by eye, with some risk of 
>> Type-I error. Thus, the Bugzilla maintainers are pretty vigilant to stamp 
>> out spammers, sometimes edging on being ham-fisted (er, -footed?).
>> 
>> -pd
>> 
>> 
>>> On 07 Feb 2016, at 00:25 , frede...@ofb.net wrote:
>>> 
>>> No problem.
>>> 
>>> Another suggestion would be to simply validate user input like most
>>> websites, and reject invalid submissions immediately, rather than
>>> blocking the user's account. I don't know what kind of spambots you
>>> are up against, but unless they are very intelligent I doubt they'll
>>> be able to understand a message like "You submitted a bug with no body
>>> text, please enter something and try again." There may also be the
>>> option of using Captchas.
>>> 
>>> Not sure how hard it is to get Bugzilla to do these things.
>>> 
>>> Thanks,
>>> 
>>> Frederick
>>> 
>>> P.S. (I now see that all errors on the bug tracker are displayed with
>>> a red background)
>>> 
>>> On Sat, Feb 06, 2016 at 03:00:21AM -0500, Duncan Murdoch wrote:
 Thanks for the suggestions.
 
 Duncan Murdoch
 
 On 05/02/2016 10:07 PM, frede...@ofb.net wrote:
> Hi Duncan Murdoch,
> 
> Thanks for your time. I apologize for not telling you that my email
> address on the bug tracker is slightly different -
> "frederik-rproj...@ofb.net" vs "frede...@ofb.net". I was going to
> follow up with this information, but then I thought, he probably knows
> how to find a tagged email address.
> 
> I do hope that you are able to fix the bug tracker. In particular,
> people should be made aware that their account is blocked before being
> invited to submit a bug. The error they get should be less rude - no
> need to make it red - and the email address in the error should be
> filled in. You complained about wasting time having to look for my
> email address - well, I wasted time looking for yours. The error
> message could even hint at what triggered the ban. I don't think that
> you're going to get very far by trying to scare off actual spammers
> with a big red accusation - I imagine they all have pretty thick skin.
> 
> Reading the first line of my bug report was generous of you, but if
> you read the rest, you'll see that, indeed, after checking with the
> knowledgeable i3 guys, it appears to be an R bug. So I would like to
> submit it. What appears at the top of my bug report is a copy of the
> original bug I posted to i3, at the linked URL (are links OK or will I
> get banned again?).
> 
> The reason a bug appeared with the subject "til" is because I noticed
> that when typing into the subject field, some "related bugs" come up.
> However, this "suggestion" process appeared to be stalled when I typed
> "til" (for "tiling" or "tilable"). I tried hitting "enter" and it
> ended up opening a bug with that subject, which I never submitted,
> because I clicked "back" and figured out that *four* characters are
> actually necesary to start getting suggestions. The whole point of
> doing this was to see if another bug had been submitted with the same
> topic, and thereby save you time! I'm not going to try to reproduce
> this error, because you said it will get me banned again, but I think
> somebody should try to fix the site so that people don't get banned
> for any content which is not submitted. Especially people with
> months-old accounts, like me.
> 
> I definitely sympathize with the spam problem, and thank you for your
> hard work. Best wishes,
> 
> Frederick
> 
> On Fri, Feb 05, 2016 at 08:19:40PM -0500, Duncan Murdoch wrote:
>> On 05/02/2016 7:26 PM, frede...@ofb.net wrote:
>>> Dear Dirk Eddelbuettel and Duncan Murdoch,
>>> 
>>> Thank you for your work on the wonderful R project!
>>> 
>>> I 

Re: [Rd] problem submitting R bug; bug plotting in tiling window manager

2016-02-07 Thread peter dalgaard
Unfortunately, the spammers in question appear to be human (of sorts). 

We're not sure what they're up to, but a common pattern is to post random text, 
or something copied from a generic bug report (like "able to add 6 item"), 
later followed by a comment containing a link or a file attachment. 

Presumably, it is some sort of click-bait scheme, but it could also be a covert 
channel for contrabande files. At any rate, it is very hard to distinguish by 
mechanical means. So it is done by eye, with some risk of Type-I error. Thus, 
the Bugzilla maintainers are pretty vigilant to stamp out spammers, sometimes 
edging on being ham-fisted (er, -footed?).

-pd


> On 07 Feb 2016, at 00:25 , frede...@ofb.net wrote:
> 
> No problem.
> 
> Another suggestion would be to simply validate user input like most
> websites, and reject invalid submissions immediately, rather than
> blocking the user's account. I don't know what kind of spambots you
> are up against, but unless they are very intelligent I doubt they'll
> be able to understand a message like "You submitted a bug with no body
> text, please enter something and try again." There may also be the
> option of using Captchas.
> 
> Not sure how hard it is to get Bugzilla to do these things.
> 
> Thanks,
> 
> Frederick
> 
> P.S. (I now see that all errors on the bug tracker are displayed with
> a red background)
> 
> On Sat, Feb 06, 2016 at 03:00:21AM -0500, Duncan Murdoch wrote:
>> Thanks for the suggestions.
>> 
>> Duncan Murdoch
>> 
>> On 05/02/2016 10:07 PM, frede...@ofb.net wrote:
>>> Hi Duncan Murdoch,
>>> 
>>> Thanks for your time. I apologize for not telling you that my email
>>> address on the bug tracker is slightly different -
>>> "frederik-rproj...@ofb.net" vs "frede...@ofb.net". I was going to
>>> follow up with this information, but then I thought, he probably knows
>>> how to find a tagged email address.
>>> 
>>> I do hope that you are able to fix the bug tracker. In particular,
>>> people should be made aware that their account is blocked before being
>>> invited to submit a bug. The error they get should be less rude - no
>>> need to make it red - and the email address in the error should be
>>> filled in. You complained about wasting time having to look for my
>>> email address - well, I wasted time looking for yours. The error
>>> message could even hint at what triggered the ban. I don't think that
>>> you're going to get very far by trying to scare off actual spammers
>>> with a big red accusation - I imagine they all have pretty thick skin.
>>> 
>>> Reading the first line of my bug report was generous of you, but if
>>> you read the rest, you'll see that, indeed, after checking with the
>>> knowledgeable i3 guys, it appears to be an R bug. So I would like to
>>> submit it. What appears at the top of my bug report is a copy of the
>>> original bug I posted to i3, at the linked URL (are links OK or will I
>>> get banned again?).
>>> 
>>> The reason a bug appeared with the subject "til" is because I noticed
>>> that when typing into the subject field, some "related bugs" come up.
>>> However, this "suggestion" process appeared to be stalled when I typed
>>> "til" (for "tiling" or "tilable"). I tried hitting "enter" and it
>>> ended up opening a bug with that subject, which I never submitted,
>>> because I clicked "back" and figured out that *four* characters are
>>> actually necesary to start getting suggestions. The whole point of
>>> doing this was to see if another bug had been submitted with the same
>>> topic, and thereby save you time! I'm not going to try to reproduce
>>> this error, because you said it will get me banned again, but I think
>>> somebody should try to fix the site so that people don't get banned
>>> for any content which is not submitted. Especially people with
>>> months-old accounts, like me.
>>> 
>>> I definitely sympathize with the spam problem, and thank you for your
>>> hard work. Best wishes,
>>> 
>>> Frederick
>>> 
>>> On Fri, Feb 05, 2016 at 08:19:40PM -0500, Duncan Murdoch wrote:
 On 05/02/2016 7:26 PM, frede...@ofb.net wrote:
> Dear Dirk Eddelbuettel and Duncan Murdoch,
> 
> Thank you for your work on the wonderful R project!
> 
> I recently attempted to submit a bug with your Bugzilla interface:
> 
> https://bugs.r-project.org/bugzilla/enter_bug.cgi
> 
> I created an account, typed in all my information, first checking
> details with another project. Then I clicked submit, and was taken to
> a web page with a big red banner, it said
> 
>   Spammer
>   If you believe your account should be restored, please send email to 
> explaining why.
> 
> What a hostile thing to say to your users! I tried resubmitting my
> bug, but removing any links, and I still get the message - so it looks
> like my account has really been blocked. Please do something to warn
> your users about this so they can avoid the upset.
 
 Your account 

Re: [Rd] problem submitting R bug; bug plotting in tiling window manager

2016-02-06 Thread frederik
No problem.

Another suggestion would be to simply validate user input like most
websites, and reject invalid submissions immediately, rather than
blocking the user's account. I don't know what kind of spambots you
are up against, but unless they are very intelligent I doubt they'll
be able to understand a message like "You submitted a bug with no body
text, please enter something and try again." There may also be the
option of using Captchas.

Not sure how hard it is to get Bugzilla to do these things.

Thanks,

Frederick

P.S. (I now see that all errors on the bug tracker are displayed with
a red background)

On Sat, Feb 06, 2016 at 03:00:21AM -0500, Duncan Murdoch wrote:
> Thanks for the suggestions.
> 
> Duncan Murdoch
> 
> On 05/02/2016 10:07 PM, frede...@ofb.net wrote:
> >Hi Duncan Murdoch,
> >
> >Thanks for your time. I apologize for not telling you that my email
> >address on the bug tracker is slightly different -
> >"frederik-rproj...@ofb.net" vs "frede...@ofb.net". I was going to
> >follow up with this information, but then I thought, he probably knows
> >how to find a tagged email address.
> >
> >I do hope that you are able to fix the bug tracker. In particular,
> >people should be made aware that their account is blocked before being
> >invited to submit a bug. The error they get should be less rude - no
> >need to make it red - and the email address in the error should be
> >filled in. You complained about wasting time having to look for my
> >email address - well, I wasted time looking for yours. The error
> >message could even hint at what triggered the ban. I don't think that
> >you're going to get very far by trying to scare off actual spammers
> >with a big red accusation - I imagine they all have pretty thick skin.
> >
> >Reading the first line of my bug report was generous of you, but if
> >you read the rest, you'll see that, indeed, after checking with the
> >knowledgeable i3 guys, it appears to be an R bug. So I would like to
> >submit it. What appears at the top of my bug report is a copy of the
> >original bug I posted to i3, at the linked URL (are links OK or will I
> >get banned again?).
> >
> >The reason a bug appeared with the subject "til" is because I noticed
> >that when typing into the subject field, some "related bugs" come up.
> >However, this "suggestion" process appeared to be stalled when I typed
> >"til" (for "tiling" or "tilable"). I tried hitting "enter" and it
> >ended up opening a bug with that subject, which I never submitted,
> >because I clicked "back" and figured out that *four* characters are
> >actually necesary to start getting suggestions. The whole point of
> >doing this was to see if another bug had been submitted with the same
> >topic, and thereby save you time! I'm not going to try to reproduce
> >this error, because you said it will get me banned again, but I think
> >somebody should try to fix the site so that people don't get banned
> >for any content which is not submitted. Especially people with
> >months-old accounts, like me.
> >
> >I definitely sympathize with the spam problem, and thank you for your
> >hard work. Best wishes,
> >
> >Frederick
> >
> >On Fri, Feb 05, 2016 at 08:19:40PM -0500, Duncan Murdoch wrote:
> >>On 05/02/2016 7:26 PM, frede...@ofb.net wrote:
> >>>Dear Dirk Eddelbuettel and Duncan Murdoch,
> >>>
> >>>Thank you for your work on the wonderful R project!
> >>>
> >>>I recently attempted to submit a bug with your Bugzilla interface:
> >>>
> >>>https://bugs.r-project.org/bugzilla/enter_bug.cgi
> >>>
> >>>I created an account, typed in all my information, first checking
> >>>details with another project. Then I clicked submit, and was taken to
> >>>a web page with a big red banner, it said
> >>>
> >>> Spammer
> >>> If you believe your account should be restored, please send email to 
> >>> explaining why.
> >>>
> >>>What a hostile thing to say to your users! I tried resubmitting my
> >>>bug, but removing any links, and I still get the message - so it looks
> >>>like my account has really been blocked. Please do something to warn
> >>>your users about this so they can avoid the upset.
> >>
> >>Your account isn't blocked now, but it wasn't easy to unblock it: you used a
> >>different email address in the submission, not the same one you used in this
> >>email.  At least one of the people who can deal with this kind of thing
> >>would now demand an apology from you before ever reading your email again.
> >>I won't do that, but I have to admit, I don't like the fact that you wasted
> >>10 minutes of my time. I'm Bcc'ing a couple of people who are working on
> >>putting together a better interface to the bug reporting system, so they
> >>know to deal with this kind of issue as well as all the others.
> >>
> >>I'm not hostile, I just sound that way, because I've wasted a lot of time
> >>this week on issues like this.
> >>
> >>Duncan Murdoch
> >>
> >>(Here's my previous email to you, for the benefit of those who are Bcc'd:
> 

Re: [Rd] problem submitting R bug; bug plotting in tiling window manager

2016-02-05 Thread Duncan Murdoch
You posted a bug report, but it had no content other than "til".  That's 
what many abusers of the system have done, so you were blocked.


I have read the first line of your bug report, and it says " I'm not 
sure if this is a bug with i3 or R ".  If you're not sure if it's a bug 
or not, then please post to R-devel.  That's a moderated list so if this 
is your first post, it may take a while to appear.


This probably seems unreasonable to you, but a lot of abuse is sent to 
the bug list, so we block it quite early.  I'll unblock you now, but 
please don't post there again unless your discussion on R-devel 
indicates this is a problem with R rather than i3.


Duncan Murdoch

On 05/02/2016 7:26 PM, frede...@ofb.net wrote:

Dear Dirk Eddelbuettel and Duncan Murdoch,

Thank you for your work on the wonderful R project!

I recently attempted to submit a bug with your Bugzilla interface:

https://bugs.r-project.org/bugzilla/enter_bug.cgi

I created an account, typed in all my information, first checking
details with another project. Then I clicked submit, and was taken to
a web page with a big red banner, it said

 Spammer
 If you believe your account should be restored, please send email to 
explaining why.

What a hostile thing to say to your users! I tried resubmitting my
bug, but removing any links, and I still get the message - so it looks
like my account has really been blocked. Please do something to warn
your users about this so they can avoid the upset.

Well, I don't know what it means to "email to explaining why", so I
tried to subscribe to R-devel. However, it's been ten minutes and no
confirmation email. So I tracked down your email addresses from the R
website. I'm still cc'ing r-devel.

I hope it is OK to send the bug by email. I really want to get back to
what I was doing, but I don't want to lose the work I put into writing
this bug report, so I'm attaching it to this message.

Thank you,

Frederick Eaton




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] problem submitting R bug; bug plotting in tiling window manager

2016-02-05 Thread Duncan Murdoch

On 05/02/2016 7:26 PM, frede...@ofb.net wrote:

Dear Dirk Eddelbuettel and Duncan Murdoch,

Thank you for your work on the wonderful R project!

I recently attempted to submit a bug with your Bugzilla interface:

https://bugs.r-project.org/bugzilla/enter_bug.cgi

I created an account, typed in all my information, first checking
details with another project. Then I clicked submit, and was taken to
a web page with a big red banner, it said

 Spammer
 If you believe your account should be restored, please send email to 
explaining why.

What a hostile thing to say to your users! I tried resubmitting my
bug, but removing any links, and I still get the message - so it looks
like my account has really been blocked. Please do something to warn
your users about this so they can avoid the upset.


Your account isn't blocked now, but it wasn't easy to unblock it: you 
used a different email address in the submission, not the same one you 
used in this email.  At least one of the people who can deal with this 
kind of thing would now demand an apology from you before ever reading 
your email again.  I won't do that, but I have to admit, I don't like 
the fact that you wasted 10 minutes of my time. I'm Bcc'ing a couple of 
people who are working on putting together a better interface to the bug 
reporting system, so they know to deal with this kind of issue as well 
as all the others.


I'm not hostile, I just sound that way, because I've wasted a lot of 
time this week on issues like this.


Duncan Murdoch

(Here's my previous email to you, for the benefit of those who are Bcc'd:

You posted a bug report, but it had no content other than "til".  That's
what many abusers of the system have done, so you were blocked.

I have read the first line of your bug report, and it says " I'm not
sure if this is a bug with i3 or R ".  If you're not sure if it's a bug
or not, then please post to R-devel.  That's a moderated list so if this
is your first post, it may take a while to appear.

This probably seems unreasonable to you, but a lot of abuse is sent to
the bug list, so we block it quite early.  I'll unblock you now, but
please don't post there again unless your discussion on R-devel
indicates this is a problem with R rather than i3.

Duncan Murdoch


Well, I don't know what it means to "email to explaining why", so I
tried to subscribe to R-devel. However, it's been ten minutes and no
confirmation email. So I tracked down your email addresses from the R
website. I'm still cc'ing r-devel.

I hope it is OK to send the bug by email. I really want to get back to
what I was doing, but I don't want to lose the work I put into writing
this bug report, so I'm attaching it to this message.

Thank you,

Frederick Eaton




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Lack of protection bug in current R release candidate

2015-06-15 Thread Radford Neal
  The current R release candidate has a lack of protect bug
  (of very long standing) 
 
 [ but not really reported,  right ? ]

It's long standing because it exists in versions of R going
back many years.  

 but the R 3.2.1 release candidate probably really cannot be
 touched now, with something non-trivial like this.
 
 We'd be *very* happy to get such problems during alpha or beta
 testing phase (or even before).

I'm not sure what you mean to imply here, but for your information,
I reported the bug to r-devel within about an hour of finding 
what caused it.  (I'd noticed the symptoms a few days before, but
hadn't isolated the cause.)

   Radford Neal

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Lack of protection bug in current R release candidate

2015-06-15 Thread Martin Maechler
 Radford Neal radf...@cs.toronto.edu
 on Sat, 13 Jun 2015 17:24:04 -0400 writes:

 The current R release candidate has a lack of protect bug
 (of very long standing) 

[ but not really reported,  right ? ]

 with respect to the
 R_print.na_string and R_print.na_string_noquote fields of
 the static R_print structure declared in Print.h.  This
 shows up very occassionally as incorrect output from the
 following lines in reg-tests-2.R:

 x - c(a, NA, b)
 factor(x)
 factor(x, exclude=)

 The solution (kludgy, but the whole concept is kludgy) is
 to forward R_print.na_string and R_print.na_string_noquote
 with the other roots in RunGenCollect (after the comment
 /* forward all roots */).

Radford Neal

Interesting ... .. more serioulsy:  Thank you, Radford!

As  this about garbage collection, I found 
I can pretty well reproduce the problem via
something like

   x - c(a, NA, b)
  fx - factor(x, exclude=)
  gctorture2(2, 5)
  r - replicate(30, capture.output(print(fx)))
  stopifnot(r[,1] == r) 
  # ^^^ fails because of bug

but the R 3.2.1 release candidate probably really cannot be
touched now, with something non-trivial like this.

We'd be *very* happy to get such problems during alpha or beta
testing phase (or even before).

Martin Maechler

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Lack of protection bug in current R release candidate

2015-06-15 Thread Martin Maechler
 Radford Neal radf...@cs.toronto.edu
 on Mon, 15 Jun 2015 10:11:33 -0400 writes:

  The current R release candidate has a lack of protect bug
  (of very long standing) 
 
 [ but not really reported,  right ? ]

 It's long standing because it exists in versions of R going
 back many years.  

 but the R 3.2.1 release candidate probably really cannot be
 touched now, with something non-trivial like this.
 
 We'd be *very* happy to get such problems during alpha or beta
 testing phase (or even before).

 I'm not sure what you mean to imply here, but for your information,
 I reported the bug to r-devel within about an hour of finding 
 what caused it.  (I'd noticed the symptoms a few days before, but
 hadn't isolated the cause.)

 Radford Neal

Thank you, now I understand.
I really  completely misinterpreted your very long standing
above.

Hence I do apologize for the negative connotation of the
above... and thank you again -- in the name of all involved --
for reporting the bug here!

BTW: I've committed  svn rev 68519  
about ten minutes ago which does fix the bug (simply applying your good advice)
and contains a regression test.

Thank you once more, Radford!
Martin Maechler

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] back port of Bug 15899 fix missing from R 3.2.1 RC release!!!

2015-06-15 Thread Jack Howarth
Is there a reason why the fix for Bug 15899 - Omitted 'extern' on
'R_running_as_main_program' after refactor can cause linker errors for
applications embedding R...

https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15899

was never back ported for R 3.3 for the R 3.2.1 release? Restoring
that omitted 'extern' to the declaration of 'int
R_running_as_main_program;' in src/Rinterface.h is essential for being
able to build Rstudio.
   Jack

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Lack of protection bug in current R release candidate

2015-06-13 Thread Radford Neal
The current R release candidate has a lack of protect bug (of very
long standing) with respect to the R_print.na_string and
R_print.na_string_noquote fields of the static R_print structure
declared in Print.h.  This shows up very occassionally as incorrect
output from the following lines in reg-tests-2.R:

  x - c(a, NA, b)
  factor(x)
  factor(x, exclude=)

The solution (kludgy, but the whole concept is kludgy) is to forward
R_print.na_string and R_print.na_string_noquote with the other roots
in RunGenCollect (after the comment /* forward all roots */).

   Radford Neal

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-09 Thread Martin Maechler
 Martin Maechler maech...@stat.math.ethz.ch
 on Fri, 9 Jan 2015 14:00:38 +0100 writes:

 Michael Lawrence lawrence.mich...@gene.com
 on Thu, 8 Jan 2015 14:02:26 -0800 writes:

 On Thu, Jan 8, 2015 at 11:57 AM, luke-tier...@uiowa.edu wrote:
 On Thu, 8 Jan 2015, Michael Lawrence wrote:
 
 If we do add an argument to get(), then it should be named consistently
 with the ifnotfound argument of mget(). 

You are right... I forgot to say so earlier in the thread.

The definition now is

get0 - function (x, envir = pos.to.env(-1L), mode = any, inherits = TRUE,
  ifnotfound = NULL)
.Internal(get0(x, envir, mode, inherits, ifnotfound))



 As mentioned, the possibility of a
 NULL value is problematic. One solution is a sentinel value that 
indicates
 an unbound value (like R_UnboundValue).
 
 
 A null default is fine -- it's a default; if it isn't right for a
 particular case you can provide something else.
 

[..]

 Adding getIfExists, or .get, or get0, or whatever seems fine. Adding
 an argument to get() with missing giving current behavior may be OK
 too. Rewriting exists and get as .Primitives may be sufficient though.

 Thank you, Luke.  Given that, Duncan's and the other inputs,
 I think we should go for a new function -- .Internal() for now.

 To Pete's point about arguments, I did drop 'frame' on purpose 
 and indeed we could try to do away with 'where/pos' as well and
 have the environment only specified by 'envir'.

 Name: I like  get0() for its brevity and prefer it to .get().

 Let me expose my current implementation on R-devel ... and start
 using it in the 'methods' package so we (Pete H. :-) can start
 measuring its impact.

I have now committed  get0() to R-devel  (svn rev 67386) 

which is already using it in quite a few places:
in 'base', notably in base/R/namespace.R   where it may speedup, also
in 'methods' in quite a few places also in the hope of some S4
speedup.

{{Now I feel having deserved some weekend break ...}}

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-09 Thread Peter Haverty
Fantastic. I'm eager to try it out.  Thanks for seeing this through.

Regards,

Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

On Fri, Jan 9, 2015 at 7:37 AM, Martin Maechler maech...@stat.math.ethz.ch
wrote:

  Martin Maechler maech...@stat.math.ethz.ch
  on Fri, 9 Jan 2015 14:00:38 +0100 writes:

  Michael Lawrence lawrence.mich...@gene.com
  on Thu, 8 Jan 2015 14:02:26 -0800 writes:

  On Thu, Jan 8, 2015 at 11:57 AM, luke-tier...@uiowa.edu wrote:
  On Thu, 8 Jan 2015, Michael Lawrence wrote:
 
  If we do add an argument to get(), then it should be named
 consistently
  with the ifnotfound argument of mget().

 You are right... I forgot to say so earlier in the thread.

 The definition now is

 get0 - function (x, envir = pos.to.env(-1L), mode = any, inherits =
 TRUE,
   ifnotfound = NULL)
 .Internal(get0(x, envir, mode, inherits, ifnotfound))



  As mentioned, the possibility of a
  NULL value is problematic. One solution is a sentinel value that
 indicates
  an unbound value (like R_UnboundValue).
 
 
  A null default is fine -- it's a default; if it isn't right for a
  particular case you can provide something else.
 

 [..]

  Adding getIfExists, or .get, or get0, or whatever seems fine.
 Adding
  an argument to get() with missing giving current behavior may be OK
  too. Rewriting exists and get as .Primitives may be sufficient
 though.

  Thank you, Luke.  Given that, Duncan's and the other inputs,
  I think we should go for a new function -- .Internal() for now.

  To Pete's point about arguments, I did drop 'frame' on purpose
  and indeed we could try to do away with 'where/pos' as well and
  have the environment only specified by 'envir'.

  Name: I like  get0() for its brevity and prefer it to .get().

  Let me expose my current implementation on R-devel ... and start
  using it in the 'methods' package so we (Pete H. :-) can start
  measuring its impact.

 I have now committed  get0() to R-devel  (svn rev 67386)

 which is already using it in quite a few places:
 in 'base', notably in base/R/namespace.R   where it may speedup, also
 in 'methods' in quite a few places also in the hope of some S4
 speedup.

 {{Now I feel having deserved some weekend break ...}}

 Martin

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-09 Thread Peter Haverty
Here are some quick measurements of Martin's accomplishment with get0:

In loading the package GenomicRanges, 30K calls to exists have been
skipped.  (However 99K still remain!)
Overall, the current usage of get0 seems to save us 10% in package
loading time (no error bars on that measurement).
microbenchmark says that

env = asNamespace(base); get0(match, env)

is a 6X speedup over the same call to get, which is pretty neat by
itself.  It might be good to just generally use get0.
Unless of course, when one really doesn't need any exists checking and
NULL results are fine, then the .Primitive [[ is 30X faster than get.
Thanks everyone for your thoughts, code and time on this topic!





Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

On Fri, Jan 9, 2015 at 7:37 AM, Martin Maechler maech...@stat.math.ethz.ch
wrote:

  Martin Maechler maech...@stat.math.ethz.ch
  on Fri, 9 Jan 2015 14:00:38 +0100 writes:

  Michael Lawrence lawrence.mich...@gene.com
  on Thu, 8 Jan 2015 14:02:26 -0800 writes:

  On Thu, Jan 8, 2015 at 11:57 AM, luke-tier...@uiowa.edu wrote:
  On Thu, 8 Jan 2015, Michael Lawrence wrote:
 
  If we do add an argument to get(), then it should be named
 consistently
  with the ifnotfound argument of mget().

 You are right... I forgot to say so earlier in the thread.

 The definition now is

 get0 - function (x, envir = pos.to.env(-1L), mode = any, inherits =
 TRUE,
   ifnotfound = NULL)
 .Internal(get0(x, envir, mode, inherits, ifnotfound))



  As mentioned, the possibility of a
  NULL value is problematic. One solution is a sentinel value that
 indicates
  an unbound value (like R_UnboundValue).
 
 
  A null default is fine -- it's a default; if it isn't right for a
  particular case you can provide something else.
 

 [..]

  Adding getIfExists, or .get, or get0, or whatever seems fine.
 Adding
  an argument to get() with missing giving current behavior may be OK
  too. Rewriting exists and get as .Primitives may be sufficient
 though.

  Thank you, Luke.  Given that, Duncan's and the other inputs,
  I think we should go for a new function -- .Internal() for now.

  To Pete's point about arguments, I did drop 'frame' on purpose
  and indeed we could try to do away with 'where/pos' as well and
  have the environment only specified by 'envir'.

  Name: I like  get0() for its brevity and prefer it to .get().

  Let me expose my current implementation on R-devel ... and start
  using it in the 'methods' package so we (Pete H. :-) can start
  measuring its impact.

 I have now committed  get0() to R-devel  (svn rev 67386)

 which is already using it in quite a few places:
 in 'base', notably in base/R/namespace.R   where it may speedup, also
 in 'methods' in quite a few places also in the hope of some S4
 speedup.

 {{Now I feel having deserved some weekend break ...}}

 Martin

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-09 Thread Hervé Pagès

Hi,

On 01/08/2015 07:02 AM, Martin Maechler wrote:



Adding an optional argument to get (and mget) like
val - get(name, where, ..., value.if.not.found=NULL )   (*)



would be useful for many.  HOWEVER, it is possible that there could be
some confusion here: (*) can give a NULL because either x exists and
has value NULL, or because x doesn't exist.   If that matters, the user
would need to be careful about specifying a value.if.not.found that cannot
be confused with a valid value of x.


Exactly -- well, of course: That problem { NULL can be the legit value of what 
you
want to get() } was the only reason to have a 'value.if.not' argument at all.

Note that this is not about a universal replacement of
the  if(exists(..)) { .. get(..) } idiom,


FWIW, if(exists(..)) { x - get(..) } is not safe because it's not
atomic. I've seen situations where exists() returns TRUE but then
get() fails to find the symbol (even if called immediately after
exists()).

After scratching my head for a while I found out that the symbol was
removed by some finalizer function defined somewhere (not on the 
environment exists() and gets() were looking at, of course). And

since garbage collection is triggered between the moment exists() sees
the symbol and get() tries to get it, the finalizer was executed and
the symbol removed.

After that, I started to systematically use x - try(get(...)) instead
(which is atomic).

Cheers,
H.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-09 Thread Martin Maechler
 Michael Lawrence lawrence.mich...@gene.com
 on Thu, 8 Jan 2015 14:02:26 -0800 writes:

 On Thu, Jan 8, 2015 at 11:57 AM, luke-tier...@uiowa.edu wrote:
 On Thu, 8 Jan 2015, Michael Lawrence wrote:
 
 If we do add an argument to get(), then it should be named consistently
 with the ifnotfound argument of mget(). As mentioned, the possibility 
of a
 NULL value is problematic. One solution is a sentinel value that 
indicates
 an unbound value (like R_UnboundValue).
 
 
 A null default is fine -- it's a default; if it isn't right for a
 particular case you can provide something else.
 
 
 But another idea (and one pretty similar to John's) is to follow the
 SYMSXP
 design at the C level, where there is a structure that points to the 
name
 and a value. We already have SYMSXPs at the R level of course (name
 objects) but they do not provide access to the value, which is typically
 R_UnboundValue. But this does not even need to be implemented with 
SYMSXP.
 The design would allow something like:
 
 binding - getBinding(x, env)
 if (hasValue(binding)) {
   x - value(binding) # throws an error if none
   message(name(binding), has value, x)
 }
 
 That I think it is a bit verbose but readable and could be made fast. 
And
 I
 think binding objects would be useful in other ways, as they are
 essentially a named object. For example, when iterating over an
 environment.
 
 
 This would need a lot more thought. Directly exposing the internals is
 definitely not something we want to do as we may well want to change
 that design. But there are lots of other corner issues that would have
 to be thought through before going forward, such as what happens if an
 rm occurs between obtaining a binding object and doing something with
 it. Serialization would also need thinking through. This doesn't seem
 like a worthwhile place to spend our efforts to me.
 
 

 Just wanted to be clear that I was not suggesting to expose any internals.
 We could implement the behavior using SYMSXP, or not. Nor would the 
binding
 need to be mutable. The binding would be considered independent of the
 environment from which it was retrieved. As Pete has mentioned, it could 
be
 a useful abstraction to have in general.

It could be, indeed.   Luke's advice (above) and my own gut
feeling do tell me that this is a much larger step than solving
the getIfExists() problem.  
In the R development cycle I'd think that it should go to the
next (2015-2016) 3.3 cycle, rather than the current 3.2 one
with goal in April.

 Adding getIfExists, or .get, or get0, or whatever seems fine. Adding
 an argument to get() with missing giving current behavior may be OK
 too. Rewriting exists and get as .Primitives may be sufficient though.

Thank you, Luke.  Given that, Duncan's and the other inputs,
I think we should go for a new function -- .Internal() for now.

To Pete's point about arguments, I did drop 'frame' on purpose 
and indeed we could try to do away with 'where/pos' as well and
have the environment only specified by 'envir'.

Name: I like  get0() for its brevity and prefer it to .get().

Let me expose my current implementation on R-devel ... and start
using it in the 'methods' package so we (Pete H. :-) can start
measuring its impact.

Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-08 Thread Peter Haverty
For what it's worth, I think we would need a new function if the default
behavior changes.  Since we already have get and mget, maybe cget for
conditional get?  if get, safe get, ...

I like the idea of keeping the original not found behavior if the
if.not.found arg is missing. However, it will be important to keep the
number of arguments down.  (I noticed that Martin's example lacks a frame
argument.)  I've heard rumors that there are plans to reduce the function
call overhead, so perhaps this matters less now.

I like Luke's idea of making exists/get/etc. .Primitives. I think that will
be necessary in order to go fast.  For my two cents, I also think
get/assign should just be synonyms for the [[ .Primitive.  That could
actually simplify things a bit. One might add inherits=FALSE and
if.not.found arguments to the environment [[ code, for example.

Regards,
Pete


Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

On Thu, Jan 8, 2015 at 11:57 AM, luke-tier...@uiowa.edu wrote:

 On Thu, 8 Jan 2015, Michael Lawrence wrote:

  If we do add an argument to get(), then it should be named consistently
 with the ifnotfound argument of mget(). As mentioned, the possibility of a
 NULL value is problematic. One solution is a sentinel value that indicates
 an unbound value (like R_UnboundValue).


 A null default is fine -- it's a default; if it isn't right for a
 particular case you can provide something else.


 But another idea (and one pretty similar to John's) is to follow the
 SYMSXP
 design at the C level, where there is a structure that points to the name
 and a value. We already have SYMSXPs at the R level of course (name
 objects) but they do not provide access to the value, which is typically
 R_UnboundValue. But this does not even need to be implemented with SYMSXP.
 The design would allow something like:

 binding - getBinding(x, env)
 if (hasValue(binding)) {
  x - value(binding) # throws an error if none
  message(name(binding), has value, x)
 }

 That I think it is a bit verbose but readable and could be made fast. And
 I
 think binding objects would be useful in other ways, as they are
 essentially a named object. For example, when iterating over an
 environment.


 This would need a lot more thought. Directly exposing the internals is
 definitely not something we want to do as we may well want to change
 that design. But there are lots of other corner issues that would have
 to be thought through before going forward, such as what happens if an
 rm occurs between obtaining a binding object and doing something with
 it. Serialization would also need thinking through. This doesn't seem
 like a worthwhile place to spend our efforts to me.

 Adding getIfExists, or .get, or get0, or whatever seems fine. Adding
 an argument to get() with missing giving current behavior may be OK
 too. Rewriting exists and get as .Primitives may be sufficient though.

 Best,

 luke


  Michael




 On Thu, Jan 8, 2015 at 6:03 AM, John Nolan jpno...@american.edu wrote:

  Adding an optional argument to get (and mget) like

 val - get(name, where, ..., value.if.not.found=NULL )   (*)

 would be useful for many.  HOWEVER, it is possible that there could be
 some confusion here: (*) can give a NULL because either x exists and
 has value NULL, or because x doesn't exist.   If that matters, the user
 would need to be careful about specifying a value.if.not.found that
 cannot
 be confused with a valid value of x.

 To avoid this difficulty, perhaps we want both: have Martin's
 getifexists(
 )
 return a list with two values:
   - a boolean variable 'found'  # = value returned by exists( )
   - a variable 'value'

 Then implement get( ) as:

 get - function(x,...,value.if.not.found ) {

   if( missing(value.if.not.found) ) {
 a - getifexists(x,... )
 if (!a$found) error(x not found)
   } else {
 a - getifexists(x,...,value.if.not.found )
   }
   return(a$value)
 }

 Note that value.if.not.found has no default value in above.
 It behaves exactly like current get does if value.if.not.found
 is not specified, and if it is specified, it would be faster
 in the common situation mentioned below:
  if(exists(x,...)) { get(x,...) }

 John

 P.S. if you like dromedaries call it valueIfNotFound ...

  ..
  John P. Nolan
  Math/Stat Department
  227 Gray Hall,   American University
  4400 Massachusetts Avenue, NW
  Washington, DC 20016-8050

  jpno...@american.edu   voice: 202.885.3140
  web: academic2.american.edu/~jpnolan
  ..


 -R-devel r-devel-boun...@r-project.org wrote: -
 To: Martin Maechler maech...@stat.math.ethz.ch, R-devel@r-project.org
 From: Duncan Murdoch
 Sent by: R-devel
 Date: 01/08/2015 06:39AM
 Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

 On 08/01/2015 4:16 AM, Martin Maechler wrote:
  In November, we had a bug

Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-08 Thread Peter Haverty
Michael's idea has an interesting bonus that he and I discussed earlier.
It would be very convenient to have a container of key/value pairs.  I
imagine many people often write this:

x - mapply( names(x), x, FUN=function(k,v) { # work with key and value }

especially ex perl people accustomed to

while ( ($key, $value) = each( some_hash ) { }

Perhaps there is room for additional discussion of using lists of SYMSXPs
in this manner. (If SYMSXPs are not that safe, perhaps a looping construct
for named vectors that gave the illusion iterating over a list of
two-tuples.)



Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

On Thu, Jan 8, 2015 at 11:57 AM, luke-tier...@uiowa.edu wrote:

 On Thu, 8 Jan 2015, Michael Lawrence wrote:

  If we do add an argument to get(), then it should be named consistently
 with the ifnotfound argument of mget(). As mentioned, the possibility of a
 NULL value is problematic. One solution is a sentinel value that indicates
 an unbound value (like R_UnboundValue).


 A null default is fine -- it's a default; if it isn't right for a
 particular case you can provide something else.


 But another idea (and one pretty similar to John's) is to follow the
 SYMSXP
 design at the C level, where there is a structure that points to the name
 and a value. We already have SYMSXPs at the R level of course (name
 objects) but they do not provide access to the value, which is typically
 R_UnboundValue. But this does not even need to be implemented with SYMSXP.
 The design would allow something like:

 binding - getBinding(x, env)
 if (hasValue(binding)) {
  x - value(binding) # throws an error if none
  message(name(binding), has value, x)
 }

 That I think it is a bit verbose but readable and could be made fast. And
 I
 think binding objects would be useful in other ways, as they are
 essentially a named object. For example, when iterating over an
 environment.


 This would need a lot more thought. Directly exposing the internals is
 definitely not something we want to do as we may well want to change
 that design. But there are lots of other corner issues that would have
 to be thought through before going forward, such as what happens if an
 rm occurs between obtaining a binding object and doing something with
 it. Serialization would also need thinking through. This doesn't seem
 like a worthwhile place to spend our efforts to me.

 Adding getIfExists, or .get, or get0, or whatever seems fine. Adding
 an argument to get() with missing giving current behavior may be OK
 too. Rewriting exists and get as .Primitives may be sufficient though.

 Best,

 luke


  Michael




 On Thu, Jan 8, 2015 at 6:03 AM, John Nolan jpno...@american.edu wrote:

  Adding an optional argument to get (and mget) like

 val - get(name, where, ..., value.if.not.found=NULL )   (*)

 would be useful for many.  HOWEVER, it is possible that there could be
 some confusion here: (*) can give a NULL because either x exists and
 has value NULL, or because x doesn't exist.   If that matters, the user
 would need to be careful about specifying a value.if.not.found that
 cannot
 be confused with a valid value of x.

 To avoid this difficulty, perhaps we want both: have Martin's
 getifexists(
 )
 return a list with two values:
   - a boolean variable 'found'  # = value returned by exists( )
   - a variable 'value'

 Then implement get( ) as:

 get - function(x,...,value.if.not.found ) {

   if( missing(value.if.not.found) ) {
 a - getifexists(x,... )
 if (!a$found) error(x not found)
   } else {
 a - getifexists(x,...,value.if.not.found )
   }
   return(a$value)
 }

 Note that value.if.not.found has no default value in above.
 It behaves exactly like current get does if value.if.not.found
 is not specified, and if it is specified, it would be faster
 in the common situation mentioned below:
  if(exists(x,...)) { get(x,...) }

 John

 P.S. if you like dromedaries call it valueIfNotFound ...

  ..
  John P. Nolan
  Math/Stat Department
  227 Gray Hall,   American University
  4400 Massachusetts Avenue, NW
  Washington, DC 20016-8050

  jpno...@american.edu   voice: 202.885.3140
  web: academic2.american.edu/~jpnolan
  ..


 -R-devel r-devel-boun...@r-project.org wrote: -
 To: Martin Maechler maech...@stat.math.ethz.ch, R-devel@r-project.org
 From: Duncan Murdoch
 Sent by: R-devel
 Date: 01/08/2015 06:39AM
 Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

 On 08/01/2015 4:16 AM, Martin Maechler wrote:
  In November, we had a bug repository conversation
  with Peter Hagerty and myself:
 
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
 
  where the bug report title started with
 
   ---  exists is a bottleneck for dispatch and package loading, ...
 
  Peter proposed an extra simplified and henc faster version of exists

Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-08 Thread Michael Lawrence
On Thu, Jan 8, 2015 at 11:57 AM, luke-tier...@uiowa.edu wrote:

 On Thu, 8 Jan 2015, Michael Lawrence wrote:

  If we do add an argument to get(), then it should be named consistently
 with the ifnotfound argument of mget(). As mentioned, the possibility of a
 NULL value is problematic. One solution is a sentinel value that indicates
 an unbound value (like R_UnboundValue).


 A null default is fine -- it's a default; if it isn't right for a
 particular case you can provide something else.


 But another idea (and one pretty similar to John's) is to follow the
 SYMSXP
 design at the C level, where there is a structure that points to the name
 and a value. We already have SYMSXPs at the R level of course (name
 objects) but they do not provide access to the value, which is typically
 R_UnboundValue. But this does not even need to be implemented with SYMSXP.
 The design would allow something like:

 binding - getBinding(x, env)
 if (hasValue(binding)) {
  x - value(binding) # throws an error if none
  message(name(binding), has value, x)
 }

 That I think it is a bit verbose but readable and could be made fast. And
 I
 think binding objects would be useful in other ways, as they are
 essentially a named object. For example, when iterating over an
 environment.


 This would need a lot more thought. Directly exposing the internals is
 definitely not something we want to do as we may well want to change
 that design. But there are lots of other corner issues that would have
 to be thought through before going forward, such as what happens if an
 rm occurs between obtaining a binding object and doing something with
 it. Serialization would also need thinking through. This doesn't seem
 like a worthwhile place to spend our efforts to me.



Just wanted to be clear that I was not suggesting to expose any internals.
We could implement the behavior using SYMSXP, or not. Nor would the binding
need to be mutable. The binding would be considered independent of the
environment from which it was retrieved. As Pete has mentioned, it could be
a useful abstraction to have in general.


 Adding getIfExists, or .get, or get0, or whatever seems fine. Adding
 an argument to get() with missing giving current behavior may be OK
 too. Rewriting exists and get as .Primitives may be sufficient though.

 Best,

 luke


  Michael




 On Thu, Jan 8, 2015 at 6:03 AM, John Nolan jpno...@american.edu wrote:

  Adding an optional argument to get (and mget) like

 val - get(name, where, ..., value.if.not.found=NULL )   (*)

 would be useful for many.  HOWEVER, it is possible that there could be
 some confusion here: (*) can give a NULL because either x exists and
 has value NULL, or because x doesn't exist.   If that matters, the user
 would need to be careful about specifying a value.if.not.found that
 cannot
 be confused with a valid value of x.

 To avoid this difficulty, perhaps we want both: have Martin's
 getifexists(
 )
 return a list with two values:
   - a boolean variable 'found'  # = value returned by exists( )
   - a variable 'value'

 Then implement get( ) as:

 get - function(x,...,value.if.not.found ) {

   if( missing(value.if.not.found) ) {
 a - getifexists(x,... )
 if (!a$found) error(x not found)
   } else {
 a - getifexists(x,...,value.if.not.found )
   }
   return(a$value)
 }

 Note that value.if.not.found has no default value in above.
 It behaves exactly like current get does if value.if.not.found
 is not specified, and if it is specified, it would be faster
 in the common situation mentioned below:
  if(exists(x,...)) { get(x,...) }

 John

 P.S. if you like dromedaries call it valueIfNotFound ...

  ..
  John P. Nolan
  Math/Stat Department
  227 Gray Hall,   American University
  4400 Massachusetts Avenue, NW
  Washington, DC 20016-8050

  jpno...@american.edu   voice: 202.885.3140
  web: academic2.american.edu/~jpnolan
  ..


 -R-devel r-devel-boun...@r-project.org wrote: -
 To: Martin Maechler maech...@stat.math.ethz.ch, R-devel@r-project.org
 From: Duncan Murdoch
 Sent by: R-devel
 Date: 01/08/2015 06:39AM
 Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

 On 08/01/2015 4:16 AM, Martin Maechler wrote:
  In November, we had a bug repository conversation
  with Peter Hagerty and myself:
 
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
 
  where the bug report title started with
 
   ---  exists is a bottleneck for dispatch and package loading, ...
 
  Peter proposed an extra simplified and henc faster version of exists(),
  and I commented
 
   --- Comment #2 from Martin Maechler maech...@stat.math.ethz.ch
 ---
   I'm very grateful that you've started exploring the bottlenecks
 of
 loading
   packages with many S4 classes (and methods)...
   and I hope we can make real progress there rather sooner than
 later

Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-08 Thread Duncan Murdoch
On 08/01/2015 4:16 AM, Martin Maechler wrote:
 In November, we had a bug repository conversation
 with Peter Hagerty and myself:
 
   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
 
 where the bug report title started with
 
  ---  exists is a bottleneck for dispatch and package loading, ...
 
 Peter proposed an extra simplified and henc faster version of exists(),
 and I commented
 
  --- Comment #2 from Martin Maechler maech...@stat.math.ethz.ch ---
  I'm very grateful that you've started exploring the bottlenecks of 
 loading
  packages with many S4 classes (and methods)...
  and I hope we can make real progress there rather sooner than later.
 
  OTOH, your `summaryRprof()` in your vignette indicates that exists() 
 may use
  upto 10% of the time spent in library(reportingTools),  and your speedup
  proposals of exist()  may go up to ca 30%  which is good and well worth
  considering,  but still we can only expect 2-3% speedup for package 
 loading
  which unfortunately is not much.
 
  Still I agree it is worth looking at exists() as you did  ... and 
  consider providing a fast simplified version of it in addition to 
 current
  exists() [I think].
 
  BTW, as we talk about enhancements here, maybe consider a further 
 possibility:
  My subjective guess is that probably more than half of exists() uses 
 are of the
  form
 
  if(exists(name, where, ...)) {
 get(name, whare, )
 ..
  } else { 
  NULL / error() / .. or similar
  }
 
  i.e. many exists() calls when returning TRUE are immediately followed 
 by the
  corresponding get() call which repeats quite a bit of the lookup that 
 exists()
  has done.
 
  Instead, I'd imagine a function, say  getifexists(name, ...) that does 
 both at
  once in the exists is TRUE case but in a way we can easily keep the 
 if(.) ..
  else clause above.  One already existing approach would use
 
  if(!inherits(tryCatch(xx - get(name, where, ...), error=function(e)e), 
 error)) {
 
... (( work with xx )) ...
 
  } else  { 
 NULL / error() / .. or similar
  }
 
  but of course our C implementation would be more efficient and use more 
 concise
  syntax {which should not look like error handling}.   Follow ups to 
 this idea
  should really go to R-devel (the mailing list).
 
 and now I do follow up here myself :
 
 I found that  'getifexists()' is actually very simple to implement,
 I have already tested it a bit, but not yet committed to R-devel
 (the R trunk aka master branch) because I'd like to get
 public comments {RFC := Request For Comments}.
 

I don't like the name -- I'd prefer getIfExists.  As Baath (2012, R
Journal) pointed out, R names are very inconsistent in naming
conventions, but lowerCamelCase is the most common choice.  Second most
common is period.separated, so an argument could be made for
get.if.exists, but there's still the possibility of confusion with S3
methods, and users of other languages where . is an operator find it a
little strange.

If you don't like lowerCamelCase (and a lot of people don't), then I
think underscore_separated is the next best choice, so would use
get_if_exists.

Another possibility is to make no new name at all, and just add an
optional parameter to get() (which if present acts as your value.if.not
parameter, if not present keeps the current object not found error).

Duncan Murdoch


 My version of the help file {for both exists() and getifexists()}
 rendered in text is
 
 -- help(getifexists) ---
 Is an Object Defined?
 
 Description:
 
  Look for an R object of the given name and possibly return it
 
 Usage:
 
  exists(x, where = -1, envir = , frame, mode = any,
 inherits = TRUE)
  
  getifexists(x, where = -1, envir = as.environment(where),
  mode = any, inherits = TRUE, value.if.not = NULL)
  
 Arguments:
 
x: a variable name (given as a character string).
 
where: where to look for the object (see the details section); if
   omitted, the function will search as if the name of the
   object appeared unquoted in an expression.
 
envir: an alternative way to specify an environment to look in, but
   it is usually simpler to just use the ‘where’ argument.
 
frame: a frame in the calling list.  Equivalent to giving ‘where’ as
   ‘sys.frame(frame)’.
 
 mode: the mode or type of object sought: see the ‘Details’ section.
 
 inherits: should the enclosing frames of the environment be searched?
 
 value.if.not: the return value of ‘getifexists(x, *)’ when ‘x’ does not
   exist.
 
 Details:
 
  The ‘where’ argument can specify the environment in which to look
  for the object in any of several ways: as an integer (the position
  in the ‘search’ list); as the character string name 

[Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-08 Thread Martin Maechler
In November, we had a bug repository conversation
with Peter Hagerty and myself:

  https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065

where the bug report title started with

 ---  exists is a bottleneck for dispatch and package loading, ...

Peter proposed an extra simplified and henc faster version of exists(),
and I commented

 --- Comment #2 from Martin Maechler maech...@stat.math.ethz.ch ---
 I'm very grateful that you've started exploring the bottlenecks of loading
 packages with many S4 classes (and methods)...
 and I hope we can make real progress there rather sooner than later.

 OTOH, your `summaryRprof()` in your vignette indicates that exists() may 
use
 upto 10% of the time spent in library(reportingTools),  and your speedup
 proposals of exist()  may go up to ca 30%  which is good and well worth
 considering,  but still we can only expect 2-3% speedup for package 
loading
 which unfortunately is not much.

 Still I agree it is worth looking at exists() as you did  ... and 
 consider providing a fast simplified version of it in addition to current
 exists() [I think].

 BTW, as we talk about enhancements here, maybe consider a further 
possibility:
 My subjective guess is that probably more than half of exists() uses are 
of the
 form

 if(exists(name, where, ...)) {
get(name, whare, )
..
 } else { 
 NULL / error() / .. or similar
 }

 i.e. many exists() calls when returning TRUE are immediately followed by 
the
 corresponding get() call which repeats quite a bit of the lookup that 
exists()
 has done.

 Instead, I'd imagine a function, say  getifexists(name, ...) that does 
both at
 once in the exists is TRUE case but in a way we can easily keep the 
if(.) ..
 else clause above.  One already existing approach would use

 if(!inherits(tryCatch(xx - get(name, where, ...), error=function(e)e), 
error)) {

   ... (( work with xx )) ...

 } else  { 
NULL / error() / .. or similar
 }

 but of course our C implementation would be more efficient and use more 
concise
 syntax {which should not look like error handling}.   Follow ups to this 
idea
 should really go to R-devel (the mailing list).

and now I do follow up here myself :

I found that  'getifexists()' is actually very simple to implement,
I have already tested it a bit, but not yet committed to R-devel
(the R trunk aka master branch) because I'd like to get
public comments {RFC := Request For Comments}.

My version of the help file {for both exists() and getifexists()}
rendered in text is

-- help(getifexists) ---
Is an Object Defined?

Description:

 Look for an R object of the given name and possibly return it

Usage:

 exists(x, where = -1, envir = , frame, mode = any,
inherits = TRUE)
 
 getifexists(x, where = -1, envir = as.environment(where),
 mode = any, inherits = TRUE, value.if.not = NULL)
 
Arguments:

   x: a variable name (given as a character string).

   where: where to look for the object (see the details section); if
  omitted, the function will search as if the name of the
  object appeared unquoted in an expression.

   envir: an alternative way to specify an environment to look in, but
  it is usually simpler to just use the ‘where’ argument.

   frame: a frame in the calling list.  Equivalent to giving ‘where’ as
  ‘sys.frame(frame)’.

mode: the mode or type of object sought: see the ‘Details’ section.

inherits: should the enclosing frames of the environment be searched?

value.if.not: the return value of ‘getifexists(x, *)’ when ‘x’ does not
  exist.

Details:

 The ‘where’ argument can specify the environment in which to look
 for the object in any of several ways: as an integer (the position
 in the ‘search’ list); as the character string name of an element
 in the search list; or as an ‘environment’ (including using
 ‘sys.frame’ to access the currently active function calls).  The
 ‘envir’ argument is an alternative way to specify an environment,
 but is primarily there for back compatibility.

 This function looks to see if the name ‘x’ has a value bound to it
 in the specified environment.  If ‘inherits’ is ‘TRUE’ and a value
 is not found for ‘x’ in the specified environment, the enclosing
 frames of the environment are searched until the name ‘x’ is
 encountered.  See ‘environment’ and the ‘R Language Definition’
 manual for details about the structure of environments and their
 enclosures.

 *Warning:* ‘inherits = TRUE’ is the default behaviour for R but
 not for S.

 If ‘mode’ is specified then only objects of that type are sought.
 The ‘mode’ may specify one of the collections ‘numeric’ and
 ‘function’ (see ‘mode’): any 

Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-08 Thread Duncan Murdoch

On 08/01/2015 9:03 AM, John Nolan wrote:

Adding an optional argument to get (and mget) like

val - get(name, where, ..., value.if.not.found=NULL )   (*)


That would be a bad idea, as it would change behaviour of existing uses 
of get().  What I suggested would not
give a default.  If the arg was missing, we'd get the old behaviour, if 
the arg was present, we'd use it.


I'm not sure this is preferable to the separate function 
implementation.  This makes the documentation and implementation of 
get() more complicated, and it would probably be slower for everyone.


Duncan Murdoch



would be useful for many.  HOWEVER, it is possible that there could be
some confusion here: (*) can give a NULL because either x exists and
has value NULL, or because x doesn't exist.   If that matters, the user
would need to be careful about specifying a value.if.not.found that cannot
be confused with a valid value of x.

To avoid this difficulty, perhaps we want both: have Martin's getifexists( )
return a list with two values:
   - a boolean variable 'found'  # = value returned by exists( )
   - a variable 'value'

Then implement get( ) as:

get - function(x,...,value.if.not.found ) {

   if( missing(value.if.not.found) ) {
 a - getifexists(x,... )
 if (!a$found) error(x not found)
   } else {
 a - getifexists(x,...,value.if.not.found )
   }
   return(a$value)
}

Note that value.if.not.found has no default value in above.
It behaves exactly like current get does if value.if.not.found
is not specified, and if it is specified, it would be faster
in the common situation mentioned below:
  if(exists(x,...)) { get(x,...) }

John

P.S. if you like dromedaries call it valueIfNotFound ...

  ..
  John P. Nolan
  Math/Stat Department
  227 Gray Hall,   American University
  4400 Massachusetts Avenue, NW
  Washington, DC 20016-8050

  jpno...@american.edu   voice: 202.885.3140
  web: academic2.american.edu/~jpnolan
  ..


-R-devel r-devel-boun...@r-project.org wrote: -
To: Martin Maechler maech...@stat.math.ethz.ch, R-devel@r-project.org
From: Duncan Murdoch
Sent by: R-devel
Date: 01/08/2015 06:39AM
Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

On 08/01/2015 4:16 AM, Martin Maechler wrote:
 In November, we had a bug repository conversation
 with Peter Hagerty and myself:

   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065

 where the bug report title started with

  ---  exists is a bottleneck for dispatch and package loading, ...

 Peter proposed an extra simplified and henc faster version of exists(),
 and I commented

  --- Comment #2 from Martin Maechler maech...@stat.math.ethz.ch ---
  I'm very grateful that you've started exploring the bottlenecks of 
loading
  packages with many S4 classes (and methods)...
  and I hope we can make real progress there rather sooner than later.

  OTOH, your `summaryRprof()` in your vignette indicates that exists() 
may use
  upto 10% of the time spent in library(reportingTools),  and your speedup
  proposals of exist()  may go up to ca 30%  which is good and well worth
  considering,  but still we can only expect 2-3% speedup for package 
loading
  which unfortunately is not much.

  Still I agree it is worth looking at exists() as you did  ... and
  consider providing a fast simplified version of it in addition to 
current
  exists() [I think].

  BTW, as we talk about enhancements here, maybe consider a further 
possibility:
  My subjective guess is that probably more than half of exists() uses 
are of the
  form

  if(exists(name, where, ...)) {
 get(name, whare, )
 ..
  } else {
  NULL / error() / .. or similar
  }

  i.e. many exists() calls when returning TRUE are immediately followed 
by the
  corresponding get() call which repeats quite a bit of the lookup that 
exists()
  has done.

  Instead, I'd imagine a function, say  getifexists(name, ...) that does 
both at
  once in the exists is TRUE case but in a way we can easily keep the 
if(.) ..
  else clause above.  One already existing approach would use

  if(!inherits(tryCatch(xx - get(name, where, ...), error=function(e)e), 
error)) {

... (( work with xx )) ...

  } else  {
 NULL / error() / .. or similar
  }

  but of course our C implementation would be more efficient and use more 
concise
  syntax {which should not look like error handling}.   Follow ups to 
this idea
  should really go to R-devel (the mailing list).

 and now I do follow up here myself :

 I found that  'getifexists()' is actually very simple to implement,
 I have already tested it a bit, but not yet committed to R-devel
 (the R trunk aka master branch) because I'd like to get
 public comments {RFC := Request

Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-08 Thread Michael Lawrence
If we do add an argument to get(), then it should be named consistently
with the ifnotfound argument of mget(). As mentioned, the possibility of a
NULL value is problematic. One solution is a sentinel value that indicates
an unbound value (like R_UnboundValue).

But another idea (and one pretty similar to John's) is to follow the SYMSXP
design at the C level, where there is a structure that points to the name
and a value. We already have SYMSXPs at the R level of course (name
objects) but they do not provide access to the value, which is typically
R_UnboundValue. But this does not even need to be implemented with SYMSXP.
The design would allow something like:

binding - getBinding(x, env)
if (hasValue(binding)) {
  x - value(binding) # throws an error if none
  message(name(binding), has value, x)
}

That I think it is a bit verbose but readable and could be made fast. And I
think binding objects would be useful in other ways, as they are
essentially a named object. For example, when iterating over an
environment.

Michael




On Thu, Jan 8, 2015 at 6:03 AM, John Nolan jpno...@american.edu wrote:

 Adding an optional argument to get (and mget) like

 val - get(name, where, ..., value.if.not.found=NULL )   (*)

 would be useful for many.  HOWEVER, it is possible that there could be
 some confusion here: (*) can give a NULL because either x exists and
 has value NULL, or because x doesn't exist.   If that matters, the user
 would need to be careful about specifying a value.if.not.found that cannot
 be confused with a valid value of x.

 To avoid this difficulty, perhaps we want both: have Martin's getifexists(
 )
 return a list with two values:
   - a boolean variable 'found'  # = value returned by exists( )
   - a variable 'value'

 Then implement get( ) as:

 get - function(x,...,value.if.not.found ) {

   if( missing(value.if.not.found) ) {
 a - getifexists(x,... )
 if (!a$found) error(x not found)
   } else {
 a - getifexists(x,...,value.if.not.found )
   }
   return(a$value)
 }

 Note that value.if.not.found has no default value in above.
 It behaves exactly like current get does if value.if.not.found
 is not specified, and if it is specified, it would be faster
 in the common situation mentioned below:
  if(exists(x,...)) { get(x,...) }

 John

 P.S. if you like dromedaries call it valueIfNotFound ...

  ..
  John P. Nolan
  Math/Stat Department
  227 Gray Hall,   American University
  4400 Massachusetts Avenue, NW
  Washington, DC 20016-8050

  jpno...@american.edu   voice: 202.885.3140
  web: academic2.american.edu/~jpnolan
  ..


 -R-devel r-devel-boun...@r-project.org wrote: -
 To: Martin Maechler maech...@stat.math.ethz.ch, R-devel@r-project.org
 From: Duncan Murdoch
 Sent by: R-devel
 Date: 01/08/2015 06:39AM
 Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

 On 08/01/2015 4:16 AM, Martin Maechler wrote:
  In November, we had a bug repository conversation
  with Peter Hagerty and myself:
 
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
 
  where the bug report title started with
 
   ---  exists is a bottleneck for dispatch and package loading, ...
 
  Peter proposed an extra simplified and henc faster version of exists(),
  and I commented
 
   --- Comment #2 from Martin Maechler maech...@stat.math.ethz.ch
 ---
   I'm very grateful that you've started exploring the bottlenecks of
 loading
   packages with many S4 classes (and methods)...
   and I hope we can make real progress there rather sooner than
 later.
 
   OTOH, your `summaryRprof()` in your vignette indicates that
 exists() may use
   upto 10% of the time spent in library(reportingTools),  and your
 speedup
   proposals of exist()  may go up to ca 30%  which is good and well
 worth
   considering,  but still we can only expect 2-3% speedup for
 package loading
   which unfortunately is not much.
 
   Still I agree it is worth looking at exists() as you did  ... and
   consider providing a fast simplified version of it in addition to
 current
   exists() [I think].
 
   BTW, as we talk about enhancements here, maybe consider a further
 possibility:
   My subjective guess is that probably more than half of exists()
 uses are of the
   form
 
   if(exists(name, where, ...)) {
  get(name, whare, )
  ..
   } else {
   NULL / error() / .. or similar
   }
 
   i.e. many exists() calls when returning TRUE are immediately
 followed by the
   corresponding get() call which repeats quite a bit of the lookup
 that exists()
   has done.
 
   Instead, I'd imagine a function, say  getifexists(name, ...) that
 does both at
   once in the exists is TRUE case but in a way we can easily keep
 the if(.) ..
   else clause above.  One already existing

Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-08 Thread Martin Maechler

 Adding an optional argument to get (and mget) like
 val - get(name, where, ..., value.if.not.found=NULL )   (*)

 would be useful for many.  HOWEVER, it is possible that there could be 
 some confusion here: (*) can give a NULL because either x exists and 
 has value NULL, or because x doesn't exist.   If that matters, the user 
 would need to be careful about specifying a value.if.not.found that cannot 
 be confused with a valid value of x.  

Exactly -- well, of course: That problem { NULL can be the legit value of what 
you
want to get() } was the only reason to have a 'value.if.not' argument at all. 

Note that this is not about a universal replacement of 
the  if(exists(..)) { .. get(..) } idiom, but rather a
replacement of these in the cases where speed matters very much,
which is e.g. in the low level support code for S4 method dispatch.

'value.if.not.found':
Note that CRAN checks requires all arguments to be written in
full length.  Even though we have auto completion in ESS,
Rstudio or other good R IDE's,  I very much like to keep
function calls somewhat compact.

And yes, as you mention the dromedars aka 2-hump camels:  
getIfExist is already horrible to my taste (and _ is not S-like; 
yes that's all very much a matter of taste and yes I'm from the
20th century).

 To avoid this difficulty, perhaps we want both: have Martin's getifexists( ) 
 return a list with two values: 
   - a boolean variable 'found'  # = value returned by exists( )
   - a variable 'value'

 Then implement get( ) as:

 get - function(x,...,value.if.not.found ) {

   if( missing(value.if.not.found) ) {
 a - getifexists(x,... )
 if (!a$found) error(x not found)
   } else {
 a - getifexists(x,...,value.if.not.found )
   }
   return(a$value)
 }

Interesting...
Note that the above get() implementation would just be conceptually, as 
all of this is also quite a bit about speed, and we do the
different cases in C anyway [via 'op' code].

 Note that value.if.not.found has no default value in above.
 It behaves exactly like current get does if value.if.not.found 
 is not specified, and if it is specified, it would be faster 
 in the common situation mentioned below:   
  if(exists(x,...)) { get(x,...) }

Good... Let's talk about your getifexists() as I argue we'd keep
get() exactly as it is now anyway, if we use a new 3rd function (I keep
calling 'getifexists()' for now):

I think in that case, getifexists() would not even *need* an argument 
'value.if.not' (or 'value.if.not.found'); it rather would return a 
  list(found = *, value = *)
in any case.
Alternatively, it could return
  structure(found, value = *)

In the first case, our main use case would be

  if((r - getifexists(x, *))$found) {
 ## work with  r$value
  }

in the 2nd case {structure} :

  if((r - getifexists(x, *))) {
 ## work with  attr(r,value)
  }

I think that (both cases) would still be a bit slower (for the above
most important use case) but probably not much
and it would like slightly more readable than my

   if (!is.null(r - getifexists(x, *))) {
  ## work with  r
   }

After all of this, I think I'd still somewhat prefer my original proposal,
but not strongly -- I had originally also thought of returning the
two parts explicitly, but then tended to prefer the version that
behaved exactly like get() in the case the object is found.

... Nice interesting ideas! ... 
let the proposals and consideration flow ...

Martin


 John

 P.S. if you like dromedaries call it valueIfNotFound ...

:-) ;-)  
I don't .. as I said above, I already strongly dislike more than one hump. 
[ Each capital is one key stroke (Shift) more ,
  and each _ is two key strokes more on most key boards...,
  and I do like identifiers that I can also quickly pronounce on
  the phone or in teaching .. ]

  ..
  John P. Nolan
  Math/Stat Department
  227 Gray Hall,   American University
  4400 Massachusetts Avenue, NW
  Washington, DC 20016-8050
  ..


 -R-devel r-devel-boun...@r-project.org wrote: - 
 To: Martin Maechler maech...@stat.math.ethz.ch, R-devel@r-project.org
 From: Duncan Murdoch 
 Sent by: R-devel 
 Date: 01/08/2015 06:39AM
 Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

 On 08/01/2015 4:16 AM, Martin Maechler wrote:
  In November, we had a bug repository conversation
  with Peter Hagerty and myself:
  
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
  
  where the bug report title started with
  
   ---  exists is a bottleneck for dispatch and package loading, ...
  
  Peter proposed an extra simplified and henc faster version of exists(),
  and I commented
  
   --- Comment #2 from Martin Maechler maech...@stat.math.ethz.ch ---
   I'm very grateful that you've started exploring the bottlenecks of 
  loading
   packages with many S4 classes

Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-08 Thread Jeroen Ooms
On Thu, Jan 8, 2015 at 6:36 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 val - get(name, where, ..., value.if.not.found=NULL )   (*)

 That would be a bad idea, as it would change behaviour of existing uses of
 get().

Another approach would be if the not found behavior consists of a
callback, e.g. an expression or function:

  get(name, where, ..., not.found=stop(object , name,  not found))

This would cover the case of not.found=NULL, but also allows for
writing code with syntax similar to tryCatch

  obj - get(foo, not.found = someDefaultValue())

Not sure what this would do to performance though.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

2015-01-08 Thread luke-tierney

On Thu, 8 Jan 2015, Michael Lawrence wrote:


If we do add an argument to get(), then it should be named consistently
with the ifnotfound argument of mget(). As mentioned, the possibility of a
NULL value is problematic. One solution is a sentinel value that indicates
an unbound value (like R_UnboundValue).


A null default is fine -- it's a default; if it isn't right for a
particular case you can provide something else.



But another idea (and one pretty similar to John's) is to follow the SYMSXP
design at the C level, where there is a structure that points to the name
and a value. We already have SYMSXPs at the R level of course (name
objects) but they do not provide access to the value, which is typically
R_UnboundValue. But this does not even need to be implemented with SYMSXP.
The design would allow something like:

binding - getBinding(x, env)
if (hasValue(binding)) {
 x - value(binding) # throws an error if none
 message(name(binding), has value, x)
}

That I think it is a bit verbose but readable and could be made fast. And I
think binding objects would be useful in other ways, as they are
essentially a named object. For example, when iterating over an
environment.


This would need a lot more thought. Directly exposing the internals is
definitely not something we want to do as we may well want to change
that design. But there are lots of other corner issues that would have
to be thought through before going forward, such as what happens if an
rm occurs between obtaining a binding object and doing something with
it. Serialization would also need thinking through. This doesn't seem
like a worthwhile place to spend our efforts to me.

Adding getIfExists, or .get, or get0, or whatever seems fine. Adding
an argument to get() with missing giving current behavior may be OK
too. Rewriting exists and get as .Primitives may be sufficient though.

Best,

luke



Michael




On Thu, Jan 8, 2015 at 6:03 AM, John Nolan jpno...@american.edu wrote:


Adding an optional argument to get (and mget) like

val - get(name, where, ..., value.if.not.found=NULL )   (*)

would be useful for many.  HOWEVER, it is possible that there could be
some confusion here: (*) can give a NULL because either x exists and
has value NULL, or because x doesn't exist.   If that matters, the user
would need to be careful about specifying a value.if.not.found that cannot
be confused with a valid value of x.

To avoid this difficulty, perhaps we want both: have Martin's getifexists(
)
return a list with two values:
  - a boolean variable 'found'  # = value returned by exists( )
  - a variable 'value'

Then implement get( ) as:

get - function(x,...,value.if.not.found ) {

  if( missing(value.if.not.found) ) {
a - getifexists(x,... )
if (!a$found) error(x not found)
  } else {
a - getifexists(x,...,value.if.not.found )
  }
  return(a$value)
}

Note that value.if.not.found has no default value in above.
It behaves exactly like current get does if value.if.not.found
is not specified, and if it is specified, it would be faster
in the common situation mentioned below:
 if(exists(x,...)) { get(x,...) }

John

P.S. if you like dromedaries call it valueIfNotFound ...

 ..
 John P. Nolan
 Math/Stat Department
 227 Gray Hall,   American University
 4400 Massachusetts Avenue, NW
 Washington, DC 20016-8050

 jpno...@american.edu   voice: 202.885.3140
 web: academic2.american.edu/~jpnolan
 ..


-R-devel r-devel-boun...@r-project.org wrote: -
To: Martin Maechler maech...@stat.math.ethz.ch, R-devel@r-project.org
From: Duncan Murdoch
Sent by: R-devel
Date: 01/08/2015 06:39AM
Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] exists ...}

On 08/01/2015 4:16 AM, Martin Maechler wrote:
 In November, we had a bug repository conversation
 with Peter Hagerty and myself:

   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065

 where the bug report title started with

  ---  exists is a bottleneck for dispatch and package loading, ...

 Peter proposed an extra simplified and henc faster version of exists(),
 and I commented

  --- Comment #2 from Martin Maechler maech...@stat.math.ethz.ch
---
  I'm very grateful that you've started exploring the bottlenecks of
loading
  packages with many S4 classes (and methods)...
  and I hope we can make real progress there rather sooner than
later.

  OTOH, your `summaryRprof()` in your vignette indicates that
exists() may use
  upto 10% of the time spent in library(reportingTools),  and your
speedup
  proposals of exist()  may go up to ca 30%  which is good and well
worth
  considering,  but still we can only expect 2-3% speedup for
package loading
  which unfortunately is not much.

  Still I agree it is worth looking at exists() as you did  ... and
  consider providing a fast simplified version of it in addition to
current

Re: [Rd] Multiple return values / bug in rpart?

2013-08-13 Thread Simon Urbanek

On Aug 13, 2013, at 5:27 AM, Barry Rowlingson wrote:

 On Mon, Aug 12, 2013 at 6:06 PM, Justin Talbot justintal...@gmail.com wrote:
 In the recommended package rpart (version 4.1-1), the file rpartpl.R
 contains the following line:
 
 return(x = x[!erase], y = y[!erase])
 
 AFAIK, returning multiple values like this is not valid R. Is that
 correct? I can't seem to make it work in my own code.
 
 Works for me, returning a list:
 
 foo
 function(x){return(x,x*2)}
 foo(99)
 [[1]]
 [1] 99
 
 [[2]]
 [1] 198
 
 But hey, that might just be because I redefined 'return' earlier:
 
 return
 function(...){list(...)}
 

eek.. this is bad:

 (function() { if(TRUE) return(OK); BAD! })()
[1] BAD!

Trying to modify the behavior of return() is rather tricky since you have to 
return from the function that's calling you ...

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Multiple return values / bug in rpart?

2013-08-13 Thread Terry Therneau
I don't remember what rpartpl once did myself; as you point out it is a routine that is no 
longer used and should be removed.  I've cc'd Brian since he maintains the rpart code.


Long ago return() with multiple arguments was a legal shorthand for returning a list.  
This feature was depricated in Splus, I think even before R rose to prominence.  I vaguely 
remember a time when it's usage generated a warning.


The fact that I've never noticed this unused routine is somewhat embarrassing.  Perhaps I 
need a not documented, never called addition to R CMD check to help me along.


Terry Therneau


In the recommended package rpart (version 4.1-1), the file rpartpl.R
contains the following line:

return(x = x[!erase], y = y[!erase])

AFAIK, returning multiple values like this is not valid R. Is that
correct? I can't seem to make it work in my own code.

It doesn't appear that rpartpl.R is used anywhere, so this may have
never caused an issue. But it's tripping up my R compiler.

Thanks,
Justin Talbot


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Multiple return values / bug in rpart?

2013-08-13 Thread Prof Brian Ripley

On 13/08/2013 13:54, Terry Therneau wrote:

I don't remember what rpartpl once did myself; as you point out it is a
routine that is no longer used and should be removed.  I've cc'd Brian
since he maintains the rpart code.

Long ago return() with multiple arguments was a legal shorthand for
returning a list. This feature was depricated in Splus, I think even
before R rose to prominence.  I vaguely remember a time when it's usage
generated a warning.


Yes, usage generated a warning then an error, but not parsing.

 foo - function() return(a=1, b=2)
 foo()
Error in return(a = 1, b = 2) : multi-argument returns are not permitted


The fact that I've never noticed this unused routine is somewhat
embarrassing.  Perhaps I need a not documented, never called addition
to R CMD check to help me along.


But you cannot know 'never called'.  This is callable by 
rpart:::rpartpl() : it is also possible that functions in your namespace 
are called via eval()ing expressions at R or C level.  (There are 
examples around for which that is the only usage.)




Terry Therneau


In the recommended package rpart (version 4.1-1), the file rpartpl.R
contains the following line:

return(x = x[!erase], y = y[!erase])

AFAIK, returning multiple values like this is not valid R. Is that
correct? I can't seem to make it work in my own code.

It doesn't appear that rpartpl.R is used anywhere, so this may have
never caused an issue. But it's tripping up my R compiler.

Thanks,
Justin Talbot



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Multiple return values / bug in rpart?

2013-08-13 Thread Duncan Murdoch

On 13-08-13 8:59 AM, Prof Brian Ripley wrote:

On 13/08/2013 13:54, Terry Therneau wrote:

I don't remember what rpartpl once did myself; as you point out it is a
routine that is no longer used and should be removed.  I've cc'd Brian
since he maintains the rpart code.

Long ago return() with multiple arguments was a legal shorthand for
returning a list. This feature was depricated in Splus, I think even
before R rose to prominence.  I vaguely remember a time when it's usage
generated a warning.


Yes, usage generated a warning then an error, but not parsing.

   foo - function() return(a=1, b=2)
   foo()
Error in return(a = 1, b = 2) : multi-argument returns are not permitted


The fact that I've never noticed this unused routine is somewhat
embarrassing.  Perhaps I need a not documented, never called addition
to R CMD check to help me along.


But you cannot know 'never called'.  This is callable by
rpart:::rpartpl() : it is also possible that functions in your namespace
are called via eval()ing expressions at R or C level.  (There are
examples around for which that is the only usage.)


An approximation to never called is to run Rprof on your test code, 
and see which functions are not mentioned.  I have a package under 
construction with some students that can use this approach to identify 
which lines are never seen while profiling the test code.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Multiple return values / bug in rpart?

2013-08-13 Thread luke-tierney

Both codetools and the compiler should be checking for use of multiple
args in return -- I'll look into adding that.

Best,

luke

On Tue, 13 Aug 2013, Duncan Murdoch wrote:


On 13-08-13 8:59 AM, Prof Brian Ripley wrote:

On 13/08/2013 13:54, Terry Therneau wrote:

I don't remember what rpartpl once did myself; as you point out it is a
routine that is no longer used and should be removed.  I've cc'd Brian
since he maintains the rpart code.

Long ago return() with multiple arguments was a legal shorthand for
returning a list. This feature was depricated in Splus, I think even
before R rose to prominence.  I vaguely remember a time when it's usage
generated a warning.


Yes, usage generated a warning then an error, but not parsing.

   foo - function() return(a=1, b=2)
   foo()
Error in return(a = 1, b = 2) : multi-argument returns are not permitted


The fact that I've never noticed this unused routine is somewhat
embarrassing.  Perhaps I need a not documented, never called addition
to R CMD check to help me along.


But you cannot know 'never called'.  This is callable by
rpart:::rpartpl() : it is also possible that functions in your namespace
are called via eval()ing expressions at R or C level.  (There are
examples around for which that is the only usage.)


An approximation to never called is to run Rprof on your test code, and see 
which functions are not mentioned.  I have a package under construction with 
some students that can use this approach to identify which lines are never 
seen while profiling the test code.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


  1   2   >