Prior to the mid-1990s, S did "length-0 OP length-n -> rep(NA, n)" and it was changed to "length-0 OP length-n -> length-0" to avoid lots of problems like any(x<0) being NA when length(x)==0. Yes, people could code defensively by putting lots of if(length(x)==0)... in their code, but that is tedious and error-prone and creates really ugly code.
Is your suggestion to leave the length-0 OP length-1 case as it is but make length-0 OP length-two-or-higher an error or warning (akin to the length-2 OP length-3 case)? By the way, the all(numeric(0)<0) is TRUE, as is all(numeric()>0), by de Morgan's rule, but that is not really relevant here. Bill Dunlap TIBCO Software wdunlap tibco.com On Thu, Sep 8, 2016 at 10:22 AM, Gabriel Becker <gmbec...@ucdavis.edu> wrote: > > > On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap <wdun...@tibco.com> wrote: > >> Shouldn't binary operators (arithmetic and logical) should throw an error >> when one operand is NULL (or other type that doesn't make sense)? This is >> a different case than a zero-length operand of a legitimate type. E.g., >> any(x < 0) >> should return FALSE if x is number-like and length(x)==0 but give an >> error if x is NULL. >> > Bill, > > That is a good point. I can see the argument for this in the case that the > non-zero length is 1. I'm not sure which is better though. If we switch > any() to all(), things get murky. > > Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and > all(x>0)), but the likelihood of this being a thought-bug on the author's > part is exceedingly high, imho. So the desirable behavior seems to depend > on the angle we look at it from. > > My personal opinion is that x < y with length(x)==0 should fail if length(y) > > 1, at least, and I'd be for it being an error even if y is length 1, > though I do acknowledge this is more likely (though still quite unlikely > imho) to be the intended behavior. > > ~G > >> >> I.e., I think the type check should be done before the length check. >> >> >> Bill Dunlap >> TIBCO Software >> wdunlap tibco.com >> >> On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker <gmbec...@ucdavis.edu> >> wrote: >> >>> Martin, >>> >>> Like Robin and Oliver I think this type of edge-case consistency is >>> important and that it's fantastic that R-core - and you personally - are >>> willing to tackle some of these "gotcha" behaviors. "Little" stuff like >>> this really does combine to go a long way to making R better and better. >>> >>> I do wonder a bit about the >>> >>> x = 1:2 >>> >>> y = NULL >>> >>> x < y >>> >>> case. >>> >>> Returning a logical of length 0 is more backwards compatible, but is it >>> ever what the author actually intended? I have trouble thinking of a case >>> where that less-than didn't carry an implicit assumption that y was >>> non-NULL. I can say that in my own code, I've never hit that behavior >>> in a >>> case that wasn't an error. >>> >>> My vote (unless someone else points out a compelling use for the >>> behavior) >>> is for the to throw an error. As a developer, I'd rather things like this >>> break so the bug in my logic is visible, rather than propagating as the >>> 0-length logical is &'ed or |'ed with other logical vectors, or used to >>> subset, or (in the case it should be length 1) passed to if() (if throws >>> an >>> error now, but the rest would silently "work"). >>> >>> Best, >>> ~G >>> >>> On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler < >>> maech...@stat.math.ethz.ch> >>> wrote: >>> >>> > >>>>> robin hankin <hankin.ro...@gmail.com> >>> > >>>>> on Thu, 8 Sep 2016 10:05:21 +1200 writes: >>> > >>> > > Martin I'd like to make a comment; I think that R's >>> > > behaviour on 'edge' cases like this is an important thing >>> > > and it's great that you are working on it. >>> > >>> > > I make heavy use of zero-extent arrays, chiefly because >>> > > the dimnames are an efficient and logical way to keep >>> > > track of certain types of information. >>> > >>> > > If I have, for example, >>> > >>> > > a <- array(0,c(2,0,2)) >>> > > dimnames(a) <- list(name=c('Mike','Kevin'), >>> > NULL,item=c("hat","scarf")) >>> > >>> > >>> > > Then in R-3.3.1, 70800 I get >>> > >>> > a> 0 >>> > > logical(0) >>> > >> >>> > >>> > > But in 71219 I get >>> > >>> > a> 0 >>> > > , , item = hat >>> > >>> > >>> > > name >>> > > Mike >>> > > Kevin >>> > >>> > > , , item = scarf >>> > >>> > >>> > > name >>> > > Mike >>> > > Kevin >>> > >>> > > (which is an empty logical array that holds the names of the >>> people >>> > and >>> > > their clothes). I find the behaviour of 71219 very much >>> preferable >>> > because >>> > > there is no reason to discard the information in the dimnames. >>> > >>> > Thanks a lot, Robin, (and Oliver) ! >>> > >>> > Yes, the above is such a case where the new behavior makes much sense. >>> > And this behavior remains identical after the 71222 amendment. >>> > >>> > Martin >>> > >>> > > Best wishes >>> > > Robin >>> > >>> > >>> > >>> > >>> > > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler < >>> > maech...@stat.math.ethz.ch> >>> > > wrote: >>> > >>> > >> >>>>> Martin Maechler <maech...@stat.math.ethz.ch> >>> > >> >>>>> on Tue, 6 Sep 2016 22:26:31 +0200 writes: >>> > >> >>> > >> > Yesterday, changes to R's development version were committed, >>> > >> relating >>> > >> > to arithmetic, logic ('&' and '|') and >>> > >> > comparison/relational ('<', '==') binary operators >>> > >> > which in NEWS are described as >>> > >> >>> > >> > SIGNIFICANT USER-VISIBLE CHANGES: >>> > >> >>> > >> > [.............] >>> > >> >>> > >> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka >>> > >> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now >>> > >> > behave consistently, notably for arrays of length zero. >>> > >> >>> > >> > Arithmetic between length-1 arrays and longer non-arrays had >>> > >> > silently dropped the array attributes and recycled. This >>> > >> > now gives a warning and will signal an error in the future, >>> > >> > as it has always for logic and comparison operations in >>> > >> > these cases (e.g., compare ‘matrix(1,1) + 2:3’ and >>> > >> > ‘matrix(1,1) < 2:3’). >>> > >> >>> > >> > As the above "visually suggests" one could think of the >>> changes >>> > >> > falling mainly two groups, >>> > >> > 1) <0-extent array> (op) <non-array> >>> > >> > 2) <1-extent array> (arith) <non-array of length != 1> >>> > >> >>> > >> > These changes are partly non-back compatible and may break >>> > >> > existing code. We believe that the internal consistency >>> gained >>> > >> > from the changes is worth the few places with problems. >>> > >> >>> > >> > We expect some package maintainers (10-20, or even more?) need >>> > >> > to adapt their code. >>> > >> >>> > >> > Case '2)' above mainly results in a new warning, e.g., >>> > >> >>> > >> >> matrix(1,1) + 1:2 >>> > >> > [1] 2 3 >>> > >> > Warning message: >>> > >> > In matrix(1, 1) + 1:2 : >>> > >> > dropping dim() of array of length one. Will become ERROR >>> > >> >> >>> > >> >>> > >> > whereas '1)' gives errors in cases the result silently was a >>> > >> > vector of length zero, or also keeps array (dim & dimnames) in >>> > >> > cases these were silently dropped. >>> > >> >>> > >> > The following is a "heavily" commented R script showing (all >>> ?) >>> > >> > the important cases with changes : >>> > >> >>> > >> > ------------------------------------------------------------ >>> > >> ---------------- >>> > >> >>> > >> > (m <- cbind(a=1[0], b=2[0])) >>> > >> > Lm <- m; storage.mode(Lm) <- "logical" >>> > >> > Im <- m; storage.mode(Im) <- "integer" >>> > >> >>> > >> > ## 1. ------------------------- >>> > >> > try( m & NULL ) # in R <= 3.3.x : >>> > >> > ## Error in m & NULL : >>> > >> > ## operations are possible only for numeric, logical or >>> complex >>> > >> types >>> > >> > ## >>> > >> > ## gives 'Lm' in R >= 3.4.0 >>> > >> >>> > >> > ## 2. ------------------------- >>> > >> > m + 2:3 ## gave numeric(0), now remains matrix identical to m >>> > >> > Im + 2:3 ## gave integer(0), now remains matrix identical to >>> Im >>> > >> (integer) >>> > >> >>> > >> > m > 1 ## gave logical(0), now remains matrix identical >>> to Lm >>> > >> (logical) >>> > >> > m > 0.1[0] ## ditto >>> > >> > m > NULL ## ditto >>> > >> >>> > >> > ## 3. ------------------------- >>> > >> > mm <- m[,c(1:2,2:1,2)] >>> > >> > try( m == mm ) ## now gives error "non-conformable arrays", >>> > >> > ## but gave logical(0) in R <= 3.3.x >>> > >> >>> > >> > ## 4. ------------------------- >>> > >> > str( Im + NULL) ## gave "num", now gives "int" >>> > >> >>> > >> > ## 5. ------------------------- >>> > >> > ## special case for arithmetic w/ length-1 array >>> > >> > (m1 <- matrix(1,1,1, dimnames=list("Ro","col"))) >>> > >> > (m2 <- matrix(1,2,1, dimnames=list(c("A","B"),"col"))) >>> > >> >>> > >> > m1 + 1:2 # -> 2:3 but now with warning to "become ERROR" >>> > >> > tools::assertError(m1 & 1:2)# ERR: dims [product 1] do not >>> match >>> > the >>> > >> length of object [2] >>> > >> > tools::assertError(m1 < 1:2)# ERR: (ditto) >>> > >> > ## >>> > >> > ## non-0-length arrays combined with {NULL or double() or ...} >>> > *fail* >>> > >> >>> > >> > ### Length-1 arrays: Arithmetic with |vectors| > 1 treated >>> array >>> > >> as scalar >>> > >> > m1 + NULL # gave numeric(0) in R <= 3.3.x --- still, *but* w/ >>> > >> warning to "be ERROR" >>> > >> > try(m1 > NULL) # gave logical(0) in R <= 3.3.x --- an >>> *error* >>> > >> now in R >= 3.4.0 >>> > >> > tools::assertError(m1 & NULL) # gave and gives error >>> > >> > tools::assertError(m1 | double())# ditto >>> > >> > ## m2 was slightly different: >>> > >> > tools::assertError(m2 + NULL) >>> > >> > tools::assertError(m2 & NULL) >>> > >> > try(m2 == NULL) ## was logical(0) in R <= 3.3.x; now error as >>> > above! >>> > >> >>> > >> > ------------------------------------------------------------ >>> > >> ---------------- >>> > >> >>> > >> >>> > >> > Note that in R's own 'nls' sources, there was one case of >>> > >> > situation '2)' above, i.e. a 1x1-matrix was used as a >>> "scalar". >>> > >> >>> > >> > In such cases, you should explicitly coerce it to a vector, >>> > >> > either ("self-explainingly") by as.vector(.), or as I did in >>> > >> > the nls case by c(.) : The latter is much less >>> > >> > self-explaining, but nicer to read in mathematical formulae, >>> and >>> > >> > currently also more efficient because it is a .Primitive. >>> > >> >>> > >> > Please use R-devel with your code, and let us know if you see >>> > >> > effects that seem adverse. >>> > >> >>> > >> I've been slightly surprised (or even "frustrated") by the empty >>> > >> reaction on our R-devel list to this post. >>> > >> >>> > >> I would have expected some critique, may be even some praise, >>> > >> ... in any case some sign people are "thinking along" (as we say >>> > >> in German). >>> > >> >>> > >> In the mean time, I've actually thought along the one case which >>> > >> is last above: The <op> (binary operation) between a >>> > >> non-0-length array and a 0-length vector (and NULL which should >>> > >> be treated like a 0-length vector): >>> > >> >>> > >> R <= 3.3.1 *is* quite inconsistent with these: >>> > >> >>> > >> >>> > >> and my proposal above (implemented in R-devel, since Sep.5) >>> would >>> > give an >>> > >> error for all these, but instead, R really could be more lenient >>> > here: >>> > >> A 0-length result is ok, and it should *not* inherit the array >>> > >> (dim, dimnames), since the array is not of length 0. So instead >>> > >> of the above [for the very last part only!!], we would aim for >>> > >> the following. These *all* give an error in current R-devel, >>> > >> with the exception of 'm1 + NULL' which "only" gives a "bad >>> > >> warning" : >>> > >> >>> > >> ------------------------ >>> > >> >>> > >> m1 <- matrix(1,1) >>> > >> m2 <- matrix(1,2) >>> > >> >>> > >> m1 + NULL # numeric(0) in R <= 3.3.x ---> OK ?! >>> > >> m1 > NULL # logical(0) in R <= 3.3.x ---> OK ?! >>> > >> try(m1 & NULL) # ERROR in R <= 3.3.x ---> change to >>> logical(0) >>> > ?! >>> > >> try(m1 | double())# ERROR in R <= 3.3.x ---> change to >>> logical(0) >>> > ?! >>> > >> ## m2 slightly different: >>> > >> try(m2 + NULL) # ERROR in R <= 3.3.x ---> change to double(0) >>> ?! >>> > >> try(m2 & NULL) # ERROR in R <= 3.3.x ---> change to >>> logical(0) ?! >>> > >> m2 == NULL # logical(0) in R <= 3.3.x ---> OK ?! >>> > >> >>> > >> ------------------------ >>> > >> >>> > >> This would be slightly more back-compatible than the currently >>> > >> implemented proposal. Everything else I said remains true, and >>> > >> I'm pretty sure most changes needed in packages would remain to >>> be >>> > done. >>> > >> >>> > >> Opinions ? >>> > >> >>> > >> >>> > >> >>> > >> > In some case where R-devel now gives an error but did not >>> > >> > previously, we could contemplate giving another "warning >>> > >> > .... 'to become ERROR'" if there was too much breakage, >>> though >>> > >> > I don't expect that. >>> > >> >>> > >> >>> > >> > For the R Core Team, >>> > >> >>> > >> > Martin Maechler, >>> > >> > ETH Zurich >>> > >> >>> > >> ______________________________________________ >>> > >> R-devel@r-project.org mailing list >>> > >> https://stat.ethz.ch/mailman/listinfo/r-devel >>> > >> >>> > >>> > >>> > >>> > > -- >>> > > Robin Hankin >>> > > Neutral theorist >>> > > hankin.ro...@gmail.com >>> > >>> > > [[alternative HTML version deleted]] >>> > >>> > ______________________________________________ >>> > R-devel@r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-devel >>> > >>> >>> >>> >>> -- >>> Gabriel Becker, PhD >>> Associate Scientist (Bioinformatics) >>> Genentech Research >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> >> >> > > > -- > Gabriel Becker, PhD > Associate Scientist (Bioinformatics) > Genentech Research > [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel