Thanks, all. I had read about recycling, but I guess I didn't fully appreciate all the "weirdness" it might produce. :/
With this explained, I'm going to ask a follow-up, which is only contextually related: the impetus for this discovery was checking "corner cases" to determine if all(x[!is.na(x)]==y[!is.na(y)]) would suffice to determine equality of two vectors containing NA's. Between the above result; my related discovery that this indexing preserves relative positional info but not absolute positional info; and the performance penalty when comparing long vectors that may be unequal "early on"; I've concluded that--if it (can be made to) "short circuit"--it would probably be better to use an implicit loop. So that's my Q: will (or can) an implicit loop (be made to) "exit early" if a specified condition is met before all indices have been checked? Thanks again! DLG On Sat, Mar 9, 2019 at 9:07 PM Jeff Newmiller <[email protected]> wrote: > Regarding the mention of logical indexing, under ?Extract I see: > > For [-indexing only: i, j, ... can be logical vectors, indicating > elements/slices to select. Such vectors are recycled if necessary to match > the corresponding extent. i, j, ... can also be negative integers, > indicating elements/slices to leave out of the selection. > > On March 9, 2019 6:57:05 PM PST, Rolf Turner <[email protected]> > wrote: > >On 3/10/19 2:36 PM, David Goldsmith wrote: > >> Hi! Newbie (self-)learning R using P. Dalgaard's "Intro Stats w/ R"; > >not > >> new to statistics (have had grad-level courses and work experience in > >> statistics) or vectorized programming syntax (have extensive > >experience > >> with MatLab, Python/NumPy, and IDL, and even a smidgen--a long time > >ago--of > >> experience w/ S-plus). > >> > >> In exploring the use of is.na in the context of logical indexing, > >I've come > >> across the following puzzling-to-me result: > >> > >>> y; !is.na(y[1:3]); y[!is.na(y[1:3])] > >> [1] 0.3534253 -1.6731597 NA -0.2079209 > >> [1] TRUE TRUE FALSE > >> [1] 0.3534253 -1.6731597 -0.2079209 > >> > >> As you can see, y is a four element vector, the third element of > >which is > >> NA; the next line gives what I would expect--T T F--because the first > >two > >> elements are not NA but the third element is. The third line is what > >> confuses me: why is the result not the two element vector consisting > >of > >> simply the first two elements of the vector (or, if vectorized > >indexing in > >> R is implemented to return a vector the same length as the logical > >index > >> vector, which appears to be the case, at least the first two elements > >and > >> then either NA or NaN in the third slot, where the logical indexing > >vector > >> is FALSE): why does the implementation "go looking" for an element > >whose > >> index in the "original" vector, 4, is larger than BOTH the largest > >index > >> specified in the inner-most subsetting index AND the size of the > >resulting > >> indexing vector? (Note: at first I didn't even understand why the > >result > >> wasn't simply > >> > >> 0.3534253 -1.6731597 NA > >> > >> but then I realized that the third logical index being FALSE, there > >was no > >> reason for *any* element to be there; but if there is, due to some > >> overriding rule regarding the length of the result relative to the > >length > >> of the indexer, shouldn't it revert back to *something* that > >indicates the > >> "FALSE"ness of that indexing element?) > >> > >> Thanks! > > > >It happens because R is eco-concious and re-cycles. :-) > > > >Try: > > > >ok <- c(TRUE,TRUE,FALSE) > >(1:4)[ok] > > > >In general in R if there is an operation involving two vectors then > >the shorter one gets recycled to provide sufficiently many entries to > >match those of the longer vector. > > > >This in the foregoing example the first entry of "ok" gets used again, > >to make a length 4 vector to match up with 1:4. The result is the same > > > >as (1:4)[c(TRUE,TRUE,FALSE,TRUE)]. > > > >If you did (1:7)[ok] you'd get the same result as that from > >(1:7)[c(TRUE,TRUE,FALSE,TRUE,TRUE,FALSE,TRUE)] i.e. "ok" gets > >recycled 2 and 1/3 times. > > > >Try 10*(1:3) + 1:4, 10*(1:3) + 1:5, 10*(1:3) + 1:6 . > > > >Note that in the first two instances you get warnings, but in the third > >you don't, since 6 is an integer multiple of 3. > > > >Why aren't there warnings when logical indexing is used? I guess > >because it would be annoying. Maybe. > > > >Note that integer indices get recycled too, but the recycling is > >limited > >so as not to produce redundancies. So > > > >(1:4)[1:3] just (sensibly) gives > > > >[1] 1 2 3 > > > >and *not* > > > >[1] 1 2 3 1 > > > >Perhaps a bit subtle, but it gives what you'd actually *want* rather > >than being pedantic about rules with a result that you wouldn't want. > > > >cheers, > > > >Rolf Turner > > > >P.S. If you do > > > >y[1:3][!is.na(y[1:3])] > > > >i.e. if you're careful to match the length of the vector and the that > >of > >the indices, you get what you initially expected. > > > >R. T. > > > >P^2.S. To the younger and wiser heads on this list: the help on "[" > >does not mention that the index vectors can be logical. I couldn't > >find > >anything about logical indexing in the R help files. Is something > >missing here, or am I just not looking in the right place? > > > >R. T. > > -- > Sent from my phone. Please excuse my brevity. > [[alternative HTML version deleted]] ______________________________________________ [email protected] mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

