>>>>> Bert Gunter <bgunter.4...@gmail.com> >>>>> on Sat, 27 Feb 2016 19:06:05 -0800 writes:
> (on list, since others might not have gotten it either). > OK, I get it now. It was I who misunderstood. > But isn't the bug in the **misuse** of match() in ecdf() > (by failing to specify the nomatch argument). Jeff says > comparisons with NaN should return an unordered result, > which NaN is afaics: >> NaN < 0 > [1] NA >> NaN > 0 > [1] NA > match() just does its thing: >> match(c(NA,NaN),c(1,2,NA,3,4,NaN,5)) > [1] 3 6 > It's up to the caller to use it correctly, which > apparently ecdf() fails to do. > Am I missing something here? not much, if any. Let me still clarify : 1) This has *nothing* to do with match, and I am confused why nobody has mentioned this till now. 2) In x <- c(1,2,NA,3,4,NaN,5) Fn <- ecdf(x) there is no error: ecdf() does drop all NA/NaN from its input on purpose and returns the empirical CDF of the other elements: so Fn is identical (practically, not strictly formally) to Fn. <- ecdf(1:5) 3) The bug is really in the underlying C code of approx() / approxfun() on which ecdf() and notably the function it creates (!) relies : > L <- approxfun(1:6, 1:6, method = "constant") > L( (2:10)/2) [1] 1 1 2 2 3 3 4 4 5 > L( c(NaN, NA, 2:10)/2) [1] 5 NA 1 1 2 2 3 3 4 4 5 4) A fix for this bug has been committed to R-devel already, a a minute ago. [svn rev 70239] Martin Maechler, ETH Zurich > Bert Gunter > "The trouble with having an open mind is that people keep > coming along and sticking things into it." -- Opus (aka > Berkeley Breathed in his "Bloom County" comic strip ) > On Sat, Feb 27, 2016 at 3:49 PM, Jason Thorpe > <jdtho...@gmail.com> wrote: >> The bug is that NaN is not part of any cumulative >> distribution... >> >> -Jason sent from my mobile device >> >> On Feb 27, 2016 3:34 PM, "Bert Gunter" >> <bgunter.4...@gmail.com> wrote: >>> >>> If I understand you correctly, the "bug" is that you do >>> not understand match(). See inline comment below and >>> note carefully the "Value" section of ?match. >>> >>> Cheers, Bert >>> >>> Bert Gunter >>> >>> "The trouble with having an open mind is that people >>> keep coming along and sticking things into it." -- Opus >>> (aka Berkeley Breathed in his "Bloom County" comic strip >>> ) >>> >>> >>> On Sat, Feb 27, 2016 at 2:52 PM, Jason Thorpe >>> <jdtho...@gmail.com> wrote: > For some reason `match()` >>> treats `NaN`'s as comparables by default: >>> > >>> >> x <- c(1,2,3,NaN,4,5) >> match(x,x) > [1] 1 2 3 4 5 6 >>> > >>> > which I can override when using `match()` directly: >>> > >>> >> match(x,x,incomparables=NaN) > [1] 1 2 3 NA 5 6 >>> > >>> > but not necessarily when calling a function that uses >>> `match()` > internally: >>> > >>> >> stats::ecdf(x)(x) > [1] 0.2 0.4 0.6 0.8 0.8 1.0 >>> > >>> > Obviously there are workarounds for any given >>> scenario, but the bigger > problem is that this behavior >>> causes difficult to discover bugs. For > example, the >>> behavior of stats::ecdf is definitely a bug introduced >>> by > it's > use of `match()` (unless you think NaN == 4 >>> is correct). >>> >>> No, you misunderstand. match() returns the POSITION of >>> the match, and clearly NaN in the 4th position of table >>> =x matches NaN in x. e.g. >>> >>> > match(c(x,NaN),x) [1] 1 2 3 4 5 6 4 >>> >>> >>> >>> > >>> > Is there a good reason that NaN's are treated as >>> comparables by match(), > or > his this a bug? >>> > >>> > For reference, I'm using R version 3.2.3 >>> > >>> > -Jason >>> > >>> > [[alternative HTML version deleted]] >>> > >>> > ______________________________________________ > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and >>> more, see > https://stat.ethz.ch/mailman/listinfo/r-help >>> > PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > and >>> provide commented, minimal, self-contained, reproducible >>> code. > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and > more, see https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide > commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.