Dear Martin, Thank you for following up.
I appreciate that this is entrenched behaviour and that changing the documentation may be preferable to changing the code in practice, and accordingly I filed this as a documentation bug earlier today (#16493). But I don't agree that the current behaviour makes sense. Firstly, the case where the magnitude of `tolerance` is greater than that of the test vectors must surely be pretty rare. Who wants to test whether 1 and 2 are equal with a tolerance of 10? Secondly, absolute error is (IMHO) more intuitive, and since the docs don't emphasise that the function prefers relative error, I would think that many users, like me, would expect absolute error to be used. (My assumption, which the docs do not coherently contradict, has been that absolute error is used to decide whether or not to return `TRUE`, but if the vectors are not considered equal then relative error is used in the return string.) Finally, if the decision is about numerical precision in the comparison then comparing `xn` to `tolerance` doesn't seem sensible. Maybe it should be something like `xn * tolerance > .Machine$double.eps`, i.e., to check whether the test criterion under relative error would be within machine precision? Note that that would make all.equal(0.3, 0.1+0.2, tolerance=1e-16) # [1] "Mean relative difference: 1.850372e-16" test TRUE (on my system), since 0.3-(0.1+0.2) is approximately -5.6e-17 (i.e., less in magnitude than 1e-16), while 0.3*1e-16 is less than .Machine$double.eps of 2.2e-16 (so absolute error would be chosen). However, if the code will not be changed, I think the documentation should (i) make clear that relative error is preferred where appropriate; (ii) correct the line 2 mistake where it is stated that the choice of relative or absolute error is determined by comparing mean absolute difference to `tolerance`; and (iii) correct the final line mistake where it is stated that relative errors are scaled by the difference (which you have suggested alternatives for). All the best, Jon On 30 July 2015 at 10:39, Martin Maechler <maech...@stat.math.ethz.ch> wrote: > Dear Jon, > > thank you for raising the issue, > >>>>>> Jon Clayden <jon.clay...@gmail.com> >>>>>> on Tue, 28 Jul 2015 12:14:48 +0100 writes: > >> Sorry; minor clarification. The actual test criterion in the example I >> gave is of course abs((0.1-0.102)/0.1) < 0.01, not abs(0.1) < 0.01. In >> any case, this does not match (my reading of) the docs, and the result >> is not `TRUE`. > >> Regards, >> Jon > >> On 28 July 2015 at 11:58, Jon Clayden <jon.clay...@gmail.com> wrote: >> > Dear all, >> > >> > The documentation for `all.equal.numeric` says >> > >> > Numerical comparisons for ‘scale = NULL’ (the default) are done by >> > first computing the mean absolute difference of the two numerical >> > vectors. If this is smaller than ‘tolerance’ or not finite, >> > absolute differences are used, otherwise relative differences >> > scaled by the mean absolute difference. >> > >> > But the actual behaviour of the function is to use relative >> > differences if the mean value of the first argument is greater than >> > `tolerance`: >> > >> > all.equal(0.1, 0.102, tolerance=0.01) >> > # [1] "Mean relative difference: 0.02" >> > >> > It seems to me that this example should produce `TRUE`, because >> > abs(0.1-0.102) < 0.01, but it does not, because abs(0.1) > 0.01. > > Irrespective of the documentation, > the above example should continue to produce what it does now. > These numbers are not close to zero (compared to tol), and so > relative error should be used. > > The whole idea of all.equal.numeric() is to use *relative* error/difference > __unless__ that is not sensible anymore, namely when the > denominator of the ratio which defines the relative error > becomes too close to zero (and hence has to be seen as > "unstable" / "unreliable"). > > The exact behavior of all.equal.numeric() has __ I'm pretty sure, but > can no longer easily prove __ been inherited from the original S > implementation in most parts, and (if that's correct) has been > in place for about 30 years [ If not, it has "only" been in > place about 17 years... ] > notably the code below has been unchanged for a long time, and been in use > in too many places to be changed now. > > So it is about the *documentation* only we should discuss changing. > > >> > The relevant section in the source seems to be >> > >> > what <- if (is.null(scale)) { >> > xn <- mean(abs(target)) >> > if (is.finite(xn) && xn > tolerance) { >> > xy <- xy/xn >> > "relative" >> > } >> > else "absolute" >> > } >> > >> > I think `xy`, not `xn`, should be tested here. > > as I said above, no such change is acceptable > {but I don't see *why* either } > >> > The last line of the documentation, indicating that relative >> > differences are "scaled by the mean absolute difference" also seems >> > not to match the code, but in this aspect the code is surely right, >> > i.e., the relative difference is relative to the mean value, not the >> > mean difference. > > Indeed... interestingly, 'target' at the point in the function > is containing only those values of the original target > which are not NA (fine, of course) but also which are not equal to > 'current'. This is a bit more surprising, also does make sense, > let's say if we have too long numeric vectors which are mostly zero > (and hence often both equal to zero), but in other cases, this > may be more problematic (think of a case when also both vectors > are mostly (exactly) equal, but then there are a few > outliers...). > But in spite of such cases, as said above, I'm almost sure this > won't be changed in the code. > > My current conclusion would be to change only > > scaled by the mean absolute difference > to > scaled by the mean absolute target value > > or (less easy, but more precise) > > scaled by the mean absolute value of those finite entries of > \code{target} where it differs from \code{current} > > or yet another version which is both precise and understandable > ? > > Martin Maechler, > ETH Zurich > >> > >> > All the best, >> > Jon ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel