On 12 November 2015 at 21:01, Tom Lane <t...@sss.pgh.pa.us> wrote: > Dean Rasheed <dean.a.rash...@gmail.com> writes: >> These results are based on the attached, updated patch which includes >> a few minor improvements. > > I started to look at this patch, and was immediately bemused by the > comment in estimate_ln_weight: > > /* > * 0.9 <= var <= 1.1 > * > * Its logarithm has a negative weight (possibly very large). Estimate > * it using ln(var) = ln(1+x) = x + O(x^2) ~= x. > */ > > This comment's nonsense of course: ln(0.9) is about -0.105, and ln(1.1) is > about 0.095, which is not even negative much less "very large".
That's nonsense. The comment is perfectly correct. It's not saying the logarithm is negative, it's saying that the *weight* of the logarithm is negative. For the examples you cite, the weight is small and negative (around -1), but for inputs closer to 1 it can be large and negative. > We could > just replace that with "For values reasonably close to 1, we can estimate > the result using ln(var) = ln(1+x) ~= x." I am wondering what's the point > though: why not flush this entire branch in favor of always using the > generic case? I rather doubt that on any modern machine two uses of > cmp_var plus init_var/sub_var/free_var are going to be cheaper than the > log() call you save. You would need a different way to special-case 1.0, > but that's not expensive. > No, you're missing the point entirely. Suppose var = 1.000000000000000000001, in which case ln(var) is approximately 1e-21. The code in the first branch uses the approximation ln(var) = ln(1+x) ~= x = var-1 to see that the weight of ln(var) is around -21. You can't do that with the code in second branch just by looking at the first couple of digits of var and doing a floating point approximation, because all you'd see would be 1.0000000 and your approximation to the logarithm would be orders of magnitude out (as is the case with the code that this function is replacing, which comes with no explanatory comments at all). > Larger questions are > > (1) why the Abs() in the specification of estimate_ln_weight --- that > doesn't comport with the text about "Estimate the weight of the most > significant digit". Yes it does, and the formula is correct. The function returns an estimate for the weight of the logarithm of var. Let L = ln(var), then what we are trying to return is the weight of L. So we don't care about the sign of L, we just need to know it's weight -- i.e., we want approximately log10(Abs(L)) which is log10(Abs(ln(var))) as the comment says. The Abs() is necessary because L might be negative (when 0 < var < 1). > (The branch I'm proposing you remove fails to > honor that anyway.) > No it doesn't. It honours the Abs() by ignoring the sign of x, and just looking at its weight. > (2) what should the behavior be for input == 1, and why? The code > is returning zero, but that seems inconsistent or at least > underdocumented. > ln(1) is 0, which has a weight of 0. I suppose you could argue that technically 0 could have any weight, but looking at the bigger picture, what this function is for is deciding how many digits of precision are needed in intermediate calculations to retain full precision in the final result. When the input is exactly 1, the callers will have nice exact results, and no extra intermediate precision is needed, so returning 0 is quite sensible. I guess adding words to that effect to the comment makes sense, since it clearly wasn't obvious. > (3) if it's throwing an error for zero input, why not for negative > input? I'm not sure that either behavior is within the purview of > this function, anyway. Seems like returning zero might be fine. > > regards, tom lane [Shrug] It doesn't make much difference since both those error cases will be caught a little later on in the callers, but since this function needs to test for an empty digit array in the second branch anyway, it seemed like it might as well report that error there. Returning zero would be fine too. Regards, Dean -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers