rwpenney edited a comment on pull request #30745:
URL: https://github.com/apache/spark/pull/30745#issuecomment-747671362
I'm not sure I see how you'll get such a general-purpose function that is
more numerically stable than just multiplying a set of numbers together.
The primary argument for **preferring** `exp(sum(log(...))` is when one can
directly compute the logarithm of the quantities of interest _without_ having
to call the `log` function. An obvious situation is when one is dealing with
probabilities for quantities drawn from a Gaussian distribution, where the
terms with the largest dynamic range are of the form $e^{x^2}$, so can
trivially be converted to log-space. I'm rather doubtful that one actually
reduces the effects of round-off error by starting with a set of quantities,
computing their logarithms, and then computing the exponential after summing
the logarithms. Clearly there are lots of non-linear terms that arise in both
the `log` and `exp` stages, and I think one would have to do quite a careful
error analysis to demonstrate that this is more accurate, in general, than just
multiplying the numbers directly.
Do you have a particular algebraic expression in mind that is mathematically
equivalent to the product of a set of numbers, but which is indeed more
accurate for the most likely use-cases when working with finite-precision
arithmetic?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]