GitHub user srowen opened a pull request:

    https://github.com/apache/spark/pull/1659

    SPARK-2748 [MLLIB] [GRAPHX] Loss of precision for small arguments to 
Math.exp, Math.log

    In a few places in MLlib, an expression of the form `log(1.0 + p)` is 
evaluated. When p is so small that `1.0 + p == 1.0`, the result is 0.0. However 
the correct answer is very near `p`. This is why `Math.log1p` exists.
    
    Similarly for one instance of `exp(m) - 1` in GraphX; there's a special 
`Math.expm1` method.
    
    While the errors occur only for very small arguments, given their use in 
machine learning algorithms, this is entirely possible.
    
    Also note the related PR for Python: 
https://github.com/apache/spark/pull/1652

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/srowen/spark SPARK-2748

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1659.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1659
    
----
commit c5926d4d1cc0d071b4ebbd2bddc14af250809fb0
Author: Sean Owen <[email protected]>
Date:   2014-07-30T11:34:20Z

    Use log1p, expm1 for better precision for tiny arguments

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to