GitHub user freeman-lab opened a pull request:

    https://github.com/apache/spark/pull/1725

    StatCounter on NumPy arrays [PYSPARK][SPARK-2012]

    These changes allow StatCounters to work properly on NumPy arrays, to fix 
the issue reported here  (https://issues.apache.org/jira/browse/SPARK-2012). 
    
    If NumPy is installed, the NumPy functions ``maximum``, ``minimum``, and 
``sqrt``, which work on arrays, are used to merge statistics. If not, we fall 
back on scalar operators, so it will work on arrays with NumPy, but will also 
work without NumPy.
    
    New unit tests added, along with a check for NumPy in the tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/freeman-lab/spark numpy-max-statcounter

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/1725.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1725
    
----
commit 176a127c3c35512a2690ad8ccfb020ea94e42596
Author: Jeremy Freeman <[email protected]>
Date:   2014-08-01T22:47:50Z

    Use numpy arrays in StatCounter
    
    - If NumPy is installed, use maximum/minimum/sqry so that StatCounters
    work on NumPy arrays
    - Otherwise, fall back on scalar operators

commit 1c8a832ac71dafad893b3f92d12d57c284496402
Author: Jeremy Freeman <[email protected]>
Date:   2014-08-01T22:48:04Z

    Unit tests for StatCounter with NumPy arrays

commit 875414c6d79ef8e8a8938cf888eba71a9bdad070
Author: Jeremy Freeman <[email protected]>
Date:   2014-08-01T23:04:16Z

    Fixed indents

commit 8e764dd0e77e1c32827859fe09019c9c912defb1
Author: Jeremy Freeman <[email protected]>
Date:   2014-08-01T23:07:31Z

    Explicit numpy imports

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to