GitHub user avulanov opened a pull request:

    https://github.com/apache/spark/pull/7621

    [SPARK-2352] [ML] Add Artificial Neural Network (ANN) to Spark

    ### Summary
    This pull request contains the following features for ML:
       - Multilayer Perceptron regressor
       - Multilayer Perceptron classifier
    
    This implementation is based on our initial pull request with @bgreeven: 
https://github.com/apache/spark/pull/1290 and inspired by very insightful 
suggestions from @mengxr and @witgo (I would like to thank all other people 
from the mentioned thread for useful discussions). The original code was 
extensively tested and benchmarked. Since then, I've addressed two main 
requirements that prevented the code from merging into the main branch: 
       - Extensible interface, so it will be easy to implement new types of 
networks
         - Main building blocks are traits `Layer` and `LayerModel`. They are 
used for constructing layers of ANN. New layers can be added by extending the 
`Layer` and `LayerModel` traits. These traits are private in this release in 
order to save path to improve them based on community feedback
         - Back propagation is implemented in general form, so there is no need 
to change it (optimization algorithm) when new layers are implemented
       - Speed and scalability: this implementation has to be comparable in 
terms of speed to the state of the art single node implementations.
         - The developed benchmark for large ANN shows that the proposed code 
is on par with C++ CPU implementation and scales nicely with the number of 
workers. Details can be found here: https://github.com/avulanov/ann-benchmark
    
    ### Other implementations based on the proposed interface
       - DBN and RBM by @witgo 
https://github.com/witgo/spark/tree/ann-interface-gemm-dbn
       - Dropout https://github.com/avulanov/spark/tree/ann-interface-gemm
    
    @mengxr and @dbtsai kindly agreed to perform code review.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/avulanov/spark SPARK-2352-ann

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/7621.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #7621
    
----
commit a2261330c227be8ef26172dbe355a617d653553a
Author: Alexander Ulanov <[email protected]>
Date:   2015-07-23T14:55:15Z

    Multilayer Perceptron regressor and classifier
    
    ANN test

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to