Github user avulanov commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-62823148
@mengxr I've performed a test on mnist8m with our implementation of ANN as
you suggested. I used a 5-node cluster. Each node has Xeon 3.3GHz 4 cores with
16GB RAM. My Spark Setup was that each node runs 4 Workers with 3 GB Ram and 1
core, total 20 Workers. 99.9% of data was used for train, remaining - for test
(random split). I got an error oscillating around 4% after 25 iterations. Each
iteration is 30 minutes on average (ranges from 10 to 50 minutes). Could you
suggest if such testing is enough or you would like me to produce some specific
measurements?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]