Github user debasish83 commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-62247730
Our sparse data very often has ~10M users and 1M features...for 1 hidden
layer net with 10K nodes, we need 10M x 10K + 1M x 10K doubles...if the model
is not distributed (like ALS design) it's not possible to put such complicated
model into memory of one node...having said that it makes sense to have a
baseline implementation so that it can be used as a reference for further
enhancements...also as a auto encoder, neural nets should generate better
results than sparse coding (which we can do in mllib through the PR) due to the
non-linearity of hidden units...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]