Github user codedeft commented on the pull request:
https://github.com/apache/spark/pull/2868#issuecomment-61358798
@mengxr @jkbradley Can you merge this? This is the only way you can
effectively train 10 large trees with the mnist8m dataset.
With node Id cache, it took a very long time, but we were able to finish
training 10 trees on mnist8m in 15 hours with 20 executors. SF with local
training can finish this in 20 minutes, so local training would be a must in
the next release.
However, without node Id cache, it looks like it's not even possible. It's
currently only 60% of the way there and it's already taken 13 hours and dozens
of fetch failures. I feel that it might eventually just fail because the models
are just too big to pass around.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]