Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/2634#issuecomment-57841035
@derrickburns The `*ClusterSuite` was created to prevent referencing
unnecessary objects into the task closure. You can try to remove `Serializable`
from algorithms. While the models are serializable, the algorithm instances
should stay on the driver node. If you want to use a member method in a task
closure, either make it static or define it as a local method. If you want to
use a member variable, assign it to a `val` first.
This is something we can try. Avoiding serializing unnecessary objects is a
good practice, but I'm not sure whether it is worth the effort.
Btw, could you update your PR following the
https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide ?
Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]