Github user derrickburns commented on the pull request:
https://github.com/apache/spark/pull/2634#issuecomment-57894117
I ran the style tests. The pass. Is there something else in the style guide
that is not captured in the tests ?
I have expended much effort to avoid serializing unnecessary objects. I'm
still perplexed why so much data is being captured in the closure that the test
fails.
Anyway, what are the next steps? Omit the test and Approve the PR? Ask
someone to help fix the code to avoid the unit test failure ?
Thx !
Sent from my iPhone
> On Oct 3, 2014, at 12:17 PM, Xiangrui Meng <[email protected]>
wrote:
>
> @derrickburns The *ClusterSuite was created to prevent referencing
unnecessary objects into the task closure. You can try to remove Serializable
from algorithms. While the models are serializable, the algorithm instances
should stay on the driver node. If you want to use a member method in a task
closure, either make it static or define it as a local method. If you want to
use a member variable, assign it to a val first.
>
> This is something we can try. Avoiding serializing unnecessary objects is
a good practice, but I'm not sure whether it is worth the effort.
>
> Btw, could you update your PR following the
https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide ?
Thanks!
>
> â
> Reply to this email directly or view it on GitHub.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]