Github user viirya commented on the issue:
https://github.com/apache/spark/pull/17936
@jerryshao Yeah, the reason I mentioned caching is to know how much
re-computing RDD costs in the performance. It seems to me that if re-computing
is much more costing than transferring the data, only caching can be helpful.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]