Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19810
are you trying to optimize the case that data is too large to fit in
memory? Spark RDD cache doesn't work well for this case.--- --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
