rdblue commented on issue #24991: [SPARK-28188] Materialize Dataframe API URL: https://github.com/apache/spark/pull/24991#issuecomment-511477843 I think this should be an action, not a sink. A no-op sink is just another way to misuse existing APIs for a different purpose. And worse, a noop sink doesn't actually accomplish the goal. This call returns a dataframe that will reuse the data stored in shuffle servers. A noop sink would not work for dataframes because you have to get Spark to re-use the same underlying RDD that has been run.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
