tcondie commented on issue #20393: [SPARK-23207][SQL] Shuffle+Repartition on a DataFrame could lead to incorrect answers URL: https://github.com/apache/spark/pull/20393#issuecomment-518437994 @jiangxb1987 and @sameeragarwal we are seeing this issue in Spark 2.3.2 when a cache step is introduced after each repartition operation. I have not been able to repro it using the example listed in this PR and Jira. Could either of you please verify that this bug fix is complete and that adding a cache step would not affect the solution?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
