Github user WeichenXu123 commented on the issue:
https://github.com/apache/spark/pull/19229
@viirya I guess the reason is, the old PR version:
`df.withColumn(..).withColumn(..).withColumn(..)....`, the long df chain
prevent the shuffle re-using... but now you merge them into one step.--- --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
