cloud-fan commented on issue #27096: [SPARK-28148][SQL] Repartition after join
is not optimized away
URL: https://github.com/apache/spark/pull/27096#issuecomment-574507348
In general, shuffles are added by Spark and Spark can pick the best # of
partitions. However, for user-specified shuff
cloud-fan commented on issue #27096: [SPARK-28148][SQL] Repartition after join
is not optimized away
URL: https://github.com/apache/spark/pull/27096#issuecomment-573548617
can we try option 1? Seems like we need to do some experiments here. I'm not
sure which option is better without seein
cloud-fan commented on issue #27096: [SPARK-28148][SQL] Repartition after join
is not optimized away
URL: https://github.com/apache/spark/pull/27096#issuecomment-572348283
an issue is that at logical phase we don't know the physical
partitioning/sorting info (e.g. SMJ), so we can't optimiz