viirya commented on pull request #28916: URL: https://github.com/apache/spark/pull/28916#issuecomment-650899063
> It's because `ShuffleRowedRDD` is created with default number of shuffle partitions here https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/ShuffleExchangeExec.scala#L98 When `partitionSpecs` is empty, `CustomShuffleReaderExec` creates `ShuffledRowRDD` with empty `partitionSpecs`. https://github.com/apache/spark/blob/079b3623c85192ff61a35cc99a4dae7ba6c599f0/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CustomShuffleReaderExec.scala#L183-L184 https://github.com/apache/spark/blob/34c7ec8e0cb395da50e5cbeee67463414dacd776/sql/core/src/main/scala/org/apache/spark/sql/execution/ShuffledRowRDD.scala#L156-L160 The shuffle is changed by AQE and `CustomShuffleReaderExec` will replace original `ShuffleExchangeExec`, I think the code you pointed is replaced by above code path. So it should be empty partition. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
