hahazyb201 opened a new issue, #5731: URL: https://github.com/apache/incubator-gluten/issues/5731
### Backend VL (Velox) ### Bug description In the DAG, when I observe the "shuffle write time total" metric, I found it was much bigger than I expected. So I dive deep into the gluten code and found that the writeTime_ was added twice into the final metric by writeMetrics.incWriteTime. <img width="371" alt="截屏2024-05-13 17 54 33" src="https://github.com/apache/incubator-gluten/assets/20397108/bd65828d-c609-4a1f-8330-8ad130aca82c"> In the VeloxCelebornHashBasedColumnarShuffleWriter.scala file, [write time](https://github.com/apache/incubator-gluten/blob/main/gluten-celeborn/velox/src/main/scala/org/apache/spark/shuffle/VeloxCelebornHashBasedColumnarShuffleWriter.scala#L155) was calculated as the sum of splitResult.getTotalWriteTime + splitResult.getTotalPushTime. And the totalWriteTime is accumulated here by this [line](https://github.com/apache/incubator-gluten/blob/main/cpp/core/shuffle/Payload.cc#L238) . The totalPushTime is accumulated [here](https://github.com/apache/incubator-gluten/blob/main/cpp/core/shuffle/rss/RssPartitionWriter.cc#L60) by the spillTime_ variable. And it's obvious that the spillTime_ includes writeTime_ which means writeTime_ was added twice in the final [write time](https://github.com/apache/incubator-gluten/blob/main/gluten-celeborn/velox/src/main/scala/org/apache/spark/shuffle/VeloxCelebornHashBasedColumnarShuffleWriter.scala#L155) metric. In order to fix it, I propose moving the ScopedTimer [line](https://github.com/apache/incubator-gluten/blob/main/cpp/core/shuffle/rss/RssPartitionWriter.cc#L60) a few lines down. <img width="625" alt="截屏2024-05-13 19 07 51" src="https://github.com/apache/incubator-gluten/assets/20397108/e644e860-bfe5-4e80-90ef-852767913388"> Let me know if you want me to open a PR. Thanks. ### Spark version Spark-3.2.x ### Spark configurations _No response_ ### System information _No response_ ### Relevant logs _No response_ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
