Deependra Patel created SPARK-44209:
---------------------------------------
Summary: Expose amount of shuffle data available on the node
Key: SPARK-44209
URL: https://issues.apache.org/jira/browse/SPARK-44209
Project: Spark
Issue Type: New Feature
Components: Shuffle
Affects Versions: 3.4.1
Reporter: Deependra Patel
[ShuffleMetrics|https://github.com/apache/spark/blob/43f7a86a05ad8c7ec7060607e43d9ca4d0fe4166/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java#L318]
doesn't have metrics like
"totalShuffleDataBytes" and "numAppsWithShuffleData", these metrics are per
node published by External Shuffle Service.
Adding these metrics would help in -
1. Deciding if we can decommission the node if no shuffle data present
2. Better live monitoring of customer's workload to see if there is skewed
shuffle data present on the node
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]