Deependra Patel created SPARK-44209:
---------------------------------------

             Summary: Expose amount of shuffle data available on the node
                 Key: SPARK-44209
                 URL: https://issues.apache.org/jira/browse/SPARK-44209
             Project: Spark
          Issue Type: New Feature
          Components: Shuffle
    Affects Versions: 3.4.1
            Reporter: Deependra Patel


[ShuffleMetrics|https://github.com/apache/spark/blob/43f7a86a05ad8c7ec7060607e43d9ca4d0fe4166/common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExternalBlockHandler.java#L318]
 doesn't have metrics like 
"totalShuffleDataBytes" and "numAppsWithShuffleData", these metrics are per 
node published by External Shuffle Service.
 
Adding these metrics would help in - 
1. Deciding if we can decommission the node if no shuffle data present
2. Better live monitoring of customer's workload to see if there is skewed 
shuffle data present on the node



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to