Yongjia Wang created SPARK-10912:
------------------------------------

             Summary: Improve Spark metrics executor.filesystem
                 Key: SPARK-10912
                 URL: https://issues.apache.org/jira/browse/SPARK-10912
             Project: Spark
          Issue Type: Improvement
    Affects Versions: 1.5.0
            Reporter: Yongjia Wang


In org.apache.spark.executor.ExecutorSource it has 2 filesystem metrics: "hdfs" 
and "file". I started using s3 as the persistent storage with Spark standalone 
cluster in EC2, and s3 read/write metrics do not appear anywhere. The 'file' 
metric appears to be only for driver reading local file, it would be nice to 
also report shuffle read/write metrics, so it can help understand things like 
if a Spark job becomes IO bound.
I think these 2 things (s3 and shuffle) are very useful and cover all the 
missing information about Spark IO especially for s3 setup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to