linxiaojun created SPARK-21882:
----------------------------------

             Summary: OutputMetrics doesn't count written bytes correctly in 
the saveAsHadoopDataset function
                 Key: SPARK-21882
                 URL: https://issues.apache.org/jira/browse/SPARK-21882
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.2.0, 1.6.1
            Reporter: linxiaojun
            Priority: Minor


The first job called from saveAsHadoopDataset, running in each executor, does 
not calculate the writtenBytes of OutputMetrics correctly. The reason is that 
we did not initialize the callback function called to find bytes written in the 
right way. As usual, statisticsTable which records statistics in a FileSystem 
must be initialized at the beginning (this will be triggered when open 
SparkHadoopWriter). The solution for this issue is to adjust the order of 
callback function initialization. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to