Eric Yang created SPARK-48397:
---------------------------------
Summary: Add data write time metric to
FileFormatDataWriter/BasicWriteJobStatsTracker
Key: SPARK-48397
URL: https://issues.apache.org/jira/browse/SPARK-48397
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 4.0.0
Reporter: Eric Yang
For FileFormatDataWriter we currently record metrics of "task commit time" and
"job commit time" in
`org.apache.spark.sql.execution.datasources.BasicWriteJobStatsTracker#metrics`.
We may also record the time spent on "data write" (together with the time spent
on producing records from the iterator), which is usually one of the major
parts of the total duration of a writing operation. It helps us identify the
bottleneck and time skew, and also the generic performance tuning.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]