LantaoJin opened a new pull request #27911: [SPARK-31154][SQL] Expose basic write metrics for InsertIntoDataSourceCommand URL: https://github.com/apache/spark/pull/27911 ### What changes were proposed in this pull request? Spark provides interface `InsertableRelation` and the `InsertIntoDataSourceCommand` to delegate the inserting processing to a data source. Unlike `DataWritingCommand`, the metrics in InsertIntoDataSourceCommand is empty and has no chance to update. So we cannot get "number of written files" or "number of output rows" from its metrics. For example, if a table is a Spark parquet table. We can get the writing metrics by: ```scala val df = sql("INSERT INTO TABLE test_table SELECT 1, 'a'") val numFiles = df.queryExecution.sparkPlan.metrics("numFiles").value ``` But if it is a Delta table, we cannot. ### Does this PR introduce any user-facing change? Add a method in `InsertableRelation`. But still keep the old one. ### How was this patch tested? Add a unit suite.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
