[ 
https://issues.apache.org/jira/browse/SPARK-31154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lantao Jin updated SPARK-31154:
-------------------------------
    Description: 
Spark provides interface `InsertableRelation` and the 
`InsertIntoDataSourceCommand` to delegate the inserting processing to a data 
source. Unlike `DataWritingCommand`, the metrics in InsertIntoDataSourceCommand 
is empty and has no chance to update. So we cannot get "number of written 
files" or "number of output rows" from its metrics.

For example, if a table is a Spark parquet table. We can get the writing 
metrics by:
{code}
val df = sql("INSERT INTO TABLE test_table SELECT 1, 'a'")
df.executionP
{code}
But if it is a Delta table, we cannot.

  was:Spark provides interface `InsertableRelation` and the 
`InsertIntoDataSourceCommand` to delegate the inserting processing to a data 
source. Unlike `DataWritingCommand`, the metrics in InsertIntoDataSourceCommand 
is empty and has no chance to update. So we cannot get "number of written 
files" or "number of output rows" from its metrics.


> Expose basic write metrics for InsertIntoDataSourceCommand
> ----------------------------------------------------------
>
>                 Key: SPARK-31154
>                 URL: https://issues.apache.org/jira/browse/SPARK-31154
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.0.0, 3.1.0
>            Reporter: Lantao Jin
>            Priority: Major
>
> Spark provides interface `InsertableRelation` and the 
> `InsertIntoDataSourceCommand` to delegate the inserting processing to a data 
> source. Unlike `DataWritingCommand`, the metrics in 
> InsertIntoDataSourceCommand is empty and has no chance to update. So we 
> cannot get "number of written files" or "number of output rows" from its 
> metrics.
> For example, if a table is a Spark parquet table. We can get the writing 
> metrics by:
> {code}
> val df = sql("INSERT INTO TABLE test_table SELECT 1, 'a'")
> df.executionP
> {code}
> But if it is a Delta table, we cannot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to