kazdy created HUDI-5243:
---------------------------

             Summary: Return num_affected_rows from sql INSERT statement
                 Key: HUDI-5243
                 URL: https://issues.apache.org/jira/browse/HUDI-5243
             Project: Apache Hudi
          Issue Type: Improvement
          Components: spark-sql
            Reporter: kazdy
            Assignee: kazdy
             Fix For: 0.13.0


Currently when running spark sql DML, in order to check how many rows were 
affected, users need to get to the commit stats using hudi cli or stored 
procedure.

We can improve user experience by returning num_affected_rows after INSERT INTO 
command, so that sql users can easily see how many rows were inserted without 
the need to go to the commits itself.

num_affected_rows can be extracted in writer itself form commitMetadata

Example:
{code:java}
spark.sql("""
create table test_mor (id int, name string) 
using hudi 
tblproperties (primaryKey = 'id', type='mor');
""")

spark.sql(
"""
INSERT INTO test_mor
VALUES 
(1, "a"),
(2, "b"),
(3, "c"),
(4, "d"),
(5, "e"),
(6, "f"),
(7, "g")
""").show()

returns:
+-----------------+
|num_affected_rows|
+-----------------+
|                7|
+-----------------+
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to