kazdy created HUDI-5243:
---------------------------
Summary: Return num_affected_rows from sql INSERT statement
Key: HUDI-5243
URL: https://issues.apache.org/jira/browse/HUDI-5243
Project: Apache Hudi
Issue Type: Improvement
Components: spark-sql
Reporter: kazdy
Assignee: kazdy
Fix For: 0.13.0
Currently when running spark sql DML, in order to check how many rows were
affected, users need to get to the commit stats using hudi cli or stored
procedure.
We can improve user experience by returning num_affected_rows after INSERT INTO
command, so that sql users can easily see how many rows were inserted without
the need to go to the commits itself.
num_affected_rows can be extracted in writer itself form commitMetadata
Example:
{code:java}
spark.sql("""
create table test_mor (id int, name string)
using hudi
tblproperties (primaryKey = 'id', type='mor');
""")
spark.sql(
"""
INSERT INTO test_mor
VALUES
(1, "a"),
(2, "b"),
(3, "c"),
(4, "d"),
(5, "e"),
(6, "f"),
(7, "g")
""").show()
returns:
+-----------------+
|num_affected_rows|
+-----------------+
| 7|
+-----------------+
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)