zhang haoyan created SPARK-42595:
------------------------------------

             Summary: Support query inserted partitions after insert data into 
table when hive.exec.dynamic.partition=true
                 Key: SPARK-42595
                 URL: https://issues.apache.org/jira/browse/SPARK-42595
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 3.5.0
            Reporter: zhang haoyan


When hive.exec.dynamic.partition=true and 
hive.exec.dynamic.partition.mode=nonstrict, we can insert table by sql like 
'insert overwrite table aaa partition(dt) select xxxx',  of course we can know 
the partitions inserted into the table by the sql itself,  but if we want do 
something for common use, we need some common way to get the inserted 
partitions,  for example:

    spark.sql("insert overwrite table aaa partition(dt) select xxxx")  //insert 
table

    val partitions = getInsertedPartitions()   //need some way to get inserted 
partitions

    monitorInsertedPartitions(partitions)    //do something for common use

Since insert statement should not return any data, this ticket propose to 
introduce spark.hive.exec.dynamic.partition.savePartitions=true (default false) 
spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix=hive_dynamic_inserted_partitions

when spark.hive.exec.dynamic.partition.savePartitions=true we save the 
partitions to the 

temporary view 
$spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix_$dbName_$tableName

we will allow user to do this

scala> spark.conf.set("hive.exec.dynamic.partition", true)

scala> spark.conf.set("hive.exec.dynamic.partition.mode", "nonstrict")

scala> spark.conf.set("spark.hive.exec.dynamic.partition.savePartitions", true)

scala> spark.sql("insert overwrite table db1.test_partition_table partition 
(dt) select 1, '2023-02-22'").show(false)

++                                                                              

||

++

++

scala> spark.sql("select * from 
hive_dynamic_inserted_partitions_db1_test_partition_table").show(false)

+----------+                                                                    

|dt        |

+----------+

|2023-02-22|

+----------+



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to