[
https://issues.apache.org/jira/browse/SPARK-42595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-42595:
------------------------------------
Assignee: Apache Spark
> Support query inserted partitions after insert data into table when
> hive.exec.dynamic.partition=true
> ----------------------------------------------------------------------------------------------------
>
> Key: SPARK-42595
> URL: https://issues.apache.org/jira/browse/SPARK-42595
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 3.5.0
> Reporter: zhang haoyan
> Assignee: Apache Spark
> Priority: Major
>
> When hive.exec.dynamic.partition=true and
> hive.exec.dynamic.partition.mode=nonstrict, we can insert table by sql like
> 'insert overwrite table aaa partition(dt) select xxxx', of course we can
> know the partitions inserted into the table by the sql itself, but if we
> want do something for common use, we need some common way to get the inserted
> partitions, for example:
> spark.sql("insert overwrite table aaa partition(dt) select xxxx")
> //insert table
> val partitions = getInsertedPartitions() //need some way to get
> inserted partitions
> monitorInsertedPartitions(partitions) //do something for common use
> Since insert statement should not return any data, this ticket propose to
> introduce spark.hive.exec.dynamic.partition.savePartitions=true (default
> false)
> spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix=hive_dynamic_inserted_partitions
> when spark.hive.exec.dynamic.partition.savePartitions=true we save the
> partitions to the
> temporary view
> $spark.hive.exec.dynamic.partition.savePartitions.tableNamePrefix_$dbName_$tableName
> we will allow user to do this
> scala> spark.conf.set("hive.exec.dynamic.partition", true)
> scala> spark.conf.set("hive.exec.dynamic.partition.mode", "nonstrict")
> scala> spark.conf.set("spark.hive.exec.dynamic.partition.savePartitions",
> true)
> scala> spark.sql("insert overwrite table db1.test_partition_table partition
> (dt) select 1, '2023-02-22'").show(false)
> ++
>
> ||
> ++
> ++
> scala> spark.sql("select * from
> hive_dynamic_inserted_partitions_db1_test_partition_table").show(false)
> +----------+
>
> |dt |
> +----------+
> |2023-02-22|
> +----------+
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]