[
https://issues.apache.org/jira/browse/HUDI-8911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
voon updated HUDI-8911:
-----------------------
Fix Version/s: 1.1.0
(was: 1.0.2)
> Support INSERT SQL statement with a subset of columns in Spark 3.4
> ------------------------------------------------------------------
>
> Key: HUDI-8911
> URL: https://issues.apache.org/jira/browse/HUDI-8911
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Y Ethan Guo
> Priority: Critical
> Fix For: 1.1.0
>
>
> The new tests `TestInsertTable`.`Test Insert Into with subset of columns`,
> `Test Insert Into with subset of columns on Parquet table` fail on Spark 3.4
> due to our validation introduced in HoodieSpark34CatalystPlanUtils in
> [https://github.com/apache/hudi/pull/11568]. Without this change, INSERT
> INTO with subset of columns used to work.
>
> {code:java}
> override def unapplyInsertIntoStatement(plan: LogicalPlan):
> Option[(LogicalPlan, Seq[String], Map[String, Option[String]], LogicalPlan,
> Boolean, Boolean)] = {
> plan match {
> case insert: InsertIntoStatement =>
> // https://github.com/apache/spark/pull/36077
> // first: in this pr, spark34 support default value for insert into, it
> will regenerate the user specified cols
> // so, no need deal with it in hudi side
> // second: in this pr, it will append hoodie meta field with default
> value, has some bug, it look like be fixed
> // in spark35(https://github.com/apache/spark/pull/41262), so
> if user want specified cols, need disable default feature.
> if (SQLConf.get.enableDefaultColumns) {
> if (insert.userSpecifiedCols.nonEmpty) {
> throw new AnalysisException("hudi not support specified cols when
> enable default columns, " +
> "please disable 'spark.sql.defaultColumn.enabled'")
> }
> Some((insert.table, Seq.empty, insert.partitionSpec, insert.query,
> insert.overwrite, insert.ifPartitionNotExists))
> } else {
> Some((insert.table, insert.userSpecifiedCols, insert.partitionSpec,
> insert.query, insert.overwrite, insert.ifPartitionNotExists))
> }
> case _ =>
> None
> }
> } {code}
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)