[jira] [Closed] (HUDI-5275) Reading data using the HoodieHiveCatalog will cause the Spark write to fail

Danny Chen (Jira) Sun, 15 Jan 2023 17:41:51 -0800


     [ 
https://issues.apache.org/jira/browse/HUDI-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Danny Chen closed HUDI-5275.
----------------------------
    Resolution: Fixed

Fixed via master branch: f750773109abd78f5f1b41bb31b27711a7201126

> Reading data using the HoodieHiveCatalog will cause the Spark write to fail
> ---------------------------------------------------------------------------
>
>                 Key: HUDI-5275
>                 URL: https://issues.apache.org/jira/browse/HUDI-5275
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: flink-sql, spark-sql
>    Affects Versions: 0.12.1
>            Reporter: waywtdcc
>            Assignee: Danny Chen
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.13.0
>
>
> After Spark creates a table then writes data, and Flink reads the table. If 
> Spark writes the table again, it will report an error
>  
> The steps are as follows:
>  # spark create table and write data
> {code:java}
> create table test.test_hudi_cc63_no_partition (
>   `id` bigint,
>   `name` string,
>   ts bigint
> ) using hudi
> tblproperties (
>   type = 'cow',
>   primaryKey = 'id' ,
>   preCombineField = 'ts');
> insert into test.test_hudi_cc63_no_partition 
> values
> (112, 'cc12', 1231); {code}
>    2. flink read this table 
>  
> {code:java}
> CREATE CATALOG myhudi WITH(
> 'type' = 'hudi',
> 'default-database' = 'default',
> 'catalog.path' = '/user/hdpu/warehouse',
>     'mode' = 'hms',
>     'hive.conf.dir' = 'hdfs:///user/hdpu/streamx/conf_data/hive_conf'
> );
> select *
>  from myhudi.test.test_hudi_cc63_no_partition; {code}
>  3. spark write again
> {code:java}
> insert into test.test_hudi_cc63_no_partition  values (11222, 'cc122', 12312); 
>  {code}
>  this will report error:
> {code:java}
> org.apache.hudi.exception.HoodieException: Config conflict(key    current 
> value    existing value):
> hoodie.datasource.write.hive_style_partitioning:    false    true
>     at 
> org.apache.hudi.HoodieWriterUtils$.validateTableConfig(HoodieWriterUtils.scala:167)
>     at 
> org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:88)
>     at 
> org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand$.run(InsertIntoHoodieTableCommand.scala:101)
>     at 
> org.apache.spark.sql.hudi.command.InsertIntoHoodieTableCommand.run(InsertIntoHoodieTableCommand.scala:60)
>     at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
>     at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
>     at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
>     at 
> org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:110)
>     at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
>     at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
>     at 
> org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
>     at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
>     at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
>     at 
> org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:110)
>     at 
> org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:106)
>     at 
> org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:481)
>     at 
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:82)
>     at 
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:481)
>     at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:30)
>     at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
>     at 
> org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
>     at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
>     at 
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:30)
>     at org.apache.spark.sql.catalyst.trees.TreeNode.transformDo {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Closed] (HUDI-5275) Reading data using the HoodieHiveCatalog will cause the Spark write to fail

Reply via email to