wangjunjie-lnnf opened a new issue, #11738:
URL: https://github.com/apache/hudi/issues/11738
We have much hudi table of version 0.6.0, We want to upgrade to version
0.14.1 or 0.15.0, so we did some test. When we write table of version 0.6.0
with client of 0.15.0, some error happen
**To Reproduce**
Steps to reproduce the behavior:
1. create table with 0.6.0
2. write table with 0.15.0
3.
4.
**Expected behavior**
upgrade success
**Environment Description**
* Hudi version : 0.6.0 & 0.15.0
* Spark version :
* Hive version :
* Hadoop version :
* Storage (HDFS/S3/GCS..) :
* Running on Docker? (yes/no) :
**Additional context**
Add any other context about the problem here.
**Stacktrace**
```scala
Exception in thread "main" org.apache.hudi.exception.HoodieException: Config
conflict(key current value existing value):
RecordKey: id null
at
org.apache.hudi.HoodieWriterUtils$.validateTableConfig(HoodieWriterUtils.scala:229)
HoodieWriterUtils.scala:229
at
org.apache.hudi.HoodieSparkSqlWriterInternal.writeInternal(HoodieSparkSqlWriter.scala:232)
HoodieSparkSqlWriter.scala:232
at
org.apache.hudi.HoodieSparkSqlWriterInternal.write(HoodieSparkSqlWriter.scala:187)
HoodieSparkSqlWriter.scala:187
at
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:125)
HoodieSparkSqlWriter.scala:125
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:168)
DefaultSource.scala:168
at
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:47)
SaveIntoDataSourceCommand.scala:47
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
commands.scala:75
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:73)
commands.scala:73
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:84)
commands.scala:84
at
org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:97)
QueryExecution.scala:97
```
the reason is table of 0.6.0 not record fields info in hoodie.properties
```
hoodie.table.precombine.field=x
hoodie.table.partition.fields=y
hoodie.table.recordkey.fields=x
```
but when write with client of 0.15.0, it did some validation, the error
occur in the validation. the validation should skip when current table version
is too low without need info in hoodie.properties
```scala
object HoodieSparkSqlWriter {
private def writeInternal(sqlContext: SQLContext,
mode: SaveMode,
optParams: Map[String, String],
...) {
var tableConfig = getHoodieTableConfig(sparkContext, path, mode, ...)
// 验证option参数和hoodie.properties是否一致
// 低版本的hoodie.properties没有记录字段信息
validateTableConfig(sqlContext.sparkSession, optParams, tableConfig,
mode == SaveMode.Overwrite)
}
}
```
when we fill fields info in hoodie.properties of version 0.6.0 by ourself,
the upgrade success.
<b>by the way: is it safe upgrade from 0.6.0 to 0.15.0 or 0.14.1 ? </b>
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]