rahil-c commented on code in PR #17768:
URL: https://github.com/apache/hudi/pull/17768#discussion_r2657974678
##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala:
##########
@@ -516,9 +516,6 @@ class HoodieSparkSqlWriterInternal {
// scalastyle:on
val writeConfig = client.getConfig
- if (writeConfig.getRecordMerger.getRecordType ==
HoodieRecordType.SPARK && tableType == MERGE_ON_READ &&
writeConfig.getLogDataBlockFormat.orElse(HoodieLogBlockType.AVRO_DATA_BLOCK) !=
HoodieLogBlockType.PARQUET_DATA_BLOCK) {
Review Comment:
Originally when I was running tests for MOR tables I ran into the following
exception when the condition was present.
```
java.lang.UnsupportedOperationException:
org.apache.hudi.DefaultSparkRecordMerger only support parquet log.
at
org.apache.hudi.HoodieSparkSqlWriterInternal.writeInternal(HoodieSparkSqlWriter.scala:513)
at
org.apache.hudi.HoodieSparkSqlWriterInternal.$anonfun$write$1(HoodieSparkSqlWriter.scala:193)
at
org.apache.hudi.HoodieSparkSqlWriterInternal.write(HoodieSparkSqlWriter.scala:211)
at
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:133)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:171)
at
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48)
at
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:75)
```
My assumption for why this condition in hudi is that we did not implement
the `getAvroBytes` originally in the `HoodieSparkRecord` as i see the following
comment here saying that only parquet log is supported
https://github.com/apache/hudi/blob/master/hudi-common/src/main/java/org/apache/hudi/common/table/log/block/HoodieAvroDataBlock.java#L121
https://github.com/apache/hudi/blob/master/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/common/model/HoodieSparkRecord.java
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]