Tao Meng created HUDI-3096:
------------------------------
Summary: fixed the bug that the cow table(contains decimalType)
write by flink cannot be read by spark
Key: HUDI-3096
URL: https://issues.apache.org/jira/browse/HUDI-3096
Project: Apache Hudi
Issue Type: Bug
Components: Flink Integration
Affects Versions: 0.10.0
Environment: flink 1.13.1
spark 3.1.1
Reporter: Tao Meng
Fix For: 0.11.0
now, flink will write decimalType as byte[]
when spark read that decimal Type, if spark find the precision of current
decimal is small spark treat it as int/long which caused the fllow error:
Caused by: org.apache.spark.sql.execution.QueryExecutionException: Parquet
column cannot be converted in file
hdfs://xxxxx/tmp/hudi/hudi_xxxxx/46d44c57-aa43-41e2-a8aa-76dcc9dac7e4_0-4-0_20211221201230.parquet.
Column: [c7], Expected: decimal(10,4), Found: BINARY
at
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:179)
at
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93)
at
org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:517)
at
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
Source)
--
This message was sent by Atlassian Jira
(v8.20.1#820001)