[jira] [Created] (HUDI-3096) fixed the bug that the cow table(contains decimalType) write by flink cannot be read by spark

Tao Meng (Jira) Tue, 21 Dec 2021 23:14:04 -0800

Tao Meng created HUDI-3096:
------------------------------

             Summary: fixed the bug that  the cow table(contains decimalType) 
write by flink cannot be read by spark
                 Key: HUDI-3096
                 URL: https://issues.apache.org/jira/browse/HUDI-3096
             Project: Apache Hudi
          Issue Type: Bug
          Components: Flink Integration
    Affects Versions: 0.10.0
         Environment: flink  1.13.1
spark 3.1.1
            Reporter: Tao Meng
             Fix For: 0.11.0



now,  flink will write decimalType as byte[]

when spark read that decimal Type, if spark find the precision of current 
decimal is small spark treat it as int/long which caused the fllow error:

 

Caused by: org.apache.spark.sql.execution.QueryExecutionException: Parquet 
column cannot be converted in file 
hdfs://xxxxx/tmp/hudi/hudi_xxxxx/46d44c57-aa43-41e2-a8aa-76dcc9dac7e4_0-4-0_20211221201230.parquet.
 Column: [c7], Expected: decimal(10,4), Found: BINARY
  at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:179)
  at 
org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:93)
  at 
org.apache.spark.sql.execution.FileSourceScanExec$$anon$1.hasNext(DataSourceScanExec.scala:517)
  at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown
 Source)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Created] (HUDI-3096) fixed the bug that the cow table(contains decimalType) write by flink cannot be read by spark

Reply via email to