Sagar Sumit created HUDI-2526:
---------------------------------
Summary: Make spark.sql.parquet.writeLegacyFormat configurable
Key: HUDI-2526
URL: https://issues.apache.org/jira/browse/HUDI-2526
Project: Apache Hudi
Issue Type: Improvement
Reporter: Sagar Sumit
Assignee: sivabalan narayanan
>From the community,
"I am observing that HUDI bulk inser in 0.9.0 version is not honoring
spark.sql.parquet.writeLegacyFormat=true
config. Can you suggest way to set this config.
Reason to use this config:
Current Bulk insert use spark dataframe writer and don't do avro conversion.
The decimal columns in my DF are written as INT32 type in parquet.
The upsert functionality which uses avro conversion is generating Fixed Length
byte array for decimal types which is failing with datatype mismatch."
The main reason is that the [config is
hardcoded|https://github.com/apache/hudi/blob/46808dcb1fe22491326a9e831dd4dde4c70796fb/hudi-client/hudi-spark-client/src/main/java/org/apache/hudi/io/storage/row/HoodieRowParquetWriteSupport.java#L48].
We can make it configurable.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)