[
https://issues.apache.org/jira/browse/HUDI-1181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenning Ding reassigned HUDI-1181:
----------------------------------
Assignee: Wenning Ding
> Decimal type display issue for record key field
> -----------------------------------------------
>
> Key: HUDI-1181
> URL: https://issues.apache.org/jira/browse/HUDI-1181
> Project: Apache Hudi
> Issue Type: Bug
> Reporter: Wenning Ding
> Assignee: Wenning Ding
> Priority: Major
> Labels: pull-request-available
>
> When using *fixed_len_byte_array* decimal type as Hudi record key, Hudi would
> not correctly display the decimal value, instead, Hudi would display it as a
> byte array.
> During the Hudi writing phase, Hudi would save the parquet source data into
> Avro Generic Record. For example, the source parquet data has a column with
> decimal type:
>
> {code:java}
> optional fixed_len_byte_array(16) OBJ_ID (DECIMAL(38,0));{code}
>
> Then Hudi will convert it into the following avro decimal type:
> {code:java}
> {
> "name" : "OBJ_ID",
> "type" : [ {
> "type" : "fixed",
> "name" : "fixed",
> "namespace" : "hoodie.hudi_ln.hudi_ln_record.OBJ_ID",
> "size" : 16,
> "logicalType" : "decimal",
> "precision" : 38,
> "scale" : 0
> }, "null" ]
> }
> {code}
> This decimal field would be stored as a fixed length bytes array. And in the
> reading phase, Hudi will convert this bytes array back to a readable decimal
> value through this
> [converter|https://github.com/apache/hudi/blob/master/hudi-spark/src/main/scala/org/apache/hudi/AvroConversionHelper.scala#L58].
> However, the problem is, when setting decimal type as record keys, Hudi would
> read the value from Avro Generic Record and then directly convert it into
> String type(See
> [here|https://github.com/apache/hudi/blob/master/hudi-spark/src/main/java/org/apache/hudi/DataSourceUtils.java#L76]).
> As a result, what shows in the _hoodie_record_key field would be something
> like: LN_LQDN_OBJ_ID:[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 25, 40, 95, -71].So
> we need to handle this special case to convert bytes array back before
> converting to String.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)