[
https://issues.apache.org/jira/browse/AVRO-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194526#comment-16194526
]
Zoltan Ivanfi commented on AVRO-2088:
-------------------------------------
This seems to be a bug in datastage. According to the
[specification|https://avro.apache.org/docs/1.8.1/spec.html#Decimal], a decimal
"must contain the two's-complement representation of the unscaled integer value
in big-endian byte order", so storing 3.12 as "3.12" is not compliant.
> Decimal logicalType values serialized in hexidecimal vs decimal
> ---------------------------------------------------------------
>
> Key: AVRO-2088
> URL: https://issues.apache.org/jira/browse/AVRO-2088
> Project: Avro
> Issue Type: Task
> Reporter: liviu
>
> We use this schema for AVRO file:
> {code:java}
> "name":"col1",
> "type":["null",
> {
> "type":"bytes",
> "logicalType":"decimal",
> "precision":19,
> "scale":2
> }
> ]
> {code}
> - if we save data in avro using sqoop or hive (external table), the values
> are saved in hexadecimal format (ex. for 3.12 value is:
> {color:#d04437}*{"col1":{"bytes":"\u00018"}}*{color}
> - if we save the data in that avro file using datastage , the values are
> saved in decimal format (ex. for 3.12 the saved value is:
> {color:#d04437}*{"col1":{"bytes":"3.12"}}*{color}
> The questions are:
> 1). why there is this differences, in one case the data is serialised using
> hexidecimal and the other case using decimal?
> 2). are these differences caused by Avro serialization encoding used (for one
> case is used binary encoding, for the other case is used json encoding)?
> 3). how can we control how the values are serialized (ex. we want to have
> them as "3.12" instead of "\u00018")
> Thanks,
> Liviu
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)