[ 
https://issues.apache.org/jira/browse/AVRO-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194526#comment-16194526
 ] 

Zoltan Ivanfi commented on AVRO-2088:
-------------------------------------

This seems to be a bug in datastage. According to the 
[specification|https://avro.apache.org/docs/1.8.1/spec.html#Decimal], a decimal 
"must contain the two's-complement representation of the unscaled integer value 
in big-endian byte order", so storing 3.12 as "3.12" is not compliant.

> Decimal logicalType values serialized in hexidecimal vs decimal
> ---------------------------------------------------------------
>
>                 Key: AVRO-2088
>                 URL: https://issues.apache.org/jira/browse/AVRO-2088
>             Project: Avro
>          Issue Type: Task
>            Reporter: liviu
>
> We use this schema for AVRO file:
> {code:java}
> "name":"col1",
> "type":["null",
>       {
>               "type":"bytes",
>               "logicalType":"decimal",
>               "precision":19,
>               "scale":2
>       }
>       ]
> {code}
> - if we save data in avro using sqoop or hive (external table), the values 
> are saved in hexadecimal format (ex. for 3.12 value is: 
> {color:#d04437}*{"col1":{"bytes":"\u00018"}}*{color}
> - if we save the data in that avro file using datastage , the values are 
> saved in decimal format (ex. for 3.12 the saved value is: 
> {color:#d04437}*{"col1":{"bytes":"3.12"}}*{color}
> The questions are:
> 1). why there is this differences, in one case the data is serialised using 
> hexidecimal and the other case using decimal? 
> 2). are these differences caused by Avro serialization encoding used (for one 
> case is used binary encoding, for the other case is used json encoding)?
> 3). how can we control how the values are serialized (ex. we want to have 
> them as "3.12" instead of "\u00018")
> Thanks,
> Liviu



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to