[ 
https://issues.apache.org/jira/browse/AVRO-2088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liviu updated AVRO-2088:
------------------------
    Description: 
We use this schema for AVRO file:


{code:java}
"name":"col1",
"type":["null",
        {
                "type":"bytes",
                "logicalType":"decimal",
                "precision":19,
                "scale":2
        }
        ]
{code}

- if we save data in avro using sqoop or hive (external table), the values are 
saved in hexadecimal format (ex. for 3.12 value is: 
{color:#d04437}*{"col1":{"bytes":"\u00018"}}*{color}
- if we save the data in that avro file using datastage , the values are saved 
in decimal format (ex. for 3.12 the saved value is: 
{color:#d04437}*{"col1":{"bytes":"3.12"}}*{color}

The questions are:
1). why there is this differences, in one case the data is serialised using 
hexidecimal and the other case using decimal? 
2). are these differences caused by Avro serialization encoding used (for one 
case is used binary encoding, for the other case is used json encoding)?
3). how can we control how the values are serialized (ex. we want to have them 
as "3.12" instead of "\u00018")

Thanks,
Liviu

  was:
Use use this schema for AVRO file:


{code:java}
"name":"col1",
"type":["null",
        {
                "type":"bytes",
                "logicalType":"decimal",
                "precision":19,
                "scale":2
        }
        ]
{code}

- if we save data in avro using sqoop or hive (external table), the values are 
saved in hexadecimal format (ex. for 3.12 value is: 
{color:#d04437}*{"col1":{"bytes":"\u00018"}}*{color}
- if we save the data in that avro file using datastage , the values are saved 
in decimal format (ex. for 3.12 the saved value is: 
{color:#d04437}*{"col1":{"bytes":"3.12"}}*{color}

The questions are:
1). why there is this differences, in one case the data is serialised using 
hexidecimal and the other case using decimal? 
2). are these differences caused by Avro serialization encoding used (for one 
case is used binary encoding, for the other case is used json encoding)?
3). how can we control how the values are serialized (ex. we want to have them 
as "3.12" instead of "\u00018")

Thanks,
Liviu


> Decimal logicalType values serialized in hexidecimal vs decimal
> ---------------------------------------------------------------
>
>                 Key: AVRO-2088
>                 URL: https://issues.apache.org/jira/browse/AVRO-2088
>             Project: Avro
>          Issue Type: Task
>            Reporter: liviu
>
> We use this schema for AVRO file:
> {code:java}
> "name":"col1",
> "type":["null",
>       {
>               "type":"bytes",
>               "logicalType":"decimal",
>               "precision":19,
>               "scale":2
>       }
>       ]
> {code}
> - if we save data in avro using sqoop or hive (external table), the values 
> are saved in hexadecimal format (ex. for 3.12 value is: 
> {color:#d04437}*{"col1":{"bytes":"\u00018"}}*{color}
> - if we save the data in that avro file using datastage , the values are 
> saved in decimal format (ex. for 3.12 the saved value is: 
> {color:#d04437}*{"col1":{"bytes":"3.12"}}*{color}
> The questions are:
> 1). why there is this differences, in one case the data is serialised using 
> hexidecimal and the other case using decimal? 
> 2). are these differences caused by Avro serialization encoding used (for one 
> case is used binary encoding, for the other case is used json encoding)?
> 3). how can we control how the values are serialized (ex. we want to have 
> them as "3.12" instead of "\u00018")
> Thanks,
> Liviu



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to