[ 
https://issues.apache.org/jira/browse/AVRO-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16523943#comment-16523943
 ] 

Ryan Blue commented on AVRO-2164:
---------------------------------

First class types are difficult to add because they break the format's 
forward-compatibility (old readers can't read newer data). I don't think 
there's a compelling argument to add decimal as a primitive anyway. We can make 
it work with logical types. Similarly, I don't see a benefit to including this 
in the serialized form. There's nothing that would achieve that we can't do 
with the scale encoded in the schema, other than storing values with different 
scales, which is beyond the scope of the type (because no SQL system supports 
it).

Part of the problem is that we don't have well-defined rules for decimal 
evolution. Because changing the scale of a value changes the value itself (4.00 
is NOT equal to 4.000), I think that at a minimum, decimals should always be 
returned in the scale they were written with. That would solve many of these 
problems, right? I'd like to hear ideas for clearly defined rules about what 
happens when you evolve a decimal. (In Iceberg, we don't allow scale changes at 
all because of the problems here.) Without a clear set of rules first, I don't 
think we can confidently make changes.


> Make Decimal a first class type.
> --------------------------------
>
>                 Key: AVRO-2164
>                 URL: https://issues.apache.org/jira/browse/AVRO-2164
>             Project: Avro
>          Issue Type: Improvement
>          Components: logical types
>    Affects Versions: 1.8.2
>            Reporter: Andy Coates
>            Priority: Major
>
> I'd be interested to hear the communities thoughts on making decimal a first 
> class type. 
> The current logical type encodes a decimal into a _bytes_ or _fixed_. This 
> encoding does not include any information about the scale, i.e. this encoding 
> is lossy. 
> There are open issues around the compatibility / evolvability of schemas 
> containing decimal logical types, (e.g. AVRO-2078 & AVRO-1721), that mean 
> reading data that was previously written with a different scale will result 
> in data corruption.
> If these issues were fixed, with suitable compatibility checks put in place, 
> this would then make it impossible to evolve an Avro schema where the scale 
> needs to be changed. This inability to evolve the scale is very restrictive, 
> and can result in high overhead for organizations that _need_ to change the 
> scale, i.e. they may potentially need to copy their entire data set, 
> deserializing with the old scale and re-serializing with the new.
> If _decimal_ were promoted to a first class type, this would allow the scale 
> to be captured in the serialized form, allow for schema evolution support.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to