[
https://issues.apache.org/jira/browse/AVRO-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13955862#comment-13955862
]
Xuefu Zhang commented on AVRO-1402:
-----------------------------------
{quote}
The scale and precision specified in the column definition are maximums. My
understanding is that decimals with smaller scale and precision may be stored
in such columns. For example in Hive if I have a DECIMAL(5,3) column then 12.4
is stored as [124, 1] as the unscaled int-scale pair (so precision 3, scale 1),
and not as [12400, 3] So in effect there are per-record precision and scale
values, unless I'm missing something.
{quote}
Sorry for being late on this. While it's possible that a column of type decimal
might have different scales from row to row, I'm not sure if there is any use
case for that. At least, majority of use cases is that all rows will have the
same scale. So for majority of use cases, storage efficiency will suffer a
little for storing scale on per row basis. Most of time, same application (such
as Hive AvroSerde) will be used to write/read the data, so the way data is
stored can be controlled. Of course, storing scale per row is more generic. If
that's case, the need for maxPrecision/maxScale at schema is less meaningful,
as consumer of the decimal data will need and be able to figure out the
precision/scale on a per-row basis.
If we do choose to storing scale per row, I'm wondering if byte instead of int
can be used as the type, saving some storage scale.
> Support for DECIMAL primitive type
> ----------------------------------
>
> Key: AVRO-1402
> URL: https://issues.apache.org/jira/browse/AVRO-1402
> Project: Avro
> Issue Type: New Feature
> Affects Versions: 1.7.5
> Reporter: Mariano Dominguez
> Priority: Minor
> Labels: Hive
> Attachments: AVRO-1402.patch, AVRO-1402.patch, AVRO-1402.patch
>
>
> Currently, Avro does not seem to support a DECIMAL type or equivalent.
> http://avro.apache.org/docs/1.7.5/spec.html#schema_primitive
> Adding DECIMAL support would be particularly interesting when converting
> types from Avro to Hive, since DECIMAL is already a supported data type in
> Hive (0.11.0).
--
This message was sent by Atlassian JIRA
(v6.2#6252)