[
https://issues.apache.org/jira/browse/AVRO-1402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13943244#comment-13943244
]
Doug Cutting commented on AVRO-1402:
------------------------------------
> Hive would probably prefer a binary representation for performance [ ... ]
It might be useful to quantify the performance difference, perhaps benchmarking
the writing and reading a snappy-compressed file that contains a decimal field
represented as either bytes or as a string.
A faster alternative to a subtype might be to use a record, e.g.:
{code}
{"type":"record","org.apache.avro.Decimal","fields":[
{"name":"scale","type":"int"},
{"name":"value","type":"bytes"}
]}
{code}
If we still changed GenericData to implement this directly then there would be
no overhead and implementation would be easier & faster, since it wouldn't need
a temporary buffer. It wouldn't be very useful to implementations that don't
yet know about it, but neither would the binary subtype. We could add this
type to the specification as something that implementations might optimize,
just like a subtype. So this might be something to benchmark too.
> if it upgraded to the new version of Avro and read a file with a decimal
> subtype it would receive a BigDecimal when it was only expecting a ByteBuffer.
Today if an application using specific or reflect uses BigDecimal then it will
be read as BigDecimal, since that's currently encoded using the schema
{"type":"string", "java-class":"java.math.BigDecimal"}. So the schema would
change when they upgrade, but the object would not. That seems compatible to
me. You?
If the application is using Generic to write, then BigDecimal will currently
fail.
I assume that existing applications are not currently using
"subType":"decimal", no application should start receiving BigDecimal that
wasn't before. If the write path is upgraded before the read path then the
application will start seeing bytes where before it saw either BigDecimal or
nothing. This is a potential compatibility problem, but not the one you seem
to describe.
> Support for DECIMAL primitive type
> ----------------------------------
>
> Key: AVRO-1402
> URL: https://issues.apache.org/jira/browse/AVRO-1402
> Project: Avro
> Issue Type: New Feature
> Affects Versions: 1.7.5
> Reporter: Mariano Dominguez
> Priority: Minor
> Labels: Hive
> Attachments: AVRO-1402.patch
>
>
> Currently, Avro does not seem to support a DECIMAL type or equivalent.
> http://avro.apache.org/docs/1.7.5/spec.html#schema_primitive
> Adding DECIMAL support would be particularly interesting when converting
> types from Avro to Hive, since DECIMAL is already a supported data type in
> Hive (0.11.0).
--
This message was sent by Atlassian JIRA
(v6.2#6252)