Hello! I commented on the JIRA too, but I think I've clarified some ideas in my head.
Before logical types, one way to use BigDecimal values in Java was to take advantage of the "Stringable" types in ReflectData. This would add a java-class annotation to the schema, and the data would be serialized from/to that class (like BigDecimal) using .toString and constructor that takes a single String argument. This was inefficient, and one of the reasons that LogicalTypes were added (to have a better binary representation). Meanwhile in the schema resolution world, BYTES and STRING have always been interoperable, just converting between the two seamlessly as UTF-8 data. Now, it's unclear how you should migrate from the "Stringable" style to the logical type style -- the change proposed is to add STRING underneath the decimal logical type, and interpret it as ASCII digits: The string "1.23" 0x49465051 would be interpreted as 1.23 instead of 12,293,448.49 (Ivan has an exact example in the JIRA). I keep changing my mind about the right thing to do, but it seems that this is a use case we should support! I was thinking about building in the resolution from "stringable" to logical type as a special case in the Java SDK (since java-class is a Java peculiarity). This could help with migrating other stringables like Dates and UUID too... but then we end up with one Schema Resolution that is successful for Java but behaves differently in other SDK. But is that even a serious problem either? It might be the nice compromise to limit the scope of the change, and could be delivered in a minor release (unlike a spec change). Thanks so much for your investigation Ivan, your JIRA is *perfect*! Ryan On Wed, Mar 30, 2022 at 9:03 AM Oscar Westra van Holthe - Kind <[email protected]> wrote: > > Hi all, > > As I understand, the string format is the same as the toString() > representation of a BigDecimal. Thus, it has: > - at least [scale] and at most [precision] digits (base 10) > - the decimal separator is a dot, placed [scale] digits before the end > - there is no thousands separator > > As for compatibility: > - All old representations, bytes & fixed, with or without logical type, > remain readable. Thus the change is backwards compatible. > - The change introduces a new representation, which old implementations can > read raw, but cannot convert to a BigDecimal. Thus, the change is > notforwards compatible. > > This change introduces a natural/human format for the decimal type (though > inefficient in size). As it both adds convenience and is backwards > compatible, it has my vote. > > > Kind regards, > Oscar > > Oscar Westra van Holthe - Kind <[email protected]> > > Op wo 30 mrt. 2022 06:30 schreef Micah Kornfield <[email protected]>: > > > Hi Ivan, > > Thank you for contributing. > > > > > > > that my changes requires also change specification for BigDecimal > > > <https://avro.apache.org/docs/current/spec.html#Decimal> to accept > > string > > > as an underlying format for encoding/decoding. > > > > > > Could you elaborate on what the string representation would be for > > decimal? On the surface this change seems undesirable as it potentially > > breaks backwards compatibility of the format. > > > > Thanks, > > Micah > > > > On Tue, Mar 29, 2022 at 2:21 PM Ivan Zemlyanskiy < > > [email protected]> wrote: > > > > > Hello, avro developers! > > > First of all thank you all for your hard work and the project you move > > > forward. The result is marvellous and it was my pleasure to make a small > > > adjustment for it. > > > > > > Some time ago I opened an issue > > > https://issues.apache.org/jira/browse/AVRO-3408 and made a pull-request > > > with my understanding how my ideas should be implemented > > > https://github.com/apache/avro/pull/1584 > > > During the discussion, Ryan Skraba noticed > > > < > > > > > https://issues.apache.org/jira/browse/AVRO-3408?focusedCommentId=17508373&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17508373 > > > > > > > that my changes requires also change specification for BigDecimal > > > <https://avro.apache.org/docs/current/spec.html#Decimal> to accept > > string > > > as an underlying format for encoding/decoding. He recommended sending a > > > note here to discuss if someone has any objections to doing so. > > > > > > I'm looking forward to reading any ideas/comments/concerns about my work > > > and this specification change. > > > > > > Thank you in advance. > > > > >
