ziggythehamster opened a new pull request #918:
URL: https://github.com/apache/avro/pull/918


   ### Jira
   
   - [x] My PR addresses the following [Avro 
Jira](https://issues.apache.org/jira/browse/AVRO/) issues and references them 
in the PR title. For example, "AVRO-1234: My Avro PR"
     - https://issues.apache.org/jira/browse/AVRO-2677
     - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
       - `memory_profiler` is MIT licensed and is only required for testing
   
   ### Tests
   
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
     - Unit tests added for decimal logical type encoding/decoding according to 
the Avro specification
     - Unit tests added to ensure performance regressions are not unknowingly 
introduced with the encoder/decoder, as we have made an effort to make this the 
most performant encoder/decoder possible
   
   ### Commits
   
   - [x] My commits all reference Jira issues in their subject lines. In 
addition, my commits follow the guidelines from "[How to write a good git 
commit message](https://chris.beams.io/posts/git-commit/)":
     1. Subject is separated from body by a blank line
     1. Subject is limited to 50 characters (not including Jira issue reference)
     1. Subject does not end with a period
     1. Subject uses the imperative mood ("add", not "adding")
     1. Body wraps at 72 characters
     1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [x] In case of new functionality, my PR adds documentation that describes 
how to use it.
     - README/CHANGELOG not updated, but the logical type support from 
AVRO-2677 should have included this encoder/decoder rather than leaving it up 
to consumers to shove the correct bytes in there. AVRO-2677 is closed, and we 
are unsure the appropriate measure to take with that. Should we have opened a 
new issue due to AVRO-2677 not being fully implemented or reopened it? If the 
former, if someone wants to create that ticket, we're happy to rewrite our 
commit messages with the correct ticket number.
   
   ### Supercedes PRs
   
   The following PRs are currently open and implement an incorrect version of 
this feature, and should be closed:
   
   * https://github.com/apache/avro/pull/829
   * https://github.com/apache/avro/pull/840
   
   Both PRs shove a string like `"1.234"` into the bytes, rather than encoding 
them according to the specification. Both PRs do not validate inputs nor 
introduce infrastructure to do that.
   
   ### Notes
   
   * The Avro specification is imprecise about how decimals are to be 
implemented, which required us to dig into the source code of Avro for Java as 
well as dig into Java's BigDecimal and BigInteger to make sure we were doing 
the same thing. Perhaps the specification could include a Java one-liner that 
implements the encoder/decoder? Here's a Scala one-liner that we used to test 
our implementation:
   
   ```scala
   val encoded = new 
java.math.BigDecimal("3.4562").setScale(6).unscaledValue().toByteArray()
   val decoded = new java.math.BigDecimal(new java.math.BigInteger(encoded), 6)
   
   encoded.map("%02x".format(_)).mkString(" ") // 34 bc c8: String
   decoded // 3.456200: java.math.BigDecimal
   ```
   
   * We tested this in Ruby 2.3, 2.4, 2.5, 2.6, and 2.7. This is the reason for 
the <= check for retained objects, as some Ruby versions retain objects where 
others don't. We think this is either a bug in `memory_profiler` or a bug in 
Ruby itself.
   * Your build system depends on the `echoe` gem, but `echoe` is not 
compatible with RubyGems > 2.7. RubyGems 3.x has been out since 2018, and 
RubyGems 2.7 barely works in newer versions of Ruby. Consider upgrading this.
   * This PR is against master, but 1.9 is the current stable version. [This 
branch](https://github.com/art19/avro/tree/art19-patched-1.9-with-pr-761) is a 
version based on 1.9 with #761 incorporated (#761 was the PR that incompletely 
implemented AVRO-2677).
     * There's [a gem published to GitHub 
packages](https://github.com/art19/avro/packages/262653?version=1.9.3.pre.b88b65e2)
 as well, if you're like us and need a version with decimal support before this 
hits an official channel.
   
   ### Thanks
   
   I'm filing the upstream PR here, but @johvet did almost all of the work, 
performance tuning, and testing.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to