tjwp commented on a change in pull request #918:
URL: https://github.com/apache/avro/pull/918#discussion_r442170152



##########
File path: lang/ruby/lib/avro/logical_types.rb
##########
@@ -16,9 +16,209 @@
 # limitations under the License.
 
 require 'date'
+require 'bigdecimal'
+require 'bigdecimal/util'
 
 module Avro
   module LogicalTypes
+
+    ##
+    # Base class for logical types requiring a schema to be present
+    class LogicalTypeWithSchema
+      ##
+      # @return [Avro::Schema] The schema this logical type is dealing with
+      attr_reader :schema
+
+      ##
+      # Build a new isntance of a logical type using the provided schema

Review comment:
       "instance"

##########
File path: lang/ruby/lib/avro/logical_types.rb
##########
@@ -81,10 +284,11 @@ def self.decode(datum)
       },
     }.freeze
 
-    def self.type_adapter(type, logical_type)
+    def self.type_adapter(type, logical_type, schema = nil)
       return unless logical_type
 
-      TYPES.fetch(type, {}.freeze).fetch(logical_type, Identity)
+      adapter = TYPES.fetch(type, {}.freeze).fetch(logical_type, Identity)
+      adapter.is_a?(Class) ? adapter.new(schema) : adapter

Review comment:
       The adapters are currently modules or classes. Instead of checking for 
class here would it be better to have both respond to a method like 
`.requires_schema?` and use that for the condition here.

##########
File path: lang/ruby/lib/avro/logical_types.rb
##########
@@ -16,9 +16,209 @@
 # limitations under the License.
 
 require 'date'
+require 'bigdecimal'
+require 'bigdecimal/util'
 
 module Avro
   module LogicalTypes
+
+    ##
+    # Base class for logical types requiring a schema to be present
+    class LogicalTypeWithSchema
+      ##
+      # @return [Avro::Schema] The schema this logical type is dealing with
+      attr_reader :schema
+
+      ##
+      # Build a new isntance of a logical type using the provided schema
+      #
+      # @param schema [Avro::Schema]
+      #     The schema to use with this instance
+      #
+      # @raise [ArgumentError]
+      #     If the provided schema is nil
+      def initialize(schema)
+        raise ArgumentError, 'schema is required' if schema.nil?
+
+        @schema = schema
+      end
+
+      ##
+      # Encode the provided datum
+      #
+      # @param datum [Object] The datum to encode
+      #
+      # @raise [NotImplementedError]
+      #     Subclass will need to override this method
+      def encode(datum)
+        raise NotImplementedError
+      end
+
+      ##
+      # Decode the provided datum
+      #
+      # @param datum [Object] The datum to decode
+      #
+      # @raise [NotImplementedError]
+      #     Subclass will need to override this method
+      def decode(datum)
+        raise NotImplementedError
+      end
+    end
+
+    ##
+    # Logical type to handle arbitrary-precision decimals using byte array.
+    #
+    # The byte array contains the two's-complement representation of the 
unscaled integer
+    # value in big-endian byte order.
+    class BytesDecimal < LogicalTypeWithSchema
+      # Messages for exceptions
+      ERROR_INSUFICIENT_PRECISION = 'Precision is too small'.freeze
+      ERROR_INVALID_SCALE         = 'Scale must be greater than or equal to 
0'.freeze
+      ERROR_INVALID_PRECISION     = 'Precision must be positive'.freeze
+      ERROR_PRECISION_TOO_SMALL   = 'Precision must be greater than 
scale'.freeze
+      ERROR_ROUNDING_NECESSARY    = 'Rounding necessary'.freeze
+      ERROR_VALUE_MUST_BE_NUMERIC = 'value must be numeric'.freeze
+
+      # The pattern used to pack up the byte array (8 bit unsigned 
integer/char)
+      PACK_UNSIGNED_CHARS = 'C*'.freeze
+
+      # The number 10 as BigDecimal
+      TEN = BigDecimal(10).freeze
+
+      ##
+      # @return [Integer] The number of total digits supported by the decimal
+      attr_reader :precision
+
+      ##
+      # @return [Integer] The number of fractional digits
+      attr_reader :scale
+
+      ##
+      # Build a new decimal logical type
+      #
+      # @param schema [Avro::Schema]
+      #     The schema defining precision and scale for the conversion
+      #
+      # @raise [ArgumentError]
+      #     If precision or scale have invalid values. Precision must be
+      #     positive and greater or equal to scale. If scale is missing,
+      #     it defaults to 0.
+      def initialize(schema)

Review comment:
       Maybe when the schema is defined it should it should create this 
"adapter" and keep a reference to it, avoiding creating a new instance for each 
encode/decode.

##########
File path: lang/ruby/lib/avro/logical_types.rb
##########
@@ -16,9 +16,209 @@
 # limitations under the License.
 
 require 'date'
+require 'bigdecimal'
+require 'bigdecimal/util'
 
 module Avro
   module LogicalTypes
+
+    ##
+    # Base class for logical types requiring a schema to be present
+    class LogicalTypeWithSchema
+      ##
+      # @return [Avro::Schema] The schema this logical type is dealing with
+      attr_reader :schema
+
+      ##
+      # Build a new isntance of a logical type using the provided schema
+      #
+      # @param schema [Avro::Schema]
+      #     The schema to use with this instance
+      #
+      # @raise [ArgumentError]
+      #     If the provided schema is nil
+      def initialize(schema)
+        raise ArgumentError, 'schema is required' if schema.nil?
+
+        @schema = schema
+      end
+
+      ##
+      # Encode the provided datum
+      #
+      # @param datum [Object] The datum to encode
+      #
+      # @raise [NotImplementedError]
+      #     Subclass will need to override this method
+      def encode(datum)
+        raise NotImplementedError
+      end
+
+      ##
+      # Decode the provided datum
+      #
+      # @param datum [Object] The datum to decode
+      #
+      # @raise [NotImplementedError]
+      #     Subclass will need to override this method
+      def decode(datum)
+        raise NotImplementedError
+      end
+    end
+
+    ##
+    # Logical type to handle arbitrary-precision decimals using byte array.
+    #
+    # The byte array contains the two's-complement representation of the 
unscaled integer
+    # value in big-endian byte order.
+    class BytesDecimal < LogicalTypeWithSchema
+      # Messages for exceptions
+      ERROR_INSUFICIENT_PRECISION = 'Precision is too small'.freeze
+      ERROR_INVALID_SCALE         = 'Scale must be greater than or equal to 
0'.freeze
+      ERROR_INVALID_PRECISION     = 'Precision must be positive'.freeze
+      ERROR_PRECISION_TOO_SMALL   = 'Precision must be greater than 
scale'.freeze
+      ERROR_ROUNDING_NECESSARY    = 'Rounding necessary'.freeze
+      ERROR_VALUE_MUST_BE_NUMERIC = 'value must be numeric'.freeze
+
+      # The pattern used to pack up the byte array (8 bit unsigned 
integer/char)
+      PACK_UNSIGNED_CHARS = 'C*'.freeze
+
+      # The number 10 as BigDecimal
+      TEN = BigDecimal(10).freeze
+
+      ##
+      # @return [Integer] The number of total digits supported by the decimal
+      attr_reader :precision
+
+      ##
+      # @return [Integer] The number of fractional digits
+      attr_reader :scale
+
+      ##
+      # Build a new decimal logical type
+      #
+      # @param schema [Avro::Schema]
+      #     The schema defining precision and scale for the conversion
+      #
+      # @raise [ArgumentError]
+      #     If precision or scale have invalid values. Precision must be
+      #     positive and greater or equal to scale. If scale is missing,
+      #     it defaults to 0.
+      def initialize(schema)
+        super
+
+        @scale     = schema.scale.to_i
+        @precision = schema.precision.to_i
+        @factor    = TEN ** @scale
+
+        validate!
+      end
+
+      ##
+      # Encode the provided value into a byte array
+      #
+      # @param value [BigDecimal, Float, Integer]
+      #     The numeric value to encode
+      #
+      # @raise [ArgumentError]
+      #     If the provided value is not a numeric type
+      #
+      # @raise [RangeError]
+      #     If the provided value has a scale higher than the schema permits,
+      #     or does not fit into the schema's precision
+      def encode(value)
+        raise ArgumentError, ERROR_VALUE_MUST_BE_NUMERIC unless 
value.is_a?(Numeric)
+
+        
to_byte_array(unscaled_value(value.to_d)).pack(PACK_UNSIGNED_CHARS).freeze
+      end
+
+      ##
+      # Decode a byte array (in form of a string) into a BigDecimal of the
+      # given precision and scale
+      #
+      # @param stream [String]
+      #     The byte array to decode
+      #
+      # @return [BigDecimal]
+      def decode(stream)
+        from_byte_array(stream) / @factor
+      end
+
+      private
+
+      ##
+      # Convert the provided stream of bytes into the unscaled value
+      #
+      # @param stream [String]
+      #     The stream of bytes to convert
+      #
+      # @return [Integer]
+      def from_byte_array(stream)
+        bytes    = stream.bytes
+        positive = bytes.first[7].zero?
+        total    = 0
+
+        bytes.each_with_index do |value, ix|
+          total += (positive ? value : (value ^ 0xff)) << (bytes.length - ix - 
1) * 8
+        end
+
+        return total if positive
+
+        -(total + 1)
+      end
+
+      ##
+      # Convert the provided number into its two's complement representation
+      # in network order (big endian).
+      #
+      # @param number [Integer]
+      #     The number to convert
+      #
+      # @return [Array<Integer>]
+      #     The byte array in network order
+      def to_byte_array(number)
+        [].tap do |result|
+          loop do
+            result.unshift(number & 0xff)
+            number >>= 8
+
+            break if (number == 0 || number == -1) && (result.first[7] == 
number[7])
+          end
+        end
+      end
+
+      ##
+      # Get the unscaled value from a BigDecimal considering the schema's scale
+      #
+      # @param decimal [BigDecimal]
+      #     The decimal to get the unscaled value from
+      #
+      # @return [Integer]
+      def unscaled_value(decimal)
+        details = decimal.split
+        length  = details[1].length
+
+        fractional_part = length - details[3]
+        raise RangeError, ERROR_ROUNDING_NECESSARY if fractional_part > scale
+
+        if length > precision || (length - fractional_part) > (precision - 
scale)
+          raise RangeError, ERROR_INSUFICIENT_PRECISION
+        end
+
+        (decimal * @factor).to_i
+      end
+
+      ##
+      # Make sure the schema definition is legit
+      #
+      # @raise [ArgumentError] If scale or precision are not quite right
+      def validate!
+        raise ArgumentError, ERROR_INVALID_SCALE if scale.negative?
+        raise ArgumentError, ERROR_INVALID_PRECISION unless precision.positive?
+        raise ArgumentError, ERROR_PRECISION_TOO_SMALL if precision < scale
+      end

Review comment:
       Should this validation happen when the schema object is created since it 
does not depend on the value being encoded? Then we avoid repeating the 
validation for each encode/decode.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to