jorgecarleitao opened a new issue #508:
URL: https://github.com/apache/arrow-rs/issues/508


   The computation of the number of bytes given a precision seems incorrect, 
and writing decimal larger than 18 crashes?
   
   Found while working on the corresponding in arrow2.
   
   See [parquet's 
definitions](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal)
 for details, but it seems that `decimal_length_from_precision` is incorrect 
because it is not the inverse of the maximum number of digits for a given size 
of parquets' FixedSizeBytes.
   
   IMO this is the correct version:
   
   ```rust
   fn decimal_length_from_precision(precision: usize) -> usize {
       // digits = floor(log_10(2^(8*n - 1) - 1))  // definition in parquet's 
logical types
       // ceil(digits) = log10(2^(8*n - 1) - 1)
       // 10^ceil(digits) = 2^(8*n - 1) - 1
       // 10^ceil(digits) + 1 = 2^(8*n - 1)
       // log2(10^ceil(digits) + 1) = (8*n - 1)
       // log2(10^ceil(digits) + 1) + 1 = 8*n
       // (log2(10^ceil(a) + 1) + 1) / 8 = n
       (((10.0_f64.powi(precision as i32) + 1.0).log2() + 1.0) / 8.0).ceil() as 
usize
   }
   ```
   
   (at least this definition causes arrow2 to write all variants in the 
`generated_decimal` file).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to