[GitHub] [arrow] alamb commented on a change in pull request #9397: ARROW-11308: [Rust][Parquet] Support decimal when writing parquet files

GitBox Sun, 07 Feb 2021 03:53:58 -0800


alamb commented on a change in pull request #9397:
URL: https://github.com/apache/arrow/pull/9397#discussion_r571604769




##########
File path: rust/parquet/src/arrow/schema.rs
##########
@@ -409,13 +413,15 @@ fn arrow_to_parquet_type(field: &Field) -> Result<Type> {
                 .with_length(*length)
                 .build()
         }
-        DataType::Decimal(precision, _) => Type::primitive_type_builder(
-            name,
-            PhysicalType::FIXED_LEN_BYTE_ARRAY,
-        )
-        .with_repetition(repetition)
-        .with_length((10.0_f64.powi(*precision as i32).log2() / 8.0).ceil() as 
i32)
-        .build(),
+        DataType::Decimal(precision, scale) => {
+            Type::primitive_type_builder(name, 
PhysicalType::FIXED_LEN_BYTE_ARRAY)

Review comment:
       Fixed Length Byte Array makes sense to me (as the interpretation of the 
bytes for decimal is different than either i32 or i64). 
   
   Reasons I could imagine using i32 or i64 to write decimals into Parquet 
would be 
   1. ecosystem compatibility (aka that the pandas parquet reader assumed 
decimals were stored using those types) 
   2.possibly so better / more performant encodings could be used.
   
   But I am just SWAG'ing it here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow] alamb commented on a change in pull request #9397: ARROW-11308: [Rust][Parquet] Support decimal when writing parquet files

Reply via email to