CynicDog commented on code in PR #9985:
URL: https://github.com/apache/arrow-rs/pull/9985#discussion_r3288159774
##########
parquet/src/arrow/schema/primitive.rs:
##########
@@ -313,23 +323,29 @@ fn from_fixed_len_byte_array(
precision: i32,
type_length: i32,
) -> Result<DataType> {
- // TODO: This should check the type length for the decimal and interval
types
match (info.logical_type_ref(), info.converted_type()) {
(Some(LogicalType::Decimal { scale, precision }), _) => {
+ check_decimal_length(type_length)?;
Review Comment:
Good catch on the wording! You're totally right—any byte length is valid
here, not just 16 or 32.
As per the [Parquet
spec](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#decimal)
for `fixed_len_byte_array`, the precision is simply limited by the array size.
As long as the precision fits within the byte length, it's completely valid.
That upper bound of 32 we were looking at is actually an Arrow constraint
(since `Decimal256` is 256 bits, which caps it at 32 bytes), rather than a
limitation from Parquet itself. Checking for a range of `1..=32` makes perfect
sense here, especially since we definitely see perfectly legitimate Parquet
files out in the wild with byte lengths like 4 or 7.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]