tustvold commented on code in PR #2357:
URL: https://github.com/apache/arrow-rs/pull/2357#discussion_r940134949
##########
parquet/src/arrow/array_reader/primitive_array.rs:
##########
@@ -208,7 +210,7 @@ where
))
}
}
- .with_precision_and_scale(p, s)?;
+ .with_precision_and_scale(p, s, false)?;
Review Comment:
An API is unsound if you can violate invariants of a data structure,
therefore potentially introducing UB, without using unsafe APIs to achieve
this. As parquet IO is safe, if parquet did not validate the data on read it
would be unsound.
This leads to a couple of options, and possibly more besides:
1. Don't care about decimals exceeding the precision and don't make this an
invariant
2. We make this an invariant and validate on read
3. We add an unsafe flag to skip validation on read
I want to take a step back before undertaking any of these, as frankly I am
deeply confused by what this precision argument is actually for - why
arbitrarily truncate your value space?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]