etseidl commented on code in PR #231: URL: https://github.com/apache/parquet-format/pull/231#discussion_r1502929393
########## Encodings.md: ########## @@ -247,6 +253,15 @@ and handled as wrapping around in 2's complement notation so that the original values are correctly restituted. This may require explicit care in some programming languages (for example by doing all arithmetic in the unsigned domain). +One strategy that might be employed to avoid the above mentioned overflow is to +perform the subtraction utilizing integers with a larger number of bits. For example, +while encoding INT32 data one might choose to perform arithmetic operations using +64-bit integers. This can lead to situtations where the number of bits used to encode +the resulting deltas is greater than the number of bits used to represent the input +values. While this behavior is allowed, data produced in this manner may not be Review Comment: I actually agree with @pitrou, but after feedback on the mailing list (and watching other similar proposals), I thought the squishy middle ground of "writers _should_ not do this, but readers _should_ accept it" would be more palatable. I'm totally fine with simply adding a sentence to the end of the preceding paragraph. That adds the clarity I want as the developer of an implementation (and is much less wordy 😉). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
