wjones127 commented on code in PR #33694:
URL: https://github.com/apache/arrow/pull/33694#discussion_r1081979118
##########
cpp/src/parquet/properties.h:
##########
@@ -452,19 +452,39 @@ class PARQUET_EXPORT WriterProperties {
return this->disable_statistics(path->ToDotString());
}
- /// Enable integer type to annotate decimal type as below:
- /// int32: 1 <= precision <= 9
- /// int64: 10 <= precision <= 18
- /// Default disabled.
- Builder* enable_integer_annotate_decimal() {
- integer_annotate_decimal_ = true;
+ /// Enable decimal logical type with 1 <= precision <= 18 to be stored as
+ /// integer physical type.
+ ///
+ /// According to the specs, DECIMAL can be used to annotate the following
types:
+ /// - int32: for 1 <= precision <= 9.
+ /// - int64: for 1 <= precision <= 18; precision < 10 will produce a
warning.
+ /// - fixed_len_byte_array: precision is limited by the array size.
+ /// Length n can store <= floor(log_10(2^(8*n - 1) - 1)) base-10 digits.
+ /// - binary: precision is not limited, but is required. precision is not
limited,
+ /// but is required. The minimum number of bytes to store the unscaled
value
+ /// should be used.
+ ///
+ /// By default, this is DISABLED and all decimal types annotate
fixed_len_byte_array.
+ ///
+ /// When enabled, the C++ writer will use following physical types to
store decimals:
+ /// - int32: for 1 <= precision <= 9.
+ /// - int64: for 10 <= precision <= 18.
+ /// - fixed_len_byte_array: for precision > 18.
+ ///
+ /// As a consequence, decimal columns stored in integer types are more
compact
+ /// but in a risk that the parquet file may not be readable by previous
Arrow C++
+ /// versions or other implementations.
Review Comment:
FYI @pitrou
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]