AdamGS commented on code in PR #4389:
URL: https://github.com/apache/arrow-rs/pull/4389#discussion_r1225237874
##########
parquet/src/column/writer/mod.rs:
##########
@@ -683,6 +683,35 @@ impl<'a, E: ColumnValueEncoder> GenericColumnWriter<'a, E>
{
.append_row_count(self.page_metrics.num_buffered_rows as i64);
}
+ fn truncate_min_value(&self, data: &[u8]) -> Vec<u8> {
+ let max_effective_len =
+ self.props.minmax_value_truncate_len().unwrap_or(data.len());
+
+ match std::str::from_utf8(data) {
+ Ok(str_data) => truncate_utf8(str_data, max_effective_len),
+ Err(_) => truncate_binary(data, max_effective_len),
+ }
+ }
+
+ fn truncate_max_value(&self, data: &[u8]) -> Vec<u8> {
+ // Even if the user disables value truncation, we want to make sure to
increase the max value so the user doesn't miss it.
Review Comment:
Agreed, it should be handled correctly now
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]