alamb commented on code in PR #3578:
URL: https://github.com/apache/arrow-rs/pull/3578#discussion_r1083968718
##########
parquet/src/encodings/rle.rs:
##########
@@ -18,8 +18,9 @@
use std::{cmp, mem::size_of};
use crate::errors::{ParquetError, Result};
+use crate::util::bit_util::from_le_slice;
use crate::util::{
- bit_util::{self, from_ne_slice, BitReader, BitWriter, FromBytes},
Review Comment:
to other reviewers, this PR changes from native endian to little endian
(parquet is always little endian) -- more comments on this below
##########
parquet/src/file/serialized_reader.rs:
##########
@@ -1502,7 +1503,7 @@ mod tests {
//col9->date_string_col: BINARY UNCOMPRESSED DO:0 FPO:332847
SZ:111948/111948/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: 01/01/09, max:
12/31/10, num_nulls: 0]
assert!(!&page_indexes[0][8].is_sorted());
if let Index::BYTE_ARRAY(index) = &page_indexes[0][8] {
- check_bytes_page_index(
Review Comment:
I confirmed there is exiting coverage for boolean a few lines above (bonus
that the interface hasn't changed 👍 )

##########
parquet/src/bin/parquet-index.rs:
##########
@@ -156,12 +156,12 @@ fn print_index<T: std::fmt::Debug>(
idx, o.offset, o.compressed_page_size, row_count
);
match &c.min {
- Some(m) => print!(", min {:>10?}", m),
+ Some(m) => print!(", min {:>10}", m),
Review Comment:
👍
##########
parquet/src/file/statistics.rs:
##########
@@ -181,11 +181,11 @@ pub fn from_thrift(
// min/max statistics for INT96 columns.
let min = min.map(|data| {
assert_eq!(data.len(), 12);
- from_ne_slice::<Int96>(&data)
+ from_le_slice::<Int96>(&data)
Review Comment:
> For native types, this outputs the data as little endian. Floating point
types are encoded in IEEE.
👍
##########
parquet/src/file/serialized_reader.rs:
##########
@@ -1515,7 +1516,7 @@ mod tests {
//col10->string_col: BINARY UNCOMPRESSED DO:0 FPO:444795
SZ:45298/45298/1.00 VC:7300 ENC:BIT_PACKED,RLE,PLAIN ST:[min: 0, max: 9,
num_nulls: 0]
assert!(&page_indexes[0][9].is_sorted());
if let Index::BYTE_ARRAY(index) = &page_indexes[0][9] {
- check_bytes_page_index(
Review Comment:
is there any coverage for FIXED_LENGTH_BYTE_ARRAY?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]