hhhizzz commented on code in PR #8733:
URL: https://github.com/apache/arrow-rs/pull/8733#discussion_r2504577957


##########
parquet/src/column/reader.rs:
##########
@@ -403,7 +465,27 @@ where
     /// Returns false if there's no page left.
     fn read_new_page(&mut self) -> Result<bool> {
         loop {
-            match self.page_reader.get_next_page()? {
+            let page_result = match self.page_reader.get_next_page() {
+                Ok(page) => page,
+                Err(err) => {
+                    return match err {
+                        ParquetError::General(message)
+                            if message
+                                .starts_with("Invalid offset in sparse column 
chunk data:") =>
+                        {
+                            let metadata = self.page_reader.peek_next_page()?;
+                            // Some writers omit data pages for sparse column 
chunks and encode the gap
+                            // as a reader-visible error. Use the metadata 
peek to synthesise a page of
+                            // null definition levels so downstream consumers 
see consistent row counts.
+                            self.try_create_synthetic_page(metadata)?;

Review Comment:
   Removed the codes for this synthetic page.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to