wjones127 edited a comment on pull request #12216: URL: https://github.com/apache/arrow/pull/12216#issuecomment-1022834715
@emkornfield I have debugged further and I believe I have narrowed down to the approximate place where the data is being corrupted. I have added two `ValidateFull()` calls that seem to be before and after this corruption occurs. The one on `parquet/arrow/reader_internal.cc:780` passes, but the one on `parquet/arrow/reader.cc:482` fails. The error I get when I run: ``` 56: /Users/willjones/Documents/arrows/arrow/cpp/src/parquet/arrow/reader.cc:482: Check failed: _s.ok() Operation failed: out_->ValidateFull() 56: Bad status: Invalid: In chunk 0: Invalid: null_count value (854) doesn't match actual number of nulls in array (861) 56: /Users/willjones/Documents/arrows/arrow/cpp/src/arrow/array/validate.cc:118 ValidateNulls(*data.type) ``` It seems that `LoadBatch()` on the leaf node is reading the correct data, but by the time `MakeArray()` is called the data has been somehow corrupted. I will debug further to try and narrow down where that is from. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
