etseidl commented on issue #6778: URL: https://github.com/apache/arrow-rs/issues/6778#issuecomment-2495136003
After some git dumpster diving, it seems like at one point the offset indexes were not written if the column indexes were not valid. https://github.com/apache/arrow-rs/blame/fba19b0142daed54c181cdb8f634f29cf7d37f8d/parquet/src/column/writer/mod.rs#L503-L510 Writing of the page indexes was decoupled in #4567. Since this is a special case where page indexes are desired, but the column index cannot be written due to all NaNs, it seems the original intent was to not write the offset index. I think parquet-rs should then not write offset indexes if page statistics are not enabled. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
