tustvold opened a new issue #1416: URL: https://github.com/apache/arrow-rs/issues/1416
**Describe the bug** https://github.com/apache/arrow-rs/blob/master/parquet/src/encodings/encoding.rs#L577 skips over the miniblock bit widths, and then only goes back and writes a value for the miniblocks that contain a non-zero number of values. The empty miniblocks are left with whatever value happens to be in the encoder's buffer. **To Reproduce** This is one of the underlying bugs behind https://github.com/apache/arrow-datafusion/issues/1976 **Expected behavior** Whilst the specification technically allows for arbitrary padding, it seems like a good idea to avoid non-deterministic output where possible -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
