Hi Arrow developers,

I'm currently working on IPC in Rust, specifically reading Arrow files.
I've noticed that null buffers/bitmaps are always padded to 64 bits (from
pyarrow, not sure about others), while in Rust we pad to 8 bits.

1. Is this fine re. Rust per the spec?

I'm having issues with reading, but only because I'm comparing array data
and not only the values and nullness of slots. I see this being more of a
problem when writing to files and streams as we'd need to pad null buffers
almost every time (since for large arrays IPC could need 2048 while we have
2046, so it's not a small data issue)

2. If implementations are allowed to choose either 8 or 64, are the Rust
commiters happy with us changing to 64-bit padding?

The benefits of changing to 64 would be removing the need to then pad the
buffer when writing to streams and files, and it'll make us more compatible
with other implementations. I suspect this would still come as an issue
when we get to add Rust to interop tests.

I tried changing to 64-bit before writing this mail, but bit-fu is still
beyond my knowledge, so I'd need help from someone else with implementing
this, or at least letting me know which lines to change. I don't mind then
making sure all tests still pass.

My goal is to complete IPC work by 0.14 release, so this would be a bit
urgent as I'm stuck right now.

Thanks
Neville

Reply via email to