hi all, As we've been discussing [1], there is a need to introduce 4 bytes of padding into the preamble of the "encapsulated IPC message" format to ensure that the Flatbuffers metadata payload begins on an 8-byte aligned memory offset. The alternative to this would be for Arrow implementations where alignment is important (e.g. C or C++) to copy the metadata (which is not always small) into memory when it is unaligned.
Micah has proposed to address this by adding a 4-byte "continuation" value at the beginning of the payload having the value 0xFFFFFFFF. The reason to do it this way is that old clients will see an invalid length (what is currently the first 4 bytes of the message -- a 32-bit little endian signed integer indicating the metadata length) rather than potentially crashing on a valid length. We also propose to expand the "end of stream" marker used in the stream and file format from 4 to 8 bytes. This has the additional effect of aligning the file footer defined in File.fbs. This would be a backwards incompatible protocol change, so older Arrow libraries would not be able to read these new messages. Maintaining forward compatibility (reading data produced by older libraries) would be possible as we can reason that a value other than the continuation value was produced by an older library (and then validate the Flatbuffer message of course). Arrow implementations could offer a backward compatibility mode for the sake of old readers if they desire (this may also assist with testing). Additionally with this vote, we want to formally approve the change to the Arrow "file" format to always write the (new 8-byte) end-of-stream marker, which enables code that processes Arrow streams to safely read the file's internal messages as though they were a normal stream. The PR making these changes to the IPC documentation is here https://github.com/apache/arrow/pull/4951 Please vote to accept these changes. This vote will be open for at least 72 hours [ ] +1 Adopt these Arrow protocol changes [ ] +0 [ ] -1 I disagree because... Here is my vote: +1 Thanks, Wes [1]: https://lists.apache.org/thread.html/8440be572c49b7b2ffb76b63e6d935ada9efd9c1c2021369b6d27786@%3Cdev.arrow.apache.org%3E