![arrow-hex](https://user-images.githubusercontent.com/2136005/45557397-7dd40c80-b80b-11e8-93e5-0f9275401f0e.png)

I am working on an Arrow implementation according to the specification as of 
the 0.10 release, and I’m confused about what appears to be extra padding in 
the file format. I generated a record batch and serialized it to the file 
format using pyarrow (0.10) and inspected the file in a hex editor.

•       The first 6 bytes are the magic (ARROW1) as expected.
•       The next 2 bytes are padding bytes (0x00) up to offset 0x07 (the 8-byte 
boundary)

As far as I understand, the following bytes should be the streaming format; 
however, there is another zero byte (padding?) at offset 0x08 (just after the 
boundary). This byte is followed by a valid message size, and the rest of the 
format is constructed as expected. Am I missing something in the serialization 
documentation about padding after the magic? Am I misunderstanding the concept 
of an 8-byte boundary? 

I am using this documentation as a reference:

https://github.com/apache/arrow/blob/master/format/IPC.md

What is that extra byte doing there? I can't seem to find the definition in the 
spec.

[ Full content available at: https://github.com/apache/arrow/issues/2559 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to