lidavidm commented on issue #47824:
URL: https://github.com/apache/arrow/issues/47824#issuecomment-3420071912

   If I understand right: 
   
   This refers to the length written as part of the message:
   
   > The metadata_size includes the size of the Message plus padding. The 
metadata_flatbuffer contains a serialized Message Flatbuffer value, which 
internally includes:
   
   But what we're concerned about is the length written to the IPC file footer, 
which also includes this size prefix? 
   
   I wish PyArrow included more ability to introspect files, but if I serialize 
a message, I get this:
   
   ```
   
b'\xff\xff\xff\xff0\x00\x00\x00\x10\x00\x00\x00\x00\x00\n\x00\x0c\x00\x06\x00\x05\x00\x08\x00\n\x00\x00\x00\x00\x01\x04\x00\x0c\x00\x00\x00\x08\x00\x08\x00\x00\x00\x04\x00\x08\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\xff\xff\xff\xff\x00\x00\x00\x00'
   ```
   
   which has `metadata_size` = 48
   
   ```
   In [12]: struct.unpack('<I', b'0\x00\x00\x00')
   Out[12]: (48,)
   ```
   
   which is consistent with the quote from the documentation. But for the file 
footer, I assume it needs to be `56` instead, and that's where the confusion 
stems from (there are two "message lengths" implicitly being discussed here, 
one is the actual size of the Flatbuf message, one is the size of the "Arrow 
IPC message"?)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to