lidavidm commented on issue #47824:
URL: https://github.com/apache/arrow/issues/47824#issuecomment-3420071912
If I understand right:
This refers to the length written as part of the message:
> The metadata_size includes the size of the Message plus padding. The
metadata_flatbuffer contains a serialized Message Flatbuffer value, which
internally includes:
But what we're concerned about is the length written to the IPC file footer,
which also includes this size prefix?
I wish PyArrow included more ability to introspect files, but if I serialize
a message, I get this:
```
b'\xff\xff\xff\xff0\x00\x00\x00\x10\x00\x00\x00\x00\x00\n\x00\x0c\x00\x06\x00\x05\x00\x08\x00\n\x00\x00\x00\x00\x01\x04\x00\x0c\x00\x00\x00\x08\x00\x08\x00\x00\x00\x04\x00\x08\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00\xff\xff\xff\xff\x00\x00\x00\x00'
```
which has `metadata_size` = 48
```
In [12]: struct.unpack('<I', b'0\x00\x00\x00')
Out[12]: (48,)
```
which is consistent with the quote from the documentation. But for the file
footer, I assume it needs to be `56` instead, and that's where the confusion
stems from (there are two "message lengths" implicitly being discussed here,
one is the actual size of the Flatbuf message, one is the size of the "Arrow
IPC message"?)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]