[
https://issues.apache.org/jira/browse/ARROW-9035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17126052#comment-17126052
]
Anthony Abate commented on ARROW-9035:
--------------------------------------
Perhaps in RFC terms
([https://tools.ietf.org/html/rfc2119)|https://tools.ietf.org/html/rfc2119] the
doc should say:
All buffers (metadata (flatbuffers) and data buffers) MUST be 8 byte aligned
but SHOULD be 64 byte aligned - This would apply to both sections.
With most of the docs going stressing 64 byte alignment, I didn't realize the
'default' alignment the C++ library is 8 bytes.. assumed it would be 64 byte.
> 8 vs 64 byte alignment
> ----------------------
>
> Key: ARROW-9035
> URL: https://issues.apache.org/jira/browse/ARROW-9035
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, Documentation
> Affects Versions: 0.17.0
> Reporter: Anthony Abate
> Priority: Minor
>
> I used the C++ library to create a very small arrow file (1 field of 5 int32)
> and was surprised that the buffers are not aligned to 64 bytes as per the
> documentation section "Buffer Alignment and Padding" with examples.. based on
> the examples there, the 20 bytes of int32 should be padded to 64 bytes, but
> it is only 24 (see below) .
> extract message metadata - see BodyLength = 24
> {code:java}
> {
> version: "V4",
> header_type: "RecordBatch",
> header: {
> nodes: [
> {
> length: 5,
> null_count: 0
> }
> ],
> buffers: [
> {
> offset: 0,
> length: 0
> },
> {
> offset: 0,
> length: 20
> }
> ]
> },
> bodyLength: 24
> } {code}
> Reading further down the documentation section "Encapsulated message format"
> it says serialization should use 8 byte alignment.
> These both seem at odds with each other and some clarification is needed.
> Is the documentation wrong?
> Or
> Should 8 byte alignment be used for File and 64 byte for IPC ?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)