paddyhoran commented on pull request #8116: URL: https://github.com/apache/arrow/pull/8116#issuecomment-687998129
I mentioned this over on [ARROW-9921](https://github.com/apache/arrow/pull/8117) but I don't think we intended to use `From<Vec<_>>` for anything other that testing originally. I thought that we needed to use the functions from [memory](https://github.com/apache/arrow/blob/master/rust/arrow/src/memory.rs) to allocate (control alignment and padding) but this allows the `Vec` to allocate (via `collect`). I think you would want to ensure that all arrays are allocated with a consistent alignment, i.e. use memory.rs. From the spec: > Implementations are recommended to allocate memory on aligned addresses (multiple of 8- or 64-bytes) and pad (overallocate) to a length that is a multiple of 8 or 64 bytes. When serializing Arrow data for interprocess communication, these alignment and padding requirements are enforced This approach might be fine for an application in the wild (that won't use IPC) but DataFusion is part of the Arrow project itself and so *should* follow the rules/recommendations, thoughts? ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
