jorgecarleitao commented on pull request #8796: URL: https://github.com/apache/arrow/pull/8796#issuecomment-735383594
Thanks a lot for all the comments so far. @jhorstmann really good points. Unfortunately, I do not think it is just the asserts, because `Vec` performs the same asserts as it is also safe code. wrt to FFI, @jhorstmann and @nevi-me : I don't think FFI can rely on it: as @jhorstmann mentions, since this only a recommendation, implementations must be able to handle non-aligned buffers. The Rust implementation is even funnier here, because the C data interface has no API to export `Buffer::offset` (only `Array::offset`). This implies that we need to offset pointer by `Buffer::offset` when we export to the C data interface (details on #8401). I think that this makes the receiving end unable to determine whether the allocated region is aligned or not. I think that this `Buffer::offset` may also destroy the benefit of alignment on our own implementation as `ArrayData::data` will output a non-aligned bytes slice whenever `Buffer::offset` is not 0. To use the aligned memory, I think we would need to use the data without the offset, perform the SIMD operation in chunks of 64 bytes _starting at the beginning of the buffer_, and then pass the offset to the new buffer. @ritchie46 good question. The benchmarks include allocations and mutations, as they cover a wide range of situations. @Dandandan that is also my current hypothesis: the implementation is competing with some of the brightest minds when we try to re-invent a `Vec`, and the benefits of 64-byte aligned memory do not overcome the benefits of a highly optimized container (`Vec`). ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
