jhorstmann commented on pull request #8796:
URL: https://github.com/apache/arrow/pull/8796#issuecomment-735366043
I'm surprised too. It might just be the removed assertions, but I did not
expect them to have measurable overhead. If you want to investigate further you
could try only removing those or replacing with `debug_assert`. I can't easily
reproduce it on my notebook since the variations per run are too high, would
need to spin up another ec2 instance to run stable benchmarks.
Reading the [columnnar specification][1] again, the alignment is only a
recommendation, and only required when serialized. I'm not familiar with that
part of the code, but I assume it already needs to ensure the required padding.
The only problem I could see would be with shared memory or FFI if the other
side relies on the padding. I think it already can't rely on 64byte alignment,
because arrays can be arbitrary slices of the underlying buffers. But relying
on padding could happen when accessing data using vector instructions.
It seems rust will soon get support for [custom allocators for `Vec`][2],
that way we could get both a simplified internal api and still ensure padded
allocations using a custom allocator.
[1]:
https://arrow.apache.org/docs/format/Columnar.html#buffer-alignment-and-padding
[2]: https://github.com/rust-lang/rust/pull/78461
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]