hi Brian -- have you had any luck reproducing this issue? Thanks Wes
On Sun, Jul 16, 2017 at 7:06 PM, Wes McKinney <wesmck...@gmail.com> wrote: > I just created https://issues.apache.org/jira/browse/ARROW-1224 > > On Sun, Jul 16, 2017 at 7:03 PM, Wes McKinney <wesmck...@gmail.com> wrote: >> hi Brian, >> >> In the record batch IPC formats (stream and file), the buffers are >> supposed to be padded at minimum to an 8 byte offset, so that all >> buffers start on an 8-byte aligned offset. >> >> We should revisit this aspect of the format documents -- ideally >> buffers would be 64-byte padded so that code that uses AVX512 can be >> used more frequently. I think it would be better in the specification >> to say: 64-byte padding is preferred, but 8-byte alignment (of start >> offsets) and padding in IPC is the minimum requirement. In the C++ >> library for example, we are rounding up all our allocations to a >> multiple of 64 bytes. >> >> It's possible there's a missing alignment in the Java writer, so if >> you can find a reproducible case where the IPC payload has a >> misaligned buffer start offset we should definitely fix that as soon >> as possible. >> >> - Wes >> >> On Sun, Jul 16, 2017 at 9:05 AM, bhulette <bhule...@ccri.com> wrote: >>> Emilio and I ran into some byte alignment issues last week. We're generating >>> data in the streaming format with the java lib, but the javascript lib is >>> failing to read it because some of the buffers don't appear to be aligned. >>> >>> Its not clear to us which and is implemented incorrectly - the spec >>> (https://arrow.apache.org/docs/memory_layout.html) says buffers should be >>> padded to 64 byte boundaries - does that extend to record batches in the IPC >>> formats? >>> >>> The javascript implementation currently uses typed arrays to create views >>> for each buffer, which need to be aligned. We're looking into using a >>> DataView or a flatbuffers ByteBuffer to get around this issue for now, but >>> I'm wondering if this is a bug in the java implementation. >>> >>> Brian