hi Rares, I think there is much more potential for confusion with binary values (vs. fixed-width values) because any array may have a non-zero offset (from being sliced).
If you have const int32_t* value_offsets = binary_arr.raw_value_offsets(); then this accounts for any non-zero slice offset (binary_arr.offset()). The problem IMHO with having const uint8_t* values = binary_arr.raw_values(); is that you have two choices, neither of them good: * Return a pointer to where the data for that array starts (including any offset). But then you cannot index into this array with the values from raw_value_offsets() * Return a pointer to the memory inside the data buffer (not accounting for the offset), but then raw_values() has inconsistent semantics with other raw_values methods There is already the value_data() method which returns the data buffer, so if you want the raw data you can do const uint8_t* raw_data = binary_arr.value_data()->data(); https://github.com/apache/arrow/blob/master/cpp/src/arrow/array.h#L451 this is something you can index into with the result of raw_value_offsets(). Or you can use the BinaryArray::GetValue method To make sure I'm getting through the issue clearly, consider a binary array with 5 values a bb ccc dddd eeeee This has buffers: length: 5 offset: 0 buffer[1] (offset): [0, 1, 3, 6, 10, 15] buffer[2] (data): aabbcccddddeeeee Now suppose you slice this array, say auto sliced = arr->Slice(2); Now the sliced array has: length: 3 offset: 2 buffer[1] (offset): [0, 1, 3, 6, 10, 15] buffer[2] (data): aabbcccddddeeeee I think because of the offsets and the potential for confusion with zero-copy array slices that if you want to interact with the raw data that you go directly to the buffer (value_data()->data()). - Wes On Sun, Sep 17, 2017 at 2:40 PM, Rares Vernica <[email protected]> wrote: > Hi, > > I have a question about the Array C++ API. BinaryArray has a > raw_value_offsets() public member. Should it also have a raw_vaues() public > member to give a pointer to the start of raw data? Or is this not feasible? > > Thanks, > Rares
