chriscasola commented on code in PR #13381:
URL: https://github.com/apache/arrow/pull/13381#discussion_r897288469
##########
go/arrow/array/string.go:
##########
@@ -113,6 +113,12 @@ func (a *String) setData(data *Data) {
if offsets := data.buffers[1]; offsets != nil {
a.offsets = arrow.Int32Traits.CastFromBytes(offsets.Bytes())
}
+
+ expNumOffsets := a.array.data.offset + a.array.data.length + 1
+ if a.array.data.length > 0 &&
+ (len(a.offsets) < expNumOffsets ||
int(a.offsets[expNumOffsets-2]) > len(a.values)) {
+ panic("arrow/array: string offsets out of bounds of data
buffer")
+ }
Review Comment:
The length check is needed because if length is 0 it appears that lots of
things allow `len(a.offsets)` to also be `0`. The offsets seem to be ignored in
that case.
I think `len(a.offsets) < expNumOffsets` is correct. If there aren't at
least `expNumOffsets` we should panic. If there are extra offsets we can't
panic because that can happen in slice scenarios like `array.NewSliceData`
(unit tests seem to confirm this).
`a.offsets[expNumOffsets-2]` is where the last value in the array starts and
it must be less than `len(a.values)`. I'll change the `>` to a `>=`. The final
byte, indicated by `a.offsets[expNumOffsets-1]` seems to be allowed to overflow
and be gracefully handled, based on [this
test](https://github.com/apache/arrow/blob/master/go/arrow/array/concat_test.go#L293),
which is why I am not checking it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]