jorisvandenbossche commented on code in PR #37467:
URL: https://github.com/apache/arrow/pull/37467#discussion_r1310391926
##########
cpp/src/arrow/array/validate.cc:
##########
@@ -713,8 +713,10 @@ struct ValidateArrayImpl {
}
// An empty list array can have 0 offsets
- const auto required_offsets = (data.length > 0) ? data.length +
data.offset + 1 : 0;
const auto offsets_byte_size = data.buffers[1]->size();
+ const auto required_offsets = ((data.length > 0) || (offsets_byte_size >
0))
+ ? data.length + data.offset + 1
+ : 0;
if (offsets_byte_size / static_cast<int32_t>(sizeof(offset_type)) <
required_offsets) {
return Status::Invalid("Offsets buffer size (bytes): ",
offsets_byte_size,
Review Comment:
@pitrou do you know if this "empty list array can have 0 offsets" also
applies to other types that have offsets, i.e. string types?
In the current version of the PR, I am relying on ValidateFull to catch the
error of creating a length-0 LargeString array with an offsets buffer of size
4, while it should be 8.
At the moment the offsets buffer size is not checked if the length of the
array is 0, but I changed this to actually check it if the offset buffer has
_some_ length (so assuming that _if_ there is an offsets buffer with a non-zero
legth, it should have a correct length, also if the array itself has length 0)
But if it's difficult to make guarantees about this / to change
ValidateFull, I can also write a dedicated test for this specific case that
doesn't rely on calling ValidateFull.
##########
cpp/src/arrow/array/validate.cc:
##########
@@ -713,8 +713,10 @@ struct ValidateArrayImpl {
}
// An empty list array can have 0 offsets
- const auto required_offsets = (data.length > 0) ? data.length +
data.offset + 1 : 0;
const auto offsets_byte_size = data.buffers[1]->size();
+ const auto required_offsets = ((data.length > 0) || (offsets_byte_size >
0))
+ ? data.length + data.offset + 1
+ : 0;
if (offsets_byte_size / static_cast<int32_t>(sizeof(offset_type)) <
required_offsets) {
return Status::Invalid("Offsets buffer size (bytes): ",
offsets_byte_size,
Review Comment:
@pitrou do you know if this "empty list array can have 0 offsets" also
applies to other types that have offsets, i.e. string types?
In the current version of the PR, I am relying on ValidateFull to catch the
error of creating a length-0 LargeString array with an offsets buffer of size
4, while it should be 8.
At the moment the offsets buffer size is not checked if the length of the
array is 0, but I changed this to actually check it if the offset buffer has
_some_ length (so assuming that _if_ there is an offsets buffer with a non-zero
length, it should have a correct length, also if the array itself has length 0)
But if it's difficult to make guarantees about this / to change
ValidateFull, I can also write a dedicated test for this specific case that
doesn't rely on calling ValidateFull.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]