wgtmac commented on code in PR #48717:
URL: https://github.com/apache/arrow/pull/48717#discussion_r2665550214
##########
cpp/src/parquet/statistics.cc:
##########
@@ -290,7 +296,26 @@ struct BinaryLikeCompareHelperBase {
template <bool is_signed>
struct CompareHelper<ByteArrayType, is_signed>
- : public BinaryLikeCompareHelperBase<ByteArrayType, is_signed> {};
+ : public BinaryLikeCompareHelperBase<ByteArrayType, is_signed> {
+ using Base = BinaryLikeCompareHelperBase<ByteArrayType, is_signed>;
+ using T = ByteArray;
+
+ // Use kNoValueSentinel instead of nullptr to distinguish "no value" from
empty string.
+ static T DefaultMin() { return T{0, kNoValueSentinel}; }
+ static T DefaultMax() { return T{0, kNoValueSentinel}; }
+
+ static T Min(int type_length, const T& a, const T& b) {
+ if (a.ptr == kNoValueSentinel) return b;
Review Comment:
Not related to this PR. Since it allows `ptr` to be nullptr for empty
string, following `Copy` may be a UB if min is an empty string with nullptr.
```cpp
template <>
inline void TypedStatisticsImpl<ByteArrayType>::Copy(const ByteArray& src,
ByteArray* dst,
ResizableBuffer*
buffer) {
if (dst->ptr == src.ptr) return;
PARQUET_THROW_NOT_OK(buffer->Resize(src.len, false));
std::memcpy(buffer->mutable_data(), src.ptr, src.len);
*dst = ByteArray(src.len, buffer->data());
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]