Copilot commented on code in PR #2622:
URL: https://github.com/apache/orc/pull/2622#discussion_r3300800391
##########
c++/src/BlockBuffer.cc:
##########
@@ -51,10 +53,19 @@ namespace orc {
if (currentSize_ < currentCapacity_) {
Block emptyBlock(blocks_[currentSize_ / blockSize_] + currentSize_ %
blockSize_,
blockSize_ - currentSize_ % blockSize_);
- currentSize_ = (currentSize_ / blockSize_ + 1) * blockSize_;
+ uint64_t nextBlockNumber = currentSize_ / blockSize_ + 1;
+ uint64_t nextSize = 0;
+ if (multiplyWithOverflow(nextBlockNumber, blockSize_, &nextSize)) {
Review Comment:
In `getNextBlock()`, `uint64_t nextBlockNumber = currentSize_ / blockSize_ +
1;` can overflow (wrap to 0) before the `multiplyWithOverflow` check runs
(e.g., `currentSize_ == UINT64_MAX` and `blockSize_ == 1`). That would bypass
the overflow guard and can reset `currentSize_` to an unexpected value. Compute
`nextBlockNumber` (or `nextSize`) using checked addition (e.g.,
`addWithOverflow` for `+ 1`) so overflow is detected before the multiplication.
##########
c++/src/ColumnReader.cc:
##########
@@ -694,17 +734,38 @@ namespace orc {
size_t StringDirectColumnReader::computeSize(const int64_t* lengths, const
char* notNull,
uint64_t numValues) {
size_t totalLength = 0;
+ bool hasNegativeLength = false;
+ bool hasLengthOverflow = false;
+ auto addLength = [&](int64_t value) {
+ if (value < 0) {
+ hasNegativeLength = true;
+ return;
+ }
+ size_t length = static_cast<size_t>(value);
+ size_t nextTotalLength = 0;
+ bool overflow = addWithOverflow(totalLength, length, &nextTotalLength);
+ hasLengthOverflow |= overflow;
Review Comment:
`StringDirectColumnReader::computeSize()` casts `int64_t value` to `size_t`
without first validating that `value <= std::numeric_limits<size_t>::max()`. On
platforms where `size_t` is narrower than `int64_t` (e.g., 32-bit builds),
large positive lengths will truncate during the cast and can evade the
`addWithOverflow(totalLength, length, ...)` detection, potentially leading to
under-allocation and unsafe reads/copies. Add an explicit range check before
the cast (treat out-of-range as overflow) so malformed length streams are
reliably rejected.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]