dmitry-chirkov-dremio commented on code in PR #50187:
URL: https://github.com/apache/arrow/pull/50187#discussion_r3455496910


##########
cpp/src/gandiva/precompiled/string_ops.cc:
##########
@@ -1895,8 +1864,7 @@ const char* replace_with_max_len_utf8_utf8_utf8(gdv_int64 
context, const char* t
   for (; text_index <= text_len - from_str_len;) {
     if (memcmp(text + text_index, from_str, from_str_len) == 0) {
       if (out_index + text_index - last_match_index + to_str_len > max_length) 
{

Review Comment:
   Nice catch reordering the subtraction first!
   
   I think we still want to widen this to gdv_int64, though. All three operands 
here are still gdv_int32, so when max_length is close to INT_MAX, the sum 
itself can still overflow before the comparison runs. For example, out_index = 
INT_MAX - 10, span = 8, and to_str_len = 8 should cleanly trip the guard, but 
the intermediate addition is already out of range for signed 32-bit. 
   
   Could we compute a gdv_int64 prospective_len here (and in the final tail 
check below) and compare against static_cast<gdv_int64>(max_length)? That would 
make the new near-2GB path fully safe.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to