Sean-Kenneth-Doherty opened a new pull request, #22293:
URL: https://github.com/apache/datafusion/pull/22293

   ## Which issue does this PR close?
   
   - Closes #22217.
   
   ## Rationale for this change
   
   The array execution path for `repeat(string, count)` calculated 
`string.len() * count` before checking the configured string-size limit. For 
very large counts, that multiplication can overflow and panic instead of 
returning the same string-size overflow error used by the scalar path.
   
   ## What changes are included in this PR?
   
   - Adds checked count conversion and repeated-length calculation helpers.
   - Uses checked multiplication and checked total-capacity accumulation in the 
array path.
   - Adds Rust and sqllogictest coverage for the one-row columnar reproducer 
from the issue.
   
   ## Are these changes tested?
   
   - `cargo fmt --all`
   - `TMPDIR=/home/sean/Projects/datafusion-repeat-overflow/target/tmp cargo 
test -p datafusion-functions 
string::repeat::tests::test_repeat_string_array_overflow -- --nocapture`
   - `TMPDIR=/home/sean/Projects/datafusion-repeat-overflow/target/tmp cargo 
test --profile=ci --test sqllogictests -- string/string_literal.slt`
   - `TMPDIR=/home/sean/Projects/datafusion-repeat-overflow/target/tmp cargo 
clippy --all-targets --all-features -- -D warnings`
   - `git diff --check`
   
   ## Are there any user-facing changes?
   
   Invalid oversized `repeat` results in the columnar path now return a normal 
DataFusion string-size overflow error instead of panicking.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to