sweb opened a new pull request, #22323:
URL: https://github.com/apache/datafusion/pull/22323

   ## Which issue does this PR close?
   
   - Closes #22188.
   
   ## Rationale for this change
   
   Captured in issue description
   
   ## What changes are included in this PR?
   
   Bounds the element count at `isize:MAX / size_of::<i64>()` in 
`generate_range_values` and returns a DataFusion execution error when the limit 
is exceeded, so the function returns a normal DataFusion error instead of 
panicking inside `Vec::reserve`.
   
   Only the scalar UDF path path in `datafusion/functions-nested/src/range.rs` 
is affected, the table-valued generate_series already streams and is unchanged.
   
   This PR just takes care of the panic and does not attempt a larger refactor 
towards streaming or introducing a max-elements config value. Happy to follow 
up on either.
   
   ## Are these changes tested?
   
   Added sqllogictest to tests in 
`datafusion/sqllogictest/test_files/array/array_range.slt` to cover the issues 
presented in the issue.
   
   ## Are there any user-facing changes?
   
   Queries that previously crashed the process with capacity overflow now 
returns `Execution error: Range too large to materialize: would produce {count} 
elements ({MAX_RANGE_ELEMENTS})`
   
   Behavior for valid ranges is unchanged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to