mkleen opened a new pull request, #23277:
URL: https://github.com/apache/datafusion/pull/23277

   ## Which issue does this PR close?
   
   - Closes https://github.com/apache/datafusion/issues/23219.
   
   ## Rationale for this change
   
   The original query from the issue: 
   
   ```sql
   SELECT   (((Cast(id AS BIGINT) % 1024) + 1024) % 1024) AS computed_bucket
   FROM     profile
   ORDER BY computed_bucket,
            Cast(id AS BIGINT) limit 10;
   ```
   
   caused:
   
   ```
   thread 'main' panicked at 
.../datafusion-datasource-54.0.0/src/statistics.rs:100:48:
   index out of bounds: the len is 0 but the index is 0
   ```
   
   The underlying issue is that the current code panics when files are split by 
statistics and there is no statistics available for the column where the sort 
order is defined. 
   
   ## What changes are included in this PR?
   
   - Fix in `MinMaxStatistics` to check if there are stats available for a 
given column 
   - Test
   
   ## Are these changes tested?
   
   Yes
   
   ## Are there any user-facing changes?
   
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to