kosiew opened a new pull request, #16466:
URL: https://github.com/apache/datafusion/pull/16466

   ## Which issue does this PR close?
   
   - Closes #16266
   
   ## Rationale for this change
   
   This change addresses a bug where `combine_hashes` was applied even if a 
dictionary value was null, leading to incorrect hash computations. 
   This was discovered while investigating #16266
   Additionally, this PR extends the test coverage for aggregate functions to 
better validate behavior with dictionary arrays containing nulls.
   
   ## What changes are included in this PR?
   
   - Fixes logic in `hash_dictionary` to ensure `combine_hashes` is only 
applied when the dictionary value is valid.
   - Corrects grammar in error messages for dataset generation expectations.
   - Enables null value generation in fuzz tests for dictionary arrays.
   - Adds comprehensive tests for aggregate functions (`COUNT`, `SUM`, `MIN`, 
`MAX`, `MEDIAN`, `FIRST_VALUE`, `LAST_VALUE`) using dictionary arrays with null 
keys and values.
   - Ensures consistent behavior across single and multi-partition execution.
   
   ## Are these changes tested?
   
   Yes, extensive new tests are added covering:
   - Aggregates on dictionary columns with null keys/values.
   - Window functions with null handling (IGNORE/RESPECT NULLS).
   - Partitioned vs. unpartitioned execution consistency.
   
   ## Are there any user-facing changes?
   
   No direct API changes, but query behavior involving dictionary arrays with 
nulls will now produce correct and consistent results in line with SQL 
semantics.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to