abhishek593 opened a new pull request, #49304: URL: https://github.com/apache/arrow/pull/49304
### Rationale for this change The rank kernel incorrectly treated NaNs and Nulls as ties. This fix ensures they are treated as distinct values according to Arrow's sorting conventions. ### What changes are included in this PR? Updated the internal MarkDuplicates helper in vector_rank.cc to distinguish between NaNs and Nulls. ### Are these changes tested? Yes. Added a regression test TestRank.NaNsAndNulls in vector_sort_test.cc and verified all compute tests pass. ### Are there any user-facing changes? The output of the rank function will now correctly differentiate between NaNs and Nulls instead of ranking them as ties. Fixes incorrect/invalid ranking results for datasets containing both NaNs and Nulls. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
