tustvold commented on code in PR #4747:
URL: https://github.com/apache/arrow-rs/pull/4747#discussion_r1309078112
##########
arrow-ord/src/sort.rs:
##########
@@ -400,14 +401,7 @@ fn child_rank(values: &dyn Array, options: SortOptions) ->
Result<Vec<u32>, Arro
descending: false,
nulls_first: options.nulls_first != options.descending,
});
-
- let sorted_value_indices = sort_to_indices(values, value_options, None)?;
- let sorted_indices = sorted_value_indices.values();
- let mut out: Vec<_> = vec![0_u32; sorted_indices.len()];
- for (ix, val) in sorted_indices.iter().enumerate() {
- out[*val as usize] = ix as u32;
- }
- Ok(out)
+ rank(values, value_options)
Review Comment:
This is the fix for #4746
The issue is caused by the previous logic assigning distinct ranks to equal
values. This was fine when used for dictionaries, as it just made the sort less
stable, but the changes in #4613 used the same logic for the lexicographic
style comparison for lists, which made this incorrect
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]