akurmustafa commented on code in PR #12992:
URL: https://github.com/apache/datafusion/pull/12992#discussion_r1806736635
##########
datafusion/physical-expr/src/equivalence/ordering.rs:
##########
@@ -1065,4 +1066,63 @@ mod tests {
Ok(())
}
+
+ #[test]
+ fn test_ordering_satisfy_on_data() -> Result<()> {
+ let schema = create_test_schema()?;
+ let col_a = &col("a", &schema)?;
+ let col_b = &col("b", &schema)?;
+ let col_c = &col("c", &schema)?;
+ let col_d = &col("d", &schema)?;
+
+ let option_asc = SortOptions {
+ descending: false,
+ nulls_first: false,
+ };
+
+ let orderings = vec![
+ // [a ASC, b ASC, c ASC, d ASC]
+ vec![
+ (col_a, option_asc),
+ (col_b, option_asc),
+ (col_c, option_asc),
+ (col_d, option_asc),
+ ],
+ // [a ASC, c ASC, b ASC, d ASC]
+ vec![
+ (col_a, option_asc),
+ (col_c, option_asc),
+ (col_b, option_asc),
+ (col_d, option_asc),
+ ],
+ ];
+ let orderings = convert_to_orderings(&orderings);
+
+ let batch = generate_table_for_orderings(orderings, schema, 1000, 10)?;
+
+ // [a ASC, c ASC, d ASC] cannot be deduced
+ let ordering = vec![
+ (col_a, option_asc),
+ (col_c, option_asc),
+ (col_d, option_asc),
+ ];
+ let ordering = convert_to_orderings(&[ordering])[0].clone();
+ assert!(!is_table_same_after_sort(ordering, batch.clone())?);
+
+ // [a ASC, b ASC, d ASC] cannot be deduced
+ let ordering = vec![
+ (col_a, option_asc),
+ (col_b, option_asc),
+ (col_d, option_asc),
+ ];
+ let ordering = convert_to_orderings(&[ordering])[0].clone();
+ assert!(!is_table_same_after_sort(ordering, batch.clone())?);
Review Comment:
This depends on the test batch generation parameters. For sufficiently large
table sizes, and with enough cardinality it is really hard to hit this case.
Hence, I can say that statistically this is very low possibility. However, we
can totally encounter this for other use cases. Hence, if the expected result
is counter intuitive, we should test the hypothesis with multiple different
runs with various parameters.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]