findepi commented on code in PR #16860: URL: https://github.com/apache/datafusion/pull/16860#discussion_r2225377710
########## datafusion/sqllogictest/test_files/joins.slt: ########## @@ -4164,23 +4164,40 @@ AS VALUES (3, 3, true), (3, 3, false); -query IIIIB -SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 LIMIT 2; +query IIIIB rowsort +SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 LIMIT 20; ---- -2 2 2 2 true +1 1 NULL NULL NULL 2 2 2 2 false +2 2 2 2 true +3 3 3 3 false +3 3 3 3 true +4 4 NULL NULL NULL -query IIIIB -SELECT * FROM t0 FULL JOIN t1 ON t0.c2 >= t1.c2 LIMIT 2; +query IIIIB rowsort +SELECT * FROM t0 FULL JOIN t1 ON t0.c2 >= t1.c2 LIMIT 20; ---- +1 1 NULL NULL NULL +2 2 2 2 false 2 2 2 2 true +3 3 2 2 false 3 3 2 2 true +3 3 3 3 false +3 3 3 3 true +4 4 2 2 false +4 4 2 2 true +4 4 3 3 false +4 4 3 3 true -query IIIIB -SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 AND t0.c2 >= t1.c2 LIMIT 2; +query IIIIB rowsort +SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 AND t0.c2 >= t1.c2 LIMIT 20; Review Comment: ```suggestion -- Note: using LIMIT value higher than cardinality before LIMIT to avoid query non-determinism SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 AND t0.c2 >= t1.c2 LIMIT 20; ``` ########## datafusion/sqllogictest/test_files/joins.slt: ########## @@ -4164,23 +4164,40 @@ AS VALUES (3, 3, true), (3, 3, false); -query IIIIB -SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 LIMIT 2; +query IIIIB rowsort +SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 LIMIT 20; Review Comment: ```suggestion -- Note: using LIMIT value higher than cardinality before LIMIT to avoid query non-determinism SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 LIMIT 20; ``` ########## datafusion/sqllogictest/test_files/joins.slt: ########## @@ -4164,23 +4164,40 @@ AS VALUES (3, 3, true), (3, 3, false); -query IIIIB -SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 LIMIT 2; +query IIIIB rowsort +SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 LIMIT 20; Review Comment: > using `order by` might change the test goal. good point > set the limit to a number larger than the result set. doesn't it also change the test goal though? for example, if the `LIMIT n` would pass `n+3` rows, this test would still pass I don't think SLT with static expected values is a way to test `LIMIT n` at all we need to accept this fact. i am ok with the change proposed here, ie LIMIT being more than actual row count. nit: let's maybe add a code comment that LIMIT should be more row count. This is so that when input data is changed and this test becomes flaky again, we still know how to fix it. ########## datafusion/sqllogictest/test_files/joins.slt: ########## @@ -4164,23 +4164,40 @@ AS VALUES (3, 3, true), (3, 3, false); -query IIIIB -SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 LIMIT 2; +query IIIIB rowsort +SELECT * FROM t0 FULL JOIN t1 ON t0.c1 = t1.c1 LIMIT 20; ---- -2 2 2 2 true +1 1 NULL NULL NULL 2 2 2 2 false +2 2 2 2 true +3 3 3 3 false +3 3 3 3 true +4 4 NULL NULL NULL -query IIIIB -SELECT * FROM t0 FULL JOIN t1 ON t0.c2 >= t1.c2 LIMIT 2; +query IIIIB rowsort +SELECT * FROM t0 FULL JOIN t1 ON t0.c2 >= t1.c2 LIMIT 20; Review Comment: ```suggestion -- Note: using LIMIT value higher than cardinality before LIMIT to avoid query non-determinism SELECT * FROM t0 FULL JOIN t1 ON t0.c2 >= t1.c2 LIMIT 20; ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org