danepitkin commented on code in PR #36768:
URL: https://github.com/apache/arrow/pull/36768#discussion_r1268164338


##########
python/pyarrow/tests/test_exec_plan.py:
##########
@@ -321,3 +321,17 @@ def test_join_extension_array_column():
     result = _perform_join(
         "left outer", t1, ["colB"], t3, ["colC"])
     assert result["colB"] == pa.chunked_array(ext_array)
+
+
+def test_group_by_ordering():
+    # GH-36709 - preserve ordering in groupby by setting use_threads=False
+    table1 = pa.table({'a': [1, 2, 3, 4], 'b': ['a'] * 4})
+    table2 = pa.table({'a': [1, 2, 3, 4], 'b': ['b'] * 4})
+    table = pa.concat_tables([table1, table2])
+
+    for _ in range(50):
+        # 50 seems to consistently cause errors when order is not preserved.
+        # If the order problem is reintroduced this test will become flaky
+        # which is still a signal that the order is not preserved.

Review Comment:
   I do worry that a flaky test in the future might go unnoticed for awhile. 
For example, a developer might just run the test until it "passes" once and 
assume the flakiness is unexpected. I don't really see a better way to test 
something like this without having control over the Acero execution 
environment, though.. This is a good approach given this restraint!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to