Github user mgaido91 commented on a diff in the pull request: https://github.com/apache/spark/pull/21291#discussion_r188872101 --- Diff: python/pyspark/sql/tests.py --- @@ -5239,8 +5239,8 @@ def test_complex_groupby(self): expected2 = df.groupby().agg(sum(df.v)) # groupby one column and one sql expression - result3 = df.groupby(df.id, df.v % 2).agg(sum_udf(df.v)) - expected3 = df.groupby(df.id, df.v % 2).agg(sum(df.v)) + result3 = df.groupby(df.id, df.v % 2).agg(sum_udf(df.v)).orderBy(df.id, df.v % 2) --- End diff -- thanks for your detailed explanation. Anyway, can we just use `orderBy(df.id)` instead of `orderBy(df.id, df.v % 2)`?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org