[GitHub] spark pull request #21291: [SPARK-24242][SQL] RangeExec should have correct ...

mgaido91 Thu, 17 May 2018 01:13:32 -0700

Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21291#discussion_r188872101
  
    --- Diff: python/pyspark/sql/tests.py ---
    @@ -5239,8 +5239,8 @@ def test_complex_groupby(self):
             expected2 = df.groupby().agg(sum(df.v))
     
             # groupby one column and one sql expression
    -        result3 = df.groupby(df.id, df.v % 2).agg(sum_udf(df.v))
    -        expected3 = df.groupby(df.id, df.v % 2).agg(sum(df.v))
    +        result3 = df.groupby(df.id, df.v % 
2).agg(sum_udf(df.v)).orderBy(df.id, df.v % 2)
    --- End diff --
    
    thanks for your detailed explanation. Anyway, can we just use 
`orderBy(df.id)` instead of `orderBy(df.id,  df.v % 2)`?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21291: [SPARK-24242][SQL] RangeExec should have correct ...

Reply via email to