[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

HyukjinKwon Mon, 15 Jan 2018 05:02:39 -0800

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19872#discussion_r161495918
  
    --- Diff: python/pyspark/sql/tests.py ---
    @@ -4279,6 +4272,386 @@ def test_unsupported_types(self):
                     df.groupby('id').apply(f).collect()
     
     
    +@unittest.skipIf(not _have_pandas or not _have_arrow, "Pandas or Arrow not 
installed")
    +class GroupbyAggTests(ReusedSQLTestCase):
    +
    +    @property
    +    def data(self):
    +        from pyspark.sql.functions import array, explode, col, lit
    +        return self.spark.range(10).toDF('id') \
    +            .withColumn("vs", array([lit(i * 1.0) + col('id') for i in 
range(20, 30)])) \
    +            .withColumn("v", explode(col('vs'))) \
    +            .drop('vs') \
    +            .withColumn('w', lit(1.0))
    +
    +    @property
    +    def plus_one(self):
    --- End diff --
    
    Could we add a prefix to note the differences like `udf`, `padnas_udf` and 
`functionType`?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #19872: [SPARK-22274][PYTHON][SQL] User-defined aggregati...

Reply via email to