[GitHub] spark pull request #21082: [SPARK-22239][SQL][Python] Enable grouped aggrega...

ueshin Fri, 18 May 2018 02:59:04 -0700

Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21082#discussion_r189198335
  
    --- Diff: python/pyspark/sql/tests.py ---
    @@ -5181,6 +5190,235 @@ def test_invalid_args(self):
                         'mixture.*aggregate function.*group aggregate pandas 
UDF'):
                     df.groupby(df.id).agg(mean_udf(df.v), mean(df.v)).collect()
     
    +
    [email protected](
    +    not _have_pandas or not _have_pyarrow,
    +    _pandas_requirement_message or _pyarrow_requirement_message)
    +class WindowPandasUDFTests(ReusedSQLTestCase):
    +    @property
    +    def data(self):
    +        from pyspark.sql.functions import array, explode, col, lit
    +        return self.spark.range(10).toDF('id') \
    +            .withColumn("vs", array([lit(i * 1.0) + col('id') for i in 
range(20, 30)])) \
    +            .withColumn("v", explode(col('vs'))) \
    +            .drop('vs') \
    +            .withColumn('w', lit(1.0))
    +
    +    @property
    +    def python_plus_one(self):
    --- End diff --
    
    Shall we move `pands_udf`s for tests to the common place?



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21082: [SPARK-22239][SQL][Python] Enable grouped aggrega...

Reply via email to