[jira] [Commented] (SPARK-35346) More clause needed for combining groupby and cube

Takeshi Yamamuro (Jira) Wed, 12 May 2021 05:21:04 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-35346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17343208#comment-17343208
 ]


Takeshi Yamamuro commented on SPARK-35346:
------------------------------------------

Do you mean this feature? https://issues.apache.org/jira/browse/SPARK-33229 
([https://github.com/apache/spark/blame/master/sql/core/src/test/resources/sql-tests/inputs/group-analytics.sql#L74-L81)]
 
If yes, we've already support in in the recent master.

> More clause needed for combining groupby and cube
> -------------------------------------------------
>
>                 Key: SPARK-35346
>                 URL: https://issues.apache.org/jira/browse/SPARK-35346
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 3.0.0, 3.0.2, 3.1.1
>            Reporter: Kai
>            Priority: Major
>
> As we all know, aggregation clause must follow after groupby, rollup or cube 
> clause in pyspark. I think we should have more features in this part. Because 
> in sql, we can write it like this "group by xxx, xxx, cube(xxx,xxx)". While 
> in pyspark, if you just need cube for one field and group for the others, 
> it's not gonna happen. Using cube for all fields brings much more cost for 
> useless data. So I think we need to improve it. Thank you!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-35346) More clause needed for combining groupby and cube

Reply via email to