[jira] [Commented] (HIVE-28489) Partitioning the input data of Grouping Set GroupBy operator

Stamatis Zampetakis (Jira) Wed, 30 Oct 2024 03:13:26 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-28489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17894117#comment-17894117
 ]


Stamatis Zampetakis commented on HIVE-28489:
--------------------------------------------

Hey [~seonggon], according to the [discussion in the user 
list|https://lists.apache.org/thread/o2rm4xjmlfv7co2ort7o5tpg37bo5zhj] there 
might be a new patch around this optimization. Is PR#5424 ready for review or 
we should wait for another PR?

nit: I was checking the attached slides but given that I don't have Powerpoint 
they appear somewhat broken in LibreOffice. Consider sharing such info in more 
widespread formats such as pdf.

> Partitioning the input data of Grouping Set GroupBy operator
> ------------------------------------------------------------
>
>                 Key: HIVE-28489
>                 URL: https://issues.apache.org/jira/browse/HIVE-28489
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Seonggon Namgung
>            Assignee: Seonggon Namgung
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: 2.PartitionDataBeforeGroupingSet.pptx
>
>
> GroupBy operator with grouping sets often emits too many rows, which becomes 
> the bottleneck of query execution. To reduce the number output rows, this 
> JIRA proposes partitioning the input data of such GroupBy operator.
> Please check out the attached slides for detailed explanation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (HIVE-28489) Partitioning the input data of Grouping Set GroupBy operator

Reply via email to