[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

Rui Li (JIRA) Wed, 18 Jan 2017 06:39:11 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828183#comment-15828183
 ]


Rui Li commented on HIVE-15580:
-------------------------------

[~xuefuz], thanks for your explanations. It makes sense. So in general, the 
input to reducers doesn't have to be <Key, Iterator<Value>> right? I think one 
drawback of this is we have to shuffle more data over network. And I'm curious, 
will this happen to MR too, i.e. does MR also spills at key-group boundary?

> Replace Spark's groupByKey operator with something with bounded memory
> ----------------------------------------------------------------------
>
>                 Key: HIVE-15580
>                 URL: https://issues.apache.org/jira/browse/HIVE-15580
>             Project: Hive
>          Issue Type: Improvement
>          Components: Spark
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>         Attachments: HIVE-15580.1.patch, HIVE-15580.1.patch, 
> HIVE-15580.2.patch, HIVE-15580.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

Reply via email to