[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828183#comment-15828183 ]
Rui Li commented on HIVE-15580: ------------------------------- [~xuefuz], thanks for your explanations. It makes sense. So in general, the input to reducers doesn't have to be <Key, Iterator<Value>> right? I think one drawback of this is we have to shuffle more data over network. And I'm curious, will this happen to MR too, i.e. does MR also spills at key-group boundary? > Replace Spark's groupByKey operator with something with bounded memory > ---------------------------------------------------------------------- > > Key: HIVE-15580 > URL: https://issues.apache.org/jira/browse/HIVE-15580 > Project: Hive > Issue Type: Improvement > Components: Spark > Reporter: Xuefu Zhang > Assignee: Xuefu Zhang > Attachments: HIVE-15580.1.patch, HIVE-15580.1.patch, > HIVE-15580.2.patch, HIVE-15580.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)