[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-19 Thread Hive QA (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829567#comment-15829567 ] Hive QA commented on HIVE-15580: Here are the results of testing the latest attachment:

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Hive QA (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829346#comment-15829346 ] Hive QA commented on HIVE-15580: Here are the results of testing the latest attachment:

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Dapeng Sun (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829218#comment-15829218 ] Dapeng Sun commented on HIVE-15580: --- Thank [~xuefuz] for the suggestion, currently the heap size is 290G

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829203#comment-15829203 ] Xuefu Zhang commented on HIVE-15580: [~dapengsun], for the OOM error you get, you can probably

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Dapeng Sun (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829154#comment-15829154 ] Dapeng Sun commented on HIVE-15580: --- Thank [~xuefuz], [~csun] and [~Ferd], we are running a 100TB test

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15829099#comment-15829099 ] Ferdinand Xu commented on HIVE-15580: - [~xuefuz], patch in HIVE-15527 solved the issue and we're

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828249#comment-15828249 ] Rui Li commented on HIVE-15580: --- Hmm Spark does make it clear groupByKey can't handle large keys:

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828183#comment-15828183 ] Rui Li commented on HIVE-15580: --- [~xuefuz], thanks for your explanations. It makes sense. So in general, the

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828155#comment-15828155 ] Xuefu Zhang commented on HIVE-15580: Hi [~lirui], your understanding is correct. And yes, groupByKey

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-18 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15828143#comment-15828143 ] Rui Li commented on HIVE-15580: --- Hi [~xuefuz], I'd like to check my understanding too. Before the patch, we

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-17 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827470#comment-15827470 ] Xuefu Zhang commented on HIVE-15580: [~Ferd], Functionally, I don't see anything bad because

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-17 Thread Ferdinand Xu (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827399#comment-15827399 ] Ferdinand Xu commented on HIVE-15580: - Hi [~xuefuz], the main change is about replacing *groupByKey*

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-11 Thread Hive QA (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819638#comment-15819638 ] Hive QA commented on HIVE-15580: Here are the results of testing the latest attachment:

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-11 Thread Hive QA (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15819356#comment-15819356 ] Hive QA commented on HIVE-15580: Here are the results of testing the latest attachment:

[jira] [Commented] (HIVE-15580) Replace Spark's groupByKey operator with something with bounded memory

2017-01-11 Thread Hive QA (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-15580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15818740#comment-15818740 ] Hive QA commented on HIVE-15580: Here are the results of testing the latest attachment: