[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2017-07-03 Thread Hive QA (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16071981#comment-16071981 ] Hive QA commented on HIVE-14797: Here are the results of testing the latest attachment:

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-21 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15597040#comment-15597040 ] Rui Li commented on HIVE-14797: --- [~roncenzhao], would you mind update the patch as Xuefu suggested? Or let

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-20 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592184#comment-15592184 ] Xuefu Zhang commented on HIVE-14797: That sounds good enough to me. > reducer number estimating may

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-20 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15591149#comment-15591149 ] Rui Li commented on HIVE-14797: --- [~xuefuz] - I see. Thanks for the information. In that case, I think we

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-19 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590047#comment-15590047 ] Xuefu Zhang commented on HIVE-14797: I guess my point is that the chance for a user to hit this

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-18 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587451#comment-15587451 ] Rui Li commented on HIVE-14797: --- Hi [~xuefuz], for the example in the description, B is skewed but (A, B)

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-18 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587229#comment-15587229 ] Xuefu Zhang commented on HIVE-14797: [~lirui] Choosing a different seed for determining bucket number

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-18 Thread Hive QA (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584787#comment-15584787 ] Hive QA commented on HIVE-14797: Here are the results of testing the latest attachment:

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-17 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584307#comment-15584307 ] Rui Li commented on HIVE-14797: --- Thanks for the update [~roncenzhao]. I have one more question.

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-17 Thread roncenzhao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15584262#comment-15584262 ] roncenzhao commented on HIVE-14797: --- Hi, [~lirui] , I hava resolved this problem in the new patch.

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-12 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15568152#comment-15568152 ] Rui Li commented on HIVE-14797: --- Seems for MR, we need to get #reducers from hconf, but for Spark/Tez, we

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-11 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15565084#comment-15565084 ] Rui Li commented on HIVE-14797: --- I did some tests locally. It turns out

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-09 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15561154#comment-15561154 ] Xuefu Zhang commented on HIVE-14797: +1 > reducer number estimating may lead to data skew >

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-10-08 Thread roncenzhao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15559171#comment-15559171 ] roncenzhao commented on HIVE-14797: --- Is there anyone who can review this patch? thanks~ > reducer

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-22 Thread Hive QA (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15512290#comment-15512290 ] Hive QA commented on HIVE-14797: Here are the results of testing the latest attachment:

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-21 Thread roncenzhao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511843#comment-15511843 ] roncenzhao commented on HIVE-14797: --- I think they are not related to my patch. The failure testcases

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-21 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15511708#comment-15511708 ] Rui Li commented on HIVE-14797: --- I see some failures "did not produce a TEST-*.xml file". Are they related?

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-21 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15510378#comment-15510378 ] Xuefu Zhang commented on HIVE-14797: The new change seems good. Minor nit: can we change the

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-21 Thread Hive QA (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15509958#comment-15509958 ] Hive QA commented on HIVE-14797: Here are the results of testing the latest attachment:

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-20 Thread Hive QA (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508829#comment-15508829 ] Hive QA commented on HIVE-14797: Here are the results of testing the latest attachment:

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-20 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508805#comment-15508805 ] Rui Li commented on HIVE-14797: --- [~roncenzhao] your solution seems also OK and simpler. Would like to know

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-20 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508789#comment-15508789 ] Rui Li commented on HIVE-14797: --- Hmm random prime won't work because we need to make sure same rows always

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-20 Thread roncenzhao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508756#comment-15508756 ] roncenzhao commented on HIVE-14797: --- Or we can use the follow way: Let the seed have two options: 31 and

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-20 Thread Rui Li (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508643#comment-15508643 ] Rui Li commented on HIVE-14797: --- If user specifies #reducers to be 31, we shouldn't change it. Is it

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-20 Thread roncenzhao (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15508405#comment-15508405 ] roncenzhao commented on HIVE-14797: --- Yes, we can not hard code the number (31). But we cannot know which

[jira] [Commented] (HIVE-14797) reducer number estimating may lead to data skew

2016-09-20 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/HIVE-14797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15506967#comment-15506967 ] Xuefu Zhang commented on HIVE-14797: This seems making sense, but can we not hard code the number