[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632727#comment-13632727 ] Namit Jain commented on HIVE-4167: -- added comments Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Namit Jain Priority: Blocker Attachments: hive.4167.1.patch, hive.4167.2.patch, hive.4167.3.patch, hive.4167.4.patch, hive.4167.4.patch-nohcat, HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13633062#comment-13633062 ] Hudson commented on HIVE-4167: -- Integrated in Hive-trunk-hadoop2 #162 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/162/]) HIVE-4167 Hive converts bucket map join to SMB join even when tables are not sorted (Namit Jain via Ashutosh) (Revision 1468349) Result = FAILURE namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1468349 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java * /hive/trunk/ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Namit Jain Priority: Blocker Attachments: hive.4167.1.patch, hive.4167.2.patch, hive.4167.3.patch, hive.4167.4.patch, hive.4167.4.patch-nohcat, HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13633661#comment-13633661 ] Hudson commented on HIVE-4167: -- Integrated in Hive-trunk-h0.21 #2067 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2067/]) HIVE-4167 Hive converts bucket map join to SMB join even when tables are not sorted (Namit Jain via Ashutosh) (Revision 1468349) Result = FAILURE namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1468349 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java * /hive/trunk/ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q * /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Namit Jain Priority: Blocker Attachments: hive.4167.1.patch, hive.4167.2.patch, hive.4167.3.patch, hive.4167.4.patch, hive.4167.4.patch-nohcat, HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631785#comment-13631785 ] Ashutosh Chauhan commented on HIVE-4167: +1 [~namit] Can you take care of committing this? Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Namit Jain Priority: Blocker Attachments: hive.4167.1.patch, hive.4167.2.patch, HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629948#comment-13629948 ] Namit Jain commented on HIVE-4167: -- I was able to reproduce it: CREATE TABLE bucket_small (key string, value string) partitioned by (ds string) CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE; load data local inpath '../data/files/smallsrcsortbucket1outof4.txt' INTO TABLE bucket_small partition(ds='2008-04-08'); load data local inpath '../data/files/smallsrcsortbucket2outof4.txt' INTO TABLE bucket_small partition(ds='2008-04-08'); CREATE TABLE bucket_big (key string, value string) partitioned by (ds string) CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE; load data local inpath '../data/files/srcsortbucket1outof4.txt' INTO TABLE bucket_big partition(ds='2008-04-08'); load data local inpath '../data/files/srcsortbucket2outof4.txt' INTO TABLE bucket_big partition(ds='2008-04-08'); load data local inpath '../data/files/srcsortbucket3outof4.txt' INTO TABLE bucket_big partition(ds='2008-04-08'); load data local inpath '../data/files/srcsortbucket4outof4.txt' INTO TABLE bucket_big partition(ds='2008-04-08'); load data local inpath '../data/files/srcsortbucket1outof4.txt' INTO TABLE bucket_big partition(ds='2008-04-09'); load data local inpath '../data/files/srcsortbucket2outof4.txt' INTO TABLE bucket_big partition(ds='2008-04-09'); load data local inpath '../data/files/srcsortbucket3outof4.txt' INTO TABLE bucket_big partition(ds='2008-04-09'); load data local inpath '../data/files/srcsortbucket4outof4.txt' INTO TABLE bucket_big partition(ds='2008-04-09'); set hive.auto.convert.join=true; set hive.auto.convert.sortmerge.join=true; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; -- Since size is being used to find the big table, the order of the tables in the join does not matter explain extended select count(*) FROM bucket_small a JOIN bucket_big b ON a.key = b.key; select count(*) FROM bucket_small a JOIN bucket_big b ON a.key = b.key; explain extended select count(*) FROM bucket_big a JOIN bucket_small b ON a.key = b.key; select count(*) FROM bucket_big a JOIN bucket_small b ON a.key = b.key; Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Namit Jain Priority: Blocker Attachments: HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629956#comment-13629956 ] Vikram Dixit K commented on HIVE-4167: -- Hi [~namit] I have a rebased patch on trunk. I was trying to produce a test using the tables available in the unit tests. Can I use the test you have provided in this jira? Thanks Vikram. Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Namit Jain Priority: Blocker Attachments: HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629965#comment-13629965 ] Vikram Dixit K commented on HIVE-4167: -- Hi Namit, I was able to reproduce this issue so far on my setup. However, I wasn't sure on how to reproduce this issue on using tables in unit-tests. I can provide an updated patch with your test right away. I am still actively working on this issue. Thanks Vikram. -- Nothing better than when appreciated for hard work. -Mark Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Namit Jain Priority: Blocker Attachments: HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629966#comment-13629966 ] Namit Jain commented on HIVE-4167: -- I have a fix, and a testcase for this. Can you take a look ? Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Namit Jain Priority: Blocker Attachments: hive.4167.1.patch, HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629967#comment-13629967 ] Namit Jain commented on HIVE-4167: -- https://reviews.facebook.net/D10209 Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Namit Jain Priority: Blocker Attachments: hive.4167.1.patch, HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629744#comment-13629744 ] Namit Jain commented on HIVE-4167: -- This is a very serious bug [~vikram.dixit], can you answer Ashutosh's question, and in your patch add a test along with a phabricator/reviewboard entry ? Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Priority: Blocker Attachments: HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted
[ https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621058#comment-13621058 ] Ashutosh Chauhan commented on HIVE-4167: [~vikram.dixit] Does this happen only in presence of HIVE-3891 or even without it ? Hive converts bucket map join to SMB join even when tables are not sorted - Key: HIVE-4167 URL: https://issues.apache.org/jira/browse/HIVE-4167 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.11.0 Reporter: Vikram Dixit K Assignee: Vikram Dixit K Attachments: HIVE-4167.patch If tables are just bucketed but not sorted, we are generating smb join operator. This results in loss of rows in queries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira