[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-16 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13632727#comment-13632727
 ] 

Namit Jain commented on HIVE-4167:
--

added comments

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: hive.4167.1.patch, hive.4167.2.patch, hive.4167.3.patch, 
 hive.4167.4.patch, hive.4167.4.patch-nohcat, HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13633062#comment-13633062
 ] 

Hudson commented on HIVE-4167:
--

Integrated in Hive-trunk-hadoop2 #162 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/162/])
HIVE-4167 Hive converts bucket map join to SMB join even when tables are 
not sorted
(Namit Jain via Ashutosh) (Revision 1468349)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1468349
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java
* /hive/trunk/ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out


 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: hive.4167.1.patch, hive.4167.2.patch, hive.4167.3.patch, 
 hive.4167.4.patch, hive.4167.4.patch-nohcat, HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13633661#comment-13633661
 ] 

Hudson commented on HIVE-4167:
--

Integrated in Hive-trunk-h0.21 #2067 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/2067/])
HIVE-4167 Hive converts bucket map join to SMB join even when tables are 
not sorted
(Namit Jain via Ashutosh) (Revision 1468349)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1468349
Files : 
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/AbstractSMBJoinProc.java
* /hive/trunk/ql/src/test/queries/clientpositive/auto_sortmerge_join_11.q
* /hive/trunk/ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out


 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: hive.4167.1.patch, hive.4167.2.patch, hive.4167.3.patch, 
 hive.4167.4.patch, hive.4167.4.patch-nohcat, HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-15 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13631785#comment-13631785
 ] 

Ashutosh Chauhan commented on HIVE-4167:


+1 [~namit] Can you take care of committing this?

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: hive.4167.1.patch, hive.4167.2.patch, HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629948#comment-13629948
 ] 

Namit Jain commented on HIVE-4167:
--

I was able to reproduce it:

CREATE TABLE bucket_small (key string, value string) partitioned by (ds string) 
CLUSTERED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
load data local inpath '../data/files/smallsrcsortbucket1outof4.txt' INTO TABLE 
bucket_small partition(ds='2008-04-08');
load data local inpath '../data/files/smallsrcsortbucket2outof4.txt' INTO TABLE 
bucket_small partition(ds='2008-04-08');

CREATE TABLE bucket_big (key string, value string) partitioned by (ds string) 
CLUSTERED BY (key) INTO 4 BUCKETS STORED AS TEXTFILE;
load data local inpath '../data/files/srcsortbucket1outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-08');
load data local inpath '../data/files/srcsortbucket2outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-08');
load data local inpath '../data/files/srcsortbucket3outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-08');
load data local inpath '../data/files/srcsortbucket4outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-08');

load data local inpath '../data/files/srcsortbucket1outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-09');
load data local inpath '../data/files/srcsortbucket2outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-09');
load data local inpath '../data/files/srcsortbucket3outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-09');
load data local inpath '../data/files/srcsortbucket4outof4.txt' INTO TABLE 
bucket_big partition(ds='2008-04-09');

set hive.auto.convert.join=true;
set hive.auto.convert.sortmerge.join=true;
set hive.optimize.bucketmapjoin = true;
set hive.optimize.bucketmapjoin.sortedmerge = true;

-- Since size is being used to find the big table, the order of the tables in 
the join does not matter
explain extended select count(*) FROM bucket_small a JOIN bucket_big b ON a.key 
= b.key;
select count(*) FROM bucket_small a JOIN bucket_big b ON a.key = b.key;

explain extended select count(*) FROM bucket_big a JOIN bucket_small b ON a.key 
= b.key;
select count(*) FROM bucket_big a JOIN bucket_small b ON a.key = b.key;

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629956#comment-13629956
 ] 

Vikram Dixit K commented on HIVE-4167:
--

Hi [~namit]

I have a rebased patch on trunk. I was trying to produce a test using the 
tables available in the unit tests. Can I use the test you have provided in 
this jira?

Thanks
Vikram.

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Vikram Dixit K (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629965#comment-13629965
 ] 

Vikram Dixit K commented on HIVE-4167:
--

Hi Namit,

I was able to reproduce this issue so far on my setup. However, I wasn't
sure on how to reproduce this issue on using tables in unit-tests. I can
provide an updated patch with your test right away. I am still actively
working on this issue.

Thanks
Vikram.






-- 
Nothing better than when appreciated for hard work.
-Mark


 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629966#comment-13629966
 ] 

Namit Jain commented on HIVE-4167:
--

I have a fix, and a testcase for this.
Can you take a look ?



 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: hive.4167.1.patch, HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-12 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629967#comment-13629967
 ] 

Namit Jain commented on HIVE-4167:
--

https://reviews.facebook.net/D10209

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Namit Jain
Priority: Blocker
 Attachments: hive.4167.1.patch, HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-11 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13629744#comment-13629744
 ] 

Namit Jain commented on HIVE-4167:
--

This is a very serious bug
[~vikram.dixit], can you answer Ashutosh's question, and in your patch add a 
test along with a phabricator/reviewboard entry ?

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
Priority: Blocker
 Attachments: HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4167) Hive converts bucket map join to SMB join even when tables are not sorted

2013-04-03 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13621058#comment-13621058
 ] 

Ashutosh Chauhan commented on HIVE-4167:


[~vikram.dixit] Does this happen only in presence of HIVE-3891 or even without 
it ?

 Hive converts bucket map join to SMB join even when tables are not sorted
 -

 Key: HIVE-4167
 URL: https://issues.apache.org/jira/browse/HIVE-4167
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.11.0
Reporter: Vikram Dixit K
Assignee: Vikram Dixit K
 Attachments: HIVE-4167.patch


 If tables are just bucketed but not sorted, we are generating smb join 
 operator. This results in loss of rows in queries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira