[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2013-01-09 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13548255#comment-13548255
 ] 

Hudson commented on HIVE-3218:
--

Integrated in Hive-trunk-hadoop2 #54 (See 
[https://builds.apache.org/job/Hive-trunk-hadoop2/54/])
HIVE-3218 Stream table of SMBJoin/BucketMapJoin with two or more 
  partitions is not handled properly (Navis via namit) (Revision 
1367012)

 Result = ABORTED
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1367012
Files : 
* /hive/trunk/data/files/srcsortbucket1outof4.txt
* /hive/trunk/data/files/srcsortbucket2outof4.txt
* /hive/trunk/data/files/srcsortbucket3outof4.txt
* /hive/trunk/data/files/srcsortbucket4outof4.txt
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/BucketMatcher.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecMapperContext.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketMapJoinOptimizer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedMergeBucketMapJoinOptimizer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/BucketMapJoinContext.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java
* /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_1.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_2.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_3.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_4.q
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/stats11.q.out


 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Fix For: 0.10.0

 Attachments: HIVE-3218.1.patch.txt, hive.3218.2.patch


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 

[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424860#comment-13424860
 ] 

Hudson commented on HIVE-3218:
--

Integrated in Hive-trunk-h0.21 #1575 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/1575/])
HIVE-3218 Stream table of SMBJoin/BucketMapJoin with two or more 
  partitions is not handled properly (Navis via namit) (Revision 
1367012)

 Result = FAILURE
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1367012
Files : 
* /hive/trunk/data/files/srcsortbucket1outof4.txt
* /hive/trunk/data/files/srcsortbucket2outof4.txt
* /hive/trunk/data/files/srcsortbucket3outof4.txt
* /hive/trunk/data/files/srcsortbucket4outof4.txt
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/BucketMatcher.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DefaultBucketMatcher.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecMapperContext.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/BucketMapJoinOptimizer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/SortedMergeBucketMapJoinOptimizer.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java
* 
/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/BucketMapJoinContext.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapJoinDesc.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java
* /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_1.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_2.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_3.q
* /hive/trunk/ql/src/test/queries/clientpositive/bucketcontext_4.q
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketcontext_4.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin1.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin3.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin5.q.out
* /hive/trunk/ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/stats11.q.out


 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt, hive.3218.2.patch


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 

[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424691#comment-13424691
 ] 

Namit Jain commented on HIVE-3218:
--

@Navis, can you update the test file ?
Let us try to get this in.

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
   ... 8 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-29 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424692#comment-13424692
 ] 

Navis commented on HIVE-3218:
-

@Namin Jain, updated test file just now.

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
   ... 8 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-29 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13424698#comment-13424698
 ] 

Namit Jain commented on HIVE-3218:
--

+1

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
   ... 8 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-27 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423696#comment-13423696
 ] 

Namit Jain commented on HIVE-3218:
--

minor comments on phabricator

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
   ... 8 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-26 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13422993#comment-13422993
 ] 

Namit Jain commented on HIVE-3218:
--

A   M   data/files/srcsbucket20.txt (118 lines) -   -
A   M   data/files/srcsbucket21.txt (120 lines) -   -
A   M   data/files/srcsbucket22.txt (124 lines) -   -
A   M   data/files/srcsbucket23.txt (138 lines)

Is it sorted and bucketed ?

If yes, can you change the names of these files to 

srcsortbucket1outof4.txt
srcsortbucket2outof4.txt ..


These files might be used for a lot of tests, so it might be a good idea to be 
clear about the names of these files.
Sorry about being picky on the name of these files.

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
   ... 8 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA 

[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-26 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423002#comment-13423002
 ] 

Namit Jain commented on HIVE-3218:
--

I am missing something:

srcbucket20.txt=[srcbucket20.txt]
srcbucket21.txt=[srcbucket21.txt]
srcbucket22.txt=[srcbucket20.txt]
srcbucket23.txt=[srcbucket21.txt]
ds=2008-04-09/srcbucket20.txt=[srcbucket20.txt]
ds=2008-04-09/srcbucket21.txt=[srcbucket21.txt]
ds=2008-04-09/srcbucket22.txt=[srcbucket20.txt]
ds=2008-04-09/srcbucket23.txt=[srcbucket21.txt]

The mapping is:

small table alias - big file table name - list of small table file names

Shouldn't the big table file name and the small table file names be fully 
qualified ?
In the above example, bigtable file name is srcbucket20.txt and 
ds=2008-04-09/srcbucket20.txt. Why is it sometimes qualified by partition name 
and sometimes not ?

Similarly, shouldn't the small table file name be fully qualified ?

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at 

[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-26 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423260#comment-13423260
 ] 

Namit Jain commented on HIVE-3218:
--

comments on phabricator

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
   ... 8 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-26 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13423676#comment-13423676
 ] 

Navis commented on HIVE-3218:
-

addressed comments

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
   ... 8 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-17 Thread Namit Jain (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416053#comment-13416053
 ] 

Namit Jain commented on HIVE-3218:
--

I was thinking about the approaches 1,2,3.

2 seems better, since 3 would mean 1 mapper would be processing multiple files.

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
   ... 8 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-17 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13416071#comment-13416071
 ] 

Navis commented on HIVE-3218:
-

For quries handling many partitions with many buckets, it would be possibly 
needed to use option 3 parallelly. I'm thinking it for another issue.

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
   ... 8 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3218) Stream table of SMBJoin/BucketMapJoin with two or more partitions is not handled properly

2012-07-02 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13404919#comment-13404919
 ] 

Navis commented on HIVE-3218:
-

For BucketedMapJoin it just returns invalid result, which is worse than SMBJoin 
case. (bucketedmapjoin5.q test is broken)

 Stream table of SMBJoin/BucketMapJoin with two or more partitions is not 
 handled properly
 -

 Key: HIVE-3218
 URL: https://issues.apache.org/jira/browse/HIVE-3218
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Critical
 Attachments: HIVE-3218.1.patch.txt


 {noformat}
 drop table hive_test_smb_bucket1;
 drop table hive_test_smb_bucket2;
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.enforce.bucketing = true;
 set hive.enforce.sorting = true;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-14') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket1 partition (ds='2010-10-15') 
 select key, value from src;
 insert overwrite table hive_test_smb_bucket2 partition (ds='2010-10-15') 
 select key, value from src;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 SELECT /* + MAPJOIN(b) */ * FROM hive_test_smb_bucket1 a JOIN 
 hive_test_smb_bucket2 b ON a.key = b.key;
 {noformat}
 which make bucket join context..
 {noformat}
 Alias Bucket Output File Name Mapping:
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-14/01_0
  1
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/00_0
  0
 
 hdfs://localhost:9000/user/hive/warehouse/hive_test_smb_bucket1/ds=2010-10-15/01_0
  1
 {noformat}
 fails with exception
 {noformat}
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:226)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
   at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:416)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127)
   at org.apache.hadoop.mapred.Child.main(Child.java:264)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to rename 
 output from: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_task_tmp.-ext-10001/_tmp.01_0
  to: 
 hdfs://localhost:9000/tmp/hive-navis/hive_2012-06-29_22-17-49_574_6018646381714861925/_tmp.-ext-10001/01_0
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.commit(FileSinkOperator.java:198)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.access$300(FileSinkOperator.java:100)
   at 
 org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:717)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:566)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193)
   ... 8 more
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira