[jira] [Commented] (HIVE-3168) LazyBinaryObjectInspector.getPrimitiveJavaObject copies beyond length of underlying BytesWritable

2012-06-27 Thread Thejas M Nair (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401991#comment-13401991
 ] 

Thejas M Nair commented on HIVE-3168:
-

Updated patch on phabricator. Fixes JavaBinaryObjectInspector and 
WritableBinaryObjectInspector as well. Also, tries to avoid copy of byte[] in 
BytesWritable .

 LazyBinaryObjectInspector.getPrimitiveJavaObject copies beyond length of 
 underlying BytesWritable
 -

 Key: HIVE-3168
 URL: https://issues.apache.org/jira/browse/HIVE-3168
 Project: Hive
  Issue Type: Bug
  Components: Serializers/Deserializers
Affects Versions: 0.9.0
Reporter: Thejas M Nair
Assignee: Thejas M Nair
 Fix For: 0.10.0, 0.9.1

 Attachments: HIVE-3168.1.patch


 LazyBinaryObjectInspector.getPrimitiveJavaObject copies the full capacity of 
 the LazyBinary's underlying BytesWritable object, which can be greater than 
 the size of the actual contents. 
 This leads to additional characters at the end of the ByteArrayRef returned. 
 When the LazyBinary object gets re-used, there can be remnants of the later 
 portion of previous entry. 
 This was not seen while reading through hive queries, which I think is 
 because a copy elsewhere seems to create LazyBinary with length == capacity. 
 (probably LazyBinary copy constructor). This was seen when MR or pig used 
 Hcatalog to read the data.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3127) Don’t pass–hconf values as command line arguments to child JVM to avoid command line exceeding char limit on windows

2012-06-27 Thread Kanna Karanam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401995#comment-13401995
 ] 

Kanna Karanam commented on HIVE-3127:
-

Thanks Edward.

 Don’t pass–hconf values as command line arguments to child JVM to avoid 
 command line exceeding char limit on windows
 

 Key: HIVE-3127
 URL: https://issues.apache.org/jira/browse/HIVE-3127
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Windows
Affects Versions: 0.9.0, 0.10.0, 0.9.1
Reporter: Kanna Karanam
Assignee: Kanna Karanam
  Labels: Windows
 Fix For: 0.10.0

 Attachments: HIVE-3127.1.patch.txt, HIVE-3127.2.patch.txt, 
 HIVE-3127.3.patch.txt


 The maximum length of the DOS command string is 8191 characters (in Windows 
 latest versions http://support.microsoft.com/kb/830473). This limit will be 
 exceeded easily when it appends individual –hconf values to the command 
 string. To work around this problem, Write all changed hconf values to a temp 
 file and pass the temp file path to the child jvm to read and initialize the 
 -hconf parameters from file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3202) Add hive command for resetting hive confs

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402242#comment-13402242
 ] 

Edward Capriolo commented on HIVE-3202:
---

+1 will commit it tests pass.

 Add hive command for resetting hive confs
 -

 Key: HIVE-3202
 URL: https://issues.apache.org/jira/browse/HIVE-3202
 Project: Hive
  Issue Type: Improvement
  Components: Configuration
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Trivial

 For the purpose of optimization we set various configs per query. It's worthy 
 but all those configs should be reset every time for next query.
 Just simple reset command would make it less painful.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-06-27 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402246#comment-13402246
 ] 

Daryn Sharp commented on HIVE-3098:
---

I believe neither disabling the fs cache nor switching to {{FileContext}} will 
help since Rohini stated earlier the desire to take advantage of the cache for 
performance, albeit w/o leaking.

{{FileContext}} isn't going to be a panacea for this issue.  I think it only 
implements hdfs, view, and ftp (not hftp).  It provides no {{close()}} method, 
so there's no way to cleanup or shutdown clients until jvm shutdown, ie. 
aborting streams, deleting tmp files, closing the dfs client, etc.  The latter 
will lead to leaks such as the dfs socket cache leaks, dfs lease renewer 
threads, etc.

Even with the fs cache disabled, leaks such as the aforementioned dfs leaks 
will still occur unless _all_ fs instances are explicitly closed.

I'd suggest either {{closeAllForUGI}} which provides a cache boost for each 
request, but degrades performance across multiple requests.  Or do the oozie 
style UGI caching with a periodic cache purging.

 Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
 (Must cache UGIs.)
 -

 Key: HIVE-3098
 URL: https://issues.apache.org/jira/browse/HIVE-3098
 Project: Hive
  Issue Type: Bug
  Components: Shims
Affects Versions: 0.9.0
 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
 turned on.
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-3098.patch


 The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
 the Oracle backend).
 The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
 in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
 100 instances of FileSystem, whose combined retained-mem consumed the 
 entire heap.
 It boiled down to hadoop::UserGroupInformation::equals() being implemented 
 such that the Subject member is compared for equality (==), and not 
 equivalence (.equals()). This causes equivalent UGI instances to compare as 
 unequal, and causes a new FileSystem instance to be created and cached.
 The UGI.equals() is so implemented, incidentally, as a fix for yet another 
 problem (HADOOP-6670); so it is unlikely that that implementation can be 
 modified.
 The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
 the Hive metastore), using an cache for UGI instances in the shims.
 I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
 test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Hive-trunk-h0.21 - Build # 1516 - Still Failing

2012-06-27 Thread Apache Jenkins Server
Changes for Build #1509

Changes for Build #1510

Changes for Build #1511

Changes for Build #1512

Changes for Build #1513

Changes for Build #1514

Changes for Build #1515
[ecapriolo] HIVE-3180 Fix Eclipse classpath template broken in HIVE-3128. Carl 
Steinbach (via egc)

[hashutosh] HIVE-3048 : Collect_set Aggregate does uneccesary check for value. 
(Ed Capriolo via Ashutosh Chauhan)


Changes for Build #1516



No tests ran.

The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1516)

Status: Still Failing

Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1516/ to 
view the results.

Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #61

2012-06-27 Thread Apache Jenkins Server
See 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/

--
[...truncated 5479 lines...]
[mkdir] Created dir: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/builtins/test
[mkdir] Created dir: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/builtins/test/src
[mkdir] Created dir: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/builtins/test/classes
[mkdir] Created dir: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/builtins/test/resources
 [copy] Warning: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/builtins/src/test/resources
 does not exist.

init:
 [echo] Project: builtins

jar:
 [echo] Project: hive

create-dirs:
 [echo] Project: shims
 [copy] Warning: 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/test/resources
 does not exist.

init:
 [echo] Project: shims

ivy-init-settings:
 [echo] Project: shims

ivy-resolve:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/zookeeper/zookeeper/3.4.3/zookeeper-3.4.3.jar
 ...
[ivy:resolve] 

 (749kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.apache.zookeeper#zookeeper;3.4.3!zookeeper.jar (98ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/apache/thrift/libthrift/0.7.0/libthrift-0.7.0.jar
 ...
[ivy:resolve] . 
(294kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] org.apache.thrift#libthrift;0.7.0!libthrift.jar 
(153ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-logging/commons-logging/1.0.4/commons-logging-1.0.4.jar
 ...
[ivy:resolve] .. (37kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
commons-logging#commons-logging;1.0.4!commons-logging.jar (22ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/commons-logging/commons-logging-api/1.0.4/commons-logging-api-1.0.4.jar
 ...
[ivy:resolve] .. (25kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
commons-logging#commons-logging-api;1.0.4!commons-logging-api.jar (20ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/com/google/guava/guava/r09/guava-r09.jar ...
[ivy:resolve] 
..
 (1117kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] com.google.guava#guava;r09!guava.jar (260ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/codehaus/jackson/jackson-core-asl/1.8.8/jackson-core-asl-1.8.8.jar
 ...
[ivy:resolve] ... (222kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.codehaus.jackson#jackson-core-asl;1.8.8!jackson-core-asl.jar (58ms)
[ivy:resolve] downloading 
http://repo1.maven.org/maven2/org/codehaus/jackson/jackson-mapper-asl/1.8.8/jackson-mapper-asl-1.8.8.jar
 ...
[ivy:resolve] 
..
 (652kB)
[ivy:resolve] .. (0kB)
[ivy:resolve]   [SUCCESSFUL ] 
org.codehaus.jackson#jackson-mapper-asl;1.8.8!jackson-mapper-asl.jar (169ms)
[ivy:report] Processing 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml
 to 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/ivy/report/org.apache.hive-hive-shims-default.html

ivy-retrieve:
 [echo] Project: shims

compile:
 [echo] Project: shims
 [echo] Building shims 0.20

build_shims:
 [echo] Project: shims
 [echo] Compiling 
https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20/java
 against hadoop 0.20.2 
(https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/hadoopcore/hadoop-0.20.2)

ivy-init-settings:
 [echo] Project: shims

ivy-resolve-hadoop-shim:
 [echo] Project: shims
[ivy:resolve] :: loading settings :: file = 

[jira] [Created] (HIVE-3203) Drop partition throws NPE if table doesn't exist

2012-06-27 Thread Kevin Wilfong (JIRA)
Kevin Wilfong created HIVE-3203:
---

 Summary: Drop partition throws NPE if table doesn't exist
 Key: HIVE-3203
 URL: https://issues.apache.org/jira/browse/HIVE-3203
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Fix For: 0.10.0
 Attachments: HIVE-3203.1.patch.txt

ALTER TABLE t1 DROP PARTITION (part = '1');

This throws an NPE if t1 doesn't exist.  A SemanticException would be cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3203) Drop partition throws NPE if table doesn't exist

2012-06-27 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3203:


Attachment: HIVE-3203.1.patch.txt

 Drop partition throws NPE if table doesn't exist
 

 Key: HIVE-3203
 URL: https://issues.apache.org/jira/browse/HIVE-3203
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Fix For: 0.10.0

 Attachments: HIVE-3203.1.patch.txt


 ALTER TABLE t1 DROP PARTITION (part = '1');
 This throws an NPE if t1 doesn't exist.  A SemanticException would be cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3203) Drop partition throws NPE if table doesn't exist

2012-06-27 Thread Kevin Wilfong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402364#comment-13402364
 ] 

Kevin Wilfong commented on HIVE-3203:
-

Submitted a diff here https://reviews.facebook.net/D3843

 Drop partition throws NPE if table doesn't exist
 

 Key: HIVE-3203
 URL: https://issues.apache.org/jira/browse/HIVE-3203
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Fix For: 0.10.0

 Attachments: HIVE-3203.1.patch.txt, HIVE-3203.2.patch.txt


 ALTER TABLE t1 DROP PARTITION (part = '1');
 This throws an NPE if t1 doesn't exist.  A SemanticException would be cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3203) Drop partition throws NPE if table doesn't exist

2012-06-27 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3203:


Attachment: HIVE-3203.2.patch.txt

 Drop partition throws NPE if table doesn't exist
 

 Key: HIVE-3203
 URL: https://issues.apache.org/jira/browse/HIVE-3203
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Fix For: 0.10.0

 Attachments: HIVE-3203.1.patch.txt, HIVE-3203.2.patch.txt


 ALTER TABLE t1 DROP PARTITION (part = '1');
 This throws an NPE if t1 doesn't exist.  A SemanticException would be cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3203) Drop partition throws NPE if table doesn't exist

2012-06-27 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3203:


Status: Patch Available  (was: Open)

 Drop partition throws NPE if table doesn't exist
 

 Key: HIVE-3203
 URL: https://issues.apache.org/jira/browse/HIVE-3203
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Fix For: 0.10.0

 Attachments: HIVE-3203.1.patch.txt, HIVE-3203.2.patch.txt


 ALTER TABLE t1 DROP PARTITION (part = '1');
 This throws an NPE if t1 doesn't exist.  A SemanticException would be cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3126) Generate build the velocity based Hive tests on windows by fixing the path issues

2012-06-27 Thread Kanna Karanam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402370#comment-13402370
 ] 

Kanna Karanam commented on HIVE-3126:
-

Updated the patch after addressing most of the review comments
1)  Replaced most of the Windows specific checks in build.xml files with 
generic OS check approach.
2)  Removed some of the if(Windows) checks in the production code.


 Generate  build the velocity based Hive tests on windows by fixing the path 
 issues
 ---

 Key: HIVE-3126
 URL: https://issues.apache.org/jira/browse/HIVE-3126
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.9.0, 0.10.0, 0.9.1
Reporter: Kanna Karanam
Assignee: Kanna Karanam
  Labels: Windows, test
 Fix For: 0.10.0

 Attachments: HIVE-3126.1.patch.txt, HIVE-3126.2.patch.txt, 
 HIVE-3126.3.patch.txt


 1)Escape the backward slash in Canonical Path if unit test runs on windows.
 2)Diff comparison – 
  a.   Ignore the extra spacing on windows
  b.   Ignore the different line endings on windows  Unix
  c.   Convert the file paths to windows specific. (Handle spaces 
 etc..)
 3)Set the right file scheme  class path separators while invoking the junit 
 task from 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-06-27 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402394#comment-13402394
 ] 

Alejandro Abdelnur commented on HIVE-3098:
--

Or FIX Hadoop FileSystem caching, this issue is gremlin-ing all over as we have 
long running/multiuser systems that use Hadoop FileSystem.

 Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
 (Must cache UGIs.)
 -

 Key: HIVE-3098
 URL: https://issues.apache.org/jira/browse/HIVE-3098
 Project: Hive
  Issue Type: Bug
  Components: Shims
Affects Versions: 0.9.0
 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
 turned on.
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-3098.patch


 The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
 the Oracle backend).
 The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
 in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
 100 instances of FileSystem, whose combined retained-mem consumed the 
 entire heap.
 It boiled down to hadoop::UserGroupInformation::equals() being implemented 
 such that the Subject member is compared for equality (==), and not 
 equivalence (.equals()). This causes equivalent UGI instances to compare as 
 unequal, and causes a new FileSystem instance to be created and cached.
 The UGI.equals() is so implemented, incidentally, as a fix for yet another 
 problem (HADOOP-6670); so it is unlikely that that implementation can be 
 modified.
 The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
 the Hive metastore), using an cache for UGI instances in the shims.
 I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
 test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3203) Drop partition throws NPE if table doesn't exist

2012-06-27 Thread Kevin Wilfong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Wilfong updated HIVE-3203:


Attachment: HIVE-3203.3.patch.txt

 Drop partition throws NPE if table doesn't exist
 

 Key: HIVE-3203
 URL: https://issues.apache.org/jira/browse/HIVE-3203
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
Priority: Minor
 Fix For: 0.10.0

 Attachments: HIVE-3203.1.patch.txt, HIVE-3203.2.patch.txt, 
 HIVE-3203.3.patch.txt


 ALTER TABLE t1 DROP PARTITION (part = '1');
 This throws an NPE if t1 doesn't exist.  A SemanticException would be cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-06-27 Thread Ashutosh Chauhan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402448#comment-13402448
 ] 

Ashutosh Chauhan commented on HIVE-3098:


I think we are mixing two things here: Performance Vs Memory Leak.

Patch is originally intended to plug memory leak, not the performance. It fixes 
the original leak, but will introduce a new one. It fixes the case when a 
limited number of users are contacting metastore(which was the original 
Mithun's test where same 60 clients keep hitting the metastore). But, if 
different users hit the metastore, leak is still there, rather more acute with 
a patch, since there are now two caches (ugi and fs) which both will grow, 
instead of just one (fs). 

Problem stems from the fact that there is no expiration policy either in fs or 
ugi cache. We need to design for UGI cache eviction policy. There, when we are 
expiring stale ugi's from ugi-cache we can do {{closeAllForUGI}} for evicting 
ugi to evict cached FS objects from fs-cache.

  

 Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
 (Must cache UGIs.)
 -

 Key: HIVE-3098
 URL: https://issues.apache.org/jira/browse/HIVE-3098
 Project: Hive
  Issue Type: Bug
  Components: Shims
Affects Versions: 0.9.0
 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
 turned on.
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-3098.patch


 The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
 the Oracle backend).
 The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
 in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
 100 instances of FileSystem, whose combined retained-mem consumed the 
 entire heap.
 It boiled down to hadoop::UserGroupInformation::equals() being implemented 
 such that the Subject member is compared for equality (==), and not 
 equivalence (.equals()). This causes equivalent UGI instances to compare as 
 unequal, and causes a new FileSystem instance to be created and cached.
 The UGI.equals() is so implemented, incidentally, as a fix for yet another 
 problem (HADOOP-6670); so it is unlikely that that implementation can be 
 modified.
 The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
 the Hive metastore), using an cache for UGI instances in the shims.
 I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
 test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3204) Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands)

2012-06-27 Thread Kanna Karanam (JIRA)
Kanna Karanam created HIVE-3204:
---

 Summary: Windows: Fix the unit tests which contains “!cmd” 
commands (Unix shell commands)
 Key: HIVE-3204
 URL: https://issues.apache.org/jira/browse/HIVE-3204
 Project: Hive
  Issue Type: Sub-task
  Components: Tests, Windows
Affects Versions: 0.10.0
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 0.10.0


Possible solution 1: (Preferred one)
!Unix cmd | Windows cmd = Keeping the same syntax. Hive uses Java runtime to 
launch the shell command  so any attempt to run windows commands on Unix will 
fail and vice versa.

To deal with unit tests. Unix commands in each .q file will be modified as 
shown below. I will filter out the !commands which can’t be run on the current .

Original entry in.q file:
!rm -rf ../build/ql/test/data/exports/exim_department;

It will be replaced with the following entries.
UNIX::!rm -rf ../build/ql/test/data/exports/exim_department;
WINDOWS::!del ../build/ql/test/data/exports/exim_department

Possible solution 2:
Provide a Shell UDF library(JAVA Based code)to support platform independent 
shell functionality
Cons – 
1) Difficult to provide full shell functionality
2) Takes long time
3) Difficult to manage


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-06-27 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402525#comment-13402525
 ] 

Daryn Sharp commented on HIVE-3098:
---

bq. Or FIX Hadoop FileSystem caching

I wanted to fix the cache for the NM, but after digging around, I think the 
cache is working as designed.  UGIs are mutable, so if two different requests 
share the same cached UGI, then tokens from one request will be shared with 
another request.  This contamination may lead to security issues and will cause 
bugs.

 Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
 (Must cache UGIs.)
 -

 Key: HIVE-3098
 URL: https://issues.apache.org/jira/browse/HIVE-3098
 Project: Hive
  Issue Type: Bug
  Components: Shims
Affects Versions: 0.9.0
 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
 turned on.
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-3098.patch


 The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
 the Oracle backend).
 The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
 in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
 100 instances of FileSystem, whose combined retained-mem consumed the 
 entire heap.
 It boiled down to hadoop::UserGroupInformation::equals() being implemented 
 such that the Subject member is compared for equality (==), and not 
 equivalence (.equals()). This causes equivalent UGI instances to compare as 
 unequal, and causes a new FileSystem instance to be created and cached.
 The UGI.equals() is so implemented, incidentally, as a fix for yet another 
 problem (HADOOP-6670); so it is unlikely that that implementation can be 
 modified.
 The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
 the Hive metastore), using an cache for UGI instances in the shims.
 I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
 test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-06-27 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402530#comment-13402530
 ] 

Alejandro Abdelnur commented on HIVE-3098:
--

But we have a bug, that not only affects clients creating UGIs on the fly for 
the same user and if caching is not off will choke the NN with open sockets. 
And the more clients doing that the more likely for the NM to choke. Could we 
make UGIs immutable (which they should have been in the first place)?

 Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
 (Must cache UGIs.)
 -

 Key: HIVE-3098
 URL: https://issues.apache.org/jira/browse/HIVE-3098
 Project: Hive
  Issue Type: Bug
  Components: Shims
Affects Versions: 0.9.0
 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
 turned on.
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-3098.patch


 The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
 the Oracle backend).
 The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
 in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
 100 instances of FileSystem, whose combined retained-mem consumed the 
 entire heap.
 It boiled down to hadoop::UserGroupInformation::equals() being implemented 
 such that the Subject member is compared for equality (==), and not 
 equivalence (.equals()). This causes equivalent UGI instances to compare as 
 unequal, and causes a new FileSystem instance to be created and cached.
 The UGI.equals() is so implemented, incidentally, as a fix for yet another 
 problem (HADOOP-6670); so it is unlikely that that implementation can be 
 modified.
 The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
 the Hive metastore), using an cache for UGI instances in the shims.
 I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
 test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3171) Bucketed sort merge join doesn't work on tables with more than one partition

2012-06-27 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-3171:
-

Component/s: Query Processor
 Labels: bucketing joins partitioning  (was: )

 Bucketed sort merge join doesn't work on tables with more than one partition
 

 Key: HIVE-3171
 URL: https://issues.apache.org/jira/browse/HIVE-3171
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Reporter: Joey Echeverria
  Labels: bucketing, joins, partitioning

 Executing a query with the MAPJOIN hint and the bucketed sort merge join 
 optimizations enabled:
 {noformat}
 set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 set hive.optimize.bucketmapjoin = true;
 set hive.optimize.bucketmapjoin.sortedmerge = true;
 {noformat}
 works fine with partitioned tables if there is only one partition in the 
 table. However, if you add a second partition, Hive attempts to do a regular 
 map-side join which can fail because the tables are too large. Hive ought to 
 be able to still do the bucketed sort merge join with partitions.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2535) Use sorted nature of compact indexes

2012-06-27 Thread Carl Steinbach (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2535:
-

Component/s: Query Processor
 Indexing
 Labels: indexing performance  (was: )

 Use sorted nature of compact indexes
 

 Key: HIVE-2535
 URL: https://issues.apache.org/jira/browse/HIVE-2535
 Project: Hive
  Issue Type: Improvement
  Components: Indexing, Query Processor
Reporter: Kevin Wilfong
Assignee: Kevin Wilfong
  Labels: indexing, performance
 Fix For: 0.8.0

 Attachments: HIVE-2535.1.patch.txt, HIVE-2535.2.patch.txt, 
 HIVE-2535.3.patch.txt, HIVE-2535.4.patch.txt


 Compact indexes are sorted based on the indexed columns, but we are not using 
 this fact when we access the index.
 To start with, if the index is stored as an RC file, and if the predicate 
 being used to access the index consists of only one non-partition condition 
 using one of the operators ,=,,=,= we could use a binary search (if 
 necessary) to find the block to begin scanning for unfiltered rows, and we 
 could use the result of comparing the value in the column with the constant 
 (this is necessarily the form of a predicate which is optimized using an 
 index) to determine when we have found all the rows which will be unfiltered.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3127) Pass hconf values as XML instead of command line arguments to child JVM

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-3127:
--

Summary: Pass hconf values as XML instead of command line arguments to 
child JVM  (was: Don’t pass–hconf values as command line arguments to child JVM 
to avoid command line exceeding char limit on windows)

 Pass hconf values as XML instead of command line arguments to child JVM
 ---

 Key: HIVE-3127
 URL: https://issues.apache.org/jira/browse/HIVE-3127
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Windows
Affects Versions: 0.9.0, 0.10.0, 0.9.1
Reporter: Kanna Karanam
Assignee: Kanna Karanam
  Labels: Windows
 Fix For: 0.10.0

 Attachments: HIVE-3127.1.patch.txt, HIVE-3127.2.patch.txt, 
 HIVE-3127.3.patch.txt


 The maximum length of the DOS command string is 8191 characters (in Windows 
 latest versions http://support.microsoft.com/kb/830473). This limit will be 
 exceeded easily when it appends individual –hconf values to the command 
 string. To work around this problem, Write all changed hconf values to a temp 
 file and pass the temp file path to the child jvm to read and initialize the 
 -hconf parameters from file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

2012-06-27 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402716#comment-13402716
 ] 

Daryn Sharp commented on HIVE-3098:
---

Unfortunately, no.  If you make a {{UGI}} immutable, it really means marking 
the {{Subject}} immutable.  Everything token-based breaks because they can't be 
added to the {{Subject}}, you can't login from a keytab or relogin because you 
can't update the TGT in the {{Subject}}, proxy users (any maybe spnego?) won't 
work because service tickets can't be added to the {{Subject}}.

A {{Subject}}/{{UGI}} is an execution context with specific privileges.  Those 
contexts cannot be cached and shared w/o risking escalated privileges. Think of 
it this way:  If I entrust you with the keys to my home to pickup a delivery, I 
don't want you to make a copy of the keys and have the ability to enter anytime 
you want w/o my explicit permission.

Without knowing the intricacies, I recommend: leaving the fs cache on, create a 
new ugi for connections, and {{closeAllForUGI}} when the request is complete. 

 Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
 (Must cache UGIs.)
 -

 Key: HIVE-3098
 URL: https://issues.apache.org/jira/browse/HIVE-3098
 Project: Hive
  Issue Type: Bug
  Components: Shims
Affects Versions: 0.9.0
 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
 turned on.
Reporter: Mithun Radhakrishnan
Assignee: Mithun Radhakrishnan
 Attachments: HIVE-3098.patch


 The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
 the Oracle backend).
 The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
 in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
 100 instances of FileSystem, whose combined retained-mem consumed the 
 entire heap.
 It boiled down to hadoop::UserGroupInformation::equals() being implemented 
 such that the Subject member is compared for equality (==), and not 
 equivalence (.equals()). This causes equivalent UGI instances to compare as 
 unequal, and causes a new FileSystem instance to be created and cached.
 The UGI.equals() is so implemented, incidentally, as a fix for yet another 
 problem (HADOOP-6670); so it is unlikely that that implementation can be 
 modified.
 The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
 the Hive metastore), using an cache for UGI instances in the shims.
 I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
 test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3127) Pass hconf values as XML instead of command line arguments to child JVM

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-3127:
--

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed. Thank you. 

 Pass hconf values as XML instead of command line arguments to child JVM
 ---

 Key: HIVE-3127
 URL: https://issues.apache.org/jira/browse/HIVE-3127
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Windows
Affects Versions: 0.9.0, 0.10.0, 0.9.1
Reporter: Kanna Karanam
Assignee: Kanna Karanam
  Labels: Windows
 Fix For: 0.10.0

 Attachments: HIVE-3127.1.patch.txt, HIVE-3127.2.patch.txt, 
 HIVE-3127.3.patch.txt


 The maximum length of the DOS command string is 8191 characters (in Windows 
 latest versions http://support.microsoft.com/kb/830473). This limit will be 
 exceeded easily when it appends individual –hconf values to the command 
 string. To work around this problem, Write all changed hconf values to a temp 
 file and pass the temp file path to the child jvm to read and initialize the 
 -hconf parameters from file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3205) Bucketed mapjoin on partitioned table which has no partition throws NPE

2012-06-27 Thread Navis (JIRA)
Navis created HIVE-3205:
---

 Summary: Bucketed mapjoin on partitioned table which has no 
partition throws NPE
 Key: HIVE-3205
 URL: https://issues.apache.org/jira/browse/HIVE-3205
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
 Environment: ubuntu 10.04
Reporter: Navis
Assignee: Navis
Priority: Minor


{code}
create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
string) clustered by (key) sorted by (key) into 2 buckets;
create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
string) clustered by (key) sorted by (key) into 2 buckets;

set hive.optimize.bucketmapjoin = true;
set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;

explain
SELECT /* + MAPJOIN(b) */ b.key as k1, b.value, b.ds, a.key as k2
FROM hive_test_smb_bucket1 a JOIN
hive_test_smb_bucket2 b
ON a.key = b.key WHERE a.ds = '2010-10-15' and b.ds='2010-10-15' and  b.key IS 
NOT NULL;
{code}

throws NPE
{noformat}
2012-06-28 08:59:13,459 ERROR ql.Driver (SessionState.java:printError(400)) - 
FAILED: NullPointerException null
java.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer$BucketMapjoinOptProc.process(BucketMapJoinOptimizer.java:269)
at 
org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125)
at 
org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
at 
org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer.transform(BucketMapJoinOptimizer.java:100)
at 
org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7564)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
at 
org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
{noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3205) Bucketed mapjoin on partitioned table which has no partition throws NPE

2012-06-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3205:


Status: Patch Available  (was: Open)

https://reviews.facebook.net/D3849

 Bucketed mapjoin on partitioned table which has no partition throws NPE
 ---

 Key: HIVE-3205
 URL: https://issues.apache.org/jira/browse/HIVE-3205
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
 Environment: ubuntu 10.04
Reporter: Navis
Assignee: Navis
Priority: Minor

 {code}
 create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds 
 string) clustered by (key) sorted by (key) into 2 buckets;
 set hive.optimize.bucketmapjoin = true;
 set hive.input.format = 
 org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat;
 explain
 SELECT /* + MAPJOIN(b) */ b.key as k1, b.value, b.ds, a.key as k2
 FROM hive_test_smb_bucket1 a JOIN
 hive_test_smb_bucket2 b
 ON a.key = b.key WHERE a.ds = '2010-10-15' and b.ds='2010-10-15' and  b.key 
 IS NOT NULL;
 {code}
 throws NPE
 {noformat}
 2012-06-28 08:59:13,459 ERROR ql.Driver (SessionState.java:printError(400)) - 
 FAILED: NullPointerException null
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer$BucketMapjoinOptProc.process(BucketMapJoinOptimizer.java:269)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
   at 
 org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer.transform(BucketMapJoinOptimizer.java:100)
   at 
 org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7564)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
   at 
 org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3172) Remove the duplicate JAR entries from the (“test.classpath”) to avoid command line exceeding char limit on windows

2012-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3172:
---

Status: Patch Available  (was: Open)

 Remove the duplicate JAR entries from the (“test.classpath”) to avoid command 
 line exceeding char limit on windows 
 ---

 Key: HIVE-3172
 URL: https://issues.apache.org/jira/browse/HIVE-3172
 Project: Hive
  Issue Type: Sub-task
  Components: Tests, Windows
Affects Versions: 0.10.0
 Environment: Windows
Reporter: Kanna Karanam
Assignee: Kanna Karanam
  Labels: Windows
 Fix For: 0.10.0

 Attachments: HIVE-3172.1.patch.txt, HIVE-3172.2.patch.txt


 The maximum length of the DOS command string is 8191 characters (in Windows 
 latest versions http://support.microsoft.com/kb/830473). Following entries in 
 the “build-common.xml” are adding lot of duplicate JAR entries to the 
 “test.classpath” and it exceeds the max character limit on windows very 
 easily. 
 !-- Include build/dist/lib on the classpath before Ivy and exclude hive jars 
 from Ivy to make sure we get the local changes when we test Hive --
 fileset dir=${build.dir.hive}/dist/lib includes=*.jar 
 erroronmissingdir=false 
 excludes=**/hive_contrib*.jar,**/hive-contrib*.jar,**/lib*.jar/
 fileset dir=${hive.root}/build/ivy/lib/test includes=*.jar 
 erroronmissingdir=false excludes=**/hive_*.jar,**/hive-*.jar/
 fileset dir=${hive.root}/build/ivy/lib/default includes=*.jar 
 erroronmissingdir=false excludes=**/hive_*.jar,**/hive-*.jar /
 fileset dir=${hive.root}/testlibs includes=*.jar/
 Proposed solution (workaround)–
 1)Include all JARs from dist\lib excluding 
 **/hive_contrib*.jar,**/hive-contrib*.jar,**/lib*.jar
 2)Select the specific jars (missing jars) from test/other folders, (that 
 includes Hadoop-*.jar files)
 Thanks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3126) Generate build the velocity based Hive tests on windows by fixing the path issues

2012-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3126:
---

Status: Patch Available  (was: Open)

 Generate  build the velocity based Hive tests on windows by fixing the path 
 issues
 ---

 Key: HIVE-3126
 URL: https://issues.apache.org/jira/browse/HIVE-3126
 Project: Hive
  Issue Type: Bug
  Components: Tests
Affects Versions: 0.9.0, 0.10.0, 0.9.1
Reporter: Kanna Karanam
Assignee: Kanna Karanam
  Labels: Windows, test
 Fix For: 0.10.0

 Attachments: HIVE-3126.1.patch.txt, HIVE-3126.2.patch.txt, 
 HIVE-3126.3.patch.txt


 1)Escape the backward slash in Canonical Path if unit test runs on windows.
 2)Diff comparison – 
  a.   Ignore the extra spacing on windows
  b.   Ignore the different line endings on windows  Unix
  c.   Convert the file paths to windows specific. (Handle spaces 
 etc..)
 3)Set the right file scheme  class path separators while invoking the junit 
 task from 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-1399) Nested UDAFs cause Hive Internal Error (NullPointerException)

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-1399:
--

Status: Patch Available  (was: Open)

 Nested UDAFs cause Hive Internal Error (NullPointerException)
 -

 Key: HIVE-1399
 URL: https://issues.apache.org/jira/browse/HIVE-1399
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: Mayank Lahiri
Assignee: Adam Fokken
 Attachments: HIVE-1399.1.patch.txt


 This query does not make real-world sense, and I'm guessing it's not even 
 supported by HQL/SQL, but I'm pretty sure that it shouldn't be causing an 
 internal error with a NullPointerException. normal just has one column 
 called val. I'm running on trunk, svn updated 5 minutes ago, ant clean 
 package.
 SELECT percentile(val, percentile(val, 0.5)) FROM normal;
 FAILED: Hive Internal Error: java.lang.NullPointerException(null)
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:153)
   at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:587)
   at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:708)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:6241)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2301)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2860)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5002)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5524)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6055)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 I've also recreated this error with a GenericUDAF I'm writing, and also with 
 the following:
 SELECT percentile(val, percentile()) FROM normal;   
 SELECT avg(variance(dob_year)) FROM somedata; // this makes no sense, but 
 still a NullPointerException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-1399) Nested UDAFs cause Hive Internal Error (NullPointerException)

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-1399:
--

Affects Version/s: (was: 0.6.0)
   0.9.0

 Nested UDAFs cause Hive Internal Error (NullPointerException)
 -

 Key: HIVE-1399
 URL: https://issues.apache.org/jira/browse/HIVE-1399
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Mayank Lahiri
Assignee: Adam Fokken
 Attachments: HIVE-1399.1.patch.txt


 This query does not make real-world sense, and I'm guessing it's not even 
 supported by HQL/SQL, but I'm pretty sure that it shouldn't be causing an 
 internal error with a NullPointerException. normal just has one column 
 called val. I'm running on trunk, svn updated 5 minutes ago, ant clean 
 package.
 SELECT percentile(val, percentile(val, 0.5)) FROM normal;
 FAILED: Hive Internal Error: java.lang.NullPointerException(null)
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:153)
   at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:587)
   at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:708)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:6241)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2301)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2860)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5002)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5524)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6055)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 I've also recreated this error with a GenericUDAF I'm writing, and also with 
 the following:
 SELECT percentile(val, percentile()) FROM normal;   
 SELECT avg(variance(dob_year)) FROM somedata; // this makes no sense, but 
 still a NullPointerException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1399) Nested UDAFs cause Hive Internal Error (NullPointerException)

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402762#comment-13402762
 ] 

Edward Capriolo commented on HIVE-1399:
---

All. This patch looks good. Lets not let this sit over an error message. 

{quote}
Aggregate functions should not be nested and can not be in a clause other than 
a SELECT or HAVING clause. 
{quote}

 Nested UDAFs cause Hive Internal Error (NullPointerException)
 -

 Key: HIVE-1399
 URL: https://issues.apache.org/jira/browse/HIVE-1399
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Mayank Lahiri
Assignee: Adam Fokken
 Attachments: HIVE-1399.1.patch.txt


 This query does not make real-world sense, and I'm guessing it's not even 
 supported by HQL/SQL, but I'm pretty sure that it shouldn't be causing an 
 internal error with a NullPointerException. normal just has one column 
 called val. I'm running on trunk, svn updated 5 minutes ago, ant clean 
 package.
 SELECT percentile(val, percentile(val, 0.5)) FROM normal;
 FAILED: Hive Internal Error: java.lang.NullPointerException(null)
 java.lang.NullPointerException
   at 
 org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:153)
   at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:587)
   at 
 org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:708)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128)
   at 
 org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:6241)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2301)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2860)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5002)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5524)
   at 
 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6055)
   at 
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126)
   at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 I've also recreated this error with a GenericUDAF I'm writing, and also with 
 the following:
 SELECT percentile(val, percentile()) FROM normal;   
 SELECT avg(variance(dob_year)) FROM somedata; // this makes no sense, but 
 still a NullPointerException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3068) Add ability to export table metadata as JSON on table drop

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402765#comment-13402765
 ] 

Edward Capriolo commented on HIVE-3068:
---

+1 I like this. Add a unit test and we are good to go.

 Add ability to export table metadata as JSON on table drop
 --

 Key: HIVE-3068
 URL: https://issues.apache.org/jira/browse/HIVE-3068
 Project: Hive
  Issue Type: New Feature
  Components: Metastore, Serializers/Deserializers
Reporter: Andrew Chalfant
Assignee: Andrew Chalfant
Priority: Minor
  Labels: features, newbie
   Original Estimate: 24h
  Remaining Estimate: 24h

 When a table is dropped, the contents go to the users trash but the metadata 
 is lost. It would be super neat to be able to save the metadata as well so 
 that tables could be trivially re-instantiated via thrift.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3068) Add ability to export table metadata as JSON on table drop

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402767#comment-13402767
 ] 

Edward Capriolo commented on HIVE-3068:
---

We have done another issue for inport and export functions. Maybe you can 
borrow that code to dump the metadata instead of the json export you have done.

 Add ability to export table metadata as JSON on table drop
 --

 Key: HIVE-3068
 URL: https://issues.apache.org/jira/browse/HIVE-3068
 Project: Hive
  Issue Type: New Feature
  Components: Metastore, Serializers/Deserializers
Reporter: Andrew Chalfant
Assignee: Andrew Chalfant
Priority: Minor
  Labels: features, newbie
   Original Estimate: 24h
  Remaining Estimate: 24h

 When a table is dropped, the contents go to the users trash but the metadata 
 is lost. It would be super neat to be able to save the metadata as well so 
 that tables could be trivially re-instantiated via thrift.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2544) Nullpointer on registering udfs.

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-2544:
--

Attachment: HIVE-2544.patch.2.txt

 Nullpointer on registering udfs.
 

 Key: HIVE-2544
 URL: https://issues.apache.org/jira/browse/HIVE-2544
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Bennie Schut
Assignee: Bennie Schut
 Attachments: HIVE-2544.1.patch.txt, HIVE-2544.patch.2.txt


 Currently the Function registry can throw NullPointers when multiple threads 
 are trying to register the same function. The normal put() will replace the 
 existing registered function object even if it's exactly the same function.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2544) Nullpointer on registering udfs.

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402776#comment-13402776
 ] 

Edward Capriolo commented on HIVE-2544:
---

I have included a patch to use Collections.synchronizedMap() this should deal 
with atomic issues without changing the order of things.

 Nullpointer on registering udfs.
 

 Key: HIVE-2544
 URL: https://issues.apache.org/jira/browse/HIVE-2544
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.9.0
Reporter: Bennie Schut
Assignee: Bennie Schut
 Attachments: HIVE-2544.1.patch.txt, HIVE-2544.patch.2.txt


 Currently the Function registry can throw NullPointers when multiple threads 
 are trying to register the same function. The normal put() will replace the 
 existing registered function object even if it's exactly the same function.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3204) Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands)

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402781#comment-13402781
 ] 

Edward Capriolo commented on HIVE-3204:
---

For this case we must be able to use dfs to do rm's.

 Windows: Fix the unit tests which contains “!cmd” commands (Unix shell 
 commands)
 --

 Key: HIVE-3204
 URL: https://issues.apache.org/jira/browse/HIVE-3204
 Project: Hive
  Issue Type: Sub-task
  Components: Tests, Windows
Affects Versions: 0.10.0
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 0.10.0


 Possible solution 1: (Preferred one)
 !Unix cmd | Windows cmd = Keeping the same syntax. Hive uses Java runtime 
 to launch the shell command  so any attempt to run windows commands on Unix 
 will fail and vice versa.
 To deal with unit tests. Unix commands in each .q file will be modified as 
 shown below. I will filter out the !commands which can’t be run on the 
 current .
 Original entry in.q file:
 !rm -rf ../build/ql/test/data/exports/exim_department;
 It will be replaced with the following entries.
 UNIX::!rm -rf ../build/ql/test/data/exports/exim_department;
 WINDOWS::!del ../build/ql/test/data/exports/exim_department
 Possible solution 2:
 Provide a Shell UDF library(JAVA Based code)to support platform independent 
 shell functionality
 Cons – 
 1) Difficult to provide full shell functionality
 2) Takes long time
 3) Difficult to manage

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-3100 : Add HiveCLI that runs over JDBC

2012-06-27 Thread Prasad Mujumdar


 On June 26, 2012, 7:46 p.m., Carl Steinbach wrote:
  eclipse-templates/HiveBeeLine.launchtemplate, line 1
  https://reviews.apache.org/r/5554/diff/1/?file=116339#file116339line1
 
  Please add sqlline to the eclipse classpath template in 
  eclipse-files/.classpath

Done


 On June 26, 2012, 7:46 p.m., Carl Steinbach wrote:
  jdbc/ivy.xml, line 36
  https://reviews.apache.org/r/5554/diff/1/?file=116342#file116342line36
 
  This doesn't apply cleanly due to a change that went in a couple days 
  ago. Needs a rebase.

Done


 On June 26, 2012, 7:46 p.m., Carl Steinbach wrote:
  jdbc/src/java/org/apache/hive/jdbc/beeline/HiveBeeline.java, line 37
  https://reviews.apache.org/r/5554/diff/1/?file=116343#file116343line37
 
  Remove.

Done


 On June 26, 2012, 7:46 p.m., Carl Steinbach wrote:
  jdbc/src/java/org/apache/hive/jdbc/beeline/HiveBeeline.java, line 77
  https://reviews.apache.org/r/5554/diff/1/?file=116343#file116343line77
 
  Remove?

yes, old code. Removed


 On June 26, 2012, 7:46 p.m., Carl Steinbach wrote:
  jdbc/src/java/org/apache/hive/jdbc/beeline/HiveBeeline.java, line 113
  https://reviews.apache.org/r/5554/diff/1/?file=116343#file116343line113
 
  Remove.

Done


- Prasad


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5554/#review8633
---


On June 25, 2012, 6:52 p.m., Prasad Mujumdar wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/5554/
 ---
 
 (Updated June 25, 2012, 6:52 p.m.)
 
 
 Review request for hive and Carl Steinbach.
 
 
 Description
 ---
 
 New patch for supporting SQLLine as command line editor for Hive.
 LICENSE and NOTICE updates to mention SQLLine
 Build changes to add SQLLine
 java wrapper to translate the CLI arguments to SQLLine
 
 
 This addresses bug HIVE-3100.
 https://issues.apache.org/jira/browse/HIVE-3100
 
 
 Diffs
 -
 
   LICENSE 05085da 
   NOTICE 871fdde 
   bin/ext/beeline.sh PRE-CREATION 
   eclipse-templates/HiveBeeLine.launchtemplate PRE-CREATION 
   ivy/ivysettings.xml fb6f4b8 
   ivy/libraries.properties 8461da1 
   jdbc/ivy.xml d8e3a5f 
   jdbc/src/java/org/apache/hive/jdbc/beeline/HiveBeeline.java PRE-CREATION 
   jdbc/src/java/org/apache/hive/jdbc/beeline/OptionsProcessor.java 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/5554/diff/
 
 
 Testing
 ---
 
 
 Thanks,
 
 Prasad Mujumdar
 




Re: Review Request: HIVE-3100 : Add HiveCLI that runs over JDBC

2012-06-27 Thread Prasad Mujumdar

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/5554/
---

(Updated June 28, 2012, 2:10 a.m.)


Review request for hive and Carl Steinbach.


Changes
---

Updates per review comments.


Description
---

New patch for supporting SQLLine as command line editor for Hive.
LICENSE and NOTICE updates to mention SQLLine
Build changes to add SQLLine
java wrapper to translate the CLI arguments to SQLLine


This addresses bug HIVE-3100.
https://issues.apache.org/jira/browse/HIVE-3100


Diffs (updated)
-

  LICENSE 05085da 
  NOTICE 871fdde 
  bin/ext/beeline.sh PRE-CREATION 
  eclipse-templates/.classpath 49e9140 
  eclipse-templates/HiveBeeLine.launchtemplate PRE-CREATION 
  ivy/ivysettings.xml fb6f4b8 
  ivy/libraries.properties c3b53bf 
  jdbc/ivy.xml 9269bd1 
  jdbc/src/java/org/apache/hive/jdbc/beeline/HiveBeeline.java PRE-CREATION 
  jdbc/src/java/org/apache/hive/jdbc/beeline/OptionsProcessor.java PRE-CREATION 

Diff: https://reviews.apache.org/r/5554/diff/


Testing
---


Thanks,

Prasad Mujumdar



[jira] [Assigned] (HIVE-1626) stop using java.util.Stack

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo reassigned HIVE-1626:
-

Assignee: Edward Capriolo  (was: John Sichi)

 stop using java.util.Stack
 --

 Key: HIVE-1626
 URL: https://issues.apache.org/jira/browse/HIVE-1626
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.7.0
Reporter: John Sichi
Assignee: Edward Capriolo

 We currently use Stack as part of the generic node walking library.  Stack 
 should not be used for this since its inheritance from Vector incurs 
 superfluous synchronization overhead.
 Most projects end up adding an ArrayStack implementation and using that 
 instead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3148) LOCATION clause is not honored when adding multiple partitions

2012-06-27 Thread Shengsheng Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402797#comment-13402797
 ] 

Shengsheng Huang commented on HIVE-3148:


@Carl @Namit @JQ I've attached the new patch with test case and updated request 
at reviewboard. https://reviews.apache.org/r/5476/
Is there anything else I need to do?

 LOCATION clause is not honored when adding multiple partitions
 --

 Key: HIVE-3148
 URL: https://issues.apache.org/jira/browse/HIVE-3148
 Project: Hive
  Issue Type: Bug
  Components: Metastore, Query Processor
Affects Versions: 0.4.0, 0.5.0, 0.6.0, 0.7.0, 0.8.0, 0.9.0
Reporter: Carl Steinbach
  Labels: patch
 Attachments: 3148.for0.9.0.patch, HIVE-3148.1.patch.txt, 
 HIVE-3148.2.patch




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-3206) Bucket mapjoin in trunk is not working

2012-06-27 Thread Navis (JIRA)
Navis created HIVE-3206:
---

 Summary: Bucket mapjoin in trunk is not working 
 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis


Bucket mapjoin throws exception archiving stored hashtables. 
{noformat}
hive set hive.optimize.bucketmapjoin = true;
hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
 from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
 on a.key=b.key;
Total MapReduce jobs = 1
12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
Execution log at: 
/tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
2012-06-28 12:36:18 Starting to launch local task to process map join;  
maximum memory = 932118528
2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 
Memory usage:   1771376 rate:   0.002
2012-06-28 12:36:18 Dump the hashtable into file: 
file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
2012-06-28 12:36:18 Upload 1 File to: 
file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 File size: 9644
2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 
Memory usage:   1844568 rate:   0.002
2012-06-28 12:36:19 Dump the hashtable into file: 
file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
2012-06-28 12:36:19 Upload 1 File to: 
file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 File size: 10023
2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec.
Execution completed successfully
Mapred Local Task Succeeded . Convert the Join into MapJoin
Mapred Local Task Succeeded . Convert the Join into MapJoin
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.io.IOException: This archives contains unclosed entries.
at 
org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
at 
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Job Submission failed with exception 'java.io.IOException(This archives 
contains unclosed entries.)'
java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
at org.apache.hadoop.fs.Path.init(Path.java:90)
at 
org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
at 
org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
at 
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
at 
org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
at 

[jira] [Commented] (HIVE-3108) SELECT count(DISTINCT col) ... returns 0 if col is a partition column

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402814#comment-13402814
 ] 

Edward Capriolo commented on HIVE-3108:
---

Confirmed fixed in trunk.
{noformat}
[edward@tablitha dist]$ bin/hive
Logging initialized using configuration in 
jar:file:/home/edward/hive/trunk/build/dist/lib/hive-common-0.10.0-SNAPSHOT.jar!/hive-log4j.properties
Hive history file=/tmp/edward/hive_job_log_edward_201206272349_386020253.txt
hive create table stocks (x int, y string) partitioned by (exchange string, 
symbol string);
OK
Time taken: 17.382 seconds
hive alter table stocks add partition (exchange='nasdaq', symbol='ed');
OK
Time taken: 2.022 seconds
hive alter table stocks add partition (exchange='nasdaq', symbol='guy');
OK
Time taken: 0.219 seconds
hive alter table stocks add partition (exchange='jp', symbol='bla');
OK
Time taken: 0.245 seconds
hive select count(distinct exchange), count(distinct symbol) from stocks;
2   3
Time taken: 5.742 seconds
{noformat}

 SELECT count(DISTINCT col) ... returns 0 if col is a partition column
 ---

 Key: HIVE-3108
 URL: https://issues.apache.org/jira/browse/HIVE-3108
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.8.0, 0.9.0
 Environment: Mac OSX running Apache distribution of hadoop and hive 
 natively.
Reporter: Dean Wampler
  Labels: Hive

 Suppose stocks is a managed OR external table, partitioned by exchange 
 and symbol. count(DISTINCT x) returns 0 for either exchange, symbol, 
 or both:
 hive SELECT count(DISTINCT exchange), count(DISTINCT symbol) from stocks;
 0  0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HIVE-3108) SELECT count(DISTINCT col) ... returns 0 if col is a partition column

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo resolved HIVE-3108.
---

Resolution: Duplicate

 SELECT count(DISTINCT col) ... returns 0 if col is a partition column
 ---

 Key: HIVE-3108
 URL: https://issues.apache.org/jira/browse/HIVE-3108
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.8.0, 0.9.0
 Environment: Mac OSX running Apache distribution of hadoop and hive 
 natively.
Reporter: Dean Wampler
  Labels: Hive

 Suppose stocks is a managed OR external table, partitioned by exchange 
 and symbol. count(DISTINCT x) returns 0 for either exchange, symbol, 
 or both:
 hive SELECT count(DISTINCT exchange), count(DISTINCT symbol) from stocks;
 0  0

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3206) Bucket mapjoin in trunk is not working

2012-06-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3206:


Status: Patch Available  (was: Open)

https://reviews.facebook.net/D3861

Working directory could be different from temp directory for MapredLocalWork 
(intentional?), so just prepended the temp directory as a parent of stored 
files with some error handling.

 Bucket mapjoin in trunk is not working 
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis

 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at 

[jira] [Commented] (HIVE-3206) Bucket mapjoin in trunk is not working

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402835#comment-13402835
 ] 

Edward Capriolo commented on HIVE-3206:
---

Did you very-clean? The tests look ok.

[edward@tablitha trunk]$ ant test -Dtestcase=TestCliDriver 
-Dqfile=bucketmapjoin1.q
BUILD SUCCESSFUL
Total time: 4 minutes 18 seconds


 Bucket mapjoin in trunk is not working 
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis

 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at 

[jira] [Commented] (HIVE-3206) Bucket mapjoin in trunk is not working

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402840#comment-13402840
 ] 

Edward Capriolo commented on HIVE-3206:
---

@Navis is there any way we can unit test here. This slipped through the unit 
testing.

 Bucket mapjoin in trunk is not working 
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis

 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 

[jira] [Commented] (HIVE-3206) Bucket mapjoin in trunk is not working

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402843#comment-13402843
 ] 

Edward Capriolo commented on HIVE-3206:
---

Please ignore here. There is not much that can be done about this one unit test 
wise. +1

 Bucket mapjoin in trunk is not working 
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis

 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 

[jira] [Updated] (HIVE-3206) Bucket mapjoin in trunk is not working

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-3206:
--

Attachment: hive-3206.1.patch.txt

 Bucket mapjoin in trunk is not working 
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Attachments: hive-3206.1.patch.txt


 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at 

[jira] [Updated] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-3206:
--

Summary: FileUtils.tar assumes wrong directory in some cases  (was: Bucket 
mapjoin in trunk is not working )

 FileUtils.tar assumes wrong directory in some cases
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Attachments: hive-3206.1.patch.txt


 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 

[jira] [Updated] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo updated HIVE-3206:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Resolved. Thanks Navis.

 FileUtils.tar assumes wrong directory in some cases
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Attachments: hive-3206.1.patch.txt


 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 

[jira] [Resolved] (HIVE-1162) Distribute ant test to speed process up

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo resolved HIVE-1162.
---

Resolution: Won't Fix

 Distribute ant test to speed process up
 ---

 Key: HIVE-1162
 URL: https://issues.apache.org/jira/browse/HIVE-1162
 Project: Hive
  Issue Type: New Feature
  Components: Build Infrastructure
Reporter: Edward Capriolo
Assignee: Edward Capriolo
Priority: Minor
 Attachments: distributed_hive_test.sh


 Hive runs atop map/reduce. Running our unit tests via map/reduce would be 
 faster and would make our evolution faster. Ideally we would want an ant 
 target in which we could supply a hadoop binary. Then we would distribute the 
 buildir and possibly ant via the distributed cache, finally we run all test 
 in parallel. 
 My proof of concept user bash + ssh public key.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3204) Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands)

2012-06-27 Thread Kanna Karanam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402850#comment-13402850
 ] 

Kanna Karanam commented on HIVE-3204:
-

WOW.. Thanks Edward. You simplified the problem very quickly. I will start with 
this approach. I noticed that our unit tests are using “!rm, !ls, !mkdir, 
!sleep” shell commands as of today. So I can easily replace all of them using 
either DFS or java_method. But I am worried that it leaves me with no unit 
testing for “!cmd command.

I can workaround this by adding a platform specific unit tests to test this 
command (one for windows and another one for Unix). Do you have any suggestions?
Thanks


 Windows: Fix the unit tests which contains “!cmd” commands (Unix shell 
 commands)
 --

 Key: HIVE-3204
 URL: https://issues.apache.org/jira/browse/HIVE-3204
 Project: Hive
  Issue Type: Sub-task
  Components: Tests, Windows
Affects Versions: 0.10.0
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 0.10.0


 Possible solution 1: (Preferred one)
 !Unix cmd | Windows cmd = Keeping the same syntax. Hive uses Java runtime 
 to launch the shell command  so any attempt to run windows commands on Unix 
 will fail and vice versa.
 To deal with unit tests. Unix commands in each .q file will be modified as 
 shown below. I will filter out the !commands which can’t be run on the 
 current .
 Original entry in.q file:
 !rm -rf ../build/ql/test/data/exports/exim_department;
 It will be replaced with the following entries.
 UNIX::!rm -rf ../build/ql/test/data/exports/exim_department;
 WINDOWS::!del ../build/ql/test/data/exports/exim_department
 Possible solution 2:
 Provide a Shell UDF library(JAVA Based code)to support platform independent 
 shell functionality
 Cons – 
 1) Difficult to provide full shell functionality
 2) Takes long time
 3) Difficult to manage

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases

2012-06-27 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-3206:
---

Fix Version/s: 0.10.0

 FileUtils.tar assumes wrong directory in some cases
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Fix For: 0.10.0

 Attachments: hive-3206.1.patch.txt


 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 

[jira] [Commented] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases

2012-06-27 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402853#comment-13402853
 ] 

Navis commented on HIVE-3206:
-

Committed this fast? o.o 
I missed to close input files. Could you change the fetch if not yet committed?
{code}
IOUtils.copy(new FileInputStream(f), tOut); // copy with 8K buffer, not close
--
FileInputStream input = new FileInputStream(f);
try {
  IOUtils.copy(input, tOut); // copy with 8K buffer, not close
} finally {
  input.close();
}
{code}

 FileUtils.tar assumes wrong directory in some cases
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Fix For: 0.10.0

 Attachments: hive-3206.1.patch.txt


 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 

[jira] [Commented] (HIVE-2498) Group by operator doesnt estimate size of Timestamp Binary data correctly

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402855#comment-13402855
 ] 

Edward Capriolo commented on HIVE-2498:
---

@Ashutosh this looks good to me. Any changes you would like since it has been a 
while?

 Group by operator doesnt estimate size of Timestamp  Binary data correctly
 ---

 Key: HIVE-2498
 URL: https://issues.apache.org/jira/browse/HIVE-2498
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0, 0.8.1, 0.9.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Attachments: HIVE-2498.D1185.1.patch, hive-2498.patch, 
 hive-2498_1.patch


 It currently defaults to default case and returns constant value, whereas we 
 can do better by getting actual size at runtime.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3204) Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands)

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402858#comment-13402858
 ] 

Edward Capriolo commented on HIVE-3204:
---

Really we can not cover '!' since we are totally relying on something external. 
We could do:
{noformat}
! ${env:JAVA_HOME}/bin/java -jar SomeClassWeMade arg[1] arg[2]
{noformat}
Or we could add these things directly to the hive language.

In the end we might really need your windows!! unix!! approach its not a 
problem worth losing much sleepover.




 Windows: Fix the unit tests which contains “!cmd” commands (Unix shell 
 commands)
 --

 Key: HIVE-3204
 URL: https://issues.apache.org/jira/browse/HIVE-3204
 Project: Hive
  Issue Type: Sub-task
  Components: Tests, Windows
Affects Versions: 0.10.0
Reporter: Kanna Karanam
Assignee: Kanna Karanam
 Fix For: 0.10.0


 Possible solution 1: (Preferred one)
 !Unix cmd | Windows cmd = Keeping the same syntax. Hive uses Java runtime 
 to launch the shell command  so any attempt to run windows commands on Unix 
 will fail and vice versa.
 To deal with unit tests. Unix commands in each .q file will be modified as 
 shown below. I will filter out the !commands which can’t be run on the 
 current .
 Original entry in.q file:
 !rm -rf ../build/ql/test/data/exports/exim_department;
 It will be replaced with the following entries.
 UNIX::!rm -rf ../build/ql/test/data/exports/exim_department;
 WINDOWS::!del ../build/ql/test/data/exports/exim_department
 Possible solution 2:
 Provide a Shell UDF library(JAVA Based code)to support platform independent 
 shell functionality
 Cons – 
 1) Difficult to provide full shell functionality
 2) Takes long time
 3) Difficult to manage

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases

2012-06-27 Thread Kanna Karanam (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402859#comment-13402859
 ] 

Kanna Karanam commented on HIVE-3206:
-

@Navis - Would it be possible to update/add aunit test? it looks like some 
settings are missing in the .q file.

Thanks

 FileUtils.tar assumes wrong directory in some cases
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Fix For: 0.10.0

 Attachments: hive-3206.1.patch.txt


 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 

[jira] [Reopened] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases

2012-06-27 Thread Edward Capriolo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Edward Capriolo reopened HIVE-3206:
---


 FileUtils.tar assumes wrong directory in some cases
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Fix For: 0.10.0

 Attachments: hive-3206.1.patch.txt


 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at 

[jira] [Commented] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases

2012-06-27 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402861#comment-13402861
 ] 

Edward Capriolo commented on HIVE-3206:
---

@Navis please supply a patch to place on top of this one. Or we can revert the 
change and supply a fully patch.

 FileUtils.tar assumes wrong directory in some cases
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Fix For: 0.10.0

 Attachments: hive-3206.1.patch.txt


 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at 
 

[jira] [Created] (HIVE-3207) FileUtils.tar does not close input files

2012-06-27 Thread Navis (JIRA)
Navis created HIVE-3207:
---

 Summary: FileUtils.tar does not close input files
 Key: HIVE-3207
 URL: https://issues.apache.org/jira/browse/HIVE-3207
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Trivial


It should close input files too. I missed this in HIVE-3206. (sorry)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-3207) FileUtils.tar does not close input files

2012-06-27 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-3207:


Status: Patch Available  (was: Open)

https://reviews.facebook.net/D3867

 FileUtils.tar does not close input files
 

 Key: HIVE-3207
 URL: https://issues.apache.org/jira/browse/HIVE-3207
 Project: Hive
  Issue Type: Sub-task
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
Priority: Trivial

 It should close input files too. I missed this in HIVE-3206. (sorry)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases

2012-06-27 Thread Navis (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402878#comment-13402878
 ] 

Navis commented on HIVE-3206:
-

@Kanna Karanam - Some problems are not shown by hive test framework. I don't 
know why.
@Edward Capriolo - Added HIVE-3207. Sorry for your inconvenience.

 FileUtils.tar assumes wrong directory in some cases
 ---

 Key: HIVE-3206
 URL: https://issues.apache.org/jira/browse/HIVE-3206
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.10.0
Reporter: Navis
Assignee: Navis
 Fix For: 0.10.0

 Attachments: hive-3206.1.patch.txt


 Bucket mapjoin throws exception archiving stored hashtables. 
 {noformat}
 hive set hive.optimize.bucketmapjoin = true;
 hive select /*+mapjoin(a)*/ a.key, a.value, b.value 
  from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b 
  on a.key=b.key;
 Total MapReduce jobs = 1
 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml 
 found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml
 Execution log at: 
 /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log
 2012-06-28 12:36:18   Starting to launch local task to process map join;  
 maximum memory = 932118528
 2012-06-28 12:36:18   Processing rows:153 Hashtable size: 153 
 Memory usage:   1771376 rate:   0.002
 2012-06-28 12:36:18   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
 2012-06-28 12:36:18   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable
  File size: 9644
 2012-06-28 12:36:19   Processing rows:309 Hashtable size: 156 
 Memory usage:   1844568 rate:   0.002
 2012-06-28 12:36:19   Dump the hashtable into file: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
 2012-06-28 12:36:19   Upload 1 File to: 
 file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable
  File size: 10023
 2012-06-28 12:36:19   End of local task; Time Taken: 0.773 sec.
 Execution completed successfully
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Mapred Local Task Succeeded . Convert the Join into MapJoin
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.io.IOException: This archives contains unclosed entries.
   at 
 org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214)
   at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391)
   at 
 org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137)
   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134)
   at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324)
   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110)
   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944)
   at 
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
   at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
   at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
   at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744)
   at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 Job Submission failed with exception 'java.io.IOException(This archives 
 contains unclosed entries.)'
 java.lang.IllegalArgumentException: Can not create a Path from an empty string
   at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)
   at org.apache.hadoop.fs.Path.init(Path.java:90)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380)
   at 
 org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193)
   at 
 org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460)
   at