[jira] [Commented] (HIVE-3168) LazyBinaryObjectInspector.getPrimitiveJavaObject copies beyond length of underlying BytesWritable
[ https://issues.apache.org/jira/browse/HIVE-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401991#comment-13401991 ] Thejas M Nair commented on HIVE-3168: - Updated patch on phabricator. Fixes JavaBinaryObjectInspector and WritableBinaryObjectInspector as well. Also, tries to avoid copy of byte[] in BytesWritable . LazyBinaryObjectInspector.getPrimitiveJavaObject copies beyond length of underlying BytesWritable - Key: HIVE-3168 URL: https://issues.apache.org/jira/browse/HIVE-3168 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Affects Versions: 0.9.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.10.0, 0.9.1 Attachments: HIVE-3168.1.patch LazyBinaryObjectInspector.getPrimitiveJavaObject copies the full capacity of the LazyBinary's underlying BytesWritable object, which can be greater than the size of the actual contents. This leads to additional characters at the end of the ByteArrayRef returned. When the LazyBinary object gets re-used, there can be remnants of the later portion of previous entry. This was not seen while reading through hive queries, which I think is because a copy elsewhere seems to create LazyBinary with length == capacity. (probably LazyBinary copy constructor). This was seen when MR or pig used Hcatalog to read the data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3127) Don’t pass–hconf values as command line arguments to child JVM to avoid command line exceeding char limit on windows
[ https://issues.apache.org/jira/browse/HIVE-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13401995#comment-13401995 ] Kanna Karanam commented on HIVE-3127: - Thanks Edward. Don’t pass–hconf values as command line arguments to child JVM to avoid command line exceeding char limit on windows Key: HIVE-3127 URL: https://issues.apache.org/jira/browse/HIVE-3127 Project: Hive Issue Type: Bug Components: Configuration, Windows Affects Versions: 0.9.0, 0.10.0, 0.9.1 Reporter: Kanna Karanam Assignee: Kanna Karanam Labels: Windows Fix For: 0.10.0 Attachments: HIVE-3127.1.patch.txt, HIVE-3127.2.patch.txt, HIVE-3127.3.patch.txt The maximum length of the DOS command string is 8191 characters (in Windows latest versions http://support.microsoft.com/kb/830473). This limit will be exceeded easily when it appends individual –hconf values to the command string. To work around this problem, Write all changed hconf values to a temp file and pass the temp file path to the child jvm to read and initialize the -hconf parameters from file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3202) Add hive command for resetting hive confs
[ https://issues.apache.org/jira/browse/HIVE-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402242#comment-13402242 ] Edward Capriolo commented on HIVE-3202: --- +1 will commit it tests pass. Add hive command for resetting hive confs - Key: HIVE-3202 URL: https://issues.apache.org/jira/browse/HIVE-3202 Project: Hive Issue Type: Improvement Components: Configuration Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Priority: Trivial For the purpose of optimization we set various configs per query. It's worthy but all those configs should be reset every time for next query. Just simple reset command would make it less painful. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)
[ https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402246#comment-13402246 ] Daryn Sharp commented on HIVE-3098: --- I believe neither disabling the fs cache nor switching to {{FileContext}} will help since Rohini stated earlier the desire to take advantage of the cache for performance, albeit w/o leaking. {{FileContext}} isn't going to be a panacea for this issue. I think it only implements hdfs, view, and ftp (not hftp). It provides no {{close()}} method, so there's no way to cleanup or shutdown clients until jvm shutdown, ie. aborting streams, deleting tmp files, closing the dfs client, etc. The latter will lead to leaks such as the dfs socket cache leaks, dfs lease renewer threads, etc. Even with the fs cache disabled, leaks such as the aforementioned dfs leaks will still occur unless _all_ fs instances are explicitly closed. I'd suggest either {{closeAllForUGI}} which provides a cache boost for each request, but degrades performance across multiple requests. Or do the oozie style UGI caching with a periodic cache purging. Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.) - Key: HIVE-3098 URL: https://issues.apache.org/jira/browse/HIVE-3098 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 0.9.0 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security turned on. Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-3098.patch The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing the Oracle backend). The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 100 instances of FileSystem, whose combined retained-mem consumed the entire heap. It boiled down to hadoop::UserGroupInformation::equals() being implemented such that the Subject member is compared for equality (==), and not equivalence (.equals()). This causes equivalent UGI instances to compare as unequal, and causes a new FileSystem instance to be created and cached. The UGI.equals() is so implemented, incidentally, as a fix for yet another problem (HADOOP-6670); so it is unlikely that that implementation can be modified. The solution for this is to check for UGI equivalence in HCatalog (i.e. in the Hive metastore), using an cache for UGI instances in the shims. I have a patch to fix this. I'll upload it shortly. I just ran an overnight test to confirm that the memory-leak has been arrested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 1516 - Still Failing
Changes for Build #1509 Changes for Build #1510 Changes for Build #1511 Changes for Build #1512 Changes for Build #1513 Changes for Build #1514 Changes for Build #1515 [ecapriolo] HIVE-3180 Fix Eclipse classpath template broken in HIVE-3128. Carl Steinbach (via egc) [hashutosh] HIVE-3048 : Collect_set Aggregate does uneccesary check for value. (Ed Capriolo via Ashutosh Chauhan) Changes for Build #1516 No tests ran. The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1516) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1516/ to view the results.
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #61
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/ -- [...truncated 5479 lines...] [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/builtins/test [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/builtins/test/src [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/builtins/test/classes [mkdir] Created dir: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/builtins/test/resources [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/builtins/src/test/resources does not exist. init: [echo] Project: builtins jar: [echo] Project: hive create-dirs: [echo] Project: shims [copy] Warning: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/test/resources does not exist. init: [echo] Project: shims ivy-init-settings: [echo] Project: shims ivy-resolve: [echo] Project: shims [ivy:resolve] :: loading settings :: file = https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/ivy/ivysettings.xml [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/zookeeper/zookeeper/3.4.3/zookeeper-3.4.3.jar ... [ivy:resolve] (749kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.zookeeper#zookeeper;3.4.3!zookeeper.jar (98ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/apache/thrift/libthrift/0.7.0/libthrift-0.7.0.jar ... [ivy:resolve] . (294kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.apache.thrift#libthrift;0.7.0!libthrift.jar (153ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/commons-logging/commons-logging/1.0.4/commons-logging-1.0.4.jar ... [ivy:resolve] .. (37kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] commons-logging#commons-logging;1.0.4!commons-logging.jar (22ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/commons-logging/commons-logging-api/1.0.4/commons-logging-api-1.0.4.jar ... [ivy:resolve] .. (25kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] commons-logging#commons-logging-api;1.0.4!commons-logging-api.jar (20ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/com/google/guava/guava/r09/guava-r09.jar ... [ivy:resolve] .. (1117kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] com.google.guava#guava;r09!guava.jar (260ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/codehaus/jackson/jackson-core-asl/1.8.8/jackson-core-asl-1.8.8.jar ... [ivy:resolve] ... (222kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.codehaus.jackson#jackson-core-asl;1.8.8!jackson-core-asl.jar (58ms) [ivy:resolve] downloading http://repo1.maven.org/maven2/org/codehaus/jackson/jackson-mapper-asl/1.8.8/jackson-mapper-asl-1.8.8.jar ... [ivy:resolve] .. (652kB) [ivy:resolve] .. (0kB) [ivy:resolve] [SUCCESSFUL ] org.codehaus.jackson#jackson-mapper-asl;1.8.8!jackson-mapper-asl.jar (169ms) [ivy:report] Processing https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml to https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/ivy/report/org.apache.hive-hive-shims-default.html ivy-retrieve: [echo] Project: shims compile: [echo] Project: shims [echo] Building shims 0.20 build_shims: [echo] Project: shims [echo] Compiling https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/ws/hive/shims/src/common/java;/home/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20/java against hadoop 0.20.2 (https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/61/artifact/hive/build/hadoopcore/hadoop-0.20.2) ivy-init-settings: [echo] Project: shims ivy-resolve-hadoop-shim: [echo] Project: shims [ivy:resolve] :: loading settings :: file =
[jira] [Created] (HIVE-3203) Drop partition throws NPE if table doesn't exist
Kevin Wilfong created HIVE-3203: --- Summary: Drop partition throws NPE if table doesn't exist Key: HIVE-3203 URL: https://issues.apache.org/jira/browse/HIVE-3203 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Fix For: 0.10.0 Attachments: HIVE-3203.1.patch.txt ALTER TABLE t1 DROP PARTITION (part = '1'); This throws an NPE if t1 doesn't exist. A SemanticException would be cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3203) Drop partition throws NPE if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3203: Attachment: HIVE-3203.1.patch.txt Drop partition throws NPE if table doesn't exist Key: HIVE-3203 URL: https://issues.apache.org/jira/browse/HIVE-3203 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Fix For: 0.10.0 Attachments: HIVE-3203.1.patch.txt ALTER TABLE t1 DROP PARTITION (part = '1'); This throws an NPE if t1 doesn't exist. A SemanticException would be cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3203) Drop partition throws NPE if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402364#comment-13402364 ] Kevin Wilfong commented on HIVE-3203: - Submitted a diff here https://reviews.facebook.net/D3843 Drop partition throws NPE if table doesn't exist Key: HIVE-3203 URL: https://issues.apache.org/jira/browse/HIVE-3203 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Fix For: 0.10.0 Attachments: HIVE-3203.1.patch.txt, HIVE-3203.2.patch.txt ALTER TABLE t1 DROP PARTITION (part = '1'); This throws an NPE if t1 doesn't exist. A SemanticException would be cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3203) Drop partition throws NPE if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3203: Attachment: HIVE-3203.2.patch.txt Drop partition throws NPE if table doesn't exist Key: HIVE-3203 URL: https://issues.apache.org/jira/browse/HIVE-3203 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Fix For: 0.10.0 Attachments: HIVE-3203.1.patch.txt, HIVE-3203.2.patch.txt ALTER TABLE t1 DROP PARTITION (part = '1'); This throws an NPE if t1 doesn't exist. A SemanticException would be cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3203) Drop partition throws NPE if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3203: Status: Patch Available (was: Open) Drop partition throws NPE if table doesn't exist Key: HIVE-3203 URL: https://issues.apache.org/jira/browse/HIVE-3203 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Fix For: 0.10.0 Attachments: HIVE-3203.1.patch.txt, HIVE-3203.2.patch.txt ALTER TABLE t1 DROP PARTITION (part = '1'); This throws an NPE if t1 doesn't exist. A SemanticException would be cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3126) Generate build the velocity based Hive tests on windows by fixing the path issues
[ https://issues.apache.org/jira/browse/HIVE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402370#comment-13402370 ] Kanna Karanam commented on HIVE-3126: - Updated the patch after addressing most of the review comments 1) Replaced most of the Windows specific checks in build.xml files with generic OS check approach. 2) Removed some of the if(Windows) checks in the production code. Generate build the velocity based Hive tests on windows by fixing the path issues --- Key: HIVE-3126 URL: https://issues.apache.org/jira/browse/HIVE-3126 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.9.0, 0.10.0, 0.9.1 Reporter: Kanna Karanam Assignee: Kanna Karanam Labels: Windows, test Fix For: 0.10.0 Attachments: HIVE-3126.1.patch.txt, HIVE-3126.2.patch.txt, HIVE-3126.3.patch.txt 1)Escape the backward slash in Canonical Path if unit test runs on windows. 2)Diff comparison – a. Ignore the extra spacing on windows b. Ignore the different line endings on windows Unix c. Convert the file paths to windows specific. (Handle spaces etc..) 3)Set the right file scheme class path separators while invoking the junit task from -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)
[ https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402394#comment-13402394 ] Alejandro Abdelnur commented on HIVE-3098: -- Or FIX Hadoop FileSystem caching, this issue is gremlin-ing all over as we have long running/multiuser systems that use Hadoop FileSystem. Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.) - Key: HIVE-3098 URL: https://issues.apache.org/jira/browse/HIVE-3098 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 0.9.0 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security turned on. Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-3098.patch The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing the Oracle backend). The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 100 instances of FileSystem, whose combined retained-mem consumed the entire heap. It boiled down to hadoop::UserGroupInformation::equals() being implemented such that the Subject member is compared for equality (==), and not equivalence (.equals()). This causes equivalent UGI instances to compare as unequal, and causes a new FileSystem instance to be created and cached. The UGI.equals() is so implemented, incidentally, as a fix for yet another problem (HADOOP-6670); so it is unlikely that that implementation can be modified. The solution for this is to check for UGI equivalence in HCatalog (i.e. in the Hive metastore), using an cache for UGI instances in the shims. I have a patch to fix this. I'll upload it shortly. I just ran an overnight test to confirm that the memory-leak has been arrested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3203) Drop partition throws NPE if table doesn't exist
[ https://issues.apache.org/jira/browse/HIVE-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3203: Attachment: HIVE-3203.3.patch.txt Drop partition throws NPE if table doesn't exist Key: HIVE-3203 URL: https://issues.apache.org/jira/browse/HIVE-3203 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Fix For: 0.10.0 Attachments: HIVE-3203.1.patch.txt, HIVE-3203.2.patch.txt, HIVE-3203.3.patch.txt ALTER TABLE t1 DROP PARTITION (part = '1'); This throws an NPE if t1 doesn't exist. A SemanticException would be cleaner. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)
[ https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402448#comment-13402448 ] Ashutosh Chauhan commented on HIVE-3098: I think we are mixing two things here: Performance Vs Memory Leak. Patch is originally intended to plug memory leak, not the performance. It fixes the original leak, but will introduce a new one. It fixes the case when a limited number of users are contacting metastore(which was the original Mithun's test where same 60 clients keep hitting the metastore). But, if different users hit the metastore, leak is still there, rather more acute with a patch, since there are now two caches (ugi and fs) which both will grow, instead of just one (fs). Problem stems from the fact that there is no expiration policy either in fs or ugi cache. We need to design for UGI cache eviction policy. There, when we are expiring stale ugi's from ugi-cache we can do {{closeAllForUGI}} for evicting ugi to evict cached FS objects from fs-cache. Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.) - Key: HIVE-3098 URL: https://issues.apache.org/jira/browse/HIVE-3098 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 0.9.0 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security turned on. Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-3098.patch The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing the Oracle backend). The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 100 instances of FileSystem, whose combined retained-mem consumed the entire heap. It boiled down to hadoop::UserGroupInformation::equals() being implemented such that the Subject member is compared for equality (==), and not equivalence (.equals()). This causes equivalent UGI instances to compare as unequal, and causes a new FileSystem instance to be created and cached. The UGI.equals() is so implemented, incidentally, as a fix for yet another problem (HADOOP-6670); so it is unlikely that that implementation can be modified. The solution for this is to check for UGI equivalence in HCatalog (i.e. in the Hive metastore), using an cache for UGI instances in the shims. I have a patch to fix this. I'll upload it shortly. I just ran an overnight test to confirm that the memory-leak has been arrested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3204) Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands)
Kanna Karanam created HIVE-3204: --- Summary: Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands) Key: HIVE-3204 URL: https://issues.apache.org/jira/browse/HIVE-3204 Project: Hive Issue Type: Sub-task Components: Tests, Windows Affects Versions: 0.10.0 Reporter: Kanna Karanam Assignee: Kanna Karanam Fix For: 0.10.0 Possible solution 1: (Preferred one) !Unix cmd | Windows cmd = Keeping the same syntax. Hive uses Java runtime to launch the shell command so any attempt to run windows commands on Unix will fail and vice versa. To deal with unit tests. Unix commands in each .q file will be modified as shown below. I will filter out the !commands which can’t be run on the current . Original entry in.q file: !rm -rf ../build/ql/test/data/exports/exim_department; It will be replaced with the following entries. UNIX::!rm -rf ../build/ql/test/data/exports/exim_department; WINDOWS::!del ../build/ql/test/data/exports/exim_department Possible solution 2: Provide a Shell UDF library(JAVA Based code)to support platform independent shell functionality Cons – 1) Difficult to provide full shell functionality 2) Takes long time 3) Difficult to manage -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)
[ https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402525#comment-13402525 ] Daryn Sharp commented on HIVE-3098: --- bq. Or FIX Hadoop FileSystem caching I wanted to fix the cache for the NM, but after digging around, I think the cache is working as designed. UGIs are mutable, so if two different requests share the same cached UGI, then tokens from one request will be shared with another request. This contamination may lead to security issues and will cause bugs. Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.) - Key: HIVE-3098 URL: https://issues.apache.org/jira/browse/HIVE-3098 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 0.9.0 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security turned on. Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-3098.patch The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing the Oracle backend). The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 100 instances of FileSystem, whose combined retained-mem consumed the entire heap. It boiled down to hadoop::UserGroupInformation::equals() being implemented such that the Subject member is compared for equality (==), and not equivalence (.equals()). This causes equivalent UGI instances to compare as unequal, and causes a new FileSystem instance to be created and cached. The UGI.equals() is so implemented, incidentally, as a fix for yet another problem (HADOOP-6670); so it is unlikely that that implementation can be modified. The solution for this is to check for UGI equivalence in HCatalog (i.e. in the Hive metastore), using an cache for UGI instances in the shims. I have a patch to fix this. I'll upload it shortly. I just ran an overnight test to confirm that the memory-leak has been arrested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)
[ https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402530#comment-13402530 ] Alejandro Abdelnur commented on HIVE-3098: -- But we have a bug, that not only affects clients creating UGIs on the fly for the same user and if caching is not off will choke the NN with open sockets. And the more clients doing that the more likely for the NM to choke. Could we make UGIs immutable (which they should have been in the first place)? Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.) - Key: HIVE-3098 URL: https://issues.apache.org/jira/browse/HIVE-3098 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 0.9.0 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security turned on. Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-3098.patch The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing the Oracle backend). The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 100 instances of FileSystem, whose combined retained-mem consumed the entire heap. It boiled down to hadoop::UserGroupInformation::equals() being implemented such that the Subject member is compared for equality (==), and not equivalence (.equals()). This causes equivalent UGI instances to compare as unequal, and causes a new FileSystem instance to be created and cached. The UGI.equals() is so implemented, incidentally, as a fix for yet another problem (HADOOP-6670); so it is unlikely that that implementation can be modified. The solution for this is to check for UGI equivalence in HCatalog (i.e. in the Hive metastore), using an cache for UGI instances in the shims. I have a patch to fix this. I'll upload it shortly. I just ran an overnight test to confirm that the memory-leak has been arrested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3171) Bucketed sort merge join doesn't work on tables with more than one partition
[ https://issues.apache.org/jira/browse/HIVE-3171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3171: - Component/s: Query Processor Labels: bucketing joins partitioning (was: ) Bucketed sort merge join doesn't work on tables with more than one partition Key: HIVE-3171 URL: https://issues.apache.org/jira/browse/HIVE-3171 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Joey Echeverria Labels: bucketing, joins, partitioning Executing a query with the MAPJOIN hint and the bucketed sort merge join optimizations enabled: {noformat} set hive.input.format=org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; {noformat} works fine with partitioned tables if there is only one partition in the table. However, if you add a second partition, Hive attempts to do a regular map-side join which can fail because the tables are too large. Hive ought to be able to still do the bucketed sort merge join with partitions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2535) Use sorted nature of compact indexes
[ https://issues.apache.org/jira/browse/HIVE-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-2535: - Component/s: Query Processor Indexing Labels: indexing performance (was: ) Use sorted nature of compact indexes Key: HIVE-2535 URL: https://issues.apache.org/jira/browse/HIVE-2535 Project: Hive Issue Type: Improvement Components: Indexing, Query Processor Reporter: Kevin Wilfong Assignee: Kevin Wilfong Labels: indexing, performance Fix For: 0.8.0 Attachments: HIVE-2535.1.patch.txt, HIVE-2535.2.patch.txt, HIVE-2535.3.patch.txt, HIVE-2535.4.patch.txt Compact indexes are sorted based on the indexed columns, but we are not using this fact when we access the index. To start with, if the index is stored as an RC file, and if the predicate being used to access the index consists of only one non-partition condition using one of the operators ,=,,=,= we could use a binary search (if necessary) to find the block to begin scanning for unfiltered rows, and we could use the result of comparing the value in the column with the constant (this is necessarily the form of a predicate which is optimized using an index) to determine when we have found all the rows which will be unfiltered. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3127) Pass hconf values as XML instead of command line arguments to child JVM
[ https://issues.apache.org/jira/browse/HIVE-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-3127: -- Summary: Pass hconf values as XML instead of command line arguments to child JVM (was: Don’t pass–hconf values as command line arguments to child JVM to avoid command line exceeding char limit on windows) Pass hconf values as XML instead of command line arguments to child JVM --- Key: HIVE-3127 URL: https://issues.apache.org/jira/browse/HIVE-3127 Project: Hive Issue Type: Bug Components: Configuration, Windows Affects Versions: 0.9.0, 0.10.0, 0.9.1 Reporter: Kanna Karanam Assignee: Kanna Karanam Labels: Windows Fix For: 0.10.0 Attachments: HIVE-3127.1.patch.txt, HIVE-3127.2.patch.txt, HIVE-3127.3.patch.txt The maximum length of the DOS command string is 8191 characters (in Windows latest versions http://support.microsoft.com/kb/830473). This limit will be exceeded easily when it appends individual –hconf values to the command string. To work around this problem, Write all changed hconf values to a temp file and pass the temp file path to the child jvm to read and initialize the -hconf parameters from file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)
[ https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402716#comment-13402716 ] Daryn Sharp commented on HIVE-3098: --- Unfortunately, no. If you make a {{UGI}} immutable, it really means marking the {{Subject}} immutable. Everything token-based breaks because they can't be added to the {{Subject}}, you can't login from a keytab or relogin because you can't update the TGT in the {{Subject}}, proxy users (any maybe spnego?) won't work because service tickets can't be added to the {{Subject}}. A {{Subject}}/{{UGI}} is an execution context with specific privileges. Those contexts cannot be cached and shared w/o risking escalated privileges. Think of it this way: If I entrust you with the keys to my home to pickup a delivery, I don't want you to make a copy of the keys and have the ability to enter anytime you want w/o my explicit permission. Without knowing the intricacies, I recommend: leaving the fs cache on, create a new ugi for connections, and {{closeAllForUGI}} when the request is complete. Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.) - Key: HIVE-3098 URL: https://issues.apache.org/jira/browse/HIVE-3098 Project: Hive Issue Type: Bug Components: Shims Affects Versions: 0.9.0 Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security turned on. Reporter: Mithun Radhakrishnan Assignee: Mithun Radhakrishnan Attachments: HIVE-3098.patch The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing the Oracle backend). The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 100 instances of FileSystem, whose combined retained-mem consumed the entire heap. It boiled down to hadoop::UserGroupInformation::equals() being implemented such that the Subject member is compared for equality (==), and not equivalence (.equals()). This causes equivalent UGI instances to compare as unequal, and causes a new FileSystem instance to be created and cached. The UGI.equals() is so implemented, incidentally, as a fix for yet another problem (HADOOP-6670); so it is unlikely that that implementation can be modified. The solution for this is to check for UGI equivalence in HCatalog (i.e. in the Hive metastore), using an cache for UGI instances in the shims. I have a patch to fix this. I'll upload it shortly. I just ran an overnight test to confirm that the memory-leak has been arrested. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3127) Pass hconf values as XML instead of command line arguments to child JVM
[ https://issues.apache.org/jira/browse/HIVE-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-3127: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed. Thank you. Pass hconf values as XML instead of command line arguments to child JVM --- Key: HIVE-3127 URL: https://issues.apache.org/jira/browse/HIVE-3127 Project: Hive Issue Type: Bug Components: Configuration, Windows Affects Versions: 0.9.0, 0.10.0, 0.9.1 Reporter: Kanna Karanam Assignee: Kanna Karanam Labels: Windows Fix For: 0.10.0 Attachments: HIVE-3127.1.patch.txt, HIVE-3127.2.patch.txt, HIVE-3127.3.patch.txt The maximum length of the DOS command string is 8191 characters (in Windows latest versions http://support.microsoft.com/kb/830473). This limit will be exceeded easily when it appends individual –hconf values to the command string. To work around this problem, Write all changed hconf values to a temp file and pass the temp file path to the child jvm to read and initialize the -hconf parameters from file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3205) Bucketed mapjoin on partitioned table which has no partition throws NPE
Navis created HIVE-3205: --- Summary: Bucketed mapjoin on partitioned table which has no partition throws NPE Key: HIVE-3205 URL: https://issues.apache.org/jira/browse/HIVE-3205 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Environment: ubuntu 10.04 Reporter: Navis Assignee: Navis Priority: Minor {code} create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds string) clustered by (key) sorted by (key) into 2 buckets; create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds string) clustered by (key) sorted by (key) into 2 buckets; set hive.optimize.bucketmapjoin = true; set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; explain SELECT /* + MAPJOIN(b) */ b.key as k1, b.value, b.ds, a.key as k2 FROM hive_test_smb_bucket1 a JOIN hive_test_smb_bucket2 b ON a.key = b.key WHERE a.ds = '2010-10-15' and b.ds='2010-10-15' and b.key IS NOT NULL; {code} throws NPE {noformat} 2012-06-28 08:59:13,459 ERROR ql.Driver (SessionState.java:printError(400)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer$BucketMapjoinOptProc.process(BucketMapJoinOptimizer.java:269) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102) at org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer.transform(BucketMapJoinOptimizer.java:100) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7564) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3205) Bucketed mapjoin on partitioned table which has no partition throws NPE
[ https://issues.apache.org/jira/browse/HIVE-3205?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3205: Status: Patch Available (was: Open) https://reviews.facebook.net/D3849 Bucketed mapjoin on partitioned table which has no partition throws NPE --- Key: HIVE-3205 URL: https://issues.apache.org/jira/browse/HIVE-3205 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Environment: ubuntu 10.04 Reporter: Navis Assignee: Navis Priority: Minor {code} create table hive_test_smb_bucket1 (key int, value string) partitioned by (ds string) clustered by (key) sorted by (key) into 2 buckets; create table hive_test_smb_bucket2 (key int, value string) partitioned by (ds string) clustered by (key) sorted by (key) into 2 buckets; set hive.optimize.bucketmapjoin = true; set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; explain SELECT /* + MAPJOIN(b) */ b.key as k1, b.value, b.ds, a.key as k2 FROM hive_test_smb_bucket1 a JOIN hive_test_smb_bucket2 b ON a.key = b.key WHERE a.ds = '2010-10-15' and b.ds='2010-10-15' and b.key IS NOT NULL; {code} throws NPE {noformat} 2012-06-28 08:59:13,459 ERROR ql.Driver (SessionState.java:printError(400)) - FAILED: NullPointerException null java.lang.NullPointerException at org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer$BucketMapjoinOptProc.process(BucketMapJoinOptimizer.java:269) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:125) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102) at org.apache.hadoop.hive.ql.optimizer.BucketMapJoinOptimizer.transform(BucketMapJoinOptimizer.java:100) at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:87) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7564) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245) at org.apache.hadoop.hive.ql.parse.ExplainSemanticAnalyzer.analyzeInternal(ExplainSemanticAnalyzer.java:50) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:245) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:335) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3172) Remove the duplicate JAR entries from the (“test.classpath”) to avoid command line exceeding char limit on windows
[ https://issues.apache.org/jira/browse/HIVE-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3172: --- Status: Patch Available (was: Open) Remove the duplicate JAR entries from the (“test.classpath”) to avoid command line exceeding char limit on windows --- Key: HIVE-3172 URL: https://issues.apache.org/jira/browse/HIVE-3172 Project: Hive Issue Type: Sub-task Components: Tests, Windows Affects Versions: 0.10.0 Environment: Windows Reporter: Kanna Karanam Assignee: Kanna Karanam Labels: Windows Fix For: 0.10.0 Attachments: HIVE-3172.1.patch.txt, HIVE-3172.2.patch.txt The maximum length of the DOS command string is 8191 characters (in Windows latest versions http://support.microsoft.com/kb/830473). Following entries in the “build-common.xml” are adding lot of duplicate JAR entries to the “test.classpath” and it exceeds the max character limit on windows very easily. !-- Include build/dist/lib on the classpath before Ivy and exclude hive jars from Ivy to make sure we get the local changes when we test Hive -- fileset dir=${build.dir.hive}/dist/lib includes=*.jar erroronmissingdir=false excludes=**/hive_contrib*.jar,**/hive-contrib*.jar,**/lib*.jar/ fileset dir=${hive.root}/build/ivy/lib/test includes=*.jar erroronmissingdir=false excludes=**/hive_*.jar,**/hive-*.jar/ fileset dir=${hive.root}/build/ivy/lib/default includes=*.jar erroronmissingdir=false excludes=**/hive_*.jar,**/hive-*.jar / fileset dir=${hive.root}/testlibs includes=*.jar/ Proposed solution (workaround)– 1)Include all JARs from dist\lib excluding **/hive_contrib*.jar,**/hive-contrib*.jar,**/lib*.jar 2)Select the specific jars (missing jars) from test/other folders, (that includes Hadoop-*.jar files) Thanks -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3126) Generate build the velocity based Hive tests on windows by fixing the path issues
[ https://issues.apache.org/jira/browse/HIVE-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3126: --- Status: Patch Available (was: Open) Generate build the velocity based Hive tests on windows by fixing the path issues --- Key: HIVE-3126 URL: https://issues.apache.org/jira/browse/HIVE-3126 Project: Hive Issue Type: Bug Components: Tests Affects Versions: 0.9.0, 0.10.0, 0.9.1 Reporter: Kanna Karanam Assignee: Kanna Karanam Labels: Windows, test Fix For: 0.10.0 Attachments: HIVE-3126.1.patch.txt, HIVE-3126.2.patch.txt, HIVE-3126.3.patch.txt 1)Escape the backward slash in Canonical Path if unit test runs on windows. 2)Diff comparison – a. Ignore the extra spacing on windows b. Ignore the different line endings on windows Unix c. Convert the file paths to windows specific. (Handle spaces etc..) 3)Set the right file scheme class path separators while invoking the junit task from -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1399) Nested UDAFs cause Hive Internal Error (NullPointerException)
[ https://issues.apache.org/jira/browse/HIVE-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-1399: -- Status: Patch Available (was: Open) Nested UDAFs cause Hive Internal Error (NullPointerException) - Key: HIVE-1399 URL: https://issues.apache.org/jira/browse/HIVE-1399 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.6.0 Reporter: Mayank Lahiri Assignee: Adam Fokken Attachments: HIVE-1399.1.patch.txt This query does not make real-world sense, and I'm guessing it's not even supported by HQL/SQL, but I'm pretty sure that it shouldn't be causing an internal error with a NullPointerException. normal just has one column called val. I'm running on trunk, svn updated 5 minutes ago, ant clean package. SELECT percentile(val, percentile(val, 0.5)) FROM normal; FAILED: Hive Internal Error: java.lang.NullPointerException(null) java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:153) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:587) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:708) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:6241) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2301) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2860) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5002) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5524) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6055) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) I've also recreated this error with a GenericUDAF I'm writing, and also with the following: SELECT percentile(val, percentile()) FROM normal; SELECT avg(variance(dob_year)) FROM somedata; // this makes no sense, but still a NullPointerException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1399) Nested UDAFs cause Hive Internal Error (NullPointerException)
[ https://issues.apache.org/jira/browse/HIVE-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-1399: -- Affects Version/s: (was: 0.6.0) 0.9.0 Nested UDAFs cause Hive Internal Error (NullPointerException) - Key: HIVE-1399 URL: https://issues.apache.org/jira/browse/HIVE-1399 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: Mayank Lahiri Assignee: Adam Fokken Attachments: HIVE-1399.1.patch.txt This query does not make real-world sense, and I'm guessing it's not even supported by HQL/SQL, but I'm pretty sure that it shouldn't be causing an internal error with a NullPointerException. normal just has one column called val. I'm running on trunk, svn updated 5 minutes ago, ant clean package. SELECT percentile(val, percentile(val, 0.5)) FROM normal; FAILED: Hive Internal Error: java.lang.NullPointerException(null) java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:153) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:587) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:708) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:6241) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2301) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2860) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5002) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5524) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6055) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) I've also recreated this error with a GenericUDAF I'm writing, and also with the following: SELECT percentile(val, percentile()) FROM normal; SELECT avg(variance(dob_year)) FROM somedata; // this makes no sense, but still a NullPointerException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1399) Nested UDAFs cause Hive Internal Error (NullPointerException)
[ https://issues.apache.org/jira/browse/HIVE-1399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402762#comment-13402762 ] Edward Capriolo commented on HIVE-1399: --- All. This patch looks good. Lets not let this sit over an error message. {quote} Aggregate functions should not be nested and can not be in a clause other than a SELECT or HAVING clause. {quote} Nested UDAFs cause Hive Internal Error (NullPointerException) - Key: HIVE-1399 URL: https://issues.apache.org/jira/browse/HIVE-1399 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: Mayank Lahiri Assignee: Adam Fokken Attachments: HIVE-1399.1.patch.txt This query does not make real-world sense, and I'm guessing it's not even supported by HQL/SQL, but I'm pretty sure that it shouldn't be causing an internal error with a NullPointerException. normal just has one column called val. I'm running on trunk, svn updated 5 minutes ago, ant clean package. SELECT percentile(val, percentile(val, 0.5)) FROM normal; FAILED: Hive Internal Error: java.lang.NullPointerException(null) java.lang.NullPointerException at org.apache.hadoop.hive.ql.plan.ExprNodeGenericFuncDesc.newInstance(ExprNodeGenericFuncDesc.java:153) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:587) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:708) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:89) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:88) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:128) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:102) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:6241) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2301) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2860) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:5002) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5524) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6055) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:126) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:304) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:377) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:303) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) I've also recreated this error with a GenericUDAF I'm writing, and also with the following: SELECT percentile(val, percentile()) FROM normal; SELECT avg(variance(dob_year)) FROM somedata; // this makes no sense, but still a NullPointerException -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3068) Add ability to export table metadata as JSON on table drop
[ https://issues.apache.org/jira/browse/HIVE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402765#comment-13402765 ] Edward Capriolo commented on HIVE-3068: --- +1 I like this. Add a unit test and we are good to go. Add ability to export table metadata as JSON on table drop -- Key: HIVE-3068 URL: https://issues.apache.org/jira/browse/HIVE-3068 Project: Hive Issue Type: New Feature Components: Metastore, Serializers/Deserializers Reporter: Andrew Chalfant Assignee: Andrew Chalfant Priority: Minor Labels: features, newbie Original Estimate: 24h Remaining Estimate: 24h When a table is dropped, the contents go to the users trash but the metadata is lost. It would be super neat to be able to save the metadata as well so that tables could be trivially re-instantiated via thrift. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3068) Add ability to export table metadata as JSON on table drop
[ https://issues.apache.org/jira/browse/HIVE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402767#comment-13402767 ] Edward Capriolo commented on HIVE-3068: --- We have done another issue for inport and export functions. Maybe you can borrow that code to dump the metadata instead of the json export you have done. Add ability to export table metadata as JSON on table drop -- Key: HIVE-3068 URL: https://issues.apache.org/jira/browse/HIVE-3068 Project: Hive Issue Type: New Feature Components: Metastore, Serializers/Deserializers Reporter: Andrew Chalfant Assignee: Andrew Chalfant Priority: Minor Labels: features, newbie Original Estimate: 24h Remaining Estimate: 24h When a table is dropped, the contents go to the users trash but the metadata is lost. It would be super neat to be able to save the metadata as well so that tables could be trivially re-instantiated via thrift. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2544) Nullpointer on registering udfs.
[ https://issues.apache.org/jira/browse/HIVE-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-2544: -- Attachment: HIVE-2544.patch.2.txt Nullpointer on registering udfs. Key: HIVE-2544 URL: https://issues.apache.org/jira/browse/HIVE-2544 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: Bennie Schut Assignee: Bennie Schut Attachments: HIVE-2544.1.patch.txt, HIVE-2544.patch.2.txt Currently the Function registry can throw NullPointers when multiple threads are trying to register the same function. The normal put() will replace the existing registered function object even if it's exactly the same function. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2544) Nullpointer on registering udfs.
[ https://issues.apache.org/jira/browse/HIVE-2544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402776#comment-13402776 ] Edward Capriolo commented on HIVE-2544: --- I have included a patch to use Collections.synchronizedMap() this should deal with atomic issues without changing the order of things. Nullpointer on registering udfs. Key: HIVE-2544 URL: https://issues.apache.org/jira/browse/HIVE-2544 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.9.0 Reporter: Bennie Schut Assignee: Bennie Schut Attachments: HIVE-2544.1.patch.txt, HIVE-2544.patch.2.txt Currently the Function registry can throw NullPointers when multiple threads are trying to register the same function. The normal put() will replace the existing registered function object even if it's exactly the same function. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3204) Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands)
[ https://issues.apache.org/jira/browse/HIVE-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402781#comment-13402781 ] Edward Capriolo commented on HIVE-3204: --- For this case we must be able to use dfs to do rm's. Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands) -- Key: HIVE-3204 URL: https://issues.apache.org/jira/browse/HIVE-3204 Project: Hive Issue Type: Sub-task Components: Tests, Windows Affects Versions: 0.10.0 Reporter: Kanna Karanam Assignee: Kanna Karanam Fix For: 0.10.0 Possible solution 1: (Preferred one) !Unix cmd | Windows cmd = Keeping the same syntax. Hive uses Java runtime to launch the shell command so any attempt to run windows commands on Unix will fail and vice versa. To deal with unit tests. Unix commands in each .q file will be modified as shown below. I will filter out the !commands which can’t be run on the current . Original entry in.q file: !rm -rf ../build/ql/test/data/exports/exim_department; It will be replaced with the following entries. UNIX::!rm -rf ../build/ql/test/data/exports/exim_department; WINDOWS::!del ../build/ql/test/data/exports/exim_department Possible solution 2: Provide a Shell UDF library(JAVA Based code)to support platform independent shell functionality Cons – 1) Difficult to provide full shell functionality 2) Takes long time 3) Difficult to manage -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-3100 : Add HiveCLI that runs over JDBC
On June 26, 2012, 7:46 p.m., Carl Steinbach wrote: eclipse-templates/HiveBeeLine.launchtemplate, line 1 https://reviews.apache.org/r/5554/diff/1/?file=116339#file116339line1 Please add sqlline to the eclipse classpath template in eclipse-files/.classpath Done On June 26, 2012, 7:46 p.m., Carl Steinbach wrote: jdbc/ivy.xml, line 36 https://reviews.apache.org/r/5554/diff/1/?file=116342#file116342line36 This doesn't apply cleanly due to a change that went in a couple days ago. Needs a rebase. Done On June 26, 2012, 7:46 p.m., Carl Steinbach wrote: jdbc/src/java/org/apache/hive/jdbc/beeline/HiveBeeline.java, line 37 https://reviews.apache.org/r/5554/diff/1/?file=116343#file116343line37 Remove. Done On June 26, 2012, 7:46 p.m., Carl Steinbach wrote: jdbc/src/java/org/apache/hive/jdbc/beeline/HiveBeeline.java, line 77 https://reviews.apache.org/r/5554/diff/1/?file=116343#file116343line77 Remove? yes, old code. Removed On June 26, 2012, 7:46 p.m., Carl Steinbach wrote: jdbc/src/java/org/apache/hive/jdbc/beeline/HiveBeeline.java, line 113 https://reviews.apache.org/r/5554/diff/1/?file=116343#file116343line113 Remove. Done - Prasad --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5554/#review8633 --- On June 25, 2012, 6:52 p.m., Prasad Mujumdar wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5554/ --- (Updated June 25, 2012, 6:52 p.m.) Review request for hive and Carl Steinbach. Description --- New patch for supporting SQLLine as command line editor for Hive. LICENSE and NOTICE updates to mention SQLLine Build changes to add SQLLine java wrapper to translate the CLI arguments to SQLLine This addresses bug HIVE-3100. https://issues.apache.org/jira/browse/HIVE-3100 Diffs - LICENSE 05085da NOTICE 871fdde bin/ext/beeline.sh PRE-CREATION eclipse-templates/HiveBeeLine.launchtemplate PRE-CREATION ivy/ivysettings.xml fb6f4b8 ivy/libraries.properties 8461da1 jdbc/ivy.xml d8e3a5f jdbc/src/java/org/apache/hive/jdbc/beeline/HiveBeeline.java PRE-CREATION jdbc/src/java/org/apache/hive/jdbc/beeline/OptionsProcessor.java PRE-CREATION Diff: https://reviews.apache.org/r/5554/diff/ Testing --- Thanks, Prasad Mujumdar
Re: Review Request: HIVE-3100 : Add HiveCLI that runs over JDBC
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/5554/ --- (Updated June 28, 2012, 2:10 a.m.) Review request for hive and Carl Steinbach. Changes --- Updates per review comments. Description --- New patch for supporting SQLLine as command line editor for Hive. LICENSE and NOTICE updates to mention SQLLine Build changes to add SQLLine java wrapper to translate the CLI arguments to SQLLine This addresses bug HIVE-3100. https://issues.apache.org/jira/browse/HIVE-3100 Diffs (updated) - LICENSE 05085da NOTICE 871fdde bin/ext/beeline.sh PRE-CREATION eclipse-templates/.classpath 49e9140 eclipse-templates/HiveBeeLine.launchtemplate PRE-CREATION ivy/ivysettings.xml fb6f4b8 ivy/libraries.properties c3b53bf jdbc/ivy.xml 9269bd1 jdbc/src/java/org/apache/hive/jdbc/beeline/HiveBeeline.java PRE-CREATION jdbc/src/java/org/apache/hive/jdbc/beeline/OptionsProcessor.java PRE-CREATION Diff: https://reviews.apache.org/r/5554/diff/ Testing --- Thanks, Prasad Mujumdar
[jira] [Assigned] (HIVE-1626) stop using java.util.Stack
[ https://issues.apache.org/jira/browse/HIVE-1626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo reassigned HIVE-1626: - Assignee: Edward Capriolo (was: John Sichi) stop using java.util.Stack -- Key: HIVE-1626 URL: https://issues.apache.org/jira/browse/HIVE-1626 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.7.0 Reporter: John Sichi Assignee: Edward Capriolo We currently use Stack as part of the generic node walking library. Stack should not be used for this since its inheritance from Vector incurs superfluous synchronization overhead. Most projects end up adding an ArrayStack implementation and using that instead. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3148) LOCATION clause is not honored when adding multiple partitions
[ https://issues.apache.org/jira/browse/HIVE-3148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402797#comment-13402797 ] Shengsheng Huang commented on HIVE-3148: @Carl @Namit @JQ I've attached the new patch with test case and updated request at reviewboard. https://reviews.apache.org/r/5476/ Is there anything else I need to do? LOCATION clause is not honored when adding multiple partitions -- Key: HIVE-3148 URL: https://issues.apache.org/jira/browse/HIVE-3148 Project: Hive Issue Type: Bug Components: Metastore, Query Processor Affects Versions: 0.4.0, 0.5.0, 0.6.0, 0.7.0, 0.8.0, 0.9.0 Reporter: Carl Steinbach Labels: patch Attachments: 3148.for0.9.0.patch, HIVE-3148.1.patch.txt, HIVE-3148.2.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3206) Bucket mapjoin in trunk is not working
Navis created HIVE-3206: --- Summary: Bucket mapjoin in trunk is not working Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at
[jira] [Commented] (HIVE-3108) SELECT count(DISTINCT col) ... returns 0 if col is a partition column
[ https://issues.apache.org/jira/browse/HIVE-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402814#comment-13402814 ] Edward Capriolo commented on HIVE-3108: --- Confirmed fixed in trunk. {noformat} [edward@tablitha dist]$ bin/hive Logging initialized using configuration in jar:file:/home/edward/hive/trunk/build/dist/lib/hive-common-0.10.0-SNAPSHOT.jar!/hive-log4j.properties Hive history file=/tmp/edward/hive_job_log_edward_201206272349_386020253.txt hive create table stocks (x int, y string) partitioned by (exchange string, symbol string); OK Time taken: 17.382 seconds hive alter table stocks add partition (exchange='nasdaq', symbol='ed'); OK Time taken: 2.022 seconds hive alter table stocks add partition (exchange='nasdaq', symbol='guy'); OK Time taken: 0.219 seconds hive alter table stocks add partition (exchange='jp', symbol='bla'); OK Time taken: 0.245 seconds hive select count(distinct exchange), count(distinct symbol) from stocks; 2 3 Time taken: 5.742 seconds {noformat} SELECT count(DISTINCT col) ... returns 0 if col is a partition column --- Key: HIVE-3108 URL: https://issues.apache.org/jira/browse/HIVE-3108 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.8.0, 0.9.0 Environment: Mac OSX running Apache distribution of hadoop and hive natively. Reporter: Dean Wampler Labels: Hive Suppose stocks is a managed OR external table, partitioned by exchange and symbol. count(DISTINCT x) returns 0 for either exchange, symbol, or both: hive SELECT count(DISTINCT exchange), count(DISTINCT symbol) from stocks; 0 0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-3108) SELECT count(DISTINCT col) ... returns 0 if col is a partition column
[ https://issues.apache.org/jira/browse/HIVE-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo resolved HIVE-3108. --- Resolution: Duplicate SELECT count(DISTINCT col) ... returns 0 if col is a partition column --- Key: HIVE-3108 URL: https://issues.apache.org/jira/browse/HIVE-3108 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.8.0, 0.9.0 Environment: Mac OSX running Apache distribution of hadoop and hive natively. Reporter: Dean Wampler Labels: Hive Suppose stocks is a managed OR external table, partitioned by exchange and symbol. count(DISTINCT x) returns 0 for either exchange, symbol, or both: hive SELECT count(DISTINCT exchange), count(DISTINCT symbol) from stocks; 0 0 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3206) Bucket mapjoin in trunk is not working
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3206: Status: Patch Available (was: Open) https://reviews.facebook.net/D3861 Working directory could be different from temp directory for MapredLocalWork (intentional?), so just prepended the temp directory as a parent of stored files with some error handling. Bucket mapjoin in trunk is not working --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at
[jira] [Commented] (HIVE-3206) Bucket mapjoin in trunk is not working
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402835#comment-13402835 ] Edward Capriolo commented on HIVE-3206: --- Did you very-clean? The tests look ok. [edward@tablitha trunk]$ ant test -Dtestcase=TestCliDriver -Dqfile=bucketmapjoin1.q BUILD SUCCESSFUL Total time: 4 minutes 18 seconds Bucket mapjoin in trunk is not working --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at
[jira] [Commented] (HIVE-3206) Bucket mapjoin in trunk is not working
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402840#comment-13402840 ] Edward Capriolo commented on HIVE-3206: --- @Navis is there any way we can unit test here. This slipped through the unit testing. Bucket mapjoin in trunk is not working --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at
[jira] [Commented] (HIVE-3206) Bucket mapjoin in trunk is not working
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402843#comment-13402843 ] Edward Capriolo commented on HIVE-3206: --- Please ignore here. There is not much that can be done about this one unit test wise. +1 Bucket mapjoin in trunk is not working --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at
[jira] [Updated] (HIVE-3206) Bucket mapjoin in trunk is not working
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-3206: -- Attachment: hive-3206.1.patch.txt Bucket mapjoin in trunk is not working --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Attachments: hive-3206.1.patch.txt Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at
[jira] [Updated] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-3206: -- Summary: FileUtils.tar assumes wrong directory in some cases (was: Bucket mapjoin in trunk is not working ) FileUtils.tar assumes wrong directory in some cases --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Attachments: hive-3206.1.patch.txt Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at
[jira] [Updated] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo updated HIVE-3206: -- Resolution: Fixed Status: Resolved (was: Patch Available) Resolved. Thanks Navis. FileUtils.tar assumes wrong directory in some cases --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Attachments: hive-3206.1.patch.txt Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at
[jira] [Resolved] (HIVE-1162) Distribute ant test to speed process up
[ https://issues.apache.org/jira/browse/HIVE-1162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo resolved HIVE-1162. --- Resolution: Won't Fix Distribute ant test to speed process up --- Key: HIVE-1162 URL: https://issues.apache.org/jira/browse/HIVE-1162 Project: Hive Issue Type: New Feature Components: Build Infrastructure Reporter: Edward Capriolo Assignee: Edward Capriolo Priority: Minor Attachments: distributed_hive_test.sh Hive runs atop map/reduce. Running our unit tests via map/reduce would be faster and would make our evolution faster. Ideally we would want an ant target in which we could supply a hadoop binary. Then we would distribute the buildir and possibly ant via the distributed cache, finally we run all test in parallel. My proof of concept user bash + ssh public key. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3204) Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands)
[ https://issues.apache.org/jira/browse/HIVE-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402850#comment-13402850 ] Kanna Karanam commented on HIVE-3204: - WOW.. Thanks Edward. You simplified the problem very quickly. I will start with this approach. I noticed that our unit tests are using “!rm, !ls, !mkdir, !sleep” shell commands as of today. So I can easily replace all of them using either DFS or java_method. But I am worried that it leaves me with no unit testing for “!cmd command. I can workaround this by adding a platform specific unit tests to test this command (one for windows and another one for Unix). Do you have any suggestions? Thanks Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands) -- Key: HIVE-3204 URL: https://issues.apache.org/jira/browse/HIVE-3204 Project: Hive Issue Type: Sub-task Components: Tests, Windows Affects Versions: 0.10.0 Reporter: Kanna Karanam Assignee: Kanna Karanam Fix For: 0.10.0 Possible solution 1: (Preferred one) !Unix cmd | Windows cmd = Keeping the same syntax. Hive uses Java runtime to launch the shell command so any attempt to run windows commands on Unix will fail and vice versa. To deal with unit tests. Unix commands in each .q file will be modified as shown below. I will filter out the !commands which can’t be run on the current . Original entry in.q file: !rm -rf ../build/ql/test/data/exports/exim_department; It will be replaced with the following entries. UNIX::!rm -rf ../build/ql/test/data/exports/exim_department; WINDOWS::!del ../build/ql/test/data/exports/exim_department Possible solution 2: Provide a Shell UDF library(JAVA Based code)to support platform independent shell functionality Cons – 1) Difficult to provide full shell functionality 2) Takes long time 3) Difficult to manage -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan updated HIVE-3206: --- Fix Version/s: 0.10.0 FileUtils.tar assumes wrong directory in some cases --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Fix For: 0.10.0 Attachments: hive-3206.1.patch.txt Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at
[jira] [Commented] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402853#comment-13402853 ] Navis commented on HIVE-3206: - Committed this fast? o.o I missed to close input files. Could you change the fetch if not yet committed? {code} IOUtils.copy(new FileInputStream(f), tOut); // copy with 8K buffer, not close -- FileInputStream input = new FileInputStream(f); try { IOUtils.copy(input, tOut); // copy with 8K buffer, not close } finally { input.close(); } {code} FileUtils.tar assumes wrong directory in some cases --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Fix For: 0.10.0 Attachments: hive-3206.1.patch.txt Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at
[jira] [Commented] (HIVE-2498) Group by operator doesnt estimate size of Timestamp Binary data correctly
[ https://issues.apache.org/jira/browse/HIVE-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402855#comment-13402855 ] Edward Capriolo commented on HIVE-2498: --- @Ashutosh this looks good to me. Any changes you would like since it has been a while? Group by operator doesnt estimate size of Timestamp Binary data correctly --- Key: HIVE-2498 URL: https://issues.apache.org/jira/browse/HIVE-2498 Project: Hive Issue Type: Bug Affects Versions: 0.8.0, 0.8.1, 0.9.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Attachments: HIVE-2498.D1185.1.patch, hive-2498.patch, hive-2498_1.patch It currently defaults to default case and returns constant value, whereas we can do better by getting actual size at runtime. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3204) Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands)
[ https://issues.apache.org/jira/browse/HIVE-3204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402858#comment-13402858 ] Edward Capriolo commented on HIVE-3204: --- Really we can not cover '!' since we are totally relying on something external. We could do: {noformat} ! ${env:JAVA_HOME}/bin/java -jar SomeClassWeMade arg[1] arg[2] {noformat} Or we could add these things directly to the hive language. In the end we might really need your windows!! unix!! approach its not a problem worth losing much sleepover. Windows: Fix the unit tests which contains “!cmd” commands (Unix shell commands) -- Key: HIVE-3204 URL: https://issues.apache.org/jira/browse/HIVE-3204 Project: Hive Issue Type: Sub-task Components: Tests, Windows Affects Versions: 0.10.0 Reporter: Kanna Karanam Assignee: Kanna Karanam Fix For: 0.10.0 Possible solution 1: (Preferred one) !Unix cmd | Windows cmd = Keeping the same syntax. Hive uses Java runtime to launch the shell command so any attempt to run windows commands on Unix will fail and vice versa. To deal with unit tests. Unix commands in each .q file will be modified as shown below. I will filter out the !commands which can’t be run on the current . Original entry in.q file: !rm -rf ../build/ql/test/data/exports/exim_department; It will be replaced with the following entries. UNIX::!rm -rf ../build/ql/test/data/exports/exim_department; WINDOWS::!del ../build/ql/test/data/exports/exim_department Possible solution 2: Provide a Shell UDF library(JAVA Based code)to support platform independent shell functionality Cons – 1) Difficult to provide full shell functionality 2) Takes long time 3) Difficult to manage -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402859#comment-13402859 ] Kanna Karanam commented on HIVE-3206: - @Navis - Would it be possible to update/add aunit test? it looks like some settings are missing in the .q file. Thanks FileUtils.tar assumes wrong directory in some cases --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Fix For: 0.10.0 Attachments: hive-3206.1.patch.txt Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at
[jira] [Reopened] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edward Capriolo reopened HIVE-3206: --- FileUtils.tar assumes wrong directory in some cases --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Fix For: 0.10.0 Attachments: hive-3206.1.patch.txt Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at
[jira] [Commented] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402861#comment-13402861 ] Edward Capriolo commented on HIVE-3206: --- @Navis please supply a patch to place on top of this one. Or we can revert the change and supply a fully patch. FileUtils.tar assumes wrong directory in some cases --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Fix For: 0.10.0 Attachments: hive-3206.1.patch.txt Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at
[jira] [Created] (HIVE-3207) FileUtils.tar does not close input files
Navis created HIVE-3207: --- Summary: FileUtils.tar does not close input files Key: HIVE-3207 URL: https://issues.apache.org/jira/browse/HIVE-3207 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Priority: Trivial It should close input files too. I missed this in HIVE-3206. (sorry) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3207) FileUtils.tar does not close input files
[ https://issues.apache.org/jira/browse/HIVE-3207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-3207: Status: Patch Available (was: Open) https://reviews.facebook.net/D3867 FileUtils.tar does not close input files Key: HIVE-3207 URL: https://issues.apache.org/jira/browse/HIVE-3207 Project: Hive Issue Type: Sub-task Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Priority: Trivial It should close input files too. I missed this in HIVE-3206. (sorry) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3206) FileUtils.tar assumes wrong directory in some cases
[ https://issues.apache.org/jira/browse/HIVE-3206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13402878#comment-13402878 ] Navis commented on HIVE-3206: - @Kanna Karanam - Some problems are not shown by hive test framework. I don't know why. @Edward Capriolo - Added HIVE-3207. Sorry for your inconvenience. FileUtils.tar assumes wrong directory in some cases --- Key: HIVE-3206 URL: https://issues.apache.org/jira/browse/HIVE-3206 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Fix For: 0.10.0 Attachments: hive-3206.1.patch.txt Bucket mapjoin throws exception archiving stored hashtables. {noformat} hive set hive.optimize.bucketmapjoin = true; hive select /*+mapjoin(a)*/ a.key, a.value, b.value from srcbucket_mapjoin_part a join srcbucket_mapjoin_part_2 b on a.key=b.key; Total MapReduce jobs = 1 12/06/28 12:36:18 WARN conf.HiveConf: DEPRECATED: Ignoring hive-default.xml found on the CLASSPATH at /home/navis/hive/conf/hive-default.xml Execution log at: /tmp/navis/navis_20120628123636_5298a863-605c-4b98-bbb3-0a132c85c5a3.log 2012-06-28 12:36:18 Starting to launch local task to process map join; maximum memory = 932118528 2012-06-28 12:36:18 Processing rows:153 Hashtable size: 153 Memory usage: 1771376 rate: 0.002 2012-06-28 12:36:18 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable 2012-06-28 12:36:18 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket22.txt.hashtable File size: 9644 2012-06-28 12:36:19 Processing rows:309 Hashtable size: 156 Memory usage: 1844568 rate: 0.002 2012-06-28 12:36:19 Dump the hashtable into file: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable 2012-06-28 12:36:19 Upload 1 File to: file:/tmp/navis/hive_2012-06-28_12-36-17_003_3016196240171705142/-local-10002/HashTable-Stage-1/MapJoin-a-00-srcbucket23.txt.hashtable File size: 10023 2012-06-28 12:36:19 End of local task; Time Taken: 0.773 sec. Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin Mapred Local Task Succeeded . Convert the Join into MapJoin Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.io.IOException: This archives contains unclosed entries. at org.apache.commons.compress.archivers.tar.TarArchiveOutputStream.finish(TarArchiveOutputStream.java:214) at org.apache.hadoop.hive.common.FileUtils.tar(FileUtils.java:276) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:391) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:137) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:134) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1324) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1110) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:944) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:744) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:607) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) Job Submission failed with exception 'java.io.IOException(This archives contains unclosed entries.)' java.lang.IllegalArgumentException: Can not create a Path from an empty string at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82) at org.apache.hadoop.fs.Path.init(Path.java:90) at org.apache.hadoop.hive.ql.exec.Utilities.getHiveJobID(Utilities.java:380) at org.apache.hadoop.hive.ql.exec.Utilities.clearMapRedWork(Utilities.java:193) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:460) at