[jira] [Commented] (HIVE-3029) Update ShimLoader to work with Hadoop 2.x
[ https://issues.apache.org/jira/browse/HIVE-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13435843#comment-13435843 ] Namit Jain commented on HIVE-3029: -- @Carl, can you take care of this ? If you are busy, I can wrap it up - let me know. Update ShimLoader to work with Hadoop 2.x - Key: HIVE-3029 URL: https://issues.apache.org/jira/browse/HIVE-3029 Project: Hive Issue Type: Bug Components: Shims Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-3029.D3255.1.patch, HIVE-3029.D3255.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3380) As a follow up for HIVE-3276, optimize union for dynamic partition queries
[ https://issues.apache.org/jira/browse/HIVE-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3380: - Resolution: Fixed Status: Resolved (was: Patch Available) Will do as part of HIVE-3276 As a follow up for HIVE-3276, optimize union for dynamic partition queries -- Key: HIVE-3380 URL: https://issues.apache.org/jira/browse/HIVE-3380 Project: Hive Issue Type: Improvement Reporter: Namit Jain Assignee: Namit Jain -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Hive-trunk-h0.21 - Build # 1610 - Still Failing
Changes for Build #1606 [cws] HIVE-2804. Task log retrieval fails on Hadoop 0.23 (Zhenxiao Luo via cws) Changes for Build #1607 [cws] HIVE-3337. Create Table Like should copy configured Table Parameters (Bhushan Mandhani via cws) Changes for Build #1608 Changes for Build #1609 [hashutosh] HIVE-3385 : fixing 0.23 test build (Sushanth Sowmyan via Ashutosh Chauhan) Changes for Build #1610 No tests ran. The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1610) Status: Still Failing Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1610/ to view the results.
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false #107
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/107/ -- [...truncated 10116 lines...] [echo] Project: odbc [copy] Warning: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/odbc/src/conf does not exist. ivy-resolve-test: [echo] Project: odbc ivy-retrieve-test: [echo] Project: odbc compile-test: [echo] Project: odbc create-dirs: [echo] Project: serde [copy] Warning: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/serde/src/test/resources does not exist. init: [echo] Project: serde ivy-init-settings: [echo] Project: serde ivy-resolve: [echo] Project: serde [ivy:resolve] :: loading settings :: file = /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml [ivy:report] Processing /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/resolution-cache/org.apache.hive-hive-serde-default.xml to /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/report/org.apache.hive-hive-serde-default.html ivy-retrieve: [echo] Project: serde dynamic-serde: compile: [echo] Project: serde ivy-resolve-test: [echo] Project: serde ivy-retrieve-test: [echo] Project: serde compile-test: [echo] Project: serde [javac] Compiling 26 source files to /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/serde/test/classes [javac] Note: Some input files use or override a deprecated API. [javac] Note: Recompile with -Xlint:deprecation for details. [javac] Note: Some input files use unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. create-dirs: [echo] Project: service [copy] Warning: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/service/src/test/resources does not exist. init: [echo] Project: service ivy-init-settings: [echo] Project: service ivy-resolve: [echo] Project: service [ivy:resolve] :: loading settings :: file = /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml [ivy:report] Processing /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/resolution-cache/org.apache.hive-hive-service-default.xml to /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/report/org.apache.hive-hive-service-default.html ivy-retrieve: [echo] Project: service compile: [echo] Project: service ivy-resolve-test: [echo] Project: service ivy-retrieve-test: [echo] Project: service compile-test: [echo] Project: service [javac] Compiling 2 source files to /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/service/test/classes test: [echo] Project: hive test-shims: [echo] Project: hive test-conditions: [echo] Project: shims gen-test: [echo] Project: shims create-dirs: [echo] Project: shims [copy] Warning: /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/test/resources does not exist. init: [echo] Project: shims ivy-init-settings: [echo] Project: shims ivy-resolve: [echo] Project: shims [ivy:resolve] :: loading settings :: file = /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml [ivy:report] Processing /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/resolution-cache/org.apache.hive-hive-shims-default.xml to /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/ivy/report/org.apache.hive-hive-shims-default.html ivy-retrieve: [echo] Project: shims compile: [echo] Project: shims [echo] Building shims 0.20 build_shims: [echo] Project: shims [echo] Compiling /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/common/java;/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/shims/src/0.20/java against hadoop 0.20.2 (/x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/build/hadoopcore/hadoop-0.20.2) ivy-init-settings: [echo] Project: shims ivy-resolve-hadoop-shim: [echo] Project: shims [ivy:resolve] :: loading settings :: file = /x1/jenkins/jenkins-slave/workspace/Hive-0.9.1-SNAPSHOT-h0.21-keepgoing=false/hive/ivy/ivysettings.xml ivy-retrieve-hadoop-shim: [echo] Project: shims [echo] Building shims 0.20S build_shims: [echo] Project: shims [echo] Compiling
Re: Problem with Hive Indexing
Hi, At lease the table size must be greater than 5GB to use the index for filter pushdown. Otherwise you have to comment the checkQuerySize method. Cheers, Mahsa On Mon, Jul 30, 2012 at 11:12 AM, Ablimit Aji abli...@gmail.com wrote: I have written a custom index handler and wanted to test it. However hive is not using it. So I test with simple table (pokes (int foo, string bar)) which comes with hive distribution for testing purpose. Then I created a compact index and set the set hive.optimize.index.filter=true; However, upon checking the log info, it seems hive is still not using the index. So, what is the problem ? The query I issued is as follow: select foo from pokes WHERE foo=498 ; Below is the log info I got after issuing the query. 12/07/26 12:25:17 INFO index.IndexWhereProcessor: Processing predicate for index optimization 12/07/26 12:25:17 INFO index.IndexWhereProcessor: (foo = 498) 12/07/26 12:25:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=pokes_idx 12/07/26 12:25:17 INFO hive.log: DDL: struct pokes_idx { i32 foo, string _bucketname, list _offsets} 12/07/26 12:25:17 INFO index.IndexWhereProcessor: checking index staleness... 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455 12/07/26 12:25:17 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/07/26 12:25:17 WARN snappy.LoadSnappy: Snappy native library not loaded
Re: Problem with Hive Indexing
Thanks Mahsa ! I didn't know that there is such a constraint. Best, Ablimit On Thu, Aug 16, 2012 at 12:32 PM, Mahsa Mofidpoor mofidp...@gmail.comwrote: Hi, At lease the table size must be greater than 5GB to use the index for filter pushdown. Otherwise you have to comment the checkQuerySize method. Cheers, Mahsa On Mon, Jul 30, 2012 at 11:12 AM, Ablimit Aji abli...@gmail.com wrote: I have written a custom index handler and wanted to test it. However hive is not using it. So I test with simple table (pokes (int foo, string bar)) which comes with hive distribution for testing purpose. Then I created a compact index and set the set hive.optimize.index.filter=true; However, upon checking the log info, it seems hive is still not using the index. So, what is the problem ? The query I issued is as follow: select foo from pokes WHERE foo=498 ; Below is the log info I got after issuing the query. 12/07/26 12:25:17 INFO index.IndexWhereProcessor: Processing predicate for index optimization 12/07/26 12:25:17 INFO index.IndexWhereProcessor: (foo = 498) 12/07/26 12:25:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=pokes_idx 12/07/26 12:25:17 INFO hive.log: DDL: struct pokes_idx { i32 foo, string _bucketname, list _offsets} 12/07/26 12:25:17 INFO index.IndexWhereProcessor: checking index staleness... 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455 12/07/26 12:25:17 INFO index.IndexWhereProcessor: 1342465077455 12/07/26 12:25:17 INFO util.NativeCodeLoader: Loaded the native-hadoop library 12/07/26 12:25:17 WARN snappy.LoadSnappy: Snappy native library not loaded
[jira] [Created] (HIVE-3390) Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle
Gang Tim Liu created HIVE-3390: -- Summary: Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle Key: HIVE-3390 URL: https://issues.apache.org/jira/browse/HIVE-3390 Project: Hive Issue Type: New Feature Reporter: Gang Tim Liu This is a follow-up for HIVE-3072. We need upgrade scripts for Derby, Postgres, and Oracle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3268) expressions in cluster by are not working
[ https://issues.apache.org/jira/browse/HIVE-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3268: Resolution: Fixed Fix Version/s: 0.10.0 Status: Resolved (was: Patch Available) Committed, thanks Namit. expressions in cluster by are not working - Key: HIVE-3268 URL: https://issues.apache.org/jira/browse/HIVE-3268 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.10.0 Attachments: hive.3268.1.patch, hive.3268.2.patch The following query fails: select key+key, value from src cluster by key+key, value; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3276) optimize union sub-queries
[ https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436105#comment-13436105 ] Kevin Wilfong commented on HIVE-3276: - Namit, is this ready for review? You mention that more test need to be added, but the JIRA is marked Patch Available. optimize union sub-queries -- Key: HIVE-3276 URL: https://issues.apache.org/jira/browse/HIVE-3276 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Attachments: HIVE-3276.1.patch, hive.3276.2.patch It might be a good idea to optimize simple union queries containing map-reduce jobs in at least one of the sub-qeuries. For eg: a query like: insert overwrite table T1 partition P1 select * from ( subq1 union all subq2 ) u; today creates 3 map-reduce jobs, one for subq1, another for subq2 and the final one for the union. It might be a good idea to optimize this. Instead of creating the union task, it might be simpler to create a move task (or something like a move task), where the outputs of the two sub-queries will be moved to the final directory. This can easily extend to more than 2 sub-queries in the union. This is only useful if there is a select * followed by filesink after the union. This can be independently useful, and also be used to optimize the skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-138) Provide option to export a HEADER
[ https://issues.apache.org/jira/browse/HIVE-138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436115#comment-13436115 ] Andrew Perepelytsya commented on HIVE-138: -- Adam, it doesn't look like 'with header' syntax was implemented as described originally, but I checked the patch diff and setting this option did work for me in 0.7.x: {code}set hive.cli.print.header=true;{code} Provide option to export a HEADER - Key: HIVE-138 URL: https://issues.apache.org/jira/browse/HIVE-138 Project: Hive Issue Type: Improvement Components: Clients, Query Processor Reporter: Adam Kramer Assignee: Paul Butler Priority: Minor Fix For: 0.7.0 Attachments: HIVE-138.patch When writing data to directories or files for later analysis, or when exploring data in the hive CLI with raw SELECT statements, it'd be great if we could get a header or something so we know which columns our output comes from. Any chance this is easy to add? Just print the column names (or formula used to generate them) in the first row? SELECT foo.* WITH HEADER FROM some_table foo limit 3; col1col2col3 1 9 6 7 5 0 7 5 3 SELECT f.col1-f.col2, col3 WITH HEADER FROM some_table foo limit 3; f.col1-f.col2 col3 -8 6 2 0 2 3 ...etc -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3388) Improve Performance of UDF PERCENTILE_APPROX()
[ https://issues.apache.org/jira/browse/HIVE-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rongrong Zhong reassigned HIVE-3388: Assignee: Rongrong Zhong Improve Performance of UDF PERCENTILE_APPROX() -- Key: HIVE-3388 URL: https://issues.apache.org/jira/browse/HIVE-3388 Project: Hive Issue Type: Task Reporter: Rongrong Zhong Assignee: Rongrong Zhong Priority: Minor -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3276) optimize union sub-queries
[ https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3276: - Status: Open (was: Patch Available) optimize union sub-queries -- Key: HIVE-3276 URL: https://issues.apache.org/jira/browse/HIVE-3276 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Attachments: HIVE-3276.1.patch, hive.3276.2.patch It might be a good idea to optimize simple union queries containing map-reduce jobs in at least one of the sub-qeuries. For eg: a query like: insert overwrite table T1 partition P1 select * from ( subq1 union all subq2 ) u; today creates 3 map-reduce jobs, one for subq1, another for subq2 and the final one for the union. It might be a good idea to optimize this. Instead of creating the union task, it might be simpler to create a move task (or something like a move task), where the outputs of the two sub-queries will be moved to the final directory. This can easily extend to more than 2 sub-queries in the union. This is only useful if there is a select * followed by filesink after the union. This can be independently useful, and also be used to optimize the skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3276) optimize union sub-queries
[ https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436176#comment-13436176 ] Namit Jain commented on HIVE-3276: -- I was able to run tests for hadoop 23. I will upload the new patch soon. optimize union sub-queries -- Key: HIVE-3276 URL: https://issues.apache.org/jira/browse/HIVE-3276 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Attachments: HIVE-3276.1.patch, hive.3276.2.patch It might be a good idea to optimize simple union queries containing map-reduce jobs in at least one of the sub-qeuries. For eg: a query like: insert overwrite table T1 partition P1 select * from ( subq1 union all subq2 ) u; today creates 3 map-reduce jobs, one for subq1, another for subq2 and the final one for the union. It might be a good idea to optimize this. Instead of creating the union task, it might be simpler to create a move task (or something like a move task), where the outputs of the two sub-queries will be moved to the final directory. This can easily extend to more than 2 sub-queries in the union. This is only useful if there is a select * followed by filesink after the union. This can be independently useful, and also be used to optimize the skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Possible patch to fix column comments with non-native SerDe
You'll need to update your serde to use the method call that takes comments. See https://github.com/jghoman/haivvreo/commit/29ead1fe101baafa8e9844eaf92022cbe4846c6f for an example. Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Tuesday, August 7, 2012 at 7:21 AM, Stephen R. Scaffidi wrote: So, the patch doesn not seem to fix the problem we are having, but it combined with the one I sent to the list seems to take care of it. I will continue to study this issue and report back on any issues. Thanks again! On 08/06/2012 06:59 PM, Stephen Scaffid wrote: Thanks! I'll see how it goes! (better yet, this could be what it takes to convince the team to upgrade!) On Aug 6, 2012, at 6:47 PM, Jakob Homan wrote: This was fixed in Hive 8 (https://issues.apache.org/jira/browse/HIVE-2171). Can you just apply that patch? On Mon, Aug 6, 2012 at 2:15 PM, Stephen R. Scaffidi sscaff...@tripadvisor.com (mailto:sscaff...@tripadvisor.com) wrote: My team and I have been trying, with limited success, to use the COMMENT feature of hive columns to maintain documentation for the tables and columns in our data-warehouse built on hive. However, we use a number of custom and non-native SerDes, and what happens to those tables is that the comments always get overwritten with the string from deserializer. I've possibly found a way to work around this from within hive but I want to get some insight from the hive-dev community to figure out whether or not this is a patently bad idea and we are just setting ourselves up for pain later on. I won't go into all the details but it seems to work in our (so far) limited testing. However, we are using hive 0.7.1 and the patch I am sending is against master/HEAD. Please let me know if this is an acceptable approach to preserving column comments with non-native SerDes or not!
[jira] [Updated] (HIVE-3375) bucketed map join should check that the number of files match the number of buckets
[ https://issues.apache.org/jira/browse/HIVE-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3375: - Attachment: hive.3375.3.patch bucketed map join should check that the number of files match the number of buckets --- Key: HIVE-3375 URL: https://issues.apache.org/jira/browse/HIVE-3375 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3375.1.patch, hive.3375.2.patch, hive.3375.3.patch Currently, we get NPE if that is not the case -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3375) bucketed map join should check that the number of files match the number of buckets
[ https://issues.apache.org/jira/browse/HIVE-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436190#comment-13436190 ] Namit Jain commented on HIVE-3375: -- refreshed after few other commits bucketed map join should check that the number of files match the number of buckets --- Key: HIVE-3375 URL: https://issues.apache.org/jira/browse/HIVE-3375 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3375.1.patch, hive.3375.2.patch, hive.3375.3.patch Currently, we get NPE if that is not the case -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3276) optimize union sub-queries
[ https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3276: - Attachment: hive.3276.3.patch optimize union sub-queries -- Key: HIVE-3276 URL: https://issues.apache.org/jira/browse/HIVE-3276 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Attachments: HIVE-3276.1.patch, hive.3276.2.patch, hive.3276.3.patch It might be a good idea to optimize simple union queries containing map-reduce jobs in at least one of the sub-qeuries. For eg: a query like: insert overwrite table T1 partition P1 select * from ( subq1 union all subq2 ) u; today creates 3 map-reduce jobs, one for subq1, another for subq2 and the final one for the union. It might be a good idea to optimize this. Instead of creating the union task, it might be simpler to create a move task (or something like a move task), where the outputs of the two sub-queries will be moved to the final directory. This can easily extend to more than 2 sub-queries in the union. This is only useful if there is a select * followed by filesink after the union. This can be independently useful, and also be used to optimize the skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3276) optimize union sub-queries
[ https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436202#comment-13436202 ] Namit Jain commented on HIVE-3276: -- @Kevin, this is ready for review. optimize union sub-queries -- Key: HIVE-3276 URL: https://issues.apache.org/jira/browse/HIVE-3276 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Attachments: HIVE-3276.1.patch, hive.3276.2.patch, hive.3276.3.patch It might be a good idea to optimize simple union queries containing map-reduce jobs in at least one of the sub-qeuries. For eg: a query like: insert overwrite table T1 partition P1 select * from ( subq1 union all subq2 ) u; today creates 3 map-reduce jobs, one for subq1, another for subq2 and the final one for the union. It might be a good idea to optimize this. Instead of creating the union task, it might be simpler to create a move task (or something like a move task), where the outputs of the two sub-queries will be moved to the final directory. This can easily extend to more than 2 sub-queries in the union. This is only useful if there is a select * followed by filesink after the union. This can be independently useful, and also be used to optimize the skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3276) optimize union sub-queries
[ https://issues.apache.org/jira/browse/HIVE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3276: - Status: Patch Available (was: Open) ShimLoader changes are copied from HIVE-3029, only to run tests on hadoop 23. Once HIVE-3029 is checked in, this file will be reverted Also, to run tests for thew newly added tests only on hadoop 23: ant clean package ant test -Dhadoop.mr.rev=23 -Dtest.print.classpath=true -Dhadoop.version=2.0.0-alpha -Dhadoop.security.version=2.0.0-alpha -Dtestcase=TestCliDriver -Dqfile=union_remove_1.q,union_remove_2.q,union_remove_3.q,union_remove_4.q,union_remove_5.q,union_remove_6.q,union_remove_7.q,union_remove_8.q,union_remove_9.q,union_remove_10.q,union_remove_11.q,union_remove_12.q,union_remove_13.q,union_remove_14.q,union_remove_15.q,union_remove_16.q,union_remove_17.q,union_remove_18.q optimize union sub-queries -- Key: HIVE-3276 URL: https://issues.apache.org/jira/browse/HIVE-3276 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Attachments: HIVE-3276.1.patch, hive.3276.2.patch, hive.3276.3.patch It might be a good idea to optimize simple union queries containing map-reduce jobs in at least one of the sub-qeuries. For eg: a query like: insert overwrite table T1 partition P1 select * from ( subq1 union all subq2 ) u; today creates 3 map-reduce jobs, one for subq1, another for subq2 and the final one for the union. It might be a good idea to optimize this. Instead of creating the union task, it might be simpler to create a move task (or something like a move task), where the outputs of the two sub-queries will be moved to the final directory. This can easily extend to more than 2 sub-queries in the union. This is only useful if there is a select * followed by filesink after the union. This can be independently useful, and also be used to optimize the skewed joins https://cwiki.apache.org/Hive/skewed-join-optimization.html -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-3390) Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle
[ https://issues.apache.org/jira/browse/HIVE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach resolved HIVE-3390. -- Resolution: Invalid This work needs to be done as part of HIVE-3072. Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle - Key: HIVE-3390 URL: https://issues.apache.org/jira/browse/HIVE-3390 Project: Hive Issue Type: New Feature Reporter: Gang Tim Liu This is a follow-up for HIVE-3072. We need upgrade scripts for Derby, Postgres, and Oracle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3391) Keep the original query in HiveDriverRunHookContextImpl
Dawid Dabrowski created HIVE-3391: - Summary: Keep the original query in HiveDriverRunHookContextImpl Key: HIVE-3391 URL: https://issues.apache.org/jira/browse/HIVE-3391 Project: Hive Issue Type: Improvement Reporter: Dawid Dabrowski Priority: Minor It'd be useful to have access to the original query in hooks. The hook that's executed first is HiveDriverRunHook, let's add it there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3375) bucketed map join should check that the number of files match the number of buckets
[ https://issues.apache.org/jira/browse/HIVE-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436227#comment-13436227 ] Carl Steinbach commented on HIVE-3375: -- +1 @Namit: Can you handle testing this and getting it committed? Thanks. bucketed map join should check that the number of files match the number of buckets --- Key: HIVE-3375 URL: https://issues.apache.org/jira/browse/HIVE-3375 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3375.1.patch, hive.3375.2.patch, hive.3375.3.patch Currently, we get NPE if that is not the case -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3228) unable to load null values that represent a timestamp value
[ https://issues.apache.org/jira/browse/HIVE-3228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436245#comment-13436245 ] Neha Tomar commented on HIVE-3228: -- I don't think this issue is fixed as I am still seeing it. Can anyone pleases update? unable to load null values that represent a timestamp value --- Key: HIVE-3228 URL: https://issues.apache.org/jira/browse/HIVE-3228 Project: Hive Issue Type: Bug Affects Versions: 0.8.0 Reporter: N Campbell Attachments: CERT.TTS.txt Attempting to load delimited data into a table with one or more timestamp columns will fail when null values are represented in the input set. load data local inpath 'CERT.TTS.txt' overwrite into table CERT.TTS_E; insert overwrite table CERT.TTS select * from CERT.TTS_E; Error: Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask SQLState: 08S01 ErrorCode: 9 create table if not exists CERT.TTS_E ( RNUM int , CTS timestamp) row format delimited fields terminated by '\t' stored as textfile; create table if not exists CERT.TTS ( RNUM int , CTS timestamp) stored as sequencefile; 0 1 1996-01-01 00:00:00.0 2 1996-01-01 12:00:00.0 3 1996-01-01 23:59:30.12300 4 2000-01-01 00:00:00.0 5 2000-01-01 12:00:00.0 6 2000-01-01 23:59:30.12300 7 2000-12-31 00:00:00.0 8 2000-12-31 12:00:00.0 9 2000-12-31 12:15:30.12300 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3390) Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle
[ https://issues.apache.org/jira/browse/HIVE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436246#comment-13436246 ] Gang Tim Liu commented on HIVE-3390: @Carl, got you. do you have some instructions to generate upgrade script for derby, postgres and Orable? Are you using SchemaTool? If you have instructions, it will be big help. thanks Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle - Key: HIVE-3390 URL: https://issues.apache.org/jira/browse/HIVE-3390 Project: Hive Issue Type: New Feature Reporter: Gang Tim Liu This is a follow-up for HIVE-3072. We need upgrade scripts for Derby, Postgres, and Oracle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3390) Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle
[ https://issues.apache.org/jira/browse/HIVE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436251#comment-13436251 ] Carl Steinbach commented on HIVE-3390: -- Schema tool won't generate the upgrade script for you. These need to be written by hand. I recommend looking at the other upgrade script examples in the derby/postgres/oracle directories. Also, it's probably worth doing this last once you're fairly certain that people won't request any more changes to the JDO mapping file. Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle - Key: HIVE-3390 URL: https://issues.apache.org/jira/browse/HIVE-3390 Project: Hive Issue Type: New Feature Reporter: Gang Tim Liu This is a follow-up for HIVE-3072. We need upgrade scripts for Derby, Postgres, and Oracle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3390) Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle
[ https://issues.apache.org/jira/browse/HIVE-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436253#comment-13436253 ] Gang Tim Liu commented on HIVE-3390: I see. will do it last. I am making changes and target to get you a patch to review today. thanks Hive List Bucketing - DDL support - DB upgrade script for Derby, Postgres, and Oracle - Key: HIVE-3390 URL: https://issues.apache.org/jira/browse/HIVE-3390 Project: Hive Issue Type: New Feature Reporter: Gang Tim Liu This is a follow-up for HIVE-3072. We need upgrade scripts for Derby, Postgres, and Oracle. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-0.9.1-SNAPSHOT-h0.21 #107
See https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/107/ -- [...truncated 36653 lines...] [junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/jenkins/hive_2012-08-16_14-18-28_732_5449941647958900016/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/107/artifact/hive/build/service/tmp/hive_job_log_jenkins_201208161418_190836605.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Copying file: https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt [junit] PREHOOK: query: load data local inpath 'https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Output: default@testhivedrivertable [junit] Copying data from https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] POSTHOOK: query: load data local inpath 'https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/jenkins/hive_2012-08-16_14-18-33_058_8798019762878503929/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/jenkins/hive_2012-08-16_14-18-33_058_8798019762878503929/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/107/artifact/hive/build/service/tmp/hive_job_log_jenkins_201208161418_1853619718.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/107/artifact/hive/build/service/tmp/hive_job_log_jenkins_201208161418_29133.txt [junit] Hive history file=https://builds.apache.org/job/Hive-0.9.1-SNAPSHOT-h0.21/107/artifact/hive/build/service/tmp/hive_job_log_jenkins_201208161418_156525202.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK:
[jira] [Commented] (HIVE-3068) Add ability to export table metadata as JSON on table drop
[ https://issues.apache.org/jira/browse/HIVE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436323#comment-13436323 ] Andrew Chalfant commented on HIVE-3068: --- pingping Add ability to export table metadata as JSON on table drop -- Key: HIVE-3068 URL: https://issues.apache.org/jira/browse/HIVE-3068 Project: Hive Issue Type: New Feature Components: Metastore, Serializers/Deserializers Reporter: Andrew Chalfant Assignee: Andrew Chalfant Priority: Minor Labels: features, newbie Attachments: HIVE-3068.2.patch.txt Original Estimate: 24h Remaining Estimate: 24h When a table is dropped, the contents go to the users trash but the metadata is lost. It would be super neat to be able to save the metadata as well so that tables could be trivially re-instantiated via thrift. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3268) expressions in cluster by are not working
[ https://issues.apache.org/jira/browse/HIVE-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436365#comment-13436365 ] Hudson commented on HIVE-3268: -- Integrated in Hive-trunk-h0.21 #1611 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1611/]) HIVE-3268. expressions in cluster by are not working. (njain via kevinwilfong) (Revision 1373918) Result = SUCCESS kevinwilfong : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1373918 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ErrorMsg.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/QBParseInfo.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java * /hive/trunk/ql/src/test/queries/clientnegative/expr_clusterby1.q * /hive/trunk/ql/src/test/queries/clientnegative/expr_distributeby1.q * /hive/trunk/ql/src/test/queries/clientnegative/expr_distributeby_sortby_1.q * /hive/trunk/ql/src/test/queries/clientnegative/expr_orderby1.q * /hive/trunk/ql/src/test/queries/clientnegative/expr_sortby1.q * /hive/trunk/ql/src/test/results/clientnegative/expr_clusterby1.q.out * /hive/trunk/ql/src/test/results/clientnegative/expr_distributeby1.q.out * /hive/trunk/ql/src/test/results/clientnegative/expr_distributeby_sortby_1.q.out * /hive/trunk/ql/src/test/results/clientnegative/expr_orderby1.q.out * /hive/trunk/ql/src/test/results/clientnegative/expr_sortby1.q.out expressions in cluster by are not working - Key: HIVE-3268 URL: https://issues.apache.org/jira/browse/HIVE-3268 Project: Hive Issue Type: Bug Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.10.0 Attachments: hive.3268.1.patch, hive.3268.2.patch The following query fails: select key+key, value from src cluster by key+key, value; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table
Jonathan Natkins created HIVE-3392: -- Summary: Hive unnecessarily validates table SerDes when dropping a table Key: HIVE-3392 URL: https://issues.apache.org/jira/browse/HIVE-3392 Project: Hive Issue Type: Bug Reporter: Jonathan Natkins natty@hadoop1:~$ hive hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive create table test (a int) row format serde 'hive.serde.JSONSerDe'; OK Time taken: 2.399 seconds natty@hadoop1:~$ hive hive drop table test; FAILED: Hive Internal Error: java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe com.cloudera.hive.serde.JSONSerDe does not exist)) java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe com.cloudera.hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253) at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490) at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe com.cloudera.hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260) ... 20 more hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive drop table test; OK Time taken: 0.658 seconds hive -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3392) Hive unnecessarily validates table SerDes when dropping a table
[ https://issues.apache.org/jira/browse/HIVE-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Natkins updated HIVE-3392: --- Description: natty@hadoop1:~$ hive hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive create table test (a int) row format serde 'hive.serde.JSONSerDe'; OK Time taken: 2.399 seconds natty@hadoop1:~$ hive hive drop table test; FAILED: Hive Internal Error: java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe hive.serde.JSONSerDe does not exist)) java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253) at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490) at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeInternal(DDLSemanticAnalyzer.java:210) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:430) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:337) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:889) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:255) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:212) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:403) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:671) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe com.cloudera.hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:211) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:260) ... 20 more hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive drop table test; OK Time taken: 0.658 seconds hive was: natty@hadoop1:~$ hive hive add jar /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar; Added /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar to class path Added resource: /home/natty/source/sample-code/custom-serdes/target/custom-serdes-1.0-SNAPSHOT.jar hive create table test (a int) row format serde 'hive.serde.JSONSerDe'; OK Time taken: 2.399 seconds natty@hadoop1:~$ hive hive drop table test; FAILED: Hive Internal Error: java.lang.RuntimeException(MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe com.cloudera.hive.serde.JSONSerDe does not exist)) java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.SerDeException SerDe com.cloudera.hive.serde.JSONSerDe does not exist) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializerFromMetaStore(Table.java:262) at org.apache.hadoop.hive.ql.metadata.Table.getDeserializer(Table.java:253) at org.apache.hadoop.hive.ql.metadata.Table.getCols(Table.java:490) at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:162) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:943) at org.apache.hadoop.hive.ql.parse.DDLSemanticAnalyzer.analyzeDropTable(DDLSemanticAnalyzer.java:700) at
Hive-trunk-h0.21 - Build # 1611 - Fixed
Changes for Build #1606 [cws] HIVE-2804. Task log retrieval fails on Hadoop 0.23 (Zhenxiao Luo via cws) Changes for Build #1607 [cws] HIVE-3337. Create Table Like should copy configured Table Parameters (Bhushan Mandhani via cws) Changes for Build #1608 Changes for Build #1609 [hashutosh] HIVE-3385 : fixing 0.23 test build (Sushanth Sowmyan via Ashutosh Chauhan) Changes for Build #1610 Changes for Build #1611 [kevinwilfong] HIVE-3268. expressions in cluster by are not working. (njain via kevinwilfong) All tests passed The Apache Jenkins build system has built Hive-trunk-h0.21 (build #1611) Status: Fixed Check console output at https://builds.apache.org/job/Hive-trunk-h0.21/1611/ to view the results.
[jira] [Updated] (HIVE-3268) expressions in cluster by are not working
[ https://issues.apache.org/jira/browse/HIVE-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3268: - Component/s: Query Processor expressions in cluster by are not working - Key: HIVE-3268 URL: https://issues.apache.org/jira/browse/HIVE-3268 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Fix For: 0.10.0 Attachments: hive.3268.1.patch, hive.3268.2.patch The following query fails: select key+key, value from src cluster by key+key, value; -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3068) Add ability to export table metadata as JSON on table drop
[ https://issues.apache.org/jira/browse/HIVE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436416#comment-13436416 ] Edward Capriolo commented on HIVE-3068: --- Dude don't do that. The average patch sits on the queue for some time and many committers volunteer time ill review ASAP. Add ability to export table metadata as JSON on table drop -- Key: HIVE-3068 URL: https://issues.apache.org/jira/browse/HIVE-3068 Project: Hive Issue Type: New Feature Components: Metastore, Serializers/Deserializers Reporter: Andrew Chalfant Assignee: Andrew Chalfant Priority: Minor Labels: features, newbie Attachments: HIVE-3068.2.patch.txt Original Estimate: 24h Remaining Estimate: 24h When a table is dropped, the contents go to the users trash but the metadata is lost. It would be super neat to be able to save the metadata as well so that tables could be trivially re-instantiated via thrift. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3029) Update ShimLoader to work with Hadoop 2.x
[ https://issues.apache.org/jira/browse/HIVE-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436422#comment-13436422 ] Carl Steinbach commented on HIVE-3029: -- @Namit: Looking at this now. Will commit soon. Thanks. Update ShimLoader to work with Hadoop 2.x - Key: HIVE-3029 URL: https://issues.apache.org/jira/browse/HIVE-3029 Project: Hive Issue Type: Bug Components: Shims Reporter: Carl Steinbach Assignee: Carl Steinbach Attachments: HIVE-3029.D3255.1.patch, HIVE-3029.D3255.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2925) Support non-MR fetching for simple queries with select/limit/filter operations only
[ https://issues.apache.org/jira/browse/HIVE-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2925: Attachment: HIVE-2925.3.patch.txt Support non-MR fetching for simple queries with select/limit/filter operations only --- Key: HIVE-2925 URL: https://issues.apache.org/jira/browse/HIVE-2925 Project: Hive Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-2925.1.patch.txt, HIVE-2925.2.patch.txt, HIVE-2925.3.patch.txt, HIVE-2925.D2607.1.patch, HIVE-2925.D2607.2.patch, HIVE-2925.D2607.3.patch, HIVE-2925.D2607.4.patch It's trivial but frequently asked by end-users. Currently, select queries with simple conditions or limit should run MR job which takes some time especially for big tables, making the people irritated. For that kind of simple queries, using fetch task would make them happy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2925) Support non-MR fetching for simple queries with select/limit/filter operations only
[ https://issues.apache.org/jira/browse/HIVE-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2925: Status: Patch Available (was: Open) Rebased on trunk. Sorry for late reply. Support non-MR fetching for simple queries with select/limit/filter operations only --- Key: HIVE-2925 URL: https://issues.apache.org/jira/browse/HIVE-2925 Project: Hive Issue Type: Improvement Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Priority: Trivial Attachments: HIVE-2925.1.patch.txt, HIVE-2925.2.patch.txt, HIVE-2925.3.patch.txt, HIVE-2925.D2607.1.patch, HIVE-2925.D2607.2.patch, HIVE-2925.D2607.3.patch, HIVE-2925.D2607.4.patch It's trivial but frequently asked by end-users. Currently, select queries with simple conditions or limit should run MR job which takes some time especially for big tables, making the people irritated. For that kind of simple queries, using fetch task would make them happy. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-3393) get_json_object and json_tuple should use Jackson library
Kevin Wilfong created HIVE-3393: --- Summary: get_json_object and json_tuple should use Jackson library Key: HIVE-3393 URL: https://issues.apache.org/jira/browse/HIVE-3393 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor The Jackson library's JSON parsers have been shown to be significantly faster that json.org's. The library is already included, so I can't think of a reason not to use it. There's also the potential for further improvements in replacing many of the try catch blocks with if statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3393) get_json_object and json_tuple should use Jackson library
[ https://issues.apache.org/jira/browse/HIVE-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3393: Attachment: HIVE-3393.1.patch.txt get_json_object and json_tuple should use Jackson library - Key: HIVE-3393 URL: https://issues.apache.org/jira/browse/HIVE-3393 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Attachments: HIVE-3393.1.patch.txt The Jackson library's JSON parsers have been shown to be significantly faster that json.org's. The library is already included, so I can't think of a reason not to use it. There's also the potential for further improvements in replacing many of the try catch blocks with if statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3393) get_json_object and json_tuple should use Jackson library
[ https://issues.apache.org/jira/browse/HIVE-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-3393: Status: Patch Available (was: Open) get_json_object and json_tuple should use Jackson library - Key: HIVE-3393 URL: https://issues.apache.org/jira/browse/HIVE-3393 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Attachments: HIVE-3393.1.patch.txt The Jackson library's JSON parsers have been shown to be significantly faster that json.org's. The library is already included, so I can't think of a reason not to use it. There's also the potential for further improvements in replacing many of the try catch blocks with if statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3393) get_json_object and json_tuple should use Jackson library
[ https://issues.apache.org/jira/browse/HIVE-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436466#comment-13436466 ] Kevin Wilfong commented on HIVE-3393: - Uploaded a diff here https://reviews.facebook.net/D4701 get_json_object and json_tuple should use Jackson library - Key: HIVE-3393 URL: https://issues.apache.org/jira/browse/HIVE-3393 Project: Hive Issue Type: Improvement Components: UDF Affects Versions: 0.10.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Priority: Minor Attachments: HIVE-3393.1.patch.txt The Jackson library's JSON parsers have been shown to be significantly faster that json.org's. The library is already included, so I can't think of a reason not to use it. There's also the potential for further improvements in replacing many of the try catch blocks with if statements. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3029) Update ShimLoader to work with Hadoop 2.x
[ https://issues.apache.org/jira/browse/HIVE-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3029: - Attachment: HIVE-3029.2.patch.txt Update ShimLoader to work with Hadoop 2.x - Key: HIVE-3029 URL: https://issues.apache.org/jira/browse/HIVE-3029 Project: Hive Issue Type: Bug Components: Shims Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.10.0 Attachments: HIVE-3029.2.patch.txt, HIVE-3029.D3255.1.patch, HIVE-3029.D3255.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3029) Update ShimLoader to work with Hadoop 2.x
[ https://issues.apache.org/jira/browse/HIVE-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-3029: - Resolution: Fixed Fix Version/s: 0.10.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Update ShimLoader to work with Hadoop 2.x - Key: HIVE-3029 URL: https://issues.apache.org/jira/browse/HIVE-3029 Project: Hive Issue Type: Bug Components: Shims Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.10.0 Attachments: HIVE-3029.2.patch.txt, HIVE-3029.D3255.1.patch, HIVE-3029.D3255.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3389) running tests for hadoop 23
[ https://issues.apache.org/jira/browse/HIVE-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436493#comment-13436493 ] Sushanth Sowmyan commented on HIVE-3389: Namit, I hit a similar issue a while back and Ashutosh pointed me to the patch in HIVE-3029 - I tested with tst.q as you mentioned, and if I apply HIVE-3029, it isn't skipped. running tests for hadoop 23 --- Key: HIVE-3389 URL: https://issues.apache.org/jira/browse/HIVE-3389 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Sushanth Sowmyan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3385) fixing 0.23 test build
[ https://issues.apache.org/jira/browse/HIVE-3385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436496#comment-13436496 ] Sushanth Sowmyan commented on HIVE-3385: As commented on the HIVE-3389, that issue seems to be fixed by HIVE-3029 fixing 0.23 test build -- Key: HIVE-3385 URL: https://issues.apache.org/jira/browse/HIVE-3385 Project: Hive Issue Type: Bug Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Labels: build, test Fix For: 0.10.0 Attachments: HIVE-3385.patch, HIVE-3385.patch.2 Follow up jira after HIVE-3341, we need to make hive tests work on 0.23. For starters, we need to add in a jar into build/ivy/lib/hadoop0.23.shim/ that includes MiniMRCluster. With 0.23, MiniMRCluster has moved to hadoop-mapreduce-client-jobclient-{$version}-tests.jar and that needs to be included in as an ivy dependence. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3226) ColumnPruner is not working on LateralView
[ https://issues.apache.org/jira/browse/HIVE-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436535#comment-13436535 ] Navis commented on HIVE-3226: - added comments ColumnPruner is not working on LateralView -- Key: HIVE-3226 URL: https://issues.apache.org/jira/browse/HIVE-3226 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Attachments: HIVE-3226.1.patch.txt, HIVE-3226.2.patch.txt Column pruning is not applied to LVJ and SEL operator, which makes exceptions at various stages. For example, {noformat} drop table array_valued_src; create table array_valued_src (key string, value arraystring); insert overwrite table array_valued_src select key, array(value) from src; select sum(val) from (select a.key as key, b.value as array_val from src a join array_valued_src b on a.key=b.key) i lateral view explode (array_val) c as val; ... 9 more Caused by: java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:157) ... 14 more Caused by: java.lang.RuntimeException: cannot find field _col0 from [0:_col5] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:345) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57) at org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:896) at org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:922) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389) at org.apache.hadoop.hive.ql.exec.JoinOperator.initializeOp(JoinOperator.java:62) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:150) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3387) meta data file size exceeds limit
[ https://issues.apache.org/jira/browse/HIVE-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436544#comment-13436544 ] Navis commented on HIVE-3387: - Configurations set by set command is not propagated to JobConf for MR job. It's just used inside of hive. In above case you mentioned, value of mapreduce.jobtracker.split.metainfo.maxsize applied to hadoop is 10M(default) which is 1/10 of your expectation. If you change mapred-site.xml, it would not occur. I also think there should be a way to change properties of JobConf. But some permission things should be preceded before that. meta data file size exceeds limit - Key: HIVE-3387 URL: https://issues.apache.org/jira/browse/HIVE-3387 Project: Hive Issue Type: Bug Affects Versions: 0.7.1 Reporter: Alexander Alten-Lorenz Fix For: 0.9.1 The cause is certainly that we use an array list instead of a set structure in the split locations API. Looks like a bug in Hive's CombineFileInputFormat. Reproduce: Set mapreduce.jobtracker.split.metainfo.maxsize=1 when submitting the Hive query. Run a big hive query that write data into a partitioned table. Due to the large number of splits, you encounter an exception on the job submitted to Hadoop and the exception said: meta data size exceeds 1. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3391) Keep the original query in HiveDriverRunHookContextImpl
[ https://issues.apache.org/jira/browse/HIVE-3391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Dabrowski reassigned HIVE-3391: - Assignee: Dawid Dabrowski Keep the original query in HiveDriverRunHookContextImpl --- Key: HIVE-3391 URL: https://issues.apache.org/jira/browse/HIVE-3391 Project: Hive Issue Type: Improvement Reporter: Dawid Dabrowski Assignee: Dawid Dabrowski Priority: Minor Original Estimate: 72h Remaining Estimate: 72h It'd be useful to have access to the original query in hooks. The hook that's executed first is HiveDriverRunHook, let's add it there. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3226) ColumnPruner is not working on LateralView
[ https://issues.apache.org/jira/browse/HIVE-3226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436549#comment-13436549 ] Namit Jain commented on HIVE-3226: -- +1 Running tests ColumnPruner is not working on LateralView -- Key: HIVE-3226 URL: https://issues.apache.org/jira/browse/HIVE-3226 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Attachments: HIVE-3226.1.patch.txt, HIVE-3226.2.patch.txt Column pruning is not applied to LVJ and SEL operator, which makes exceptions at various stages. For example, {noformat} drop table array_valued_src; create table array_valued_src (key string, value arraystring); insert overwrite table array_valued_src select key, array(value) from src; select sum(val) from (select a.key as key, b.value as array_val from src a join array_valued_src b on a.key=b.key) i lateral view explode (array_val) c as val; ... 9 more Caused by: java.lang.RuntimeException: Reduce operator initialization failed at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:157) ... 14 more Caused by: java.lang.RuntimeException: cannot find field _col0 from [0:_col5] at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:345) at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.getStructFieldRef(StandardStructObjectInspector.java:143) at org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57) at org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:896) at org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:922) at org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433) at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389) at org.apache.hadoop.hive.ql.exec.JoinOperator.initializeOp(JoinOperator.java:62) at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) at org.apache.hadoop.hive.ql.exec.ExecReducer.configure(ExecReducer.java:150) {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-3389) running tests for hadoop 23
[ https://issues.apache.org/jira/browse/HIVE-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain resolved HIVE-3389. -- Resolution: Fixed duplicate of HIVE-3029 running tests for hadoop 23 --- Key: HIVE-3389 URL: https://issues.apache.org/jira/browse/HIVE-3389 Project: Hive Issue Type: Bug Components: Testing Infrastructure Reporter: Namit Jain Assignee: Sushanth Sowmyan -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3029) Update ShimLoader to work with Hadoop 2.x
[ https://issues.apache.org/jira/browse/HIVE-3029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13436571#comment-13436571 ] Hudson commented on HIVE-3029: -- Integrated in Hive-trunk-h0.21 #1612 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1612/]) HIVE-3029. Update ShimLoader to work with Hadoop 2.x (Carl Steinbach via cws) (Revision 1374101) Result = SUCCESS cws : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1374101 Files : * /hive/trunk/shims/src/common/java/org/apache/hadoop/hive/shims/ShimLoader.java Update ShimLoader to work with Hadoop 2.x - Key: HIVE-3029 URL: https://issues.apache.org/jira/browse/HIVE-3029 Project: Hive Issue Type: Bug Components: Shims Reporter: Carl Steinbach Assignee: Carl Steinbach Fix For: 0.10.0 Attachments: HIVE-3029.2.patch.txt, HIVE-3029.D3255.1.patch, HIVE-3029.D3255.1.patch -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-3375) bucketed map join should check that the number of files match the number of buckets
[ https://issues.apache.org/jira/browse/HIVE-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-3375: - Resolution: Fixed Status: Resolved (was: Patch Available) Committed (Thanks kevin and carl) bucketed map join should check that the number of files match the number of buckets --- Key: HIVE-3375 URL: https://issues.apache.org/jira/browse/HIVE-3375 Project: Hive Issue Type: Bug Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.3375.1.patch, hive.3375.2.patch, hive.3375.3.patch Currently, we get NPE if that is not the case -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira