Re: Review Request: HIVE-1078: CREATE VIEW followup: CREATE OR REPLACE
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1058/#review1128 --- http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java https://reviews.apache.org/r/1058/#comment2356 Defer the db.getPartitions (which could be expensive) so that we don't do it unless we're sure that the partition keys are actually changing. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java https://reviews.apache.org/r/1058/#comment2357 Avoid usage of java.util.Stack. Some old Hive code uses it but it's deprecated because it's synchronized for no good reason. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java https://reviews.apache.org/r/1058/#comment2358 add spaces around operators such as = - John On 2011-07-20 01:01:53, Charles Chen wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1058/ --- (Updated 2011-07-20 01:01:53) Review request for hive. Summary --- https://issues.apache.org/jira/browse/HIVE-1078 This addresses bug HIVE-1078. https://issues.apache.org/jira/browse/HIVE-1078 Diffs - http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view1.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view2.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view3.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view4.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/create_or_replace_view.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view1.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view2.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view3.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view4.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_or_replace_view.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_view.q.out 1146902 Diff: https://reviews.apache.org/r/1058/diff Testing --- Passes unit tests Thanks, Charles
[jira] [Commented] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE
[ https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068165#comment-13068165 ] jirapos...@reviews.apache.org commented on HIVE-1078: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1058/#review1128 --- http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java https://reviews.apache.org/r/1058/#comment2356 Defer the db.getPartitions (which could be expensive) so that we don't do it unless we're sure that the partition keys are actually changing. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java https://reviews.apache.org/r/1058/#comment2357 Avoid usage of java.util.Stack. Some old Hive code uses it but it's deprecated because it's synchronized for no good reason. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java https://reviews.apache.org/r/1058/#comment2358 add spaces around operators such as = - John On 2011-07-20 01:01:53, Charles Chen wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/1058/ bq. --- bq. bq. (Updated 2011-07-20 01:01:53) bq. bq. bq. Review request for hive. bq. bq. bq. Summary bq. --- bq. bq. https://issues.apache.org/jira/browse/HIVE-1078 bq. bq. bq. This addresses bug HIVE-1078. bq. https://issues.apache.org/jira/browse/HIVE-1078 bq. bq. bq. Diffs bq. - bq. bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 1146902 bq. http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1146902 bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1146902 bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1146902 bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1146902 bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java 1146902 bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view1.q PRE-CREATION bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view2.q PRE-CREATION bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view3.q PRE-CREATION bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view4.q PRE-CREATION bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/create_or_replace_view.q PRE-CREATION bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view1.q.out PRE-CREATION bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view2.q.out PRE-CREATION bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view3.q.out PRE-CREATION bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view4.q.out PRE-CREATION bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_or_replace_view.q.out PRE-CREATION bq. http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_view.q.out 1146902 bq. bq. Diff: https://reviews.apache.org/r/1058/diff bq. bq. bq. Testing bq. --- bq. bq. Passes unit tests bq. bq. bq. Thanks, bq. bq. Charles bq. bq. CREATE VIEW followup: CREATE OR REPLACE Key: HIVE-1078 URL: https://issues.apache.org/jira/browse/HIVE-1078 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: Charles Chen Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, HIVE-1078v6.patch, HIVE-1078v7.patch Currently, replacing a view requires DROP VIEW v; CREATE VIEW v AS new-definition; CREATE OR REPLACE would allow these to be combined into a single operation. -- This message is automatically generated by JIRA. For more information on JIRA, see:
[jira] [Updated] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE
[ https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1078: - Status: Open (was: Patch Available) CREATE VIEW followup: CREATE OR REPLACE Key: HIVE-1078 URL: https://issues.apache.org/jira/browse/HIVE-1078 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: Charles Chen Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, HIVE-1078v6.patch, HIVE-1078v7.patch Currently, replacing a view requires DROP VIEW v; CREATE VIEW v AS new-definition; CREATE OR REPLACE would allow these to be combined into a single operation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: Cli: Print Hadoop's CPU milliseconds
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/948/ --- (Updated 2011-07-20 06:27:19.820431) Review request for hive, Yongqiang He, Ning Zhang, and namit jain. Changes --- remove MapRedStats from DriverContext and add more counters to it. Summary --- In hive CLI, print out CPU msec from Hadoop MapReduce coutners. This addresses bug HIVE-2236. https://issues.apache.org/jira/browse/HIVE-2236 Diffs (updated) - trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1148623 trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java PRE-CREATION trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1148623 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1148623 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1148623 trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1148623 Diff: https://reviews.apache.org/r/948/diff Testing --- run the updated codes against real clusters and make sure it printing is correct. Thanks, Siying
[jira] [Commented] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds
[ https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068169#comment-13068169 ] jirapos...@reviews.apache.org commented on HIVE-2236: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/948/ --- (Updated 2011-07-20 06:27:19.820431) Review request for hive, Yongqiang He, Ning Zhang, and namit jain. Changes --- remove MapRedStats from DriverContext and add more counters to it. Summary --- In hive CLI, print out CPU msec from Hadoop MapReduce coutners. This addresses bug HIVE-2236. https://issues.apache.org/jira/browse/HIVE-2236 Diffs (updated) - trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1148623 trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java PRE-CREATION trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1148623 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1148623 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1148623 trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1148623 Diff: https://reviews.apache.org/r/948/diff Testing --- run the updated codes against real clusters and make sure it printing is correct. Thanks, Siying Cli: Print Hadoop's CPU milliseconds Key: HIVE-2236 URL: https://issues.apache.org/jira/browse/HIVE-2236 Project: Hive Issue Type: New Feature Components: CLI Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2236.1.patch, HIVE-2236.2.patch CPU Milliseonds information is available from Hadoop's framework. Printing it out to Hive CLI when executing a job will help users to know more about their jobs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds
[ https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-2236: -- Attachment: HIVE-2236.2.patch remove the MapRedStat list from DriverContext and add more counters. Cli: Print Hadoop's CPU milliseconds Key: HIVE-2236 URL: https://issues.apache.org/jira/browse/HIVE-2236 Project: Hive Issue Type: New Feature Components: CLI Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2236.1.patch, HIVE-2236.2.patch CPU Milliseonds information is available from Hadoop's framework. Printing it out to Hive CLI when executing a job will help users to know more about their jobs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2139) Enables HiveServer to accept -hiveconf option
[ https://issues.apache.org/jira/browse/HIVE-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068173#comment-13068173 ] Carl Steinbach commented on HIVE-2139: -- +1. Will commit if tests pass. Enables HiveServer to accept -hiveconf option - Key: HIVE-2139 URL: https://issues.apache.org/jira/browse/HIVE-2139 Project: Hive Issue Type: Improvement Components: CLI Environment: Linux + CDH3u0 (Hive 0.7.0+27.1-2~lucid-cdh3) Reporter: Kazuki Ohta Assignee: Patrick Hunt Attachments: HIVE-2139.patch, HIVE-2139.patch, HIVE-2139.patch Currently, I'm trying to test HiveHBaseIntegration on HiveServer. But it doesn't seem to accept -hiveconf command. {code} hive --service hiveserver -hiveconf hbase.zookeeper.quorum=hdp0,hdp1,hdp2 Starting Hive Thrift Server java.lang.NumberFormatException: For input string: -hiveconf at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Integer.parseInt(Integer.java:449) at java.lang.Integer.parseInt(Integer.java:499) at org.apache.hadoop.hive.service.HiveServer.main(HiveServer.java:382) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:186) {code} Therefore, you need to throw the query like set hbase.zookeeper.quorum=hdp0,hdp1,hdp2 everytime. It's not convenient for separating the configuration between server-side and client-side. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1884) Potential risk of resource leaks in Hive
[ https://issues.apache.org/jira/browse/HIVE-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068179#comment-13068179 ] John Sichi commented on HIVE-1884: -- +1. Will commit when tests pass. Potential risk of resource leaks in Hive Key: HIVE-1884 URL: https://issues.apache.org/jira/browse/HIVE-1884 Project: Hive Issue Type: Bug Components: CLI, Metastore, Query Processor, Server Infrastructure Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0 Environment: Hive 0.6.0, Hadoop 0.20.1 SUSE Linux Enterprise Server 11 (i586) Reporter: Mohit Sikri Assignee: Chinna Rao Lalam Attachments: HIVE-1884.1.PATCH, HIVE-1884.2.patch, HIVE-1884.3.patch, HIVE-1884.4.patch, HIVE-1884.5.patch h3.There are couple of resource leaks. h4.For example, In CliDriver.java, Method :- processReader() the buffered reader is not closed. h3.Also there are risk(s) of resource(s) getting leaked , in such cases we need to re factor the code to move closing of resources in finally block. h4. For Example :- In Throttle.java Method:- checkJobTracker() , the following code snippet might cause resource leak. {code} InputStream in = url.openStream(); in.read(buffer); in.close(); {code} Ideally and as per the best coding practices it should be like below {code} InputStream in=null; try { in = url.openStream(); int numRead = in.read(buffer); } finally { IOUtils.closeStream(in); } {code} Similar cases, were found in ExplainTask.java, DDLTask.java etc.Need to re factor all such occurrences. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-707) add group_concat
[ https://issues.apache.org/jira/browse/HIVE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068182#comment-13068182 ] guyan commented on HIVE-707: hi all, this issue will be resolved? add group_concat Key: HIVE-707 URL: https://issues.apache.org/jira/browse/HIVE-707 Project: Hive Issue Type: New Feature Components: Query Processor Reporter: Namit Jain Assignee: Min Zhou Moving the discussion to a new jira: I've implemented group_cat() in a rush, and found something difficult to slove: 1. function group_cat() has a internal order by clause, currently, we can't implement such an aggregation in hive. 2. when the strings will be group concated are too large, in another words, if data skew appears, there is often not enough memory to store such a big result. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2294) Allow ShimLoader to work with Hadoop 0.20-append
Allow ShimLoader to work with Hadoop 0.20-append Key: HIVE-2294 URL: https://issues.apache.org/jira/browse/HIVE-2294 Project: Hive Issue Type: Bug Affects Versions: 0.7.1 Reporter: YoungWoo Kim Assignee: YoungWoo Kim Priority: Trivial Fix For: 0.8.0 If we are running Hive with Hadoop 0.20-append, Hive ShimLoader does not get Hadoop version correctly. Surfix starts with '-' should be removed from major version info. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2283) Backtracking real column names for EXPLAIN output
[ https://issues.apache.org/jira/browse/HIVE-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2283: Attachment: HIVE-2283.test.patch HIVE-2283.2.patch Bug fixes added ql/src/test/queries/clientpositive/explain_columns.q as requested. Backtracking real column names for EXPLAIN output - Key: HIVE-2283 URL: https://issues.apache.org/jira/browse/HIVE-2283 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.8.0 Reporter: Navis Priority: Minor Attachments: HIVE-2283.1.patch, HIVE-2283.2.patch, HIVE-2283.test.patch GUI people suggested that showing real column names for result of EXPLAIN statement would make customers feel more comfortable with HIVE. I agreed and working on it. {code} a. current EXPLAIN Select Operator expressions: expr: _col10 type: int expr: _col17 type: string Group By Operator keys: expr: _col0 type: int expr: _col17 type: int b. suggested EXPLAIN Select Operator expressions: _col10=t2.key_int1, _col17=upper(t1.key_int1), _col22=t3.key_string2 Group By Operator keys: _col10=t2.key_int1, _col17=upper(t1.key_int1) {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2294) Allow ShimLoader to work with Hadoop 0.20-append
[ https://issues.apache.org/jira/browse/HIVE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YoungWoo Kim updated HIVE-2294: --- Attachment: HIVE-2294.1.patch Allow ShimLoader to work with Hadoop 0.20-append Key: HIVE-2294 URL: https://issues.apache.org/jira/browse/HIVE-2294 Project: Hive Issue Type: Bug Affects Versions: 0.7.1 Reporter: YoungWoo Kim Assignee: YoungWoo Kim Priority: Trivial Fix For: 0.8.0 Attachments: HIVE-2294.1.patch If we are running Hive with Hadoop 0.20-append, Hive ShimLoader does not get Hadoop version correctly. Surfix starts with '-' should be removed from major version info. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2294) Allow ShimLoader to work with Hadoop 0.20-append
[ https://issues.apache.org/jira/browse/HIVE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YoungWoo Kim updated HIVE-2294: --- Status: Patch Available (was: Open) Patch for HIVE-2294 Allow ShimLoader to work with Hadoop 0.20-append Key: HIVE-2294 URL: https://issues.apache.org/jira/browse/HIVE-2294 Project: Hive Issue Type: Bug Affects Versions: 0.7.1 Reporter: YoungWoo Kim Assignee: YoungWoo Kim Priority: Trivial Fix For: 0.8.0 Attachments: HIVE-2294.1.patch If we are running Hive with Hadoop 0.20-append, Hive ShimLoader does not get Hadoop version correctly. Surfix starts with '-' should be removed from major version info. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2294) Allow ShimLoader to work with Hadoop 0.20-append
[ https://issues.apache.org/jira/browse/HIVE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] YoungWoo Kim updated HIVE-2294: --- Attachment: HIVE-2294.2.patch Re-create a patch from svn Allow ShimLoader to work with Hadoop 0.20-append Key: HIVE-2294 URL: https://issues.apache.org/jira/browse/HIVE-2294 Project: Hive Issue Type: Bug Affects Versions: 0.7.1 Reporter: YoungWoo Kim Assignee: YoungWoo Kim Priority: Trivial Fix For: 0.8.0 Attachments: HIVE-2294.1.patch, HIVE-2294.2.patch If we are running Hive with Hadoop 0.20-append, Hive ShimLoader does not get Hadoop version correctly. Surfix starts with '-' should be removed from major version info. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2295) Implement CLUSTERED BY, DISTRIBUTED BY, SORTED BY directives for a single query level.
Implement CLUSTERED BY, DISTRIBUTED BY, SORTED BY directives for a single query level. -- Key: HIVE-2295 URL: https://issues.apache.org/jira/browse/HIVE-2295 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Adam Kramer The common framework for utilizing the mapreduce framework looks like this: SELECT TRANSFORM(a.foo, a.bar) USING 'mapper.py' AS x, y, z FROM ( SELECT b.foo, b.bar FROM tablename b CLUSTER BY b.foo ) a; ...however, this is exceptionally fragile, as it relies on the assumption that Hive is not doing any magic in between the query steps. People familiar with SQL frequently assume that query steps are effectively separated from each other. CLUSTER BY, then, would guarantee that data are clustered on their way OUT of the query, but really what we need is a directive to indicate that data must be clustered on the way INTO the query. This is not pedantic, because there is no reason that Hive wouldn't try to optimize data flow between queries, for example, systematically splitting up big queries. The UDAF framework, with its merging step, would allow a single key/value pair to be split across SEVERAL reducers, violating the mapreduce assumptions but returning the correct data...however, for a TRANSFORM statement, no such protections are afforded. I propose, for greater clarity, that these directives be part of the same query level. Example syntax: SELECT TRANSFORM(foo, bar) USING 'reducer.py' AS x, y, z FROM tablename CLUSTERED BY foo; ...in other words, move the directive regarding data distribution to the query that actually cares about it, allowing for users who are making the assumptions of the mapreduce framework to formally indicate that their transformer really DOES need clustered data. Or to put it in other words, CLUSTER BY is a directive guaranteeing that data are clustered on the way OUT OF a query (i.e., for bucketed tables), whereas CLUSTERED BY is a directive guaranteeing that data are clustered on the way INTO a query. Bonus points: For tables that are already CLUSTERED BY in their definition, allow this query to run in the map phase. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1434) Cassandra Storage Handler
[ https://issues.apache.org/jira/browse/HIVE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068396#comment-13068396 ] Edward Capriolo commented on HIVE-1434: --- It is now pretty easy to take the Brisk jar and drop it into hive: https://github.com/riptano/hive/wiki/Cassandra-Handler-usage-in-Hive-0.7-with-Cassandra-0.7 Also the brisk version of the handler has more features then this as it can transpose wide rows into long columns. I think at this point we might as well abandon trying to get this code into hive. It is much easier to code/innovate it as an external project with git then inside hadoop-hive. Cassandra Storage Handler - Key: HIVE-1434 URL: https://issues.apache.org/jira/browse/HIVE-1434 Project: Hive Issue Type: New Feature Affects Versions: 0.7.0 Reporter: Edward Capriolo Assignee: Edward Capriolo Attachments: cas-handle.tar.gz, cass_handler.diff, hive-1434-1.txt, hive-1434-2-patch.txt, hive-1434-2011-02-26.patch.txt, hive-1434-2011-03-07.patch.txt, hive-1434-2011-03-07.patch.txt, hive-1434-2011-03-14.patch.txt, hive-1434-3-patch.txt, hive-1434-4-patch.txt, hive-1434-5.patch.txt, hive-1434.2011-02-27.diff.txt, hive-cassandra.2011-02-25.txt, hive.diff Add a cassandra storage handler. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2080) Few code improvements in the ql and serde packages.
[ https://issues.apache.org/jira/browse/HIVE-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-2080: --- Attachment: HIVE-2080.1.Patch Few code improvements in the ql and serde packages. --- Key: HIVE-2080 URL: https://issues.apache.org/jira/browse/HIVE-2080 Project: Hive Issue Type: Bug Components: Query Processor, Serializers/Deserializers Affects Versions: 0.7.0 Environment: Hadoop 0.20.1, Hive0.7.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5). Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-2080.1.Patch, HIVE-2080.Patch Few code improvements in the ql and serde packages. 1) Little performance Improvements 2) Null checks to avoid NPEs 3) Effective varaible management. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: Few code improvements in the ql and serde packages.
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1144/ --- Review request for hive. Summary --- Few code improvements in the ql and serde packages. 1) Little performance Improvements 2) Null checks to avoid NPEs 3) Effective varaible management. This addresses bug HIVE-2080. https://issues.apache.org/jira/browse/HIVE-2080 Diffs - trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/UnionOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ASTNode.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1148179 trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeField.java 1148179 trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFieldType.java 1148179 trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFunction.java 1148179 Diff: https://reviews.apache.org/r/1144/diff Testing --- All unit test passed Thanks, chinna
[jira] [Updated] (HIVE-2183) In Task class and its subclasses logger is initialized in constructor
[ https://issues.apache.org/jira/browse/HIVE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-2183: --- Attachment: HIVE-2183.1.patch In Task class and its subclasses logger is initialized in constructor - Key: HIVE-2183 URL: https://issues.apache.org/jira/browse/HIVE-2183 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.5.0, 0.8.0 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5) Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Priority: Minor Attachments: HIVE-2183.1.patch, HIVE-2183.patch In Task class and its subclasses logger is initialized in constructor. Log object no need to initialize every time in the constructor, Log object can make it as static object. {noformat} Ex: public ExecDriver() { super(); LOG = LogFactory.getLog(this.getClass().getName()); console = new LogHelper(LOG); this.jobExecHelper = new HadoopJobExecHelper(job, console, this, this); } {noformat} Need to change like this {noformat} private static final Log LOG = LogFactory.getLog(ExecDriver.class); {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2080) Few code improvements in the ql and serde packages.
[ https://issues.apache.org/jira/browse/HIVE-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068404#comment-13068404 ] jirapos...@reviews.apache.org commented on HIVE-2080: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1144/ --- Review request for hive. Summary --- Few code improvements in the ql and serde packages. 1) Little performance Improvements 2) Null checks to avoid NPEs 3) Effective varaible management. This addresses bug HIVE-2080. https://issues.apache.org/jira/browse/HIVE-2080 Diffs - trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/UnionOperator.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ASTNode.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 1148179 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1148179 trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeField.java 1148179 trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFieldType.java 1148179 trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFunction.java 1148179 Diff: https://reviews.apache.org/r/1144/diff Testing --- All unit test passed Thanks, chinna Few code improvements in the ql and serde packages. --- Key: HIVE-2080 URL: https://issues.apache.org/jira/browse/HIVE-2080 Project: Hive Issue Type: Bug Components: Query Processor, Serializers/Deserializers Affects Versions: 0.7.0 Environment: Hadoop 0.20.1, Hive0.7.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5). Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-2080.1.Patch, HIVE-2080.Patch Few code improvements in the ql and serde packages. 1) Little performance Improvements 2) Null checks to avoid NPEs 3) Effective varaible management. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: In Task class and its subclasses logger is initialized in constructor
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1146/ --- Review request for hive. Summary --- In Task class and its subclasses logger is initialized in constructor. Log object no need to initialize every time in the constructor, Log object can make it as static object. This addresses bug HIVE-2183. https://issues.apache.org/jira/browse/HIVE-2183 Diffs - trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 1145025 Diff: https://reviews.apache.org/r/1146/diff Testing --- All unit tests passed Thanks, chinna
[jira] [Commented] (HIVE-2183) In Task class and its subclasses logger is initialized in constructor
[ https://issues.apache.org/jira/browse/HIVE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068408#comment-13068408 ] jirapos...@reviews.apache.org commented on HIVE-2183: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1146/ --- Review request for hive. Summary --- In Task class and its subclasses logger is initialized in constructor. Log object no need to initialize every time in the constructor, Log object can make it as static object. This addresses bug HIVE-2183. https://issues.apache.org/jira/browse/HIVE-2183 Diffs - trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1145025 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 1145025 Diff: https://reviews.apache.org/r/1146/diff Testing --- All unit tests passed Thanks, chinna In Task class and its subclasses logger is initialized in constructor - Key: HIVE-2183 URL: https://issues.apache.org/jira/browse/HIVE-2183 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.5.0, 0.8.0 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5) Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Priority: Minor Attachments: HIVE-2183.1.patch, HIVE-2183.patch In Task class and its subclasses logger is initialized in constructor. Log object no need to initialize every time in the constructor, Log object can make it as static object. {noformat} Ex: public ExecDriver() { super(); LOG = LogFactory.getLog(this.getClass().getName()); console = new LogHelper(LOG); this.jobExecHelper = new HadoopJobExecHelper(job, console, this, this); } {noformat} Need to change like this {noformat} private static final Log LOG = LogFactory.getLog(ExecDriver.class); {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2184) Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()
[ https://issues.apache.org/jira/browse/HIVE-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-2184: --- Attachment: HIVE-2184.2.patch Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close() --- Key: HIVE-2184 URL: https://issues.apache.org/jira/browse/HIVE-2184 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.5.0, 0.8.0 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5) Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-2184.1.patch, HIVE-2184.1.patch, HIVE-2184.2.patch, HIVE-2184.patch 1)Hive.close() will call HiveMetaStoreClient.close() in this method the variable standAloneClient is never become true then client.shutdown() never call. 2)Hive.close() After calling metaStoreClient.close() need to make metaStoreClient=null -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2183) In Task class and its subclasses logger is initialized in constructor
[ https://issues.apache.org/jira/browse/HIVE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-2183: --- Status: Patch Available (was: Open) In Task class and its subclasses logger is initialized in constructor - Key: HIVE-2183 URL: https://issues.apache.org/jira/browse/HIVE-2183 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.5.0, 0.8.0 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5) Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Priority: Minor Attachments: HIVE-2183.1.patch, HIVE-2183.patch In Task class and its subclasses logger is initialized in constructor. Log object no need to initialize every time in the constructor, Log object can make it as static object. {noformat} Ex: public ExecDriver() { super(); LOG = LogFactory.getLog(this.getClass().getName()); console = new LogHelper(LOG); this.jobExecHelper = new HadoopJobExecHelper(job, console, this, this); } {noformat} Need to change like this {noformat} private static final Log LOG = LogFactory.getLog(ExecDriver.class); {noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2080) Few code improvements in the ql and serde packages.
[ https://issues.apache.org/jira/browse/HIVE-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chinna Rao Lalam updated HIVE-2080: --- Status: Patch Available (was: Open) Few code improvements in the ql and serde packages. --- Key: HIVE-2080 URL: https://issues.apache.org/jira/browse/HIVE-2080 Project: Hive Issue Type: Bug Components: Query Processor, Serializers/Deserializers Affects Versions: 0.7.0 Environment: Hadoop 0.20.1, Hive0.7.0 and SUSE Linux Enterprise Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5). Reporter: Chinna Rao Lalam Assignee: Chinna Rao Lalam Attachments: HIVE-2080.1.Patch, HIVE-2080.Patch Few code improvements in the ql and serde packages. 1) Little performance Improvements 2) Null checks to avoid NPEs 3) Effective varaible management. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: HIVE-1078: CREATE VIEW followup: CREATE OR REPLACE
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1058/ --- (Updated 2011-07-20 18:03:04.848086) Review request for hive. Changes --- Add testcases, fixed issues in comments above (btw the previous revision passed unit tests) Summary --- https://issues.apache.org/jira/browse/HIVE-1078 This addresses bug HIVE-1078. https://issues.apache.org/jira/browse/HIVE-1078 Diffs (updated) - http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view1.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view2.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view3.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view4.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view5.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view6.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view7.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view8.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/recursive_view.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/create_or_replace_view.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view1.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view2.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view3.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view4.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view5.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view6.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view7.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view8.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/recursive_view.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_or_replace_view.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_view.q.out 1146902 Diff: https://reviews.apache.org/r/1058/diff Testing --- Passes unit tests Thanks, Charles
[jira] [Commented] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE
[ https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068519#comment-13068519 ] jirapos...@reviews.apache.org commented on HIVE-1078: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1058/ --- (Updated 2011-07-20 18:03:04.848086) Review request for hive. Changes --- Add testcases, fixed issues in comments above (btw the previous revision passed unit tests) Summary --- https://issues.apache.org/jira/browse/HIVE-1078 This addresses bug HIVE-1078. https://issues.apache.org/jira/browse/HIVE-1078 Diffs (updated) - http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java 1146902 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view1.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view2.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view3.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view4.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view5.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view6.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view7.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view8.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/recursive_view.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/create_or_replace_view.q PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view1.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view2.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view3.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view4.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view5.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view6.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view7.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view8.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/recursive_view.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_or_replace_view.q.out PRE-CREATION http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_view.q.out 1146902 Diff: https://reviews.apache.org/r/1058/diff Testing --- Passes unit tests Thanks, Charles CREATE VIEW followup: CREATE OR REPLACE Key: HIVE-1078 URL: https://issues.apache.org/jira/browse/HIVE-1078 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: Charles Chen Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch Currently, replacing a view requires DROP VIEW v; CREATE VIEW v AS new-definition; CREATE OR REPLACE would allow these to be combined into a single operation. -- This message is automatically generated by JIRA. For more information
[jira] [Updated] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE
[ https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Chen updated HIVE-1078: --- Attachment: HIVE-1078v8.patch CREATE VIEW followup: CREATE OR REPLACE Key: HIVE-1078 URL: https://issues.apache.org/jira/browse/HIVE-1078 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: Charles Chen Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch Currently, replacing a view requires DROP VIEW v; CREATE VIEW v AS new-definition; CREATE OR REPLACE would allow these to be combined into a single operation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE
[ https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Chen updated HIVE-1078: --- Status: Patch Available (was: Open) CREATE VIEW followup: CREATE OR REPLACE Key: HIVE-1078 URL: https://issues.apache.org/jira/browse/HIVE-1078 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.6.0 Reporter: John Sichi Assignee: Charles Chen Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch Currently, replacing a view requires DROP VIEW v; CREATE VIEW v AS new-definition; CREATE OR REPLACE would allow these to be combined into a single operation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2289) NumberFormatException with respect to _offsets when running a query with index
[ https://issues.apache.org/jira/browse/HIVE-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068529#comment-13068529 ] siddharth ramanan commented on HIVE-2289: - Thanks John for your really quick replies. I Just have one final question - from the Hive page, I understand that, there are a lot of overheads while running the queries and in turn these affect the performance (response time of queries), Can we configure hive to an extent that, the response time for a query is less than say 5 seconds for querying over a million rows (just to give examples, I am giving these numbers) I understand that this question is too subjective (depends on the cluster, the configuration of machines used etc) but I am quite confused about the performance of hive over huge data.. Thanks, Siddharth NumberFormatException with respect to _offsets when running a query with index --- Key: HIVE-2289 URL: https://issues.apache.org/jira/browse/HIVE-2289 Project: Hive Issue Type: Bug Components: Indexing Affects Versions: 0.7.0 Environment: RedHat 5 Reporter: siddharth ramanan I am having a table named foo with columns origin, destination and information. Steps I followed to create index named foosample for foo, 1)create index foosample on table foo(origin) as 'compact' with deferred rebuild; 2)alter index foosample on foo rebuild; 3)insert overwrite directory /tmp/index_result select '_bucketname','_offsets' from default__foo_foosample__ where origin='WAW'; 4)set hive.index.compact.file=/tmp/index_result; 5)set hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat; 6)select * from foo where origin='WAW'; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.lang.NumberFormatException: For input string: _offsets at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:410) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexResult.add(HiveCompactIndexResult.java:158) at org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexResult.init(HiveCompactIndexResult.java:107) at org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat.getSplits(HiveCompactIndexInputFormat.java:89) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:657) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Job Submission failed with exception 'java.lang.NumberFormatException(For input string: _offsets)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask Steps 2 and 3 ran a successful mapreduce job and also the table default__foo_foosample__ (index table) has data with three columns origin, _bucketname and _offsets. Thanks, Siddharth -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2224) Ability to add partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2224: Summary: Ability to add partitions atomically (was: Ability to add_partitions, and atomically) Ability to add partitions atomically Key: HIVE-2224 URL: https://issues.apache.org/jira/browse/HIVE-2224 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-2224.patch I'd like to see an atomic version of the add_partitions() call. Whether this is to be done by config to affect add_partitions() behaviour (not my preference) or just changing add_partitions() default behaviour (my preference, but likely to affect current behaviour, so will need others' input) or by making a new add_partitions_atomic() call depends on discussion. This looks relatively doable to implement (will need a dependent add_partition_core to not do a ms.commit_partition() early, and to cache list of directories created to remove on rollback, and a list of AddPartitionEvent to trigger in one shot later) Thoughts? This also seems like something to implement for allowing HIVE-1805. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2224) Ability to add partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068536#comment-13068536 ] Paul Yang commented on HIVE-2224: - Seems like it was an issue with the machine. But it has been committed - thanks Sushanth! Ability to add partitions atomically Key: HIVE-2224 URL: https://issues.apache.org/jira/browse/HIVE-2224 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Attachments: HIVE-2224.patch I'd like to see an atomic version of the add_partitions() call. Whether this is to be done by config to affect add_partitions() behaviour (not my preference) or just changing add_partitions() default behaviour (my preference, but likely to affect current behaviour, so will need others' input) or by making a new add_partitions_atomic() call depends on discussion. This looks relatively doable to implement (will need a dependent add_partition_core to not do a ms.commit_partition() early, and to cache list of directories created to remove on rollback, and a list of AddPartitionEvent to trigger in one shot later) Thoughts? This also seems like something to implement for allowing HIVE-1805. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2224) Ability to add partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Yang updated HIVE-2224: Resolution: Fixed Fix Version/s: 0.8.0 Status: Resolved (was: Patch Available) Ability to add partitions atomically Key: HIVE-2224 URL: https://issues.apache.org/jira/browse/HIVE-2224 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.8.0 Attachments: HIVE-2224.patch I'd like to see an atomic version of the add_partitions() call. Whether this is to be done by config to affect add_partitions() behaviour (not my preference) or just changing add_partitions() default behaviour (my preference, but likely to affect current behaviour, so will need others' input) or by making a new add_partitions_atomic() call depends on discussion. This looks relatively doable to implement (will need a dependent add_partition_core to not do a ms.commit_partition() early, and to cache list of directories created to remove on rollback, and a list of AddPartitionEvent to trigger in one shot later) Thoughts? This also seems like something to implement for allowing HIVE-1805. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2209) Provide a way by which ObjectInspectorUtils.compare can be extended by the caller for comparing maps which are part of the object
[ https://issues.apache.org/jira/browse/HIVE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068540#comment-13068540 ] He Yongqiang commented on HIVE-2209: +1, will commit after tests pass. Provide a way by which ObjectInspectorUtils.compare can be extended by the caller for comparing maps which are part of the object - Key: HIVE-2209 URL: https://issues.apache.org/jira/browse/HIVE-2209 Project: Hive Issue Type: Improvement Reporter: Krishna Kumar Assignee: Krishna Kumar Priority: Minor Attachments: HIVE-2209v0.patch, HIVE-2209v2.patch, HIVE2209v1.patch Now ObjectInspectorUtils.compare throws an exception if a map is contained (recursively) within the objects being compared. Two obvious implementations are - a simple map comparer which assumes keys of the first map can be used to fetch values from the second - a 'cross-product' comparer which compares every pair of key-value pairs in the two maps, and calls a match if and only if all pairs are matched Note that it would be difficult to provide a transitive greater-than/less-than indication with maps so that is not in scope. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2289) NumberFormatException with respect to _offsets when running a query with index
[ https://issues.apache.org/jira/browse/HIVE-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068581#comment-13068581 ] John Sichi commented on HIVE-2289: -- (This JIRA issue is not the right place for these discussions; mailing list u...@hive.apache.org is.) NumberFormatException with respect to _offsets when running a query with index --- Key: HIVE-2289 URL: https://issues.apache.org/jira/browse/HIVE-2289 Project: Hive Issue Type: Bug Components: Indexing Affects Versions: 0.7.0 Environment: RedHat 5 Reporter: siddharth ramanan I am having a table named foo with columns origin, destination and information. Steps I followed to create index named foosample for foo, 1)create index foosample on table foo(origin) as 'compact' with deferred rebuild; 2)alter index foosample on foo rebuild; 3)insert overwrite directory /tmp/index_result select '_bucketname','_offsets' from default__foo_foosample__ where origin='WAW'; 4)set hive.index.compact.file=/tmp/index_result; 5)set hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat; 6)select * from foo where origin='WAW'; Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks is set to 0 since there's no reduce operator java.lang.NumberFormatException: For input string: _offsets at java.lang.NumberFormatException.forInputString(NumberFormatException.java:48) at java.lang.Long.parseLong(Long.java:410) at java.lang.Long.parseLong(Long.java:468) at org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexResult.add(HiveCompactIndexResult.java:158) at org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexResult.init(HiveCompactIndexResult.java:107) at org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat.getSplits(HiveCompactIndexInputFormat.java:89) at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730) at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:657) at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Job Submission failed with exception 'java.lang.NumberFormatException(For input string: _offsets)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask Steps 2 and 3 ran a successful mapreduce job and also the table default__foo_foosample__ (index table) has data with three columns origin, _bucketname and _offsets. Thanks, Siddharth -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-1884) Potential risk of resource leaks in Hive
[ https://issues.apache.org/jira/browse/HIVE-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-1884: - Resolution: Fixed Fix Version/s: 0.8.0 Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) Committed. Thanks Chinna! Potential risk of resource leaks in Hive Key: HIVE-1884 URL: https://issues.apache.org/jira/browse/HIVE-1884 Project: Hive Issue Type: Bug Components: CLI, Metastore, Query Processor, Server Infrastructure Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0 Environment: Hive 0.6.0, Hadoop 0.20.1 SUSE Linux Enterprise Server 11 (i586) Reporter: Mohit Sikri Assignee: Chinna Rao Lalam Fix For: 0.8.0 Attachments: HIVE-1884.1.PATCH, HIVE-1884.2.patch, HIVE-1884.3.patch, HIVE-1884.4.patch, HIVE-1884.5.patch h3.There are couple of resource leaks. h4.For example, In CliDriver.java, Method :- processReader() the buffered reader is not closed. h3.Also there are risk(s) of resource(s) getting leaked , in such cases we need to re factor the code to move closing of resources in finally block. h4. For Example :- In Throttle.java Method:- checkJobTracker() , the following code snippet might cause resource leak. {code} InputStream in = url.openStream(); in.read(buffer); in.close(); {code} Ideally and as per the best coding practices it should be like below {code} InputStream in=null; try { in = url.openStream(); int numRead = in.read(buffer); } finally { IOUtils.closeStream(in); } {code} Similar cases, were found in ExplainTask.java, DDLTask.java etc.Need to re factor all such occurrences. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2283) Backtracking real column names for EXPLAIN output
[ https://issues.apache.org/jira/browse/HIVE-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068590#comment-13068590 ] John Sichi commented on HIVE-2283: -- Submit one combined patch with everything, including the .q.out file which we need for the test to pass. Once read, add it to Review Board, upload it here, and then click the Submit Patch button. https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-ReviewProcess Backtracking real column names for EXPLAIN output - Key: HIVE-2283 URL: https://issues.apache.org/jira/browse/HIVE-2283 Project: Hive Issue Type: Improvement Components: Query Processor Affects Versions: 0.8.0 Reporter: Navis Priority: Minor Attachments: HIVE-2283.1.patch, HIVE-2283.2.patch, HIVE-2283.test.patch GUI people suggested that showing real column names for result of EXPLAIN statement would make customers feel more comfortable with HIVE. I agreed and working on it. {code} a. current EXPLAIN Select Operator expressions: expr: _col10 type: int expr: _col17 type: string Group By Operator keys: expr: _col0 type: int expr: _col17 type: int b. suggested EXPLAIN Select Operator expressions: _col10=t2.key_int1, _col17=upper(t1.key_int1), _col22=t3.key_string2 Group By Operator keys: _col10=t2.key_int1, _col17=upper(t1.key_int1) {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-trunk-h0.21 #836
See https://builds.apache.org/job/Hive-trunk-h0.21/836/ -- [...truncated 32873 lines...] [echo] Writing POM to https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/jdbc/pom.xml No ivy:settings found for the default reference 'ivy.instance'. A default instance will be used no settings file found, using default... :: loading settings :: url = jar:file:/home/hudson/.ant/lib/ivy-2.0.0-rc2.jar!/org/apache/ivy/core/settings/ivysettings.xml ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/ivy/lib/ivy-2.1.0.jar [get] Not modified - so not downloaded ivy-probe-antlib: ivy-init-antlib: ivy-init: check-ivy: create-dirs: compile-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks [javac] https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ant/build.xml:40: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds deploy-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks [javac] https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ant/build.xml:40: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds jar: init: install-hadoopcore: install-hadoopcore-default: ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/ivy/lib/ivy-2.1.0.jar [get] Not modified - so not downloaded ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-retrieve-hadoop-source: :: loading settings :: file = https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ivy/ivysettings.xml [ivy:retrieve] :: resolving dependencies :: org.apache.hive#hive-hwi;0.8.0-SNAPSHOT [ivy:retrieve] confs: [default] [ivy:retrieve] found hadoop#core;0.20.1 in hadoop-source [ivy:retrieve] :: resolution report :: resolve 663ms :: artifacts dl 1ms - | |modules|| artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| - | default | 1 | 0 | 0 | 0 || 1 | 0 | - [ivy:retrieve] :: retrieving :: org.apache.hive#hive-hwi [ivy:retrieve] confs: [default] [ivy:retrieve] 0 artifacts copied, 1 already retrieved (0kB/1ms) install-hadoopcore-internal: setup: war: compile: [echo] Compiling: hwi [javac] https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/hwi/build.xml:71: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds jar: [echo] Jar: hwi make-pom: [echo] Writing POM to https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/hwi/pom.xml No ivy:settings found for the default reference 'ivy.instance'. A default instance will be used no settings file found, using default... :: loading settings :: url = jar:file:/home/hudson/.ant/lib/ivy-2.0.0-rc2.jar!/org/apache/ivy/core/settings/ivysettings.xml ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/ivy/lib/ivy-2.1.0.jar [get] Not modified - so not downloaded ivy-probe-antlib: ivy-init-antlib: ivy-init: check-ivy: create-dirs: compile-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks [javac] https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ant/build.xml:40: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds deploy-ant-tasks: create-dirs: init: compile: [echo] Compiling: anttasks [javac] https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ant/build.xml:40: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds jar: init: setup: compile: [echo] Compiling: hbase-handler [javac] https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build-common.xml:301: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds [copy] Warning: https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/hbase-handler/src/java/conf does not exist. jar: [echo] Jar: hbase-handler make-pom: [echo] Writing POM to https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/hbase-handler/pom.xml No ivy:settings found for the default
[jira] [Commented] (HIVE-2224) Ability to add partitions atomically
[ https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068629#comment-13068629 ] Sushanth Sowmyan commented on HIVE-2224: Thanks! Ability to add partitions atomically Key: HIVE-2224 URL: https://issues.apache.org/jira/browse/HIVE-2224 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sushanth Sowmyan Assignee: Sushanth Sowmyan Fix For: 0.8.0 Attachments: HIVE-2224.patch I'd like to see an atomic version of the add_partitions() call. Whether this is to be done by config to affect add_partitions() behaviour (not my preference) or just changing add_partitions() default behaviour (my preference, but likely to affect current behaviour, so will need others' input) or by making a new add_partitions_atomic() call depends on discussion. This looks relatively doable to implement (will need a dependent add_partition_core to not do a ms.commit_partition() early, and to cache list of directories created to remove on rollback, and a list of AddPartitionEvent to trigger in one shot later) Thoughts? This also seems like something to implement for allowing HIVE-1805. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2296) bad compressed file names from insert into
[ https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklin Hu updated HIVE-2296: -- Affects Version/s: 0.8.0 bad compressed file names from insert into -- Key: HIVE-2296 URL: https://issues.apache.org/jira/browse/HIVE-2296 Project: Hive Issue Type: Bug Affects Versions: 0.8.0 Reporter: Franklin Hu Assignee: Franklin Hu When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0 00_0.gz_copy_1 Correct behavior should be to pick a valid filename -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2296) bad compressed file names from insert into
bad compressed file names from insert into -- Key: HIVE-2296 URL: https://issues.apache.org/jira/browse/HIVE-2296 Project: Hive Issue Type: Bug Reporter: Franklin Hu Assignee: Franklin Hu When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0 00_0.gz_copy_1 Correct behavior should be to pick a valid filename -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2296) bad compressed file names from insert into
[ https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklin Hu updated HIVE-2296: -- Description: When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0.gz 00_0.gz_copy_1 Correct behavior should be to pick a valid filename was: When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0 00_0.gz_copy_1 Correct behavior should be to pick a valid filename bad compressed file names from insert into -- Key: HIVE-2296 URL: https://issues.apache.org/jira/browse/HIVE-2296 Project: Hive Issue Type: Bug Affects Versions: 0.8.0 Reporter: Franklin Hu Assignee: Franklin Hu When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0.gz 00_0.gz_copy_1 Correct behavior should be to pick a valid filename -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: Review Request: reduce name node calls in hive by creating temporary directories
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/952/ --- (Updated 2011-07-20 23:31:54.007436) Review request for hive, Yongqiang He, Ning Zhang, and namit jain. Changes --- 1. change block merge task too 2. change the capital file name Summary --- reduce name node calls in hive by creating temporary directories This addresses bug HIVE-2201. https://issues.apache.org/jira/browse/HIVE-2201 Diffs (updated) - trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1148905 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 1148905 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1148905 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFileOutputFormat.java 1148905 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 1148905 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 1148905 Diff: https://reviews.apache.org/r/952/diff Testing --- Thanks, Siying
[jira] [Updated] (HIVE-2201) reduce name node calls in hive by creating temporary directories
[ https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-2201: -- Attachment: HIVE-2201.4.patch 1. change block merge task too 2. change the capital file name reduce name node calls in hive by creating temporary directories Key: HIVE-2201 URL: https://issues.apache.org/jira/browse/HIVE-2201 Project: Hive Issue Type: Improvement Reporter: Namit Jain Assignee: Siying Dong Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch, HIVE-2201.4.patch Currently, in Hive, when a file gets written by a FileSinkOperator, the sequence of operations is as follows: 1. In tmp directory tmp1, create a tmp file _tmp_1 2. At the end of the operator, move /tmp1/_tmp_1 to /tmp1/1 3. Move directory /tmp1 to /tmp2 4. For all files in /tmp2, remove all files starting with _tmp and duplicate files. Due to speculative execution, a lot of temporary files are created in /tmp1 (or /tmp2). This leads to a lot of name node calls, specially for large queries. The protocol above can be modified slightly: 1. In tmp directory tmp1, create a tmp file _tmp_1 2. At the end of the operator, move /tmp1/_tmp_1 to /tmp2/1 3. Move directory /tmp2 to /tmp3 4. For all files in /tmp3, remove all duplicate files. This should reduce the number of tmp files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2201) reduce name node calls in hive by creating temporary directories
[ https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068715#comment-13068715 ] jirapos...@reviews.apache.org commented on HIVE-2201: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/952/ --- (Updated 2011-07-20 23:31:54.007436) Review request for hive, Yongqiang He, Ning Zhang, and namit jain. Changes --- 1. change block merge task too 2. change the capital file name Summary --- reduce name node calls in hive by creating temporary directories This addresses bug HIVE-2201. https://issues.apache.org/jira/browse/HIVE-2201 Diffs (updated) - trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1148905 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 1148905 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1148905 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFileOutputFormat.java 1148905 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 1148905 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 1148905 Diff: https://reviews.apache.org/r/952/diff Testing --- Thanks, Siying reduce name node calls in hive by creating temporary directories Key: HIVE-2201 URL: https://issues.apache.org/jira/browse/HIVE-2201 Project: Hive Issue Type: Improvement Reporter: Namit Jain Assignee: Siying Dong Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch, HIVE-2201.4.patch Currently, in Hive, when a file gets written by a FileSinkOperator, the sequence of operations is as follows: 1. In tmp directory tmp1, create a tmp file _tmp_1 2. At the end of the operator, move /tmp1/_tmp_1 to /tmp1/1 3. Move directory /tmp1 to /tmp2 4. For all files in /tmp2, remove all files starting with _tmp and duplicate files. Due to speculative execution, a lot of temporary files are created in /tmp1 (or /tmp2). This leads to a lot of name node calls, specially for large queries. The protocol above can be modified slightly: 1. In tmp directory tmp1, create a tmp file _tmp_1 2. At the end of the operator, move /tmp1/_tmp_1 to /tmp2/1 3. Move directory /tmp2 to /tmp3 4. For all files in /tmp3, remove all duplicate files. This should reduce the number of tmp files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2296) bad compressed file names from insert into
[ https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklin Hu updated HIVE-2296: -- Attachment: hive-2296.1.patch bad compressed file names from insert into -- Key: HIVE-2296 URL: https://issues.apache.org/jira/browse/HIVE-2296 Project: Hive Issue Type: Bug Affects Versions: 0.8.0 Reporter: Franklin Hu Assignee: Franklin Hu Attachments: hive-2296.1.patch When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0.gz 00_0.gz_copy_1 Correct behavior should be to pick a valid filename -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2296) bad compressed file names from insert into
[ https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklin Hu updated HIVE-2296: -- Description: When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0.gz 00_0.gz_copy_1 This causes corrupted output when doing a SELECT * on the table. Correct behavior should be to pick a valid filename such as: 00_0_copy_1.gz was: When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0.gz 00_0.gz_copy_1 Correct behavior should be to pick a valid filename bad compressed file names from insert into -- Key: HIVE-2296 URL: https://issues.apache.org/jira/browse/HIVE-2296 Project: Hive Issue Type: Bug Affects Versions: 0.8.0 Reporter: Franklin Hu Assignee: Franklin Hu Attachments: hive-2296.1.patch When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0.gz 00_0.gz_copy_1 This causes corrupted output when doing a SELECT * on the table. Correct behavior should be to pick a valid filename such as: 00_0_copy_1.gz -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2296) bad compressed file names from insert into
[ https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklin Hu updated HIVE-2296: -- Attachment: hive-2296.2.patch add unit test bad compressed file names from insert into -- Key: HIVE-2296 URL: https://issues.apache.org/jira/browse/HIVE-2296 Project: Hive Issue Type: Bug Affects Versions: 0.8.0 Reporter: Franklin Hu Assignee: Franklin Hu Attachments: hive-2296.1.patch, hive-2296.2.patch When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0.gz 00_0.gz_copy_1 This causes corrupted output when doing a SELECT * on the table. Correct behavior should be to pick a valid filename such as: 00_0_copy_1.gz -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-trunk-h0.21 #837
See https://builds.apache.org/job/Hive-trunk-h0.21/837/changes Changes: [jvs] HIVE-1884. Potential risk of resource leaks in Hive (Chinna Rao Lalam via jvs) [pauly] HIVE-2224. Ability to add partitions atomically (Sushanth Sowmyan via pauly) -- [...truncated 33321 lines...] [artifact:deploy] Uploading: org/apache/hive/hive-hbase-handler/0.8.0-SNAPSHOT/hive-hbase-handler-0.8.0-20110721.003735-38.jar to repository apache.snapshots.https at https://repository.apache.org/content/repositories/snapshots [artifact:deploy] Transferring 49K from apache.snapshots.https [artifact:deploy] Uploaded 49K [artifact:deploy] [INFO] Uploading project information for hive-hbase-handler 0.8.0-20110721.003735-38 [artifact:deploy] [INFO] Retrieving previous metadata from apache.snapshots.https [artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot org.apache.hive:hive-hbase-handler:0.8.0-SNAPSHOT' [artifact:deploy] [INFO] Retrieving previous metadata from apache.snapshots.https [artifact:deploy] [INFO] Uploading repository metadata for: 'artifact org.apache.hive:hive-hbase-handler' ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: /x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/build/ivy/lib/ivy-2.1.0.jar [get] Not modified - so not downloaded ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-resolve-maven-ant-tasks: [ivy:resolve] :: loading settings :: file = /x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ivy/ivysettings.xml ivy-retrieve-maven-ant-tasks: [ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use 'ivy.settings.file' instead [ivy:cachepath] :: loading settings :: file = /x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ivy/ivysettings.xml mvn-taskdef: maven-publish-artifact: [artifact:install-provider] Installing provider: org.apache.maven.wagon:wagon-http:jar:1.0-beta-2:runtime [artifact:deploy] Deploying to https://repository.apache.org/content/repositories/snapshots [artifact:deploy] [INFO] Retrieving previous build number from apache.snapshots.https [artifact:deploy] Uploading: org/apache/hive/hive-hwi/0.8.0-SNAPSHOT/hive-hwi-0.8.0-20110721.003736-38.jar to repository apache.snapshots.https at https://repository.apache.org/content/repositories/snapshots [artifact:deploy] Transferring 23K from apache.snapshots.https [artifact:deploy] Uploaded 23K [artifact:deploy] [INFO] Retrieving previous metadata from apache.snapshots.https [artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot org.apache.hive:hive-hwi:0.8.0-SNAPSHOT' [artifact:deploy] [INFO] Retrieving previous metadata from apache.snapshots.https [artifact:deploy] [INFO] Uploading repository metadata for: 'artifact org.apache.hive:hive-hwi' [artifact:deploy] [INFO] Uploading project information for hive-hwi 0.8.0-20110721.003736-38 ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: /x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/build/ivy/lib/ivy-2.1.0.jar [get] Not modified - so not downloaded ivy-probe-antlib: ivy-init-antlib: ivy-init: ivy-resolve-maven-ant-tasks: [ivy:resolve] :: loading settings :: file = /x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ivy/ivysettings.xml ivy-retrieve-maven-ant-tasks: [ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use 'ivy.settings.file' instead [ivy:cachepath] :: loading settings :: file = /x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ivy/ivysettings.xml mvn-taskdef: maven-publish-artifact: [artifact:install-provider] Installing provider: org.apache.maven.wagon:wagon-http:jar:1.0-beta-2:runtime [artifact:deploy] Deploying to https://repository.apache.org/content/repositories/snapshots [artifact:deploy] [INFO] Retrieving previous build number from apache.snapshots.https [artifact:deploy] Uploading: org/apache/hive/hive-jdbc/0.8.0-SNAPSHOT/hive-jdbc-0.8.0-20110721.003738-38.jar to repository apache.snapshots.https at https://repository.apache.org/content/repositories/snapshots [artifact:deploy] Transferring 56K from apache.snapshots.https [artifact:deploy] Uploaded 56K [artifact:deploy] [INFO] Uploading project information for hive-jdbc 0.8.0-20110721.003738-38 [artifact:deploy] [INFO] Retrieving previous metadata from apache.snapshots.https [artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot org.apache.hive:hive-jdbc:0.8.0-SNAPSHOT' [artifact:deploy] [INFO] Retrieving previous metadata from apache.snapshots.https [artifact:deploy] [INFO] Uploading repository metadata for: 'artifact org.apache.hive:hive-jdbc' ivy-init-dirs: ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To:
[jira] [Commented] (HIVE-1884) Potential risk of resource leaks in Hive
[ https://issues.apache.org/jira/browse/HIVE-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068745#comment-13068745 ] Hudson commented on HIVE-1884: -- Integrated in Hive-trunk-h0.21 #837 (See [https://builds.apache.org/job/Hive-trunk-h0.21/837/]) HIVE-1884. Potential risk of resource leaks in Hive (Chinna Rao Lalam via jvs) jvs : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1148921 Files : * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java * /hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/util/typedbytes/TypedBytesWritableInput.java * /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java Potential risk of resource leaks in Hive Key: HIVE-1884 URL: https://issues.apache.org/jira/browse/HIVE-1884 Project: Hive Issue Type: Bug Components: CLI, Metastore, Query Processor, Server Infrastructure Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0 Environment: Hive 0.6.0, Hadoop 0.20.1 SUSE Linux Enterprise Server 11 (i586) Reporter: Mohit Sikri Assignee: Chinna Rao Lalam Fix For: 0.8.0 Attachments: HIVE-1884.1.PATCH, HIVE-1884.2.patch, HIVE-1884.3.patch, HIVE-1884.4.patch, HIVE-1884.5.patch h3.There are couple of resource leaks. h4.For example, In CliDriver.java, Method :- processReader() the buffered reader is not closed. h3.Also there are risk(s) of resource(s) getting leaked , in such cases we need to re factor the code to move closing of resources in finally block. h4. For Example :- In Throttle.java Method:- checkJobTracker() , the following code snippet might cause resource leak. {code} InputStream in = url.openStream(); in.read(buffer); in.close(); {code} Ideally and as per the best coding practices it should be like below {code} InputStream in=null; try { in = url.openStream(); int numRead = in.read(buffer); } finally { IOUtils.closeStream(in); } {code} Similar cases, were found in ExplainTask.java, DDLTask.java etc.Need to re factor all such occurrences. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2247) ALTER TABLE RENAME PARTITION
[ https://issues.apache.org/jira/browse/HIVE-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiyan Wang updated HIVE-2247: -- Attachment: HIVE-2247.5.patch.txt Use alter_partition(db_name, tbl_name, newPart, part_vals) to replace rename_partition thrift API Add one authorization unit test to test if new partition has the same privilege as old one ALTER TABLE RENAME PARTITION Key: HIVE-2247 URL: https://issues.apache.org/jira/browse/HIVE-2247 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Weiyan Wang Attachments: HIVE-2247.3.patch.txt, HIVE-2247.4.patch.txt, HIVE-2247.5.patch.txt We need a ALTER TABLE TABLE RENAME PARTITIONfunction that is similar t ALTER TABLE RENAME. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2247) ALTER TABLE RENAME PARTITION
[ https://issues.apache.org/jira/browse/HIVE-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068753#comment-13068753 ] jirapos...@reviews.apache.org commented on HIVE-2247: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1105/ --- (Updated 2011-07-21 01:20:25.242756) Review request for Siying Dong. Changes --- Refactor the code, rename_partition shares the same thrift API as alter_partition, we do alter_partition when part_vals is empty, we do rename_partition when part_vals is given Summary --- Implement ALTER TABLE PARTITION RENAME function to rename a partition. Add HiveQL syntax ALTER TABLE bar PARTITION (k1='v1', k2='v2') RENAME TO PARTITION (k1='v3', k2='v4'); This is my first Hive diff, I just learn everything from existing codebase and may not have a good understanding on it. Feel free to inform me if I make something wrong. Thanks This addresses bug HIVE-2247. https://issues.apache.org/jira/browse/HIVE-2247 Diffs (updated) - trunk/metastore/if/hive_metastore.thrift 1145366 trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1145366 trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1145366 trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 1145366 trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java 1145366 trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 1145366 trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 1145366 trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 1145366 trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1145366 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 1145366 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1145366 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java 1145366 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 1145366 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1145366 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 1145366 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java 1145366 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1145366 trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1145366 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1145366 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1145366 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 1145366 trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java 1145366 trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java 1145366 trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 1145366 trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/RenamePartitionDesc.java PRE-CREATION trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure.q PRE-CREATION trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure2.q PRE-CREATION trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure3.q PRE-CREATION trunk/ql/src/test/queries/clientpositive/alter_rename_partition.q PRE-CREATION trunk/ql/src/test/queries/clientpositive/alter_rename_partition_authorization.q PRE-CREATION trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure.q.out PRE-CREATION trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure2.q.out PRE-CREATION trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure3.q.out PRE-CREATION trunk/ql/src/test/results/clientpositive/alter_rename_partition.q.out PRE-CREATION trunk/ql/src/test/results/clientpositive/alter_rename_partition_authorization.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1105/diff Testing --- Add a partition A in the table Rename partition A to partition B Show the partitions in the table, it returns partition B. SELECT the data from partition A, it returns no results SELECT the data from partition B, it returns the data originally stored in partition A Thanks, Weiyan ALTER TABLE RENAME PARTITION Key: HIVE-2247 URL: https://issues.apache.org/jira/browse/HIVE-2247 Project: Hive Issue Type: New Feature Reporter: Siying Dong Assignee: Weiyan Wang Attachments: HIVE-2247.3.patch.txt,
[jira] [Commented] (HIVE-2201) reduce name node calls in hive by creating temporary directories
[ https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068787#comment-13068787 ] He Yongqiang commented on HIVE-2201: +1, will commit after tests pass. reduce name node calls in hive by creating temporary directories Key: HIVE-2201 URL: https://issues.apache.org/jira/browse/HIVE-2201 Project: Hive Issue Type: Improvement Reporter: Namit Jain Assignee: Siying Dong Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch, HIVE-2201.4.patch Currently, in Hive, when a file gets written by a FileSinkOperator, the sequence of operations is as follows: 1. In tmp directory tmp1, create a tmp file _tmp_1 2. At the end of the operator, move /tmp1/_tmp_1 to /tmp1/1 3. Move directory /tmp1 to /tmp2 4. For all files in /tmp2, remove all files starting with _tmp and duplicate files. Due to speculative execution, a lot of temporary files are created in /tmp1 (or /tmp2). This leads to a lot of name node calls, specially for large queries. The protocol above can be modified slightly: 1. In tmp directory tmp1, create a tmp file _tmp_1 2. At the end of the operator, move /tmp1/_tmp_1 to /tmp2/1 3. Move directory /tmp2 to /tmp3 4. For all files in /tmp3, remove all duplicate files. This should reduce the number of tmp files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2209) Provide a way by which ObjectInspectorUtils.compare can be extended by the caller for comparing maps which are part of the object
[ https://issues.apache.org/jira/browse/HIVE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] He Yongqiang updated HIVE-2209: --- Resolution: Fixed Status: Resolved (was: Patch Available) Committed, thanks Krishna Kumar! Provide a way by which ObjectInspectorUtils.compare can be extended by the caller for comparing maps which are part of the object - Key: HIVE-2209 URL: https://issues.apache.org/jira/browse/HIVE-2209 Project: Hive Issue Type: Improvement Reporter: Krishna Kumar Assignee: Krishna Kumar Priority: Minor Attachments: HIVE-2209v0.patch, HIVE-2209v2.patch, HIVE2209v1.patch Now ObjectInspectorUtils.compare throws an exception if a map is contained (recursively) within the objects being compared. Two obvious implementations are - a simple map comparer which assumes keys of the first map can be used to fetch values from the second - a 'cross-product' comparer which compares every pair of key-value pairs in the two maps, and calls a match if and only if all pairs are matched Note that it would be difficult to provide a transitive greater-than/less-than indication with maps so that is not in scope. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: HIVE-2296 bad compressed file names when calling INSERT INTO
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1155/ --- Review request for hive and Siying Dong. Summary --- Fixes problem of bad compressed file names by stripping off the file format (ex .gz) and reappending it to the path later. This addresses bug HIVE-2296. https://issues.apache.org/jira/browse/HIVE-2296 Diffs - trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1148973 trunk/ql/src/test/queries/clientpositive/insert_compressed.q PRE-CREATION trunk/ql/src/test/results/clientpositive/insert_compressed.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1155/diff Testing --- Unit tests pass Thanks, Franklin
[jira] [Commented] (HIVE-2296) bad compressed file names from insert into
[ https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068794#comment-13068794 ] jirapos...@reviews.apache.org commented on HIVE-2296: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/1155/ --- Review request for hive and Siying Dong. Summary --- Fixes problem of bad compressed file names by stripping off the file format (ex .gz) and reappending it to the path later. This addresses bug HIVE-2296. https://issues.apache.org/jira/browse/HIVE-2296 Diffs - trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1148973 trunk/ql/src/test/queries/clientpositive/insert_compressed.q PRE-CREATION trunk/ql/src/test/results/clientpositive/insert_compressed.q.out PRE-CREATION Diff: https://reviews.apache.org/r/1155/diff Testing --- Unit tests pass Thanks, Franklin bad compressed file names from insert into -- Key: HIVE-2296 URL: https://issues.apache.org/jira/browse/HIVE-2296 Project: Hive Issue Type: Bug Affects Versions: 0.8.0 Reporter: Franklin Hu Assignee: Franklin Hu Attachments: hive-2296.1.patch, hive-2296.2.patch When INSERT INTO is run on a table with compressed output (hive.exec.compress.output=true) and existing files in the table, it may copy the new files in bad file names: Before INSERT INTO: 00_0.gz After INSERT INTO: 00_0.gz 00_0.gz_copy_1 This causes corrupted output when doing a SELECT * on the table. Correct behavior should be to pick a valid filename such as: 00_0_copy_1.gz -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira