Re: Review Request: HIVE-1078: CREATE VIEW followup: CREATE OR REPLACE

2011-07-20 Thread John Sichi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1058/#review1128
---



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
https://reviews.apache.org/r/1058/#comment2356

Defer the db.getPartitions (which could be expensive) so that we don't do 
it unless we're sure that the partition keys are actually changing.



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
https://reviews.apache.org/r/1058/#comment2357

Avoid usage of java.util.Stack.  Some old Hive code uses it but it's 
deprecated because it's synchronized for no good reason.



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
https://reviews.apache.org/r/1058/#comment2358

add spaces around operators such as =


- John


On 2011-07-20 01:01:53, Charles Chen wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/1058/
 ---
 
 (Updated 2011-07-20 01:01:53)
 
 
 Review request for hive.
 
 
 Summary
 ---
 
 https://issues.apache.org/jira/browse/HIVE-1078
 
 
 This addresses bug HIVE-1078.
 https://issues.apache.org/jira/browse/HIVE-1078
 
 
 Diffs
 -
 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
  1146902 
   
 http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
  1146902 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
  1146902 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
  1146902 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
  1146902 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java
  1146902 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view1.q
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view2.q
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view3.q
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view4.q
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/create_or_replace_view.q
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view1.q.out
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view2.q.out
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view3.q.out
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view4.q.out
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_or_replace_view.q.out
  PRE-CREATION 
   
 http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_view.q.out
  1146902 
 
 Diff: https://reviews.apache.org/r/1058/diff
 
 
 Testing
 ---
 
 Passes unit tests
 
 
 Thanks,
 
 Charles
 




[jira] [Commented] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-07-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068165#comment-13068165
 ] 

jirapos...@reviews.apache.org commented on HIVE-1078:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1058/#review1128
---



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
https://reviews.apache.org/r/1058/#comment2356

Defer the db.getPartitions (which could be expensive) so that we don't do 
it unless we're sure that the partition keys are actually changing.



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
https://reviews.apache.org/r/1058/#comment2357

Avoid usage of java.util.Stack.  Some old Hive code uses it but it's 
deprecated because it's synchronized for no good reason.



http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
https://reviews.apache.org/r/1058/#comment2358

add spaces around operators such as =


- John


On 2011-07-20 01:01:53, Charles Chen wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1058/
bq.  ---
bq.  
bq.  (Updated 2011-07-20 01:01:53)
bq.  
bq.  
bq.  Review request for hive.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HIVE-1078
bq.  
bq.  
bq.  This addresses bug HIVE-1078.
bq.  https://issues.apache.org/jira/browse/HIVE-1078
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 1146902 
bq.
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 1146902 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
 1146902 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
 1146902 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
 1146902 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java
 1146902 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view1.q
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view2.q
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view3.q
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view4.q
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/create_or_replace_view.q
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view1.q.out
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view2.q.out
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view3.q.out
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view4.q.out
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_or_replace_view.q.out
 PRE-CREATION 
bq.
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_view.q.out
 1146902 
bq.  
bq.  Diff: https://reviews.apache.org/r/1058/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Passes unit tests
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Charles
bq.  
bq.



 CREATE VIEW followup:  CREATE OR REPLACE
 

 Key: HIVE-1078
 URL: https://issues.apache.org/jira/browse/HIVE-1078
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, 
 HIVE-1078v6.patch, HIVE-1078v7.patch


 Currently, replacing a view requires
 DROP VIEW v;
 CREATE VIEW v AS new-definition;
 CREATE OR REPLACE would allow these to be combined into a single operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: 

[jira] [Updated] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-07-20 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1078:
-

Status: Open  (was: Patch Available)

 CREATE VIEW followup:  CREATE OR REPLACE
 

 Key: HIVE-1078
 URL: https://issues.apache.org/jira/browse/HIVE-1078
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, 
 HIVE-1078v6.patch, HIVE-1078v7.patch


 Currently, replacing a view requires
 DROP VIEW v;
 CREATE VIEW v AS new-definition;
 CREATE OR REPLACE would allow these to be combined into a single operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: Cli: Print Hadoop's CPU milliseconds

2011-07-20 Thread Siying Dong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/948/
---

(Updated 2011-07-20 06:27:19.820431)


Review request for hive, Yongqiang He, Ning Zhang, and namit jain.


Changes
---

remove MapRedStats from DriverContext and add more counters to it.


Summary
---

In hive CLI, print out CPU msec from Hadoop MapReduce coutners.


This addresses bug HIVE-2236.
https://issues.apache.org/jira/browse/HIVE-2236


Diffs (updated)
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 
1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1148623 

Diff: https://reviews.apache.org/r/948/diff


Testing
---

run the updated codes against real clusters and make sure it printing is 
correct.


Thanks,

Siying



[jira] [Commented] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds

2011-07-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068169#comment-13068169
 ] 

jirapos...@reviews.apache.org commented on HIVE-2236:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/948/
---

(Updated 2011-07-20 06:27:19.820431)


Review request for hive, Yongqiang He, Ning Zhang, and namit jain.


Changes
---

remove MapRedStats from DriverContext and add more counters to it.


Summary
---

In hive CLI, print out CPU msec from Hadoop MapReduce coutners.


This addresses bug HIVE-2236.
https://issues.apache.org/jira/browse/HIVE-2236


Diffs (updated)
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 
1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1148623 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1148623 

Diff: https://reviews.apache.org/r/948/diff


Testing
---

run the updated codes against real clusters and make sure it printing is 
correct.


Thanks,

Siying



 Cli: Print Hadoop's CPU milliseconds
 

 Key: HIVE-2236
 URL: https://issues.apache.org/jira/browse/HIVE-2236
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2236.1.patch, HIVE-2236.2.patch


 CPU Milliseonds information is available from Hadoop's framework. Printing it 
 out to Hive CLI when executing a job will help users to know more about their 
 jobs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds

2011-07-20 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2236:
--

Attachment: HIVE-2236.2.patch

remove the MapRedStat list from DriverContext and add more counters.

 Cli: Print Hadoop's CPU milliseconds
 

 Key: HIVE-2236
 URL: https://issues.apache.org/jira/browse/HIVE-2236
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2236.1.patch, HIVE-2236.2.patch


 CPU Milliseonds information is available from Hadoop's framework. Printing it 
 out to Hive CLI when executing a job will help users to know more about their 
 jobs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2139) Enables HiveServer to accept -hiveconf option

2011-07-20 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068173#comment-13068173
 ] 

Carl Steinbach commented on HIVE-2139:
--

+1. Will commit if tests pass.

 Enables HiveServer to accept -hiveconf option
 -

 Key: HIVE-2139
 URL: https://issues.apache.org/jira/browse/HIVE-2139
 Project: Hive
  Issue Type: Improvement
  Components: CLI
 Environment: Linux + CDH3u0 (Hive 0.7.0+27.1-2~lucid-cdh3)
Reporter: Kazuki Ohta
Assignee: Patrick Hunt
 Attachments: HIVE-2139.patch, HIVE-2139.patch, HIVE-2139.patch


 Currently, I'm trying to test HiveHBaseIntegration on HiveServer. But it 
 doesn't seem to accept -hiveconf command.
 {code}
 hive --service hiveserver -hiveconf hbase.zookeeper.quorum=hdp0,hdp1,hdp2
 Starting Hive Thrift Server
 java.lang.NumberFormatException: For input string: -hiveconf
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 at java.lang.Integer.parseInt(Integer.java:449)
 at java.lang.Integer.parseInt(Integer.java:499)
 at org.apache.hadoop.hive.service.HiveServer.main(HiveServer.java:382)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
 {code}
 Therefore, you need to throw the query like set 
 hbase.zookeeper.quorum=hdp0,hdp1,hdp2 everytime. It's not convenient for 
 separating the configuration between server-side and client-side.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1884) Potential risk of resource leaks in Hive

2011-07-20 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068179#comment-13068179
 ] 

John Sichi commented on HIVE-1884:
--

+1.  Will commit when tests pass.

 Potential risk of resource leaks in Hive
 

 Key: HIVE-1884
 URL: https://issues.apache.org/jira/browse/HIVE-1884
 Project: Hive
  Issue Type: Bug
  Components: CLI, Metastore, Query Processor, Server Infrastructure
Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0
 Environment: Hive 0.6.0, Hadoop 0.20.1
 SUSE Linux Enterprise Server 11 (i586)
Reporter: Mohit Sikri
Assignee: Chinna Rao Lalam
 Attachments: HIVE-1884.1.PATCH, HIVE-1884.2.patch, HIVE-1884.3.patch, 
 HIVE-1884.4.patch, HIVE-1884.5.patch


 h3.There are couple of resource leaks.
 h4.For example,
 In CliDriver.java, Method :- processReader() the buffered reader is not 
 closed.
 h3.Also there are risk(s) of  resource(s) getting leaked , in such cases we 
 need to re factor the code to move closing of resources in finally block.
 h4. For Example :- 
 In Throttle.java   Method:- checkJobTracker() , the following code snippet 
 might cause resource leak.
 {code}
 InputStream in = url.openStream();
 in.read(buffer);
 in.close();
 {code}
 Ideally and as per the best coding practices it should be like below
 {code}
 InputStream in=null;
 try   {
 in = url.openStream();
 int numRead = in.read(buffer);
 }
 finally {
IOUtils.closeStream(in);
 }
 {code}
 Similar cases, were found in ExplainTask.java, DDLTask.java etc.Need to re 
 factor all such occurrences.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-707) add group_concat

2011-07-20 Thread guyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068182#comment-13068182
 ] 

guyan commented on HIVE-707:


hi all, this issue will be resolved?

 add group_concat
 

 Key: HIVE-707
 URL: https://issues.apache.org/jira/browse/HIVE-707
 Project: Hive
  Issue Type: New Feature
  Components: Query Processor
Reporter: Namit Jain
Assignee: Min Zhou

 Moving the discussion to a new jira:
 I've implemented group_cat() in a rush, and found something difficult to 
 slove:
 1. function group_cat() has a internal order by clause, currently, we can't 
 implement such an aggregation in hive.
 2. when the strings will be group concated are too large, in another words, 
 if data skew appears, there is often not enough memory to store such a big 
 result.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2294) Allow ShimLoader to work with Hadoop 0.20-append

2011-07-20 Thread YoungWoo Kim (JIRA)
Allow ShimLoader to work with Hadoop 0.20-append


 Key: HIVE-2294
 URL: https://issues.apache.org/jira/browse/HIVE-2294
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.1
Reporter: YoungWoo Kim
Assignee: YoungWoo Kim
Priority: Trivial
 Fix For: 0.8.0


If we are running Hive with Hadoop 0.20-append, Hive ShimLoader does not get 
Hadoop version correctly. Surfix starts with '-' should be removed from major 
version info.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2283) Backtracking real column names for EXPLAIN output

2011-07-20 Thread Navis (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Navis updated HIVE-2283:


Attachment: HIVE-2283.test.patch
HIVE-2283.2.patch

Bug fixes  added ql/src/test/queries/clientpositive/explain_columns.q as 
requested.

 Backtracking real column names for EXPLAIN output
 -

 Key: HIVE-2283
 URL: https://issues.apache.org/jira/browse/HIVE-2283
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Navis
Priority: Minor
 Attachments: HIVE-2283.1.patch, HIVE-2283.2.patch, 
 HIVE-2283.test.patch


 GUI people suggested that showing real column names for result of EXPLAIN 
 statement would make customers feel more comfortable with HIVE. I agreed and 
 working on it. 
 {code}
 a. current EXPLAIN
  Select Operator
expressions:
  expr: _col10
  type: int
  expr: _col17
  type: string
Group By Operator
  keys:
expr: _col0
type: int
expr: _col17
type: int
 b. suggested EXPLAIN
  Select Operator
expressions: _col10=t2.key_int1, _col17=upper(t1.key_int1), 
 _col22=t3.key_string2
Group By Operator
  keys: _col10=t2.key_int1, _col17=upper(t1.key_int1)
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2294) Allow ShimLoader to work with Hadoop 0.20-append

2011-07-20 Thread YoungWoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YoungWoo Kim updated HIVE-2294:
---

Attachment: HIVE-2294.1.patch

 Allow ShimLoader to work with Hadoop 0.20-append
 

 Key: HIVE-2294
 URL: https://issues.apache.org/jira/browse/HIVE-2294
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.1
Reporter: YoungWoo Kim
Assignee: YoungWoo Kim
Priority: Trivial
 Fix For: 0.8.0

 Attachments: HIVE-2294.1.patch


 If we are running Hive with Hadoop 0.20-append, Hive ShimLoader does not get 
 Hadoop version correctly. Surfix starts with '-' should be removed from major 
 version info.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2294) Allow ShimLoader to work with Hadoop 0.20-append

2011-07-20 Thread YoungWoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YoungWoo Kim updated HIVE-2294:
---

Status: Patch Available  (was: Open)

Patch for HIVE-2294

 Allow ShimLoader to work with Hadoop 0.20-append
 

 Key: HIVE-2294
 URL: https://issues.apache.org/jira/browse/HIVE-2294
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.1
Reporter: YoungWoo Kim
Assignee: YoungWoo Kim
Priority: Trivial
 Fix For: 0.8.0

 Attachments: HIVE-2294.1.patch


 If we are running Hive with Hadoop 0.20-append, Hive ShimLoader does not get 
 Hadoop version correctly. Surfix starts with '-' should be removed from major 
 version info.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2294) Allow ShimLoader to work with Hadoop 0.20-append

2011-07-20 Thread YoungWoo Kim (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YoungWoo Kim updated HIVE-2294:
---

Attachment: HIVE-2294.2.patch

Re-create a patch from svn

 Allow ShimLoader to work with Hadoop 0.20-append
 

 Key: HIVE-2294
 URL: https://issues.apache.org/jira/browse/HIVE-2294
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.1
Reporter: YoungWoo Kim
Assignee: YoungWoo Kim
Priority: Trivial
 Fix For: 0.8.0

 Attachments: HIVE-2294.1.patch, HIVE-2294.2.patch


 If we are running Hive with Hadoop 0.20-append, Hive ShimLoader does not get 
 Hadoop version correctly. Surfix starts with '-' should be removed from major 
 version info.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2295) Implement CLUSTERED BY, DISTRIBUTED BY, SORTED BY directives for a single query level.

2011-07-20 Thread Adam Kramer (JIRA)
Implement CLUSTERED BY, DISTRIBUTED BY, SORTED BY directives for a single query 
level.
--

 Key: HIVE-2295
 URL: https://issues.apache.org/jira/browse/HIVE-2295
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Adam Kramer


The common framework for utilizing the mapreduce framework looks like this:

SELECT TRANSFORM(a.foo, a.bar)
USING 'mapper.py'
AS x, y, z
FROM (
  SELECT b.foo, b.bar
  FROM tablename b
  CLUSTER BY b.foo
) a;

...however, this is exceptionally fragile, as it relies on the assumption that 
Hive is not doing any magic in between the query steps. People familiar with 
SQL frequently assume that query steps are effectively separated from each 
other. CLUSTER BY, then, would guarantee that data are clustered on their way 
OUT of the query, but really what we need is a directive to indicate that data 
must be clustered on the way INTO the query.

This is not pedantic, because there is no reason that Hive wouldn't try to 
optimize data flow between queries, for example, systematically splitting up 
big queries. The UDAF framework, with its merging step, would allow a single 
key/value pair to be split across SEVERAL reducers, violating the mapreduce 
assumptions but returning the correct data...however, for a TRANSFORM 
statement, no such protections are afforded.

I propose, for greater clarity, that these directives be part of the same query 
level. Example syntax:

SELECT TRANSFORM(foo, bar)
USING 'reducer.py'
AS x, y, z
FROM tablename
CLUSTERED BY foo;

...in other words, move the directive regarding data distribution to the query 
that actually cares about it, allowing for users who are making the assumptions 
of the mapreduce framework to formally indicate that their transformer really 
DOES need clustered data. Or to put it in other words, CLUSTER BY is a 
directive guaranteeing that data are clustered on the way OUT OF a query (i.e., 
for bucketed tables), whereas CLUSTERED BY is a directive guaranteeing that 
data are clustered on the way INTO a query.

Bonus points: For tables that are already CLUSTERED BY in their definition, 
allow this query to run in the map phase.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1434) Cassandra Storage Handler

2011-07-20 Thread Edward Capriolo (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068396#comment-13068396
 ] 

Edward Capriolo commented on HIVE-1434:
---

It is now pretty easy to take the Brisk jar and drop it into hive:

https://github.com/riptano/hive/wiki/Cassandra-Handler-usage-in-Hive-0.7-with-Cassandra-0.7

Also the brisk version of the handler has more features then this as it can 
transpose wide rows into long columns. I think at this point we might as well 
abandon trying to get this code into hive. It is much easier to code/innovate 
it as an external project with git then inside hadoop-hive. 

 Cassandra Storage Handler
 -

 Key: HIVE-1434
 URL: https://issues.apache.org/jira/browse/HIVE-1434
 Project: Hive
  Issue Type: New Feature
Affects Versions: 0.7.0
Reporter: Edward Capriolo
Assignee: Edward Capriolo
 Attachments: cas-handle.tar.gz, cass_handler.diff, hive-1434-1.txt, 
 hive-1434-2-patch.txt, hive-1434-2011-02-26.patch.txt, 
 hive-1434-2011-03-07.patch.txt, hive-1434-2011-03-07.patch.txt, 
 hive-1434-2011-03-14.patch.txt, hive-1434-3-patch.txt, hive-1434-4-patch.txt, 
 hive-1434-5.patch.txt, hive-1434.2011-02-27.diff.txt, 
 hive-cassandra.2011-02-25.txt, hive.diff


 Add a cassandra storage handler.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2080) Few code improvements in the ql and serde packages.

2011-07-20 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2080:
---

Attachment: HIVE-2080.1.Patch

 Few code improvements in the ql and serde packages.
 ---

 Key: HIVE-2080
 URL: https://issues.apache.org/jira/browse/HIVE-2080
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Serializers/Deserializers
Affects Versions: 0.7.0
 Environment: Hadoop 0.20.1, Hive0.7.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2080.1.Patch, HIVE-2080.Patch


 Few code improvements in the ql and serde packages.
 1) Little performance Improvements 
 2) Null checks to avoid NPEs
 3) Effective varaible management.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: Few code improvements in the ql and serde packages.

2011-07-20 Thread chinnarao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1144/
---

Review request for hive.


Summary
---

Few code improvements in the ql and serde packages.
1) Little performance Improvements 
2) Null checks to avoid NPEs
3) Effective varaible management.


This addresses bug HIVE-2080.
https://issues.apache.org/jira/browse/HIVE-2080


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 
1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 
1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 
1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/UnionOperator.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ASTNode.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1148179 
  
trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeField.java
 1148179 
  
trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFieldType.java
 1148179 
  
trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFunction.java
 1148179 

Diff: https://reviews.apache.org/r/1144/diff


Testing
---

All unit test passed


Thanks,

chinna



[jira] [Updated] (HIVE-2183) In Task class and its subclasses logger is initialized in constructor

2011-07-20 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2183:
---

Attachment: HIVE-2183.1.patch

 In Task class and its subclasses logger is initialized in constructor
 -

 Key: HIVE-2183
 URL: https://issues.apache.org/jira/browse/HIVE-2183
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.5.0, 0.8.0
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
Priority: Minor
 Attachments: HIVE-2183.1.patch, HIVE-2183.patch


 In Task class and its subclasses logger is initialized in constructor. Log 
 object no need to initialize every time in the constructor, Log object can 
 make it as static object.
 {noformat}
 Ex:
   public ExecDriver() {
 super();
 LOG = LogFactory.getLog(this.getClass().getName());
 console = new LogHelper(LOG);
 this.jobExecHelper = new HadoopJobExecHelper(job, console, this, this);
   }
 {noformat}
 Need to change like this
 {noformat}
 private static final Log LOG = LogFactory.getLog(ExecDriver.class);
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2080) Few code improvements in the ql and serde packages.

2011-07-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068404#comment-13068404
 ] 

jirapos...@reviews.apache.org commented on HIVE-2080:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1144/
---

Review request for hive.


Summary
---

Few code improvements in the ql and serde packages.
1) Little performance Improvements 
2) Null checks to avoid NPEs
3) Effective varaible management.


This addresses bug HIVE-2080.
https://issues.apache.org/jira/browse/HIVE-2080


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 
1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FilterOperator.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 
1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/SelectOperator.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java 
1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/UnionOperator.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ASTNode.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/BaseSemanticAnalyzer.java 
1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 1148179 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1148179 
  
trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeField.java
 1148179 
  
trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFieldType.java
 1148179 
  
trunk/serde/src/java/org/apache/hadoop/hive/serde2/dynamic_type/DynamicSerDeFunction.java
 1148179 

Diff: https://reviews.apache.org/r/1144/diff


Testing
---

All unit test passed


Thanks,

chinna



 Few code improvements in the ql and serde packages.
 ---

 Key: HIVE-2080
 URL: https://issues.apache.org/jira/browse/HIVE-2080
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Serializers/Deserializers
Affects Versions: 0.7.0
 Environment: Hadoop 0.20.1, Hive0.7.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2080.1.Patch, HIVE-2080.Patch


 Few code improvements in the ql and serde packages.
 1) Little performance Improvements 
 2) Null checks to avoid NPEs
 3) Effective varaible management.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: In Task class and its subclasses logger is initialized in constructor

2011-07-20 Thread chinnarao

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1146/
---

Review request for hive.


Summary
---

In Task class and its subclasses logger is initialized in constructor. Log 
object no need to initialize every time in the constructor, Log object can make 
it as static object.


This addresses bug HIVE-2183.
https://issues.apache.org/jira/browse/HIVE-2183


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1145025 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 
1145025 

Diff: https://reviews.apache.org/r/1146/diff


Testing
---

All unit tests passed


Thanks,

chinna



[jira] [Commented] (HIVE-2183) In Task class and its subclasses logger is initialized in constructor

2011-07-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068408#comment-13068408
 ] 

jirapos...@reviews.apache.org commented on HIVE-2183:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1146/
---

Review request for hive.


Summary
---

In Task class and its subclasses logger is initialized in constructor. Log 
object no need to initialize every time in the constructor, Log object can make 
it as static object.


This addresses bug HIVE-2183.
https://issues.apache.org/jira/browse/HIVE-2183


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/CopyTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FetchTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionRegistry.java 
1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FunctionTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MapredLocalTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java 
1145025 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1145025 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/FunctionSemanticAnalyzer.java 
1145025 

Diff: https://reviews.apache.org/r/1146/diff


Testing
---

All unit tests passed


Thanks,

chinna



 In Task class and its subclasses logger is initialized in constructor
 -

 Key: HIVE-2183
 URL: https://issues.apache.org/jira/browse/HIVE-2183
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.5.0, 0.8.0
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
Priority: Minor
 Attachments: HIVE-2183.1.patch, HIVE-2183.patch


 In Task class and its subclasses logger is initialized in constructor. Log 
 object no need to initialize every time in the constructor, Log object can 
 make it as static object.
 {noformat}
 Ex:
   public ExecDriver() {
 super();
 LOG = LogFactory.getLog(this.getClass().getName());
 console = new LogHelper(LOG);
 this.jobExecHelper = new HadoopJobExecHelper(job, console, this, this);
   }
 {noformat}
 Need to change like this
 {noformat}
 private static final Log LOG = LogFactory.getLog(ExecDriver.class);
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2184) Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()

2011-07-20 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2184:
---

Attachment: HIVE-2184.2.patch

 Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()
 ---

 Key: HIVE-2184
 URL: https://issues.apache.org/jira/browse/HIVE-2184
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0, 0.8.0
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2184.1.patch, HIVE-2184.1.patch, HIVE-2184.2.patch, 
 HIVE-2184.patch


 1)Hive.close() will call HiveMetaStoreClient.close() in this method the 
 variable standAloneClient is never become true then client.shutdown() never 
 call.
 2)Hive.close() After calling metaStoreClient.close() need to make 
 metaStoreClient=null

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2183) In Task class and its subclasses logger is initialized in constructor

2011-07-20 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2183:
---

Status: Patch Available  (was: Open)

 In Task class and its subclasses logger is initialized in constructor
 -

 Key: HIVE-2183
 URL: https://issues.apache.org/jira/browse/HIVE-2183
 Project: Hive
  Issue Type: Bug
  Components: Query Processor
Affects Versions: 0.5.0, 0.8.0
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
Priority: Minor
 Attachments: HIVE-2183.1.patch, HIVE-2183.patch


 In Task class and its subclasses logger is initialized in constructor. Log 
 object no need to initialize every time in the constructor, Log object can 
 make it as static object.
 {noformat}
 Ex:
   public ExecDriver() {
 super();
 LOG = LogFactory.getLog(this.getClass().getName());
 console = new LogHelper(LOG);
 this.jobExecHelper = new HadoopJobExecHelper(job, console, this, this);
   }
 {noformat}
 Need to change like this
 {noformat}
 private static final Log LOG = LogFactory.getLog(ExecDriver.class);
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2080) Few code improvements in the ql and serde packages.

2011-07-20 Thread Chinna Rao Lalam (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2080:
---

Status: Patch Available  (was: Open)

 Few code improvements in the ql and serde packages.
 ---

 Key: HIVE-2080
 URL: https://issues.apache.org/jira/browse/HIVE-2080
 Project: Hive
  Issue Type: Bug
  Components: Query Processor, Serializers/Deserializers
Affects Versions: 0.7.0
 Environment: Hadoop 0.20.1, Hive0.7.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5).
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2080.1.Patch, HIVE-2080.Patch


 Few code improvements in the ql and serde packages.
 1) Little performance Improvements 
 2) Null checks to avoid NPEs
 3) Effective varaible management.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: HIVE-1078: CREATE VIEW followup: CREATE OR REPLACE

2011-07-20 Thread Charles Chen

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1058/
---

(Updated 2011-07-20 18:03:04.848086)


Review request for hive.


Changes
---

Add testcases, fixed issues in comments above (btw the previous revision passed 
unit tests)


Summary
---

https://issues.apache.org/jira/browse/HIVE-1078


This addresses bug HIVE-1078.
https://issues.apache.org/jira/browse/HIVE-1078


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view1.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view2.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view3.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view4.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view6.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view7.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view8.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/recursive_view.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/create_or_replace_view.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view1.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view2.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view3.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view4.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view5.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view6.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view7.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view8.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/recursive_view.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_or_replace_view.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_view.q.out
 1146902 

Diff: https://reviews.apache.org/r/1058/diff


Testing
---

Passes unit tests


Thanks,

Charles



[jira] [Commented] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-07-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068519#comment-13068519
 ] 

jirapos...@reviews.apache.org commented on HIVE-1078:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1058/
---

(Updated 2011-07-20 18:03:04.848086)


Review request for hive.


Changes
---

Add testcases, fixed issues in comments above (btw the previous revision passed 
unit tests)


Summary
---

https://issues.apache.org/jira/browse/HIVE-1078


This addresses bug HIVE-1078.
https://issues.apache.org/jira/browse/HIVE-1078


Diffs (updated)
-

  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/CreateViewDesc.java
 1146902 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view1.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view2.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view3.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view4.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view5.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view6.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view7.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/create_or_replace_view8.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientnegative/recursive_view.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/queries/clientpositive/create_or_replace_view.q
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view1.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view2.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view3.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view4.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view5.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view6.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view7.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/create_or_replace_view8.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientnegative/recursive_view.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_or_replace_view.q.out
 PRE-CREATION 
  
http://svn.apache.org/repos/asf/hive/trunk/ql/src/test/results/clientpositive/create_view.q.out
 1146902 

Diff: https://reviews.apache.org/r/1058/diff


Testing
---

Passes unit tests


Thanks,

Charles



 CREATE VIEW followup:  CREATE OR REPLACE
 

 Key: HIVE-1078
 URL: https://issues.apache.org/jira/browse/HIVE-1078
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, 
 HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch


 Currently, replacing a view requires
 DROP VIEW v;
 CREATE VIEW v AS new-definition;
 CREATE OR REPLACE would allow these to be combined into a single operation.

--
This message is automatically generated by JIRA.
For more information 

[jira] [Updated] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-07-20 Thread Charles Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Chen updated HIVE-1078:
---

Attachment: HIVE-1078v8.patch

 CREATE VIEW followup:  CREATE OR REPLACE
 

 Key: HIVE-1078
 URL: https://issues.apache.org/jira/browse/HIVE-1078
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, 
 HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch


 Currently, replacing a view requires
 DROP VIEW v;
 CREATE VIEW v AS new-definition;
 CREATE OR REPLACE would allow these to be combined into a single operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-1078) CREATE VIEW followup: CREATE OR REPLACE

2011-07-20 Thread Charles Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Chen updated HIVE-1078:
---

Status: Patch Available  (was: Open)

 CREATE VIEW followup:  CREATE OR REPLACE
 

 Key: HIVE-1078
 URL: https://issues.apache.org/jira/browse/HIVE-1078
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Attachments: HIVE-1078v3.patch, HIVE-1078v4.patch, HIVE-1078v5.patch, 
 HIVE-1078v6.patch, HIVE-1078v7.patch, HIVE-1078v8.patch


 Currently, replacing a view requires
 DROP VIEW v;
 CREATE VIEW v AS new-definition;
 CREATE OR REPLACE would allow these to be combined into a single operation.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2289) NumberFormatException with respect to _offsets when running a query with index

2011-07-20 Thread siddharth ramanan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068529#comment-13068529
 ] 

siddharth ramanan commented on HIVE-2289:
-

Thanks John for your really quick replies. I Just have one final question - 
from the Hive page, I understand that, there are a lot of overheads while 
running the queries and in turn these affect the performance (response time of 
queries), Can we configure hive to an extent that, the response time for a 
query is less than say 5 seconds for querying over a million rows (just to give 
examples, I am giving these numbers) I understand that this question is too 
subjective (depends on the cluster, the configuration of machines used etc) but 
I am quite confused about the performance of hive over huge data..

Thanks,
Siddharth

 NumberFormatException with respect to _offsets when running a query with  
 index
 ---

 Key: HIVE-2289
 URL: https://issues.apache.org/jira/browse/HIVE-2289
 Project: Hive
  Issue Type: Bug
  Components: Indexing
Affects Versions: 0.7.0
 Environment: RedHat 5
Reporter: siddharth ramanan

 I am having a table named foo with columns origin, destination and 
 information.
 Steps I followed to create index named foosample for foo,
 1)create index foosample on table foo(origin) as 'compact' with deferred 
 rebuild;
 2)alter index foosample on foo rebuild;
 3)insert overwrite directory /tmp/index_result select 
 '_bucketname','_offsets' from default__foo_foosample__ where origin='WAW';
 4)set hive.index.compact.file=/tmp/index_result;
 5)set 
 hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat;
 6)select * from foo where origin='WAW';
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.lang.NumberFormatException: For input string: _offsets
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 at java.lang.Long.parseLong(Long.java:410)
 at java.lang.Long.parseLong(Long.java:468)
 at 
 org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexResult.add(HiveCompactIndexResult.java:158)
 at 
 org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexResult.init(HiveCompactIndexResult.java:107)
 at 
 org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat.getSplits(HiveCompactIndexInputFormat.java:89)
 at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
 at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
 at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:657)
 at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Job Submission failed with exception 'java.lang.NumberFormatException(For 
 input string: _offsets)'
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask
 Steps 2 and 3 ran a successful mapreduce job and also the table 
 default__foo_foosample__ (index table) has data with three columns origin, 
 _bucketname and _offsets.
 Thanks,
 Siddharth

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2224) Ability to add partitions atomically

2011-07-20 Thread Paul Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-2224:


Summary: Ability to add partitions atomically  (was: Ability to 
add_partitions, and atomically)

 Ability to add partitions atomically
 

 Key: HIVE-2224
 URL: https://issues.apache.org/jira/browse/HIVE-2224
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-2224.patch


 I'd like to see an atomic version of the add_partitions() call.
 Whether this is to be done by config to affect add_partitions() behaviour 
 (not my preference) or just changing add_partitions() default behaviour (my 
 preference, but likely to affect current behaviour, so will need others' 
 input) or by making a new add_partitions_atomic() call depends on discussion.
 This looks relatively doable to implement (will need a dependent 
 add_partition_core to not do a ms.commit_partition() early, and to cache list 
 of directories created to remove on rollback, and a list of AddPartitionEvent 
 to trigger in one shot later)
 Thoughts? This also seems like something to implement for allowing HIVE-1805.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2224) Ability to add partitions atomically

2011-07-20 Thread Paul Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068536#comment-13068536
 ] 

Paul Yang commented on HIVE-2224:
-

Seems like it was an issue with the machine. But it has been committed - thanks 
Sushanth!

 Ability to add partitions atomically
 

 Key: HIVE-2224
 URL: https://issues.apache.org/jira/browse/HIVE-2224
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Attachments: HIVE-2224.patch


 I'd like to see an atomic version of the add_partitions() call.
 Whether this is to be done by config to affect add_partitions() behaviour 
 (not my preference) or just changing add_partitions() default behaviour (my 
 preference, but likely to affect current behaviour, so will need others' 
 input) or by making a new add_partitions_atomic() call depends on discussion.
 This looks relatively doable to implement (will need a dependent 
 add_partition_core to not do a ms.commit_partition() early, and to cache list 
 of directories created to remove on rollback, and a list of AddPartitionEvent 
 to trigger in one shot later)
 Thoughts? This also seems like something to implement for allowing HIVE-1805.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2224) Ability to add partitions atomically

2011-07-20 Thread Paul Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Yang updated HIVE-2224:


   Resolution: Fixed
Fix Version/s: 0.8.0
   Status: Resolved  (was: Patch Available)

 Ability to add partitions atomically
 

 Key: HIVE-2224
 URL: https://issues.apache.org/jira/browse/HIVE-2224
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 0.8.0

 Attachments: HIVE-2224.patch


 I'd like to see an atomic version of the add_partitions() call.
 Whether this is to be done by config to affect add_partitions() behaviour 
 (not my preference) or just changing add_partitions() default behaviour (my 
 preference, but likely to affect current behaviour, so will need others' 
 input) or by making a new add_partitions_atomic() call depends on discussion.
 This looks relatively doable to implement (will need a dependent 
 add_partition_core to not do a ms.commit_partition() early, and to cache list 
 of directories created to remove on rollback, and a list of AddPartitionEvent 
 to trigger in one shot later)
 Thoughts? This also seems like something to implement for allowing HIVE-1805.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2209) Provide a way by which ObjectInspectorUtils.compare can be extended by the caller for comparing maps which are part of the object

2011-07-20 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068540#comment-13068540
 ] 

He Yongqiang commented on HIVE-2209:


+1, will commit after tests pass.

 Provide a way by which ObjectInspectorUtils.compare can be extended by the 
 caller for comparing maps which are part of the object
 -

 Key: HIVE-2209
 URL: https://issues.apache.org/jira/browse/HIVE-2209
 Project: Hive
  Issue Type: Improvement
Reporter: Krishna Kumar
Assignee: Krishna Kumar
Priority: Minor
 Attachments: HIVE-2209v0.patch, HIVE-2209v2.patch, HIVE2209v1.patch


 Now ObjectInspectorUtils.compare throws an exception if a map is contained 
 (recursively) within the objects being compared. Two obvious implementations 
 are
 - a simple map comparer which assumes keys of the first map can be used to 
 fetch values from the second
 - a 'cross-product' comparer which compares every pair of key-value pairs in 
 the two maps, and calls a match if and only if all pairs are matched
 Note that it would be difficult to provide a transitive 
 greater-than/less-than indication with maps so that is not in scope. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2289) NumberFormatException with respect to _offsets when running a query with index

2011-07-20 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068581#comment-13068581
 ] 

John Sichi commented on HIVE-2289:
--

(This JIRA issue is not the right place for these discussions; mailing list 
u...@hive.apache.org is.)

 NumberFormatException with respect to _offsets when running a query with  
 index
 ---

 Key: HIVE-2289
 URL: https://issues.apache.org/jira/browse/HIVE-2289
 Project: Hive
  Issue Type: Bug
  Components: Indexing
Affects Versions: 0.7.0
 Environment: RedHat 5
Reporter: siddharth ramanan

 I am having a table named foo with columns origin, destination and 
 information.
 Steps I followed to create index named foosample for foo,
 1)create index foosample on table foo(origin) as 'compact' with deferred 
 rebuild;
 2)alter index foosample on foo rebuild;
 3)insert overwrite directory /tmp/index_result select 
 '_bucketname','_offsets' from default__foo_foosample__ where origin='WAW';
 4)set hive.index.compact.file=/tmp/index_result;
 5)set 
 hive.input.format=org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat;
 6)select * from foo where origin='WAW';
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 java.lang.NumberFormatException: For input string: _offsets
 at 
 java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
 at java.lang.Long.parseLong(Long.java:410)
 at java.lang.Long.parseLong(Long.java:468)
 at 
 org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexResult.add(HiveCompactIndexResult.java:158)
 at 
 org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexResult.init(HiveCompactIndexResult.java:107)
 at 
 org.apache.hadoop.hive.ql.index.compact.HiveCompactIndexInputFormat.getSplits(HiveCompactIndexInputFormat.java:89)
 at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
 at 
 org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
 at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
 at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:657)
 at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Job Submission failed with exception 'java.lang.NumberFormatException(For 
 input string: _offsets)'
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.MapRedTask
 Steps 2 and 3 ran a successful mapreduce job and also the table 
 default__foo_foosample__ (index table) has data with three columns origin, 
 _bucketname and _offsets.
 Thanks,
 Siddharth

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-1884) Potential risk of resource leaks in Hive

2011-07-20 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-1884:
-

   Resolution: Fixed
Fix Version/s: 0.8.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed.  Thanks Chinna!


 Potential risk of resource leaks in Hive
 

 Key: HIVE-1884
 URL: https://issues.apache.org/jira/browse/HIVE-1884
 Project: Hive
  Issue Type: Bug
  Components: CLI, Metastore, Query Processor, Server Infrastructure
Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0
 Environment: Hive 0.6.0, Hadoop 0.20.1
 SUSE Linux Enterprise Server 11 (i586)
Reporter: Mohit Sikri
Assignee: Chinna Rao Lalam
 Fix For: 0.8.0

 Attachments: HIVE-1884.1.PATCH, HIVE-1884.2.patch, HIVE-1884.3.patch, 
 HIVE-1884.4.patch, HIVE-1884.5.patch


 h3.There are couple of resource leaks.
 h4.For example,
 In CliDriver.java, Method :- processReader() the buffered reader is not 
 closed.
 h3.Also there are risk(s) of  resource(s) getting leaked , in such cases we 
 need to re factor the code to move closing of resources in finally block.
 h4. For Example :- 
 In Throttle.java   Method:- checkJobTracker() , the following code snippet 
 might cause resource leak.
 {code}
 InputStream in = url.openStream();
 in.read(buffer);
 in.close();
 {code}
 Ideally and as per the best coding practices it should be like below
 {code}
 InputStream in=null;
 try   {
 in = url.openStream();
 int numRead = in.read(buffer);
 }
 finally {
IOUtils.closeStream(in);
 }
 {code}
 Similar cases, were found in ExplainTask.java, DDLTask.java etc.Need to re 
 factor all such occurrences.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2283) Backtracking real column names for EXPLAIN output

2011-07-20 Thread John Sichi (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068590#comment-13068590
 ] 

John Sichi commented on HIVE-2283:
--

Submit one combined patch with everything, including the .q.out file which we 
need for the test to pass.  Once read, add it to Review Board, upload it here, 
and then click the Submit Patch button.

https://cwiki.apache.org/confluence/display/Hive/HowToContribute#HowToContribute-ReviewProcess

 Backtracking real column names for EXPLAIN output
 -

 Key: HIVE-2283
 URL: https://issues.apache.org/jira/browse/HIVE-2283
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Navis
Priority: Minor
 Attachments: HIVE-2283.1.patch, HIVE-2283.2.patch, 
 HIVE-2283.test.patch


 GUI people suggested that showing real column names for result of EXPLAIN 
 statement would make customers feel more comfortable with HIVE. I agreed and 
 working on it. 
 {code}
 a. current EXPLAIN
  Select Operator
expressions:
  expr: _col10
  type: int
  expr: _col17
  type: string
Group By Operator
  keys:
expr: _col0
type: int
expr: _col17
type: int
 b. suggested EXPLAIN
  Select Operator
expressions: _col10=t2.key_int1, _col17=upper(t1.key_int1), 
 _col22=t3.key_string2
Group By Operator
  keys: _col10=t2.key_int1, _col17=upper(t1.key_int1)
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-trunk-h0.21 #836

2011-07-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/836/

--
[...truncated 32873 lines...]
 [echo]  Writing POM to 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/jdbc/pom.xml
No ivy:settings found for the default reference 'ivy.instance'.  A default 
instance will be used
no settings file found, using default...
:: loading settings :: url = 
jar:file:/home/hudson/.ant/lib/ivy-2.0.0-rc2.jar!/org/apache/ivy/core/settings/ivysettings.xml

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

check-ivy:

create-dirs:

compile-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ant/build.xml:40: 
warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

deploy-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ant/build.xml:40: 
warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

jar:

init:

install-hadoopcore:

install-hadoopcore-default:

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-retrieve-hadoop-source:
:: loading settings :: file = 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ivy/ivysettings.xml
[ivy:retrieve] :: resolving dependencies :: 
org.apache.hive#hive-hwi;0.8.0-SNAPSHOT
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  found hadoop#core;0.20.1 in hadoop-source
[ivy:retrieve] :: resolution report :: resolve 663ms :: artifacts dl 1ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
|  default |   1   |   0   |   0   |   0   ||   1   |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.hive#hive-hwi
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  0 artifacts copied, 1 already retrieved (0kB/1ms)

install-hadoopcore-internal:

setup:

war:

compile:
 [echo] Compiling: hwi
[javac] 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/hwi/build.xml:71: 
warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

jar:
 [echo] Jar: hwi

make-pom:
 [echo]  Writing POM to 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/hwi/pom.xml
No ivy:settings found for the default reference 'ivy.instance'.  A default 
instance will be used
no settings file found, using default...
:: loading settings :: url = 
jar:file:/home/hudson/.ant/lib/ivy-2.0.0-rc2.jar!/org/apache/ivy/core/settings/ivysettings.xml

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

check-ivy:

create-dirs:

compile-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ant/build.xml:40: 
warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

deploy-ant-tasks:

create-dirs:

init:

compile:
 [echo] Compiling: anttasks
[javac] 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/ant/build.xml:40: 
warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds

jar:

init:

setup:

compile:
 [echo] Compiling: hbase-handler
[javac] 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build-common.xml:301: 
warning: 'includeantruntime' was not set, defaulting to 
build.sysclasspath=last; set to false for repeatable builds
 [copy] Warning: 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/hbase-handler/src/java/conf
 does not exist.

jar:
 [echo] Jar: hbase-handler

make-pom:
 [echo]  Writing POM to 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/hbase-handler/pom.xml
No ivy:settings found for the default 

[jira] [Commented] (HIVE-2224) Ability to add partitions atomically

2011-07-20 Thread Sushanth Sowmyan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068629#comment-13068629
 ] 

Sushanth Sowmyan commented on HIVE-2224:


Thanks!

 Ability to add partitions atomically
 

 Key: HIVE-2224
 URL: https://issues.apache.org/jira/browse/HIVE-2224
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sushanth Sowmyan
Assignee: Sushanth Sowmyan
 Fix For: 0.8.0

 Attachments: HIVE-2224.patch


 I'd like to see an atomic version of the add_partitions() call.
 Whether this is to be done by config to affect add_partitions() behaviour 
 (not my preference) or just changing add_partitions() default behaviour (my 
 preference, but likely to affect current behaviour, so will need others' 
 input) or by making a new add_partitions_atomic() call depends on discussion.
 This looks relatively doable to implement (will need a dependent 
 add_partition_core to not do a ms.commit_partition() early, and to cache list 
 of directories created to remove on rollback, and a list of AddPartitionEvent 
 to trigger in one shot later)
 Thoughts? This also seems like something to implement for allowing HIVE-1805.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2296) bad compressed file names from insert into

2011-07-20 Thread Franklin Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Franklin Hu updated HIVE-2296:
--

Affects Version/s: 0.8.0

 bad compressed file names from insert into
 --

 Key: HIVE-2296
 URL: https://issues.apache.org/jira/browse/HIVE-2296
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Franklin Hu
Assignee: Franklin Hu

 When INSERT INTO is run on a table with compressed output 
 (hive.exec.compress.output=true) and existing files in the table, it may copy 
 the new files in bad file names:
 Before INSERT INTO:
 00_0.gz
 After INSERT INTO:
 00_0
 00_0.gz_copy_1
 Correct behavior should be to pick a valid filename

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2296) bad compressed file names from insert into

2011-07-20 Thread Franklin Hu (JIRA)
bad compressed file names from insert into
--

 Key: HIVE-2296
 URL: https://issues.apache.org/jira/browse/HIVE-2296
 Project: Hive
  Issue Type: Bug
Reporter: Franklin Hu
Assignee: Franklin Hu


When INSERT INTO is run on a table with compressed output 
(hive.exec.compress.output=true) and existing files in the table, it may copy 
the new files in bad file names:

Before INSERT INTO:
00_0.gz

After INSERT INTO:
00_0
00_0.gz_copy_1

Correct behavior should be to pick a valid filename

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2296) bad compressed file names from insert into

2011-07-20 Thread Franklin Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Franklin Hu updated HIVE-2296:
--

Description: 
When INSERT INTO is run on a table with compressed output 
(hive.exec.compress.output=true) and existing files in the table, it may copy 
the new files in bad file names:

Before INSERT INTO:
00_0.gz

After INSERT INTO:
00_0.gz
00_0.gz_copy_1

Correct behavior should be to pick a valid filename

  was:
When INSERT INTO is run on a table with compressed output 
(hive.exec.compress.output=true) and existing files in the table, it may copy 
the new files in bad file names:

Before INSERT INTO:
00_0.gz

After INSERT INTO:
00_0
00_0.gz_copy_1

Correct behavior should be to pick a valid filename


 bad compressed file names from insert into
 --

 Key: HIVE-2296
 URL: https://issues.apache.org/jira/browse/HIVE-2296
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Franklin Hu
Assignee: Franklin Hu

 When INSERT INTO is run on a table with compressed output 
 (hive.exec.compress.output=true) and existing files in the table, it may copy 
 the new files in bad file names:
 Before INSERT INTO:
 00_0.gz
 After INSERT INTO:
 00_0.gz
 00_0.gz_copy_1
 Correct behavior should be to pick a valid filename

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: reduce name node calls in hive by creating temporary directories

2011-07-20 Thread Siying Dong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/952/
---

(Updated 2011-07-20 23:31:54.007436)


Review request for hive, Yongqiang He, Ning Zhang, and namit jain.


Changes
---

1. change block merge task too
2. change the capital file name


Summary
---

reduce name node calls in hive by creating temporary directories


This addresses bug HIVE-2201.
https://issues.apache.org/jira/browse/HIVE-2201


Diffs (updated)
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1148905 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
1148905 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1148905 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFileOutputFormat.java 
1148905 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1148905 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java
 1148905 

Diff: https://reviews.apache.org/r/952/diff


Testing
---


Thanks,

Siying



[jira] [Updated] (HIVE-2201) reduce name node calls in hive by creating temporary directories

2011-07-20 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2201:
--

Attachment: HIVE-2201.4.patch

1. change block merge task too
2. change the capital file name

 reduce name node calls in hive by creating temporary directories
 

 Key: HIVE-2201
 URL: https://issues.apache.org/jira/browse/HIVE-2201
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Siying Dong
 Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch, 
 HIVE-2201.4.patch


 Currently, in Hive, when a file gets written by a FileSinkOperator,
 the sequence of operations is as follows:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp1/1
 3. Move directory /tmp1 to /tmp2
 4. For all files in /tmp2, remove all files starting with _tmp and
 duplicate files.
 Due to speculative execution, a lot of temporary files are created
 in /tmp1 (or /tmp2). This leads to a lot of name node calls,
 specially for large queries.
 The protocol above can be modified slightly:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp2/1
 3. Move directory /tmp2 to /tmp3
 4. For all files in /tmp3, remove all duplicate files.
 This should reduce the number of tmp files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2201) reduce name node calls in hive by creating temporary directories

2011-07-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068715#comment-13068715
 ] 

jirapos...@reviews.apache.org commented on HIVE-2201:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/952/
---

(Updated 2011-07-20 23:31:54.007436)


Review request for hive, Yongqiang He, Ning Zhang, and namit jain.


Changes
---

1. change block merge task too
2. change the capital file name


Summary
---

reduce name node calls in hive by creating temporary directories


This addresses bug HIVE-2201.
https://issues.apache.org/jira/browse/HIVE-2201


Diffs (updated)
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1148905 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java 
1148905 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1148905 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFileOutputFormat.java 
1148905 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1148905 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java
 1148905 

Diff: https://reviews.apache.org/r/952/diff


Testing
---


Thanks,

Siying



 reduce name node calls in hive by creating temporary directories
 

 Key: HIVE-2201
 URL: https://issues.apache.org/jira/browse/HIVE-2201
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Siying Dong
 Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch, 
 HIVE-2201.4.patch


 Currently, in Hive, when a file gets written by a FileSinkOperator,
 the sequence of operations is as follows:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp1/1
 3. Move directory /tmp1 to /tmp2
 4. For all files in /tmp2, remove all files starting with _tmp and
 duplicate files.
 Due to speculative execution, a lot of temporary files are created
 in /tmp1 (or /tmp2). This leads to a lot of name node calls,
 specially for large queries.
 The protocol above can be modified slightly:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp2/1
 3. Move directory /tmp2 to /tmp3
 4. For all files in /tmp3, remove all duplicate files.
 This should reduce the number of tmp files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2296) bad compressed file names from insert into

2011-07-20 Thread Franklin Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Franklin Hu updated HIVE-2296:
--

Attachment: hive-2296.1.patch

 bad compressed file names from insert into
 --

 Key: HIVE-2296
 URL: https://issues.apache.org/jira/browse/HIVE-2296
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Franklin Hu
Assignee: Franklin Hu
 Attachments: hive-2296.1.patch


 When INSERT INTO is run on a table with compressed output 
 (hive.exec.compress.output=true) and existing files in the table, it may copy 
 the new files in bad file names:
 Before INSERT INTO:
 00_0.gz
 After INSERT INTO:
 00_0.gz
 00_0.gz_copy_1
 Correct behavior should be to pick a valid filename

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2296) bad compressed file names from insert into

2011-07-20 Thread Franklin Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Franklin Hu updated HIVE-2296:
--

Description: 
When INSERT INTO is run on a table with compressed output 
(hive.exec.compress.output=true) and existing files in the table, it may copy 
the new files in bad file names:

Before INSERT INTO:
00_0.gz

After INSERT INTO:
00_0.gz
00_0.gz_copy_1

This causes corrupted output when doing a SELECT * on the table.
Correct behavior should be to pick a valid filename such as:
00_0_copy_1.gz

  was:
When INSERT INTO is run on a table with compressed output 
(hive.exec.compress.output=true) and existing files in the table, it may copy 
the new files in bad file names:

Before INSERT INTO:
00_0.gz

After INSERT INTO:
00_0.gz
00_0.gz_copy_1

Correct behavior should be to pick a valid filename


 bad compressed file names from insert into
 --

 Key: HIVE-2296
 URL: https://issues.apache.org/jira/browse/HIVE-2296
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Franklin Hu
Assignee: Franklin Hu
 Attachments: hive-2296.1.patch


 When INSERT INTO is run on a table with compressed output 
 (hive.exec.compress.output=true) and existing files in the table, it may copy 
 the new files in bad file names:
 Before INSERT INTO:
 00_0.gz
 After INSERT INTO:
 00_0.gz
 00_0.gz_copy_1
 This causes corrupted output when doing a SELECT * on the table.
 Correct behavior should be to pick a valid filename such as:
 00_0_copy_1.gz

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2296) bad compressed file names from insert into

2011-07-20 Thread Franklin Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Franklin Hu updated HIVE-2296:
--

Attachment: hive-2296.2.patch

add unit test

 bad compressed file names from insert into
 --

 Key: HIVE-2296
 URL: https://issues.apache.org/jira/browse/HIVE-2296
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Franklin Hu
Assignee: Franklin Hu
 Attachments: hive-2296.1.patch, hive-2296.2.patch


 When INSERT INTO is run on a table with compressed output 
 (hive.exec.compress.output=true) and existing files in the table, it may copy 
 the new files in bad file names:
 Before INSERT INTO:
 00_0.gz
 After INSERT INTO:
 00_0.gz
 00_0.gz_copy_1
 This causes corrupted output when doing a SELECT * on the table.
 Correct behavior should be to pick a valid filename such as:
 00_0_copy_1.gz

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-trunk-h0.21 #837

2011-07-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/837/changes

Changes:

[jvs] HIVE-1884. Potential risk of resource leaks in Hive
(Chinna Rao Lalam via jvs)

[pauly] HIVE-2224. Ability to add partitions atomically (Sushanth Sowmyan via 
pauly)

--
[...truncated 33321 lines...]
[artifact:deploy] Uploading: 
org/apache/hive/hive-hbase-handler/0.8.0-SNAPSHOT/hive-hbase-handler-0.8.0-20110721.003735-38.jar
 to repository apache.snapshots.https at 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] Transferring 49K from apache.snapshots.https
[artifact:deploy] Uploaded 49K
[artifact:deploy] [INFO] Uploading project information for hive-hbase-handler 
0.8.0-20110721.003735-38
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot 
org.apache.hive:hive-hbase-handler:0.8.0-SNAPSHOT'
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact 
org.apache.hive:hive-hbase-handler'

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-resolve-maven-ant-tasks:
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ivy/ivysettings.xml

ivy-retrieve-maven-ant-tasks:
[ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use 
'ivy.settings.file' instead
[ivy:cachepath] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ivy/ivysettings.xml

mvn-taskdef:

maven-publish-artifact:
[artifact:install-provider] Installing provider: 
org.apache.maven.wagon:wagon-http:jar:1.0-beta-2:runtime
[artifact:deploy] Deploying to 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] [INFO] Retrieving previous build number from 
apache.snapshots.https
[artifact:deploy] Uploading: 
org/apache/hive/hive-hwi/0.8.0-SNAPSHOT/hive-hwi-0.8.0-20110721.003736-38.jar 
to repository apache.snapshots.https at 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] Transferring 23K from apache.snapshots.https
[artifact:deploy] Uploaded 23K
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot 
org.apache.hive:hive-hwi:0.8.0-SNAPSHOT'
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact 
org.apache.hive:hive-hwi'
[artifact:deploy] [INFO] Uploading project information for hive-hwi 
0.8.0-20110721.003736-38

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/build/ivy/lib/ivy-2.1.0.jar
  [get] Not modified - so not downloaded

ivy-probe-antlib:

ivy-init-antlib:

ivy-init:

ivy-resolve-maven-ant-tasks:
[ivy:resolve] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ivy/ivysettings.xml

ivy-retrieve-maven-ant-tasks:
[ivy:cachepath] DEPRECATED: 'ivy.conf.file' is deprecated, use 
'ivy.settings.file' instead
[ivy:cachepath] :: loading settings :: file = 
/x1/jenkins/jenkins-slave/workspace/Hive-trunk-h0.21/hive/ivy/ivysettings.xml

mvn-taskdef:

maven-publish-artifact:
[artifact:install-provider] Installing provider: 
org.apache.maven.wagon:wagon-http:jar:1.0-beta-2:runtime
[artifact:deploy] Deploying to 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] [INFO] Retrieving previous build number from 
apache.snapshots.https
[artifact:deploy] Uploading: 
org/apache/hive/hive-jdbc/0.8.0-SNAPSHOT/hive-jdbc-0.8.0-20110721.003738-38.jar 
to repository apache.snapshots.https at 
https://repository.apache.org/content/repositories/snapshots
[artifact:deploy] Transferring 56K from apache.snapshots.https
[artifact:deploy] Uploaded 56K
[artifact:deploy] [INFO] Uploading project information for hive-jdbc 
0.8.0-20110721.003738-38
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'snapshot 
org.apache.hive:hive-jdbc:0.8.0-SNAPSHOT'
[artifact:deploy] [INFO] Retrieving previous metadata from 
apache.snapshots.https
[artifact:deploy] [INFO] Uploading repository metadata for: 'artifact 
org.apache.hive:hive-jdbc'

ivy-init-dirs:

ivy-download:
  [get] Getting: 
http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
  [get] To: 

[jira] [Commented] (HIVE-1884) Potential risk of resource leaks in Hive

2011-07-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068745#comment-13068745
 ] 

Hudson commented on HIVE-1884:
--

Integrated in Hive-trunk-h0.21 #837 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/837/])
HIVE-1884. Potential risk of resource leaks in Hive
(Chinna Rao Lalam via jvs)

jvs : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1148921
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/io/RCFileInputFormat.java
* 
/hive/trunk/contrib/src/java/org/apache/hadoop/hive/contrib/util/typedbytes/TypedBytesWritableInput.java
* /hive/trunk/cli/src/java/org/apache/hadoop/hive/cli/CliDriver.java
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java


 Potential risk of resource leaks in Hive
 

 Key: HIVE-1884
 URL: https://issues.apache.org/jira/browse/HIVE-1884
 Project: Hive
  Issue Type: Bug
  Components: CLI, Metastore, Query Processor, Server Infrastructure
Affects Versions: 0.3.0, 0.4.0, 0.4.1, 0.5.0, 0.6.0
 Environment: Hive 0.6.0, Hadoop 0.20.1
 SUSE Linux Enterprise Server 11 (i586)
Reporter: Mohit Sikri
Assignee: Chinna Rao Lalam
 Fix For: 0.8.0

 Attachments: HIVE-1884.1.PATCH, HIVE-1884.2.patch, HIVE-1884.3.patch, 
 HIVE-1884.4.patch, HIVE-1884.5.patch


 h3.There are couple of resource leaks.
 h4.For example,
 In CliDriver.java, Method :- processReader() the buffered reader is not 
 closed.
 h3.Also there are risk(s) of  resource(s) getting leaked , in such cases we 
 need to re factor the code to move closing of resources in finally block.
 h4. For Example :- 
 In Throttle.java   Method:- checkJobTracker() , the following code snippet 
 might cause resource leak.
 {code}
 InputStream in = url.openStream();
 in.read(buffer);
 in.close();
 {code}
 Ideally and as per the best coding practices it should be like below
 {code}
 InputStream in=null;
 try   {
 in = url.openStream();
 int numRead = in.read(buffer);
 }
 finally {
IOUtils.closeStream(in);
 }
 {code}
 Similar cases, were found in ExplainTask.java, DDLTask.java etc.Need to re 
 factor all such occurrences.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2247) ALTER TABLE RENAME PARTITION

2011-07-20 Thread Weiyan Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiyan Wang updated HIVE-2247:
--

Attachment: HIVE-2247.5.patch.txt

Use alter_partition(db_name, tbl_name, newPart, part_vals) to replace 
rename_partition thrift API
Add one authorization unit test to test if new partition has the same privilege 
as old one

 ALTER TABLE RENAME PARTITION
 

 Key: HIVE-2247
 URL: https://issues.apache.org/jira/browse/HIVE-2247
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Weiyan Wang
 Attachments: HIVE-2247.3.patch.txt, HIVE-2247.4.patch.txt, 
 HIVE-2247.5.patch.txt


 We need a ALTER TABLE TABLE RENAME PARTITIONfunction that is similar t ALTER 
 TABLE RENAME.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2247) ALTER TABLE RENAME PARTITION

2011-07-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068753#comment-13068753
 ] 

jirapos...@reviews.apache.org commented on HIVE-2247:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1105/
---

(Updated 2011-07-21 01:20:25.242756)


Review request for Siying Dong.


Changes
---

Refactor the code, rename_partition shares the same thrift API as 
alter_partition, we do alter_partition when part_vals is empty, we do 
rename_partition when part_vals is given 


Summary
---

Implement ALTER TABLE PARTITION RENAME function to rename a partition. 
Add HiveQL syntax ALTER TABLE bar PARTITION (k1='v1', k2='v2') RENAME TO 
PARTITION (k1='v3', k2='v4');
This is my first Hive diff, I just learn everything from existing codebase and 
may not have a good understanding on it. 
Feel free to inform me if I make something wrong. Thanks


This addresses bug HIVE-2247.
https://issues.apache.org/jira/browse/HIVE-2247


Diffs (updated)
-

  trunk/metastore/if/hive_metastore.thrift 1145366 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.h 1145366 
  trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp 1145366 
  
trunk/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore_server.skeleton.cpp 
1145366 
  
trunk/metastore/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/metastore/api/ThriftHiveMetastore.java
 1145366 
  trunk/metastore/src/gen/thrift/gen-php/hive_metastore/ThriftHiveMetastore.php 
1145366 
  
trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore-remote 
1145366 
  trunk/metastore/src/gen/thrift/gen-py/hive_metastore/ThriftHiveMetastore.py 
1145366 
  trunk/metastore/src/gen/thrift/gen-rb/thrift_hive_metastore.rb 1145366 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
1145366 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1145366 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1145366 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/IMetaStoreClient.java 
1145366 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1145366 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/RawStore.java 
1145366 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1145366 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/AlterTableDesc.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/DDLWork.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/HiveOperation.java 1145366 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/RenamePartitionDesc.java 
PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure.q 
PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure2.q 
PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/alter_rename_partition_failure3.q 
PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/alter_rename_partition.q 
PRE-CREATION 
  
trunk/ql/src/test/queries/clientpositive/alter_rename_partition_authorization.q 
PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure.q.out 
PRE-CREATION 
  
trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure2.q.out 
PRE-CREATION 
  
trunk/ql/src/test/results/clientnegative/alter_rename_partition_failure3.q.out 
PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/alter_rename_partition.q.out 
PRE-CREATION 
  
trunk/ql/src/test/results/clientpositive/alter_rename_partition_authorization.q.out
 PRE-CREATION 

Diff: https://reviews.apache.org/r/1105/diff


Testing
---

Add a partition A in the table
Rename partition A to partition B
Show the partitions in the table, it returns partition B.
SELECT the data from partition A, it returns no results
SELECT the data from partition B, it returns the data originally stored in 
partition A


Thanks,

Weiyan



 ALTER TABLE RENAME PARTITION
 

 Key: HIVE-2247
 URL: https://issues.apache.org/jira/browse/HIVE-2247
 Project: Hive
  Issue Type: New Feature
Reporter: Siying Dong
Assignee: Weiyan Wang
 Attachments: HIVE-2247.3.patch.txt, 

[jira] [Commented] (HIVE-2201) reduce name node calls in hive by creating temporary directories

2011-07-20 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068787#comment-13068787
 ] 

He Yongqiang commented on HIVE-2201:


+1, will commit after tests pass.

 reduce name node calls in hive by creating temporary directories
 

 Key: HIVE-2201
 URL: https://issues.apache.org/jira/browse/HIVE-2201
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Siying Dong
 Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch, 
 HIVE-2201.4.patch


 Currently, in Hive, when a file gets written by a FileSinkOperator,
 the sequence of operations is as follows:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp1/1
 3. Move directory /tmp1 to /tmp2
 4. For all files in /tmp2, remove all files starting with _tmp and
 duplicate files.
 Due to speculative execution, a lot of temporary files are created
 in /tmp1 (or /tmp2). This leads to a lot of name node calls,
 specially for large queries.
 The protocol above can be modified slightly:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp2/1
 3. Move directory /tmp2 to /tmp3
 4. For all files in /tmp3, remove all duplicate files.
 This should reduce the number of tmp files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2209) Provide a way by which ObjectInspectorUtils.compare can be extended by the caller for comparing maps which are part of the object

2011-07-20 Thread He Yongqiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

He Yongqiang updated HIVE-2209:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed, thanks Krishna Kumar!

 Provide a way by which ObjectInspectorUtils.compare can be extended by the 
 caller for comparing maps which are part of the object
 -

 Key: HIVE-2209
 URL: https://issues.apache.org/jira/browse/HIVE-2209
 Project: Hive
  Issue Type: Improvement
Reporter: Krishna Kumar
Assignee: Krishna Kumar
Priority: Minor
 Attachments: HIVE-2209v0.patch, HIVE-2209v2.patch, HIVE2209v1.patch


 Now ObjectInspectorUtils.compare throws an exception if a map is contained 
 (recursively) within the objects being compared. Two obvious implementations 
 are
 - a simple map comparer which assumes keys of the first map can be used to 
 fetch values from the second
 - a 'cross-product' comparer which compares every pair of key-value pairs in 
 the two maps, and calls a match if and only if all pairs are matched
 Note that it would be difficult to provide a transitive 
 greater-than/less-than indication with maps so that is not in scope. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: HIVE-2296 bad compressed file names when calling INSERT INTO

2011-07-20 Thread Franklin Hu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1155/
---

Review request for hive and Siying Dong.


Summary
---

Fixes problem of bad compressed file names by stripping off the file format (ex 
.gz) and reappending it to the path later.


This addresses bug HIVE-2296.
https://issues.apache.org/jira/browse/HIVE-2296


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1148973 
  trunk/ql/src/test/queries/clientpositive/insert_compressed.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_compressed.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1155/diff


Testing
---

Unit tests pass


Thanks,

Franklin



[jira] [Commented] (HIVE-2296) bad compressed file names from insert into

2011-07-20 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13068794#comment-13068794
 ] 

jirapos...@reviews.apache.org commented on HIVE-2296:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1155/
---

Review request for hive and Siying Dong.


Summary
---

Fixes problem of bad compressed file names by stripping off the file format (ex 
.gz) and reappending it to the path later.


This addresses bug HIVE-2296.
https://issues.apache.org/jira/browse/HIVE-2296


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1148973 
  trunk/ql/src/test/queries/clientpositive/insert_compressed.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/insert_compressed.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1155/diff


Testing
---

Unit tests pass


Thanks,

Franklin



 bad compressed file names from insert into
 --

 Key: HIVE-2296
 URL: https://issues.apache.org/jira/browse/HIVE-2296
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Franklin Hu
Assignee: Franklin Hu
 Attachments: hive-2296.1.patch, hive-2296.2.patch


 When INSERT INTO is run on a table with compressed output 
 (hive.exec.compress.output=true) and existing files in the table, it may copy 
 the new files in bad file names:
 Before INSERT INTO:
 00_0.gz
 After INSERT INTO:
 00_0.gz
 00_0.gz_copy_1
 This causes corrupted output when doing a SELECT * on the table.
 Correct behavior should be to pick a valid filename such as:
 00_0_copy_1.gz

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira