date:20110822


[ 
https://issues.apache.org/jira/browse/HIVE-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088683#comment-13088683
 ] 

Chinna Rao Lalam commented on HIVE-2399:


This issue related to HIVE-2111. Pls check this issue for more information.

 when use PARTITION，can not execute select statement
 -

 Key: HIVE-2399
 URL: https://issues.apache.org/jira/browse/HIVE-2399
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.7.0
 Environment: OS: Red Hat Enterprise Linux AS release 4 (Nahant Update 
 5)
 Hadoop: 0.20.2
 hive: 0.7.0
Reporter: yue.zhang

 when add PARTITION to create table statment, select statement of bad form 
 data is not good.
 create table statment
 ==
 CREATE  TABLE pplive(
   ip STRING,
   n1 STRING,
   n2 STRING,
   log_date  STRING,
   method STRING,
   uri STRING,
   version STRING,
   status STRING,
   flux STRING,
   n3 STRING,
   n4 STRING
   )
   PARTITIONED BY(path STRING)
 ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
 WITH SERDEPROPERTIES (
   input.regex = ([^ ]+)\\s+([^ ]+)\\s+([^ ]+)\\s+\\[(.+)\\]\\s+\([^ 
 ]+)\\s+(.+)\\s(.+)\\\s+([^ ]+)\\s+([^ ]+)\\s+\([^ ]+)\\\s+\(.+)\,
   output.format.string = %1$s %2$s %3$s %4$s
  )  ;
 hive.bb.txt
 ==
 Error line1.
 Error line2.
 Error line3.
 Load data
 ==
 LOAD DATA INPATH '/user/hive/warehouse/input/hive.bb.txt' OVERWRITE INTO 
 TABLE pplive PARTITION(path='haha') ;
 cli comand
 ==
  select * from pplive;
 Failed with exception java.io.IOException:java.lang.NullPointerException
 hive log error:
 ==
 2011-08-22 15:54:19,451 WARN  serde2.RegexSerDe 
 (RegexSerDe.java:deserialize(180)) - 1 unmatched rows are found: Error line1.
 2011-08-22 15:54:19,453 ERROR CliDriver (SessionState.java:printError(343)) - 
 Failed with exception java.io.IOException:java.lang.NullPointerException
 java.io.IOException: java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:150)
 at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1114)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
 at 
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: java.lang.NullPointerException
 at java.util.ArrayList.addAll(ArrayList.java:472)
 at 
 org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldsDataAsList(UnionStructObjectInspector.java:144)
 at 
 org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:357)
 at org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:142)
 ... 9 more

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1105) Add service script for starting metastore server

2011-08-22 Thread XiaoboGu (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088728#comment-13088728
 ] 

XiaoboGu commented on HIVE-1105:


But how to start metastore service as dameons, write the pids into pid files, 
and stop it using commands such as hive --service metastore stop?


 Add service script for starting metastore server
 

 Key: HIVE-1105
 URL: https://issues.apache.org/jira/browse/HIVE-1105
 Project: Hive
  Issue Type: Improvement
  Components: Server Infrastructure
Affects Versions: 0.4.1
Reporter: John Sichi
Assignee: John Sichi
Priority: Minor
 Fix For: 0.5.0

 Attachments: HIVE-1105.1.patch


 The instructions on this page recommend running Java directly in order to 
 start the metastore:
 http://wiki.apache.org/hadoop/Hive/AdminManual/MetastoreAdmin
 Since we already have a generic service-starter script, it would be nice to 
 be able to do this instead:
 hive --service metastore
 I've written a metastore.sh for this purpose.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2017) Driver.execute() should maintaining SessionState in case of runtime errors


 [ 
https://issues.apache.org/jira/browse/HIVE-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2017:
---

Attachment: HIVE-2017.1.patch

 Driver.execute() should maintaining SessionState in case of runtime errors
 --

 Key: HIVE-2017
 URL: https://issues.apache.org/jira/browse/HIVE-2017
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2017.1.patch


 Here's a snippet from Driver.execute():
 {code}
 // TODO: This error messaging is not very informative. Fix that.
 errorMessage = FAILED: Execution Error, return code  + exitVal 
 +  from 
 + tsk.getClass().getName();
 SQLState = 08S01;
 console.printError(errorMessage);
 if (running.size() != 0) {
   taskCleanup();
 }
 return 9;
 {code}
 I simply returned in case of runtime errors without maintaining SessionState. 
 It could cause resource leak mentioned in HIVE-1959. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: removed system.exit() from Driver.taskCleanup() and added code to kill the remaining tasks


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1609/
---

Review request for hive, John Sichi and Ning Zhang.


Summary
---

while processing two parallel tasks if one of the task fails the 
Driver.taskCleanup() will call system.exit() this will shutdown the jvm so it 
is replaced with the logic to stop the remaining tasks.. Here need to cleanup 2 
kinds of tasks one is non-mr tasks and  mr tasks. For stopping non-mr tasks 
used the thread.interrupt() because every non-mr task will be executed as a 
thread and for mr tasks maintained a variable called jobKillUri's this variable 
will track the spawned job kill uri, in taskCleanup() if it is a mr task using 
this variable kill the job. 


This addresses bug HIVE-2017.
https://issues.apache.org/jira/browse/HIVE-2017


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1160102 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1160102 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 
1160102 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1160102 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1160102 
  trunk/ql/src/test/queries/clientnegative/alter_exit.q PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/alter_exit.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1609/diff


Testing
---

Added test case for this scenario.  Ran existing test cases


Thanks,

chinna

[jira] [Commented] (HIVE-2017) Driver.execute() should maintaining SessionState in case of runtime errors

2011-08-22 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1306#comment-1306
 ] 

jirapos...@reviews.apache.org commented on HIVE-2017:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1609/
---

Review request for hive, John Sichi and Ning Zhang.


Summary
---

while processing two parallel tasks if one of the task fails the 
Driver.taskCleanup() will call system.exit() this will shutdown the jvm so it 
is replaced with the logic to stop the remaining tasks.. Here need to cleanup 2 
kinds of tasks one is non-mr tasks and  mr tasks. For stopping non-mr tasks 
used the thread.interrupt() because every non-mr task will be executed as a 
thread and for mr tasks maintained a variable called jobKillUri's this variable 
will track the spawned job kill uri, in taskCleanup() if it is a mr task using 
this variable kill the job. 


This addresses bug HIVE-2017.
https://issues.apache.org/jira/browse/HIVE-2017


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1160102 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1160102 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 
1160102 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 1160102 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1160102 
  trunk/ql/src/test/queries/clientnegative/alter_exit.q PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/alter_exit.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1609/diff


Testing
---

Added test case for this scenario.  Ran existing test cases


Thanks,

chinna



 Driver.execute() should maintaining SessionState in case of runtime errors
 --

 Key: HIVE-2017
 URL: https://issues.apache.org/jira/browse/HIVE-2017
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2017.1.patch


 Here's a snippet from Driver.execute():
 {code}
 // TODO: This error messaging is not very informative. Fix that.
 errorMessage = FAILED: Execution Error, return code  + exitVal 
 +  from 
 + tsk.getClass().getName();
 SQLState = 08S01;
 console.printError(errorMessage);
 if (running.size() != 0) {
   taskCleanup();
 }
 return 9;
 {code}
 I simply returned in case of runtime errors without maintaining SessionState. 
 It could cause resource leak mentioned in HIVE-1959. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2017) Driver.execute() should maintaining SessionState in case of runtime errors


[ 
https://issues.apache.org/jira/browse/HIVE-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1305#comment-1305
 ] 

Chinna Rao Lalam commented on HIVE-2017:


while processing two parallel tasks if one of the task fails the 
Driver.taskCleanup() will call system.exit() this will shutdown the jvm so it 
is replaced with the logic to stop the remaining tasks.. Here need to cleanup 2 
kinds of tasks one is non-mr tasks and  mr tasks. For stopping non-mr tasks 
used the thread.interrupt() because every non-mr task will be executed as a 
thread and for mr tasks maintained a variable called jobKillUri's this variable 
will track the spawned job kill uri, in taskCleanup() if it is a mr task using 
this variable kill the job. 

 Driver.execute() should maintaining SessionState in case of runtime errors
 --

 Key: HIVE-2017
 URL: https://issues.apache.org/jira/browse/HIVE-2017
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2017.1.patch


 Here's a snippet from Driver.execute():
 {code}
 // TODO: This error messaging is not very informative. Fix that.
 errorMessage = FAILED: Execution Error, return code  + exitVal 
 +  from 
 + tsk.getClass().getName();
 SQLState = 08S01;
 console.printError(errorMessage);
 if (running.size() != 0) {
   taskCleanup();
 }
 return 9;
 {code}
 I simply returned in case of runtime errors without maintaining SessionState. 
 It could cause resource leak mentioned in HIVE-1959. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Re: Review Request: Clean up the scratch.dir (tmp/hive-root) while restarting Hive server.


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1481/
---

(Updated 2011-08-22 18:23:15.749654)


Review request for hive.


Changes
---

Fixed review comments and updated the patch.


Summary
---

Now queries leaves the map outputs under scratch.dir after execution. If the 
hive server is stopped we need not keep the stopped server's map oputputs. So 
whle starting the server we can clear the scratch.dir. This can help in 
improved disk usage.

Implemented a cleanup method in HiveServer and it will trigger based on the 
property value hive.start.cleanup.scrachdir


This addresses bug HIVE-2181.
https://issues.apache.org/jira/browse/HIVE-2181


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1158629 
  trunk/conf/hive-default.xml 1158629 
  trunk/service/src/java/org/apache/hadoop/hive/service/HiveServer.java 1158629 
  trunk/service/src/test/org/apache/hadoop/hive/service/TestHiveServer.java 
1158629 

Diff: https://reviews.apache.org/r/1481/diff


Testing
---

Added test case for this scenario.


Thanks,

chinna

Re: Review Request: Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1088/
---

(Updated 2011-08-22 18:23:44.041004)


Review request for hive and John Sichi.


Changes
---

Fixed review comments and updated the patch.


Summary
---

Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()


This addresses bug HIVE-2184.
https://issues.apache.org/jira/browse/HIVE-2184


Diffs (updated)
-

  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1154844 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1154844 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1154844 

Diff: https://reviews.apache.org/r/1088/diff


Testing
---

Ran all the testcases


Thanks,

chinna

[jira] [Commented] (HIVE-2181) Clean up the scratch.dir (tmp/hive-root) while restarting Hive server.

2011-08-22 Thread jirapos...@reviews.apache.org (JIRA)

[
https://issues.apache.org/jira/browse/HIVE-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1308#comment-1308
]

jirapos...@reviews.apache.org commented on HIVE-2181:
-

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1481/
---

(Updated 2011-08-22 18:23:15.749654)

Review request for hive.

Changes
---

Fixed review comments and updated the patch.

Summary
---

Now queries leaves the map outputs under scratch.dir after execution. If the
hive server is stopped we need not keep the stopped server's map oputputs. So
whle starting the server we can clear the scratch.dir. This can help in
improved disk usage.

Implemented a cleanup method in HiveServer and it will trigger based on the
property value hive.start.cleanup.scrachdir

This addresses bug HIVE-2181.
https://issues.apache.org/jira/browse/HIVE-2181

Diffs (updated)
-

trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1158629
trunk/conf/hive-default.xml 1158629
trunk/service/src/java/org/apache/hadoop/hive/service/HiveServer.java 1158629
trunk/service/src/test/org/apache/hadoop/hive/service/TestHiveServer.java
1158629

Diff: https://reviews.apache.org/r/1481/diff

Testing
---

Added test case for this scenario.

Thanks,

chinna

Clean up the scratch.dir (tmp/hive-root) while restarting Hive server.

Key: HIVE-2181
URL: https://issues.apache.org/jira/browse/HIVE-2181
Project: Hive
Issue Type: Bug
Components: Server Infrastructure
Affects Versions: 0.8.0
Environment: Suse linux, Hadoop 20.1, Hive 0.8
Reporter: sanoj mathew
Assignee: Chinna Rao Lalam
Priority: Minor
Attachments: HIVE-2181.1.patch, HIVE-2181.2.patch, HIVE-2181.patch

Original Estimate: 48h
Remaining Estimate: 48h

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1996) LOAD DATA INPATH fails when the table already contains a file of the same name


 [ 
https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-1996:
---

Attachment: HIVE-1996.1.Patch

 LOAD DATA INPATH fails when the table already contains a file of the same 
 name
 

 Key: HIVE-1996
 URL: https://issues.apache.org/jira/browse/HIVE-1996
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Kirk True
Assignee: Chinna Rao Lalam
 Attachments: HIVE-1996.1.Patch, HIVE-1996.Patch


 Steps:
 1. From the command line copy the kv2.txt data file into the current user's 
 HDFS directory:
 {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
 kv2.txt}}
 2. In Hive, create the table:
 {{create table tst_src1 (key_ int, value_ string);}}
 3. Load the data into the table from HDFS:
 {{load data inpath './kv2.txt' into table tst_src1;}}
 4. Repeat step 1
 5. Repeat step 3
 Expected:
 To have kv2.txt renamed in HDFS and then copied to the destination as per 
 HIVE-307.
 Actual:
 File is renamed, but {{Hive.copyFiles}} doesn't see the change in {{srcs}} 
 as it continues to use the same array elements (with the un-renamed, old file 
 names). It crashes with this error:
 {noformat}
 java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
 at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
 at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
 at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2017) Driver.execute() should maintaining SessionState in case of runtime errors


 [ 
https://issues.apache.org/jira/browse/HIVE-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-2017:
---

Status: Patch Available  (was: Open)

 Driver.execute() should maintaining SessionState in case of runtime errors
 --

 Key: HIVE-2017
 URL: https://issues.apache.org/jira/browse/HIVE-2017
 Project: Hive
  Issue Type: Bug
Reporter: Ning Zhang
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2017.1.patch


 Here's a snippet from Driver.execute():
 {code}
 // TODO: This error messaging is not very informative. Fix that.
 errorMessage = FAILED: Execution Error, return code  + exitVal 
 +  from 
 + tsk.getClass().getName();
 SQLState = 08S01;
 console.printError(errorMessage);
 if (running.size() != 0) {
   taskCleanup();
 }
 return 9;
 {code}
 I simply returned in case of runtime errors without maintaining SessionState. 
 It could cause resource leak mentioned in HIVE-1959. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2184) Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()

2011-08-22 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088891#comment-13088891
 ] 

jirapos...@reviews.apache.org commented on HIVE-2184:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1088/
---

(Updated 2011-08-22 18:23:44.041004)


Review request for hive and John Sichi.


Changes
---

Fixed review comments and updated the patch.


Summary
---

Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()


This addresses bug HIVE-2184.
https://issues.apache.org/jira/browse/HIVE-2184


Diffs (updated)
-

  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1154844 
  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java
 1154844 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1154844 

Diff: https://reviews.apache.org/r/1088/diff


Testing
---

Ran all the testcases


Thanks,

chinna



 Few improvements in org.apache.hadoop.hive.ql.metadata.Hive.close()
 ---

 Key: HIVE-2184
 URL: https://issues.apache.org/jira/browse/HIVE-2184
 Project: Hive
  Issue Type: Bug
  Components: Metastore
Affects Versions: 0.5.0, 0.8.0
 Environment: Hadoop 0.20.1, Hive0.8.0 and SUSE Linux Enterprise 
 Server 10 SP2 (i586) - Kernel 2.6.16.60-0.21-smp (5)
Reporter: Chinna Rao Lalam
Assignee: Chinna Rao Lalam
 Attachments: HIVE-2184.1.patch, HIVE-2184.1.patch, HIVE-2184.2.patch, 
 HIVE-2184.3.patch, HIVE-2184.patch


 1)Hive.close() will call HiveMetaStoreClient.close() in this method the 
 variable standAloneClient is never become true then client.shutdown() never 
 call.
 2)Hive.close() After calling metaStoreClient.close() need to make 
 metaStoreClient=null

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-1996) LOAD DATA INPATH fails when the table already contains a file of the same name


[ 
https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088903#comment-13088903
 ] 

Chinna Rao Lalam commented on HIVE-1996:


This scenario will work if load from local but it will fail if loading from 
File-system. To replicate this scenarios i have used this queries create table 
load_overwrite2 (key string, value string) stored as textfile location 
'file:/tmp1/load2_overwrite2'; so as part of this query execution it should 
create this file:/tmp1/load2_overwrite2. I have verified this in my 
environment it is working without fail. Pls let me know if any issues.

 LOAD DATA INPATH fails when the table already contains a file of the same 
 name
 

 Key: HIVE-1996
 URL: https://issues.apache.org/jira/browse/HIVE-1996
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Kirk True
Assignee: Chinna Rao Lalam
 Attachments: HIVE-1996.1.Patch, HIVE-1996.Patch


 Steps:
 1. From the command line copy the kv2.txt data file into the current user's 
 HDFS directory:
 {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
 kv2.txt}}
 2. In Hive, create the table:
 {{create table tst_src1 (key_ int, value_ string);}}
 3. Load the data into the table from HDFS:
 {{load data inpath './kv2.txt' into table tst_src1;}}
 4. Repeat step 1
 5. Repeat step 3
 Expected:
 To have kv2.txt renamed in HDFS and then copied to the destination as per 
 HIVE-307.
 Actual:
 File is renamed, but {{Hive.copyFiles}} doesn't see the change in {{srcs}} 
 as it continues to use the same array elements (with the un-renamed, old file 
 names). It crashes with this error:
 {noformat}
 java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
 at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
 at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
 at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Review Request: LOAD DATA INPATH fails when the table already contains a file of the same name


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1610/
---

Review request for hive, Carl Steinbach and John Sichi.


Summary
---

LOAD DATA INPATH fails when the table already contains a file of the same 
name. If any name confilcts occurs it will rename the file, After file name got 
changed it is trying to load with the old name because of this load is failed. 
Now we have changed the code like, load with the changed filename for that 
introduced a map it will maintain the old name and changed filename as key 
value pair and while loading need to use this map.


This addresses bug HIVE-1996.
https://issues.apache.org/jira/browse/HIVE-1996


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1160102 
  trunk/ql/src/test/queries/clientpositive/input44.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/input44.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1610/diff


Testing
---

Added a test case for this scenario.


Thanks,

chinna

[jira] [Commented] (HIVE-1996) LOAD DATA INPATH fails when the table already contains a file of the same name

2011-08-22 Thread jirapos...@reviews.apache.org (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088911#comment-13088911
 ] 

jirapos...@reviews.apache.org commented on HIVE-1996:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1610/
---

Review request for hive, Carl Steinbach and John Sichi.


Summary
---

LOAD DATA INPATH fails when the table already contains a file of the same 
name. If any name confilcts occurs it will rename the file, After file name got 
changed it is trying to load with the old name because of this load is failed. 
Now we have changed the code like, load with the changed filename for that 
introduced a map it will maintain the old name and changed filename as key 
value pair and while loading need to use this map.


This addresses bug HIVE-1996.
https://issues.apache.org/jira/browse/HIVE-1996


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 1160102 
  trunk/ql/src/test/queries/clientpositive/input44.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/input44.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1610/diff


Testing
---

Added a test case for this scenario.


Thanks,

chinna



 LOAD DATA INPATH fails when the table already contains a file of the same 
 name
 

 Key: HIVE-1996
 URL: https://issues.apache.org/jira/browse/HIVE-1996
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Kirk True
Assignee: Chinna Rao Lalam
 Attachments: HIVE-1996.1.Patch, HIVE-1996.Patch


 Steps:
 1. From the command line copy the kv2.txt data file into the current user's 
 HDFS directory:
 {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
 kv2.txt}}
 2. In Hive, create the table:
 {{create table tst_src1 (key_ int, value_ string);}}
 3. Load the data into the table from HDFS:
 {{load data inpath './kv2.txt' into table tst_src1;}}
 4. Repeat step 1
 5. Repeat step 3
 Expected:
 To have kv2.txt renamed in HDFS and then copied to the destination as per 
 HIVE-307.
 Actual:
 File is renamed, but {{Hive.copyFiles}} doesn't see the change in {{srcs}} 
 as it continues to use the same array elements (with the un-renamed, old file 
 names). It crashes with this error:
 {noformat}
 java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
 at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
 at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
 at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1996) LOAD DATA INPATH fails when the table already contains a file of the same name


 [ 
https://issues.apache.org/jira/browse/HIVE-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chinna Rao Lalam updated HIVE-1996:
---

Status: Patch Available  (was: Open)

 LOAD DATA INPATH fails when the table already contains a file of the same 
 name
 

 Key: HIVE-1996
 URL: https://issues.apache.org/jira/browse/HIVE-1996
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.0
Reporter: Kirk True
Assignee: Chinna Rao Lalam
 Attachments: HIVE-1996.1.Patch, HIVE-1996.Patch


 Steps:
 1. From the command line copy the kv2.txt data file into the current user's 
 HDFS directory:
 {{$ hadoop fs -copyFromLocal /path/to/hive/sources/data/files/kv2.txt 
 kv2.txt}}
 2. In Hive, create the table:
 {{create table tst_src1 (key_ int, value_ string);}}
 3. Load the data into the table from HDFS:
 {{load data inpath './kv2.txt' into table tst_src1;}}
 4. Repeat step 1
 5. Repeat step 3
 Expected:
 To have kv2.txt renamed in HDFS and then copied to the destination as per 
 HIVE-307.
 Actual:
 File is renamed, but {{Hive.copyFiles}} doesn't see the change in {{srcs}} 
 as it continues to use the same array elements (with the un-renamed, old file 
 names). It crashes with this error:
 {noformat}
 java.lang.NullPointerException
 at org.apache.hadoop.hive.ql.metadata.Hive.copyFiles(Hive.java:1725)
 at org.apache.hadoop.hive.ql.metadata.Table.copyFiles(Table.java:541)
 at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1173)
 at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:197)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
 at 
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1060)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:897)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:745)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 {noformat}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1989) recognize transitivity of predicates on join keys


 [ 
https://issues.apache.org/jira/browse/HIVE-1989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-1989:
-

Status: Open  (was: Patch Available)

I see failures in the following tests:

org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_auto_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_join_filters
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_ppd_outer_join4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join8


 recognize transitivity of predicates on join keys
 -

 Key: HIVE-1989
 URL: https://issues.apache.org/jira/browse/HIVE-1989
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.6.0
Reporter: John Sichi
Assignee: Charles Chen
 Fix For: 0.8.0

 Attachments: HIVE-1989v1.patch, HIVE-1989v4.patch, 
 HIVE-1989v5-WITH-HIVE-2382v1.patch, HIVE-1989v6-WITH-HIVE-2383v1.patch


 Given
 {noformat}
 set hive.mapred.mode=strict;
 create table invites (foo int, bar string) partitioned by (ds string);
 create table invites2 (foo int, bar string) partitioned by (ds string);
 select count(*) from invites join invites2 on invites.ds=invites2.ds where 
 invites.ds='2011-01-01';
 {noformat}
 currently an error occurs:
 {noformat}
 Error in semantic analysis: No Partition Predicate Found for Alias invites2 
 Table invites2
 {noformat}
 The optimizer should be able to infer a predicate on invites2 via 
 transitivity.  The current lack places a burden on the user to add a 
 redundant predicate, and makes impossible (at least in strict mode) join 
 views where both underlying tables are partitioned (the join select list has 
 to pick one of the tables arbitrarily).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2278) Support archiving for multiple partitions if the table is partitioned by multiple columns


 [ 
https://issues.apache.org/jira/browse/HIVE-2278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2278:
-

Status: Open  (was: Patch Available)

I see failures in the following tests. Please take a look:

org.apache.hadoop.hive.metastore.TestMetaStoreEventListener.testListener
org.apache.hadoop.hive.metastore.TestRemoteHiveMetaStore.testSynchronized
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_corrupt
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_archive_multi
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_authorization_2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_create_view_partitioned
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_drop_multi_partitions
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_escape1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_inputddl6

 Support archiving for multiple partitions if the table is partitioned by 
 multiple columns
 -

 Key: HIVE-2278
 URL: https://issues.apache.org/jira/browse/HIVE-2278
 Project: Hive
  Issue Type: New Feature
Reporter: Namit Jain
Assignee: Marcin Kurczych
 Attachments: HIVE-2278.2.patch, HIVE-2278.3.patch, HIVE-2278.4.patch, 
 HIVE-2278.5.patch, HIVE-2278.5.patch, HIVE-2278.6.patch, hive.2278.1.patch


 If a table is partitioned by ds,hr
 it should be possible to archive all the files in ds to reduce the number of 
 files

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HIVE-2350) Improve RCFile Read Speed


 [ 
https://issues.apache.org/jira/browse/HIVE-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach reopened HIVE-2350:
--


@Tim: Yes, looks like closing this was a mistake on my part. Your latest patch 
looks good, but you forgot to click the box that gives license rights to the 
ASF. Can you please attach the patch again and this time click the box? Thanks.

 Improve RCFile Read Speed
 -

 Key: HIVE-2350
 URL: https://issues.apache.org/jira/browse/HIVE-2350
 Project: Hive
  Issue Type: Improvement
Reporter: Tim Armstrong
Assignee: Tim Armstrong
Priority: Minor
 Fix For: 0.8.0

 Attachments: rcfile-2011-08-04.diff, rcfile_opt_2011-08-05.diff, 
 rcfile_opt_2011-08-05b.diff, rcfile_opt_2011-08-11.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 By tweaking the RCFile$Reader implementation to allow more efficient memory 
 access I was able to reduce CPU usage.  I measured the speed required to scan 
 a gzipped RCFile, decompress and assemble into records.  CPU time was reduced 
 by about 7% for a full table scan,  An improvement of about 2% was realised 
 when a smaller subset of columns (3-5 out of tens) were selected.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2396) RCFileReader Buffer Reuse


[ 
https://issues.apache.org/jira/browse/HIVE-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088971#comment-13088971
 ] 

Carl Steinbach commented on HIVE-2396:
--

@Yongqiang: HIVE-2350 was not committed. Closing that ticket was a mistake on 
my part. Yes, I'll commit it to trunk.

 RCFileReader Buffer Reuse
 -

 Key: HIVE-2396
 URL: https://issues.apache.org/jira/browse/HIVE-2396
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Affects Versions: 0.8.0
Reporter: Tim Armstrong
Assignee: Tim Armstrong
Priority: Minor
 Fix For: 0.8.0

 Attachments: rcfile_bufreuse_2011-08-19.patch


 Minor tweak to RCFile$Reader which improves read performance by reusing value 
 buffers.  This reduces object allocation.
 Depends on HIVE-2350

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2350) Improve RCFile Read Speed

2011-08-22 Thread Tim Armstrong (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated HIVE-2350:


Attachment: rcfile_opt_2011-08-11.patch

 Improve RCFile Read Speed
 -

 Key: HIVE-2350
 URL: https://issues.apache.org/jira/browse/HIVE-2350
 Project: Hive
  Issue Type: Improvement
Reporter: Tim Armstrong
Assignee: Tim Armstrong
Priority: Minor
 Fix For: 0.8.0

 Attachments: rcfile-2011-08-04.diff, rcfile_opt_2011-08-05.diff, 
 rcfile_opt_2011-08-05b.diff, rcfile_opt_2011-08-11.patch, 
 rcfile_opt_2011-08-11.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 By tweaking the RCFile$Reader implementation to allow more efficient memory 
 access I was able to reduce CPU usage.  I measured the speed required to scan 
 a gzipped RCFile, decompress and assemble into records.  CPU time was reduced 
 by about 7% for a full table scan,  An improvement of about 2% was realised 
 when a smaller subset of columns (3-5 out of tens) were selected.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2350) Improve RCFile Read Speed

2011-08-22 Thread Tim Armstrong (JIRA)


[ 
https://issues.apache.org/jira/browse/HIVE-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13088978#comment-13088978
 ] 

Tim Armstrong commented on HIVE-2350:
-

Oops - my mistake.

 Improve RCFile Read Speed
 -

 Key: HIVE-2350
 URL: https://issues.apache.org/jira/browse/HIVE-2350
 Project: Hive
  Issue Type: Improvement
Reporter: Tim Armstrong
Assignee: Tim Armstrong
Priority: Minor
 Fix For: 0.8.0

 Attachments: rcfile-2011-08-04.diff, rcfile_opt_2011-08-05.diff, 
 rcfile_opt_2011-08-05b.diff, rcfile_opt_2011-08-11.patch, 
 rcfile_opt_2011-08-11.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 By tweaking the RCFile$Reader implementation to allow more efficient memory 
 access I was able to reduce CPU usage.  I measured the speed required to scan 
 a gzipped RCFile, decompress and assemble into records.  CPU time was reduced 
 by about 7% for a full table scan,  An improvement of about 2% was realised 
 when a smaller subset of columns (3-5 out of tens) were selected.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2350) Improve RCFile Read Speed


[ 
https://issues.apache.org/jira/browse/HIVE-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13089024#comment-13089024
 ] 

Carl Steinbach commented on HIVE-2350:
--

+1. Will commit if tests pass.

 Improve RCFile Read Speed
 -

 Key: HIVE-2350
 URL: https://issues.apache.org/jira/browse/HIVE-2350
 Project: Hive
  Issue Type: Improvement
Reporter: Tim Armstrong
Assignee: Tim Armstrong
Priority: Minor
 Fix For: 0.8.0

 Attachments: rcfile-2011-08-04.diff, rcfile_opt_2011-08-05.diff, 
 rcfile_opt_2011-08-05b.diff, rcfile_opt_2011-08-11.patch, 
 rcfile_opt_2011-08-11.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 By tweaking the RCFile$Reader implementation to allow more efficient memory 
 access I was able to reduce CPU usage.  I measured the speed required to scan 
 a gzipped RCFile, decompress and assemble into records.  CPU time was reduced 
 by about 7% for a full table scan,  An improvement of about 2% was realised 
 when a smaller subset of columns (3-5 out of tens) were selected.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2303) files with control-A,B are not delimited correctly.


 [ 
https://issues.apache.org/jira/browse/HIVE-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2303:
-

Status: Open  (was: Patch Available)

I see diffs in the following tests:

org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_combine2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_filter_join_breaktask
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input23
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input42
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part7
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input_part9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_louter_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_outer_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_pcr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_rand_partitionpruner3
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_regexp_extract
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_router_join_ppr
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample10
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample6
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample8
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_sample9
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_ppr1
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_transform_ppr2
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_explode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udf_reflect
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_udtf_explode
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_union_ppr
org.apache.hadoop.hive.cli.TestNegativeMinimrCliDriver.testNegativeCliDriver_minimr_broken_pipe
org.apache.hadoop.hive.ql.parse.TestParse.testParse_cast1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input20
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input8
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath
org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath2
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join5
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join6
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join7
org.apache.hadoop.hive.ql.parse.TestParse.testParse_join8
org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf1
org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf4
org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf6
org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_case
org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_when

@Amareshwari: Can you please take a look? Thanks.

 files with control-A,B are not delimited correctly.
 ---

 Key: HIVE-2303
 URL: https://issues.apache.org/jira/browse/HIVE-2303
 Project: Hive
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.8.0

 Attachments: patch-2303.txt


 The following is from one of our users:
  
 create external table impressions (imp string, msg string)
   row format delimited
 fields terminated by '\t'
 lines terminated by '\n'
   stored as textfile 
   location '/xxx';
  
 Some strings in my data contains Control-A, Control-B etc as internal 
 delimiters.  If I do a
  
 Select * from impressions limit 10;
  
 All fields were able to print correctly.  However if I do a
  
 Select * from impressions where msg regexp '.*' limit 10;
  
 The fields were broken by the control characters.  The difference between the 
 2 commands is that the latter requires a map-reduce job.  
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

hive-0.7.1: TestCliDriver FAILED

2011-08-22 Thread 李冰

Hi, all
When I try to run the standard test cases in Hive 0.7.1 against SUN 1.6 JDK, I 
found that TestCliDriver failed.

The version of the JDK I used is:
java version 1.6.0_27-ea
Java(TM) SE Runtime Environment (build 1.6.0_27-ea-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.2-b03, mixed mode)


My steps:
1. ant clean
2. ant package
3. ant test

Here is a snapshot of the failure:

    [junit] Done query: script_env_var2.q
    [junit] Begin query: script_pipe.q
    [junit] junit.framework.AssertionFailedError: Client execution results 
failed with error code = 1
    [junit] See build/ql/tmp/hive.log, or try ant test ... 
-Dtest.silent=false to get more logs.
    [junit] at junit.framework.Assert.fail(Assert.java:47)
    [junit] at 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_script_pipe(TestCliDriver.java:21067)
    [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    [junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    [junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    [junit] at java.lang.reflect.Method.invoke(Method.java:597)
    [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
    [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
    [junit] at junit.framework.TestResult$1.protect(TestResult.java:106)
    [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
    [junit] at junit.framework.TestResult.run(TestResult.java:109)
    [junit] at junit.framework.TestCase.run(TestCase.java:118)
    [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
    [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
    [junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
    [junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
    [junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)
    [junit] diff -a -I file: -I pfile: -I hdfs: -I /tmp/ -I invalidscheme: -I 
lastUpdateTime -I lastAccessTime -I [Oo]wner -I CreateTime -I LastAccessTime -I 
Location -I transient_lastDdlTime -I last_modified_ -I 
java.lang.RuntimeException -I at org -I at sun -I at java -I at junit -I Caused 
by: -I LOCK_QUERYID: -I grantTime -I [.][.][.] [0-9]* more -I USING 'java -cp 
/home/libing/hive-0.7.1/src/build/ql/test/logs/clientpositive/script_pipe.q.out 
/home/libing/hive-0.7.1/src/ql/src/test/results/clientpositive/script_pipe.q.out
    [junit] 143c143,144
    [junit]  POSTHOOK: Output: 
file:/tmp/libing/hive_2011-08-21_23-27-41_670_8767305526316071428/-mr-1
    [junit] ---
    [junit]  POSTHOOK: Output: 
file:/tmp/sdong/hive_2011-02-10_17-04-27_817_7785884157237702561/-mr-1
    [junit]  238   val_238 238 val_238
    [junit] Exception: Client execution results failed with error code = 1
    [junit] See build/ql/tmp/hive.log, or try ant test ... 
-Dtest.silent=false to get more logs.
    [junit] Begin query: select_as_omitted.q

Have you met this before?

Thanks

回复： hive-0.7.1: TestCliDriver FAILED

2011-08-22 Thread 李冰

From the result, we can see that the difference between the source file and 
target file is only the path which should be masked when being compared.

Bing

--- 11年8月22日，周一, 李 冰 lib...@yahoo.com.cn 写道：

发件人: 李 冰 lib...@yahoo.com.cn
主题: hive-0.7.1: TestCliDriver FAILED
收件人: u...@hive.apache.org
抄送: dev@hive.apache.org
日期: 2011年8月22日,周一,下午10:43

Hi, all
When I try to run the standard test cases in Hive 0.7.1 against SUN 1.6 JDK, I 
found that TestCliDriver failed.

The version of the JDK I used is:
java version 1.6.0_27-ea
Java(TM) SE Runtime Environment (build 1.6.0_27-ea-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.2-b03, mixed mode)


My steps:
1. ant clean
2. ant package
3. ant test

Here is a snapshot of the failure:

    [junit] Done query: script_env_var2.q
    [junit] Begin query: script_pipe.q
    [junit] junit.framework.AssertionFailedError: Client execution results 
failed with error code = 1
    [junit] See build/ql/tmp/hive.log, or try ant test ... 
-Dtest.silent=false to get more logs.
    [junit] at
 junit.framework.Assert.fail(Assert.java:47)
    [junit] at 
org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_script_pipe(TestCliDriver.java:21067)
    [junit] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    [junit] at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    [junit] at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    [junit] at java.lang.reflect.Method.invoke(Method.java:597)
    [junit] at junit.framework.TestCase.runTest(TestCase.java:154)
    [junit] at junit.framework.TestCase.runBare(TestCase.java:127)
    [junit]
 at junit.framework.TestResult$1.protect(TestResult.java:106)
    [junit] at junit.framework.TestResult.runProtected(TestResult.java:124)
    [junit] at junit.framework.TestResult.run(TestResult.java:109)
    [junit] at junit.framework.TestCase.run(TestCase.java:118)
    [junit] at junit.framework.TestSuite.runTest(TestSuite.java:208)
    [junit] at junit.framework.TestSuite.run(TestSuite.java:203)
    [junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
    [junit] at 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
    [junit] at
 
org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)
    [junit] diff -a -I file: -I pfile: -I hdfs: -I /tmp/ -I invalidscheme: -I 
lastUpdateTime -I lastAccessTime -I [Oo]wner -I CreateTime -I LastAccessTime -I 
Location -I transient_lastDdlTime -I last_modified_ -I 
java.lang.RuntimeException -I at org -I at sun -I at java -I at junit -I Caused 
by: -I LOCK_QUERYID: -I grantTime -I [.][.][.] [0-9]* more -I USING 'java -cp 
/home/libing/hive-0.7.1/src/build/ql/test/logs/clientpositive/script_pipe.q.out 
/home/libing/hive-0.7.1/src/ql/src/test/results/clientpositive/script_pipe.q.out
    [junit] 143c143,144
    [junit]  POSTHOOK: Output: 
file:/tmp/libing/hive_2011-08-21_23-27-41_670_8767305526316071428/-mr-1
    [junit] ---
    [junit]  POSTHOOK: Output:
 file:/tmp/sdong/hive_2011-02-10_17-04-27_817_7785884157237702561/-mr-1
    [junit]  238   val_238 238 val_238
    [junit] Exception: Client execution results failed with error code = 1
    [junit] See build/ql/tmp/hive.log, or try ant test ... 
-Dtest.silent=false to get more logs.
    [junit] Begin query: select_as_omitted.q

Have you met this before?

Thanks

[jira] [Resolved] (HIVE-2350) Improve RCFile Read Speed


 [ 
https://issues.apache.org/jira/browse/HIVE-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach resolved HIVE-2350.
--

Resolution: Fixed

Committed to trunk (for real this time!). Thanks Tim!

 Improve RCFile Read Speed
 -

 Key: HIVE-2350
 URL: https://issues.apache.org/jira/browse/HIVE-2350
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Tim Armstrong
Assignee: Tim Armstrong
Priority: Minor
  Labels: rcfile
 Fix For: 0.8.0

 Attachments: rcfile-2011-08-04.diff, rcfile_opt_2011-08-05.diff, 
 rcfile_opt_2011-08-05b.diff, rcfile_opt_2011-08-11.patch, 
 rcfile_opt_2011-08-11.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 By tweaking the RCFile$Reader implementation to allow more efficient memory 
 access I was able to reduce CPU usage.  I measured the speed required to scan 
 a gzipped RCFile, decompress and assemble into records.  CPU time was reduced 
 by about 7% for a full table scan,  An improvement of about 2% was realised 
 when a smaller subset of columns (3-5 out of tens) were selected.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2350) Improve RCFile Read Speed


 [ 
https://issues.apache.org/jira/browse/HIVE-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2350:
-

Component/s: Serializers/Deserializers

 Improve RCFile Read Speed
 -

 Key: HIVE-2350
 URL: https://issues.apache.org/jira/browse/HIVE-2350
 Project: Hive
  Issue Type: Improvement
  Components: Serializers/Deserializers
Reporter: Tim Armstrong
Assignee: Tim Armstrong
Priority: Minor
  Labels: rcfile
 Fix For: 0.8.0

 Attachments: rcfile-2011-08-04.diff, rcfile_opt_2011-08-05.diff, 
 rcfile_opt_2011-08-05b.diff, rcfile_opt_2011-08-11.patch, 
 rcfile_opt_2011-08-11.patch

   Original Estimate: 0h
  Remaining Estimate: 0h

 By tweaking the RCFile$Reader implementation to allow more efficient memory 
 access I was able to reduce CPU usage.  I measured the speed required to scan 
 a gzipped RCFile, decompress and assemble into records.  CPU time was reduced 
 by about 7% for a full table scan,  An improvement of about 2% was realised 
 when a smaller subset of columns (3-5 out of tens) were selected.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2305) UNION ALL on different types throws runtime exception


 [ 
https://issues.apache.org/jira/browse/HIVE-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2305:
-

Status: Open  (was: Patch Available)

@Franklin: I see test failures in TestParse.union and TestCliDriver.union24. 
Can you please take a look? Thanks.

 UNION ALL on different types throws runtime exception
 -

 Key: HIVE-2305
 URL: https://issues.apache.org/jira/browse/HIVE-2305
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.7.1
Reporter: Franklin Hu
Assignee: Franklin Hu
 Fix For: 0.8.0

 Attachments: hive-2305.1.patch, hive-2305.2.patch, hive-2305.3.patch


 Ex:
 SELECT * (SELECT 123 FROM ... UNION ALL SELECT '123' FROM ..) t;
 Unioning columns of different types currently throws runtime exceptions.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2266) Fix compression parameters


 [ 
https://issues.apache.org/jira/browse/HIVE-2266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2266:
-

Status: Open  (was: Patch Available)

@Vaibhav: The changes look good, but it needs a test. Is it feasible to add 
something to TestCliDriver? Thanks.

 Fix compression parameters
 --

 Key: HIVE-2266
 URL: https://issues.apache.org/jira/browse/HIVE-2266
 Project: Hive
  Issue Type: Bug
Reporter: Vaibhav Aggarwal
Assignee: Vaibhav Aggarwal
 Attachments: HIVE-2266.patch


 There are a number of places where compression values are not set correctly 
 in FileSinkOperator. This results in uncompressed files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2303) files with control-A,B are not delimited correctly.

2011-08-22 Thread Amareshwari Sriramadasu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-2303:
--

Status: Patch Available  (was: Open)

 files with control-A,B are not delimited correctly.
 ---

 Key: HIVE-2303
 URL: https://issues.apache.org/jira/browse/HIVE-2303
 Project: Hive
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.8.0

 Attachments: patch-2303-2.txt, patch-2303.txt


 The following is from one of our users:
  
 create external table impressions (imp string, msg string)
   row format delimited
 fields terminated by '\t'
 lines terminated by '\n'
   stored as textfile 
   location '/xxx';
  
 Some strings in my data contains Control-A, Control-B etc as internal 
 delimiters.  If I do a
  
 Select * from impressions limit 10;
  
 All fields were able to print correctly.  However if I do a
  
 Select * from impressions where msg regexp '.*' limit 10;
  
 The fields were broken by the control characters.  The difference between the 
 2 commands is that the latter requires a map-reduce job.  
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2303) files with control-A,B are not delimited correctly.

2011-08-22 Thread Amareshwari Sriramadasu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HIVE-2303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amareshwari Sriramadasu updated HIVE-2303:
--

Attachment: patch-2303-2.txt

Patch on review board has test outputs regenerated. Uploading the patch from 
review board.

 files with control-A,B are not delimited correctly.
 ---

 Key: HIVE-2303
 URL: https://issues.apache.org/jira/browse/HIVE-2303
 Project: Hive
  Issue Type: Bug
Reporter: Amareshwari Sriramadasu
Assignee: Amareshwari Sriramadasu
 Fix For: 0.8.0

 Attachments: patch-2303-2.txt, patch-2303.txt


 The following is from one of our users:
  
 create external table impressions (imp string, msg string)
   row format delimited
 fields terminated by '\t'
 lines terminated by '\n'
   stored as textfile 
   location '/xxx';
  
 Some strings in my data contains Control-A, Control-B etc as internal 
 delimiters.  If I do a
  
 Select * from impressions limit 10;
  
 All fields were able to print correctly.  However if I do a
  
 Select * from impressions where msg regexp '.*' limit 10;
  
 The fields were broken by the control characters.  The difference between the 
 2 commands is that the latter requires a map-reduce job.  
  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2338) Alter table always throws an unhelpful error on failure


 [ 
https://issues.apache.org/jira/browse/HIVE-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2338:
-

   Resolution: Fixed
Fix Version/s: 0.8.0
 Hadoop Flags: [Reviewed]
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks Sohan!

 Alter table always throws an unhelpful error on failure
 ---

 Key: HIVE-2338
 URL: https://issues.apache.org/jira/browse/HIVE-2338
 Project: Hive
  Issue Type: Bug
  Components: Diagnosability, Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
Priority: Minor
 Fix For: 0.8.0

 Attachments: HIVE-2338.1.patch, HIVE-2338.2.patch


 Every failure in an alter table function always return a MetaException. When 
 altering tables and catching exceptions, we throw a MetaException in the 
 finally part of a try-catch-finally block, which overrides any other 
 exceptions thrown.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-2338) Alter table always throws an unhelpful error on failure


 [ 
https://issues.apache.org/jira/browse/HIVE-2338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carl Steinbach updated HIVE-2338:
-

Component/s: Diagnosability

 Alter table always throws an unhelpful error on failure
 ---

 Key: HIVE-2338
 URL: https://issues.apache.org/jira/browse/HIVE-2338
 Project: Hive
  Issue Type: Bug
  Components: Diagnosability, Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
Priority: Minor
 Fix For: 0.8.0

 Attachments: HIVE-2338.1.patch, HIVE-2338.2.patch


 Every failure in an alter table function always return a MetaException. When 
 altering tables and catching exceptions, we throw a MetaException in the 
 finally part of a try-catch-finally block, which overrides any other 
 exceptions thrown.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-1850) alter table set serdeproperties bypasses regexps checks (leaves table in a non-recoverable state?)