[jira] [Created] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds

2011-06-23 Thread Siying Dong (JIRA)
Cli: Print Hadoop's CPU milliseconds


 Key: HIVE-2236
 URL: https://issues.apache.org/jira/browse/HIVE-2236
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Reporter: Siying Dong
Priority: Minor


CPU Milliseonds information is available from Hadoop's framework. Printing it 
out to Hive CLI when executing a job will help users to know more about their 
jobs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Review Request: Cli: Print Hadoop's CPU milliseconds

2011-06-23 Thread Siying Dong

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/948/
---

Review request for hive, Yongqiang He, Ning Zhang, and namit jain.


Summary
---

In hive CLI, print out CPU msec from Hadoop MapReduce coutners.


This addresses bug HIVE-2236.
https://issues.apache.org/jira/browse/HIVE-2236


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1138748 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 1138748 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1138748 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 
1138748 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1138748 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1138748 

Diff: https://reviews.apache.org/r/948/diff


Testing
---

run the updated codes against real clusters and make sure it printing is 
correct.


Thanks,

Siying



[jira] [Commented] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds

2011-06-23 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053684#comment-13053684
 ] 

jirapos...@reviews.apache.org commented on HIVE-2236:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/948/
---

Review request for hive, Yongqiang He, Ning Zhang, and namit jain.


Summary
---

In hive CLI, print out CPU msec from Hadoop MapReduce coutners.


This addresses bug HIVE-2236.
https://issues.apache.org/jira/browse/HIVE-2236


Diffs
-

  trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1138748 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 1138748 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java PRE-CREATION 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1138748 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 
1138748 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1138748 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1138748 

Diff: https://reviews.apache.org/r/948/diff


Testing
---

run the updated codes against real clusters and make sure it printing is 
correct.


Thanks,

Siying



 Cli: Print Hadoop's CPU milliseconds
 

 Key: HIVE-2236
 URL: https://issues.apache.org/jira/browse/HIVE-2236
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2236.1.patch


 CPU Milliseonds information is available from Hadoop's framework. Printing it 
 out to Hive CLI when executing a job will help users to know more about their 
 jobs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-1537) Allow users to specify LOCATION in CREATE DATABASE statement

2011-06-23 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053764#comment-13053764
 ] 

jirapos...@reviews.apache.org commented on HIVE-1537:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/949/
---

Review request for hive, Ning Zhang and Amareshwari Sriramadasu.


Summary
---

Usage:

create database location 'path1';
alter database location 'path2';

After 'alter', only newly created tables will be located under the new 
location. Tables created before 'alter' will be under 'path1'.

Notes:
--
1. I have moved getDefaultDatabasePath() to HiveMetaStore and made it private. 
There should only be one API to obtain the location of a database and it has to 
accept 'Database' as an arg and hence the new method in Warehouse 
'getDatabasePath()' and similarly 'getTablePath()'. The usages of older API 
also has been changed. Hope that should be fine.
2. One could argue why have getDatabasePath() as location can be obtained by 
db.getLocationUri(). I wanted to retain this method to do any additional 
processing if necessary (getDns or whatever).


This addresses bug HIVE-1537.
https://issues.apache.org/jira/browse/HIVE-1537


Diffs
-

  
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
1138011 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1138011 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1138011 
  trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
1138011 
  
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1138011 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1138011 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1138011 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1138011 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1138011 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
1138011 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1138011 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
1138011 
  trunk/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 1138011 
  trunk/ql/src/test/queries/clientpositive/database_location.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/database_location.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/949/diff


Testing
---

1. Updated TestHiveMetaStore.java for testing the functionality - database 
creation, alteration and table's locations as TestCliDriver outputs ignore 
locations.
2. Added database_location.q for testing the grammar primarily.

Thanks,
Thiruvel


Thanks,

Thiruvel



 Allow users to specify LOCATION in CREATE DATABASE statement
 

 Key: HIVE-1537
 URL: https://issues.apache.org/jira/browse/HIVE-1537
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Reporter: Carl Steinbach
Assignee: Thiruvel Thirumoolan
 Attachments: HIVE-1537.patch, hive-1537.metastore.part.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2230) Hive Client build error

2011-06-23 Thread Bennie Schut (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053806#comment-13053806
 ] 

Bennie Schut commented on HIVE-2230:


I talked with Dmytro offline and this line on the wiki should probably changed:
The Hive ODBC driver was developed with Thrift trunk version r790732, but the 
latest revision should also be fine.

Hive 0.7 and higher uses thrift 0.5.0. I'm not sure what happens when you mix 
with a newer version of thrift but the older version (r790732) doesn't seem to 
work. I would probably advice others to use 0.5.0.



 Hive Client build error
 ---

 Key: HIVE-2230
 URL: https://issues.apache.org/jira/browse/HIVE-2230
 Project: Hive
  Issue Type: Bug
  Components: Clients, ODBC
 Environment: hive:
 {code}
 Path: .
 URL: http://svn.apache.org/repos/asf/hive/trunk
 Repository Root: http://svn.apache.org/repos/asf
 Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
 Revision: 1138016
 Node Kind: directory
 Schedule: normal
 Last Changed Author: jvs
 Last Changed Rev: 1137839
 Last Changed Date: 2011-06-21 03:41:17 +0200 (Tue, 21 Jun 2011)
 {code}
 thrift:
 {code}
 Path: .
 URL: http://svn.apache.org/repos/asf/thrift/trunk
 Repository Root: http://svn.apache.org/repos/asf
 Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
 Revision: 1138011
 Node Kind: directory
 Schedule: normal
 Last Changed Author: molinaro
 Last Changed Rev: 1137870
 Last Changed Date: 2011-06-21 08:20:18 +0200 (Tue, 21 Jun 2011)
 {code}
Reporter: Dmytro Korochkin

 While running ant 
 {code}
 ant compile-cpp -Dthrift.home=/usr/local
 {code}
 to build Hive Client according to http://wiki.apache.org/hadoop/Hive/HiveODBC 
 I've got following error message:
 {code}
 compile-cpp:
  [exec] mkdir -p /home/ubuntu/hive/build/metastore/objs
  [exec] g++ -Wall -g -fPIC -m32 -DARCH32 -I/usr/local/include/thrift 
 -I/usr/local/include/thrift/fb303 -I/include 
 -I/home/ubuntu/hive/service/src/gen/thrift/gen-cpp 
 -I/home/ubuntu/hive/ql/src/gen/thrift/gen-cpp 
 -I/home/ubuntu/hive/metastore/src/gen/thrift/gen-cpp 
 -I/home/ubuntu/hive/odbc/src/cpp -c 
 /home/ubuntu/hive/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp -o 
 /home/ubuntu/hive/build/metastore/objs/ThriftHiveMetastore.o
  [exec] 
 /home/ubuntu/hive/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp: 
 In member function 'virtual bool 
 Apache::Hadoop::Hive::ThriftHiveMetastoreProcessor::process_fn(apache::thrift::protocol::TProtocol*,
  apache::thrift::protocol::TProtocol*, std::string, int32_t)':
  [exec] 
 /home/ubuntu/hive/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp:18014:92:
  error: no matching function for call to 
 'Apache::Hadoop::Hive::ThriftHiveMetastoreProcessor::process_fn(apache::thrift::protocol::TProtocol*,
  apache::thrift::protocol::TProtocol*, std::string, int32_t)'
  [exec] /usr/local/include/thrift/fb303/FacebookService.h:1299:16: note: 
 candidate is: virtual bool 
 facebook::fb303::FacebookServiceProcessor::process_fn(apache::thrift::protocol::TProtocol*,
  apache::thrift::protocol::TProtocol*, std::string, int32_t, void*)
  [exec] make: *** 
 [/home/ubuntu/hive/build/metastore/objs/ThriftHiveMetastore.o] Error 1
 BUILD FAILED
 {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: trunk busted?

2011-06-23 Thread John Sichi
OK, I guess it's because I'm hitting the permissions mystery discussed with you 
and Paul back-channel.  We need to get that resolved.

JVS

On Jun 22, 2011, at 10:20 PM, Ning Zhang wrote:

 FYI my test just succeeded on the clean check out of trunk. 
 
 On Jun 22, 2011, at 5:14 PM, yongqiang he wrote:
 
 database.q failed me when testing HIVE-2100
 
 On Wed, Jun 22, 2011 at 2:23 PM, John Sichi jsi...@fb.com wrote:
 Yeah, that's one of the failures (out of many different ones) that Jenkins 
 has been hitting (see the end of this log):
 
 https://builds.apache.org/view/G-L/view/Hive/job/Hive-trunk-h0.21/788/console
 
 It's sporadic, probably based on server load.
 
 JVS
 
 On Jun 22, 2011, at 2:10 PM, Ning Zhang wrote:
 
 John, here's what I got for 'ant clean package'. It seems ivy is flaky now?
 
 
 ivy-download:
 [get] Getting: 
 http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar
 [get] To: 
 /data/users/nzhang/reviews/2/apache-hive/build/ivy/lib/ivy-2.1.0.jar
 [get] Error getting 
 http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar to 
 /data/users/nzhang/reviews/2/apache-hive/build/ivy/lib/ivy-2.1.0.jar
 
 BUILD FAILED
 /data/users/nzhang/reviews/2/apache-hive/build.xml:196: The following 
 error occurred while executing this line:
 /data/users/nzhang/reviews/2/apache-hive/build.xml:130: The following 
 error occurred while executing this line:
 /data/users/nzhang/reviews/2/apache-hive/build-common.xml:128: 
 java.net.ConnectException: Connection refused
 
 
 On Jun 22, 2011, at 12:43 PM, John Sichi wrote:
 
 Yeah, all tests passed when I committed the bitmap indexes, so I'm not 
 sure what's up.
 
 JVS
 
 On Jun 22, 2011, at 12:36 PM, Ning Zhang wrote:
 
 trunk was fine the last time I committed. John the last ones who 
 committed were Carl (branching 0.7.1) and you (bitmap index). :) Did you 
 get all the tests passed? I'll test with a clean checkout.
 
 On Jun 22, 2011, at 12:02 PM, John Sichi wrote:
 
 Are other committers able to pass tests on Hive trunk?  I'm getting 
 lots of failures, and Jenkins seems to have been barfing for a while 
 too.
 
 JVS
 
 
 
 
 
 
 



Re: Review Request: HIVE-1537 - Allow users to specify LOCATION in CREATE DATABASE statement

2011-06-23 Thread Ashutosh Chauhan

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/949/#review898
---



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/949/#comment1938

This may not be always successful. You may fail to create dirs for number 
of reasons. So, this needs to be handled gracefully. Transaction needs to 
rollback in such case and create database ddl needs to fail. For more info, 
look the first comment of Devaraj and also his attached partial patch.



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/949/#comment1941

As previously, mkdirs() can fail, so handle similarly as in createDatabase()



trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
https://reviews.apache.org/r/949/#comment1942

Please also add a test when a create database fails because a FS operation 
fails. In such a case no metadata should get created. One way to simulate that 
is to make location unwritable then try to create database on that location.


- Ashutosh


On 2011-06-23 09:55:50, Thiruvel Thirumoolan wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/949/
 ---
 
 (Updated 2011-06-23 09:55:50)
 
 
 Review request for hive, Ning Zhang and Amareshwari Sriramadasu.
 
 
 Summary
 ---
 
 Usage:
 
 create database location 'path1';
 alter database location 'path2';
 
 After 'alter', only newly created tables will be located under the new 
 location. Tables created before 'alter' will be under 'path1'.
 
 Notes:
 --
 1. I have moved getDefaultDatabasePath() to HiveMetaStore and made it 
 private. There should only be one API to obtain the location of a database 
 and it has to accept 'Database' as an arg and hence the new method in 
 Warehouse 'getDatabasePath()' and similarly 'getTablePath()'. The usages of 
 older API also has been changed. Hope that should be fine.
 2. One could argue why have getDatabasePath() as location can be obtained by 
 db.getLocationUri(). I wanted to retain this method to do any additional 
 processing if necessary (getDns or whatever).
 
 
 This addresses bug HIVE-1537.
 https://issues.apache.org/jira/browse/HIVE-1537
 
 
 Diffs
 -
 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java
  1138011 
   
 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
 1138011 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
 1138011 
   trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
 1138011 
   
 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
  1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1138011 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
 1138011 
   trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
 1138011 
   
 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java
  1138011 
   trunk/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 1138011 
   trunk/ql/src/test/queries/clientpositive/database_location.q PRE-CREATION 
   trunk/ql/src/test/results/clientpositive/database_location.q.out 
 PRE-CREATION 
 
 Diff: https://reviews.apache.org/r/949/diff
 
 
 Testing
 ---
 
 1. Updated TestHiveMetaStore.java for testing the functionality - database 
 creation, alteration and table's locations as TestCliDriver outputs ignore 
 locations.
 2. Added database_location.q for testing the grammar primarily.
 
 Thanks,
 Thiruvel
 
 
 Thanks,
 
 Thiruvel
 




[jira] [Commented] (HIVE-1537) Allow users to specify LOCATION in CREATE DATABASE statement

2011-06-23 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053959#comment-13053959
 ] 

jirapos...@reviews.apache.org commented on HIVE-1537:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/949/#review898
---



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/949/#comment1938

This may not be always successful. You may fail to create dirs for number 
of reasons. So, this needs to be handled gracefully. Transaction needs to 
rollback in such case and create database ddl needs to fail. For more info, 
look the first comment of Devaraj and also his attached partial patch.



trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
https://reviews.apache.org/r/949/#comment1941

As previously, mkdirs() can fail, so handle similarly as in createDatabase()



trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
https://reviews.apache.org/r/949/#comment1942

Please also add a test when a create database fails because a FS operation 
fails. In such a case no metadata should get created. One way to simulate that 
is to make location unwritable then try to create database on that location.


- Ashutosh


On 2011-06-23 09:55:50, Thiruvel Thirumoolan wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/949/
bq.  ---
bq.  
bq.  (Updated 2011-06-23 09:55:50)
bq.  
bq.  
bq.  Review request for hive, Ning Zhang and Amareshwari Sriramadasu.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Usage:
bq.  
bq.  create database location 'path1';
bq.  alter database location 'path2';
bq.  
bq.  After 'alter', only newly created tables will be located under the new 
location. Tables created before 'alter' will be under 'path1'.
bq.  
bq.  Notes:
bq.  --
bq.  1. I have moved getDefaultDatabasePath() to HiveMetaStore and made it 
private. There should only be one API to obtain the location of a database and 
it has to accept 'Database' as an arg and hence the new method in Warehouse 
'getDatabasePath()' and similarly 'getTablePath()'. The usages of older API 
also has been changed. Hope that should be fine.
bq.  2. One could argue why have getDatabasePath() as location can be obtained 
by db.getLocationUri(). I wanted to retain this method to do any additional 
processing if necessary (getDns or whatever).
bq.  
bq.  
bq.  This addresses bug HIVE-1537.
bq.  https://issues.apache.org/jira/browse/HIVE-1537
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 
1138011 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 
1138011 
bq.
trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 
1138011 
bq.trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 
1138011 
bq.
trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java
 1138011 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1138011 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1138011 
bq.
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 
1138011 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1138011 
bq.
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 
1138011 
bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 
1138011 
bq.
trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 
1138011 
bq.trunk/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 
1138011 
bq.trunk/ql/src/test/queries/clientpositive/database_location.q 
PRE-CREATION 
bq.trunk/ql/src/test/results/clientpositive/database_location.q.out 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/949/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  1. Updated TestHiveMetaStore.java for testing the functionality - database 
creation, alteration and table's locations as TestCliDriver outputs ignore 
locations.
bq.  2. Added database_location.q for testing the grammar primarily.
bq.  
bq.  Thanks,
bq.  Thiruvel
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Thiruvel
bq.  
bq.



 Allow users to specify LOCATION in CREATE DATABASE statement
 

 Key: HIVE-1537
 URL: https://issues.apache.org/jira/browse/HIVE-1537
 Project: Hive
  Issue 

Re: Review Request: HIVE-2035 Use block level merge on rcfile if intermediate merge is needed

2011-06-23 Thread Franklin Hu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/935/
---

(Updated 2011-06-23 18:56:14.903379)


Review request for hive.


Changes
---

Add max and min split size configs to unit tests


Summary
---

For a table stored as RCFile, intermediate results are sometimes merged if 
those files are below a certain threshold. For RCFiles, we can do a block level 
merge that does not deserialize the blocks and is more efficient. This patch 
leverages the existing code used to merge for ALTER TABLE ... CONCATENATE.


This addresses bug HIVE-2035.
https://issues.apache.org/jira/browse/HIVE-2035


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1139014 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 1139014 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1139014 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java 
1139014 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java
 1139014 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileKeyBufferWrapper.java
 1139014 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java
 1139014 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
1139014 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
 1139014 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 1139014 
  trunk/ql/src/test/queries/clientpositive/rcfile_createas1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/rcfile_merge1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/rcfile_merge2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/rcfile_merge3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/rcfile_merge4.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/rcfile_createas1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/rcfile_merge1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/rcfile_merge2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/rcfile_merge3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/rcfile_merge4.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/935/diff


Testing
---


Thanks,

Franklin



[jira] [Updated] (HIVE-2035) Use block-level merge for RCFile if merging intermediate results are needed

2011-06-23 Thread Franklin Hu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Franklin Hu updated HIVE-2035:
--

Attachment: hive-2035.3.patch

Add min/max split size settings to unit tests

 Use block-level merge for RCFile if merging intermediate results are needed
 ---

 Key: HIVE-2035
 URL: https://issues.apache.org/jira/browse/HIVE-2035
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Franklin Hu
 Attachments: hive-2035.1.patch, hive-2035.3.patch


 Currently if hive.merge.mapredfiles and/or hive.merge.mapfile is set to true 
 the intermediate data could be merged using an additional MapReduce job. This 
 could be quite expensive if the data size is large. With HIVE-1950, merging 
 can be done in the RCFile block level so that it bypasses the 
 (de-)compression, (de-)serialization phases. This could improve the merge 
 process significantly. 
 This JIRA should handle the case where the input table is not stored in 
 RCFile, but the destination table is (which requires the intermediate data 
 should be stored in the same format as the destination table). 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2035) Use block-level merge for RCFile if merging intermediate results are needed

2011-06-23 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054033#comment-13054033
 ] 

jirapos...@reviews.apache.org commented on HIVE-2035:
-


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/935/
---

(Updated 2011-06-23 18:56:14.903379)


Review request for hive.


Changes
---

Add max and min split size configs to unit tests


Summary
---

For a table stored as RCFile, intermediate results are sometimes merged if 
those files are below a certain threshold. For RCFiles, we can do a block level 
merge that does not deserialize the blocks and is more efficient. This patch 
leverages the existing code used to merge for ALTER TABLE ... CONCATENATE.


This addresses bug HIVE-2035.
https://issues.apache.org/jira/browse/HIVE-2035


Diffs (updated)
-

  trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1139014 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 1139014 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 
1139014 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java 
1139014 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java
 1139014 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileKeyBufferWrapper.java
 1139014 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java
 1139014 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 
1139014 
  
trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java
 1139014 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 1139014 
  trunk/ql/src/test/queries/clientpositive/rcfile_createas1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/rcfile_merge1.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/rcfile_merge2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/rcfile_merge3.q PRE-CREATION 
  trunk/ql/src/test/queries/clientpositive/rcfile_merge4.q PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/rcfile_createas1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/rcfile_merge1.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/rcfile_merge2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/rcfile_merge3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientpositive/rcfile_merge4.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/935/diff


Testing
---


Thanks,

Franklin



 Use block-level merge for RCFile if merging intermediate results are needed
 ---

 Key: HIVE-2035
 URL: https://issues.apache.org/jira/browse/HIVE-2035
 Project: Hive
  Issue Type: Improvement
Reporter: Ning Zhang
Assignee: Franklin Hu
 Attachments: hive-2035.1.patch, hive-2035.3.patch


 Currently if hive.merge.mapredfiles and/or hive.merge.mapfile is set to true 
 the intermediate data could be merged using an additional MapReduce job. This 
 could be quite expensive if the data size is large. With HIVE-1950, merging 
 can be done in the RCFile block level so that it bypasses the 
 (de-)compression, (de-)serialization phases. This could improve the merge 
 process significantly. 
 This JIRA should handle the case where the input table is not stored in 
 RCFile, but the destination table is (which requires the intermediate data 
 should be stored in the same format as the destination table). 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Jenkins build is back to normal : Hive-trunk-h0.21 #790

2011-06-23 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/790/




[jira] [Commented] (HIVE-2215) Add api for marking / querying set of partitions for events

2011-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054057#comment-13054057
 ] 

Hudson commented on HIVE-2215:
--

Integrated in Hive-trunk-h0.21 #790 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/790/])


 Add api for marking / querying set of partitions for events
 ---

 Key: HIVE-2215
 URL: https://issues.apache.org/jira/browse/HIVE-2215
 Project: Hive
  Issue Type: New Feature
  Components: Metastore
Affects Versions: 0.8.0
Reporter: Ashutosh Chauhan
Assignee: Ashutosh Chauhan
 Fix For: 0.8.0

 Attachments: hive-2215_full-1.patch, hive_2215.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2218) speedup addInputPaths

2011-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054060#comment-13054060
 ] 

Hudson commented on HIVE-2218:
--

Integrated in Hive-trunk-h0.21 #790 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/790/])


 speedup addInputPaths
 -

 Key: HIVE-2218
 URL: https://issues.apache.org/jira/browse/HIVE-2218
 Project: Hive
  Issue Type: Improvement
Reporter: He Yongqiang
Assignee: He Yongqiang
 Fix For: 0.8.0

 Attachments: HIVE-2218.1.patch, HIVE-2218.2.patch, HIVE-2218.3.patch


 Speedup the addInputPaths for combined symlink inputformat, and added some 
 other micro optimizations which also work for normal cases.
 This can help reducing the start time of one query from 5 hours to less than 
 20 mins.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2176) Schema creation scripts are incomplete since they leave out tables that are specific to DataNucleus

2011-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054059#comment-13054059
 ] 

Hudson commented on HIVE-2176:
--

Integrated in Hive-trunk-h0.21 #790 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/790/])


 Schema creation scripts are incomplete since they leave out tables that are 
 specific to DataNucleus
 ---

 Key: HIVE-2176
 URL: https://issues.apache.org/jira/browse/HIVE-2176
 Project: Hive
  Issue Type: Bug
  Components: Configuration, Metastore
Affects Versions: 0.5.0, 0.6.0, 0.7.0
Reporter: Esteban Gutierrez
Assignee: Esteban Gutierrez
  Labels: derby, mysql, postgres
 Fix For: 0.7.1, 0.8.0

 Attachments: HIVE-2176.3.patch.txt


 When using the DDL SQL scripts to create the Metastore, tables like 
 SEQUENCE_TABLE are missing and force the user to change the configuration to 
 use Datanucleus to do all the provisioning of the Metastore tables. Adding 
 the missing table definitions to the DDL scripts will allow to have a 
 functional Hive Metastore without enabling additional privileges to the 
 Metastore user and/or enabling datanucleus.autoCreateSchema property in 
 hive-site.xml
 [After running the hive-schema-0.7.0.mysql.sql and revoking ALTER and CREATE 
 privileges to the 'metastoreuser']
 hive show tables; 
 FAILED: Error in metadata: javax.jdo.JDOException: Exception thrown calling 
 table.exists() for `SEQUENCE_TABLE` 
 NestedThrowables: 
 com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: CREATE command 
 denied to user 'metastoreuser'@'localhost' for table 'SEQUENCE_TABLE' 
 FAILED: Execution Error, return code 1 from 
 org.apache.hadoop.hive.ql.exec.DDLTask

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2158) add the HivePreparedStatement implementation based on current HIVE supported data-type

2011-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054061#comment-13054061
 ] 

Hudson commented on HIVE-2158:
--

Integrated in Hive-trunk-h0.21 #790 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/790/])


 add the HivePreparedStatement implementation based on current HIVE supported 
 data-type
 --

 Key: HIVE-2158
 URL: https://issues.apache.org/jira/browse/HIVE-2158
 Project: Hive
  Issue Type: Sub-task
  Components: JDBC
Affects Versions: 0.6.0, 0.7.0, 0.8.0
Reporter: Yuanjun Li
Assignee: Yuanjun Li
 Fix For: 0.7.1, 0.8.0

 Attachments: HIVE-0.7.1-PreparedStatement.1.patch.txt, 
 HIVE-0.8-PreparedStatement.1.patch.txt




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage

2011-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054058#comment-13054058
 ] 

Hudson commented on HIVE-2036:
--

Integrated in Hive-trunk-h0.21 #790 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/790/])


 Update bitmap indexes for automatic usage
 -

 Key: HIVE-2036
 URL: https://issues.apache.org/jira/browse/HIVE-2036
 Project: Hive
  Issue Type: Improvement
  Components: Indexing
Affects Versions: 0.8.0
Reporter: Russell Melick
Assignee: Syed S. Albiz
 Fix For: 0.8.0

 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch, HIVE-2036.8.patch


 HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap 
 index support.  The bitmap code will need to be extended after it is 
 committed to enable automatic use of indexing.  Most work will be focused in 
 the BitmapIndexHandler, which needs to generate the re-entrant QL index 
 query.  There may also be significant work in the IndexPredicateAnalyzer to 
 support predicates with OR's, instead of just AND's as it is currently.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2222) runnable queue in Driver and DriverContext is not thread safe

2011-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054063#comment-13054063
 ] 

Hudson commented on HIVE-:
--

Integrated in Hive-trunk-h0.21 #790 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/790/])


 runnable queue in Driver and DriverContext is not thread safe
 -

 Key: HIVE-
 URL: https://issues.apache.org/jira/browse/HIVE-
 Project: Hive
  Issue Type: Bug
Reporter: He Yongqiang
Assignee: Namit Jain
 Attachments: hive..1.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2140) Return correct Major / Minor version numbers for Hive Driver

2011-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054062#comment-13054062
 ] 

Hudson commented on HIVE-2140:
--

Integrated in Hive-trunk-h0.21 #790 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/790/])


 Return correct Major / Minor version numbers for Hive Driver
 

 Key: HIVE-2140
 URL: https://issues.apache.org/jira/browse/HIVE-2140
 Project: Hive
  Issue Type: Sub-task
  Components: JDBC
Affects Versions: 0.6.0, 0.7.0
Reporter: Curtis Boyden
Assignee: Curtis Boyden
 Fix For: 0.7.1, 0.8.0

 Attachments: hive-0.6-driver-version.patch, 
 hive-0.7-driver-version.patch, hive-trunk-driver-version.patch




--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2213) Optimize partial specification metastore functions

2011-06-23 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054064#comment-13054064
 ] 

Hudson commented on HIVE-2213:
--

Integrated in Hive-trunk-h0.21 #790 (See 
[https://builds.apache.org/job/Hive-trunk-h0.21/790/])


 Optimize partial specification metastore functions
 --

 Key: HIVE-2213
 URL: https://issues.apache.org/jira/browse/HIVE-2213
 Project: Hive
  Issue Type: Improvement
  Components: Metastore
Reporter: Sohan Jain
Assignee: Sohan Jain
 Fix For: 0.8.0

 Attachments: HIVE-2213.1.patch, HIVE-2213.3.patch


 If a table has a large number of partitions, get_partition_names_ps() make 
 take a long time to execute, because we get all of the partition names from 
 the database.  This is not very memory efficient, and the operation can be 
 pushed down to the JDO layer without getting all of the names first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2237) hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java

2011-06-23 Thread Patrick Hunt (JIRA)
hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
-

 Key: HIVE-2237
 URL: https://issues.apache.org/jira/browse/HIVE-2237
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.8.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt


I see the following error in helios eclipse with the latest trunk (although 
build on the command line is fine):

Syntax error on token ;, delete this token

seems to have been introduced by this change in HIVE-2036

+import org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat;;


I have a patch forthcoming.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2237) hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java

2011-06-23 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated HIVE-2237:
---

Attachment: HIVE-2237.patch

patch to remove the extra semi

 hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
 -

 Key: HIVE-2237
 URL: https://issues.apache.org/jira/browse/HIVE-2237
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.8.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Attachments: HIVE-2237.patch


 I see the following error in helios eclipse with the latest trunk (although 
 build on the command line is fine):
 Syntax error on token ;, delete this token
 seems to have been introduced by this change in HIVE-2036
 +import org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat;;
 I have a patch forthcoming.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2237) hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java

2011-06-23 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated HIVE-2237:
---

Status: Patch Available  (was: Open)

 hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
 -

 Key: HIVE-2237
 URL: https://issues.apache.org/jira/browse/HIVE-2237
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.8.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Attachments: HIVE-2237.patch


 I see the following error in helios eclipse with the latest trunk (although 
 build on the command line is fine):
 Syntax error on token ;, delete this token
 seems to have been introduced by this change in HIVE-2036
 +import org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat;;
 I have a patch forthcoming.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2237) hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java

2011-06-23 Thread John Sichi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Sichi updated HIVE-2237:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

+1, committed to trunk.  Thanks Patrick!


 hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
 -

 Key: HIVE-2237
 URL: https://issues.apache.org/jira/browse/HIVE-2237
 Project: Hive
  Issue Type: Bug
  Components: Build Infrastructure
Affects Versions: 0.8.0
Reporter: Patrick Hunt
Assignee: Patrick Hunt
 Attachments: HIVE-2237.patch


 I see the following error in helios eclipse with the latest trunk (although 
 build on the command line is fine):
 Syntax error on token ;, delete this token
 seems to have been introduced by this change in HIVE-2036
 +import org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat;;
 I have a patch forthcoming.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data

2011-06-23 Thread Carl Steinbach (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054116#comment-13054116
 ] 

Carl Steinbach commented on HIVE-895:
-

@jakob: The code on github looks really good.

The release branch for 0.8.0 is going to get created sometime in the next 
couple of weeks. Do you think it will be possible to get a patch ready for 
review before then?

 Add SerDe for Avro serialized data
 --

 Key: HIVE-895
 URL: https://issues.apache.org/jira/browse/HIVE-895
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Jeff Hammerbacher
Assignee: Jakob Homan

 As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
 data seems like a solid win.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data

2011-06-23 Thread Jakob Homan (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054124#comment-13054124
 ] 

Jakob Homan commented on HIVE-895:
--

A couple weeks is probably not feasible.  Assuming 0.9 comes out in a few 
months after that, that's probably a better bet.

 Add SerDe for Avro serialized data
 --

 Key: HIVE-895
 URL: https://issues.apache.org/jira/browse/HIVE-895
 Project: Hive
  Issue Type: New Feature
  Components: Serializers/Deserializers
Reporter: Jeff Hammerbacher
Assignee: Jakob Homan

 As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro 
 data seems like a solid win.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds

2011-06-23 Thread Siying Dong (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siying Dong updated HIVE-2236:
--

Status: Patch Available  (was: Open)

 Cli: Print Hadoop's CPU milliseconds
 

 Key: HIVE-2236
 URL: https://issues.apache.org/jira/browse/HIVE-2236
 Project: Hive
  Issue Type: New Feature
  Components: CLI
Reporter: Siying Dong
Assignee: Siying Dong
Priority: Minor
 Attachments: HIVE-2236.1.patch


 CPU Milliseonds information is available from Hadoop's framework. Printing it 
 out to Hive CLI when executing a job will help users to know more about their 
 jobs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2201) reduce name node calls in hive by creating temporary directories

2011-06-23 Thread Siying Dong (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054188#comment-13054188
 ] 

Siying Dong commented on HIVE-2201:
---

ping

 reduce name node calls in hive by creating temporary directories
 

 Key: HIVE-2201
 URL: https://issues.apache.org/jira/browse/HIVE-2201
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Siying Dong
 Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch


 Currently, in Hive, when a file gets written by a FileSinkOperator,
 the sequence of operations is as follows:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp1/1
 3. Move directory /tmp1 to /tmp2
 4. For all files in /tmp2, remove all files starting with _tmp and
 duplicate files.
 Due to speculative execution, a lot of temporary files are created
 in /tmp1 (or /tmp2). This leads to a lot of name node calls,
 specially for large queries.
 The protocol above can be modified slightly:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp2/1
 3. Move directory /tmp2 to /tmp3
 4. For all files in /tmp3, remove all duplicate files.
 This should reduce the number of tmp files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HIVE-2201) reduce name node calls in hive by creating temporary directories

2011-06-23 Thread He Yongqiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054200#comment-13054200
 ] 

He Yongqiang commented on HIVE-2201:


i will take a look...

 reduce name node calls in hive by creating temporary directories
 

 Key: HIVE-2201
 URL: https://issues.apache.org/jira/browse/HIVE-2201
 Project: Hive
  Issue Type: Improvement
Reporter: Namit Jain
Assignee: Siying Dong
 Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch


 Currently, in Hive, when a file gets written by a FileSinkOperator,
 the sequence of operations is as follows:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp1/1
 3. Move directory /tmp1 to /tmp2
 4. For all files in /tmp2, remove all files starting with _tmp and
 duplicate files.
 Due to speculative execution, a lot of temporary files are created
 in /tmp1 (or /tmp2). This leads to a lot of name node calls,
 specially for large queries.
 The protocol above can be modified slightly:
 1. In tmp directory tmp1, create a tmp file _tmp_1
 2. At the end of the operator, move
 /tmp1/_tmp_1 to /tmp2/1
 3. Move directory /tmp2 to /tmp3
 4. For all files in /tmp3, remove all duplicate files.
 This should reduce the number of tmp files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




Build failed in Jenkins: Hive-trunk-h0.21 #791

2011-06-23 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hive-trunk-h0.21/791/changes

Changes:

[jvs] HIVE-2237.  hive fails to build in eclipse due to syntax error in
BitmapIndexHandler.java
(Patrick Hunt via jvs)

--
[...truncated 30941 lines...]
[junit] OK
[junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-06-23_18-48-32_025_6710146509679189022/-mr-1
[junit] Total MapReduce jobs = 1
[junit] Launching Job 1 out of 1
[junit] Number of reduce tasks determined at compile time: 1
[junit] In order to change the average load for a reducer (in bytes):
[junit]   set hive.exec.reducers.bytes.per.reducer=number
[junit] In order to limit the maximum number of reducers:
[junit]   set hive.exec.reducers.max=number
[junit] In order to set a constant number of reducers:
[junit]   set mapred.reduce.tasks=number
[junit] Job running in-process (local Hadoop)
[junit] Hadoop job information for null: number of mappers: 0; number of 
reducers: 0
[junit] 2011-06-23 18:48:35,141 null map = 100%,  reduce = 100%
[junit] Ended Job = job_local_0001
[junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-06-23_18-48-32_025_6710146509679189022/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/service/tmp/hive_job_log_hudson_201106231848_247218233.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/data/files/kv1.txt' 
into table testhivedrivertable
[junit] PREHOOK: type: LOAD
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] Copying data from 
https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/data/files/kv1.txt
[junit] Loading data to table default.testhivedrivertable
[junit] POSTHOOK: query: load data local inpath 
'https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/data/files/kv1.txt' 
into table testhivedrivertable
[junit] POSTHOOK: type: LOAD
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] PREHOOK: query: select * from testhivedrivertable limit 10
[junit] PREHOOK: type: QUERY
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: 
file:/tmp/hudson/hive_2011-06-23_18-48-36_668_4580842764428959983/-mr-1
[junit] POSTHOOK: query: select * from testhivedrivertable limit 10
[junit] POSTHOOK: type: QUERY
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: 
file:/tmp/hudson/hive_2011-06-23_18-48-36_668_4580842764428959983/-mr-1
[junit] OK
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] PREHOOK: Input: default@testhivedrivertable
[junit] PREHOOK: Output: default@testhivedrivertable
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] POSTHOOK: Input: default@testhivedrivertable
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK
[junit] Hive history 
file=https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/service/tmp/hive_job_log_hudson_201106231848_1722739147.txt
[junit] PREHOOK: query: drop table testhivedrivertable
[junit] PREHOOK: type: DROPTABLE
[junit] POSTHOOK: query: drop table testhivedrivertable
[junit] POSTHOOK: type: DROPTABLE
[junit] OK
[junit] PREHOOK: query: create table testhivedrivertable (num int)
[junit] PREHOOK: type: CREATETABLE
[junit] POSTHOOK: query: create table testhivedrivertable (num int)
[junit] POSTHOOK: type: CREATETABLE
[junit] POSTHOOK: Output: default@testhivedrivertable
[junit] OK